In statistics, the logit (/ˈlɪt/ LOH-jit) function is the quantile function associated with the standard logistic distribution. It has many uses in data analysis and machine learning, especially in data transformations.

Plot of logit(x) in the ___domain of 0 to 1, where the base of the logarithm is e.

Mathematically, the logit is the inverse of the standard logistic function , so the logit is defined as

Because of this, the logit is also called the log-odds since it is equal to the logarithm of the odds where p is a probability. Thus, the logit is a type of function that maps probability values from to real numbers in ,[1] akin to the probit function.

Definition

edit

If p is a probability, then p/(1 − p) is the corresponding odds; the logit of the probability is the logarithm of the odds, i.e.:

 

The base of the logarithm function used is of little importance in the present article, as long as it is greater than 1, but the natural logarithm with base e is the one most often used. The choice of base corresponds to the choice of logarithmic unit for the value: base 2 corresponds to a shannon, base e to a nat, and base 10 to a hartley; these units are particularly used in information-theoretic interpretations. For each choice of base, the logit function takes values between negative and positive infinity.

The “logistic” function of any number   is given by the inverse-logit:

 

The difference between the logits of two probabilities is the logarithm of the odds ratio (R), thus providing a shorthand for writing the correct combination of odds ratios only by adding and subtracting:

 

The Taylor series for the logit function is given by:

 

History

edit

Several approaches have been explored to adapt linear regression methods to a ___domain where the output is a probability value  , instead of any real number  . In many cases, such efforts have focused on modeling this problem by mapping the range   to   and then running the linear regression on these transformed values.[2]

In 1934, Chester Ittner Bliss used the cumulative normal distribution function to perform this mapping and called his model probit, an abbreviation for "probability unit". This is, however, computationally more expensive.[2]

In 1944, Joseph Berkson used log of odds and called this function logit, an abbreviation for "logistic unit", following the analogy for probit:

"I use this term [logit] for   following Bliss, who called the analogous function which is linear on   for the normal curve 'probit'."

— Joseph Berkson (1944)[3]

Log odds was used extensively by Charles Sanders Peirce (late 19th century).[4] G. A. Barnard in 1949 coined the commonly used term log-odds;[5][6] the log-odds of an event is the logit of the probability of the event.[7] Barnard also coined the term lods as an abstract form of "log-odds",[8] but suggested that "in practice the term 'odds' should normally be used, since this is more familiar in everyday life".[9]

Uses and properties

edit

Comparison with probit

edit
 
Comparison of the logit function with a scaled probit (i.e. the inverse CDF of the normal distribution), comparing   vs.  , which makes the slopes the same at the y-origin.

Closely related to the logit function (and logit model) are the probit function and probit model. The logit and probit are both sigmoid functions with a ___domain between 0 and 1, which makes them both quantile functions – i.e., inverses of the cumulative distribution function (CDF) of a probability distribution. In fact, the logit is the quantile function of the logistic distribution, while the probit is the quantile function of the normal distribution. The probit function is denoted  , where   is the CDF of the standard normal distribution, as just mentioned:

 

As shown in the graph on the right, the logit and probit functions are extremely similar when the probit function is scaled, so that its slope at y = 0 matches the slope of the logit. As a result, probit models are sometimes used in place of logit models because for certain applications (e.g., in item response theory) the implementation is easier.[14]

See also

edit

References

edit
  1. ^ "Logit/Probit" (PDF).
  2. ^ a b Cramer, J. S. (2003). "The origins and development of the logit model" (PDF). Cambridge UP. Archived from the original (PDF) on 19 September 2024.
  3. ^ Berkson 1944, p. 361, footnote 2.
  4. ^ Stigler, Stephen M. (1986). The history of statistics : the measurement of uncertainty before 1900. Cambridge, Massachusetts: Belknap Press of Harvard University Press. ISBN 978-0-674-40340-6.
  5. ^ Hilbe, Joseph M. (2009), Logistic Regression Models, CRC Press, p. 3, ISBN 9781420075779.
  6. ^ Barnard 1949, p. 120.
  7. ^ Cramer, J. S. (2003), Logit Models from Economics and Other Fields, Cambridge University Press, p. 13, ISBN 9781139438193.
  8. ^ Barnard 1949, p. 120,128.
  9. ^ Barnard 1949, p. 136.
  10. ^ "R: Inverse logit function". Archived from the original on 2011-07-06. Retrieved 2011-02-18.
  11. ^ Thrun, Sebastian (2003). "Learning Occupancy Grid Maps with Forward Sensor Models" (PDF). Autonomous Robots. 15 (2): 111–127. doi:10.1023/A:1025584807625. ISSN 0929-5593. S2CID 2279013.
  12. ^ Styler, Alex (2012). "Statistical Techniques in Robotics" (PDF). p. 2. Retrieved 2017-01-26.
  13. ^ Dickmann, J.; Appenrodt, N.; Klappstein, J.; Bloecher, H. L.; Muntzinger, M.; Sailer, A.; Hahn, M.; Brenk, C. (2015-01-01). "Making Bertha See Even More: Radar Contribution". IEEE Access. 3: 1233–1247. Bibcode:2015IEEEA...3.1233D. doi:10.1109/ACCESS.2015.2454533. ISSN 2169-3536.
  14. ^ Albert, James H. (2016). "Logit, Probit, and other Response Functions". Handbook of Item Response Theory. Vol. Two. Chapman and Hall. pp. 3–22. doi:10.1201/b19166-1. ISBN 978-1-315-37364-5.
edit

Further reading

edit