Content deleted Content added
m Typo |
merged two duplicate paragraph fragments and clarified third bullet. |
||
Line 1:
{{unreferenced|date=May 2011}}
In [[computer science]], a [[log probability]] is simply the logarithm of a probability. The use of '''log probabilities''' means representing [[probability|probabilities]] in [[logarithm]]ic space, instead of the standard <math>[0, 1]</math> [[interval (mathematics)|interval]].
The logarithm function is not defined for zero, so log probabilities can only represent non-zero probabilities. Since the logarithm of a number in <math>[0, 1)</math> interval is negative, often the negative log probabilities are used. In that case the log probabilities in the following formulas would be [[Additive inverse|inverted]]. Any base can be selected for the logarithm.▼
# '''Speed.''' Since multiplication is more [[time complexity|expensive]] than addition, taking the product of a high number of probabilities is often faster if they are represented in log form. (The conversion to log form is expensive, but is only incurred once.) Multiplication arises from calculating the probability that multiple independent events occur: the probability that all independent events of interest occur is the product of all these events' probabilities.▼
# '''Accuracy.''' The use of log probabilities improves [[numerical stability]], when the probabilities are very small, because of the way in which computers [[Floating point|approximate real numbers]].▼
# '''Simplicity.''' Many probability distributions have an exponential form. Taking the log of these distributions eliminates the exponential function, unwrapping the exponent. For example, the log probability of the normal distribution's PDF is <math>-(x-m_x/\sigma_m)^2+C</math> instead of <math>C_2 \exp(-(x-m_x/\sigma_m)^2)</math>. Log probabilities make some mathematical manipulations easier to perform.
The use of log probabilities is widespread in several fields of computer science such as [[information theory]] and [[natural language processing]] as it represents the [[surprisal]], the minimum length of the message that specifies the outcome in an optimally efficient code.▼
== Representation issues ==
▲The logarithm function is not defined for zero, so log probabilities can only represent non-zero probabilities. Since the logarithm of a number in <math>[0, 1)</math> interval is negative, often the negative log probabilities are used. In that case the log probabilities in the following formulas would be [[Additive inverse|inverted]].
== Basic manipulations ==
Any base can be selected for the logarithm.
: <math>x' = \log(x) \in \mathbb{R}</math>
Line 14 ⟶ 26:
However, in many applications a multiplication of probabilities (giving the probability of all independent events occurring) is used more often than their addition (giving the probability of at least one of them occurring). Additionally, the cost of computing the addition can be avoided in some situations by simply using the highest probability as an approximation. Since probabilities are non-negative this gives a lower bound. This approximation is used in reverse to get a [[LogSumExp|continuous approximation of the max function]].
▲Representing probabilities in this way has two main advantages:
▲# '''Speed.''' Since multiplication is more [[time complexity|expensive]] than addition, taking the product of a high number of probabilities is faster if they are represented in log form. (The conversion to log form is expensive, but is only incurred once.)
▲# '''Accuracy.''' The use of log probabilities improves [[numerical stability]], when the probabilities are very small.
▲The use of log probabilities is widespread in several fields of computer science such as [[information theory]] and [[natural language processing]] as it represents the [[surprisal]], the minimum length of the message that specifies the outcome in an optimally efficient code.
==Addition in log space==
|