Content deleted Content added
Aimefournier (talk | contribs) m →As a set of independent binary regressions: Clarify that k = 0 is not included. |
Aimefournier (talk | contribs) m →As a log-linear model: Clarify that k = 0 is not a case, and prevent confusion about x symbol. |
||
Line 100:
: <math>
\ln \Pr(Y_i=k) = \boldsymbol\beta_k \cdot \mathbf{X}_i - \ln Z, \;\;\;\;
</math>
Line 110:
: <math>
\Pr(Y_i=k) = \frac{1}{Z} e^{\boldsymbol\beta_k \cdot \mathbf{X}_i}, \;\;\;\;
</math>
Line 128:
:<math>
\Pr(Y_i=k) = \frac{e^{\boldsymbol\beta_k \cdot \mathbf{X}_i}}{\sum_{j=1}^{K} e^{\boldsymbol\beta_j \cdot \mathbf{X}_i}}, \;\;\;\;
</math>
Line 137:
The following function:
:<math>\operatorname{softmax}(k,
is referred to as the [[softmax function]]. The reason is that the effect of exponentiating the values <math>
:<math>f(k) = \begin{cases}
1 & \textrm{if } \; k = \operatorname{\arg\max}
0 & \textrm{otherwise}.
\end{cases}
Line 153:
The softmax function thus serves as the equivalent of the [[logistic function]] in binary logistic regression.
Note that not all of the <math>\beta_k</math> vectors of coefficients are uniquely [[identifiability|identifiable]]. This is due to the fact that all probabilities must sum to 1, making one of them completely determined once all the rest are known. As a result, there are only <math>
:<math>
Line 167:
:<math>
\begin{align}
\boldsymbol\beta'_k &= \boldsymbol\beta_k - \boldsymbol\beta_K, \;\;\;
\boldsymbol\beta'_K &= 0
\end{align}
Line 175:
:<math>
\Pr(Y_i=k) = \frac{e^{\boldsymbol\beta'_k \cdot \mathbf{X}_i}}{1 + \sum_{j=1}^{K-1} e^{\boldsymbol\beta'_j \cdot \mathbf{X}_i}}, \;\;\;\;
</math>
|