Revision as of 21:51, 28 September 2024 edit 2601:447:cd80:e200:5960:6acf:38e4:a8cc (talk) →As a set of independent binary regressions ← Previous edit		Revision as of 21:52, 28 September 2024 edit undo 2601:447:cd80:e200:5960:6acf:38e4:a8cc (talk) →As a log-linear model Next edit →
Line 110: : <math> \Pr(Y_i=k) = \frac{1}{Z} e^{\boldsymbol\beta_k \cdot \mathbf{X}_i} \;\;\;\;,\;\;k \le K. </math>. The quantity ''Z'' is called the [[partition function (mathematics)\|partition function]] for the distribution. We can compute the value of the partition function by applying the above constraint that requires all probabilities to sum to 1: :<math> 1 = \sum_{k=1}^{K} \Pr(Y_i=k) \;=\; \sum_{k=1}^{K} \frac{1}{Z} e^{\boldsymbol\beta_k \cdot \mathbf{X}_i} \;=\; \frac{1}{Z} \sum_{k=1}^{K} e^{\boldsymbol\beta_k \cdot \mathbf{X}_i}. </math> Therefore: :<math>Z = \sum_{k=1}^{K} e^{\boldsymbol\beta_k \cdot \mathbf{X}_i}.</math> Note that this factor is "constant" in the sense that it is not a function of ''Y''<sub>''i''</sub>, which is the variable over which the probability distribution is defined. However, it is definitely not constant with respect to the explanatory variables, or crucially, with respect to the unknown regression coefficients '''''β'''''<sub>''k''</sub>, which we will need to determine through some sort of [[mathematical optimization\|optimization]] procedure.

Multinomial logistic regression: Difference between revisions