Content deleted Content added
Line 548:
:<math>\Pr(Y_i=c) = \operatorname{softmax}(c, \boldsymbol\beta_0 \cdot \mathbf{X}_i, \boldsymbol\beta_1 \cdot \mathbf{X}_i, \dots) .</math>
To prove that this is equivalent to the previous model, we start by recognizing the above model is overspecified, in that <math>\Pr(Y_i=0)</math> and <math>\Pr(Y_i=1)</math> cannot be independently specified: rather <math>\Pr(Y_i=0) + \Pr(Y_i=1) = 1</math> so knowing one automatically determines the other. As a result, the model is [[nonidentifiable]], in that multiple combinations of <math>\boldsymbol\beta_{0}</math> and <math>\boldsymbol\beta_{1}</math> will produce the same probabilities for all possible explanatory variables. In fact, it can be seen that adding any constant vector to both of them will produce the same probabilities:
:<math>
|