Content deleted Content added
Aimefournier (talk | contribs) m →As a log-linear model: Clarify that k = 0 is not a case, and prevent confusion about x symbol. |
Aimefournier (talk | contribs) m →Likelihood function: Fixed lower limit of k |
||
Line 231:
== Likelihood function ==
The observed values <math>y_i \in \{
The likelihood function for this model is defined by
:<math>L = \prod_{i=1}^n P(Y_i=y_i) = \prod_{i=1}^n
where the index <math>i</math> denotes the observations 1 to ''n'' and the index <math>j</math> denotes the classes 1 to ''K''. <math>\delta_{j,y_i}=\begin{cases}1, \text{ for } j=y_i \\ 0, \text{ otherwise}\end{cases}</math> is the Kronecker delta. The negative log-likelihood function is therefore the well-known cross-entropy:
:<math>-\log L = - \sum_{i=1}^n \sum_{j=1}^K \delta_{j,y_i} \log(P(Y_i=j))= - \sum_{j=1}^K\sum_{y_i=j}\log(P(Y_i=j)).</math>
==Application in natural language processing==
|