Revision as of 17:42, 10 October 2024 edit 82.49.100.199 (talk) No edit summary ← Previous edit		Revision as of 12:58, 13 October 2024 edit undo Hooman Mallahzadeh (talk \| contribs) Extended confirmed users 4,638 edits Linking. Next edit →
Line 19: <math display="block"> P(w_m \mid w_1,\ldots,w_{m-1}) = \frac{1}{Z(w_1,\ldots,w_{m-1})} \exp (a^T f(w_1,\ldots,w_m))</math> where <math>Z(w_1,\ldots,w_{m-1})</math> is the [[Partition function (mathematics)\|partition function]], <math>a</math> is the parameter vector, and <math>f(w_1,\ldots,w_m)</math> is the feature function. In the simplest case, the feature function is just an indicator of the presence of a certain ''n''-gram. It is helpful to use a prior on <math>a</math> or some form of [[Regularization (mathematics)\|regularization]]. The log-bilinear model is another example of an exponential language model.

Language model: Difference between revisions