Content deleted Content added
Suntooooth (talk | contribs) Changing short description from "purely statistical model of language" to "Purely statistical model of language" |
“word” doesn’t need to be lowercase |
||
Line 1:
{{Short description|Purely statistical model of language}} {{DISPLAYTITLE:
A '''word n-gram language model''' is a purely statistical model of language. It has been superseded by [[recurrent neural network]]-based models, which has been superseded by [[large language model]]s. <ref>{{Cite journal|url=https://dl.acm.org/doi/10.5555/944919.944966|title=A neural probabilistic language model|first1=Yoshua|last1=Bengio|first2=Réjean|last2=Ducharme|first3=Pascal|last3=Vincent|first4=Christian|last4=Janvin|date=March 1, 2003|journal=The Journal of Machine Learning Research|volume=3|pages=1137–1155|via=ACM Digital Library}}</ref> It is based on an assumption that the probability of the next word in a sequence depends only on a fixed size window of previous words. If only one previous word was considered, it was called a bigram model; if two words, a trigram model; if ''n''-1 words, an ''n''-gram model.<ref name=jm/> Special tokens were introduced to denote the start and end of a sentence <math>\langle s\rangle</math> and <math>\langle /s\rangle</math>.
|