Revision as of 21:52, 13 August 2023 edit Suntooooth (talk \| contribs) Extended confirmed users, IP block exemptions, Pending changes reviewers, Rollbackers 10,846 edits Changing short description from "purely statistical model of language" to "Purely statistical model of language" Tag: Shortdesc helper ← Previous edit		Revision as of 01:38, 22 August 2023 edit undo Mxn (talk \| contribs) Autopatrolled, Extended confirmed users, File movers, Pending changes reviewers, Rollbackers 41,484 edits “word” doesn’t need to be lowercase Tag: 2017 wikitext editor Next edit →
Line 1: {{Short description\|Purely statistical model of language}} {{DISPLAYTITLE:~~word~~Word ''n''-gram language model}} A '''word n-gram language model''' is a purely statistical model of language. It has been superseded by [[recurrent neural network]]-based models, which has been superseded by [[large language model]]s. <ref>{{Cite journal\|url=https://dl.acm.org/doi/10.5555/944919.944966\|title=A neural probabilistic language model\|first1=Yoshua\|last1=Bengio\|first2=Réjean\|last2=Ducharme\|first3=Pascal\|last3=Vincent\|first4=Christian\|last4=Janvin\|date=March 1, 2003\|journal=The Journal of Machine Learning Research\|volume=3\|pages=1137–1155\|via=ACM Digital Library}}</ref> It is based on an assumption that the probability of the next word in a sequence depends only on a fixed size window of previous words. If only one previous word was considered, it was called a bigram model; if two words, a trigram model; if ''n''-1 words, an ''n''-gram model.<ref name=jm/> Special tokens were introduced to denote the start and end of a sentence <math>\langle s\rangle</math> and <math>\langle /s\rangle</math>.

Word n-gram language model: Difference between revisions