Revision as of 21:40, 14 April 2022 edit Jarble (talk \| contribs) Autopatrolled, Extended confirmed users 150,084 edits Redirecting to Language_model#n-gram (♐) Tag: New redirect		Revision as of 17:08, 10 March 2023 edit undo Colin M (talk \| contribs) Autopatrolled, Administrators 12,442 edits Quick stub. Will be using this shortly as a merge target for a bunch of content in Language model Tag: Removed redirect Next edit →
Line 1: An '''n-gram language model''' is a [[language model]] that models sequences of words as a [[Markov process]]. It makes use of the simplifying assumption that the probability of the next word in a sequence depends only on a fixed size window of previous words. A bigram model considers one previous word, a trigram model considers two, and in general, an ''n''-gram model considers ''n''-1 words of previous context.<ref name=jm/> ~~#REDIRECT [[Language_model#n-gram]] {{Redirect category shell\|~~ ~~{{R to section}}~~ For example, a bigram language model models the probability of the sentence ''I saw the red house'' as: <math display="block">P(\text{I, saw, the, red, house}) \approx P(\text{I}\mid\langle s\rangle) P(\text{saw}\mid \text{I}) P(\text{the}\mid\text{saw}) P(\text{red}\mid\text{the}) P(\text{house}\mid\text{red}) P(\langle /s\rangle\mid \text{house})</math> Where <math>\langle s\rangle</math> and <math>\langle /s\rangle</math> are special tokens denoting the start and end of a sentence. These conditional probabilities may be estimated based on frequency counts in some [[text corpus]]. For example, <math>P(\text{saw}\mid \text{I})</math> can be naively estimated as the proportion of occurrences of the word ''I'' which are followed by ''saw'' in the corpus. The problem of sparsity (for example, if the bigram "red house" has zero occurrences in our corpus) may necessitate modifying the basic markov model by [[smoothing]] techniques, particularly when using larger context windows.<ref name=jm/> n-gram models are no longer commonly used in [[natural language processing]] research and applications, as they have been supplanted by state of the art [[neural language model\|deep learning methods]], most recently [[large language model]]s. ==References== {{reflist\|refs= <ref name=jm>{{cite book \|last1=Jurafsky \|first1=Dan \|last2=Martin \|first2=James H. \|title=Speech and Language Processing \|date=7 January 2023 \|edition=3rd edition draft \|url=https://web.stanford.edu/~jurafsky/slp3/ed3book_jan72023.pdf \|access-date=24 May 2022 \|chapter=N-gram Language Models }}</ref> }} [[Category:Language modeling]] [[Category:Statistical natural language processing]] [[Category:Markov models]]

Word n-gram language model: Difference between revisions