Content deleted Content added
Redirecting to Language_model#n-gram (♐) Tag: New redirect |
Quick stub. Will be using this shortly as a merge target for a bunch of content in Language model Tag: Removed redirect |
||
Line 1:
An '''n-gram language model''' is a [[language model]] that models sequences of words as a [[Markov process]]. It makes use of the simplifying assumption that the probability of the next word in a sequence depends only on a fixed size window of previous words. A bigram model considers one previous word, a trigram model considers two, and in general, an ''n''-gram model considers ''n''-1 words of previous context.<ref name=jm/>
For example, a bigram language model models the probability of the sentence ''I saw the red house'' as:
<math display="block">P(\text{I, saw, the, red, house}) \approx P(\text{I}\mid\langle s\rangle) P(\text{saw}\mid \text{I}) P(\text{the}\mid\text{saw}) P(\text{red}\mid\text{the}) P(\text{house}\mid\text{red}) P(\langle /s\rangle\mid \text{house})</math>
Where <math>\langle s\rangle</math> and <math>\langle /s\rangle</math> are special tokens denoting the start and end of a sentence.
These conditional probabilities may be estimated based on frequency counts in some [[text corpus]]. For example, <math>P(\text{saw}\mid \text{I})</math> can be naively estimated as the proportion of occurrences of the word ''I'' which are followed by ''saw'' in the corpus. The problem of sparsity (for example, if the bigram "red house" has zero occurrences in our corpus) may necessitate modifying the basic markov model by [[smoothing]] techniques, particularly when using larger context windows.<ref name=jm/>
n-gram models are no longer commonly used in [[natural language processing]] research and applications, as they have been supplanted by state of the art [[neural language model|deep learning methods]], most recently [[large language model]]s.
==References==
{{reflist|refs=
<ref name=jm>{{cite book
|last1=Jurafsky |first1=Dan |last2=Martin |first2=James H.
|title=Speech and Language Processing
|date=7 January 2023 |edition=3rd edition draft
|url=https://web.stanford.edu/~jurafsky/slp3/ed3book_jan72023.pdf
|access-date=24 May 2022
|chapter=N-gram Language Models
}}</ref>
}}
[[Category:Language modeling]]
[[Category:Statistical natural language processing]]
[[Category:Markov models]]
|