Content deleted Content added
Line 100:
''n''-gram-based searching was also used for [[plagiarism detection]].
== Bias–variance tradeoff ==
{{Main|Bias–variance tradeoff}}
To choose a value for ''n'' in an ''n''-gram model, it is necessary to find the right trade-off between the stability of the estimate against its appropriateness. This means that trigram (i.e. triplets of words) is a common choice with large training corpora (millions of words), whereas a bigram is often used with smaller ones.
|