FactoredThe '''factored language model''' ('''FLM''') is an extension of a conventional [[Languagelanguage model]] introduced by Jeff Bilmes and Katrin Kirchoff in 2003. In an FLM, each word is viewed as a vector of ''k'' factors: ''w<sub>i</submath>w_i = \{f<sub>i</sub><sup>f_i^1</sup>, ..., f<sub>i</sub><sup>f_i^k\}.</supmath>}''. An FLM provides the probabilistic model ''<math>P(f|f_1, ..., f_N)</math> where the prediction of a factor <math>f<sub/math>1 is based on <math>N</submath> parents <math>\{f_1, ..., ff_N\}<sub/math>N. For example, if <math>w</submath> represents a word token and <math>t</math> represents a [[Part of speech]] tag for English, the expression <math>P(w_i|w_{i-2}, w_{i-1}, t_{i-1})''</math> gives a model for predicting current word token based on a traditional [[Ngram]] model as well as the [[Part of speech]] tag of the previous word.
A major advantage of factored language models is that they allow users to specify linguistic knowledge such as the relationship between word tokens and [[Part of speech]] in English, or morphological information (stems, root, etc.) in Arabic.
Like [[N-gram]] models, smoothing techniques are necessary in parameter estimation. In particular, generalized backing-off is used in training an FLM.▼
▲Like [[N-gram]] models, smoothing techniques are necessary in parameter estimation. In particular, generalized backingback-off is used in training an FLM.