Revision as of 19:06, 17 February 2023 edit Cosmia Nebula (talk \| contribs) Extended confirmed users 11,304 edits →Learning from a corpus: Dirac delta Tag: Visual edit ← Previous edit		Revision as of 19:31, 17 February 2023 edit undo Cosmia Nebula (talk \| contribs) Extended confirmed users 11,304 edits →Limitations: limitations Tag: Visual edit Next edit →
Line 54: * latent variables: the alignments <math>\{a^{(k)}\}_k</math> In this form, this is exactly the kind of problem solved by [[expectation–maximization algorithm]]. Due to the simplistic assumptions, the algorithm has a closed-form, efficiently computable solution. For a detailed derivation of the algorithm, see <ref name=":0">{{Cite book \|last=Koehn \|first=Philipp \|url=https://books.google.com/books?id=4v_Cx1wIMLkC&newbks=0&hl=en \|title=Statistical Machine Translation \|date=2010 \|publisher=Cambridge University Press \|isbn=978-0-521-87415-1 \|language=en \|chapter=4. Word-Based Models}}</ref> chapter 4 and <ref>{{Cite web \|title=CS288, Spring 2020, Lectur 05: Statistical Machine Translation \|url=https://cal-cs288.github.io/sp20/slides/cs288_sp20_05_statistical_translation_1up.pdf \|url-status=live \|archive-url=https://web.archive.org/web/20201024011801/https://cal-cs288.github.io/sp20/slides/cs288_sp20_05_statistical_translation_1up.pdf \|archive-date=24 Oct 2020}}</ref>. In short, the EM algorithm goes as follows:<blockquote>INPUT. a corpus of English-foreign sentence pairs <math>\{(e^{(k)}, f^{(k)})\}_k</math> Line 74: === Limitations === There are several limitations to the IBM model 1.<ref name=":0" /> * No fluency: Given any sentence pair <math>(e, f)</math>, any permutation of the English sentence is equally likely: <math>p(e\|f) = p(e'\|f)</math> for any permutation of the English sentence <math>e</math> into <math>e'</math>. * No length preference: The probability of each length of translation is equal: <math>\sum_{e\text{ has length }l}p(e\|f) = \frac 1N</math> for any <math>l \in \{1, 2, ..., N\}</math>. * It is weak in terms of conducting reordering or adding and dropping words. In most cases, words that follow each other in one language would have a different order after translation, but IBM Model 1 treats all kinds of reordering as equally possible.

IBM alignment models: Difference between revisions