Revision as of 18:10, 10 February 2025 edit Closed Limelike Curves (talk \| contribs) Extended confirmed users, Pending changes reviewers 8,356 edits hatnote Tag: Visual edit ← Previous edit		Latest revision as of 22:52, 25 March 2025 edit undo Cosmia Nebula (talk \| contribs) Extended confirmed users 11,304 edits →top Tag: Visual edit
Line 3: The '''IBM alignment models''' are a sequence of increasingly complex models used in [[statistical machine translation]] to train a translation model and an alignment model, starting with lexical translation probabilities and moving to reordering and word duplication.<ref name=":1">{{Cite journal \|last1=Brown \|first1=Peter F. \|author-link1=Peter Fitzhugh Brown \|last2=Pietra \|first2=Vincent J. Della \|last3=Pietra \|first3=Stephen A. Della \|last4=Mercer \|first4=Robert L. \|author-link4=Robert Mercer \|date=1993-06-01 \|title=The mathematics of statistical machine translation: parameter estimation \|url=https://dl.acm.org/doi/10.5555/972470.972474 \|journal=Comput. Linguist. \|volume=19 \|issue=2 \|pages=263–311 \|issn=0891-2017}}</ref><ref>{{cite web \| url = http://www.statmt.org/survey/Topic/IBMModels \| title = IBM Models \| date = 11 September 2015 \| publisher = SMT Research Survey Wiki \| access-date = 26 October 2015}}</ref> They underpinned the majority of statistical machine translation systems for almost twenty years starting in the early 1990s, until [[neural machine translation]] began to dominate. These models offer principled probabilistic formulation and (mostly) tractable inference.<ref>{{cite web \|author=Yarin Gal \|author2=Phil Blunsom \|date=12 June 2013 \|title=A Systematic Bayesian Treatment of the IBM Alignment Models \|url=http://mlg.eng.cam.ac.uk/yarin/PDFs/PY-IBM_presentation.pdf \|archive-url=https://web.archive.org/web/20160304071924/http://mlg.eng.cam.ac.uk/yarin/PDFs/PY-IBM_presentation.pdf \|archive-date=4 Mar 2016 \|access-date=26 October 2015 \|publisher=University of Cambridge}}</ref> The IBM alignment models were published in parts in 1988<ref>{{Cite journal \|last1=Brown \|first1=P. \|last2=Cocke \|first2=J. \|last3=Della Pietra \|first3=S. \|last4=Della Pietra \|first4=V. \|last5=Jelinek \|first5=F. \|last6=Mercer \|first6=R. \|last7=Roossin \|first7=P. \|date=1988 \|title=A Statistical Approach to Language Translation \|url=https://aclanthology.org/C88-1016/ \|journal=Coling Budapest 1988 Volume 1: International Conference on Computational Linguistics}}</ref> and 1990,<ref>{{Cite journal \|last1=Brown \|first1=Peter F. \|last2=Cocke \|first2=John \|last3=Della Pietra \|first3=Stephen A. \|last4=Della Pietra \|first4=Vincent J. \|last5=Jelinek \|first5=Fredrick \|last6=Lafferty \|first6=John D. \|last7=Mercer \|first7=Robert L. \|last8=Roossin \|first8=Paul S. \|date=1990 \|title=A Statistical Approach to Machine Translation \|url=https://aclanthology.org/J90-2002/ \|journal=Computational Linguistics \|volume=16 \|issue=2 \|pages=79–85}}</ref> and the entire series is published in 1993.<ref name=":1" /> Every author of the 1993 paper subsequently went to the hedge fund [[Renaissance Technologies]].<ref>{{Cite web \|last=walutowyjohn \|date=2013-01-28 \|title=A Visionary Gift: Della Pietra Family Endows Biomedical Imaging Chair - SBU News \|url=https://news.stonybrook.edu/alumni/a-visionary-gift-della-pietra-family-endows-biomedical-imaging-chair-2/ \|access-date=2025-01-06 \|website=Stony Brook University News \|language=en-US}}</ref> The original work on statistical machine translation at [[IBM]] proposed five models, and a model 6 was proposed later. The sequence of the six models can be summarized as:

IBM alignment models: Difference between revisions