Content deleted Content added
→Learning from a corpus: details of derivation |
|||
Line 53:
* observable variables: the English sentences <math>\{e^{(k)}\}_k</math>.
* latent variables: the alignments <math>\{a^{(k)}\}_k</math>
In this form, this is exactly the kind of problem solved by [[expectation–maximization algorithm]]. Due to the simplistic assumptions, the algorithm has a closed-form, efficiently computable solution, which is the solution to the following equations:<math display="block">
\begin{cases}
\max_{t'} \sum_k \sum_i \sum_{a^{(k)}} t(a^{(k)} | e^{(k)}, f^{(k)}) \ln t(e_i^{(k)} | f_{a^{(k)}(i)}^{(k)}) \\
\
\end{cases}
</math>This can be solved by [[Lagrange multiplier|Lagrangian multipliers]], then simplified. For a detailed derivation of the algorithm, see <ref name=":0">{{Cite book |last=Koehn |first=Philipp |url=https://books.google.com/books?id=4v_Cx1wIMLkC&newbks=0&hl=en |title=Statistical Machine Translation |date=2010 |publisher=Cambridge University Press |isbn=978-0-521-87415-1 |language=en |chapter=4. Word-Based Models}}</ref> chapter 4 and <ref>{{Cite web |title=CS288, Spring 2020, Lectur 05: Statistical Machine Translation |url=https://cal-cs288.github.io/sp20/slides/cs288_sp20_05_statistical_translation_1up.pdf |url-status=live |archive-url=https://web.archive.org/web/20201024011801/https://cal-cs288.github.io/sp20/slides/cs288_sp20_05_statistical_translation_1up.pdf |archive-date=24 Oct 2020}}</ref>.
|