Paraphrasing (computational linguistics): Difference between revisions

Content deleted Content added
Paraphrase generation: revised multi sequence alignment section
Line 16:
Paraphrase can also be generated through the use of [[statistical machine translation#Phrase-based translation|phrase-based translation]] as proposed by Bannard and Callison-Burch<ref name=Bannard>{{cite conference|last1=Bannard|first1=Colin|last2=Callison-Burch|first2=Chris|title=Paraphrasing Bilingual Parallel Corpora|booktitle=Proceedings of the 43rd Annual Meeting of the ACL|place=Ann Arbor, Michigan|pages=597-604|year=2005|url=https://dl.acm.org/citation.cfm?id=1219914}}</ref>. The chief concept consists of aligning phrases in a pivot language to produce potential paraphrases in the original language. For example, the phrase "under control" in an English sentence is aligned with the phrase "unter kontrolle" in its German counterpart. The phrase "unter kontrolle" is then found in another German sentence with the aligned English phrase being "in check", a paraphrase of "under control".
 
The probability distribution can be modeled as <math>\Pr(e_2 | e_1)</math>, the probability phrase <math>e_2</math> is a paraphrase of <math>e_1</math>, which is equivalent to <math>\Pr(e_2|f) \Pr(f|e_1)</math> summed over all <math>f</math>, a potential phrase translation in the pivot language. Additionally, the sentence <math>e_1</math> is added as a prior to add context to the paraphrase. Thus the optimal paraphrase, <math>\hat{e_2}</math> can be calculatedmodeled as:
 
: <math>\hat{e_2} = \text{arg} \max_{e_2 \neq e_1} \Pr(e_2 | e_1, S) = \text{arg} \max_{e_2 \neq e_1} \sum_f \Pr(e_2 | f, S) \Pr(f | e_1, S)</math>