Paraphrasing (computational linguistics): Difference between revisions

Content deleted Content added
Expanded machine translation section
Line 20:
The probability distribution can be modeled as <math>\Pr(e_2 | e_1)</math>, the probability phrase <math>e_2</math> is a paraphrase of <math>e_1</math>, which is equivalent to <math>\Pr(e_2|f) \Pr(f|e_1)</math> summed over all <math>f</math>, a potential phrase translation in the pivot language. Additionally, the sentence <math>e_1</math> is added as a prior to add context to the paraphrase. Thus the optimal paraphrase, <math>\hat{e_2}</math> can be calculated as:
 
: <math>\hat{e_2} = \text{arg} \max_{e_2 \neq e_1} \Pr(e_2 | e_1, S) = \text{arg} \max_{e_2 \neq e_1} \sum_f \Pr(e_2 | f, S) \Pr(f | e_1, S)</math>
 
<math>\Pr(e_2|f)</math> and <math>\Pr(f|e_1)</math> can be approximated by simply taking their frequencies. Adding <math>S</math> as a prior is modeled by calculating the probability of forming the <math>S</math> when <math>e_1</math> is substituted with <math>e_2</math>.
 
=== Autoencoders ===