Paraphrasing (computational linguistics): Difference between revisions

Content deleted Content added
Line 48:
While originally used to evaluate machine translations, [[BLEU]] has been used successfully to evaluate paraphrase generation models as well. However, paraphrases often have several lexically different but equally valid solutions which hurts BLEU and other similar evaluation metrics.<ref name=Chen>{{cite conference|last1=Chen|first1=David|last2=Dolan|first2=William|title=Collecting Highly Parallel Data for Paraphrase Evaluation|booktitle=Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies|place=Portland, Oregon|year=2008|pages=190-200|url=https://dl.acm.org/citation.cfm?id=2002497}}</ref>
 
Metrics specifically designed to evaluate paraphrase generation inclue PEM<ref name=Liu>{{cite conference|last1=Liu|first1=Chang|last2=Dahlmeier|first2=Daniel|last3=Ng|first3=Hwee Tou|title=PEM: A Paraphrase Evaluation Metric Exploiting ParllelParallel Texts|booktitle=Proceedings of the 2010 Conference on Empricial Methods in Natural Language Processing|place=MIT, Massachusetts|year=2010|pages=923-932|url=http://www.aclweb.org/anthology/D10-1090}}</ref>,
ParaMetric<ref name=Burch2>{{cite conference|last1=Callison-Burch|first1=Chris|last2=Cohn|first2=Trevor|last3=Lapata|first3=Mirella|title=ParaMetric: An Automatic Evaluation Metric for Paraphrasing|booktitle=Proceedings of the 22nd International Conference on Computational Linguistics|place=Manchester|year=2008|pages=97-104|url=https://pdfs.semanticscholar.org/be0d/0df960833c1bea2a39ba9a17e5ca958018cd.pdf}}</ref>,
and PINC<ref name=Chen></ref>. PEM (paraphrase evaluation metric) attempts to evaluate the "adequacy, fluency, and lexical dissimilarity" of paraphrases usingby pivotreturning languagea single value heuristic calculated using [[n-gram|N-grams]] overlap in a pivot language. However, a large drawback to PEM is that must be trained using a large, in-___domain parallel corpora as well as human judges.<ref name=Chen></ref> In other words, it is tantamount to training a paraphrase recognition system in order to evaluate a paraphrase generation system.
 
== References ==