Paraphrasing (computational linguistics): Difference between revisions

Content deleted Content added
Rescuing 1 sources and tagging 0 as dead.) #IABot (v2.0.9.5) (AManWithNoPlan - 15896
timeliness
Tags: Visual edit Mobile edit Mobile web edit
Line 51:
Metrics specifically designed to evaluate paraphrase generation include paraphrase in n-gram change (PINC)<ref name=Chen /> and paraphrase evaluation metric (PEM)<ref name=Liu>{{cite conference|last1=Liu|first1=Chang|last2=Dahlmeier|first2=Daniel|last3=Ng|first3=Hwee Tou|title=PEM: A Paraphrase Evaluation Metric Exploiting Parallel Texts |conference=Proceedings of the 2010 Conference on Empricial Methods in Natural Language Processing |place=MIT, Massachusetts |year=2010 |pages=923–932 |url=http://www.aclweb.org/anthology/D10-1090}}</ref> along with the aforementioned ParaMetric. PINC is designed to be used with BLEU and help cover its inadequacies. Since BLEU has difficulty measuring lexical dissimilarity, PINC is a measurement of the lack of n-gram overlap between a source sentence and a candidate paraphrase. It is essentially the [[Jaccard index|Jaccard distance]] between the sentence, excluding n-grams that appear in the source sentence to maintain some semantic equivalence. PEM, on the other hand, attempts to evaluate the "adequacy, fluency, and lexical dissimilarity" of paraphrases by returning a single value heuristic calculated using [[N-gram]]s overlap in a pivot language. However, a large drawback to PEM is that it must be trained using large, in-___domain parallel corpora and human judges.<ref name=Chen /> It is equivalent to training a paraphrase recognition to evaluate a paraphrase generation system.
 
The Quora Question Pairs Dataset, which contains hundreds of thousands of duplicate questions, has become a common dataset for the evaluation of paraphrase detectors.<ref>{{cite web |title=Paraphrase Identification on Quora Question Pairs |url=https://paperswithcode.com/sota/paraphrase-identification-on-quora-question|website=Papers with Code}}</ref> TheConsistently best performing models forreliable paraphrase detection for the last three years have all used the Transformer architecture and all have relied on large amounts of pre-training with more general data before fine-tuning w

with the question pairs.
 
== See also ==