Revision as of 02:32, 28 July 2021 edit 2603:8001:7e03:7628:75ce:a77f:a312:f07f (talk) No edit summary Tag: Reverted ← Previous edit		Revision as of 04:41, 28 July 2021 edit undo Anachronist (talk \| contribs) Edit filter managers, Autopatrolled, IP block exemptions, Administrators 68,752 edits reverted unsourced paragraph that consisted solely of commentary about inline external links Tag: Manual revert Next edit →
Line 45: Metrics specifically designed to evaluate paraphrase generation include paraphrase in n-gram change (PINC)<ref name=Chen /> and paraphrase evaluation metric (PEM)<ref name=Liu>{{cite conference\|last1=Liu\|first1=Chang\|last2=Dahlmeier\|first2=Daniel\|last3=Ng\|first3=Hwee Tou\|title=PEM: A Paraphrase Evaluation Metric Exploiting Parallel Texts\|book-title=Proceedings of the 2010 Conference on Empricial Methods in Natural Language Processing\|place=MIT, Massachusetts\|year=2010\|pages=923–932\|url=http://www.aclweb.org/anthology/D10-1090}}</ref> along with the aforementioned ParaMetric. PINC is designed to be used in conjunction with BLEU and help cover its inadequacies. Since BLEU has difficulty measuring lexical dissimilarity, PINC is a measurement of the lack of n-gram overlap between a source sentence and a candidate paraphrase. It is essentially the [[Jaccard index\|Jaccard distance]] between the sentence excluding n-grams that appear in the source sentence to maintain some semantic equivalence. PEM, on the other hand, attempts to evaluate the "adequacy, fluency, and lexical dissimilarity" of paraphrases by returning a single value heuristic calculated using [[N-gram]]s overlap in a pivot language. However, a large drawback to PEM is that must be trained using a large, in-___domain parallel corpora as well as human judges.<ref name=Chen /> In other words, it is tantamount to training a paraphrase recognition system in order to evaluate a paraphrase generation system. ~~== Commercial Tools ==~~ Since the advent of backtranslation with tools, such as OpenNMT, and large language models, such as those in the GPT2 & GPT3 series, automatic paraphrasing tools have become increasingly available online. Typically they work by taking an input text, splitting it into sentences, and then backtranslating each sentence through a pivot language, such as French, German, Russian, etc., and then translate it back into English, producing a paraphrase. Many such tools exist for free and some with paid options, including [https://www.paraphrasetool.com/ paraphrase tool], [https://www.spinbot.com/ spinbot],[https://www.prepostseo.com/ prepostseo], and others that do essentially the same thing. There is some concern that such tools constitute plagiarism, since they are able to change the surface structure of the sentences while retaining the meaning, often confounding common plagiarism detection tools that rely on n-gram sequences. == See also ==

Paraphrasing (computational linguistics): Difference between revisions