Revision as of 11:16, 6 October 2022 edit NLPresearch (talk \| contribs) 9 edits Added Transformer methods. Since 2018 most of the top-tier publications use a Transformer model. Tag: Visual edit ← Previous edit		Revision as of 15:41, 6 October 2022 edit undo NLPresearch (talk \| contribs) 9 edits Added Transformers to paraphrase identification. In recent years, most research papers use some kind of transformer as the estimation model. Tag: Visual edit Next edit →
Line 1: {{short description\|Automatic generation or recognition of paraphrased text}} {{about\|automated generation and recognition of paraphrases\|\|Paraphrase (disambiguation)}} '''Paraphrase''' or '''paraphrasing''' in [[computational linguistics]] is the [[natural language processing]] task of detecting and generating [[paraphrase]]s. Applications of paraphrasing are varied including information retrieval, [[question answering]], [[Automatic summarization\|text summarization]], and [[plagiarism detection]].<ref name=Socher /> Paraphrasing is also useful in the [[evaluation of machine translation]],<ref name=Callison>{{cite conference \|last=Callison-Burch \|first=Chris \|title=Syntactic Constraints on Paraphrases Extracted from Parallel Corpora \|conference=EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing \|date=October 25–27, 2008 \|place=Honolulu, Hawaii \|pages=196–205\|url=https://dl.acm.org/citation.cfm?id=1613743}}</ref> as well as [[semantic parsing]]<ref>Berant, Jonathan, and Percy Liang. "[http://www.aclweb.org/anthology/P14-1133 Semantic parsing via paraphrasing]." Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vol. 1. 2014.</ref> and [[natural language generation\|generation]]<ref>{{Cite journal \|last=Wahle \|first=Jan Philip \|last2=Ruas \|first2=Terry \|last3=Kirstein \|first3=Frederic \|last4=Gipp \|first4=Bela \|date=2022 \|title=How Large Language Models are Transforming Machine-Paraphrase Plagiarism \|journal=Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing \|___location=Online and Abu Dhabi, United Arab Emirates}}</ref> of new samples to expand existing [[Text corpus\|corpora]].<ref name=Barzilay /> == Paraphrase generation == Line 40: Since paraphrases carry the same semantic meaning between one another, they should have similar skip-thought vectors. Thus a simple [[logistic regression]] can be trained to good performance with the absolute difference and component-wise product of two skip-thought vectors as input. === Transformers === Similar to how [[Transformer (machine learning model)\|Transformer models]] influenced paraphrase generation, their application in identifying paraphrases showed great success. Models such as BERT can be adapted with a [[binary classification]] layer and trained end-to-end on identification tasks.<ref>{{Cite journal \|last=Devlin \|first=Jacob \|last2=Chang \|first2=Ming-Wei \|last3=Lee \|first3=Kenton \|last4=Toutanova \|first4=Kristina \|date=2019 \|title=[No title found] \|url=http://aclweb.org/anthology/N19-1423 \|journal=Proceedings of the 2019 Conference of the North \|language=en \|___location=Minneapolis, Minnesota \|publisher=Association for Computational Linguistics \|pages=4171–4186 \|doi=10.18653/v1/N19-1423}}</ref><ref>{{Citation \|last=Wahle \|first=Jan Philip \|title=Identifying Machine-Paraphrased Plagiarism \|date=2022 \|url=https://link.springer.com/10.1007/978-3-030-96957-8_34 \|work=Information for a Better World: Shaping the Global Future \|volume=13192 \|pages=393–413 \|editor-last=Smits \|editor-first=Malte \|place=Cham \|publisher=Springer International Publishing \|language=en \|doi=10.1007/978-3-030-96957-8_34 \|isbn=978-3-030-96956-1 \|access-date=2022-10-06 \|last2=Ruas \|first2=Terry \|last3=Foltýnek \|first3=Tomáš \|last4=Meuschke \|first4=Norman \|last5=Gipp \|first5=Bela}}</ref> Transformers achieve strong results when transferring between domains and paraphrasing techniques compared to more traditional machine learning methods such as [[logistic regression]]. Other successful methods based on the Transformer architecture include using [[Adversarial machine learning\|adversarial learning]] and [[Meta learning (computer science)\|meta-learning]].<ref>{{Cite journal \|last=Nighojkar \|first=Animesh \|last2=Licato \|first2=John \|date=2021 \|title=Improving Paraphrase Detection with the Adversarial Paraphrasing Task \|url=https://aclanthology.org/2021.acl-long.552 \|journal=Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) \|language=en \|___location=Online \|publisher=Association for Computational Linguistics \|pages=7106–7116 \|doi=10.18653/v1/2021.acl-long.552}}</ref><ref>{{Cite journal \|last=Dopierre \|first=Thomas \|last2=Gravier \|first2=Christophe \|last3=Logerais \|first3=Wilfried \|date=2021 \|title=ProtAugment: Intent Detection Meta-Learning through Unsupervised Diverse Paraphrasing \|url=https://aclanthology.org/2021.acl-long.191 \|journal=Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) \|language=en \|___location=Online \|publisher=Association for Computational Linguistics \|pages=2454–2466 \|doi=10.18653/v1/2021.acl-long.191}}</ref> == Evaluation ==

Paraphrasing (computational linguistics): Difference between revisions