Revision as of 18:33, 7 December 2023 edit 98.247.164.220 (talk) →Evaluation Tags: Mobile edit Mobile web edit ← Previous edit		Revision as of 06:01, 27 December 2023 edit undo Citation bot (talk \| contribs) Bots 5,868,077 edits Alter: title, template type. Add: chapter. Removed proxy/dead URL that duplicated identifier. Removed parameters. \| Use this bot. Report bugs. \| #UCB_CommandLine Next edit →
Line 1: {{short description\|Automatic generation or recognition of paraphrased text}} {{about\|automated generation and recognition of paraphrases\|\|Paraphrase (disambiguation)}} '''Paraphrase''' or '''paraphrasing''' in [[computational linguistics]] is the [[natural language processing]] task of detecting and generating [[paraphrase]]s. Applications of paraphrasing are varied including information retrieval, [[question answering]], [[Automatic summarization\|text summarization]], and [[plagiarism detection]].<ref name=Socher /> Paraphrasing is also useful in the [[evaluation of machine translation]],<ref name=Callison>{{cite conference \|last=Callison-Burch \|first=Chris \|title=Syntactic Constraints on Paraphrases Extracted from Parallel Corpora \|conference=EMNLP '08 Proceedings of the Conference on Empirical Methods in Natural Language Processing \|date=October 25–27, 2008 \|place=Honolulu, Hawaii \|pages=196–205\|url=https://dl.acm.org/citation.cfm?id=1613743}}</ref> as well as [[semantic parsing]]<ref>Berant, Jonathan, and Percy Liang. "[http://www.aclweb.org/anthology/P14-1133 Semantic parsing via paraphrasing]." Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vol. 1. 2014.</ref> and [[natural language generation\|generation]]<ref>{{Cite ~~journal~~book \|last1=Wahle \|first1=Jan Philip \|last2=Ruas \|first2=Terry \|last3=Kirstein \|first3=Frederic \|last4=Gipp \|first4=Bela \|~~date=2022 \|title~~chapter=How Large Language Models are Transforming Machine-Paraphrase Plagiarism \|~~journal~~date=2022 \|title=Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing \|pages=952–963 \|___location=Online and Abu Dhabi, United Arab Emirates\|doi=10.18653/v1/2022.emnlp-main.62 \|arxiv=2210.03568 }}</ref> of new samples to expand existing [[Text corpus\|corpora]].<ref name=Barzilay /> == Paraphrase generation == Line 45: == Evaluation == Multiple methods can be used to evaluate paraphrases. Since paraphrase recognition can be posed as a classification problem, most standard evaluations metrics such as [[accuracy]], [[f1 score]], or an [[receiver operating characteristic\|ROC curve]] do relatively well. However, there is difficulty calculating f1-scores due to trouble producing a complete list of paraphrases for a given phrase and the fact that good paraphrases are dependent upon context. A metric designed to counter these problems is ParaMetric.<ref name=Burch2>{{cite conference \|last1=Callison-Burch \|first1=Chris \|last2=Cohn \|first2=Trevor \|last3=Lapata \|first3=Mirella \|title=ParaMetric: An Automatic Evaluation Metric for Paraphrasing \|conference=Proceedings of the 22nd International Conference on Computational Linguistics\|place=Manchester \|year=2008 \|pages=97–104 \|doi=10.3115/1599081.1599094 \|s2cid=837398 ~~\|url=https://pdfs.semanticscholar.org/be0d/0df960833c1bea2a39ba9a17e5ca958018cd.pdf~~ \|doi-access=free}}</ref> ParaMetric aims to calculate the [[precision and recall]] of an automatic paraphrase system by comparing the automatic alignment of paraphrases to a manual alignment of similar phrases. Since ParaMetric is simply rating the quality of phrase alignment, it can be used to rate paraphrase generation systems, assuming it uses phrase alignment as part of its generation process. A notable drawback to ParaMetric is the large and exhaustive set of manual alignments that must be initially created before a rating can be produced. The evaluation of paraphrase generation has similar difficulties as the evaluation of [[machine translation]]. The quality of a paraphrase depends on its context, whether it is being used as a summary, and how it is generated, among other factors. Additionally, a good paraphrase usually is lexically dissimilar from its source phrase. The simplest method used to evaluate paraphrase generation would be through the use of human judges. Unfortunately, evaluation through human judges tends to be time-consuming. Automated approaches to evaluation prove to be challenging as it is essentially a problem as difficult as paraphrase recognition. While originally used to evaluate machine translations, bilingual evaluation understudy ([[BLEU]]) has been used successfully to evaluate paraphrase generation models as well. However, paraphrases often have several lexically different but equally valid solutions, hurting BLEU and other similar evaluation metrics.<ref name=Chen>{{cite conference \|last1=Chen \|first1=David \|last2=Dolan \|first2=William \|title=Collecting Highly Parallel Data for Paraphrase Evaluation \|conference=Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies \|place=Portland, Oregon \|year=2008 \|pages=190–200 \|url=https://dl.acm.org/citation.cfm?id=2002497}}</ref>

Paraphrasing (computational linguistics): Difference between revisions