Paraphrasing (computational linguistics): Difference between revisions

Content deleted Content added
Copy edit as part of article assessment
Added Transformer methods. Since 2018 most of the top-tier publications use a Transformer model.
Line 23:
=== Long short-term memory ===
There has been success in using [[long short-term memory]] (LSTM) models to generate paraphrases.<ref name=Prakash>{{Citation|last1=Prakash|first1=Aaditya|last2=Hasan|first2=Sadid A.|last3=Lee|first3=Kathy|last4=Datla|first4=Vivek|last5=Qadir|first5=Ashequl|last6=Liu|first6=Joey|last7=Farri|first7=Oladimeji|title=Neural Paraphrase Generation with Staked Residual LSTM Networks|year=2016|arxiv=1610.03098|bibcode=2016arXiv161003098P}}</ref> In short, the model consists of an encoder and decoder component, both implemented using variations of a stacked [[Vanishing gradient problem#Residual networks|residual]] LSTM. First, the encoding LSTM takes a [[one-hot]] encoding of all the words in a sentence as input and produces a final hidden vector, which can represent the input sentence. The decoding LSTM takes the hidden vector as input and generates a new sentence, terminating in an end-of-sentence token. The encoder and decoder are trained to take a phrase and reproduce the one-hot distribution of a corresponding paraphrase by minimizing [[perplexity]] using simple [[stochastic gradient descent]]. New paraphrases are generated by inputting a new phrase to the encoder and passing the output to the decoder.
 
=== Transformers ===
With the introduction of [[Transformer (machine learning model)|Transformer models]], paraphrase generation approaches improved their ability to generate text by scaling [[neural network]] parameters and heavily parallelizing training through [[Feedforward neural network|feed-forward layers]].<ref>{{Cite journal |last=Zhou |first=Jianing |last2=Bhat |first2=Suma |date=2021 |title=Paraphrase Generation: A Survey of the State of the Art |url=https://aclanthology.org/2021.emnlp-main.414 |journal=Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing |language=en |___location=Online and Punta Cana, Dominican Republic |publisher=Association for Computational Linguistics |pages=5075–5086 |doi=10.18653/v1/2021.emnlp-main.414}}</ref> These models are so fluent in generating text that human experts cannot identify if an example was human-authored or machine-generated.<ref>{{Cite journal |last=Dou |first=Yao |last2=Forbes |first2=Maxwell |last3=Koncel-Kedziorski |first3=Rik |last4=Smith |first4=Noah |last5=Choi |first5=Yejin |date=2022 |title=Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text |url=https://aclanthology.org/2022.acl-long.501 |journal=Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) |language=en |___location=Dublin, Ireland |publisher=Association for Computational Linguistics |pages=7250–7274 |doi=10.18653/v1/2022.acl-long.501}}</ref> Transformer-based paraphrase generation relies on [[Autoencoder|autoencoding]], [[Autoregressive model|autoregressive]], or [[Seq2seq|sequence-to-sequence]] methods. Autoencoder models predict word replacement candidates with a one-hot distribution over the vocabulary, while autoregressive and seq2seq models generate new text based on the source predicting one word at a time.<ref>{{Cite journal |last=Liu |first=Xianggen |last2=Mou |first2=Lili |last3=Meng |first3=Fandong |last4=Zhou |first4=Hao |last5=Zhou |first5=Jie |last6=Song |first6=Sen |date=2020 |title=Unsupervised Paraphrasing by Simulated Annealing |url=https://www.aclweb.org/anthology/2020.acl-main.28 |journal=Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics |language=en |___location=Online |publisher=Association for Computational Linguistics |pages=302–312 |doi=10.18653/v1/2020.acl-main.28}}</ref><ref>{{Cite journal |last=Wahle |first=Jan Philip |last2=Ruas |first2=Terry |last3=Meuschke |first3=Norman |last4=Gipp |first4=Bela |title=Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection |url=https://ieeexplore.ieee.org/document/9651895/ |journal=2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL) |___location=Champaign, IL, USA |publisher=IEEE |pages=226–229 |doi=10.1109/JCDL52503.2021.00065 |isbn=978-1-6654-1770-9}}</ref> More advanced efforts also exist to make paraphrasing controllable according to predefined quality dimensions, such as semantic preservation or lexical diversity.<ref>{{Cite journal |last=Bandel |first=Elron |last2=Aharonov |first2=Ranit |last3=Shmueli-Scheuer |first3=Michal |last4=Shnayderman |first4=Ilya |last5=Slonim |first5=Noam |last6=Ein-Dor |first6=Liat |date=2022 |title=Quality Controlled Paraphrase Generation |url=https://aclanthology.org/2022.acl-long.45 |journal=Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) |language=en |___location=Dublin, Ireland |publisher=Association for Computational Linguistics |pages=596–609 |doi=10.18653/v1/2022.acl-long.45}}</ref> Many Transformer-based paraphrase generation methods rely on unsupervised learning to leverage large amounts of training data and scale their methods.<ref>{{Cite journal |last=Lee |first=John Sie Yuen |last2=Lim |first2=Ho Hung |last3=Carol Webster |first3=Carol |date=2022 |title=Unsupervised Paraphrasability Prediction for Compound Nominalizations |url=https://aclanthology.org/2022.naacl-main.237 |journal=Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies |language=en |___location=Seattle, United States |publisher=Association for Computational Linguistics |pages=3254–3263 |doi=10.18653/v1/2022.naacl-main.237}}</ref><ref>{{Cite journal |last=Niu |first=Tong |last2=Yavuz |first2=Semih |last3=Zhou |first3=Yingbo |last4=Keskar |first4=Nitish Shirish |last5=Wang |first5=Huan |last6=Xiong |first6=Caiming |date=2021 |title=Unsupervised Paraphrasing with Pretrained Language Models |url=https://aclanthology.org/2021.emnlp-main.417 |journal=Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing |language=en |___location=Online and Punta Cana, Dominican Republic |publisher=Association for Computational Linguistics |pages=5136–5150 |doi=10.18653/v1/2021.emnlp-main.417}}</ref>
 
== Paraphrase recognition ==