Revision as of 22:22, 28 April 2022 edit Adam.Sudo (talk \| contribs) Extended confirmed users 1,104 edits Copy edit as part of article assessment ← Previous edit		Revision as of 11:16, 6 October 2022 edit undo NLPresearch (talk \| contribs) 9 edits Added Transformer methods. Since 2018 most of the top-tier publications use a Transformer model. Tag: Visual edit Next edit →
Line 23: === Long short-term memory === There has been success in using [[long short-term memory]] (LSTM) models to generate paraphrases.<ref name=Prakash>{{Citation\|last1=Prakash\|first1=Aaditya\|last2=Hasan\|first2=Sadid A.\|last3=Lee\|first3=Kathy\|last4=Datla\|first4=Vivek\|last5=Qadir\|first5=Ashequl\|last6=Liu\|first6=Joey\|last7=Farri\|first7=Oladimeji\|title=Neural Paraphrase Generation with Staked Residual LSTM Networks\|year=2016\|arxiv=1610.03098\|bibcode=2016arXiv161003098P}}</ref> In short, the model consists of an encoder and decoder component, both implemented using variations of a stacked [[Vanishing gradient problem#Residual networks\|residual]] LSTM. First, the encoding LSTM takes a [[one-hot]] encoding of all the words in a sentence as input and produces a final hidden vector, which can represent the input sentence. The decoding LSTM takes the hidden vector as input and generates a new sentence, terminating in an end-of-sentence token. The encoder and decoder are trained to take a phrase and reproduce the one-hot distribution of a corresponding paraphrase by minimizing [[perplexity]] using simple [[stochastic gradient descent]]. New paraphrases are generated by inputting a new phrase to the encoder and passing the output to the decoder. === Transformers === With the introduction of [[Transformer (machine learning model)\|Transformer models]], paraphrase generation approaches improved their ability to generate text by scaling [[neural network]] parameters and heavily parallelizing training through [[Feedforward neural network\|feed-forward layers]].<ref>{{Cite journal \|last=Zhou \|first=Jianing \|last2=Bhat \|first2=Suma \|date=2021 \|title=Paraphrase Generation: A Survey of the State of the Art \|url=https://aclanthology.org/2021.emnlp-main.414 \|journal=Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing \|language=en \|___location=Online and Punta Cana, Dominican Republic \|publisher=Association for Computational Linguistics \|pages=5075–5086 \|doi=10.18653/v1/2021.emnlp-main.414}}</ref> These models are so fluent in generating text that human experts cannot identify if an example was human-authored or machine-generated.<ref>{{Cite journal \|last=Dou \|first=Yao \|last2=Forbes \|first2=Maxwell \|last3=Koncel-Kedziorski \|first3=Rik \|last4=Smith \|first4=Noah \|last5=Choi \|first5=Yejin \|date=2022 \|title=Is GPT-3 Text Indistinguishable from Human Text? Scarecrow: A Framework for Scrutinizing Machine Text \|url=https://aclanthology.org/2022.acl-long.501 \|journal=Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) \|language=en \|___location=Dublin, Ireland \|publisher=Association for Computational Linguistics \|pages=7250–7274 \|doi=10.18653/v1/2022.acl-long.501}}</ref> Transformer-based paraphrase generation relies on [[Autoencoder\|autoencoding]], [[Autoregressive model\|autoregressive]], or [[Seq2seq\|sequence-to-sequence]] methods. Autoencoder models predict word replacement candidates with a one-hot distribution over the vocabulary, while autoregressive and seq2seq models generate new text based on the source predicting one word at a time.<ref>{{Cite journal \|last=Liu \|first=Xianggen \|last2=Mou \|first2=Lili \|last3=Meng \|first3=Fandong \|last4=Zhou \|first4=Hao \|last5=Zhou \|first5=Jie \|last6=Song \|first6=Sen \|date=2020 \|title=Unsupervised Paraphrasing by Simulated Annealing \|url=https://www.aclweb.org/anthology/2020.acl-main.28 \|journal=Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics \|language=en \|___location=Online \|publisher=Association for Computational Linguistics \|pages=302–312 \|doi=10.18653/v1/2020.acl-main.28}}</ref><ref>{{Cite journal \|last=Wahle \|first=Jan Philip \|last2=Ruas \|first2=Terry \|last3=Meuschke \|first3=Norman \|last4=Gipp \|first4=Bela \|title=Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection \|url=https://ieeexplore.ieee.org/document/9651895/ \|journal=2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL) \|___location=Champaign, IL, USA \|publisher=IEEE \|pages=226–229 \|doi=10.1109/JCDL52503.2021.00065 \|isbn=978-1-6654-1770-9}}</ref> More advanced efforts also exist to make paraphrasing controllable according to predefined quality dimensions, such as semantic preservation or lexical diversity.<ref>{{Cite journal \|last=Bandel \|first=Elron \|last2=Aharonov \|first2=Ranit \|last3=Shmueli-Scheuer \|first3=Michal \|last4=Shnayderman \|first4=Ilya \|last5=Slonim \|first5=Noam \|last6=Ein-Dor \|first6=Liat \|date=2022 \|title=Quality Controlled Paraphrase Generation \|url=https://aclanthology.org/2022.acl-long.45 \|journal=Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) \|language=en \|___location=Dublin, Ireland \|publisher=Association for Computational Linguistics \|pages=596–609 \|doi=10.18653/v1/2022.acl-long.45}}</ref> Many Transformer-based paraphrase generation methods rely on unsupervised learning to leverage large amounts of training data and scale their methods.<ref>{{Cite journal \|last=Lee \|first=John Sie Yuen \|last2=Lim \|first2=Ho Hung \|last3=Carol Webster \|first3=Carol \|date=2022 \|title=Unsupervised Paraphrasability Prediction for Compound Nominalizations \|url=https://aclanthology.org/2022.naacl-main.237 \|journal=Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies \|language=en \|___location=Seattle, United States \|publisher=Association for Computational Linguistics \|pages=3254–3263 \|doi=10.18653/v1/2022.naacl-main.237}}</ref><ref>{{Cite journal \|last=Niu \|first=Tong \|last2=Yavuz \|first2=Semih \|last3=Zhou \|first3=Yingbo \|last4=Keskar \|first4=Nitish Shirish \|last5=Wang \|first5=Huan \|last6=Xiong \|first6=Caiming \|date=2021 \|title=Unsupervised Paraphrasing with Pretrained Language Models \|url=https://aclanthology.org/2021.emnlp-main.417 \|journal=Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing \|language=en \|___location=Online and Punta Cana, Dominican Republic \|publisher=Association for Computational Linguistics \|pages=5136–5150 \|doi=10.18653/v1/2021.emnlp-main.417}}</ref> == Paraphrase recognition ==

Paraphrasing (computational linguistics): Difference between revisions