Content deleted Content added
No edit summary |
changed the spelling and added the correct tenses |
||
Line 24:
=== Neural NLP (present) ===
In 2003, [[word n-gram language model|word n-gram model]], at the time the best statistical algorithm, was
In 2010, [[Tomáš Mikolov]] (then a PhD student at [[Brno University of Technology]]) with co-authors applied a simple [[recurrent neural network]] with a single hidden layer to language modelling,<ref>{{cite book |last1=Mikolov |first1=Tomáš |last2=Karafiát |first2=Martin |last3=Burget |first3=Lukáš |last4=Černocký |first4=Jan |last5=Khudanpur |first5=Sanjeev |title=Interspeech 2010 |chapter=Recurrent neural network based language model |journal=Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010 |date=26 September 2010 |pages=1045–1048 |doi=10.21437/Interspeech.2010-343 |s2cid=17048224 |chapter-url=https://gwern.net/doc/ai/nn/rnn/2010-mikolov.pdf |language=en}}</ref> and in the following years he went on to develop [[Word2vec]]. In the 2010s, [[representation learning]] and [[deep learning|deep neural network]]-style (featuring many hidden layers) machine learning methods became widespread in natural language processing. That popularity was due partly to a flurry of results showing that such techniques<ref name="goldberg:nnlp17">{{cite journal |last=Goldberg |first=Yoav |year=2016 |arxiv=1807.10854 |title=A Primer on Neural Network Models for Natural Language Processing |journal=Journal of Artificial Intelligence Research |volume=57 |pages=345–420 |doi=10.1613/jair.4992 |s2cid=8273530 }}</ref><ref name="goodfellow:book16">{{cite book |first1=Ian |last1=Goodfellow |first2=Yoshua |last2=Bengio |first3=Aaron |last3=Courville |url=http://www.deeplearningbook.org/ |title=Deep Learning |publisher=MIT Press |year=2016 }}</ref> can achieve state-of-the-art results in many natural language tasks, e.g., in [[language modeling]]<ref name="jozefowicz:lm16">{{cite book |first1=Rafal |last1=Jozefowicz |first2=Oriol |last2=Vinyals |first3=Mike |last3=Schuster |first4=Noam |last4=Shazeer |first5=Yonghui |last5=Wu |year=2016 |arxiv=1602.02410 |title=Exploring the Limits of Language Modeling |bibcode=2016arXiv160202410J }}</ref> and parsing.<ref name="choe:emnlp16">{{cite journal |first1=Do Kook |last1=Choe |first2=Eugene |last2=Charniak |journal=Emnlp 2016 |url=https://aclanthology.coli.uni-saarland.de/papers/D16-1257/d16-1257 |title=Parsing as Language Modeling |access-date=2018-10-22 |archive-date=2018-10-23 |archive-url=https://web.archive.org/web/20181023034804/https://aclanthology.coli.uni-saarland.de/papers/D16-1257/d16-1257 |url-status=dead }}</ref><ref name="vinyals:nips15">{{cite journal |last1=Vinyals |first1=Oriol |last2=Kaiser |first2=Lukasz |display-authors=1 |journal=Nips2015 |title=Grammar as a Foreign Language |year=2014 |arxiv=1412.7449 |bibcode=2014arXiv1412.7449V |url=https://papers.nips.cc/paper/5635-grammar-as-a-foreign-language.pdf }}</ref> This is increasingly important [[artificial intelligence in healthcare|in medicine and healthcare]], where NLP helps analyze notes and text in [[Electronic health record|electronic health records]] that would otherwise be inaccessible for study when seeking to improve care<ref>{{Cite journal|last1=Turchin|first1=Alexander|last2=Florez Builes|first2=Luisa F.|date=2021-03-19|title=Using Natural Language Processing to Measure and Improve Quality of Diabetes Care: A Systematic Review|journal=Journal of Diabetes Science and Technology|volume=15|issue=3|language=en|pages=553–560|doi=10.1177/19322968211000831|pmid=33736486|pmc=8120048|issn=1932-2968}}</ref> or protect patient privacy.<ref>{{Cite journal |last1=Lee |first1=Jennifer |last2=Yang |first2=Samuel |last3=Holland-Hall |first3=Cynthia |last4=Sezgin |first4=Emre |last5=Gill |first5=Manjot |last6=Linwood |first6=Simon |last7=Huang |first7=Yungui |last8=Hoffman |first8=Jeffrey |date=2022-06-10 |title=Prevalence of Sensitive Terms in Clinical Notes Using Natural Language Processing Techniques: Observational Study |journal=JMIR Medical Informatics |language=en |volume=10 |issue=6 |pages=e38482 |doi=10.2196/38482 |issn=2291-9694 |pmc=9233261 |pmid=35687381 |doi-access=free }}</ref>
Line 54:
=== Neural networks ===
{{Further|Artificial neural network}}
A major drawback of statistical methods is that they require elaborate [[feature engineering]]. Since 2015,<ref>{{Cite web |last=Socher |first=Richard |title=Deep Learning For NLP-ACL 2012 Tutorial |url=https://www.socher.org/index.php/Main/DeepLearningForNLP-ACL2012Tutorial |access-date=2020-08-17 |website=www.socher.org}} This was an early Deep Learning tutorial at the ACL 2012 and met with both interest and (at the time) skepticism by most participants. Until then, neural learning was basically rejected because of its lack of statistical interpretability. Until 2015, deep learning had evolved into the major framework of NLP. [Link is broken, try http://web.stanford.edu/class/cs224n/]</ref> the statistical approach
Intermediate tasks (e.g., part-of-speech tagging and dependency parsing)
[[Neural machine translation]], based on then-newly-invented [[Seq2seq|sequence-to-sequence]] transformations, made obsolete the intermediate steps, such as word alignment, previously necessary for [[statistical machine translation]].
|