Natural language processing: Difference between revisions

Content deleted Content added
ES IRM (talk | contribs)
m Reverted 2 edits by EdwinDareck234 (talk) to last revision by InternetArchiveBot
 
(46 intermediate revisions by 33 users not shown)
Line 1:
{{Short description|Processing of natural language by a computer}}
{{Short description|Field of linguistics and computer science}}{{More citations needed|date=May 2024}}{{Other uses|NLP (disambiguation){{!}}NLP}}{{About|natural language processing done by computers|the natural language processing done by the human brain|Language processing in the brain}}
{{Multiple issues|
'''Natural language processing''' ('''NLP''') is an [[interdisciplinary]] subfield of [[computer science]] - specifically [[Artificial Intelligence]] - and [[linguistics]]. It is primarily concerned with providing computers the ability to process data encoded in [[natural language]], typically collected in [[text corpus|text corpora]], using either rule-based, statistical or neural-based approaches of [[machine learning]] and [[deep learning]].
{{More citations needed|date=May 2024}}
{{Cleanup rewrite|date=July 2025}}
{{Cleanup reorganize|date=July 2025}}
}}
'''Natural language processing''' (NLP) is the processing of [[natural language]] information by a [[computer]]. The study of NLP, a subfield of [[computer science]], is generally associated with [[artificial intelligence]]. NLP is related to [[information retrieval]], [[knowledge representation]], [[computational linguistics]], and more broadly with [[linguistics]].<ref name="nlpintro">
{{cite book |last=Eisenstein |first=Jacob |date=October 1, 2019 |title=Introduction to Natural Language Processing |url=https://mitpress.mit.edu/9780262042840/introduction-to-natural-language-processing/ |___location= |publisher=The MIT Press |page=1 |isbn=9780262042840 |access-date=}}</ref>
 
Major processing tasks in Naturalan LanguageNLP Processingsystem areinclude: [[speech recognition]], [[text classification]], [[natural-language understanding|natural language understanding]], and [[natural language generation|natural-language generation]].
 
== History ==
{{See|History of natural language processing}}
 
Natural language processing has its roots in the 1940s1950s.<ref>{{Cite web |title=NLP |url=https://cs.stanford.edu/people/eroberts/courses/soco/projects/2004-05/nlp/overview_history.html}}</ref> Already in 19401950, [[Alan Turing]] published an article titled "[[Computing Machinery and Intelligence]]" which proposed what is now called the [[Turing test]] as a criterion of intelligence, though at the time that was not articulated as a problem separate from artificial intelligence. The proposed test includes a task that involves the automated interpretation and generation of natural language.
 
=== Symbolic NLP (1950s – early 1990s) ===
The premise of symbolic NLP is well-summarized by [[John Searle]]'s [[Chinese room]] experiment: Given a collection of rules (e.g., a Chinese phrasebook, with questions and matching answers), the computer emulates natural language understanding (or other NLP tasks) by applying those rules to the data it confronts.
 
* '''1950s''': The [[Georgetown-IBM experiment|Georgetown experiment]] in 1954 involved fully [[automatic translation]] of more than sixty Russian sentences into English. The authors claimed that within three or five years, machine translation would be a solved problem.<ref>{{cite web|author=Hutchins, J.|year=2005|url=http://www.hutchinsweb.me.uk/Nutshell-2005.pdf|title=The history of machine translation in a nutshell|access-date=2019-02-04|archive-date=2019-07-13|archive-url=https://web.archive.org/web/20190713103044/http://www.hutchinsweb.me.uk/Nutshell-2005.pdf|url-status=dead}}{{self-published source|date=December 2013}}</ref> However, real progress was much slower, and after the [[ALPAC|ALPAC report]] in 1966, which found that ten-year-long years of research had failed to fulfill the expectations, funding for machine translation was dramatically reduced. Little further research in machine translation was conducted in America (though some research continued elsewhere, such as Japan and Europe<ref>"ALPAC: the (in)famous report", John Hutchins, MT News International, no. 14, June 1996, pp. 9–12.</ref>) until the late 1980s when the first [[statistical machine translation]] systems were developed.
* '''1960s''': Some notably successful natural language processing systems developed in the 1960s were [[SHRDLU]], a natural language system working in restricted "[[blocks world]]s" with restricted vocabularies, and [[ELIZA]], a simulation of a [[Rogerian psychotherapy|Rogerian psychotherapist]], written by [[Joseph Weizenbaum]] between 1964 and 1966. Using almost no information about human thought or emotion, ELIZA sometimes provided a startlingly human-like interaction. When the "patient" exceeded the very small knowledge base, ELIZA might provide a generic response, for example, responding to "My head hurts" with "Why do you say your head hurts?". [[Ross Quillian]]'s successful work on natural language was demonstrated with a vocabulary of only ''twenty'' words, because that was all that would fit in a computer memory at the time.<ref>{{Harvnb|Crevier|1993|pp=146–148}}, see also {{Harvnb|Buchanan|2005|p=56}}: "Early programs were necessarily limited in scope by the size and speed of memory"</ref>
 
* '''1970s''': During the 1970s, many programmers began to write "conceptual [[ontology (information science)|ontologies]]", which structured real-world information into computer-understandable data. Examples are MARGIE (Schank, 1975), SAM (Cullingford, 1978), PAM (Wilensky, 1978), TaleSpin (Meehan, 1976), QUALM (Lehnert, 1977), Politics (Carbonell, 1979), and Plot Units (Lehnert 1981). During this time, the first [[chatterbots]] were written (e.g., [[PARRY]]).
* '''1980s''': The 1980s and early 1990s mark the heyday of symbolic methods in NLP. Focus areas of the time included research on rule-based parsing (e.g., the development of [[Head-driven phrase structure grammar|HPSG]] as a computational operationalization of [[generative grammar]]), morphology (e.g., two-level morphology<ref>{{citation|last=Koskenniemi|first=Kimmo|title=Two-level morphology: A general computational model of word-form recognition and production|url=http://www.ling.helsinki.fi/~koskenni/doc/Two-LevelMorphology.pdf|year=1983|publisher=Department of General Linguistics, [[University of Helsinki]]|author-link=Kimmo Koskenniemi|access-date=2020-08-20|archive-date=2018-12-21|archive-url=https://web.archive.org/web/20181221032913/http://www.ling.helsinki.fi/~koskenni/doc/Two-LevelMorphology.pdf|url-status=dead}}</ref>), semantics (e.g., [[Lesk algorithm]]), reference (e.g., within Centering Theory<ref>Joshi, A. K., & Weinstein, S. (1981, August). [https://www.ijcai.org/Proceedings/81-1/Papers/071.pdf Control of Inference: Role of Some Aspects of Discourse Structure-Centering]. In ''IJCAI'' (pp. 385–387).</ref>) and other areas of natural language understanding (e.g., in the [[Rhetorical structure theory|Rhetorical Structure Theory]]). Other lines of research were continued, e.g., the development of chatterbots with [[Racter]] and [[Jabberwacky]]. An important development (that eventually led to the statistical turn in the 1990s) was the rising importance of quantitative evaluation in this period.<ref>{{Cite journal|last1=Guida|first1=G.|last2=Mauri|first2=G.|date=July 1986|title=Evaluation of natural language processing systems: Issues and approaches|journal=Proceedings of the IEEE|volume=74|issue=7|pages=1026–1035|doi=10.1109/PROC.1986.13580|s2cid=30688575|issn=1558-2256}}</ref>
 
=== Statistical NLP (1990s–2010s1990s–present) ===
Up until the 1980s, most natural language processing systems were based on complex sets of hand-written rules. Starting in the late 1980s, however, there was a revolution in natural language processing with the introduction of [[machine learning]] algorithms for language processing. This was due to both the steady increase in computational power (see [[Moore's law]]) and the gradual lessening of the dominance of [[Noam Chomsky|Chomskyan]] theories of linguistics (e.g. [[transformational grammar]]), whose theoretical underpinnings discouraged the sort of [[corpus linguistics]] that underlies the machine-learning approach to language processing.<ref>Chomskyan linguistics encourages the investigation of "[[corner case]]s" that stress the limits of its theoretical models (comparable to [[pathological (mathematics)|pathological]] phenomena in mathematics), typically created using [[thought experiment]]s, rather than the systematic investigation of typical phenomena that occur in real-world data, as is the case in [[corpus linguistics]]. The creation and use of such [[text corpus|corpora]] of real-world data is a fundamental part of machine-learning algorithms for natural language processing. In addition, theoretical underpinnings of Chomskyan linguistics such as the so-called "[[poverty of the stimulus]]" argument entail that general learning algorithms, as are typically used in machine learning, cannot be successful in language processing. As a result, the Chomskyan paradigm discouraged the application of such models to language processing.</ref>
*'''1990s''': Many of the notable early successes onin statistical methods in NLP occurred in the field of [[machine translation]], due especially to work at IBM Research, such as [[IBM alignment models]]. These systems were able to take advantage of existing multilingual [[text corpus|textual corpora]] that had been produced by the [[Parliament of Canada]] and the [[European Union]] as a result of laws calling for the translation of all governmental proceedings into all official languages of the corresponding systems of government. However, most other systems depended on corpora specifically developed for the tasks implemented by these systems, which was (and often continues to be) a major limitation in the success of these systems. As a result, a great deal of research has gone into methods of more effectively learning from limited amounts of data.
*'''2000s''': With the growth of the web, increasing amounts of raw (unannotated) language data hashave become available since the mid-1990s. Research has thus increasingly focused on [[unsupervised learning|unsupervised]] and [[semi-supervised learning]] algorithms. Such algorithms can learn from data that has not been hand-annotated with the desired answers or using a combination of annotated and non-annotated data. Generally, this task is much more difficult than [[supervised learning]], and typically produces less accurate results for a given amount of input data. However, there is an enormous amount of non-annotated data available (including, among other things, the entire content of the [[World Wide Web]]), which can often make up for the inferiorworse resultsefficiency if the algorithm used has a low enough [[time complexity]] to be practical.
In *'''2003,:''' [[word n-gram language model|word n-gram model]], at the time the best statistical algorithm, wasis overperformedoutperformed by a [[multi-layer perceptron]] (with a single hidden layer and context length of several words, trained on up to 14 million of words with a CPU cluster in [[language model]]ling), by [[Yoshua Bengio|Bengio]] withet co-authorsal.)<ref>{{Cite journal|url=https://dl.acm.org/doi/10.5555/944919.944966|title=A neural probabilistic language model|first1=Yoshua|last1=Bengio|first2=Réjean|last2=Ducharme|first3=Pascal|last3=Vincent|first4=Christian|last4=Janvin|date=March 1, 2003|journal=The Journal of Machine Learning Research|volume=3|pages=1137–1155|via=ACM Digital Library}}</ref>
 
In *'''2010,:''' [[Tomáš Mikolov]] (then a PhD student at [[Brno University of Technology]]) with co-authors applied a simple [[recurrent neural network]] with a single hidden layer to language modelling,<ref>{{cite book |last1=Mikolov |first1=Tomáš |last2=Karafiát |first2=Martin |last3=Burget |first3=Lukáš |last4=Černocký |first4=Jan |last5=Khudanpur |first5=Sanjeev |title=Interspeech 2010 |chapter=Recurrent neural network based language model |journal=Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010 |date=26 September 2010 |pages=1045–1048 |doi=10.21437/Interspeech.2010-343 |s2cid=17048224 |chapter-url=https://gwern.net/doc/ai/nn/rnn/2010-mikolov.pdf |language=en}}</ref> and in the following years he went on to develop [[Word2vec]]. In the 2010s, [[representation learning]] and [[deep learning|deep neural network]]-style (featuring many hidden layers) machine learning methods became widespread in natural language processing. That popularity was due partly to a flurry of results showing that such techniques<ref name="goldberg:nnlp17">{{cite journal |last=Goldberg |first=Yoav |year=2016 |arxiv=1807.10854 |title=A Primer on Neural Network Models for Natural Language Processing |journal=Journal of Artificial Intelligence Research |volume=57 |pages=345–420 |doi=10.1613/jair.4992 |s2cid=8273530 }}</ref><ref name="goodfellow:book16">{{cite book |first1=Ian |last1=Goodfellow |first2=Yoshua |last2=Bengio |first3=Aaron |last3=Courville |url=http://www.deeplearningbook.org/ |title=Deep Learning |publisher=MIT Press |year=2016 }}</ref> can achieve state-of-the-art results in many natural language tasks, e.g., in [[language modeling]]<ref name="jozefowicz:lm16">{{cite book |first1=Rafal |last1=Jozefowicz |first2=Oriol |last2=Vinyals |first3=Mike |last3=Schuster |first4=Noam |last4=Shazeer |first5=Yonghui |last5=Wu |year=2016 |arxiv=1602.02410 |title=Exploring the Limits of Language Modeling |bibcode=2016arXiv160202410J }}</ref> and parsing.<ref name="choe:emnlp16">{{cite journal |first1=Do Kook |last1=Choe |first2=Eugene |last2=Charniak |journal=Emnlp 2016 |url=https://aclanthology.coli.uni-saarland.de/papers/D16-1257/d16-1257 |title=Parsing as Language Modeling |access-date=2018-10-22 |archive-date=2018-10-23 |archive-url=https://web.archive.org/web/20181023034804/https://aclanthology.coli.uni-saarland.de/papers/D16-1257/d16-1257 |url-status=dead }}</ref><ref name="vinyals:nips15">{{cite journal |last1=Vinyals |first1=Oriol |last2=Kaiser |first2=Lukasz |display-authors=1 |journal=Nips2015 |title=Grammar as a Foreign Language |year=2014 |arxiv=1412.7449 |bibcode=2014arXiv1412.7449V |url=https://papers.nips.cc/paper/5635-grammar-as-a-foreign-language.pdf }}</ref> This is increasingly important [[artificial intelligence in healthcare|in medicine and healthcare]], where NLP helps analyze notes and text in [[Electronic health record|electronic health records]] that would otherwise be inaccessible for study when seeking to improve care<ref>{{Cite journal|last1=Turchin|first1=Alexander|last2=Florez Builes|first2=Luisa F.|date=2021-03-19|title=Using Natural Language Processing to Measure and Improve Quality of Diabetes Care: A Systematic Review|journal=Journal of Diabetes Science and Technology|volume=15|issue=3|language=en|pages=553–560|doi=10.1177/19322968211000831|pmid=33736486|pmc=8120048|issn=1932-2968}}</ref> or protect patient privacy.<ref>{{Cite journal |last1=Lee |first1=Jennifer |last2=Yang |first2=Samuel |last3=Holland-Hall |first3=Cynthia |last4=Sezgin |first4=Emre |last5=Gill |first5=Manjot |last6=Linwood |first6=Simon |last7=Huang |first7=Yungui |last8=Hoffman |first8=Jeffrey |date=2022-06-10 |title=Prevalence of Sensitive Terms in Clinical Notes Using Natural Language Processing Techniques: Observational Study |journal=JMIR Medical Informatics |language=en |volume=10 |issue=6 |pages=e38482 |doi=10.2196/38482 |issn=2291-9694 |pmc=9233261 |pmid=35687381 |doi-access=free }}</ref>
=== Neural NLP (present) ===
In 2003, [[word n-gram language model|word n-gram model]], at the time the best statistical algorithm, was overperformed by a [[multi-layer perceptron]] (with a single hidden layer and context length of several words trained on up to 14 million of words with a CPU cluster in [[language model]]ling) by [[Yoshua Bengio]] with co-authors.<ref>{{Cite journal|url=https://dl.acm.org/doi/10.5555/944919.944966|title=A neural probabilistic language model|first1=Yoshua|last1=Bengio|first2=Réjean|last2=Ducharme|first3=Pascal|last3=Vincent|first4=Christian|last4=Janvin|date=March 1, 2003|journal=The Journal of Machine Learning Research|volume=3|pages=1137–1155|via=ACM Digital Library}}</ref>
 
In 2010, [[Tomáš Mikolov]] (then a PhD student at [[Brno University of Technology]]) with co-authors applied a simple [[recurrent neural network]] with a single hidden layer to language modelling,<ref>{{cite book |last1=Mikolov |first1=Tomáš |last2=Karafiát |first2=Martin |last3=Burget |first3=Lukáš |last4=Černocký |first4=Jan |last5=Khudanpur |first5=Sanjeev |title=Interspeech 2010 |chapter=Recurrent neural network based language model |journal=Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010 |date=26 September 2010 |pages=1045–1048 |doi=10.21437/Interspeech.2010-343 |s2cid=17048224 |chapter-url=https://gwern.net/doc/ai/nn/rnn/2010-mikolov.pdf |language=en}}</ref> and in the following years he went on to develop [[Word2vec]]. In the 2010s, [[representation learning]] and [[deep learning|deep neural network]]-style (featuring many hidden layers) machine learning methods became widespread in natural language processing. That popularity was due partly to a flurry of results showing that such techniques<ref name="goldberg:nnlp17">{{cite journal |last=Goldberg |first=Yoav |year=2016 |arxiv=1807.10854 |title=A Primer on Neural Network Models for Natural Language Processing |journal=Journal of Artificial Intelligence Research |volume=57 |pages=345–420 |doi=10.1613/jair.4992 |s2cid=8273530 }}</ref><ref name="goodfellow:book16">{{cite book |first1=Ian |last1=Goodfellow |first2=Yoshua |last2=Bengio |first3=Aaron |last3=Courville |url=http://www.deeplearningbook.org/ |title=Deep Learning |publisher=MIT Press |year=2016 }}</ref> can achieve state-of-the-art results in many natural language tasks, e.g., in [[language modeling]]<ref name="jozefowicz:lm16">{{cite book |first1=Rafal |last1=Jozefowicz |first2=Oriol |last2=Vinyals |first3=Mike |last3=Schuster |first4=Noam |last4=Shazeer |first5=Yonghui |last5=Wu |year=2016 |arxiv=1602.02410 |title=Exploring the Limits of Language Modeling |bibcode=2016arXiv160202410J }}</ref> and parsing.<ref name="choe:emnlp16">{{cite journal |first1=Do Kook |last1=Choe |first2=Eugene |last2=Charniak |journal=Emnlp 2016 |url=https://aclanthology.coli.uni-saarland.de/papers/D16-1257/d16-1257 |title=Parsing as Language Modeling |access-date=2018-10-22 |archive-date=2018-10-23 |archive-url=https://web.archive.org/web/20181023034804/https://aclanthology.coli.uni-saarland.de/papers/D16-1257/d16-1257 |url-status=dead }}</ref><ref name="vinyals:nips15">{{cite journal |last1=Vinyals |first1=Oriol |last2=Kaiser |first2=Lukasz |display-authors=1 |journal=Nips2015 |title=Grammar as a Foreign Language |year=2014 |arxiv=1412.7449 |bibcode=2014arXiv1412.7449V |url=https://papers.nips.cc/paper/5635-grammar-as-a-foreign-language.pdf }}</ref> This is increasingly important [[artificial intelligence in healthcare|in medicine and healthcare]], where NLP helps analyze notes and text in [[Electronic health record|electronic health records]] that would otherwise be inaccessible for study when seeking to improve care<ref>{{Cite journal|last1=Turchin|first1=Alexander|last2=Florez Builes|first2=Luisa F.|date=2021-03-19|title=Using Natural Language Processing to Measure and Improve Quality of Diabetes Care: A Systematic Review|journal=Journal of Diabetes Science and Technology|volume=15|issue=3|language=en|pages=553–560|doi=10.1177/19322968211000831|pmid=33736486|pmc=8120048|issn=1932-2968}}</ref> or protect patient privacy.<ref>{{Cite journal |last1=Lee |first1=Jennifer |last2=Yang |first2=Samuel |last3=Holland-Hall |first3=Cynthia |last4=Sezgin |first4=Emre |last5=Gill |first5=Manjot |last6=Linwood |first6=Simon |last7=Huang |first7=Yungui |last8=Hoffman |first8=Jeffrey |date=2022-06-10 |title=Prevalence of Sensitive Terms in Clinical Notes Using Natural Language Processing Techniques: Observational Study |journal=JMIR Medical Informatics |language=en |volume=10 |issue=6 |pages=e38482 |doi=10.2196/38482 |issn=2291-9694 |pmc=9233261 |pmid=35687381 |doi-access=free }}</ref>
 
==Approaches: Symbolic, statistical, neural networks{{anchor|Statistical natural language processing (SNLP)}} ==
Line 39 ⟶ 42:
* the larger such a (probabilistic) language model is, the more accurate it becomes, in contrast to rule-based systems that can gain accuracy only by increasing the amount and complexity of the rules leading to [[intractable problem|intractability]] problems.
 
Rule-based systems are commonly used:
Although rule-based systems for manipulating symbols were still in use in 2020, they have become mostly obsolete with the advance of [[Large language model|LLM]]s in 2023.
 
Before that they were commonly used:
* when the amount of training data is insufficient to successfully apply machine learning methods, e.g., for the machine translation of low-resource languages such as provided by the [[Apertium]] system,
* for preprocessing in NLP pipelines, e.g., [[Tokenization (lexical analysis)|tokenization]], or
Line 54 ⟶ 56:
=== Neural networks ===
{{Further|Artificial neural network}}
A major drawback of statistical methods is that they require elaborate [[feature engineering]]. Since 2015,<ref>{{Cite web |last=Socher |first=Richard |title=Deep Learning For NLP-ACL 2012 Tutorial |url=https://www.socher.org/index.php/Main/DeepLearningForNLP-ACL2012Tutorial |access-date=2020-08-17 |website=www.socher.org |archive-date=2021-04-14 |archive-url=https://web.archive.org/web/20210414054126/https://www.socher.org/index.php/Main/DeepLearningForNLP-ACL2012Tutorial |url-status=dead }} This was an early Deep Learning tutorial at the ACL 2012 and met with both interest and (at the time) skepticism by most participants. Until then, neural learning was basically rejected because of its lack of statistical interpretability. Until 2015, deep learning had evolved into the major framework of NLP. [Link is broken, try http://web.stanford.edu/class/cs224n/]</ref> the statistical approach washas been replaced by the [[Artificial neural network|neural networks]] approach, using [[Semantic networks|semantic networks]]<ref>{{cite book |last1=Segev |first1=Elad |title=Semantic Network Analysis in Social Sciences |date=2022 |publisher=Routledge |___location=London |isbn=9780367636524 |url=https://www.routledge.com/Semantic-Network-Analysis-in-Social-Sciences/Segev/p/book/9780367636524 |access-date=5 December 2021 |archive-date=5 December 2021 |archive-url=https://web.archive.org/web/20211205140726/https://www.routledge.com/Semantic-Network-Analysis-in-Social-Sciences/Segev/p/book/9780367636524 |url-status=live }}</ref> and [[word embedding]]s to capture semantic properties of words.
 
Intermediate tasks (e.g., part-of-speech tagging and dependency parsing) haveare not been needed anymore.
 
[[Neural machine translation]], based on then-newly- invented [[Seq2seq|sequence-to-sequence]] transformations, made obsolete the intermediate steps, such as word alignment, previously necessary for [[statistical machine translation]].
 
== Common NLP tasks ==
Line 126 ⟶ 128:
 
; [[Argument mining]]
:The goal of argument mining is the automatic extraction and identification of argumentative structures from [[natural language]] text with the aid of computer programs.<ref>{{Cite journal|last1=Lippi|first1=Marco|last2=Torroni|first2=Paolo|date=2016-04-20|title=Argumentation Mining: State of the Art and Emerging Trends|url=https://dl.acm.org/doi/10.1145/2850417|journal=ACM Transactions on Internet Technology|language=en|volume=16|issue=2|pages=1–25|doi=10.1145/2850417|hdl=11585/523460|s2cid=9561587|issn=1533-5399|hdl-access=free}}</ref> Such argumentative structures include the premise, conclusions, the [[argument scheme]] and the relationship between the main and subsidiary argument, or the main and counter-argument within discourse.<ref>{{Cite web|title=Argument Mining – IJCAI2016 Tutorial|url=https://www.i3s.unice.fr/~villata/tutorialIJCAI2016.html|access-date=2021-03-09|website=www.i3s.unice.fr|archive-date=2021-04-18|archive-url=https://web.archive.org/web/20210418083659/https://www.i3s.unice.fr/~villata/tutorialIJCAI2016.html|url-status=dead}}</ref><ref>{{Cite web|title=NLP Approaches to Computational Argumentation – ACL 2016, Berlin|url=http://acl2016tutorial.arg.tech/|access-date=2021-03-09|language=en-GB}}</ref>
 
=== Higher-level NLP applications ===
Line 136 ⟶ 138:
; [[Machine translation]] (MT)
:Automatically translate text from one human language to another. This is one of the most difficult problems, and is a member of a class of problems colloquially termed "[[AI-complete]]", i.e. requiring all of the different types of knowledge that humans possess (grammar, semantics, facts about the real world, etc.) to solve properly.
; [[Natural- language understanding]] (NLU): Convert chunks of text into more formal representations such as [[first-order logic]] structures that are easier for [[computer]] programs to manipulate. Natural language understanding involves the identification of the intended semantic from the multiple possible semantics which can be derived from a natural language expression which usually takes the form of organized notations of natural language concepts. Introduction and creation of language metamodel and ontology are efficient however empirical solutions. An explicit formalization of natural language semantics without confusions with implicit assumptions such as [[closed-world assumption]] (CWA) vs. [[open-world assumption]], or subjective Yes/No vs. objective True/False is expected for the construction of a basis of semantics formalization.<ref>{{cite journal|last1=Duan|first1=Yucong|last2=Cruz|first2=Christophe|year=2011|title=Formalizing Semantic of Natural Language through Conceptualization from Existence|url=http://www.ijimt.org/abstract/100-E00187.htm|journal=International Journal of Innovation, Management and Technology|volume=2|issue=1|pages=37–42|archive-url=https://web.archive.org/web/20111009135952/http://www.ijimt.org/abstract/100-E00187.htm|archive-date=2011-10-09}}</ref>
; [[Natural language generation|Natural-language generation]]<nowiki> (NLG):</nowiki>
:Convert information from computer databases or semantic intents into readable human language.
; Book generation
:Not an NLP task proper but an extension of natural language generation and other NLP tasks is the creation of full-fledged books. The first machine-generated book was created by a rule-based system in 1984 (Racter, ''The policeman's beard is half-constructed'').<ref>{{Cite web|title=U B U W E B :: Racter|url=http://www.ubu.com/historical/racter/index.html|access-date=2020-08-17|website=www.ubu.com}}</ref> The first published work by a neural network was published in 2018, ''[[1 the Road]]'', marketed as a novel, contains sixty million words. Both these systems are basically elaborate but non-sensical (semantics-free) [[language model]]s. The first machine-generated science book was published in 2019 (Beta Writer, ''Lithium-Ion Batteries'', Springer, Cham).<ref>{{Cite book|last=Writer|first=Beta|date=2019|title=Lithium-Ion Batteries|language=en-gb|doi=10.1007/978-3-030-16800-1|isbn=978-3-030-16799-8|s2cid=155818532}}</ref> Unlike ''Racter'' and ''1 the Road'', this is grounded on factual knowledge and based on text summarization.
; [[Document AI]]
:A Document AI platform sits on top of the NLP technology enabling users with no prior experience of artificial intelligence, machine learning or NLP to quickly train a computer to extract the specific data they need from different document types. NLP-powered Document AI enables non-technical teams to quickly access information hidden in documents, for example, lawyers, business analysts and accountants.<ref>{{Cite web|title=Document Understanding AI on Google Cloud (Cloud Next '19) – YouTube|url=https://www.youtube.com/watch?v=7dtl650D0y0| archive-url=https://ghostarchive.org/varchive/youtube/20211030/7dtl650D0y0| archive-date=2021-10-30|access-date=2021-01-11|website=www.youtube.com| date=11 April 2019 }}{{cbignore}}</ref>
; [[Dialogue system|Dialogue management]]
:Computer systems intended to converse with a human.
Line 165 ⟶ 167:
 
# Apply the theory of [[conceptual metaphor]], explained by Lakoff as "the understanding of one idea, in terms of another" which provides an idea of the intent of the author.<ref>{{Cite book|title=A Cognitive Theory of Cultural Meaning|last= Strauss |first= Claudia |publisher= Cambridge University Press|year=1999|isbn=978-0-521-59541-4|pages=156–164}}</ref> For example, consider the English word ''big''. When used in a comparison ("That is a big tree"), the author's intent is to imply that the tree is ''physically large'' relative to other trees or the authors experience. When used metaphorically ("Tomorrow is a big day"), the author's intent to imply ''importance''. The intent behind other usages, like in "She is a big person", will remain somewhat ambiguous to a person and a cognitive NLP algorithm alike without additional information.
# Assign relative measures of meaning to a word, phrase, sentence or piece of text based on the information presented before and after the piece of text being analyzed, e.g., by means of a [[probabilistic context-free grammar]] (PCFG). The mathematical equation for such algorithms is presented in [https://worldwide.espacenet.com/patent/search/family/055314712/publication/US9269353B1?q=pn%3DUS9269353 US Patent 9269353] {{Webarchive|url=https://web.archive.org/web/20240516102600/https://worldwide.espacenet.com/patent/search/family/055314712/publication/US9269353B1?q=pn=US9269353 |date=2024-05-16 }}:<ref>{{cite patent |country=US |number=9269353|status=patent}}</ref>
::<math> {RMM(token_N)}
=