Natural language processing: Difference between revisions

Content deleted Content added
Ramsrav (talk | contribs)
No edit summary
Tags: Reverted Visual edit
m Reverted 2 edits by EdwinDareck234 (talk) to last revision by InternetArchiveBot
 
(12 intermediate revisions by 9 users not shown)
Line 1:
{{Short description|FieldProcessing of linguisticsnatural andlanguage computerby sciencea computer}}
{{Multiple issues|
{{About|computer processing|human brain processing|Language processing in the brain}}
{{More citations needed|date=May 2024}}
{{Cleanup rewrite|date=July 2025}}
{{Cleanup reorganize|date=July 2025}}
}}
'''Natural language processing''' (NLP) is the processing of [[natural language]] information by a [[computer]]. The study of NLP, a subfield of [[computer science]], is generally associated with [[artificial intelligence]]. NLP is related to [[information retrieval]], [[knowledge representation]], [[computational linguistics]], and more broadly with [[linguistics]].<ref name="nlpintro">
{{cite book |last=Eisenstein |first=Jacob |date=October 1, 2019 |title=Introduction to Natural Language Processing |url=https://mitpress.mit.edu/9780262042840/introduction-to-natural-language-processing/ |___location= |publisher=The MIT Press |page=1 |isbn=9780262042840 |access-date=}}</ref>
 
Major processing tasks in an NLP system include: [[speech recognition]], [[text classification]], [[natural-language understanding|natural language understanding]], and [[natural language generation]].
=== 1. What is NLP? ===
Natural Language Processing (NLP) is a specialized branch of '''computer science and artificial intelligence'''. Its main goal is to enable computers to understand, interpret, and generate human language, just like people do. Instead of relying on programming languages or code, NLP focuses on processing '''natural languages'''—like English, Hindi, or Telugu—that humans use every day.
 
=== 2. Purpose of NLP ===
The core idea behind NLP is to make communication between humans and machines smoother and more natural. This involves teaching machines how to read, listen to, and even respond to text or speech in ways that are meaningful. It connects deeply with areas such as '''information retrieval''' (like search engines), '''knowledge representation''' (storing meaning and facts), and '''computational linguistics''' (applying linguistics through computers).
 
=== 3. Key Goals and Challenges ===
NLP is not just about translating words; it’s about understanding '''context, grammar, and meaning'''. For example, the word “bank” can mean a financial institution or the edge of a river. NLP systems must learn to figure out which meaning fits based on context. This makes the task both powerful and challenging—because human language is full of '''ambiguity, slang, and emotion'''.
 
=== 4. Major Tasks in NLP ===
There are several major tasks in NLP that help achieve these goals:
 
* '''Speech Recognition''': Converting spoken words into text (used in voice assistants like Siri).
* '''Text Classification''': Grouping or tagging text into categories (like spam vs. non-spam emails).
* '''Natural Language Understanding (NLU)''': Helping machines understand the meaning behind words and phrases.
* '''Natural Language Generation (NLG)''': Enabling machines to produce text or speech that sounds natural.
 
=== 5. Applications in the Real World ===
NLP is used in everyday technology—like Google Translate, Alexa, Grammarly, and chatbots on websites. It helps in sentiment analysis (e.g., analyzing customer reviews), document summarization, and even in healthcare to process medical records. Its applications are growing rapidly in fields such as education, law, marketing, and customer service.
 
=== 6. The Future of NLP ===
With the rise of '''machine learning and deep learning''', NLP is becoming even more powerful. Models like ChatGPT and BERT can understand and generate human-like responses. As research continues, NLP is expected to play a bigger role in making AI systems more intelligent, ethical, and responsive to human needs.
 
== History ==
Line 34 ⟶ 18:
The premise of symbolic NLP is well-summarized by [[John Searle]]'s [[Chinese room]] experiment: Given a collection of rules (e.g., a Chinese phrasebook, with questions and matching answers), the computer emulates natural language understanding (or other NLP tasks) by applying those rules to the data it confronts.
 
* '''1950s''': The [[Georgetown-IBM experiment|Georgetown experiment]] in 1954 involved fully [[automatic translation]] of more than sixty Russian sentences into English. The authors claimed that within three or five years, machine translation would be a solved problem.<ref>{{cite web|author=Hutchins, J.|year=2005|url=http://www.hutchinsweb.me.uk/Nutshell-2005.pdf|title=The history of machine translation in a nutshell|access-date=2019-02-04|archive-date=2019-07-13|archive-url=https://web.archive.org/web/20190713103044/http://www.hutchinsweb.me.uk/Nutshell-2005.pdf|url-status=dead}}{{self-published source|date=December 2013}}</ref> However, real progress was much slower, and after the [[ALPAC|ALPAC report]] in 1966, which found that ten years of research had failed to fulfill the expectations, funding for machine translation was dramatically reduced. Little further research in machine translation was conducted in America (though some research continued elsewhere, such as Japan and Europe<ref>"ALPAC: the (in)famous report", John Hutchins, MT News International, no. 14, June 1996, pp. 9–12.</ref>) until the late 1980s when the first [[statistical machine translation]] systems were developed.
* '''1960s''': Some notably successful natural language processing systems developed in the 1960s were [[SHRDLU]], a natural language system working in restricted "[[blocks world]]s" with restricted vocabularies, and [[ELIZA]], a simulation of a [[Rogerian psychotherapy|Rogerian psychotherapist]], written by [[Joseph Weizenbaum]] between 1964 and 1966. Using almost no information about human thought or emotion, ELIZA sometimes provided a startlingly human-like interaction. When the "patient" exceeded the very small knowledge base, ELIZA might provide a generic response, for example, responding to "My head hurts" with "Why do you say your head hurts?". [[Ross Quillian]]'s successful work on natural language was demonstrated with a vocabulary of only ''twenty'' words, because that was all that would fit in a computer memory at the time.<ref>{{Harvnb|Crevier|1993|pp=146–148}}, see also {{Harvnb|Buchanan|2005|p=56}}: "Early programs were necessarily limited in scope by the size and speed of memory"</ref>
 
* '''1970s''': During the 1970s, many programmers began to write "conceptual [[ontology (information science)|ontologies]]", which structured real-world information into computer-understandable data. Examples are MARGIE (Schank, 1975), SAM (Cullingford, 1978), PAM (Wilensky, 1978), TaleSpin (Meehan, 1976), QUALM (Lehnert, 1977), Politics (Carbonell, 1979), and Plot Units (Lehnert 1981). During this time, the first [[chatterbots]] were written (e.g., [[PARRY]]).
* '''1980s''': The 1980s and early 1990s mark the heyday of symbolic methods in NLP. Focus areas of the time included research on rule-based parsing (e.g., the development of [[Head-driven phrase structure grammar|HPSG]] as a computational operationalization of [[generative grammar]]), morphology (e.g., two-level morphology<ref>{{citation|last=Koskenniemi|first=Kimmo|title=Two-level morphology: A general computational model of word-form recognition and production|url=http://www.ling.helsinki.fi/~koskenni/doc/Two-LevelMorphology.pdf|year=1983|publisher=Department of General Linguistics, [[University of Helsinki]]|author-link=Kimmo Koskenniemi|access-date=2020-08-20|archive-date=2018-12-21|archive-url=https://web.archive.org/web/20181221032913/http://www.ling.helsinki.fi/~koskenni/doc/Two-LevelMorphology.pdf|url-status=dead}}</ref>), semantics (e.g., [[Lesk algorithm]]), reference (e.g., within Centering Theory<ref>Joshi, A. K., & Weinstein, S. (1981, August). [https://www.ijcai.org/Proceedings/81-1/Papers/071.pdf Control of Inference: Role of Some Aspects of Discourse Structure-Centering]. In ''IJCAI'' (pp. 385–387).</ref>) and other areas of natural language understanding (e.g., in the [[Rhetorical structure theory|Rhetorical Structure Theory]]). Other lines of research were continued, e.g., the development of chatterbots with [[Racter]] and [[Jabberwacky]]. An important development (that eventually led to the statistical turn in the 1990s) was the rising importance of quantitative evaluation in this period.<ref>{{Cite journal|last1=Guida|first1=G.|last2=Mauri|first2=G.|date=July 1986|title=Evaluation of natural language processing systems: Issues and approaches|journal=Proceedings of the IEEE|volume=74|issue=7|pages=1026–1035|doi=10.1109/PROC.1986.13580|s2cid=30688575|issn=1558-2256}}</ref>
 
=== Statistical NLP (1990s–present) ===
Line 44 ⟶ 28:
*'''1990s''': Many of the notable early successes in statistical methods in NLP occurred in the field of [[machine translation]], due especially to work at IBM Research, such as [[IBM alignment models]]. These systems were able to take advantage of existing multilingual [[text corpus|textual corpora]] that had been produced by the [[Parliament of Canada]] and the [[European Union]] as a result of laws calling for the translation of all governmental proceedings into all official languages of the corresponding systems of government. However, most other systems depended on corpora specifically developed for the tasks implemented by these systems, which was (and often continues to be) a major limitation in the success of these systems. As a result, a great deal of research has gone into methods of more effectively learning from limited amounts of data.
*'''2000s''': With the growth of the web, increasing amounts of raw (unannotated) language data have become available since the mid-1990s. Research has thus increasingly focused on [[unsupervised learning|unsupervised]] and [[semi-supervised learning]] algorithms. Such algorithms can learn from data that has not been hand-annotated with the desired answers or using a combination of annotated and non-annotated data. Generally, this task is much more difficult than [[supervised learning]], and typically produces less accurate results for a given amount of input data. However, there is an enormous amount of non-annotated data available (including, among other things, the entire content of the [[World Wide Web]]), which can often make up for the worse efficiency if the algorithm used has a low enough [[time complexity]] to be practical.
*'''2003:''' [[word n-gram language model|word n-gram model]], at the time the best statistical algorithm, is outperformed by a [[multi-layer perceptron]] (with a single hidden layer and [[context length]] of several words, trained on up to 14 million words, by [[Yoshua Bengio|Bengio]] et al.)<ref>{{Cite journal|url=https://dl.acm.org/doi/10.5555/944919.944966|title=A neural probabilistic language model|first1=Yoshua|last1=Bengio|first2=Réjean|last2=Ducharme|first3=Pascal|last3=Vincent|first4=Christian|last4=Janvin|date=March 1, 2003|journal=The Journal of Machine Learning Research|volume=3|pages=1137–1155|via=ACM Digital Library}}</ref>
*'''2010:''' [[Tomáš Mikolov]] (then a PhD student at [[Brno University of Technology]]) with co-authors applied a simple [[recurrent neural network]] with a single hidden layer to language modelling,<ref>{{cite book |last1=Mikolov |first1=Tomáš |last2=Karafiát |first2=Martin |last3=Burget |first3=Lukáš |last4=Černocký |first4=Jan |last5=Khudanpur |first5=Sanjeev |title=Interspeech 2010 |chapter=Recurrent neural network based language model |journal=Proceedings of the 11th Annual Conference of the International Speech Communication Association, INTERSPEECH 2010 |date=26 September 2010 |pages=1045–1048 |doi=10.21437/Interspeech.2010-343 |s2cid=17048224 |chapter-url=https://gwern.net/doc/ai/nn/rnn/2010-mikolov.pdf |language=en}}</ref> and in the following years he went on to develop [[Word2vec]]. In the 2010s, [[representation learning]] and [[deep learning|deep neural network]]-style (featuring many hidden layers) machine learning methods became widespread in natural language processing. That popularity was due partly to a flurry of results showing that such techniques<ref name="goldberg:nnlp17">{{cite journal |last=Goldberg |first=Yoav |year=2016 |arxiv=1807.10854 |title=A Primer on Neural Network Models for Natural Language Processing |journal=Journal of Artificial Intelligence Research |volume=57 |pages=345–420 |doi=10.1613/jair.4992 |s2cid=8273530 }}</ref><ref name="goodfellow:book16">{{cite book |first1=Ian |last1=Goodfellow |first2=Yoshua |last2=Bengio |first3=Aaron |last3=Courville |url=http://www.deeplearningbook.org/ |title=Deep Learning |publisher=MIT Press |year=2016 }}</ref> can achieve state-of-the-art results in many natural language tasks, e.g., in [[language modeling]]<ref name="jozefowicz:lm16">{{cite book |first1=Rafal |last1=Jozefowicz |first2=Oriol |last2=Vinyals |first3=Mike |last3=Schuster |first4=Noam |last4=Shazeer |first5=Yonghui |last5=Wu |year=2016 |arxiv=1602.02410 |title=Exploring the Limits of Language Modeling |bibcode=2016arXiv160202410J }}</ref> and parsing.<ref name="choe:emnlp16">{{cite journal |first1=Do Kook |last1=Choe |first2=Eugene |last2=Charniak |journal=Emnlp 2016 |url=https://aclanthology.coli.uni-saarland.de/papers/D16-1257/d16-1257 |title=Parsing as Language Modeling |access-date=2018-10-22 |archive-date=2018-10-23 |archive-url=https://web.archive.org/web/20181023034804/https://aclanthology.coli.uni-saarland.de/papers/D16-1257/d16-1257 |url-status=dead }}</ref><ref name="vinyals:nips15">{{cite journal |last1=Vinyals |first1=Oriol |last2=Kaiser |first2=Lukasz |display-authors=1 |journal=Nips2015 |title=Grammar as a Foreign Language |year=2014 |arxiv=1412.7449 |bibcode=2014arXiv1412.7449V |url=https://papers.nips.cc/paper/5635-grammar-as-a-foreign-language.pdf }}</ref> This is increasingly important [[artificial intelligence in healthcare|in medicine and healthcare]], where NLP helps analyze notes and text in [[Electronic health record|electronic health records]] that would otherwise be inaccessible for study when seeking to improve care<ref>{{Cite journal|last1=Turchin|first1=Alexander|last2=Florez Builes|first2=Luisa F.|date=2021-03-19|title=Using Natural Language Processing to Measure and Improve Quality of Diabetes Care: A Systematic Review|journal=Journal of Diabetes Science and Technology|volume=15|issue=3|language=en|pages=553–560|doi=10.1177/19322968211000831|pmid=33736486|pmc=8120048|issn=1932-2968}}</ref> or protect patient privacy.<ref>{{Cite journal |last1=Lee |first1=Jennifer |last2=Yang |first2=Samuel |last3=Holland-Hall |first3=Cynthia |last4=Sezgin |first4=Emre |last5=Gill |first5=Manjot |last6=Linwood |first6=Simon |last7=Huang |first7=Yungui |last8=Hoffman |first8=Jeffrey |date=2022-06-10 |title=Prevalence of Sensitive Terms in Clinical Notes Using Natural Language Processing Techniques: Observational Study |journal=JMIR Medical Informatics |language=en |volume=10 |issue=6 |pages=e38482 |doi=10.2196/38482 |issn=2291-9694 |pmc=9233261 |pmid=35687381 |doi-access=free }}</ref>
 
=== Applications of Natural Language Processing ===
Natural Language Processing (NLP) has become an essential technology across a broad range of industries. As deep learning and large-scale language models have evolved, the capabilities of NLP systems have significantly improved, enabling automation, insight generation, and efficient data interaction in both consumer-facing and enterprise environments.
 
'''1. Customer Service and Chatbots'''
 
NLP enables the development of intelligent virtual assistants and chatbots that can understand and respond to customer inquiries in real time. These systems are trained to interpret natural language input, identify the user's intent, and deliver helpful responses. They are widely used in e-commerce platforms, telecommunications, banking, and other service industries to streamline support, reduce wait times, and operate 24/7. In addition to answering queries, these chatbots can also handle appointment scheduling, order tracking, and complaint resolution.
 
'''2. Healthcare'''
 
In the healthcare sector, NLP is used to process large volumes of unstructured clinical data, such as doctors’ notes and patient records. By extracting meaningful insights from text, NLP helps in generating summaries, identifying medical conditions, and flagging potential risks. It plays a critical role in improving patient care, supporting diagnosis, and even predicting health trends. NLP is also used to automate literature reviews, mine clinical trial data, and support medical research.
 
'''3. Legal and Compliance'''
 
Legal professionals use NLP tools to analyze contracts, extract key clauses, and identify regulatory risks. These systems assist in reviewing large volumes of legal text much faster than manual efforts, improving accuracy and consistency. In compliance, NLP helps monitor communication, flag policy violations, and ensure documentation meets legal standards. This reduces the administrative burden on law firms and helps maintain regulatory integrity.
 
'''4. Content Moderation and Sentiment Analysis'''
 
Social media platforms and online communities use NLP to automatically detect offensive language, hate speech, spam, and other forms of inappropriate content. Sentiment analysis, a specific NLP task, allows businesses and researchers to analyze text data and gauge public opinion about products, policies, or events. By understanding sentiment, companies can respond proactively to customer feedback or manage brand reputation. These tools are essential for maintaining safe online environments and understanding user behavior.
 
'''5. Education and Accessibility'''
 
NLP is transforming education through automated grading systems, text summarization, and personalized learning platforms that adapt content based on student performance. It also supports learners with disabilities by providing speech-to-text and text-to-speech tools, helping those with visual or hearing impairments. Intelligent tutoring systems use NLP to simulate one-on-one teaching experiences by interpreting student input and providing targeted feedback. In academic research, NLP assists in synthesizing literature and extracting relevant insights.
 
'''6. Translation and Cross-Language Understanding'''
 
Machine translation systems like Google Translate rely on NLP to convert text from one language to another while preserving meaning and context. These systems help break language barriers, enabling communication in multinational organizations, travel, and global collaboration. NLP-based translation is used in mobile apps, websites, and email clients, and has expanded access to education, media, and services across the globe. Newer NLP models can even understand cultural nuances and idiomatic expressions, making translations more accurate and human-like.
 
==Approaches: Symbolic, statistical, neural networks{{anchor|Statistical natural language processing (SNLP)}} ==
Line 99 ⟶ 56:
=== Neural networks ===
{{Further|Artificial neural network}}
A major drawback of statistical methods is that they require elaborate [[feature engineering]]. Since 2015,<ref>{{Cite web |last=Socher |first=Richard |title=Deep Learning For NLP-ACL 2012 Tutorial |url=https://www.socher.org/index.php/Main/DeepLearningForNLP-ACL2012Tutorial |access-date=2020-08-17 |website=www.socher.org |archive-date=2021-04-14 |archive-url=https://web.archive.org/web/20210414054126/https://www.socher.org/index.php/Main/DeepLearningForNLP-ACL2012Tutorial |url-status=dead }} This was an early Deep Learning tutorial at the ACL 2012 and met with both interest and (at the time) skepticism by most participants. Until then, neural learning was basically rejected because of its lack of statistical interpretability. Until 2015, deep learning had evolved into the major framework of NLP. [Link is broken, try http://web.stanford.edu/class/cs224n/]</ref> the statistical approach has been replaced by the [[Artificial neural network|neural networks]] approach, using [[semantic networks]]<ref>{{cite book |last1=Segev |first1=Elad |title=Semantic Network Analysis in Social Sciences |date=2022 |publisher=Routledge |___location=London |isbn=9780367636524 |url=https://www.routledge.com/Semantic-Network-Analysis-in-Social-Sciences/Segev/p/book/9780367636524 |access-date=5 December 2021 |archive-date=5 December 2021 |archive-url=https://web.archive.org/web/20211205140726/https://www.routledge.com/Semantic-Network-Analysis-in-Social-Sciences/Segev/p/book/9780367636524 |url-status=live }}</ref> and [[word embedding]]s to capture semantic properties of words.
 
Intermediate tasks (e.g., part-of-speech tagging and dependency parsing) are not needed anymore.
Line 171 ⟶ 128:
 
; [[Argument mining]]
:The goal of argument mining is the automatic extraction and identification of argumentative structures from [[natural language]] text with the aid of computer programs.<ref>{{Cite journal|last1=Lippi|first1=Marco|last2=Torroni|first2=Paolo|date=2016-04-20|title=Argumentation Mining: State of the Art and Emerging Trends|url=https://dl.acm.org/doi/10.1145/2850417|journal=ACM Transactions on Internet Technology|language=en|volume=16|issue=2|pages=1–25|doi=10.1145/2850417|hdl=11585/523460|s2cid=9561587|issn=1533-5399|hdl-access=free}}</ref> Such argumentative structures include the premise, conclusions, the [[argument scheme]] and the relationship between the main and subsidiary argument, or the main and counter-argument within discourse.<ref>{{Cite web|title=Argument Mining – IJCAI2016 Tutorial|url=https://www.i3s.unice.fr/~villata/tutorialIJCAI2016.html|access-date=2021-03-09|website=www.i3s.unice.fr|archive-date=2021-04-18|archive-url=https://web.archive.org/web/20210418083659/https://www.i3s.unice.fr/~villata/tutorialIJCAI2016.html|url-status=dead}}</ref><ref>{{Cite web|title=NLP Approaches to Computational Argumentation – ACL 2016, Berlin|url=http://acl2016tutorial.arg.tech/|access-date=2021-03-09|language=en-GB}}</ref>
 
=== Higher-level NLP applications ===
Line 181 ⟶ 138:
; [[Machine translation]] (MT)
:Automatically translate text from one human language to another. This is one of the most difficult problems, and is a member of a class of problems colloquially termed "[[AI-complete]]", i.e. requiring all of the different types of knowledge that humans possess (grammar, semantics, facts about the real world, etc.) to solve properly.
; [[Natural- language understanding]] (NLU): Convert chunks of text into more formal representations such as [[first-order logic]] structures that are easier for [[computer]] programs to manipulate. Natural language understanding involves the identification of the intended semantic from the multiple possible semantics which can be derived from a natural language expression which usually takes the form of organized notations of natural language concepts. Introduction and creation of language metamodel and ontology are efficient however empirical solutions. An explicit formalization of natural language semantics without confusions with implicit assumptions such as [[closed-world assumption]] (CWA) vs. [[open-world assumption]], or subjective Yes/No vs. objective True/False is expected for the construction of a basis of semantics formalization.<ref>{{cite journal|last1=Duan|first1=Yucong|last2=Cruz|first2=Christophe|year=2011|title=Formalizing Semantic of Natural Language through Conceptualization from Existence|url=http://www.ijimt.org/abstract/100-E00187.htm|journal=International Journal of Innovation, Management and Technology|volume=2|issue=1|pages=37–42|archive-url=https://web.archive.org/web/20111009135952/http://www.ijimt.org/abstract/100-E00187.htm|archive-date=2011-10-09}}</ref>
; [[Natural language generation|Natural-language generation]]<nowiki> (NLG):</nowiki>
:Convert information from computer databases or semantic intents into readable human language.
; Book generation
Line 210 ⟶ 167:
 
# Apply the theory of [[conceptual metaphor]], explained by Lakoff as "the understanding of one idea, in terms of another" which provides an idea of the intent of the author.<ref>{{Cite book|title=A Cognitive Theory of Cultural Meaning|last= Strauss |first= Claudia |publisher= Cambridge University Press|year=1999|isbn=978-0-521-59541-4|pages=156–164}}</ref> For example, consider the English word ''big''. When used in a comparison ("That is a big tree"), the author's intent is to imply that the tree is ''physically large'' relative to other trees or the authors experience. When used metaphorically ("Tomorrow is a big day"), the author's intent to imply ''importance''. The intent behind other usages, like in "She is a big person", will remain somewhat ambiguous to a person and a cognitive NLP algorithm alike without additional information.
# Assign relative measures of meaning to a word, phrase, sentence or piece of text based on the information presented before and after the piece of text being analyzed, e.g., by means of a [[probabilistic context-free grammar]] (PCFG). The mathematical equation for such algorithms is presented in [https://worldwide.espacenet.com/patent/search/family/055314712/publication/US9269353B1?q=pn%3DUS9269353 US Patent 9269353] {{Webarchive|url=https://web.archive.org/web/20240516102600/https://worldwide.espacenet.com/patent/search/family/055314712/publication/US9269353B1?q=pn=US9269353 |date=2024-05-16 }}:<ref>{{cite patent |country=US |number=9269353|status=patent}}</ref>
::<math> {RMM(token_N)}
=