Natural language processing: Difference between revisions

Content deleted Content added
Iohla (talk | contribs)
No edit summary
Iohla (talk | contribs)
added prepositions
Line 1:
{{Short description|Field of linguistics and computer science}}{{More citations needed|date=May 2024}}{{Other uses|NLP (disambiguation){{!}}NLP}}{{About|natural language processing done by computers|the natural language processing done by the human brain|Language processing in the brain}}{{In'''Natural use|time=16:43,language 9 August 2024processing''' (UTC'''NLP''')from1942 hrsis an [[interdisciplinary]] subfield of [[computer science]] and [[artificial intelligence]]. It is primarily concerned with providing computers with the ability to 1952hrsprocess data encoded in [E[natural language]] and is thus closely related to [[information retrieval]], [[knowledge representation]] and [[computational linguistics]], a subfield of [[linguistics]].A.T Typically data is collected in [[text corpus|text corpora]}}], using either rule-based, statistical or neural-based approaches in [[machine learning]] and [[deep learning]].
'''Natural language processing''' ('''NLP''') is an [[interdisciplinary]] subfield of [[computer science]] and [[artificial intelligence]]. It is primarily concerned with providing computers the ability to process data encoded in [[natural language]] and is thus closely related to [[information retrieval]], [[knowledge representation]] and [[computational linguistics]], a subfield of [[linguistics]]. Typically data is collected in [[text corpus|text corpora]], using either rule-based, statistical or neural-based approaches of [[machine learning]] and [[deep learning]].
 
Major tasks in natural language processing are [[speech recognition]], [[text classification]], [[natural-language understanding]], and [[natural language generation|natural-language generation]].
Line 12 ⟶ 11:
The premise of symbolic NLP is well-summarized by [[John Searle]]'s [[Chinese room]] experiment: Given a collection of rules (e.g., a Chinese phrasebook, with questions and matching answers), the computer emulates natural language understanding (or other NLP tasks) by applying those rules to the data it confronts.
 
* '''1950s''': The [[Georgetown-IBM experiment|Georgetown experiment]] in 1954 involved fully [[automatic translation]] of more than sixty Russian sentences into English. The authors claimed that within three or five years, machine translation would be a solved problem.<ref>{{cite web|author=Hutchins, J.|year=2005|url=http://www.hutchinsweb.me.uk/Nutshell-2005.pdf|title=The history of machine translation in a nutshell}}{{self-published source|date=December 2013}}</ref> However, real progress was much slower, and after the [[ALPAC|ALPAC report]] in 1966, which found that ten-year-long years of research had failed to fulfill the expectations, funding for machine translation was dramatically reduced. Little further research in machine translation was conducted in America (though some research continued elsewhere, such as Japan and Europe<ref>"ALPAC: the (in)famous report", John Hutchins, MT News International, no. 14, June 1996, pp. 9–12.</ref>) until the late 1980s when the first [[statistical machine translation]] systems were developed.
* '''1960s''': Some notably successful natural language processing systems developed in the 1960s were [[SHRDLU]], a natural language system working in restricted "[[blocks world]]s" with restricted vocabularies, and [[ELIZA]], a simulation of a [[Rogerian psychotherapy|Rogerian psychotherapist]], written by [[Joseph Weizenbaum]] between 1964 and 1966. Using almost no information about human thought or emotion, ELIZA sometimes provided a startlingly human-like interaction. When the "patient" exceeded the very small knowledge base, ELIZA might provide a generic response, for example, responding to "My head hurts" with "Why do you say your head hurts?". [[Ross Quillian]]'s successful work on natural language was demonstrated with a vocabulary of only ''twenty'' words, because that was all that would fit in a computer memory at the time.<ref>{{Harvnb|Crevier|1993|pp=146–148}}, see also {{Harvnb|Buchanan|2005|p=56}}: "Early programs were necessarily limited in scope by the size and speed of memory"</ref>
 
Line 20 ⟶ 19:
=== Statistical NLP (1990s–2010s) ===
Up until the 1980s, most natural language processing systems were based on complex sets of hand-written rules. Starting in the late 1980s, however, there was a revolution in natural language processing with the introduction of [[machine learning]] algorithms for language processing. This was due to both the steady increase in computational power (see [[Moore's law]]) and the gradual lessening of the dominance of [[Noam Chomsky|Chomskyan]] theories of linguistics (e.g. [[transformational grammar]]), whose theoretical underpinnings discouraged the sort of [[corpus linguistics]] that underlies the machine-learning approach to language processing.<ref>Chomskyan linguistics encourages the investigation of "[[corner case]]s" that stress the limits of its theoretical models (comparable to [[pathological (mathematics)|pathological]] phenomena in mathematics), typically created using [[thought experiment]]s, rather than the systematic investigation of typical phenomena that occur in real-world data, as is the case in [[corpus linguistics]]. The creation and use of such [[text corpus|corpora]] of real-world data is a fundamental part of machine-learning algorithms for natural language processing. In addition, theoretical underpinnings of Chomskyan linguistics such as the so-called "[[poverty of the stimulus]]" argument entail that general learning algorithms, as are typically used in machine learning, cannot be successful in language processing. As a result, the Chomskyan paradigm discouraged the application of such models to language processing.</ref>
*'''1990s''': Many of the notable early successes onin statistical methods in NLP occurred in the field of [[machine translation]], due especially to work at IBM Research, such as [[IBM alignment models]]. These systems were able to take advantage of existing multilingual [[text corpus|textual corpora]] that had been produced by the [[Parliament of Canada]] and the [[European Union]] as a result of laws calling for the translation of all governmental proceedings into all official languages of the corresponding systems of government. However, most other systems depended on corpora specifically developed for the tasks implemented by these systems, which was (and often continues to be) a major limitation in the success of these systems. As a result, a great deal of research has gone into methods of more effectively learning from limited amounts of data.
*'''2000s''': With the growth of the web, increasing amounts of raw (unannotated) language data hashave become available since the mid-1990s. Research has thus increasingly focused on [[unsupervised learning|unsupervised]] and [[semi-supervised learning]] algorithms. Such algorithms can learn from data that has not been hand-annotated with the desired answers or using a combination of annotated and non-annotated data. Generally, this task is much more difficult than [[supervised learning]], and typically produces less accurate results for a given amount of input data. However, there is an enormous amount of non-annotated data available (including, among other things, the entire content of the [[World Wide Web]]), which can often make up for the inferior results if the algorithm used has a low enough [[time complexity]] to be practical.
 
=== Neural NLP (present) ===