Outline of natural language processing: Difference between revisions

Content deleted Content added
rmv - COI spam, WP:EL
Line 347:
|
|-
|'''[[Jabberwacky]] '''
|1982
|[[Rollo Carpenter]]
Line 389:
|[[IBM]]
|A question answering system that won the [[Jeopardy!]] contest, defeating the best human players in February 2011.
|-
|[[Rosoka]]
|2007
|[[Rosoka Software]]
|Multilingual NLP processing for government and commercial applications. Extracts entities, relationships, provides entity resolution, language ID, sentiment analysis and geotagging.
|
|-
|MeTA
Line 433 ⟶ 427:
* [[Concept mining]] –
* [[Content determination]] –
* [[ConverseonDATR]] –
* [[DATR]] –
* [[DBpedia Spotlight]] –
* [[Deep linguistic processing]] –
Line 514 ⟶ 507:
 
=== Natural language processing toolkits ===
The following '''natural language processing [[List of toolkits|toolkits]]''' are popularnotable collections of [[natural language processing]] software. They are suites of [[Library (computer science)|libraries]], [[Software framework|frameworks]], and [[Software application|applications]] for symbolic, statistical natural language and speech processing.
 
{|class="wikitable sortable"
!Name!!Language!!License!!Creators!!Website
|-
|[[Apertium]] || [[C++]], [[Java (programming language)|Java]] || [[GPL]] || (various) || [http://wiki.apertium.org/]
|-
|[[ChatScript]] || [[C++]] || [[GPL]] || [[Bruce Wilcox]] || [http://brilligunderstanding.com/]
|-
|[[Deeplearning4j]] || [[Java (programming language)|Java]], [[Scala (programming language)|Scala]] || [[Apache License|Apache 2.0]] || Adam Gibson, Skymind || [http://deeplearning4j.org/]
|Ariane || GETA's Specialized Languages for Linguistic Programming (Ariane-G5) || [[BSD]] + miscellaneous || Vincent Berment (Ariane-H online version) || [http://lingwarium.org/]
|-
|[[DELPH-IN]] || [[LISP]], [[C++]] || [[LGPL]], [[MIT License|MIT]], ... || Deep Linguistic Processing with [[HPSG]] Initiative || [http://www.delph-in.net/]
|[[Deeplearning4j]] || [[Java (programming language)|Java]], [[Scala (programming language)|Scala]] || [[Apache License|Apache 2.0]] || Adam Gibson, Skymind || [http://deeplearning4j.org/]
|-
|[[Distinguo]] || [[C++]] ||Commercial || [[Ultralingua Inc.]] || [http://ultralingua.com/en/semantic-search.htm]
|[[DELPH-IN]] || [[LISP]], [[C++]] || [[LGPL]], [[MIT License|MIT]], ... || Deep Linguistic Processing with [[HPSG]] Initiative || [http://www.delph-in.net/]
|-
|[[DKPro]] Core||[[Java (programming language)|Java]]||[[Apache License|Apache 2.0]] / Varying for individual modules ||[[Technische Universität Darmstadt]] / Online community
|[[Distinguo]] || [[C++]] ||Commercial || [[Ultralingua Inc.]] || [http://ultralingua.com/en/semantic-search.htm]
|-
|[[DKProGeneral Architecture for Text Engineering]] Core(GATE)||[[Java (programming language)|Java]]|| [[Apache License|Apache 2.0LGPL]] / Varying for individual modules ||GATE [[Technischeopen Universität Darmstadt]] / Onlinesource community || [https://dkpro.github.io/dkpro-core/]
|-
|[[FreeLingGensim]] || [[C++]] (with [[Java (programming language)|Java]], [[Python (programming language)|Python]], and [[Perl]] APIs) || Affero GPL || [http://www.talp.upc.edu TALP Research Center], [[Universitat Politècnica de CatalunyaLGPL]] || [http://nlp.cs.upc.edu/freeling]Radim Řehůřek
|-
|[[General Architecture for Text EngineeringLinguaStream]] (GATE)||[[Java (programming language)|Java]]||Free [[LGPL]]for research ||GATE[[University openof source community||Caen]], [http://gate.ac.uk/[France]]
|-
|[[GensimMallet (software project)|Mallet]] || [[PythonJava (programming language)|PythonJava]] || [[LGPLCommon Public License]] ||[[University Radimof ŘehůřekMassachusetts || [https://github.com/piskvorky/gensim/Amherst]]
|-
|[[LinguaStreamModular Audio Recognition Framework]]||[[Java (programming language)|Java]]||[[BSD license|BSD]]||Free forThe researchMARF ||Research [[Universityand ofDevelopment Caen]]Group, [[France]]Concordia University (Quebec)||Concordia [http://www.linguastream.org/University]]
|-
|[[MalletMontyLingua]]||[[Python (softwareprogramming projectlanguage)|MalletPython]] ||, [[Java (programming language)|Java]] ||Free [[Commonfor Public License]]research || [[University of Massachusetts AmherstMIT]] || [http://mallet.cs.umass.edu/]
|-
|[[ModularNatural AudioLanguage Recognition FrameworkToolkit]] (NLTK) || [[JavaPython (programming language)|JavaPython]] || [[BSDApache licenseLicense|BSD]]Apache || The MARF Research and Development Group, [[Concordia University (Quebec)|Concordia University2.0]] || [http://marf.sf.net]
|-
|Apache [[OpenNLP]] || [[Java (programming language)|Java]]||[[Apache Software Foundation|Apache License 2.0]]||Online community || [http://opennlp.apache.org]
|[https://meta-toolkit.org/ MeTA]
|C++, Python (wrapper via metapy)
|Free for research
|Sean Massung, Chase Geigle, Cheng{X}iang Zhai at University of Illinois Urbana-Champaign
|[https://meta-toolkit.org/]
|-
|[[MontyLingua]] || [[Python (programming language)|Python]], [[Java (programming language)|Java]]||Free for research ||[[MIT]] || [http://web.media.mit.edu/~hugo/montylingua/]
|-
|[[Natural Language Toolkit]] (NLTK) ||[[Python (programming language)|Python]] || [[Apache License|Apache 2.0]] || || [http://www.nltk.org]
|-
|NLP Lean Programming framework (NLPf)
|[[Java (programming language)|Java]]
|[[GNU Lesser General Public License|LGPL]]
|
|[https://gitlab.com/schrieveslaach/NLPf]
|-
|[[TextBlob]] || [[Python (programming language)|Python]] || [[MIT License|MIT]] || Steven Loria et al || [https://textblob.readthedocs.io/en/dev/]
|-
|[[Rosoka Toolkit]] ||[[Java (programming language)|Java]] || Commercial || [[Rosoka Software, Corp.]]|| [https://www.rosoka.com/natural-language-processing]
|-
|Apache [[OpenNLP]] || [[Java (programming language)|Java]]||[[Apache Software Foundation|Apache License 2.0]]||Online community || [http://opennlp.apache.org]
|-
|Rasa NLU|| [[Python (programming language)|Python]]||[[Apache License|Apache 2.0]]||Rasa, Open Source Community || [http://rasa.com/]
|-
|[[spaCy]]
Line 573 ⟶ 544:
|[[MIT License|MIT]]
|Matthew Honnibal, Explosion AI
|[https://spacy.io]
|-
|[[UIMA]]||[[Java (programming language)|Java]] / [[C++]] || [[Apache License|Apache 2.0]] || [[Apache Software Foundation|Apache]] || [http://incubator.apache.org/uima/index.html]
|-
|[https://github.com/IllinoisCogComp/illinois-cogcomp-nlp CogcompNLP]
|[[Java (programming language)|Java]]
|Research and Academic Use License
|Cognitive Computation Group (Dan Roth)
|[https://github.com/IllinoisCogComp/illinois-cogcomp-nlp]
|-
|CoreNLP
|[[Java (programming language)|Java]]
|GNU GPL
|Stanford
|[https://stanfordnlp.github.io/CoreNLP/index.html]
|-
|[[Natural Language ToolkitUIMA]] (NLTK) ||[[PythonJava (programming language)|PythonJava]] / [[C++]]|| [[Apache License|Apache 2.0]] ||[[Apache Software Foundation|| [http://www.nltk.orgApache]]
|[https://github.com/JohnSnowLabs/spark-nlp Spark NLP]
|[[Scala (programming language)|Scala]] / [[Python (programming language)|Python]]
|[[Apache License|Apache 2.0]]
|JohnSnowLabs
|[https://github.com/JohnSnowLabs/spark-nlp]
|-
|}
Line 599 ⟶ 551:
=== Named entity recognizers ===
* ABNER (A Biomedical Named Entity Recognizer) – open source text mining program that uses linear-chain conditional random field sequence models. It automatically tags genes, proteins and other entity names in text. Written by Burr Settles of the University of Wisconsin-Madison.
* [http://nlp.stanford.edu/software/CRF-NER.shtml Stanford NER] (Named Entity Recognizer) — Java implementation of a Named Entity Recognizer that uses linear-chain conditional random field sequence models. It automatically tags persons, organizations, and locations in text in English, German, Chinese, and Spanish languages. Written by Jenny Finkel and other members of the Stanford NLP Group at Stanford University.
 
=== Translation software ===
Line 613 ⟶ 565:
 
=== Other software ===
* [[CTAKES]] – open-source natural language processing system for information extraction from electronic medical record clinical free-text. It processes clinical notes, identifying types of clinical named entities — drugs, diseases/disorders, signs/symptoms, anatomical sites and procedures. Each named entity has attributes for the text span, the ontology mapping code, context (family history of, current, unrelated to patient), and negated/not negated. Also known as Apache cTAKES.
* [[Boris (software)|BORIS]] –
* [[Digital Media Access Protocol|DMAP]] –
* [[CTAKES]] – open-source natural language processing system for information extraction from electronic medical record clinical free-text. It processes clinical notes, identifying types of clinical named entities — drugs, diseases/disorders, signs/symptoms, anatomical sites and procedures. Each named entity has attributes for the text span, the ontology mapping code, context (family history of, current, unrelated to patient), and negated/not negated. Also known as Apache cTAKES.
* [http://cubic.ai/ Cubic.ai] - voice assistant for a smart home. Cubic is a mobile app for Android devices which enables user to control smart home devices with voice commands.
* [[Digital Media Access Protocol|DMAP]] –
* [[ETAP-3]] &ndash; proprietary linguistic processing system focusing on English and Russian.<ref>{{cite web|url=http://www.iitp.ru/ru/science/works/452.htm |title=МНОГОЦЕЛЕВОЙ ЛИНГВИСТИЧЕСКИЙ ПРОЦЕССОР ЭТАП-3 |publisher=Iitp.ru |date= |accessdate=2012-02-14}}</ref> It is a [[Rule-based machine translation|rule-based system]] which uses the [[Meaning-Text Theory]] as its theoretical foundation.
* [[JAPE (linguistics)|JAPE]] &ndash; the Java Annotation Patterns Engine, a component of the open-source General Architecture for Text Engineering (GATE) platform. JAPE is a finite state transducer that operates over annotations based on regular expressions.
Line 634 ⟶ 584:
* [[Festival Speech Synthesis System]] &ndash;
* [[CMU Sphinx]] speech recognition system &ndash;
* [[Language Grid]] - Open source platform for language web services, which can customize language services by combining existing language services.
* [http://parlo.io Parlo Broca] - Enterprise Natural Language Understanding system for building enterprise chatbots
* [[Language Grid]] - Open source platform for language web services, which can customize language services by combining existing language services.
 
=== Chatterbots ===
Line 664 ⟶ 613:
*[[MegaHAL]]
*[[Mitsuku]], 2013 and 2016 [[Loebner Prize]] winner<ref>{{cite web|url=http://www.paulmckevitt.com/loebner2013/|title=Loebner Prize Contest 2013 |publisher=People.exeter.ac.uk |date=2013-09-14 |accessdate=2013-12-02}}</ref>
*[http://raphaella.mochuelitofriki.com/ Raphaella Ai 01110110]
*Rose - ... 2015 - 3x [[Loebner Prize]] winner, by [[Bruce Wilcox]].
*[[SimSimi]] - A popular artificial intelligence conversation program that was created in 2002 by ISMaker.
Line 670 ⟶ 618:
*[[Ultra Hal Assistant|Ultra Hal]] - 2007 [[Loebner Prize]] winner, by [[Robert Medeksza]].
*[[Verbot]]
*[http://ai.cronusbot.com/cronus/ Cronus Bot], by Innovative Solutions.
*[http://www.meetida.com IDA Chatbot], by Blunner.
 
====Instant messenger chatterbots====
Line 700 ⟶ 646:
}}</ref>
*[[Negobot]], a bot designed to catch online pedophiles by posing as a young girl and attempting to elicit personal details from people it speaks to.<ref>{{cite web|last1=Laorden|first1=Carlos|last2=Galan-Garcia|first2=Patxi|last3=Santos|first3=Igor|last4=Sanz|first4=Borja|last5=Hidalgo|first5=Jose Maria Gomez|last6=Bringas|first6=Pablo G.|title=Negobot: A conversational agent based on game theory for the detection of paedophile behaviour|url=http://paginaspersonales.deusto.es/isantos/publications/2012/Laorden_2012_CISIS_Negobot.pdf|isbn=978-3-642-33018-6|deadurl=yes|archiveurl=https://web.archive.org/web/20130917013039/http://paginaspersonales.deusto.es/isantos/publications/2012/Laorden_2012_CISIS_Negobot.pdf|archivedate=2013-09-17|df=}}</ref>
 
====Natural Language Understanding chatterbots====
*[[OnlineBotBuilder]], the first Online Chatbot Builder for [[Microsoft's Luis.ai service]] (March 2017 to present)<ref>{{Cite news
| last = Potschka
| first = Rob
| title = Online Bot Builder! Always Free! OnlineBotBuilder.com
| url = http://OnlineBotBuilder.com
| date = 2017-03-04
}}</ref>
 
== Natural language processing organizations ==
Line 761 ⟶ 698:
* [[Terry Winograd]] &ndash; professor of computer science at Stanford University, and co-director of the Stanford Human-Computer Interaction Group. He is known within the philosophy of mind and artificial intelligence fields for his work on natural language using the SHRDLU program.
* [[William Aaron Woods]] &ndash;
* [[Maurice Gross]] &ndash; author of the concept of local grammar,<ref name="AHI">[http://hdl.handle.net/2042/14456 Ibrahim, Amr Helmy. 2002. "Maurice Gross (1934-2001). À la mémoire de Maurice Gross". ''Hermès'' 34.]</ref> taking finite automata as the competence model of language.<ref name="RD">[http://www.nyu.edu/pages/linguistics/kaliedoscope/mauricegross13.pdf Dougherty, Ray. 2001. ''Maurice Gross Memorial Letter''.]</ref> Local grammars consisting of finite automata, coupled with morpho-syntactic dictionaries support automatic text analysis<ref name="AHI"/><ref name="BL">[http://www.cairn.info/article.php?ID_ARTICLE=TL_046_0145 Lamiroy, Béatrice. 2003. " In memoriam Maurice Gross ", ''Travaux de linguistique'' 46:1, pp. 145-158.]</ref> by Intex software (now [http://www.nooj4nlp.net NooJ]) developed by Max Silberztein and by [http://www-igm.univ-mlv.fr/~unitex/index.php Unitex/GramLab] developed by the [http://ligm.u-pem.fr/ Gaspard-Monge Computer Science Laboratory (LIGM)].
* [[Stephen Wolfram]] &ndash; CEO and founder of [[Wolfram Research]], creator of the programming language (natural language understanding) [[Wolfram Language]], and natural language processing computation engine [[Wolfram Alpha]].<ref>http://blog.wolfram.com/2010/11/16/programming-with-natural-language-is-actually-going-to-work/</ref>
* [[Victor Yngve]] &ndash;