Revision as of 20:20, 20 January 2025 edit MrOllie (talk \| contribs) Extended confirmed users, Pending changes reviewers, Rollbackers 255,481 edits Reverted 1 edit by Sarahouza (talk): Rv blog Tags: Twinkle Undo ← Previous edit		Revision as of 03:50, 10 February 2025 edit undo Closed Limelike Curves (talk \| contribs) Extended confirmed users, Pending changes reviewers 8,356 edits →Neural NLP (present): These are the same paradigm/subparadigm Next edit →
Line 26: *'''2000s''': With the growth of the web, increasing amounts of raw (unannotated) language data have become available since the mid-1990s. Research has thus increasingly focused on [[unsupervised learning\|unsupervised]] and [[semi-supervised learning]] algorithms. Such algorithms can learn from data that has not been hand-annotated with the desired answers or using a combination of annotated and non-annotated data. Generally, this task is much more difficult than [[supervised learning]], and typically produces less accurate results for a given amount of input data. However, there is an enormous amount of non-annotated data available (including, among other things, the entire content of the [[World Wide Web]]), which can often make up for the inferior results if the algorithm used has a low enough [[time complexity]] to be practical. ~~=== Neural NLP (present) ===~~ In 2003, [[word n-gram language model\|word n-gram model]], at the time the best statistical algorithm, was outperformed by a [[multi-layer perceptron]] (with a single hidden layer and context length of several words trained on up to 14 million of words with a CPU cluster in [[language model]]ling) by [[Yoshua Bengio]] with co-authors.<ref>{{Cite journal\|url=https://dl.acm.org/doi/10.5555/944919.944966\|title=A neural probabilistic language model\|first1=Yoshua\|last1=Bengio\|first2=Réjean\|last2=Ducharme\|first3=Pascal\|last3=Vincent\|first4=Christian\|last4=Janvin\|date=March 1, 2003\|journal=The Journal of Machine Learning Research\|volume=3\|pages=1137–1155\|via=ACM Digital Library}}</ref>

Natural language processing: Difference between revisions