Language model: Difference between revisions

Content deleted Content added
No edit summary
Tags: Visual edit Mobile edit Mobile web edit
Citation bot (talk | contribs)
Altered title. Removed URL that duplicated identifier. | Use this bot. Report bugs. | Suggested by Dominic3203 | Linked from User:LinguisticMystic/cs/outline | #UCB_webform_linked 1121/2277
Line 11:
In 1980, statistical approaches were explored and found to be more useful for many purposes than rule-based formal grammars. Discrete representations like [[Word n-gram language model|word ''n''-gram language models]], with probabilities for discrete combinations of words, made significant advances.
 
In the 2000s, continuous representations for words, such as [[Word2vec|word embeddings]], began to replace discrete representations.<ref>{{Cite news |date=2022-02-22 |title=The Nature Of Life, The Nature Of Thinking: Looking Back On Eugene Charniak’sCharniak's Work And Life |url=https://cs.brown.edu/news/2022/02/22/the-nature-of-life-the-nature-of-thinking-looking-back-on-eugene-charniaks-work-and-life/ |archive-url=http://web.archive.org/web/20241103134558/https://cs.brown.edu/news/2022/02/22/the-nature-of-life-the-nature-of-thinking-looking-back-on-eugene-charniaks-work-and-life/ |archive-date=2024-11-03 |access-date=2025-02-05 |language=en}}</ref> Typically, the representation is a [[Real number|real-valued]] vector that encodes the meaning of the word in such a way that the words that are closer in the vector space are expected to be similar in meaning, and common relationships between pairs of words like plurality or gender .
 
== Pure statistical models ==
Line 101:
{{refbegin}}
 
* {{cite conference |author1=Jay M. Ponte |author2=W. Bruce Croft | citeseerx=10.1.1.117.4237 |doi=10.1145/290941.291008 |doi-access=free |url=https://dl.acm.org/doi/10.1145/290941.291008 | title = A Language Modeling Approach to Information Retrieval | book-title=Research and Development in Information Retrieval | year=1998 | pages=275–281 }}
* {{cite conference |author1=Fei Song |author2=W. Bruce Croft | citeseerx=10.1.1.21.6467 |doi=10.1145/319950.320022 |doi-access=free |url=https://dl.acm.org/doi/10.1145/319950.320022 |title=A General Language Model for Information Retrieval | book-title=Research and Development in Information Retrieval |year=1999 | pages=279–280 }}
* {{cite tech report |first=Stanley F. |last=Chen |author2=Joshua Goodman |title=An Empirical Study of Smoothing Techniques for Language Modeling |institution=Harvard University |year=1998 |citeseerx=10.1.1.131.5458 |url=https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=273adbdb43097636aa9260d9ecd60d0787b0ef4d }}