Language model: Difference between revisions

Content deleted Content added
more relevant sub-link
m +link
Line 2:
{{Use dmy dates|date=July 2022}}
 
A '''language model''' is a probabilistic [[Model#Conceptual model|model]] of a natural language.<ref>{{cite book |last1=Jurafsky |first1=Dan |last2=Martin |first2=James H. |title=Speech and Language Processing |date=2021 |edition=3rd |url=https://web.stanford.edu/~jurafsky/slp3/ |access-date=24 May 2022 |chapter=N-gram Language Models |archive-date=22 May 2022 |archive-url=https://web.archive.org/web/20220522005855/https://web.stanford.edu/~jurafsky/slp3/ |url-status=live }}</ref> In 1980, the first significant statistical language model was proposed, and during the decade IBM performed ‘Shannon‘[[Claude Shannon|Shannon]]-style’ experiments, in which potential sources for language modeling improvement were identified by observing and analyzing the performance of human subjects in predicting or correcting text.<ref>{{cite journal |last1=Rosenfeld |first1=Ronald |year=2000 |title=Two decades of statistical language modeling: Where do we go from here? |journal=Proceedings of the IEEE |volume=88 |issue=8|pages=1270–1278 |doi=10.1109/5.880083 |s2cid=10959945 |url=https://figshare.com/articles/journal_contribution/6611138 }}</ref>
 
Language models are useful for a variety of tasks, including [[speech recognition]]<ref>Kuhn, Roland, and Renato De Mori (1990). [https://www.researchgate.net/profile/Roland_Kuhn2/publication/3191800_Cache-based_natural_language_model_for_speech_recognition/links/004635184ee5b2c24f000000.pdf "A cache-based natural language model for speech recognition"]. ''IEEE transactions on pattern analysis and machine intelligence'' 12.6: 570–583.</ref> (helping prevent predictions of low-probability (e.g. nonsense) sequences), [[machine translation]],<ref name="Semantic parsing as machine translation">Andreas, Jacob, Andreas Vlachos, and Stephen Clark (2013). [https://www.aclweb.org/anthology/P13-2009 "Semantic parsing as machine translation"] {{Webarchive|url=https://web.archive.org/web/20200815080932/https://www.aclweb.org/anthology/P13-2009/ |date=15 August 2020 }}. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers).</ref> [[natural language generation]] (generating more human-like text), [[optical character recognition]], [[handwriting recognition]],<ref>Pham, Vu, et al (2014). [https://arxiv.org/abs/1312.4569 "Dropout improves recurrent neural networks for handwriting recognition"] {{Webarchive|url=https://web.archive.org/web/20201111170554/https://arxiv.org/abs/1312.4569 |date=11 November 2020 }}. 14th International Conference on Frontiers in Handwriting Recognition. IEEE.</ref> [[grammar induction]],<ref>Htut, Phu Mon, Kyunghyun Cho, and Samuel R. Bowman (2018). [https://arxiv.org/pdf/1808.10000.pdf?source=post_page--------------------------- "Grammar induction with neural language models: An unusual replication"] {{Webarchive|url=https://web.archive.org/web/20220814010528/https://arxiv.org/pdf/1808.10000.pdf?source=post_page--------------------------- |date=14 August 2022 }}. {{arXiv|1808.10000}}.</ref> and [[information retrieval]].<ref name=ponte1998>{{cite conference |first1=Jay M. |last1=Ponte |first2= W. Bruce |last2=Croft | title= A language modeling approach to information retrieval |conference=Proceedings of the 21st ACM SIGIR Conference |year=1998 |publisher=ACM |place=Melbourne, Australia | pages = 275–281| doi=10.1145/290941.291008}}</ref><ref name=hiemstra1998>{{cite conference | first=Djoerd | last=Hiemstra | year = 1998 | title = A linguistically motivated probabilistically model of information retrieval | conference = Proceedings of the 2nd European conference on Research and Advanced Technology for Digital Libraries | publisher = LNCS, Springer | pages=569–584 | doi= 10.1007/3-540-49653-X_34}}</ref>