Revision as of 20:25, 16 December 2023 edit 2601:447:c601:3690:ed3b:6a76:4b96:4523 (talk) No edit summary ← Previous edit		Revision as of 01:08, 30 December 2023 edit undo IntentionallyDense (talk \| contribs) Extended confirmed users, New page reviewers, Pending changes reviewers 36,556 edits m v2.05b - WPCleaner - Fix errors for CW project (Double pipe in a link) Tag: WPCleaner Next edit →
Line 6: Language models are useful for a variety of tasks, including [[speech recognition]]<ref>Kuhn, Roland, and Renato De Mori (1990). [https://www.researchgate.net/profile/Roland_Kuhn2/publication/3191800_Cache-based_natural_language_model_for_speech_recognition/links/004635184ee5b2c24f000000.pdf "A cache-based natural language model for speech recognition"]. ''IEEE transactions on pattern analysis and machine intelligence'' 12.6: 570–583.</ref> (helping prevent predictions of low-probability (e.g. nonsense) sequences), [[machine translation]],<ref name="Semantic parsing as machine translation">Andreas, Jacob, Andreas Vlachos, and Stephen Clark (2013). [https://www.aclweb.org/anthology/P13-2009 "Semantic parsing as machine translation"] {{Webarchive\|url=https://web.archive.org/web/20200815080932/https://www.aclweb.org/anthology/P13-2009/ \|date=15 August 2020 }}. Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers).</ref> [[natural language generation]] (generating more human-like text), [[optical character recognition]], [[handwriting recognition]],<ref>Pham, Vu, et al (2014). [https://arxiv.org/abs/1312.4569 "Dropout improves recurrent neural networks for handwriting recognition"] {{Webarchive\|url=https://web.archive.org/web/20201111170554/https://arxiv.org/abs/1312.4569 \|date=11 November 2020 }}. 14th International Conference on Frontiers in Handwriting Recognition. IEEE.</ref> [[grammar induction]],<ref>Htut, Phu Mon, Kyunghyun Cho, and Samuel R. Bowman (2018). [https://arxiv.org/pdf/1808.10000.pdf?source=post_page--------------------------- "Grammar induction with neural language models: An unusual replication"] {{Webarchive\|url=https://web.archive.org/web/20220814010528/https://arxiv.org/pdf/1808.10000.pdf?source=post_page--------------------------- \|date=14 August 2022 }}. {{arXiv\|1808.10000}}.</ref> and [[information retrieval]].<ref name=ponte1998>{{cite conference \|first1=Jay M. \|last1=Ponte \|first2= W. Bruce \|last2=Croft \| title= A language modeling approach to information retrieval \|conference=Proceedings of the 21st ACM SIGIR Conference \|year=1998 \|publisher=ACM \|place=Melbourne, Australia \| pages = 275–281\| doi=10.1145/290941.291008}}</ref><ref name=hiemstra1998>{{cite conference \| first=Djoerd \| last=Hiemstra \| year = 1998 \| title = A linguistically motivated probabilistically model of information retrieval \| conference = Proceedings of the 2nd European conference on Research and Advanced Technology for Digital Libraries \| publisher = LNCS, Springer \| pages=569–584 \| doi= 10.1007/3-540-49653-X_34}}</ref> [[Large language model]]s, currently their most advanced form, are a combination of larger datasets (frequently using scraped words from the public internet), [[feedforward neural network]]s, and [[transformer (machine learning)\|transformer]]s. They have superseded [[recurrent neural network]]-based models, which had previously superseded the pure statistical models, such as [[~~word~~Word n-gram language model\|word ''n''-gram language model\|]]. == Pure statistical models ==

Language model: Difference between revisions