Machine learning: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 15:04, 13 August 2025 edit Hooman Mallahzadeh (talk \| contribs) Extended confirmed users 4,637 edits No edit summary ← Previous edit		Latest revision as of 21:02, 26 August 2025 edit undo Citation bot (talk \| contribs) Bots 5,865,537 edits Add: article-number, bibcode. Removed parameters. Some additions/deletions were parameter name changes. \| Use this bot. Report bugs. \| Suggested by Headbomb \| Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox \| #UCB_webform_linked 651/990
(5 intermediate revisions by 4 users not shown)
Line 58: === Statistics === Machine learning and [[statistics]] are closely related fields in terms of methods, but distinct in their principal goal: statistics draws population [[Statistical inference\|inferences]] from a [[Sample (statistics)\|sample]], while machine learning finds generalisable predictive patterns.<ref>{{cite journal \|first1=Danilo \|last1=Bzdok \|first2=Naomi \|last2=Altman \|author-link2=Naomi Altman \|first3=Martin \|last3=Krzywinski \|title=Statistics versus Machine Learning \|journal=[[Nature Methods]] \|volume=15 \|issue=4 \|pages=233–234 \|year=2018 \|doi=10.1038/nmeth.4642 \|pmid=30100822 \|pmc=6082636 }}</ref> According to [[Michael I. Jordan]], the ideas of machine learning, from methodological principles to theoretical tools, have had a long pre-history in statistics.<ref name="mi jordan ama">{{cite web\|url=https://www.reddit.com/r/MachineLearning/comments/2fxi6v/ama_michael_i_jordan/ckelmtt?context=3\|title=statistics and machine learning\|publisher=reddit\|date=10 September 2014\|access-date=1 October 2014\|author=Michael I. Jordan\|author-link=Michael I. Jordan\|archive-date=18 October 2017\|archive-url=https://web.archive.org/web/20171018192328/https://www.reddit.com/r/MachineLearning/comments/2fxi6v/ama_michael_i_jordan/ckelmtt/?context=3\|url-status=live}}</ref> He also suggested the term [[data science]] as a placeholder to call the overall field.<ref name="mi jordan ama" /> Conventional statistical analyses require the a priori selection of a model most suitable for the study data set. In addition, only significant or theoretically relevant variables based on previous experience are included for analysis. In contrast, machine learning is not built on a pre-structured model; rather, the data shape the model by detecting underlying patterns. The more variables (input) used to train the model, the more accurate the ultimate model will be.<ref>Hung et al. Algorithms to Measure Surgeon Performance and Anticipate Clinical Outcomes in Robotic Surgery. JAMA Surg. 2018</ref> Line 146: {{Main\|Feature learning}} Several learning algorithms aim at discovering better representations of the inputs provided during training.<ref name="pami">{{cite journal \|author1=Y. Bengio \|author2=A. Courville \|author3=P. Vincent \|title=Representation Learning: A Review and New Perspectives \|journal= IEEE Transactions on Pattern Analysis and Machine Intelligence\|year=2013\|doi=10.1109/tpami.2013.50 \|pmid=23787338 \|volume=35 \|issue=8 \|pages=1798–1828\|arxiv=1206.5538 \|bibcode=2013ITPAM..35.1798B \|s2cid=393948 }}</ref> Classic examples include [[principal component analysis]] and cluster analysis. Feature learning algorithms, also called representation learning algorithms, often attempt to preserve the information in their input but also transform it in a way that makes it useful, often as a pre-processing step before performing classification or predictions. This technique allows reconstruction of the inputs coming from the unknown data-generating distribution, while not being necessarily faithful to configurations that are implausible under that distribution. This replaces manual [[feature engineering]], and allows a machine to both learn the features and use them to perform a specific task. Feature learning can be either supervised or unsupervised. In supervised feature learning, features are learned using labelled input data. Examples include [[artificial neural network]]s, [[multilayer perceptron]]s, and supervised [[dictionary learning]]. In unsupervised feature learning, features are learned with unlabelled input data. Examples include dictionary learning, [[independent component analysis]], [[autoencoder]]s, [[matrix decomposition\|matrix factorisation]]<ref>{{cite conference \|author1=Nathan Srebro \|author2=Jason D. M. Rennie \|author3=Tommi S. Jaakkola \|title=Maximum-Margin Matrix Factorization \|conference=[[Conference on Neural Information Processing Systems\|NIPS]] \|year=2004}}</ref> and various forms of [[Cluster analysis\|clustering]].<ref name="coates2011">{{cite conference Line 253: === Genetic algorithms === {{Main\|Genetic algorithm}} A genetic algorithm (GA) is a [[search algorithm]] and [[heuristic (computer science)\|heuristic]] technique that mimics the process of [[natural selection]], using methods such as [[Mutation (genetic algorithm)\|mutation]] and [[Crossover (genetic algorithm)\|crossover]] to generate new [[Chromosome (genetic algorithm)\|genotype]]s in the hope of finding good solutions to a given problem. In machine learning, genetic algorithms were used in the 1980s and 1990s.<ref>{{cite journal \|last1=Goldberg \|first1=David E. \|first2=John H. \|last2=Holland \|title=Genetic algorithms and machine learning \|journal=[[Machine Learning (journal)\|Machine Learning]] \|volume=3 \|issue=2 \|year=1988 \|pages=95–99 \|doi=10.1007/bf00113892 \|s2cid=35506513 \|url=https://deepblue.lib.umich.edu/bitstream/2027.42/46947/1/10994_2005_Article_422926.pdf \|doi-access=free \|access-date=3 September 2019 \|archive-date=16 May 2011 \|archive-url=https://web.archive.org/web/20110516025803/http://deepblue.lib.umich.edu/bitstream/2027.42/46947/1/10994_2005_Article_422926.pdf \|url-status=live }}</ref><ref>{{Cite journal \|title=Machine Learning, Neural and Statistical Classification \|journal=Ellis Horwood Series in Artificial Intelligence \|first1=D. \|last1=Michie \|first2=D. J. \|last2=Spiegelhalter \|first3=C. C. \|last3=Taylor \|year=1994 \|bibcode=1994mlns.book.....M }}</ref> Conversely, machine learning techniques have been used to improve the performance of genetic and [[evolutionary algorithm]]s.<ref>{{cite journal \|last1=Zhang \|first1=Jun \|last2=Zhan \|first2=Zhi-hui \|last3=Lin \|first3=Ying \|last4=Chen \|first4=Ni \|last5=Gong \|first5=Yue-jiao \|last6=Zhong \|first6=Jing-hui \|last7=Chung \|first7=Henry S.H. \|last8=Li \|first8=Yun \|last9=Shi \|first9=Yu-hui \|title=Evolutionary Computation Meets Machine Learning: A Survey \|journal= IEEE Computational Intelligence Magazine\|year=2011 \|volume=6 \|issue=4 \|pages=68–75 \|doi=10.1109/mci.2011.942584\|bibcode=2011ICIM....6d..68Z \|s2cid=6760276 }}</ref> === Belief functions === {{Main\|Dempster–Shafer theory}} The theory of belief functions, also referred to as evidence theory or Dempster–Shafer theory, is a general framework for reasoning with uncertainty, with understood connections to other frameworks such as [[probability]], [[Possibility theory\|possibility]] and [[Imprecise probability\|imprecise probability theories]]. These theoretical frameworks can be thought of as a kind of learner and have some analogous properties of how evidence is combined (e.g., Dempster's rule of combination), just like how in a [[Probability mass function\|pmf]]-based Bayesian approach would combine probabilities.<ref>{{Cite journal \|last1=Verbert \|first1=K. \|last2=Babuška \|first2=R. \|last3=De Schutter \|first3=B. \|date=2017-04-01 \|title=Bayesian and Dempster–Shafer reasoning for knowledge-based fault diagnosis–A comparative study \|url=https://www.sciencedirect.com/science/article/abs/pii/S0952197617300118 \|journal=Engineering Applications of Artificial Intelligence \|volume=60 \|pages=136–150 \|doi=10.1016/j.engappai.2017.01.011 \|issn=0952-1976}}</ref> However, there are many caveats to these beliefs functions when compared to Bayesian approaches in order to incorporate ignorance and [[uncertainty quantification]]. These belief function approaches that are implemented within the machine learning ___domain typically leverage a fusion approach of various [[ensemble methods]] to better handle the learner's [[decision boundary]], low samples, and ambiguous class issues that standard machine learning approach tend to have difficulty resolving.<ref name="YoosefzadehNajafabadi-2021">{{cite journal \|last1=Yoosefzadeh-Najafabadi \|first1=Mohsen \|last2=Hugh \|first2=Earl \|last3=Tulpan \|first3=Dan \|last4=Sulik \|first4=John \|last5=Eskandari \|first5=Milad \|year=2021 \|title=Application of Machine Learning Algorithms in Plant Breeding: Predicting Yield From Hyperspectral Reflectance in Soybean? \|journal=Front. Plant Sci. \|volume=11 \|~~pages~~article-number=624273 \|bibcode=2021FrPS...1124273Y \|doi=10.3389/fpls.2020.624273 \|pmc=7835636 \|pmid=33510761 \|doi-access=free}}</ref><ref name="Kohavi" /> However, the computational complexity of these algorithms are dependent on the number of propositions (classes), and can lead to a much higher computation time when compared to other machine learning approaches. === Rule-based models === Line 292: * [[DNA sequence]] classification * [[Computational economics\|Economics]] * [[Data analysis\|Financial ~~market~~data analysis]] ~~analysis~~<ref>Machine learning is included in the [[Chartered Financial Analyst (CFA)#Curriculum\|CFA Curriculum]] (discussion is top-down); see: [https://www.cfainstitute.org/-/media/documents/study-session/2020-l2-ss3.ashx Kathleen DeRose and Christophe Le Lanno (2020). "Machine Learning"] {{Webarchive\|url=https://web.archive.org/web/20200113085425/https://www.cfainstitute.org/-/media/documents/study-session/2020-l2-ss3.ashx \|date=13 January 2020 }}.</ref> * [[General game playing]] * [[Handwriting recognition]] Line 331: Recent advancements in machine learning have extended into the field of quantum chemistry, where novel algorithms now enable the prediction of solvent effects on chemical reactions, thereby offering new tools for chemists to tailor experimental conditions for optimal outcomes.<ref>{{Cite journal \|last1=Chung \|first1=Yunsie \|last2=Green \|first2=William H. \|date=2024 \|title=Machine learning from quantum chemistry to predict experimental solvent effects on reaction rates \|journal=Chemical Science \|language=en \|volume=15 \|issue=7 \|pages=2410–2424 \|doi=10.1039/D3SC05353A \|issn=2041-6520 \|pmc=10866337 \|pmid=38362410 }}</ref> Machine Learning is becoming a useful tool to investigate and predict evacuation decision making in large scale and small scale disasters. Different solutions have been tested to predict if and when householders decide to evacuate during wildfires and hurricanes.<ref>{{Cite journal \|last1=Sun \|first1=Yuran \|last2=Huang \|first2=Shih-Kai \|last3=Zhao \|first3=Xilei \|date=1 February 2024 \|title=Predicting Hurricane Evacuation Decisions with Interpretable Machine Learning Methods \|journal=International Journal of Disaster Risk Science \|language=en \|volume=15 \|issue=1 \|pages=134–148 \|doi=10.1007/s13753-024-00541-1 \|issn=2192-6395 \|doi-access=free \|arxiv=2303.06557 \|bibcode=2024IJDRS..15..134S }}</ref><ref>{{Citation \|last1=Sun \|first1=Yuran \|title=8 - AI for large-scale evacuation modeling: promises and challenges \|date=1 January 2024 \|work=Interpretable Machine Learning for the Analysis, Design, Assessment, and Informed Decision Making for Civil Infrastructure \|pages=185–204 \|editor-last=Naser \|editor-first=M. Z. \|url=https://www.sciencedirect.com/science/article/pii/B9780128240731000149 \|access-date=19 May 2024 \|series=Woodhead Publishing Series in Civil and Structural Engineering \|publisher=Woodhead Publishing \|isbn=978-0-12-824073-1 \|last2=Zhao \|first2=Xilei \|last3=Lovreglio \|first3=Ruggiero \|last4=Kuligowski \|first4=Erica \|archive-date=19 May 2024 \|archive-url=https://web.archive.org/web/20240519121547/https://www.sciencedirect.com/science/article/abs/pii/B9780128240731000149 \|url-status=live }}</ref><ref>{{Cite journal \|last1=Xu \|first1=Ningzhe \|last2=Lovreglio \|first2=Ruggiero \|last3=Kuligowski \|first3=Erica D. \|last4=Cova \|first4=Thomas J. \|last5=Nilsson \|first5=Daniel \|last6=Zhao \|first6=Xilei \|date=1 March 2023 \|title=Predicting and Assessing Wildfire Evacuation Decision-Making Using Machine Learning: Findings from the 2019 Kincade Fire \|url=https://doi.org/10.1007/s10694-023-01363-1 \|journal=Fire Technology \|language=en \|volume=59 \|issue=2 \|pages=793–825 \|doi=10.1007/s10694-023-01363-1 \|issn=1572-8099 \|access-date=19 May 2024 \|archive-date=19 May 2024 \|archive-url=https://web.archive.org/web/20240519121534/https://link.springer.com/article/10.1007/s10694-023-01363-1 \|url-status=live \|url-access=subscription }}</ref> Other applications have been focusing on pre evacuation decisions in building fires.<ref>{{Cite journal \|last1=Wang \|first1=Ke \|last2=Shi \|first2=Xiupeng \|last3=Goh \|first3=Algena Pei Xuan \|last4=Qian \|first4=Shunzhi \|date=1 June 2019 \|title=A machine learning based study on pedestrian movement dynamics under emergency evacuation \|url=https://www.sciencedirect.com/science/article/pii/S037971121830376X \|journal=Fire Safety Journal \|volume=106 \|pages=163–176 \|doi=10.1016/j.firesaf.2019.04.008 \|bibcode=2019FirSJ.106..163W \|issn=0379-7112 \|access-date=19 May 2024 \|archive-date=19 May 2024 \|archive-url=https://web.archive.org/web/20240519121539/https://www.sciencedirect.com/science/article/abs/pii/S037971121830376X \|url-status=live \|hdl=10356/143390 \|hdl-access=free }}</ref><ref>{{Cite journal \|last1=Zhao \|first1=Xilei \|last2=Lovreglio \|first2=Ruggiero \|last3=Nilsson \|first3=Daniel \|date=1 May 2020 \|title=Modelling and interpreting pre-evacuation decision-making using machine learning \|url=https://www.sciencedirect.com/science/article/pii/S0926580519313184 \|journal=Automation in Construction \|volume=113 \|article-number=103140 \|doi=10.1016/j.autcon.2020.103140 \|hdl=10179/17315 \|issn=0926-5805 \|access-date=19 May 2024 \|archive-date=19 May 2024 \|archive-url=https://web.archive.org/web/20240519121548/https://www.sciencedirect.com/science/article/abs/pii/S0926580519313184 \|url-status=live \|hdl-access=free }}</ref> Machine learning is also emerging as a promising tool in geotechnical engineering, where it is used to support tasks such as ground classification, hazard prediction, and site characterization. Recent research emphasizes a move toward data-centric methods in this field, where machine learning is not a replacement for engineering judgment, but a way to enhance it using site-specific data and patterns.<ref>{{Cite journal \|last1=Phoon \|first1=Kok-Kwang \|last2=Zhang \|first2=Wengang \|date=2023-01-02 \|title=Future of machine learning in geotechnics \|url=https://www.tandfonline.com/doi/full/10.1080/17499518.2022.2087884 \|journal=Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards \|language=en \|volume=17 \|issue=1 \|pages=7–22 \|doi=10.1080/17499518.2022.2087884 \|bibcode=2023GAMRE..17....7P \|issn=1749-9518\|url-access=subscription }}</ref> == Limitations == Line 347 ⟶ 345: {{Main\|Explainable artificial intelligence}} Explainable AI (XAI), or Interpretable AI, or Explainable Machine Learning (XML), is artificial intelligence (AI) in which humans can understand the decisions or predictions made by the AI.<ref>{{cite journal \|last1=Rudin \|first1=Cynthia \|title=Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead \|journal=Nature Machine Intelligence \|date=2019 \|volume=1 \|issue=5 \|pages=206–215 \|doi=10.1038/s42256-019-0048-x \|pmid=35603010 \|pmc=9122117 }}</ref> It contrasts with the "black box" concept in machine learning where even its designers cannot explain why an AI arrived at a specific decision.<ref>{{cite journal \|last1=Hu \|first1=Tongxi \|last2=Zhang \|first2=Xuesong \|last3=Bohrer \|first3=Gil \|last4=Liu \|first4=Yanlan \|last5=Zhou \|first5=Yuyu \|last6=Martin \|first6=Jay \|last7=LI \|first7=Yang \|last8=Zhao \|first8=Kaiguang \|title=Crop yield prediction via explainable AI and interpretable machine learning: Dangers of black box models for evaluating climate change impacts on crop yield\|journal=Agricultural and Forest Meteorology \|date=2023 \|volume=336 \|article-number=109458 \|doi=10.1016/j.agrformet.2023.109458 \|bibcode=2023AgFM..33609458H \|s2cid=258552400 \|doi-access=free }}</ref> By refining the mental models of users of AI-powered systems and dismantling their misconceptions, XAI promises to help users perform more effectively. XAI may be an implementation of the social right to explanation. === Overfitting === Line 418 ⟶ 416: \| pages = 14192–14205 \| doi = 10.1109/JIOT.2023.3340858 \| bibcode = 2024IITJ...1114192A \| url = https://research-portal.uws.ac.uk/en/publications/c8edfe21-77d0-4c3e-a8bc-d384faf605a0 }}</ref> Running models directly on these devices eliminates the need to transfer and store data on cloud servers for further processing, thereby reducing the risk of data breaches, privacy leaks and theft of intellectual property, personal data and business secrets. Embedded machine learning can be achieved through various techniques, such as [[hardware acceleration]],<ref>{{Cite book\|last1=Giri\|first1=Davide\|last2=Chiu\|first2=Kuan-Lin\|last3=Di Guglielmo\|first3=Giuseppe\|last4=Mantovani\|first4=Paolo\|last5=Carloni\|first5=Luca P.\|title=2020 Design, Automation & Test in Europe Conference & Exhibition (DATE) \|chapter=ESP4ML: Platform-Based Design of Systems-on-Chip for Embedded Machine Learning \|date=15 June 2020\|chapter-url=https://ieeexplore.ieee.org/document/9116317\|pages=1049–1054\|doi=10.23919/DATE48585.2020.9116317\|arxiv=2004.03640\|isbn=978-3-9819263-4-7\|s2cid=210928161\|access-date=17 January 2022\|archive-date=18 January 2022\|archive-url=https://web.archive.org/web/20220118182342/https://ieeexplore.ieee.org/abstract/document/9116317?casa_token=5I_Tmgrrbu4AAAAA:v7pDHPEWlRuo2Vk3pU06194PO0-W21UOdyZqADrZxrRdPBZDMLwQrjJSAHUhHtzJmLu_VdgW\|url-status=live}}</ref><ref>{{Cite web\|last1=Louis\|first1=Marcia Sahaya\|last2=Azad\|first2=Zahra\|last3=Delshadtehrani\|first3=Leila\|last4=Gupta\|first4=Suyog\|last5=Warden\|first5=Pete\|last6=Reddi\|first6=Vijay Janapa\|last7=Joshi\|first7=Ajay\|date=2019\|title=Towards Deep Learning using TensorFlow Lite on RISC-V\|url=https://edge.seas.harvard.edu/publications/towards-deep-learning-using-tensorflow-lite-risc-v\|access-date=17 January 2022\|website=[[Harvard University]]\|archive-date=17 January 2022\|archive-url=https://web.archive.org/web/20220117031909/https://edge.seas.harvard.edu/publications/towards-deep-learning-using-tensorflow-lite-risc-v\|url-status=live}}</ref> [[approximate computing]],<ref>{{Cite book\|last1=Ibrahim\|first1=Ali\|last2=Osta\|first2=Mario\|last3=Alameh\|first3=Mohamad\|last4=Saleh\|first4=Moustafa\|last5=Chible\|first5=Hussein\|last6=Valle\|first6=Maurizio\|title=2018 25th IEEE International Conference on Electronics, Circuits and Systems (ICECS) \|chapter=Approximate Computing Methods for Embedded Machine Learning \|date=21 January 2019\|pages=845–848\|doi=10.1109/ICECS.2018.8617877\|isbn=978-1-5386-9562-3\|s2cid=58670712}}</ref> and model optimisation.<ref>{{Cite web\|title=dblp: TensorFlow Eager: A Multi-Stage, Python-Embedded DSL for Machine Learning.\|url=https://dblp.org/rec/journals/corr/abs-1903-01855.html\|access-date=17 January 2022\|website=dblp.org\|language=en\|archive-date=18 January 2022\|archive-url=https://web.archive.org/web/20220118182335/https://dblp.org/rec/journals/corr/abs-1903-01855.html\|url-status=live}}</ref><ref>{{Cite journal\|last1=Branco\|first1=Sérgio\|last2=Ferreira\|first2=André G.\|last3=Cabral\|first3=Jorge\|date=5 November 2019\|title=Machine Learning in Resource-Scarce Embedded Systems, FPGAs, and End-Devices: A Survey\|journal=Electronics\|volume=8\|issue=11\|pages=1289\|doi=10.3390/electronics8111289\|issn=2079-9292\|doi-access=free\|hdl=1822/62521\|hdl-access=free}}</ref> Common optimisation techniques include [[Pruning (artificial neural network)\|pruning]], [[Model compression#Quantization\|quantization]], [[knowledge distillation]], low-rank factorisation, network architecture search, and parameter sharing.