Machine learning: Difference between revisions

Content deleted Content added
m physical neural networks: Capitalize sub-section title's first letter
Citation bot (talk | contribs)
Add: article-number, bibcode. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Headbomb | Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox | #UCB_webform_linked 651/990
 
(40 intermediate revisions by 33 users not shown)
Line 3:
{{Redirect|Statistical learning|statistical learning in linguistics|Statistical learning in language acquisition}}
{{Machine learning bar}}
{{Artificial intelligence|Major goalsApproaches}}
{{Use dmy dates|date=April 2025}}
{{Use British English|date=April 2025}}
Line 9:
'''Machine learning''' ('''ML''') is a [[field of study]] in [[artificial intelligence]] concerned with the development and study of [[Computational statistics|statistical algorithms]] that can learn from [[data]] and [[generalise]] to unseen data, and thus perform [[Task (computing)|tasks]] without explicit [[Machine code|instructions]].{{Refn|The definition "without being explicitly programmed" is often attributed to [[Arthur Samuel (computer scientist)|Arthur Samuel]], who coined the term "machine learning" in 1959, but the phrase is not found verbatim in this publication, and may be a [[paraphrase]] that appeared later. Confer "Paraphrasing Arthur Samuel (1959), the question is: How can computers learn to solve problems without being explicitly programmed?" in {{Cite conference |chapter=Automated Design of Both the Topology and Sizing of Analog Electrical Circuits Using Genetic Programming |conference=Artificial Intelligence in Design '96 |last1=Koza |first1=John R. |last2=Bennett |first2=Forrest H. |last3=Andre |first3=David |last4=Keane |first4=Martin A. |title=Artificial Intelligence in Design '96 |date=1996 |publisher=Springer Netherlands |___location=Dordrecht, Netherlands |pages=151–170 |language=en |doi=10.1007/978-94-009-0279-4_9 |isbn=978-94-010-6610-5 }}}} Within a subdiscipline in machine learning, advances in the field of [[deep learning]] have allowed [[Neural network (machine learning)|neural networks]], a class of statistical algorithms, to surpass many previous machine learning approaches in performance.<ref name="ibm">{{Cite web |title=What is Machine Learning? |url=https://www.ibm.com/topics/machine-learning |access-date=27 June 2023 |website=IBM |date=22 September 2021 |language=en-us |archive-date=27 December 2023 |archive-url=https://web.archive.org/web/20231227153910/https://www.ibm.com/topics/machine-learning |url-status=live }}</ref>
 
ML finds application in many fields, including [[natural language processing]], [[computer vision]], [[speech recognition]], [[email filtering]], [[agriculture]], and [[medicine]]. The application of ML to business problems is known as [[predictive analytics]].
ML finds application in many fields, including [[natural language processing]], [[computer vision]], [[speech recognition]], [[email filtering]], [[agriculture]], and [[medicine]].<ref name="tvt">{{Cite journal |last1=Hu |first1=Junyan |last2=Niu |first2=Hanlin |last3=Carrasco |first3=Joaquin |last4=Lennox |first4=Barry |last5=Arvin |first5=Farshad |date=2020 |title=Voronoi-Based Multi-Robot Autonomous Exploration in Unknown Environments via Deep Reinforcement Learning |journal=IEEE Transactions on Vehicular Technology |volume=69 |issue=12 |pages=14413–14423 |doi=10.1109/tvt.2020.3034800 |s2cid=228989788 |issn=0018-9545 |doi-access=free |url=https://research.manchester.ac.uk/files/191737243/09244647.pdf }}</ref><ref name="YoosefzadehNajafabadi-2021">{{cite journal |last1=Yoosefzadeh-Najafabadi|first1=Mohsen |last2=Hugh |first2=Earl |last3=Tulpan |first3=Dan |last4=Sulik |first4=John |last5=Eskandari |first5=Milad |title=Application of Machine Learning Algorithms in Plant Breeding: Predicting Yield From Hyperspectral Reflectance in Soybean? |journal=Front. Plant Sci. |volume=11 |year=2021 |pages=624273|doi=10.3389/fpls.2020.624273 |pmid=33510761 |pmc=7835636 |doi-access=free |bibcode=2021FrPS...1124273Y }}</ref> The application of ML to business problems is known as [[predictive analytics]].
 
[[Statistics]] and [[mathematical optimisation]] (mathematical programming) methods comprise the foundations of machine learning. [[Data mining]] is a related field of study, focusing on [[exploratory data analysis]] (EDA) via [[unsupervised learning]].{{refn|Machine learning and pattern recognition "can be viewed as two facets of the same field".<ref name="bishop2006" />{{rp|vii}}}}<ref name="Friedman-1998">{{cite journal |last=Friedman |first=Jerome H. |author-link = Jerome H. Friedman|title=Data Mining and Statistics: What's the connection? |journal=Computing Science and Statistics |volume=29 |issue=1 |year=1998 |pages=3–9}}</ref>
Line 21:
The term ''machine learning'' was coined in 1959 by [[Arthur Samuel (computer scientist)|Arthur Samuel]], an [[IBM]] employee and pioneer in the field of [[computer gaming]] and [[artificial intelligence]].<ref name="Samuel">{{Cite journal|last=Samuel|first=Arthur|date=1959|title=Some Studies in Machine Learning Using the Game of Checkers|journal=IBM Journal of Research and Development|volume=3|issue=3|pages=210–229|doi=10.1147/rd.33.0210|citeseerx=10.1.1.368.2254|s2cid=2126705 }}</ref><ref name="Kohavi">R. Kohavi and F. Provost, "Glossary of terms", Machine Learning, vol. 30, no. 2–3, pp. 271–274, 1998.</ref> The synonym ''self-teaching computers'' was also used in this time period.<ref name=cyberthreat>{{cite news |last1=Gerovitch |first1=Slava |title=How the Computer Got Its Revenge on the Soviet Union |url=https://nautil.us/issue/23/dominoes/how-the-computer-got-its-revenge-on-the-soviet-union |access-date=19 September 2021 |work=Nautilus |date=9 April 2015 |archive-date=22 September 2021 |archive-url=https://web.archive.org/web/20210922175839/https://nautil.us/issue/23/Dominoes/how-the-computer-got-its-revenge-on-the-soviet-union |url-status=dead }}</ref><ref>{{cite journal |last1=Lindsay |first1=Richard P. |title=The Impact of Automation On Public Administration |journal=Western Political Quarterly |date=1 September 1964 |volume=17 |issue=3 |pages=78–81 |doi=10.1177/106591296401700364 |s2cid=154021253 |url=https://journals.sagepub.com/doi/10.1177/106591296401700364 |access-date=6 October 2021 |language=en |issn=0043-4078 |archive-date=6 October 2021 |archive-url=https://web.archive.org/web/20211006190841/https://journals.sagepub.com/doi/10.1177/106591296401700364 |url-status=live |url-access=subscription }}</ref>
 
The earliest machine learning program was introduced in the 1950s when [[Arthur Samuel (computer scientist)|Arthur Samuel]] invented a [[computer program]] that calculated the winning chance in checkers for each side, but the history of machine learning roots back to decades of human desire and effort to study human cognitive processes.<ref name="WhatIs">{{Cite web |title=History and Evolution of Machine Learning: A Timeline |url=https://www.techtarget.com/whatis/A-Timeline-of-Machine-Learning-History |access-date=8 December 2023 |website=WhatIs |language=en |archive-date=8 December 2023 |archive-url=https://web.archive.org/web/20231208220935/https://www.techtarget.com/whatis/A-Timeline-of-Machine-Learning-History |url-status=live }}</ref> In 1949, [[Canadians|Canadian]] psychologist [[Donald O. Hebb|Donald Hebb]] published the book ''[[Organization of Behavior|The Organization of Behavior]]'', in which he introduced a [[Hebbian theory|theoretical neural structure]] formed by certain interactions among [[nerve cells]].<ref>{{Cite journal |last=Milner |first=Peter M. |date=1993 |title=The Mind and Donald O. Hebb |url=https://www.jstor.org/stable/24941344 |journal=Scientific American |volume=268 |issue=1 |pages=124–129 |doi=10.1038/scientificamerican0193-124 |jstor=24941344 |pmid=8418480 |bibcode=1993SciAm.268a.124M |issn=0036-8733 |access-date=9 December 2023 |archive-date=20 December 2023 |archive-url=https://web.archive.org/web/20231220163326/https://www.jstor.org/stable/24941344 |url-status=live |url-access=subscription }}</ref> [[Hebb's model]] of [[neuron]]s interacting with one another set a groundwork for how AIs and machine learning algorithms work under nodes, or [[artificial neuron]]s used by computers to communicate data.<ref name="WhatIs" /> Other researchers who have studied human [[cognitive systems engineering|cognitive systems]] contributed to the modern machine learning technologies as well, including logician [[Walter Pitts]] and [[Warren Sturgis McCulloch|Warren McCulloch]], who proposed the early mathematical models of neural networks to come up with [[algorithm]]s that mirror human thought processes.<ref name="WhatIs" />
 
By the early 1960s, an experimental "learning machine" with [[punched tape]] memory, called Cybertron, had been developed by [[Raytheon Company]] to analyse [[sonar]] signals, [[Electrocardiography|electrocardiograms]], and speech patterns using rudimentary [[reinforcement learning]]. It was repetitively "trained" by a human operator/teacher to recognise patterns and equipped with a "[[goof]]" button to cause it to reevaluate incorrect decisions.<ref>"Science: The Goof Button", [[Time (magazine)|Time]], 18 August 1961.
Line 39:
 
=== Artificial intelligence ===
[[File:AI hierarchy.svg|thumb|Machine[[Deep learning]] asis subfielda subset of AImachine learning, which is itself a subset of [[artificial intelligence]].<ref name="journalimcms.org">{{cite journal |vauthors=Sindhu V, Nivedha S, Prakash M |date=February 2020|title=An Empirical Science Research on Bioinformatics in Machine Learning |journal=Journal of Mechanics of Continua and Mathematical Sciences |issue=7 |doi=10.26782/jmcms.spl.7/2020.02.00006 |doi-access=free}}</ref>]]
As a scientific endeavour, machine learning grew out of the quest for [[artificial intelligence]] (AI). In the early days of AI as an [[Discipline (academia)|academic discipline]], some researchers were interested in having machines learn from data. They attempted to approach the problem with various symbolic methods, as well as what were then termed "[[Artificial neural network|neural network]]s"; these were mostly [[perceptron]]s and [[ADALINE|other models]] that were later found to be reinventions of the [[generalised linear model]]s of statistics.<ref>{{cite book |last1=Sarle |first1=Warren S.|chapter=Neural Networks and statistical models |pages=1538–50 |year=1994 |title=SUGI 19: proceedings of the Nineteenth Annual SAS Users Group International Conference |publisher=SAS Institute |isbn=9781555446116 |oclc=35546178}}</ref> [[Probabilistic reasoning]] was also employed, especially in [[automated medical diagnosis]].<ref name="aima">{{cite AIMA|edition=2}}</ref>{{rp|488}}
 
Line 58:
 
=== Statistics ===
Machine learning and [[statistics]] are closely related fields in terms of methods, but distinct in their principal goal: statistics draws population [[Statistical inference|inferences]] from a [[Sample (statistics)|sample]], while machine learning finds generalisable predictive patterns.<ref>{{cite journal |first1=Danilo |last1=Bzdok |first2=Naomi |last2=Altman |author-link2=Naomi Altman |first3=Martin |last3=Krzywinski |title=Statistics versus Machine Learning |journal=[[Nature Methods]] |volume=15 |issue=4 |pages=233–234 |year=2018 |doi=10.1038/nmeth.4642 |pmid=30100822 |pmc=6082636 }}</ref> According to [[Michael I. Jordan]], the ideas of machine learning, from methodological principles to theoretical tools, have had a long pre-history in statistics.<ref name="mi jordan ama">{{cite web|url=https://www.reddit.com/r/MachineLearning/comments/2fxi6v/ama_michael_i_jordan/ckelmtt?context=3|title=statistics and machine learning|publisher=reddit|date=10 September 2014|access-date=1 October 2014|author=Michael I. Jordan|author-link=Michael I. Jordan|archive-date=18 October 2017|archive-url=https://web.archive.org/web/20171018192328/https://www.reddit.com/r/MachineLearning/comments/2fxi6v/ama_michael_i_jordan/ckelmtt/?context=3|url-status=live}}</ref> He also suggested the term [[data science]] as a placeholder to call the overall field.<ref name="mi jordan ama" />
 
Conventional statistical analyses require the a priori selection of a model most suitable for the study data set. In addition, only significant or theoretically relevant variables based on previous experience are included for analysis. In contrast, machine learning is not built on a pre-structured model; rather, the data shape the model by detecting underlying patterns. The more variables (input) used to train the model, the more accurate the ultimate model will be.<ref>Hung et al. Algorithms to Measure Surgeon Performance and Anticipate Clinical Outcomes in Robotic Surgery. JAMA Surg. 2018</ref>
Line 82:
 
{{Anchor|Algorithm types}}
[[File:Supervised_and_unsupervised_learning.png|thumb|upright=1.3|In [[supervised learning]], the training data is labelled with the expected answers, while in [[unsupervised learning]], the model identifies patterns or structures in unlabelled data.]]
Machine learning approaches are traditionally divided into three broad categories, which correspond to learning paradigms, depending on the nature of the "signal" or "feedback" available to the learning system:
* [[Supervised learning]]: The computer is presented with example inputs and their desired outputs, given by a "teacher", and the goal is to learn a general rule that [[Map (mathematics)|maps]] inputs to outputs.
* [[Unsupervised learning]]: No labels are given to the learning algorithm, leaving it on its own to find structure in its input. Unsupervised learning can be a goal in itself (discovering hidden patterns in data) or a means towards an end ([[feature learning]]).
* [[Reinforcement learning]]: A computer program interacts with a dynamic environment in which it must perform a certain goal (such as [[Autonomous car|driving a vehicle]] or playing a game against an opponent). As it navigates its problem space, the program is provided feedback that's analogous to rewards, which it tries to maximise.<ref name="bishop2006"/>
Although each algorithm has advantages and limitations, no single algorithm works for all problems.<ref>{{cite journal |last1=Jordan |first1=M. I. |last2=Mitchell |first2=T. M. |title=Machine learning: Trends, perspectives, and prospects |journal=Science |date=17 July 2015 |volume=349 |issue=6245 |pages=255–260 |doi=10.1126/science.aaa8415|pmid=26185243 |bibcode=2015Sci...349..255J |s2cid=677218 }}</ref><ref>{{cite book |last1=El Naqa |first1=Issam |last2=Murphy |first2=Martin J. |title=Machine Learning in Radiation Oncology |chapter=What is Machine Learning? |date=2015 |pages=3–11 |doi=10.1007/978-3-319-18305-3_1|isbn=978-3-319-18304-6 |s2cid=178586107 }}</ref><ref>{{cite journal |last1=Okolie |first1=Jude A. |last2=Savage |first2=Shauna |last3=Ogbaga |first3=Chukwuma C. |last4=Gunes |first4=Burcu |title=Assessing the potential of machine learning methods to study the removal of pharmaceuticals from wastewater using biochar or activated carbon |journal=Total Environment Research Themes |date=June 2022 |volume=1–2 |pagesarticle-number=100001 |doi=10.1016/j.totert.2022.100001|s2cid=249022386 |doi-access=free |bibcode=2022TERT....100001O }}</ref>
 
=== Supervised learning ===
Line 146:
{{Main|Feature learning}}
 
Several learning algorithms aim at discovering better representations of the inputs provided during training.<ref name="pami">{{cite journal |author1=Y. Bengio |author2=A. Courville |author3=P. Vincent |title=Representation Learning: A Review and New Perspectives |journal= IEEE Transactions on Pattern Analysis and Machine Intelligence|year=2013|doi=10.1109/tpami.2013.50 |pmid=23787338 |volume=35 |issue=8 |pages=1798–1828|arxiv=1206.5538 |bibcode=2013ITPAM..35.1798B |s2cid=393948 }}</ref> Classic examples include [[principal component analysis]] and cluster analysis. Feature learning algorithms, also called representation learning algorithms, often attempt to preserve the information in their input but also transform it in a way that makes it useful, often as a pre-processing step before performing classification or predictions. This technique allows reconstruction of the inputs coming from the unknown data-generating distribution, while not being necessarily faithful to configurations that are implausible under that distribution. This replaces manual [[feature engineering]], and allows a machine to both learn the features and use them to perform a specific task.
 
Feature learning can be either supervised or unsupervised. In supervised feature learning, features are learned using labelled input data. Examples include [[artificial neural network]]s, [[multilayer perceptron]]s, and supervised [[dictionary learning]]. In unsupervised feature learning, features are learned with unlabelled input data. Examples include dictionary learning, [[independent component analysis]], [[autoencoder]]s, [[matrix decomposition|matrix factorisation]]<ref>{{cite conference |author1=Nathan Srebro |author2=Jason D. M. Rennie |author3=Tommi S. Jaakkola |title=Maximum-Margin Matrix Factorization |conference=[[Conference on Neural Information Processing Systems|NIPS]] |year=2004}}</ref> and various forms of [[Cluster analysis|clustering]].<ref name="coates2011">{{cite conference
Line 233:
Regression analysis encompasses a large variety of statistical methods to estimate the relationship between input variables and their associated features. Its most common form is [[linear regression]], where a single line is drawn to best fit the given data according to a mathematical criterion such as [[ordinary least squares]]. The latter is often extended by [[regularization (mathematics)|regularisation]] methods to mitigate overfitting and bias, as in [[ridge regression]]. When dealing with non-linear problems, go-to models include [[polynomial regression]] (for example, used for trendline fitting in Microsoft Excel<ref>{{cite web|last1=Stevenson|first1=Christopher|title=Tutorial: Polynomial Regression in Excel|url=https://facultystaff.richmond.edu/~cstevens/301/Excel4.html|website=facultystaff.richmond.edu|access-date=22 January 2017|archive-date=2 June 2013|archive-url=https://web.archive.org/web/20130602200850/https://facultystaff.richmond.edu/~cstevens/301/Excel4.html|url-status=live}}</ref>), [[logistic regression]] (often used in [[statistical classification]]) or even [[kernel regression]], which introduces non-linearity by taking advantage of the [[kernel trick]] to implicitly map input variables to higher-dimensional space.
 
[[General linear model|Multivariate linear regression]] extends the concept of linear regression to handle multiple dependent variables simultaneously. This approach estimates the relationships between a set of input variables and several output variables by fitting a [[Multidimensional system|multidimensional]] linear model. It is particularly useful in scenarios where outputs are interdependent or share underlying patterns, such as predicting multiple economic indicators or reconstructing images,<ref>{{cite journal |last1= Wanta |first1= Damian |last2= Smolik |first2= Aleksander |last3= Smolik |first3= Waldemar T. |last4= Midura |first4= Mateusz |last5= Wróblewski |first5= Przemysław |date= 2025 |title= Image reconstruction using machine-learned pseudoinverse in electrical capacitance tomography |journal= Engineering Applications of Artificial Intelligence |volume= 142|pagearticle-number= 109888|doi= 10.1016/j.engappai.2024.109888 |doi-access= free}}</ref> which are inherently multi-dimensional.
 
=== Bayesian networks ===
Line 253:
=== Genetic algorithms ===
{{Main|Genetic algorithm}}
A genetic algorithm (GA) is a [[search algorithm]] and [[heuristic (computer science)|heuristic]] technique that mimics the process of [[natural selection]], using methods such as [[Mutation (genetic algorithm)|mutation]] and [[Crossover (genetic algorithm)|crossover]] to generate new [[Chromosome (genetic algorithm)|genotype]]s in the hope of finding good solutions to a given problem. In machine learning, genetic algorithms were used in the 1980s and 1990s.<ref>{{cite journal |last1=Goldberg |first1=David E. |first2=John H. |last2=Holland |title=Genetic algorithms and machine learning |journal=[[Machine Learning (journal)|Machine Learning]] |volume=3 |issue=2 |year=1988 |pages=95–99 |doi=10.1007/bf00113892 |s2cid=35506513 |url=https://deepblue.lib.umich.edu/bitstream/2027.42/46947/1/10994_2005_Article_422926.pdf |doi-access=free |access-date=3 September 2019 |archive-date=16 May 2011 |archive-url=https://web.archive.org/web/20110516025803/http://deepblue.lib.umich.edu/bitstream/2027.42/46947/1/10994_2005_Article_422926.pdf |url-status=live }}</ref><ref>{{Cite journal |title=Machine Learning, Neural and Statistical Classification |journal=Ellis Horwood Series in Artificial Intelligence |first1=D. |last1=Michie |first2=D. J. |last2=Spiegelhalter |first3=C. C. |last3=Taylor |year=1994 |bibcode=1994mlns.book.....M }}</ref> Conversely, machine learning techniques have been used to improve the performance of genetic and [[evolutionary algorithm]]s.<ref>{{cite journal |last1=Zhang |first1=Jun |last2=Zhan |first2=Zhi-hui |last3=Lin |first3=Ying |last4=Chen |first4=Ni |last5=Gong |first5=Yue-jiao |last6=Zhong |first6=Jing-hui |last7=Chung |first7=Henry S.H. |last8=Li |first8=Yun |last9=Shi |first9=Yu-hui |title=Evolutionary Computation Meets Machine Learning: A Survey |journal= IEEE Computational Intelligence Magazine|year=2011 |volume=6 |issue=4 |pages=68–75 |doi=10.1109/mci.2011.942584|bibcode=2011ICIM....6d..68Z |s2cid=6760276 }}</ref>
 
=== Belief functions ===
{{Main|Dempster–Shafer theory}}
The theory of belief functions, also referred to as evidence theory or Dempster–Shafer theory, is a general framework for reasoning with uncertainty, with understood connections to other frameworks such as [[probability]], [[Possibility theory|possibility]] and [[Imprecise probability|imprecise probability theories]]. These theoretical frameworks can be thought of as a kind of learner and have some analogous properties of how evidence is combined (e.g., Dempster's rule of combination), just like how in a [[Probability mass function|pmf]]-based Bayesian approach would combine probabilities.<ref>{{Cite journal |last1=Verbert |first1=K. |last2=Babuška |first2=R. |last3=De Schutter |first3=B. |date=2017-04-01 |title=Bayesian and Dempster–Shafer reasoning for knowledge-based fault diagnosis–A comparative study |url=https://www.sciencedirect.com/science/article/abs/pii/S0952197617300118 |journal=Engineering Applications of Artificial Intelligence |volume=60 |pages=136–150 |doi=10.1016/j.engappai.2017.01.011 |issn=0952-1976}}</ref> However, there are many caveats to these beliefs functions when compared to Bayesian approaches in order to incorporate ignorance and [[uncertainty quantification]]. These belief function approaches that are implemented within the machine learning ___domain typically leverage a fusion approach of various [[ensemble methods]] to better handle the learner's [[decision boundary]], low samples, and ambiguous class issues that standard machine learning approach tend to have difficulty resolving.<ref name="YoosefzadehNajafabadi-2021">{{cite journal |last1=Yoosefzadeh-Najafabadi |first1=Mohsen |last2=Hugh |first2=Earl |last3=Tulpan |first3=Dan |last4=Sulik |first4=John |last5=Eskandari |first5=Milad |year=2021 |title=Application of Machine Learning Algorithms in Plant Breeding: Predicting Yield From Hyperspectral Reflectance in Soybean? |journal=Front. Plant Sci. |volume=11 |article-number=624273 |bibcode=2021FrPS...1124273Y |doi=10.3389/fpls.2020.624273 |pmc=7835636 |pmid=33510761 |doi-access=free}}</ref><ref name="Kohavi" /> However, the computational complexity of these algorithms are dependent on the number of propositions (classes), and can lead to a much higher computation time when compared to other machine learning approaches.
 
=== Rule-based models ===
Line 292:
* [[DNA sequence]] classification
* [[Computational economics|Economics]]
* [[Data analysis|Financial marketdata analysis]] analysis<ref>Machine learning is included in the [[Chartered Financial Analyst (CFA)#Curriculum|CFA Curriculum]] (discussion is top-down); see: [https://www.cfainstitute.org/-/media/documents/study-session/2020-l2-ss3.ashx Kathleen DeRose and Christophe Le Lanno (2020). "Machine Learning"] {{Webarchive|url=https://web.archive.org/web/20200113085425/https://www.cfainstitute.org/-/media/documents/study-session/2020-l2-ss3.ashx |date=13 January 2020 }}.</ref>
* [[General game playing]]
* [[Handwriting recognition]]
Line 327:
{{colend}}
 
In 2006, the media-services provider [[Netflix]] held the first "[[Netflix Prize]]" competition to find a program to better predict user preferences and improve the accuracy of its existing Cinematch movie recommendation algorithm by at least 10%. A joint team made up of researchers from [[AT&T Labs]]-Research in collaboration with the teams Big Chaos and Pragmatic Theory built an [[Ensemble Averaging|ensemble model]] to win the Grand Prize in 2009 for $1 million.<ref>[https://web.archive.org/web/20151110062742/http://www2.research.att.com/~volinsky/netflix/ "BelKor Home Page"] research.att.com</ref> Shortly after the prize was awarded, Netflix realised that viewers' ratings were not the best indicators of their viewing patterns ("everything is a recommendation") and they changed their recommendation engine accordingly.<ref>{{cite web|url=http://techblog.netflix.com/2012/04/netflix-recommendations-beyond-5-stars.html|title=The Netflix Tech Blog: Netflix Recommendations: Beyond the 5 stars (Part 1)|access-date=8 August 2015|date=6 April 2012|archive-url=https://web.archive.org/web/20160531002916/http://techblog.netflix.com/2012/04/netflix-recommendations-beyond-5-stars.html|archive-date=31 May 2016}}</ref> In 2010, an article in the ''[[The Wall Street Journal]]'' noted the use of machine learning by Rebellion Research to predict the [[2008 financial crisis]].<ref>{{cite web|url=https://www.wsj.com/articles/SB10001424052748703834604575365310813948080|title=Letting the Machines Decide|author=Scott Patterson|date=13 July 2010|publisher=[[The Wall Street Journal]]|access-date=24 June 2018|archive-date=24 June 2018|archive-url=https://web.archive.org/web/20180624151019/https://www.wsj.com/articles/SB10001424052748703834604575365310813948080|url-status=live}}</ref> In 2012, co-founder of [[Sun Microsystems]], [[Vinod Khosla]], predicted that 80% of medical doctors jobs would be lost in the next two decades to automated machine learning medical diagnostic software.<ref>{{cite web|url=https://techcrunch.com/2012/01/10/doctors-or-algorithms/|author=Vinod Khosla|publisher=Tech Crunch|title=Do We Need Doctors or Algorithms?|date=10 January 2012|access-date=20 October 2016|archive-date=18 June 2018|archive-url=https://web.archive.org/web/20180618175811/https://techcrunch.com/2012/01/10/doctors-or-algorithms/|url-status=live}}</ref> In 2014, it was reported that a machine learning algorithm had been applied in the field of art history to study fine art paintings and that it may have revealed previously unrecognised influences among artists.<ref>[https://medium.com/the-physics-arxiv-blog/when-a-machine-learning-algorithm-studied-fine-art-paintings-it-saw-things-art-historians-had-never-b8e4e7bf7d3e When A Machine Learning Algorithm Studied Fine Art Paintings, It Saw Things Art Historians Had Never Noticed] {{Webarchive|url=https://web.archive.org/web/20160604072143/https://medium.com/the-physics-arxiv-blog/when-a-machine-learning-algorithm-studied-fine-art-paintings-it-saw-things-art-historians-had-never-b8e4e7bf7d3e |date=4 June 2016 }}, ''The Physics at [[ArXiv]] blog''</ref> In 2019 [[Springer Nature]] published the first research book created using machine learning.<ref>{{Cite web|url=https://www.theverge.com/2019/4/10/18304558/ai-writing-academic-research-book-springer-nature-artificial-intelligence|title=The first AI-generated textbook shows what robot writers are actually good at|last=Vincent|first=James|date=10 April 2019|website=The Verge|access-date=5 May 2019|archive-date=5 May 2019|archive-url=https://web.archive.org/web/20190505200409/https://www.theverge.com/2019/4/10/18304558/ai-writing-academic-research-book-springer-nature-artificial-intelligence|url-status=live}}</ref> In 2020, machine learning technology was used to help make diagnoses and aid researchers in developing a cure for COVID-19.<ref>{{Cite journal|title=Artificial Intelligence (AI) applications for COVID-19 pandemic|date=1 July 2020|journal=Diabetes & Metabolic Syndrome: Clinical Research & Reviews|volume=14|issue=4|pages=337–339|doi=10.1016/j.dsx.2020.04.012|doi-access=free|last1=Vaishya|first1=Raju|last2=Javaid|first2=Mohd|last3=Khan|first3=Ibrahim Haleem|last4=Haleem|first4=Abid|pmid=32305024|pmc=7195043}}</ref> Machine learning was recently applied to predict the pro-environmental behaviour of travellers.<ref>{{Cite journal|title=Application of machine learning to predict visitors' green behavior in marine protected areas: evidence from Cyprus|first1=Hamed|last1=Rezapouraghdam|first2=Arash|last2=Akhshik|first3=Haywantee|last3=Ramkissoon|date=10 March 2021|journal=Journal of Sustainable Tourism|volume=31 |issue=11 |pages=2479–2505|doi=10.1080/09669582.2021.1887878|doi-access=free|hdl=10037/24073|hdl-access=free}}</ref> Recently, machine learning technology was also applied to optimise smartphone's performance and thermal behaviour based on the user's interaction with the phone.<ref>{{Cite book|last1=Dey|first1=Somdip|last2=Singh|first2=Amit Kumar|last3=Wang|first3=Xiaohang|last4=McDonald-Maier|first4=Klaus|title=2020 Design, Automation & Test in Europe Conference & Exhibition (DATE) |chapter=User Interaction Aware Reinforcement Learning for Power and Thermal Efficiency of CPU-GPU Mobile MPSoCs |date=15 June 2020|chapter-url=https://ieeexplore.ieee.org/document/9116294|pages=1728–1733|doi=10.23919/DATE48585.2020.9116294|isbn=978-3-9819263-4-7|s2cid=219858480|url=http://repository.essex.ac.uk/27546/1/User%20Interaction%20Aware%20Reinforcement%20Learning.pdf |access-date=20 January 2022|archive-date=13 December 2021|archive-url=https://web.archive.org/web/20211213192526/https://ieeexplore.ieee.org/document/9116294/|url-status=live}}</ref><ref>{{Cite news|last=Quested|first=Tony|title=Smartphones get smarter with Essex innovation|work=Business Weekly|url=https://www.businessweekly.co.uk/news/academia-research/smartphones-get-smarter-essex-innovation|access-date=17 June 2021|archive-date=24 June 2021|archive-url=https://web.archive.org/web/20210624200126/https://www.businessweekly.co.uk/news/academia-research/smartphones-get-smarter-essex-innovation|url-status=live}}</ref><ref>{{Cite news|last=Williams|first=Rhiannon|date=21 July 2020|title=Future smartphones 'will prolong their own battery life by monitoring owners' behaviour'|url=https://inews.co.uk/news/technology/future-smartphones-prolong-battery-life-monitoring-behaviour-558689|access-date=17 June 2021|newspaper=[[i (British newspaper)|i]]|language=en|archive-date=24 June 2021|archive-url=https://web.archive.org/web/20210624201153/https://inews.co.uk/news/technology/future-smartphones-prolong-battery-life-monitoring-behaviour-558689|url-status=live}}</ref> When applied correctly, machine learning algorithms (MLAs) can utilise a wide range of company characteristics to predict stock returns without [[overfitting]]. By employing effective feature engineering and combining forecasts, MLAs can generate results that far surpass those obtained from basic linear techniques like [[Ordinary least squares|OLS]].<ref>{{Cite journal |last1=Rasekhschaffe |first1=Keywan Christian |last2=Jones |first2=Robert C. |date=1 July 2019 |title=Machine Learning for Stock Selection |url=https://www.tandfonline.com/doi/full/10.1080/0015198X.2019.1596678 |journal=Financial Analysts Journal |language=en |volume=75 |issue=3 |pages=70–88 |doi=10.1080/0015198X.2019.1596678 |s2cid=108312507 |issn=0015-198X |access-date=26 November 2023 |archive-date=26 November 2023 |archive-url=https://web.archive.org/web/20231126160605/https://www.tandfonline.com/doi/full/10.1080/0015198X.2019.1596678 |url-status=live |url-access=subscription }}</ref>
 
Recent advancements in machine learning have extended into the field of quantum chemistry, where novel algorithms now enable the prediction of solvent effects on chemical reactions, thereby offering new tools for chemists to tailor experimental conditions for optimal outcomes.<ref>{{Cite journal |last1=Chung |first1=Yunsie |last2=Green |first2=William H. |date=2024 |title=Machine learning from quantum chemistry to predict experimental solvent effects on reaction rates |journal=Chemical Science |language=en |volume=15 |issue=7 |pages=2410–2424 |doi=10.1039/D3SC05353A |issn=2041-6520 |pmc=10866337 |pmid=38362410 }}</ref>
 
Machine Learning is becoming a useful tool to investigate and predict evacuation decision making in large scale and small scale disasters. Different solutions have been tested to predict if and when householders decide to evacuate during wildfires and hurricanes.<ref>{{Cite journal |last1=Sun |first1=Yuran |last2=Huang |first2=Shih-Kai |last3=Zhao |first3=Xilei |date=1 February 2024 |title=Predicting Hurricane Evacuation Decisions with Interpretable Machine Learning Methods |journal=International Journal of Disaster Risk Science |language=en |volume=15 |issue=1 |pages=134–148 |doi=10.1007/s13753-024-00541-1 |issn=2192-6395 |doi-access=free |arxiv=2303.06557 |bibcode=2024IJDRS..15..134S }}</ref><ref>{{Citation |last1=Sun |first1=Yuran |title=8 - AI for large-scale evacuation modeling: promises and challenges |date=1 January 2024 |work=Interpretable Machine Learning for the Analysis, Design, Assessment, and Informed Decision Making for Civil Infrastructure |pages=185–204 |editor-last=Naser |editor-first=M. Z. |url=https://www.sciencedirect.com/science/article/pii/B9780128240731000149 |access-date=19 May 2024 |series=Woodhead Publishing Series in Civil and Structural Engineering |publisher=Woodhead Publishing |isbn=978-0-12-824073-1 |last2=Zhao |first2=Xilei |last3=Lovreglio |first3=Ruggiero |last4=Kuligowski |first4=Erica |archive-date=19 May 2024 |archive-url=https://web.archive.org/web/20240519121547/https://www.sciencedirect.com/science/article/abs/pii/B9780128240731000149 |url-status=live }}</ref><ref>{{Cite journal |last1=Xu |first1=Ningzhe |last2=Lovreglio |first2=Ruggiero |last3=Kuligowski |first3=Erica D. |last4=Cova |first4=Thomas J. |last5=Nilsson |first5=Daniel |last6=Zhao |first6=Xilei |date=1 March 2023 |title=Predicting and Assessing Wildfire Evacuation Decision-Making Using Machine Learning: Findings from the 2019 Kincade Fire |url=https://doi.org/10.1007/s10694-023-01363-1 |journal=Fire Technology |language=en |volume=59 |issue=2 |pages=793–825 |doi=10.1007/s10694-023-01363-1 |issn=1572-8099 |access-date=19 May 2024 |archive-date=19 May 2024 |archive-url=https://web.archive.org/web/20240519121534/https://link.springer.com/article/10.1007/s10694-023-01363-1 |url-status=live |url-access=subscription }}</ref> Other applications have been focusing on pre evacuation decisions in building fires.<ref>{{Cite journal |last1=Wang |first1=Ke |last2=Shi |first2=Xiupeng |last3=Goh |first3=Algena Pei Xuan |last4=Qian |first4=Shunzhi |date=1 June 2019 |title=A machine learning based study on pedestrian movement dynamics under emergency evacuation |url=https://www.sciencedirect.com/science/article/pii/S037971121830376X |journal=Fire Safety Journal |volume=106 |pages=163–176 |doi=10.1016/j.firesaf.2019.04.008 |bibcode=2019FirSJ.106..163W |issn=0379-7112 |access-date=19 May 2024 |archive-date=19 May 2024 |archive-url=https://web.archive.org/web/20240519121539/https://www.sciencedirect.com/science/article/abs/pii/S037971121830376X |url-status=live |hdl=10356/143390 |hdl-access=free }}</ref><ref>{{Cite journal |last1=Zhao |first1=Xilei |last2=Lovreglio |first2=Ruggiero |last3=Nilsson |first3=Daniel |date=1 May 2020 |title=Modelling and interpreting pre-evacuation decision-making using machine learning |url=https://www.sciencedirect.com/science/article/pii/S0926580519313184 |journal=Automation in Construction |volume=113 |pagesarticle-number=103140 |doi=10.1016/j.autcon.2020.103140 |hdl=10179/17315 |issn=0926-5805 |access-date=19 May 2024 |archive-date=19 May 2024 |archive-url=https://web.archive.org/web/20240519121548/https://www.sciencedirect.com/science/article/abs/pii/S0926580519313184 |url-status=live |hdl-access=free }}</ref>
 
Machine learning is also emerging as a promising tool in geotechnical engineering, where it is used to support tasks such as ground classification, hazard prediction, and site characterization. Recent research emphasizes a move toward data-centric methods in this field, where machine learning is not a replacement for engineering judgment, but a way to enhance it using site-specific data and patterns.<ref>{{Cite journal |last1=Phoon |first1=Kok-Kwang |last2=Zhang |first2=Wengang |date=2023-01-02 |title=Future of machine learning in geotechnics |url=https://www.tandfonline.com/doi/full/10.1080/17499518.2022.2087884 |journal=Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards |language=en |volume=17 |issue=1 |pages=7–22 |doi=10.1080/17499518.2022.2087884 |bibcode=2023GAMRE..17....7P |issn=1749-9518}}</ref>
 
== Limitations ==
Line 347 ⟶ 345:
{{Main|Explainable artificial intelligence}}
 
Explainable AI (XAI), or Interpretable AI, or Explainable Machine Learning (XML), is artificial intelligence (AI) in which humans can understand the decisions or predictions made by the AI.<ref>{{cite journal |last1=Rudin |first1=Cynthia |title=Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead |journal=Nature Machine Intelligence |date=2019 |volume=1 |issue=5 |pages=206–215 |doi=10.1038/s42256-019-0048-x |pmid=35603010 |pmc=9122117 }}</ref> It contrasts with the "black box" concept in machine learning where even its designers cannot explain why an AI arrived at a specific decision.<ref>{{cite journal |last1=Hu |first1=Tongxi |last2=Zhang |first2=Xuesong |last3=Bohrer |first3=Gil |last4=Liu |first4=Yanlan |last5=Zhou |first5=Yuyu |last6=Martin |first6=Jay |last7=LI |first7=Yang |last8=Zhao |first8=Kaiguang |title=Crop yield prediction via explainable AI and interpretable machine learning: Dangers of black box models for evaluating climate change impacts on crop yield|journal=Agricultural and Forest Meteorology |date=2023 |volume=336 |pagearticle-number=109458 |doi=10.1016/j.agrformet.2023.109458 |bibcode=2023AgFM..33609458H |s2cid=258552400 |doi-access=free }}</ref> By refining the mental models of users of AI-powered systems and dismantling their misconceptions, XAI promises to help users perform more effectively. XAI may be an implementation of the social right to explanation.
 
=== Overfitting ===
Line 418 ⟶ 416:
| pages = 14192–14205
| doi = 10.1109/JIOT.2023.3340858
| bibcode = 2024IITJ...1114192A
| url = https://research-portal.uws.ac.uk/en/publications/c8edfe21-77d0-4c3e-a8bc-d384faf605a0
}}</ref> Running models directly on these devices eliminates the need to transfer and store data on cloud servers for further processing, thereby reducing the risk of data breaches, privacy leaks and theft of intellectual property, personal data and business secrets. Embedded machine learning can be achieved through various techniques, such as [[hardware acceleration]],<ref>{{Cite book|last1=Giri|first1=Davide|last2=Chiu|first2=Kuan-Lin|last3=Di Guglielmo|first3=Giuseppe|last4=Mantovani|first4=Paolo|last5=Carloni|first5=Luca P.|title=2020 Design, Automation & Test in Europe Conference & Exhibition (DATE) |chapter=ESP4ML: Platform-Based Design of Systems-on-Chip for Embedded Machine Learning |date=15 June 2020|chapter-url=https://ieeexplore.ieee.org/document/9116317|pages=1049–1054|doi=10.23919/DATE48585.2020.9116317|arxiv=2004.03640|isbn=978-3-9819263-4-7|s2cid=210928161|access-date=17 January 2022|archive-date=18 January 2022|archive-url=https://web.archive.org/web/20220118182342/https://ieeexplore.ieee.org/abstract/document/9116317?casa_token=5I_Tmgrrbu4AAAAA:v7pDHPEWlRuo2Vk3pU06194PO0-W21UOdyZqADrZxrRdPBZDMLwQrjJSAHUhHtzJmLu_VdgW|url-status=live}}</ref><ref>{{Cite web|last1=Louis|first1=Marcia Sahaya|last2=Azad|first2=Zahra|last3=Delshadtehrani|first3=Leila|last4=Gupta|first4=Suyog|last5=Warden|first5=Pete|last6=Reddi|first6=Vijay Janapa|last7=Joshi|first7=Ajay|date=2019|title=Towards Deep Learning using TensorFlow Lite on RISC-V|url=https://edge.seas.harvard.edu/publications/towards-deep-learning-using-tensorflow-lite-risc-v|access-date=17 January 2022|website=[[Harvard University]]|archive-date=17 January 2022|archive-url=https://web.archive.org/web/20220117031909/https://edge.seas.harvard.edu/publications/towards-deep-learning-using-tensorflow-lite-risc-v|url-status=live}}</ref> [[approximate computing]],<ref>{{Cite book|last1=Ibrahim|first1=Ali|last2=Osta|first2=Mario|last3=Alameh|first3=Mohamad|last4=Saleh|first4=Moustafa|last5=Chible|first5=Hussein|last6=Valle|first6=Maurizio|title=2018 25th IEEE International Conference on Electronics, Circuits and Systems (ICECS) |chapter=Approximate Computing Methods for Embedded Machine Learning |date=21 January 2019|chapter-url=https://ieeexplore.ieee.org/document/8617877|pages=845–848|doi=10.1109/ICECS.2018.8617877|isbn=978-1-5386-9562-3|s2cid=58670712|access-date=17 January 2022|archive-date=17 January 2022|archive-url=https://web.archive.org/web/20220117031855/https://ieeexplore.ieee.org/abstract/document/8617877?casa_token=arUW5Oy-tzwAAAAA:I9x6edlfskM6kGNFUN9zAFrjEBv_8kYTz7ERTxtXu9jAqdrYCcDbbwjBdgwXvb6QAH_-0VJJ|url-status=live}}</ref> and model optimisation.<ref>{{Cite web|title=dblp: TensorFlow Eager: A Multi-Stage, Python-Embedded DSL for Machine Learning.|url=https://dblp.org/rec/journals/corr/abs-1903-01855.html|access-date=17 January 2022|website=dblp.org|language=en|archive-date=18 January 2022|archive-url=https://web.archive.org/web/20220118182335/https://dblp.org/rec/journals/corr/abs-1903-01855.html|url-status=live}}</ref><ref>{{Cite journal|last1=Branco|first1=Sérgio|last2=Ferreira|first2=André G.|last3=Cabral|first3=Jorge|date=5 November 2019|title=Machine Learning in Resource-Scarce Embedded Systems, FPGAs, and End-Devices: A Survey|journal=Electronics|volume=8|issue=11|pages=1289|doi=10.3390/electronics8111289|issn=2079-9292|doi-access=free|hdl=1822/62521|hdl-access=free}}</ref> Common optimisation techniques include [[Pruning (artificial neural network)|pruning]], [[Model compression#Quantization (Embedded Machine Learning)|quantisationquantization]], [[knowledge distillation]], low-rank factorisation, network architecture search, and parameter sharing.
 
== Software ==
Line 425 ⟶ 424:
 
=== Free and open-source software{{anchor|Open-source_software}} ===
{{See also|Lists of open-source artificial intelligence software}}
{{Div col|colwidth=18em}}
* [[Caffe (software)|Caffe]]
Line 432:
* [[Google JAX]]
* [[Infer.NET]]
* [[Jubatus]]
* [[Keras]]
* [[Kubeflow]]
Line 449 ⟶ 450:
* [[Apache Spark#MLlib Machine Learning Library|Spark MLlib]]
* [[Apache SystemML|SystemML]]
* [[Theano (software)|Theano]]
* [[TensorFlow]]
* [[Torch (machine learning)|Torch]] / [[PyTorch]]