Machine learning: Difference between revisions

Content deleted Content added
Undid revision 1307325422 by 2A02:3100:39EE:B300:4150:432:34AC:78A4 (talk) Arbitrary. Find a better source explaining why this particular take is significant enough to highlight.
Citation bot (talk | contribs)
Add: article-number, bibcode. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Headbomb | Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox | #UCB_webform_linked 651/990
 
(One intermediate revision by one other user not shown)
Line 146:
{{Main|Feature learning}}
 
Several learning algorithms aim at discovering better representations of the inputs provided during training.<ref name="pami">{{cite journal |author1=Y. Bengio |author2=A. Courville |author3=P. Vincent |title=Representation Learning: A Review and New Perspectives |journal= IEEE Transactions on Pattern Analysis and Machine Intelligence|year=2013|doi=10.1109/tpami.2013.50 |pmid=23787338 |volume=35 |issue=8 |pages=1798–1828|arxiv=1206.5538 |bibcode=2013ITPAM..35.1798B |s2cid=393948 }}</ref> Classic examples include [[principal component analysis]] and cluster analysis. Feature learning algorithms, also called representation learning algorithms, often attempt to preserve the information in their input but also transform it in a way that makes it useful, often as a pre-processing step before performing classification or predictions. This technique allows reconstruction of the inputs coming from the unknown data-generating distribution, while not being necessarily faithful to configurations that are implausible under that distribution. This replaces manual [[feature engineering]], and allows a machine to both learn the features and use them to perform a specific task.
 
Feature learning can be either supervised or unsupervised. In supervised feature learning, features are learned using labelled input data. Examples include [[artificial neural network]]s, [[multilayer perceptron]]s, and supervised [[dictionary learning]]. In unsupervised feature learning, features are learned with unlabelled input data. Examples include dictionary learning, [[independent component analysis]], [[autoencoder]]s, [[matrix decomposition|matrix factorisation]]<ref>{{cite conference |author1=Nathan Srebro |author2=Jason D. M. Rennie |author3=Tommi S. Jaakkola |title=Maximum-Margin Matrix Factorization |conference=[[Conference on Neural Information Processing Systems|NIPS]] |year=2004}}</ref> and various forms of [[Cluster analysis|clustering]].<ref name="coates2011">{{cite conference
Line 253:
=== Genetic algorithms ===
{{Main|Genetic algorithm}}
A genetic algorithm (GA) is a [[search algorithm]] and [[heuristic (computer science)|heuristic]] technique that mimics the process of [[natural selection]], using methods such as [[Mutation (genetic algorithm)|mutation]] and [[Crossover (genetic algorithm)|crossover]] to generate new [[Chromosome (genetic algorithm)|genotype]]s in the hope of finding good solutions to a given problem. In machine learning, genetic algorithms were used in the 1980s and 1990s.<ref>{{cite journal |last1=Goldberg |first1=David E. |first2=John H. |last2=Holland |title=Genetic algorithms and machine learning |journal=[[Machine Learning (journal)|Machine Learning]] |volume=3 |issue=2 |year=1988 |pages=95–99 |doi=10.1007/bf00113892 |s2cid=35506513 |url=https://deepblue.lib.umich.edu/bitstream/2027.42/46947/1/10994_2005_Article_422926.pdf |doi-access=free |access-date=3 September 2019 |archive-date=16 May 2011 |archive-url=https://web.archive.org/web/20110516025803/http://deepblue.lib.umich.edu/bitstream/2027.42/46947/1/10994_2005_Article_422926.pdf |url-status=live }}</ref><ref>{{Cite journal |title=Machine Learning, Neural and Statistical Classification |journal=Ellis Horwood Series in Artificial Intelligence |first1=D. |last1=Michie |first2=D. J. |last2=Spiegelhalter |first3=C. C. |last3=Taylor |year=1994 |bibcode=1994mlns.book.....M }}</ref> Conversely, machine learning techniques have been used to improve the performance of genetic and [[evolutionary algorithm]]s.<ref>{{cite journal |last1=Zhang |first1=Jun |last2=Zhan |first2=Zhi-hui |last3=Lin |first3=Ying |last4=Chen |first4=Ni |last5=Gong |first5=Yue-jiao |last6=Zhong |first6=Jing-hui |last7=Chung |first7=Henry S.H. |last8=Li |first8=Yun |last9=Shi |first9=Yu-hui |title=Evolutionary Computation Meets Machine Learning: A Survey |journal= IEEE Computational Intelligence Magazine|year=2011 |volume=6 |issue=4 |pages=68–75 |doi=10.1109/mci.2011.942584|bibcode=2011ICIM....6d..68Z |s2cid=6760276 }}</ref>
 
=== Belief functions ===
{{Main|Dempster–Shafer theory}}
The theory of belief functions, also referred to as evidence theory or Dempster–Shafer theory, is a general framework for reasoning with uncertainty, with understood connections to other frameworks such as [[probability]], [[Possibility theory|possibility]] and [[Imprecise probability|imprecise probability theories]]. These theoretical frameworks can be thought of as a kind of learner and have some analogous properties of how evidence is combined (e.g., Dempster's rule of combination), just like how in a [[Probability mass function|pmf]]-based Bayesian approach would combine probabilities.<ref>{{Cite journal |last1=Verbert |first1=K. |last2=Babuška |first2=R. |last3=De Schutter |first3=B. |date=2017-04-01 |title=Bayesian and Dempster–Shafer reasoning for knowledge-based fault diagnosis–A comparative study |url=https://www.sciencedirect.com/science/article/abs/pii/S0952197617300118 |journal=Engineering Applications of Artificial Intelligence |volume=60 |pages=136–150 |doi=10.1016/j.engappai.2017.01.011 |issn=0952-1976}}</ref> However, there are many caveats to these beliefs functions when compared to Bayesian approaches in order to incorporate ignorance and [[uncertainty quantification]]. These belief function approaches that are implemented within the machine learning ___domain typically leverage a fusion approach of various [[ensemble methods]] to better handle the learner's [[decision boundary]], low samples, and ambiguous class issues that standard machine learning approach tend to have difficulty resolving.<ref name="YoosefzadehNajafabadi-2021">{{cite journal |last1=Yoosefzadeh-Najafabadi |first1=Mohsen |last2=Hugh |first2=Earl |last3=Tulpan |first3=Dan |last4=Sulik |first4=John |last5=Eskandari |first5=Milad |year=2021 |title=Application of Machine Learning Algorithms in Plant Breeding: Predicting Yield From Hyperspectral Reflectance in Soybean? |journal=Front. Plant Sci. |volume=11 |pagesarticle-number=624273 |bibcode=2021FrPS...1124273Y |doi=10.3389/fpls.2020.624273 |pmc=7835636 |pmid=33510761 |doi-access=free}}</ref><ref name="Kohavi" /> However, the computational complexity of these algorithms are dependent on the number of propositions (classes), and can lead to a much higher computation time when compared to other machine learning approaches.
 
=== Rule-based models ===
Line 292:
* [[DNA sequence]] classification
* [[Computational economics|Economics]]
* [[Data analysis|Financial marketdata analysis]] analysis<ref>Machine learning is included in the [[Chartered Financial Analyst (CFA)#Curriculum|CFA Curriculum]] (discussion is top-down); see: [https://www.cfainstitute.org/-/media/documents/study-session/2020-l2-ss3.ashx Kathleen DeRose and Christophe Le Lanno (2020). "Machine Learning"] {{Webarchive|url=https://web.archive.org/web/20200113085425/https://www.cfainstitute.org/-/media/documents/study-session/2020-l2-ss3.ashx |date=13 January 2020 }}.</ref>
* [[General game playing]]
* [[Handwriting recognition]]
Line 345:
{{Main|Explainable artificial intelligence}}
 
Explainable AI (XAI), or Interpretable AI, or Explainable Machine Learning (XML), is artificial intelligence (AI) in which humans can understand the decisions or predictions made by the AI.<ref>{{cite journal |last1=Rudin |first1=Cynthia |title=Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead |journal=Nature Machine Intelligence |date=2019 |volume=1 |issue=5 |pages=206–215 |doi=10.1038/s42256-019-0048-x |pmid=35603010 |pmc=9122117 }}</ref> It contrasts with the "black box" concept in machine learning where even its designers cannot explain why an AI arrived at a specific decision.<ref>{{cite journal |last1=Hu |first1=Tongxi |last2=Zhang |first2=Xuesong |last3=Bohrer |first3=Gil |last4=Liu |first4=Yanlan |last5=Zhou |first5=Yuyu |last6=Martin |first6=Jay |last7=LI |first7=Yang |last8=Zhao |first8=Kaiguang |title=Crop yield prediction via explainable AI and interpretable machine learning: Dangers of black box models for evaluating climate change impacts on crop yield|journal=Agricultural and Forest Meteorology |date=2023 |volume=336 |article-number=109458 |doi=10.1016/j.agrformet.2023.109458 |bibcode=2023AgFM..33609458H |s2cid=258552400 |doi-access=free }}</ref> By refining the mental models of users of AI-powered systems and dismantling their misconceptions, XAI promises to help users perform more effectively. XAI may be an implementation of the social right to explanation.
 
=== Overfitting ===
Line 416:
| pages = 14192–14205
| doi = 10.1109/JIOT.2023.3340858
| bibcode = 2024IITJ...1114192A
| url = https://research-portal.uws.ac.uk/en/publications/c8edfe21-77d0-4c3e-a8bc-d384faf605a0
}}</ref> Running models directly on these devices eliminates the need to transfer and store data on cloud servers for further processing, thereby reducing the risk of data breaches, privacy leaks and theft of intellectual property, personal data and business secrets. Embedded machine learning can be achieved through various techniques, such as [[hardware acceleration]],<ref>{{Cite book|last1=Giri|first1=Davide|last2=Chiu|first2=Kuan-Lin|last3=Di Guglielmo|first3=Giuseppe|last4=Mantovani|first4=Paolo|last5=Carloni|first5=Luca P.|title=2020 Design, Automation & Test in Europe Conference & Exhibition (DATE) |chapter=ESP4ML: Platform-Based Design of Systems-on-Chip for Embedded Machine Learning |date=15 June 2020|chapter-url=https://ieeexplore.ieee.org/document/9116317|pages=1049–1054|doi=10.23919/DATE48585.2020.9116317|arxiv=2004.03640|isbn=978-3-9819263-4-7|s2cid=210928161|access-date=17 January 2022|archive-date=18 January 2022|archive-url=https://web.archive.org/web/20220118182342/https://ieeexplore.ieee.org/abstract/document/9116317?casa_token=5I_Tmgrrbu4AAAAA:v7pDHPEWlRuo2Vk3pU06194PO0-W21UOdyZqADrZxrRdPBZDMLwQrjJSAHUhHtzJmLu_VdgW|url-status=live}}</ref><ref>{{Cite web|last1=Louis|first1=Marcia Sahaya|last2=Azad|first2=Zahra|last3=Delshadtehrani|first3=Leila|last4=Gupta|first4=Suyog|last5=Warden|first5=Pete|last6=Reddi|first6=Vijay Janapa|last7=Joshi|first7=Ajay|date=2019|title=Towards Deep Learning using TensorFlow Lite on RISC-V|url=https://edge.seas.harvard.edu/publications/towards-deep-learning-using-tensorflow-lite-risc-v|access-date=17 January 2022|website=[[Harvard University]]|archive-date=17 January 2022|archive-url=https://web.archive.org/web/20220117031909/https://edge.seas.harvard.edu/publications/towards-deep-learning-using-tensorflow-lite-risc-v|url-status=live}}</ref> [[approximate computing]],<ref>{{Cite book|last1=Ibrahim|first1=Ali|last2=Osta|first2=Mario|last3=Alameh|first3=Mohamad|last4=Saleh|first4=Moustafa|last5=Chible|first5=Hussein|last6=Valle|first6=Maurizio|title=2018 25th IEEE International Conference on Electronics, Circuits and Systems (ICECS) |chapter=Approximate Computing Methods for Embedded Machine Learning |date=21 January 2019|pages=845–848|doi=10.1109/ICECS.2018.8617877|isbn=978-1-5386-9562-3|s2cid=58670712}}</ref> and model optimisation.<ref>{{Cite web|title=dblp: TensorFlow Eager: A Multi-Stage, Python-Embedded DSL for Machine Learning.|url=https://dblp.org/rec/journals/corr/abs-1903-01855.html|access-date=17 January 2022|website=dblp.org|language=en|archive-date=18 January 2022|archive-url=https://web.archive.org/web/20220118182335/https://dblp.org/rec/journals/corr/abs-1903-01855.html|url-status=live}}</ref><ref>{{Cite journal|last1=Branco|first1=Sérgio|last2=Ferreira|first2=André G.|last3=Cabral|first3=Jorge|date=5 November 2019|title=Machine Learning in Resource-Scarce Embedded Systems, FPGAs, and End-Devices: A Survey|journal=Electronics|volume=8|issue=11|pages=1289|doi=10.3390/electronics8111289|issn=2079-9292|doi-access=free|hdl=1822/62521|hdl-access=free}}</ref> Common optimisation techniques include [[Pruning (artificial neural network)|pruning]], [[Model compression#Quantization|quantization]], [[knowledge distillation]], low-rank factorisation, network architecture search, and parameter sharing.