Neural architecture search: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 21:57, 19 August 2024 edit 142.105.129.184 (talk) grammar ← Previous edit		Latest revision as of 00:29, 27 August 2025 edit undo Citation bot (talk \| contribs) Bots 5,866,392 edits Removed URL that duplicated identifier. \| Use this bot. Report bugs. \| Suggested by Headbomb \| Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox \| #UCB_webform_linked 893/990
(2 intermediate revisions by 2 users not shown)
Line 7: * The ''performance estimation strategy'' evaluates the performance of a possible ANN from its design (without constructing and training it). NAS is closely related to [[hyperparameter optimization]]<ref>Matthias Feurer and Frank Hutter. [https://link.springer.com/content/pdf/10.1007%2F978-3-030-05318-5_1.pdf Hyperparameter optimization]. In: ''AutoML: Methods, Systems, Challenges'', pages 3–38.</ref> and [[meta-learning (computer science)\|meta-learning]]<ref>{{Cite book\|chapter-url=https://link.springer.com/chapter/10.1007/978-3-030-05318-5_2\|doi = 10.1007/978-3-030-05318-5_2\|chapter = Meta-Learning\|title = Automated Machine Learning\|series = The Springer Series on Challenges in Machine Learning\|year = 2019\|last1 = Vanschoren\|first1 = Joaquin\|pages = 35–61\|isbn = 978-3-030-05317-8\|s2cid = 239362577}}</ref> and is a subfield of [[automated machine learning]] (AutoML).<ref>{{Cite journal \|~~last~~last1=Salehin \|~~first~~first1=Imrus \|last2=Islam \|first2=Md. Shamiul \|last3=Saha \|first3=Pritom \|last4=Noman \|first4=S. M. \|last5=Tuni \|first5=Azra \|last6=Hasan \|first6=Md. Mehedi \|last7=Baten \|first7=Md. Abu \|date=2024-01-01 \|title=AutoML: A systematic review on automated machine learning with neural architecture search ~~\|url=https://www.sciencedirect.com/science/article/pii/S2949715923000604~~ \|journal=Journal of Information and Intelligence \|volume=2 \|issue=1 \|pages=52–81 \|doi=10.1016/j.jiixd.2023.10.002 \|issn=2949-7159\|doi-access=free }}</ref> ==Reinforcement learning== Line 17: == Evolution == An alternative approach to NAS is based on [[evolutionary algorithm]]s, which has been employed by several groups.<ref>{{cite arXiv\|last1=Real\|first1=Esteban\|last2=Moore\|first2=Sherry\|last3=Selle\|first3=Andrew\|last4=Saxena\|first4=Saurabh\|last5=Suematsu\|first5=Yutaka Leon\|last6=Tan\|first6=Jie\|last7=Le\|first7=Quoc\|last8=Kurakin\|first8=Alex\|date=2017-03-03\|title=Large-Scale Evolution of Image Classifiers\|eprint=1703.01041\|class=cs.NE}}</ref><ref>{{Cite arXiv\|last1=Suganuma\|first1=Masanori\|last2=Shirakawa\|first2=Shinichi\|last3=Nagao\|first3=Tomoharu\|date=2017-04-03\|title=A Genetic Programming Approach to Designing Convolutional Neural Network Architectures\|class=cs.NE\|eprint=1704.00764v2\|language=en}}</ref><ref name=":0">{{Cite arXiv\|last1=Liu\|first1=Hanxiao\|last2=Simonyan\|first2=Karen\|last3=Vinyals\|first3=Oriol\|last4=Fernando\|first4=Chrisantha\|last5=Kavukcuoglu\|first5=Koray\|date=2017-11-01\|title=Hierarchical Representations for Efficient Architecture Search\|class=cs.LG\|eprint=1711.00436v2\|language=en}}</ref><ref name="Real 2018">{{cite arXiv\|last1=Real\|first1=Esteban\|last2=Aggarwal\|first2=Alok\|last3=Huang\|first3=Yanping\|last4=Le\|first4=Quoc V.\|date=2018-02-05\|title=Regularized Evolution for Image Classifier Architecture Search\|eprint=1802.01548\|class=cs.NE}}</ref><ref>{{cite arXiv\|last1=Miikkulainen\|first1=Risto\|last2=Liang\|first2=Jason\|last3=Meyerson\|first3=Elliot\|last4=Rawal\|first4=Aditya\|last5=Fink\|first5=Dan\|last6=Francon\|first6=Olivier\|last7=Raju\|first7=Bala\|last8=Shahrzad\|first8=Hormoz\|last9=Navruzyan\|first9=Arshak\|last10=Duffy\|first10=Nigel\|last11=Hodjat\|first11=Babak\|date=2017-03-04\|title=Evolving Deep Neural Networks\|class=cs.NE\|eprint=1703.00548}}</ref><ref>{{Cite book\|last1=Xie\|first1=Lingxi\|last2=Yuille\|first2=Alan\|title=2017 IEEE International Conference on Computer Vision (ICCV) \|chapter=Genetic CNN ~~\|chapter-url=https://ieeexplore.ieee.org/document/8237416~~\|year=2017\|pages=1388–1397\|doi=10.1109/ICCV.2017.154\|arxiv=1703.01513\|isbn=978-1-5386-1032-9\|s2cid=206770867}}</ref><ref name="Elsken 2018" /> An Evolutionary Algorithm for Neural Architecture Search generally performs the following procedure.<ref name="liu2021survey">{{cite journal\|last1=Liu\|first1=Yuqiao\|last2=Sun\|first2=Yanan\|last3=Xue\|first3=Bing\|last4=Zhang\|first4=Mengjie\|last5=Yen\|first5=Gary G\|last6=Tan\|first6=Kay Chen\|title=A Survey on Evolutionary Neural Architecture Search\|journal=IEEE Transactions on Neural Networks and Learning Systems\|year=2021\|volume= 34\|issue=2 \|pages=1–21\|doi=10.1109/TNNLS.2021.3100554\|pmid=34357870\|arxiv=2008.10937\|s2cid=221293236}}</ref> First a pool consisting of different candidate architectures along with their validation scores (fitness) is initialised. At each step the architectures in the candidate pool are mutated (e.g.: 3x3 convolution instead of a 5x5 convolution). Next the new architectures are trained from scratch for a few epochs and their validation scores are obtained. This is followed by replacing the lowest scoring architectures in the candidate pool with the better, newer architectures. This procedure is repeated multiple times and thus the candidate pool is refined over time. Mutations in the context of evolving ANNs are operations such as adding or removing a layer, which include changing the type of a layer (e.g., from convolution to pooling), changing the hyperparameters of a layer, or changing the training hyperparameters. On [[CIFAR-10]] and [[ImageNet]], evolution and RL performed comparably, while both slightly outperformed [[random search]].<ref name="Real 2018" /><ref name=":0" /> == Bayesian optimization == Line 61: {{Differentiable computing}} [[Category:Artificial intelligence engineering]]▼ ▲[[Category:Artificial intelligence]]