Neural architecture search: Difference between revisions

Content deleted Content added
Wsafari (talk | contribs)
Modified section on evolutionary algorithm
Line 18:
 
== Evolution ==
SeveralAn groupsalternative employedapproach to NAS is based on [[evolutionaryEvolutionary algorithm|evolutionary algorithms]]s, forwhich NAShas been employed by several groups.<ref>{{cite arXiv|last1=Real|first1=Esteban|last2=Moore|first2=Sherry|last3=Selle|first3=Andrew|last4=Saxena|first4=Saurabh|last5=Suematsu|first5=Yutaka Leon|last6=Tan|first6=Jie|last7=Le|first7=Quoc|last8=Kurakin|first8=Alex|date=2017-03-03|title=Large-Scale Evolution of Image Classifiers|eprint=1703.01041|class=cs.NE}}</ref><ref>{{Cite journal|last=Suganuma|first=Masanori|last2=Shirakawa|first2=Shinichi|last3=Nagao|first3=Tomoharu|date=2017-04-03|title=A Genetic Programming Approach to Designing Convolutional Neural Network Architectures|url=https://arxiv.org/abs/1704.00764v2|language=en}}</ref><ref name=":0">{{Cite journal|last=Liu|first=Hanxiao|last2=Simonyan|first2=Karen|last3=Vinyals|first3=Oriol|last4=Fernando|first4=Chrisantha|last5=Kavukcuoglu|first5=Koray|date=2017-11-01|title=Hierarchical Representations for Efficient Architecture Search|url=https://arxiv.org/abs/1711.00436v2|language=en}}</ref><ref name="Real 2018">{{cite arXiv|last1=Real|first1=Esteban|last2=Aggarwal|first2=Alok|last3=Huang|first3=Yanping|last4=Le|first4=Quoc V.|date=2018-02-05|title=Regularized Evolution for Image Classifier Architecture Search|eprint=1802.01548|class=cs.NE}}</ref><ref>Stanley, Kenneth;{{Cite journal|last=Miikkulainen, |first=Risto,|last2=Liang|first2=Jason|last3=Meyerson|first3=Elliot|last4=Rawal|first4=Aditya|last5=Fink|first5=Dan|last6=Francon|first6=Olivier|last7=Raju|first7=Bala|last8=Shahrzad|first8=Hormoz|last9=Navruzyan|first9=Arshak|last10=Duffy|first10=Nigel|last11=Hodjat|first11=Babak|date=2017-03-04|title=Evolving "[Deep Neural Networks|url=http://citeseerxarxiv.istorg/abs/1703.psu00548|journal=arXiv:1703.edu/viewdoc00548 [cs]}}</download?doiref><ref>{{Cite journal|last=10Xie|first=Lingxi|last2=Yuille|first2=Alan|date=|title=Genetic CNN|url=https://ieeexplore.1ieee.1.28.5457&rep=rep1&typeorg/document/8237416|journal=pdf2017 EvolvingIEEE NeuralInternational NetworksConference throughon AugmentingComputer Topologies]",Vision in:(ICCV)|pages=1388–1397|doi=10.1109/ICCV.2017.154}}</ref><ref Evolutionaryname="Elsken Computation,2018" 2002</ref>. An Evolutionary Algorithm for Neural Architecture Search generally performs the following procedure<ref name="liu2021survey">{{cite arXiv|last1=Liu|first1=Yuqiao|last2=Sun|first2=Yanan|last3=Xue|first3=Bing|last3=Zhang|first3=Mengjie|last3=Yen|first3=Gary G|last3=Tan|first3=Kay Chen|date=2020-08-25|title=A Survey on Evolutionary Neural Architecture Search|eprint=2008.10937|class=cs.NE}}</ref>. First a pool consisting of different candidate architectures along with their validation scores (fitness) is initialised. At each step the architectures in the candidate pool are mutated (eg: 3x3 convolution instead of a 5x5 convolution). Next the new architectures are trained from scratch for a few epochs and their validation scores are obtained. This is followed by replacing the lowest scoring architectures in the candidate pool with the better, newer architectures. This procedure is repeated multiple times and thus the candidate pool is refined over time. Mutations in the context of evolving ANNs are operations such as adding a layer,or removing a layer, orwhich include changing the type of a layer (e.g., from convolution to pooling), changing the hyperparameters of a layer, or changing the training hyperparameters. On [[CIFAR-10]] and [[ImageNet]], evolution and RL performed comparably, while both slightly outperformed [[random search]].<ref name="Real 2018" /><ref name=":0" />
 
== Bayesian Optimization ==
[[Bayesian Optimization]] which has proven to be an efficient method for hyperparameter optimization can also be applied to NAS. In this context the objective function maps an architecture to its validation error after being trained for a number of epochs. At each iteration BO uses a surrogate to model this objective function based on previously obtained architectures and their validation errors. One then chooses the next architecture to evaluate by maximizing an acquisition function, such as expected improvement, which provides a balance between exploration and exploitation. Acquisition function maximization and objective function evaluation are often computationally expensive for NAS, and make the application of BO challenging in this context. Recently BANANAS<ref>{{Cite journal|last=White|first=Colin|last2=Neiswanger|first2=Willie|last3=Savani|first3=Yash|date=2020-11-02|title=BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search|url=http://arxiv.org/abs/1910.11858|journal=arXiv:1910.11858 [cs, stat]}}</ref> has achieved promising results in this direction by introducing a high-performing instantiation of BO coupled to a neural predictor.
 
==Hill-climbing==