Revision as of 20:49, 29 December 2017 edit Rhiever (talk \| contribs) 60 edits mNo edit summary ← Previous edit		Revision as of 21:21, 29 December 2017 edit undo Rhiever (talk \| contribs) 60 edits →Add Evolutionary Optimization section to Approaches Next edit →
Line 99: A different approach in order to obtain a gradient with respect to hyperparameters consists in differentiating the steps of an iterative optimization algorithm using [[automatic differentiation]].<ref>{{cite journal\|last1=Domke\|first1=Justin\|title=Generic Methods for Optimization-Based Modeling\|journal=AISTATS\|date=2012\|volume=22\|url=http://www.jmlr.org/proceedings/papers/v22/domke12/domke12.pdf}}</ref><ref name=abs1502.03492>{{cite arXiv \|last1=Maclaurin\|first1=Douglas\|last2=Duvenaud\|first2=David\|last3=Adams\|first3=Ryan P.\|eprint=1502.03492\|title=Gradient-based Hyperparameter Optimization through Reversible Learning\|class=stat.ML\|date=2015}}</ref> === Evolutionary optimization === {{main article\|Evolutionary algorithm}} Evolutionary optimization is a methodology for the global optimization of noisy black-box functions. In hyperparameter optimization, evolutionary optimization uses [[evolutionary algorithms]] to search the space of hyperparameters for a given algorithm.<ref name="bergstra11" /> Evolutionary hyperparameter optimization follows a [[Evolutionary_algorithm#Implementation\|process]] inspired by the biological concept of [[evolution]]: # Create an initial population of random solutions (i.e., randomly generate tuples of hyperparameters, typically 100+) # Evaluate the hyperparameters tuples and acquire their [[fitness\|fitness function]] (e.g., 10-fold [[Cross-validation (statistics)\|cross-validation]] accuracy of the machine learning algorithm with those hyperparameters) # Rank the hyperparameter tuples by their relative fitness # Replace the worst-performing hyperparameter tuples with new hyperparameter tuples generated through [[crossover (genetic algorithm)\|crossover]] and [[mutation (genetic algorithm)\|mutation]] # Repeat steps 2-4 until satisfactory algorithm performance is reached or algorithm performance is no longer improving Evolutionary optimization has been used in hyperparameter optimization for statistical machine learning algorithms<ref name="bergstra11" />, [[automated machine learning]]<ref name="tpot1" /><ref name="tpot2" />, [[Deep_learning#Deep_neural_networks\|deep neural network]] architecture search<ref name="miikkulainen1">{{cite journal \| vauthors = Miikkulainen R, Liang J, Meyerson E, Rawal A, Fink D, Francon O, Raju B, Shahrzad H, Navruzyan A, Duffy N, Hodjat B \| year = 2017 \| title = Evolving Deep Neural Networks \| url = https://arxiv.org/abs/1703.00548 \| journal = arXiv Preprint }}</ref><ref name="jaderberg1">{{cite journal \| vauthors = Jaderberg M, Dalibard V, Osindero S, Czarnecki WM, Donahue J, Razavi A, Vinyals O, Green T, Dunning I, Simonyan K, Fernando C, Kavukcuoglu K \| year = 2017 \| title = Population Based Training of Neural Networks \| url = https://arxiv.org/abs/1711.09846 \| journal = arXiv Preprint }}</ref>, as well as training of the weights in deep neural networks<ref name="such1">{{cite journal \| vauthors = Such FP, Madhavan V, Conti E, Lehman J, Stanley KO, Clune J \| year = 2017 \| title = Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning \| url = https://arxiv.org/abs/1712.06567 \| journal = arXiv Preprint }}</ref>. === Others === ~~[[Evolutionary algorithm\|Evolutionary]]<ref name=abs1711.09846>[https://arxiv.org/abs/1711.09846 Population Based Training of Neural Networks (2017)]</ref>,~~ [[Radial basis function\|RBF]]<ref name=abs1705.08520>[https://arxiv.org/abs/1705.08520 An effective algorithm for hyperparameter optimization of neural networks (2017)]</ref> and [[spectral method\|spectral]]<ref name=abs1706.00764>[https://arxiv.org/abs/1706.00764 Hyperparameter Optimization: A Spectral Approach (2017)]</ref> approaches have also been ~~used~~developed. == Software ==

Hyperparameter optimization: Difference between revisions