Hyperparameter optimization: Difference between revisions

Content deleted Content added
Line 125:
Population Based Training (PBT) learns both hyperparameter values and network weights. Multiple learning processes operate independently, using different hyperparameters. As with evolutionary methods, poorly performing models are iteratively replaced with models that adopt modified hyperparameter values and weights based on the better performers. This replacement model warm starting is the primary differentiator between PBT and other evolutionary methods. PBT thus allows the hyperparameters to evolve and eliminates the need for manual hypertuning. The process makes no assumptions regarding model architecture, loss functions or training procedures.
 
PBT and its variants are adaptive methods: they update hyperparameters during the training of the models. On the contrary, non-adaptive methods have the susub-optimal strategy to assign a constant set of hyperparameters for the whole training. <ref>{{cite arxiv|last=Li|first=Ang|last2=Spyra|first2=Ola|last3=Perel|first3=Sagi|last4=Dalibard|first4=Valentin|last5=Jaderberg|first5=Max|last6=Gu|first6=Chenjie|last7=Budden|first7=David|last8=Harley|first8=Tim|last9=Gupta|first9=Pramod|date=2019-02-05|title=A Generalized Framework for Population Based Training|eprint=1902.01894|class=cs.AI}}</ref>
 
=== Early stopping-based ===