Revision as of 20:54, 8 October 2024 edit Cosmia Nebula (talk \| contribs) Extended confirmed users 11,304 edits →Improvements Tag: Visual edit ← Previous edit		Revision as of 20:55, 8 October 2024 edit undo Citation bot (talk \| contribs) Bots 5,868,144 edits Altered template type. Add: class, date, title, eprint, authors 1-4. Removed URL that duplicated identifier. Changed bare reference to CS1/2. \| Use this bot. Report bugs. \| Suggested by Cosmia Nebula \| #UCB_webform Next edit →
Line 97: === Improvements === BatchNorm has been very popular and there were many attempted improvements. Some examples include:<ref name=":3">~~https://arxiv.org/pdf/~~{{cite arXiv \| eprint=1906.03548 \| last1=Summers \| first1=Cecilia \| last2=Dinneen \| first2=Michael J. \| title=Four Things Everyone Should Know to Improve Batch Normalization \| date=2019 \| class=cs.LG }}</ref> * Ghost batch: Randomly partition a batch into sub-batches and perform BatchNorm separately on each. Line 110: </math>where <math>\alpha</math> is a hyperparameter to be optimized on a validation set. Other works attempt to eliminate BatchNorm, such as the Normalizer-Free ResNet.<ref>~~https://arxiv.org/abs/~~{{cite arXiv \| eprint=2102.06171 \| last1=Brock \| first1=Andrew \| last2=De \| first2=Soham \| last3=Smith \| first3=Samuel L. \| last4=Simonyan \| first4=Karen \| title=High-Performance Large-Scale Image Recognition Without Normalization \| date=2021 \| class=cs.CV }}</ref> == Layer normalization ==

Normalization (machine learning): Difference between revisions