Normalization (machine learning): Difference between revisions

Content deleted Content added
No edit summary
Line 138:
</math>
 
== WeightOther normalizationnormalizations ==
'''Weight normalization''' ('''WeightNorm''')<ref>{{Citation |last=Salimans |first=Tim |title=Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks |date=2016-06-03 |url=http://arxiv.org/abs/1602.07868 |access-date=2024-08-08 |doi=10.48550/arXiv.1602.07868 |last2=Kingma |first2=Diederik P.}}</ref> is a technique inspired by BatchNorm. It normalizes weight matrices in a neural network, rather than its neural activations.
 
'''Gradient normalization''' ('''GradNorm''')<ref>{{Cite journal |last=Chen |first=Zhao |last2=Badrinarayanan |first2=Vijay |last3=Lee |first3=Chen-Yu |last4=Rabinovich |first4=Andrew |date=2018-07-03 |title=GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks |url=https://proceedings.mlr.press/v80/chen18a.html |journal=Proceedings of the 35th International Conference on Machine Learning |language=en |publisher=PMLR |pages=794–803}}</ref> normalizes gradient vectors during backpropagation.
 
== CNN-specific normalization ==