Normalization (machine learning): Difference between revisions

Content deleted Content added
Line 172:
 
=== Response normalization ===
{{Anchor|Local response normalization}}'''Local response normalization'''<ref>{{Cite journal |last1=Krizhevsky |first1=Alex |last2=Sutskever |first2=Ilya |last3=Hinton |first3=Geoffrey E |date=2012 |title=ImageNet Classification with Deep Convolutional Neural Networks |url=https://papers.nips.cc/paper_files/paper/2012/hash/c399862d3b9d6b76c8436e924a68c45b-Abstract.html |journal=Advances in Neural Information Processing Systems |publisher=Curran Associates, Inc. |volume=25}}</ref> was used in [[AlexNet]]. It was applied in a convolutional layer, just after a nonlinear activation function. It was defined by<math display="block">b_{x, y}^i=\frac{a_{x, y}^i}{\left(k+\alpha \sum_{j=\max (0, i-n / 2)}^{\min (N-1, i+n / 2)}\left(a_{x, y}^j\right)^2\right)^\beta}</math>where <math>a_{x,y}^i</math> is the activation of the neuron at ___location <math>(x,y)</math> and channel <math>i</math>. In words, each pixel in a channel is suppressed by the activations of the same pixel in its adjacent channels.
 
The numbers <math>k, n, \alpha, \beta</math> are hyperparameters picked by using a validation set.