Revision as of 00:02, 19 October 2024 edit Cosmia Nebula (talk \| contribs) Extended confirmed users 11,304 edits →Response normalization Tag: Visual edit ← Previous edit		Revision as of 17:47, 10 November 2024 edit undo Perceptron599 (talk \| contribs) 390 edits Improve intro description/phrasing Tags: Mobile edit Mobile web edit Next edit →
Line 1: {{Short description\|~~Rescaling~~Machine ~~inputs~~learning ~~to improve model training~~technique}} {{Machine learning bar}} In [[machine learning]], '''normalization''' is a statistical technique with various applications. There are ~~mainly~~ two main forms of normalization, namely ''data normalization'' and ''activation normalization''. Data normalization, (or [[feature scaling]]~~, is a general technique in statistics, and it~~) includes methods that rescale input data so that ~~they~~the [[Feature (machine learning)\|features]] have ~~well-behaved~~the same range, mean, variance, ~~and~~or other statistical properties. ~~Activation~~For instance, a popular choice of feature scaling method is [[Feature scaling#Rescaling (min-max normalization)\|min-max normalization]], where each feature is ~~specific~~transformed to ~~deep~~have ~~learning,~~the ~~and~~same itrange ~~includes~~(typically ~~methods~~<math>[0,1]</math> ~~that~~or ~~rescale~~<math>[-1,1]</math>). This solves the ~~activation~~problem of ~~hidden~~different ~~neurons~~features ~~inside~~having avastly ~~neural~~different scales, for example if one feature is measured in kilometers and another in ~~network~~nanometers. Activation normalization, on the other hand, is specific to [[deep learning]], and includes methods that rescale the activation of [[Hidden layer\|hidden neurons]] inside [[Neural network (machine learning)\|neural networks]]. Normalization is often used for faster training convergence, less sensitivity to variations in input data, less overfitting, and better generalization to unseen data. They are often theoretically justified as reducing covariance shift, smoother optimization landscapes, increasing [[Regularization (mathematics)\|regularization]], though they are mainly justified by empirical success.<ref>{{Cite book \|last=Huang \|first=Lei \|url=https://link.springer.com/10.1007/978-3-031-14595-7 \|title=Normalization Techniques in Deep Learning \|date=2022 \|publisher=Springer International Publishing \|isbn=978-3-031-14594-0 \|series=Synthesis Lectures on Computer Vision \|___location=Cham \|language=en \|doi=10.1007/978-3-031-14595-7}}</ref>▼ Normalization is often used to: * increase the speed of training convergence, * reduce sensitivity to variations and feature scales in input data, * reduce [[overfitting]], * and produce better model generalization to unseen data. ▲Normalization ~~is often used for faster training convergence, less sensitivity to variations in input data, less overfitting, and better generalization to unseen data. They~~techniques are often theoretically justified as reducing covariance shift, ~~smoother~~smoothing optimization landscapes, and increasing [[Regularization (mathematics)\|regularization]], though they are mainly justified by empirical success.<ref>{{Cite book \|last=Huang \|first=Lei \|url=https://link.springer.com/10.1007/978-3-031-14595-7 \|title=Normalization Techniques in Deep Learning \|date=2022 \|publisher=Springer International Publishing \|isbn=978-3-031-14594-0 \|series=Synthesis Lectures on Computer Vision \|___location=Cham \|language=en \|doi=10.1007/978-3-031-14595-7}}</ref> == Batch normalization ==

Normalization (machine learning): Difference between revisions