Content deleted Content added
Citation bot (talk | contribs) Add: eprint, class, authors 1-1. Removed parameters. Some additions/deletions were parameter name changes. | Use this bot. Report bugs. | Suggested by Headbomb | Pages linked from cached Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox | via #UCB_webform_linked 23/26 |
Elaborated further of the subject adding sections on types of pruning. Tags: Reverted Visual edit |
||
Line 4:
}}
In the context of [[artificial neural network]], '''pruning''' is the practice of removing parameters (which may entail removing individual parameters, or parameters in groups such as by [[artificial neurons|neurons]]) from an existing network.<ref>{{cite arxiv|last1=Blalock|first1=Davis|last2=Ortiz|first2=Jose Javier Gonzalez|last3=Frankle|first3=Jonathan|last4=Guttag|first4=John|date=2020-03-06|title=What is the State of Neural Network Pruning?|class=cs.LG|eprint=2003.03033}}</ref> The goal of this process is to maintain accuracy of the network while increasing its efficiency. This can be done to reduce the computational resources required to run the [[neural network]]. After a network is trained to a desired solution with the training data, units (hidden layer nodes or interconnections) are analysed to determine which are not contributing to the solution. There are several approaches described in the literature to determine non-contributing units. A widely used approach is that non-essential units can be determined by a form of sensitivity analysis.So far, few studies have been performed in the analysis of pruning algorithms for classification of remotely-sensed data.
A basic algorithm for pruning is as follows:<ref>Molchanov, P., Tyree, S., Karras, T., Aila, T., & Kautz, J. (2016). ''Pruning convolutional neural networks for resource efficient inference''. arXiv preprint arXiv:1611.06440.</ref><ref>[https://jacobgil.github.io/deeplearning/pruning-deep-learning Pruning deep neural networks to make them fast and small].</ref>
#Evaluate the importance of each neuron.
#Rank the [[neurons]] according to their importance (assuming there is a clearly defined measure for "importance").
#Remove the least important neuron.
#Check a termination condition (to be determined by the user) to see whether to continue pruning.
== Types ==
=== Magnitude based Pruning ===
The [[Magnitude Based pruning technique]] (MB) is the simplest pruning [[algorithm]] . It is based on deleting interconnections with small ‘[[Salience (neuroscience)|saliency]]’, i.e. those whose deletion will have the least effect on the training error. Saliency corresponds to the magnitude (weight) value of the [[interconnections]]. It is assumed that the interconnections whose magnitude value is small will have minor effect on the performance of the network. After reasonable initial training, the interconnection having the smallest magnitude value is removed. The network is then retrained and the process is repeated in an iterative fashion until the training error reaches a certain limit.
=== Optimum Brain Damage ===
The [https://nyuscholars.nyu.edu/en/publications/optimal-brain-damage Optimum Brain Damage pruning algorithm] (OBD), introduced by [http://yann.lecun.com/exdb/publis/pdf/lecun-90b.pdf Le Cun, Denker and Solla] in 1990, is based on second order derivatives of the error function. The aim is to iteratively delete the weights whose deletion will result in the least increase of the error in the network. There is an important problem in the estimation of formula, is the size of the Hessian matrix. The calculation of the Hessian matrix is time-consuming, hence Le Cun et al. (1990) assume that the [[Hessian]] is diagonal. On the other hand, Hassibi and Stork (1993) argue that Hessian for every problem they have considered are strongly non-diagonal, and this leads OBD to eliminate the wrong weights
== References ==
|