Learning vector quantization: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 15:41, 25 August 2022 edit Juvilius (talk \| contribs) 6 edits paper citations coming soon Tag: Reverted ← Previous edit		Latest revision as of 22:46, 19 June 2025 edit undo OAbot (talk \| contribs) Bots 643,903 edits m Open access bot: url-access updated in citation with #oabot.
(20 intermediate revisions by 8 users not shown)
Line 1: In [[computer science]], '''learning vector quantization''' ('''LVQ''') is a [[prototype\|prototype-based]] [[supervised learning\|supervised]] [[Statistical classification\|classification]] [[algorithm]]. LVQ is the supervised counterpart of [[vector quantization]] systems. LVQ can be understood as a special case of an [[artificial neural network]], more precisely, it applies a [[winner-take-all (computing)\|winner-take-all]] [[Hebbian learning]]-based approach. It is a precursor to [[self-organizing map]]s (SOM) and related to [[neural gas]] and the [[k-nearest neighbor algorithm]] (k-NN). LVQ was invented by [[Teuvo Kohonen]].<ref>T. Kohonen. Self-Organizing Maps. Springer, Berlin, 1997.</ref> ~~== Overview ==~~ LVQ can be understood as a special case of an [[artificial neural network]], more precisely, it applies a [[winner-take-all (computing)\|winner-take-all]] [[Hebbian learning]]-based approach. It is a precursor to [[self-organizing map]]s (SOM) and related to [[neural gas]], and to the [[k-nearest neighbor algorithm]] (k-NN). LVQ was invented by [[Teuvo Kohonen]].<ref>T. Kohonen. Self-Organizing Maps. Springer, Berlin, 1997.</ref> == Definition == An LVQ system is represented by prototypes <math>W=(w(i),...,w(n))</math> which are defined in the [[feature space]] of observed data. In winner-take-all training algorithms one determines, for each data point, the prototype which is closest to the input according to a given distance measure. The position of this so-called winner prototype is then adapted, i.e. the winner is moved closer if it correctly classifies the data point or moved away if it classifies the data point incorrectly. Line 9 ⟶ 7: LVQ systems can be applied to [[multi-class classification]] problems in a natural way. A key issue in LVQ is the choice of an appropriate measure of distance or similarity for training and classification. Recently, techniques have been developed which adapt a parameterized distance measure in the course of training the system, see e.g. (Schneider, Biehl, and Hammer, 2009)<ref>{{cite journal\|~~authors~~author1=P. Schneider, \|author2=B. Hammer~~, and~~ \|author3=M. Biehl \|title=Adaptive Relevance Matrices in Learning Vector Quantization\|journal= Neural Computation\|volume=21\|issue=10\|pages=3532–3561\|year=2009\|doi=10.1162/neco.2009.10-08-892\|pmid=19635012\|citeseerx=10.1.1.216.1183\|s2cid=17306078}}</ref> and references therein. LVQ can be a source of great help in classifying text documents.{{Citation needed\|date=December 2019\|reason=removed citation to predatory publisher content}} ==Algorithm== The algorithms are presented as in.<ref>{{Citation \|last=Kohonen \|first=Teuvo \|title=Learning Vector Quantization \|date=2001 \|work=Self-Organizing Maps \|volume=30 \|pages=245–261 \|url=http://link.springer.com/10.1007/978-3-642-56927-2_6 \|place=Berlin, Heidelberg \|publisher=Springer Berlin Heidelberg \|doi=10.1007/978-3-642-56927-2_6 \|isbn=978-3-540-67921-9\|url-access=subscription }}</ref> ~~Below follows an informal description.<br>~~ ~~The algorithm consists of three basic steps. The algorithm's input is:~~ Set up: * how many neurons the system will have <math>M</math> (in the simplest case it is equal to the number of classes) * what weight each neuron has <math>\vec{w_i}</math> for <math>i = 0,1,...,M - 1 </math> * Let the ~~corresponding~~data ~~label~~be denoted by <math>~~c_i~~x_i \in \R^D</math>, toand ~~each~~their ~~neuron~~corresponding labels by <math>y_i \in \~~vec~~{~~w_i}~~1, 2, \dots, C\}</math>. * ~~how~~The ~~fast~~complete ~~the~~dataset ~~neurons are learning~~is <math>\{(x_i, y_i)\~~eta~~ }_{i=1}^N</math>. * The set of code vectors is <math>w_j \in \R^D</math>. * and an input list <math> L </math> containing all the vectors of which the labels are known already (training set). * The learning rate at iteration step <math>t</math> is denoted by <math>\alpha_t</math>. * The hyperparameters <math>w</math> and <math>\epsilon</math> are used by LVQ2 and LVQ3. The original paper suggests <math>\epsilon \in [0.1, 0.5]</math> and <math>w \in [0.2, 0.3]</math>. === LVQ1 === Initialize several code vectors per label. Iterate until convergence criteria is reached. # Sample a datum <math>x_i</math>, and find out the code vector <math>w_j</math>, such that <math>x_i</math> falls within the [[Voronoi diagram\|Voronoi cell]] of <math>w_j</math>. # If its label <math>y_i</math> is the same as that of <math>w_j</math>, then <math>w_j \leftarrow w_j + \alpha_t(x_i - w_j)</math>, otherwise, <math>w_j \leftarrow w_j - \alpha_t(x_i - w_j)</math>. === LVQ2.1 ===▼ ~~The algorithm's flow is:~~ #LVQ2 ~~For~~is ~~next~~the ~~input~~same as LVQ3, but with this sentence removed: "If <math>~~\vec{x}~~w_j</math> ~~(with label~~and <math>yw_k</math>) inand <math> L x_i</math> ~~find~~have the ~~closest~~same ~~neuron~~class, then <math>w_j \~~vec{w_m}~~leftarrow w_j - \alpha_t(x_i - w_j)</math>~~, <br>i.e.~~ and <math>d(w_k \~~vec{x},\vec{w_m})~~leftarrow =w_k + \~~min\limits_i~~alpha_t(x_i ~~{d(\vec{x},\vec{w_i}~~- w_k)} </math>,.". ~~where~~If <math>~~\, d~~w_j</math> isand ~~the~~<math>w_k</math> ~~metric~~and ~~used~~<math>x_i</math> (have ~~[[Euclidean~~the same ~~distance\|Euclidean]]~~class, ~~etc.~~then )nothing happens. # Update <math>\vec{w_m}</math>. A better explanation is get <math>\vec{w_m}</math> closer to the input <math>\vec{x}</math>, if <math>\vec{x}</math> and <math>\vec{w_m}</math> belong to the same label and get them further apart if they don't. <br><math> \vec{w_m} \gets \vec{w_m} + \eta \cdot \left( \vec{x} - \vec{w_m} \right) </math> if <math> c_m = y</math> (closer together) <br> or <math> \vec{w_m} \gets \vec{w_m} - \eta \cdot \left( \vec{x} - \vec{w_m} \right) </math> if <math> c_m \neq y</math> (further apart). ~~# While there are vectors left in <math> L </math> go to step 1, else terminate.~~ === LVQ3 === ~~Note: <math>\vec{w_i}</math> and <math>\vec{x}</math> are [[vector space\|vectors]] in feature space.~~ [[File:Apollonian_circles.svg\|thumb\|Some Apollonian circles. Every blue circle intersects every red circle at a right angle. Every red circle passes through the two points ''{{mvar\|C, D}}'', and every blue circle separates the two points.]] Initialize several code vectors per label. Iterate until convergence criteria is reached. # Sample a datum <math>x_i</math>, and find out two code vectors <math>w_j, w_k</math> closest to it. ~~==Variants==~~ # Let <math>d_j := \\|x_i - w_j\\|, d_k := \\|x_i - w_k\\|</math>. ▲ LVQ2.1 # If <math>\min \left(\frac{d_j}{d_k}, \frac{d_k}{d_j}\right)>s </math>, where <math>s=\frac{1-w}{1+w}</math>, then * Generalized LVQ (GLVQ) #* If <math>w_j</math> and <math>x_i</math> have the same class, and <math>w_k</math> and <math>x_i</math> have different classes, then <math>w_j \leftarrow w_j + \alpha_t(x_i - w_j)</math> and <math>w_k \leftarrow w_k - \alpha_t(x_i - w_k)</math>. * Generalized Relevance LVQ (GRLVQ) #* If <math>w_k</math> and <math>x_i</math> have the same class, and <math>w_j</math> and <math>x_i</math> have different classes, then <math>w_j \leftarrow w_j - \alpha_t(x_i - w_j)</math> and <math>w_k \leftarrow w_k + \alpha_t(x_i - w_k)</math>. * Generalized Matrix LVQ (GMLVQ) #* If <math>w_j</math> and <math>w_k</math> and <math>x_i</math> have the same class, then <math>w_j \leftarrow w_j - \epsilon\alpha_t(x_i - w_j)</math> and <math>w_k \leftarrow w_k + \epsilon\alpha_t(x_i - w_k)</math>. * Relational GLVQ (RGLVQ) #* If <math>w_k</math> and <math>x_i</math> have different classes, and <math>w_j</math> and <math>x_i</math> have different classes, then the original paper simply does not explain what happens in this case, but presumably nothing happens in this case. * Median GLVQ (MGLVQ) # Otherwise, skip. * Limited Rank Matrix LVQ (LiRaMLVQ) Note that condition <math>\min \left(\frac{d_j}{d_k}, \frac{d_k}{d_j}\right)>s </math>, where <math>s=\frac{1-w}{1+w}</math>, precisely means that the point <math>x_i</math> falls between two [[Apollonian circles\|Apollonian spheres]]. * Generalized Tangent LVQ (GTLVQ) (also non-euclidean - curved manifold) * Localized GMLVQ (LGMLVQ) * Robust soft LVQ (RSLVQ) * Border-sensitive GLVQ (BSGLVQ) == References == Line 46 ⟶ 49: == Further reading == * {{cite journal \|last1=Somervuo \|first1=Panu \|last2=Kohonen \|first2=Teuvo \|date=1999 \|title=Self-organizing maps and learning vector quantization for feature sequences \|journal=Neural Processing Letters \|volume=10 \|issue=2 \|pages=151–159 \|doi=10.1023/A:1018741720065}} * [http://www.cis.hut.fi/panus/papers/dtwsom.pdf Self-Organizing Maps and Learning Vector Quantization for Feature Sequences, Somervuo and Kohonen. 2004] (pdf) * {{Cite journal \|last=Nova \|first=David \|last2=Estévez \|first2=Pablo A. \|date=2014-09-01 \|title=A review of learning vector quantization classifiers \|url=https://link.springer.com/article/10.1007/s00521-013-1535-3 \|journal=Neural Computing and Applications \|language=en \|volume=25 \|issue=3 \|pages=511–524 \|doi=10.1007/s00521-013-1535-3 \|issn=1433-3058\|arxiv=1509.07093 }} == External links ==