Learning vector quantization: Difference between revisions

Content deleted Content added
paper citations coming soon
Tag: Reverted
OAbot (talk | contribs)
m Open access bot: url-access updated in citation with #oabot.
 
(20 intermediate revisions by 8 users not shown)
Line 1:
In [[computer science]], '''learning vector quantization''' ('''LVQ''') is a [[prototype|prototype-based]] [[supervised learning|supervised]] [[Statistical classification|classification]] [[algorithm]]. LVQ is the supervised counterpart of [[vector quantization]] systems. LVQ can be understood as a special case of an [[artificial neural network]], more precisely, it applies a [[winner-take-all (computing)|winner-take-all]] [[Hebbian learning]]-based approach. It is a precursor to [[self-organizing map]]s (SOM) and related to [[neural gas]] and the [[k-nearest neighbor algorithm]] (k-NN). LVQ was invented by [[Teuvo Kohonen]].<ref>T. Kohonen. Self-Organizing Maps. Springer, Berlin, 1997.</ref>
 
== Overview ==
LVQ can be understood as a special case of an [[artificial neural network]], more precisely, it applies a [[winner-take-all (computing)|winner-take-all]] [[Hebbian learning]]-based approach. It is a precursor to [[self-organizing map]]s (SOM) and related to [[neural gas]], and to the [[k-nearest neighbor algorithm]] (k-NN). LVQ was invented by [[Teuvo Kohonen]].<ref>T. Kohonen. Self-Organizing Maps. Springer, Berlin, 1997.</ref>
 
== Definition ==
An LVQ system is represented by prototypes <math>W=(w(i),...,w(n))</math> which are defined in the [[feature space]] of observed data. In winner-take-all training algorithms one determines, for each data point, the prototype which is closest to the input according to a given distance measure. The position of this so-called winner prototype is then adapted, i.e. the winner is moved closer if it correctly classifies the data point or moved away if it classifies the data point incorrectly.
 
Line 9 ⟶ 7:
LVQ systems can be applied to [[multi-class classification]] problems in a natural way.
 
A key issue in LVQ is the choice of an appropriate measure of distance or similarity for training and classification. Recently, techniques have been developed which adapt a parameterized distance measure in the course of training the system, see e.g. (Schneider, Biehl, and Hammer, 2009)<ref>{{cite journal|authorsauthor1=P. Schneider, |author2=B. Hammer, and |author3=M. Biehl |title=Adaptive Relevance Matrices in Learning Vector Quantization|journal= Neural Computation|volume=21|issue=10|pages=3532–3561|year=2009|doi=10.1162/neco.2009.10-08-892|pmid=19635012|citeseerx=10.1.1.216.1183|s2cid=17306078}}</ref> and references therein.
 
LVQ can be a source of great help in classifying text documents.{{Citation needed|date=December 2019|reason=removed citation to predatory publisher content}}
 
==Algorithm==
The algorithms are presented as in.<ref>{{Citation |last=Kohonen |first=Teuvo |title=Learning Vector Quantization |date=2001 |work=Self-Organizing Maps |volume=30 |pages=245–261 |url=http://link.springer.com/10.1007/978-3-642-56927-2_6 |place=Berlin, Heidelberg |publisher=Springer Berlin Heidelberg |doi=10.1007/978-3-642-56927-2_6 |isbn=978-3-540-67921-9|url-access=subscription }}</ref>
Below follows an informal description.<br>
 
The algorithm consists of three basic steps. The algorithm's input is:
Set up:
* how many neurons the system will have <math>M</math> (in the simplest case it is equal to the number of classes)
 
* what weight each neuron has <math>\vec{w_i}</math> for <math>i = 0,1,...,M - 1 </math>
* Let the correspondingdata labelbe denoted by <math>c_ix_i \in \R^D</math>, toand eachtheir neuroncorresponding labels by <math>y_i \in \vec{w_i}1, 2, \dots, C\}</math>.
* howThe fastcomplete thedataset neurons are learningis <math>\{(x_i, y_i)\eta }_{i=1}^N</math>.
* The set of code vectors is <math>w_j \in \R^D</math>.
* and an input list <math> L </math> containing all the vectors of which the labels are known already (training set).
* The learning rate at iteration step <math>t</math> is denoted by <math>\alpha_t</math>.
* The hyperparameters <math>w</math> and <math>\epsilon</math> are used by LVQ2 and LVQ3. The original paper suggests <math>\epsilon \in [0.1, 0.5]</math> and <math>w \in [0.2, 0.3]</math>.
 
=== LVQ1 ===
Initialize several code vectors per label. Iterate until convergence criteria is reached.
 
# Sample a datum <math>x_i</math>, and find out the code vector <math>w_j</math>, such that <math>x_i</math> falls within the [[Voronoi diagram|Voronoi cell]] of <math>w_j</math>.
# If its label <math>y_i</math> is the same as that of <math>w_j</math>, then <math>w_j \leftarrow w_j + \alpha_t(x_i - w_j)</math>, otherwise, <math>w_j \leftarrow w_j - \alpha_t(x_i - w_j)</math>.
 
*=== LVQ2.1 ===
The algorithm's flow is:
#LVQ2 Foris nextthe inputsame as LVQ3, but with this sentence removed: "If <math>\vec{x}w_j</math> (with labeland <math>yw_k</math>) inand <math> L x_i</math> findhave the closestsame neuronclass, then <math>w_j \vec{w_m}leftarrow w_j - \alpha_t(x_i - w_j)</math>, <br>i.e. and <math>d(w_k \vec{x},\vec{w_m})leftarrow =w_k + \min\limits_ialpha_t(x_i {d(\vec{x},\vec{w_i}- w_k)} </math>,.". whereIf <math>\, dw_j</math> isand the<math>w_k</math> metricand used<math>x_i</math> (have [[Euclideanthe same distance|Euclidean]]class, etc.then )nothing happens.
# Update <math>\vec{w_m}</math>. A better explanation is get <math>\vec{w_m}</math> closer to the input <math>\vec{x}</math>, if <math>\vec{x}</math> and <math>\vec{w_m}</math> belong to the same label and get them further apart if they don't. <br><math> \vec{w_m} \gets \vec{w_m} + \eta \cdot \left( \vec{x} - \vec{w_m} \right) </math> if <math> c_m = y</math> (closer together) <br> or <math> \vec{w_m} \gets \vec{w_m} - \eta \cdot \left( \vec{x} - \vec{w_m} \right) </math> if <math> c_m \neq y</math> (further apart).
# While there are vectors left in <math> L </math> go to step 1, else terminate.
 
=== LVQ3 ===
Note: <math>\vec{w_i}</math> and <math>\vec{x}</math> are [[vector space|vectors]] in feature space.
[[File:Apollonian_circles.svg|thumb|Some Apollonian circles. Every blue circle intersects every red circle at a right angle. Every red circle passes through the two points ''{{mvar|C, D}}'', and every blue circle separates the two points.]]
Initialize several code vectors per label. Iterate until convergence criteria is reached.
 
# Sample a datum <math>x_i</math>, and find out two code vectors <math>w_j, w_k</math> closest to it.
==Variants==
# Let <math>d_j := \|x_i - w_j\|, d_k := \|x_i - w_k\|</math>.
* LVQ2.1
# If <math>\min \left(\frac{d_j}{d_k}, \frac{d_k}{d_j}\right)>s </math>, where <math>s=\frac{1-w}{1+w}</math>, then
* Generalized LVQ (GLVQ)
#* If <math>w_j</math> and <math>x_i</math> have the same class, and <math>w_k</math> and <math>x_i</math> have different classes, then <math>w_j \leftarrow w_j + \alpha_t(x_i - w_j)</math> and <math>w_k \leftarrow w_k - \alpha_t(x_i - w_k)</math>.
* Generalized Relevance LVQ (GRLVQ)
#* If <math>w_k</math> and <math>x_i</math> have the same class, and <math>w_j</math> and <math>x_i</math> have different classes, then <math>w_j \leftarrow w_j - \alpha_t(x_i - w_j)</math> and <math>w_k \leftarrow w_k + \alpha_t(x_i - w_k)</math>.
* Generalized Matrix LVQ (GMLVQ)
#* If <math>w_j</math> and <math>w_k</math> and <math>x_i</math> have the same class, then <math>w_j \leftarrow w_j - \epsilon\alpha_t(x_i - w_j)</math> and <math>w_k \leftarrow w_k + \epsilon\alpha_t(x_i - w_k)</math>.
* Relational GLVQ (RGLVQ)
#* If <math>w_k</math> and <math>x_i</math> have different classes, and <math>w_j</math> and <math>x_i</math> have different classes, then the original paper simply does not explain what happens in this case, but presumably nothing happens in this case.
* Median GLVQ (MGLVQ)
# Otherwise, skip.
* Limited Rank Matrix LVQ (LiRaMLVQ)
Note that condition <math>\min \left(\frac{d_j}{d_k}, \frac{d_k}{d_j}\right)>s </math>, where <math>s=\frac{1-w}{1+w}</math>, precisely means that the point <math>x_i</math> falls between two [[Apollonian circles|Apollonian spheres]].
* Generalized Tangent LVQ (GTLVQ) (also non-euclidean - curved manifold)
* Localized GMLVQ (LGMLVQ)
* Robust soft LVQ (RSLVQ)
* Border-sensitive GLVQ (BSGLVQ)
 
== References ==
Line 46 ⟶ 49:
 
== Further reading ==
* {{cite journal |last1=Somervuo |first1=Panu |last2=Kohonen |first2=Teuvo |date=1999 |title=Self-organizing maps and learning vector quantization for feature sequences |journal=Neural Processing Letters |volume=10 |issue=2 |pages=151–159 |doi=10.1023/A:1018741720065}}
* [http://www.cis.hut.fi/panus/papers/dtwsom.pdf Self-Organizing Maps and Learning Vector Quantization for Feature Sequences, Somervuo and Kohonen. 2004] (pdf)
* {{Cite journal |last=Nova |first=David |last2=Estévez |first2=Pablo A. |date=2014-09-01 |title=A review of learning vector quantization classifiers |url=https://link.springer.com/article/10.1007/s00521-013-1535-3 |journal=Neural Computing and Applications |language=en |volume=25 |issue=3 |pages=511–524 |doi=10.1007/s00521-013-1535-3 |issn=1433-3058|arxiv=1509.07093 }}
 
== External links ==