Learning vector quantization: Difference between revisions

Browse history interactively

← Previous edit

Content deleted Content added

VisualWikitext

Revision as of 16:41, 25 August 2022 edit Juvilius (talk \| contribs) 6 edits Undid revision 1106625313 by MrOllie (talk) Tag: Reverted ← Previous edit		Latest revision as of 22:46, 19 June 2025 edit undo OAbot (talk \| contribs) Bots 644,620 edits m Open access bot: url-access updated in citation with #oabot.
(18 intermediate revisions by 7 users not shown)
Line 1: In [[computer science]], '''learning vector quantization''' ('''LVQ''') is a [[prototype\|prototype-based]] [[supervised learning\|supervised]] [[Statistical classification\|classification]] [[algorithm]]. LVQ is the supervised counterpart of [[vector quantization]] systems. LVQ can be understood as a special case of an [[artificial neural network]], more precisely, it applies a [[winner-take-all (computing)\|winner-take-all]] [[Hebbian learning]]-based approach. It is a precursor to [[self-organizing map]]s (SOM) and related to [[neural gas]] and the [[k-nearest neighbor algorithm]] (k-NN). LVQ was invented by [[Teuvo Kohonen]].<ref>T. Kohonen. Self-Organizing Maps. Springer, Berlin, 1997.</ref> ~~== Overview ==~~ LVQ can be understood as a special case of an [[artificial neural network]], more precisely, it applies a [[winner-take-all (computing)\|winner-take-all]] [[Hebbian learning]]-based approach. It is a precursor to [[self-organizing map]]s (SOM) and related to [[neural gas]], and to the [[k-nearest neighbor algorithm]] (k-NN). LVQ was invented by [[Teuvo Kohonen]].<ref>T. Kohonen. Self-Organizing Maps. Springer, Berlin, 1997.</ref> == Definition == An LVQ system is represented by prototypes <math>W=(w(i),...,w(n))</math> which are defined in the [[feature space]] of observed data. In winner-take-all training algorithms one determines, for each data point, the prototype which is closest to the input according to a given distance measure. The position of this so-called winner prototype is then adapted, i.e. the winner is moved closer if it correctly classifies the data point or moved away if it classifies the data point incorrectly. Line 9 ⟶ 7: LVQ systems can be applied to [[multi-class classification]] problems in a natural way. A key issue in LVQ is the choice of an appropriate measure of distance or similarity for training and classification. Recently, techniques have been developed which adapt a parameterized distance measure in the course of training the system, see e.g. (Schneider, Biehl, and Hammer, 2009)<ref>{{cite journal\|~~authors~~author1=P. Schneider, \|author2=B. Hammer~~, and~~ \|author3=M. Biehl \|title=Adaptive Relevance Matrices in Learning Vector Quantization\|journal= Neural Computation\|volume=21\|issue=10\|pages=3532–3561\|year=2009\|doi=10.1162/neco.2009.10-08-892\|pmid=19635012\|citeseerx=10.1.1.216.1183\|s2cid=17306078}}</ref> and references therein. LVQ can be a source of great help in classifying text documents.{{Citation needed\|date=December 2019\|reason=removed citation to predatory publisher content}} == Algorithm == The algorithms are presented as in.<ref>{{Citation \|last=Kohonen \|first=Teuvo \|title=Learning Vector Quantization \|date=2001 \|work=Self-Organizing Maps \|volume=30 \|pages=245–261 \|url=http://link.springer.com/10.1007/978-3-642-56927-2_6 \|place=Berlin, Heidelberg \|publisher=Springer Berlin Heidelberg \|doi=10.1007/978-3-642-56927-2_6 \|isbn=978-3-540-67921-9\|url-access=subscription }}</ref> ~~Below follows an informal description.<br>~~ ~~The algorithm consists of three basic steps. The algorithm's input is:~~ Set up: * how many neurons the system will have <math>M</math> (in the simplest case it is equal to the number of classes) * what weight each neuron has <math>\vec{w_i}</math> for <math>i = 0,1,...,M - 1 </math> * Let the ~~corresponding~~data ~~label~~be denoted by <math>~~c_i~~x_i \in \R^D</math>, toand ~~each~~their ~~neuron~~corresponding labels by <math>y_i \in \~~vec~~{~~w_i}~~1, 2, \dots, C\}</math>. * ~~how~~The ~~fast~~complete ~~the~~dataset ~~neurons are learning~~is <math>\{(x_i, y_i)\~~eta~~ }_{i=1}^N</math>. * The set of code vectors is <math>w_j \in \R^D</math>. * and an input list <math> L </math> containing all the vectors of which the labels are known already (training set). * The learning rate at iteration step <math>t</math> is denoted by <math>\alpha_t</math>. * The hyperparameters <math>w</math> and <math>\epsilon</math> are used by LVQ2 and LVQ3. The original paper suggests <math>\epsilon \in [0.1, 0.5]</math> and <math>w \in [0.2, 0.3]</math>. === LVQ1 === Initialize several code vectors per label. Iterate until convergence criteria is reached. # Sample a datum <math>x_i</math>, and find out the code vector <math>w_j</math>, such that <math>x_i</math> falls within the [[Voronoi diagram\|Voronoi cell]] of <math>w_j</math>. # If its label <math>y_i</math> is the same as that of <math>w_j</math>, then <math>w_j \leftarrow w_j + \alpha_t(x_i - w_j)</math>, otherwise, <math>w_j \leftarrow w_j - \alpha_t(x_i - w_j)</math>. === LVQ2.1 ===▼ ~~The algorithm's flow is:~~ #LVQ2 ~~For~~is ~~next~~the ~~input~~same as LVQ3, but with this sentence removed: "If <math>~~\vec{x}~~w_j</math> ~~(with label~~and <math>yw_k</math>) inand <math> L x_i</math> ~~find~~have the ~~closest~~same ~~neuron~~class, then <math>w_j \~~vec{w_m}~~leftarrow w_j - \alpha_t(x_i - w_j)</math>~~, <br>i.e.~~ and <math>d(w_k \~~vec{x},\vec{w_m})~~leftarrow =w_k + \~~min\limits_i~~alpha_t(x_i ~~{d(\vec{x},\vec{w_i}~~- w_k)} </math>,.". ~~where~~If <math>~~\, d~~w_j</math> isand ~~the~~<math>w_k</math> ~~metric~~and ~~used~~<math>x_i</math> (have ~~[[Euclidean~~the same ~~distance\|Euclidean]]~~class, ~~etc.~~then )nothing happens. # Update <math>\vec{w_m}</math>. A better explanation is get <math>\vec{w_m}</math> closer to the input <math>\vec{x}</math>, if <math>\vec{x}</math> and <math>\vec{w_m}</math> belong to the same label and get them further apart if they don't. <br><math> \vec{w_m} \gets \vec{w_m} + \eta \cdot \left( \vec{x} - \vec{w_m} \right) </math> if <math> c_m = y</math> (closer together) <br> or <math> \vec{w_m} \gets \vec{w_m} - \eta \cdot \left( \vec{x} - \vec{w_m} \right) </math> if <math> c_m \neq y</math> (further apart). ~~# While there are vectors left in <math> L </math> go to step 1, else terminate.~~ === LVQ3 === ~~Note: <math>\vec{w_i}</math> and <math>\vec{x}</math> are [[vector space\|vectors]] in feature space.~~ [[File:Apollonian_circles.svg\|thumb\|Some Apollonian circles. Every blue circle intersects every red circle at a right angle. Every red circle passes through the two points ''{{mvar\|C, D}}'', and every blue circle separates the two points.]] Initialize several code vectors per label. Iterate until convergence criteria is reached. # Sample a datum <math>x_i</math>, and find out two code vectors <math>w_j, w_k</math> closest to it. ~~== Variants ==~~ # Let <math>d_j := \\|x_i - w_j\\|, d_k := \\|x_i - w_k\\|</math>. ▲ LVQ2.1 # If <math>\min \left(\frac{d_j}{d_k}, \frac{d_k}{d_j}\right)>s </math>, where <math>s=\frac{1-w}{1+w}</math>, then * Generalized LVQ (GLVQ) {{cite web \|url=https://papers.nips.cc/paper/1995/file/9c3b1830513cc3b8fc4b76635d32e692-Paper.pdf \|title=Generalized Learning Vector Quantization \|last= \|first= \|date= \|website=nips.cc \|publisher=NeurIPS Proceedings \|access-date=25 August 2022 \|quote=}} #* If <math>w_j</math> and <math>x_i</math> have the same class, and <math>w_k</math> and <math>x_i</math> have different classes, then <math>w_j \leftarrow w_j + \alpha_t(x_i - w_j)</math> and <math>w_k \leftarrow w_k - \alpha_t(x_i - w_k)</math>. * Generalized Relevance LVQ (GRLVQ) <ref name="HammerVillmann2002">{{cite journal \| last1 = Hammer \| first1 = Barbara \| last2 = Villmann \| first2 = Thomas \| title = Generalized relevance learning vector quantization \| journal = Neural Networks \| date = October 2002 \| volume = 15 \| issue = 8–9 \| pages = 1059–1068 \| issn = 0893-6080 \| doi = 10.1016/S0893-6080(02)00079-5 \| pmid = \| url = }}</ref> #* If <math>w_k</math> and <math>x_i</math> have the same class, and <math>w_j</math> and <math>x_i</math> have different classes, then <math>w_j \leftarrow w_j - \alpha_t(x_i - w_j)</math> and <math>w_k \leftarrow w_k + \alpha_t(x_i - w_k)</math>. * Generalized Matrix LVQ (GMLVQ) <ref name="SchneiderBiehlHammer2009">{{cite journal \| last1 = Schneider \| first1 = Petra \| last2 = Biehl \| first2 = Michael \| last3 = Hammer \| first3 = Barbara \| title = Adaptive Relevance Matrices in Learning Vector Quantization \| journal = Neural Computation \| date = December 2009 \| volume = 21 \| issue = 12 \| pages = 3532–3561 \| issn = 0899-7667 \| eissn = 1530-888X \| doi = 10.1162/neco.2009.11-08-908 \| pmid = 19764875 \| url = }}</ref> #* If <math>w_j</math> and <math>w_k</math> and <math>x_i</math> have the same class, then <math>w_j \leftarrow w_j - \epsilon\alpha_t(x_i - w_j)</math> and <math>w_k \leftarrow w_k + \epsilon\alpha_t(x_i - w_k)</math>. * Relational GLVQ (RGLVQ) <ref name="HammerSchleifZhu2011">{{cite book \| title = Neural Information Processing \| last1 = Hammer \| first1 = Barbara \| last2 = Schleif \| first2 = Frank-Michael \| last3 = Zhu \| first3 = Xibin \| chapter = Relational Extensions of Learning Vector Quantization \| date = 2011 \| pages = 481–489 \| publisher = Springer Berlin Heidelberg \| issn = 0302-9743 \| eissn = 1611-3349 \| doi = 10.1007/978-3-642-24958-7_56 \| url = }}</ref> #* If <math>w_k</math> and <math>x_i</math> have different classes, and <math>w_j</math> and <math>x_i</math> have different classes, then the original paper simply does not explain what happens in this case, but presumably nothing happens in this case. * Median GLVQ (MGLVQ) <ref name="NebelHammerFrohberg2015">{{cite journal \| last1 = Nebel \| first1 = David \| last2 = Hammer \| first2 = Barbara \| last3 = Frohberg \| first3 = Kathleen \| last4 = Villmann \| first4 = Thomas \| title = Median variants of learning vector quantization for learning of dissimilarity data \| journal = Neurocomputing \| date = December 2015 \| volume = 169 \| pages = 295–305 \| issn = 0925-2312 \| doi = 10.1016/j.neucom.2014.12.096 \| pmid = \| url = }}</ref> # Otherwise, skip. * Limited Rank Matrix LVQ (LiRaMLVQ) <ref name="BunteSchneiderHammer2012">{{cite journal \| last1 = Bunte \| first1 = Kerstin \| last2 = Schneider \| first2 = Petra \| last3 = Hammer \| first3 = Barbara \| last4 = Schleif \| first4 = Frank-Michael \| last5 = Villmann \| first5 = Thomas \| last6 = Biehl \| first6 = Michael \| title = Limited Rank Matrix Learning, discriminative dimension reduction and visualization \| journal = Neural Networks \| date = February 2012 \| volume = 26 \| pages = 159–173 \| issn = 0893-6080 \| doi = 10.1016/j.neunet.2011.10.001 \| pmid = 22041220 \| url = }}</ref> Note that condition <math>\min \left(\frac{d_j}{d_k}, \frac{d_k}{d_j}\right)>s </math>, where <math>s=\frac{1-w}{1+w}</math>, precisely means that the point <math>x_i</math> falls between two [[Apollonian circles\|Apollonian spheres]]. * Robust soft LVQ (RSLVQ) <ref name="SeoObermayer2003">{{cite journal \| last1 = Seo \| first1 = Sambu \| last2 = Obermayer \| first2 = Klaus \| title = Soft Learning Vector Quantization \| journal = Neural Computation \| date = 1 July 2003 \| volume = 15 \| issue = 7 \| pages = 1589–1604 \| issn = 0899-7667 \| eissn = 1530-888X \| doi = 10.1162/089976603321891819 \| pmid = 12816567 \| url = }}</ref> * Border-sensitive GLVQ (BSGLVQ) <ref name="KadenRiedelHermann2014">{{cite journal \| last1 = Kaden \| first1 = Marika \| last2 = Riedel \| first2 = Martin \| last3 = Hermann \| first3 = Wieland \| last4 = Villmann \| first4 = Thomas \| title = Border-sensitive learning in generalized learning vector quantization: an alternative to support vector machines \| journal = Soft Computing \| date = 20 November 2014 \| volume = 19 \| issue = 9 \| pages = 2423–2434 \| issn = 1432-7643 \| eissn = 1433-7479 \| doi = 10.1007/s00500-014-1496-1 \| pmid = \| url = }}</ref> == References == Line 44 ⟶ 49: == Further reading == * {{cite journal \|last1=Somervuo \|first1=Panu \|last2=Kohonen \|first2=Teuvo \|date=1999 \|title=Self-organizing maps and learning vector quantization for feature sequences \|journal=Neural Processing Letters \|volume=10 \|issue=2 \|pages=151–159 \|doi=10.1023/A:1018741720065}} * [http://www.cis.hut.fi/panus/papers/dtwsom.pdf Self-Organizing Maps and Learning Vector Quantization for Feature Sequences, Somervuo and Kohonen. 2004] (pdf) * {{Cite journal \|last=Nova \|first=David \|last2=Estévez \|first2=Pablo A. \|date=2014-09-01 \|title=A review of learning vector quantization classifiers \|url=https://link.springer.com/article/10.1007/s00521-013-1535-3 \|journal=Neural Computing and Applications \|language=en \|volume=25 \|issue=3 \|pages=511–524 \|doi=10.1007/s00521-013-1535-3 \|issn=1433-3058\|arxiv=1509.07093 }} == External links ==