Content deleted Content added
m Open access bot: hdl updated in citation with #oabot. |
|||
(42 intermediate revisions by 26 users not shown) | |||
Line 1:
{{short description|Classical quantization technique from signal processing}}
{{Multiple issues|
{{Missing information|something|date=February 2009}}
{{Original research|date=November 2016}}
{{Technical|date=October 2023}}
}}
'''Vector quantization''' ('''VQ''') is a classical [[Quantization (signal processing)|quantization]] technique from [[signal processing]] that allows the modeling of [[probability density functions]] by the distribution of prototype vectors.
The density matching property of vector quantization is powerful, especially for identifying the density of large and high-dimensional data. Since data points are represented by the index of their closest centroid, commonly occurring data have low error, and rare data high error. This is why VQ is suitable for [[lossy data compression]]. It can also be used for lossy data correction and [[density estimation]].
Line 10 ⟶ 12:
== Training ==
</ref>
# Pick a sample point at random
Line 62 ⟶ 64:
| date = 2009-12-27
| url = http://lists.mplayerhq.hu/pipermail/bow/2009-December/000058.html
|
</ref>
* [[Cinepak]]
* [[Daala]] is transform-based but uses [[pyramid vector quantization]] on transformed coefficients<ref>{{cite IETF |title= Pyramid Vector Quantization for Video Coding | first1= JM. |last1= Valin | draft=draft-valin-videocodec-pvq-00 | date=October 2012 |publisher=[[Internet Engineering Task Force|IETF]] |
* [[Digital Video Interactive]]: Production-Level Video and Real-Time Video
* [[Indeo]]
Line 79 ⟶ 81:
* [[AMR-WB+]]
* [[CELP]]
* [[CELT]] (now part of [[Opus (codec)|Opus]]
* [[Codec 2]]
* [[DTS Coherent Acoustics|DTS]]
* [[G.729]]
* [[iLBC]]
* [[Ogg Vorbis]]
{{cite web
| title = Vorbis I Specification
| publisher = Xiph.org
| date = 2007-03-09
| url =
|
</ref>
▲* [[Opus (codec)|Opus]] is transform-based but uses vector quantization on transformed coefficients
* [[TwinVQ]]
=== Use in pattern recognition ===
VQ was also used in the eighties for speech<ref>{{cite
Recently it has also been used for efficient [[nearest neighbor search]]
<ref>{{cite journal|author=H. Jegou |author2=M. Douze |author3=C. Schmid|title=Product Quantization for Nearest Neighbor Search|journal=IEEE Transactions on Pattern Analysis and Machine Intelligence|year=2011|volume=33|issue=1|pages=117–128|doi=10.1109/TPAMI.2010.57|pmid=21088323 |url=http://hal.archives-ouvertes.fr/docs/00/51/44/62/PDF/paper_hal.pdf |archive-url=https://web.archive.org/web/20111217142048/http://hal.archives-ouvertes.fr/docs/00/51/44/62/PDF/paper_hal.pdf |archive-date=2011-12-17 |url-status=live|citeseerx=10.1.1.470.8573 |s2cid=5850884 }}</ref>
and on-line signature recognition.<ref>{{cite journal|last=Faundez-Zanuy|first=Marcos|title=offline and On-line signature recognition based on VQ-DTW|journal=Pattern Recognition|year=2007|volume=40|issue=3|pages=981–992|doi=10.1016/j.patcog.2006.06.007}}</ref>
In [[pattern recognition]] applications, one codebook is constructed for each class (each class being a user in biometric applications) using acoustic vectors of this user. In the testing phase the quantization distortion of a testing signal is worked out with the whole set of codebooks obtained in the training phase. The codebook that provides the smallest vector quantization distortion indicates the identified user.
The main advantage of VQ in [[pattern recognition]] is its low computational burden when compared with other techniques such as [[dynamic time warping]] (DTW) and [[hidden Markov model]] (HMM). The main drawback when compared to DTW and HMM is that it does not take into account the temporal evolution of the signals (speech, signature, etc.) because all the vectors are mixed up. In order to overcome this problem a multi-section codebook approach has been proposed.<ref>{{cite journal|last=Faundez-Zanuy|first=Marcos|author2=Juan Manuel Pascual-Gaspar |title=Efficient On-line signature recognition based on Multi-section VQ|journal=Pattern Analysis and Applications|year=2011|volume=14|issue=1|pages=37–45|doi=10.1007/s10044-010-0176-8|s2cid=24868914}}</ref> The multi-section approach consists of modelling the signal with several sections (for instance, one codebook for the initial part, another one for the center and a last codebook for the ending part).
=== Use as clustering algorithm ===
As VQ is seeking for centroids as density points of nearby lying samples, it can be also directly used as a prototype-based clustering method: each centroid is then associated with one prototype.
By aiming to minimize the expected squared quantization error<ref>{{cite journal|last=Gray|first=R.M.|title=Vector Quantization|journal=IEEE ASSP Magazine|year=1984|volume=1|issue=2|pages=4–29|doi=10.1109/massp.1984.1162229|hdl=2060/19890012969|hdl-access=free}}</ref> and introducing a decreasing learning gain fulfilling the Robbins-Monro conditions, multiple iterations over the whole data set with a concrete but fixed number of prototypes converges to the solution of [[k-means]] clustering algorithm in an incremental manner.
=== Generative Adversarial Networks (GAN) ===
VQ has been used to quantize a feature representation layer in the discriminator of [[Generative adversarial network]]s. The feature quantization (FQ) technique performs implicit feature matching.<ref>Feature Quantization Improves GAN Training https://arxiv.org/abs/2004.02088</ref> It improves the GAN training, and yields an improved performance on a variety of popular GAN models: BigGAN for image generation, StyleGAN for face synthesis, and U-GAT-IT for unsupervised image-to-image translation.
== See also ==
'''Subtopics'''
{{col div|colwidth=40em}}
* [[Learning vector quantization]]▼
* [[Lloyd's algorithm]]▼
* [[Neural gas|Growing Neural Gas]], a neural network-like system for vector quantization▼
{{colend}}
'''Related topics'''
{{col div|colwidth=40em}}
* [[Speech coding]]
* [[Ogg Vorbis]]
Line 113 ⟶ 128:
* [[Rate-distortion function]]
* [[Data clustering]]
▲* [[Learning vector quantization]]
* [[Centroidal Voronoi tessellation]]
▲* [[Neural gas|Growing Neural Gas]], a neural network-like system for vector quantization
* [[Image segmentation]]
▲* [[Lloyd's algorithm]]
▲* [[Linde–Buzo–Gray algorithm|Linde,Buzo,Gray Algorithm (LBG)]]
* [[K-means clustering]]
* [[Autoencoder]]
* [[Deep Learning]]
{{colend}}
''Part of this article was originally based on material from the [[Free On-line Dictionary of Computing]] and is used with [[Wikipedia:Foldoc license|permission]] under the GFDL.''
Line 129 ⟶ 141:
==External links==
* http://www.data-compression.com/vq.html {{Webarchive|url=https://web.archive.org/web/20171210201342/http://www.data-compression.com/vq.html |date=2017-12-10 }}
* [
* [https://dl.acm.org/citation.cfm?id=1535126 VQ Indexes Compression and Information Hiding Using Hybrid Lossless Index Coding], Wen-Jan Chen and Wen-Tsung Huang
|