Draft:Product quantization: Difference between revisions

Content deleted Content added
Osmarks (talk | contribs)
Altered url. URLs might have been anonymized. Add: class, date, title, eprint, issue, pmid, authors 1-9. Removed parameters. Some additions/deletions were parameter name changes. | Use this tool. Report bugs. | #UCB_Gadget
Citation bot (talk | contribs)
Removed URL that duplicated identifier. Removed access-date with no URL. | Use this bot. Report bugs. | Suggested by Headbomb | Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox | #UCB_webform_linked 1010/1032
 
(3 intermediate revisions by 2 users not shown)
Line 1:
{{AFC submission|d|nn|u=Osmarks|ns=118|decliner=CoconutOctopus|declinets=20250607121517|ts=20250414141445}} <!-- Do not remove this line! -->
 
{{Short description|Vector search algorithm}}
{{Draft topics|stem}}
{{AfC topic|stem}}
 
{{Draft article}}
'''Product quantization''' ('''PQ''') is a technique for [[Nearest neighbor search#Approximation methods|approximate nearest neighbor]] search.<ref name="product2010jegou">{{cite journal |last1=Jégou |first1=Herve |last2=Douze |first2=Matthijs |last3=Schmid |first3=Cordelia |title=Product Quantization for Nearest Neighbor Search |journal=IEEE Transactions on Pattern Analysis and Machine Intelligence |date=2010-03-18 |volume=33 |issue=1 |pages=117–128 |doi=10.1109/TPAMI.2010.57 |pmid=21088323 |url=https://ieeexplore.ieee.org/document/5432202 |access-date=13 April 2025}}</ref> It works by [[lossy compression|lossily compressing]] vectors by representing them as a [[Cartesian product]] of [[Dimension_(vector_space) | low-dimensional]] [[Linear_subspace | subspaces]] and quantizing each subspace independently.<ref name="wu2019vector">{{cite journal |last1=Wu |first1=Ze-bin |last2=Yu |first2=Jun-qing |title=Vector quantization: a review |journal=Frontiers of Information Technology & Electronic Engineering |date=2019-05-18 |volume=20 |issue=4 |pages=507–524 |doi=10.1631/FITEE.1700833 |url=https://link.springer.com/article/10.1631/fitee.1700833}}</ref> Distances can be efficiently computed between product-quantized vectors and an unquantized vector by creating a [[lookup table]], so product quantization can save compute, storage and memory bandwidth.<ref name="guo2019accelerating">{{cite arXiv |eprint=1908.10396 |last1=Guo |first1=Ruiqi |last2=Sun |first2=Philip |last3=Lindgren |first3=Erik |last4=Geng |first4=Quan |last5=Simcha |first5=David |last6=Chern |first6=Felix |last7=Kumar |first7=Sanjiv |title=Accelerating Large-Scale Inference with Anisotropic Vector Quantization |date=2019 |class=cs.LG }}</ref>
 
Product quantization can be used by itself or as a component of more complex ANN search algorithms.<ref name="douze2024faiss">{{cite arXiv |eprint=2401.08281 |last1=Douze |first1=Matthijs |last2=Guzhva |first2=Alexandr |last3=Deng |first3=Chengqi |last4=Johnson |first4=Jeff |last5=Szilvasy |first5=Gergely |last6=Mazaré |first6=Pierre-Emmanuel |last7=Lomeli |first7=Maria |last8=Hosseini |first8=Lucas |last9=Jégou |first9=Hervé |title=The Faiss library |date=2024 |class=cs.LG }}</ref> The combination of product quantization with inverted files is sometimes known as IVFDAC (inverted file system with asymmetric distance computation): this involves dividing the database into coarse buckets and, for a given query, doing distance computations only with the buckets nearest to that query.<ref name="matsui2018survey">{{cite journal |last1=Matsui |first1=Yusuke |last2=Uchida |first2=Yusuke |last3=Jégou |first3=Hervé |last4=Satoh |first4=Shin'ichi |title=A Survey of Product Quantization |journal=ITE Transactions on Media Technology and Applications |date=2018 |volume=6 |issue=1 |pages=2–10 |doi=10.3169/mta.6.2 |url=https://www.jstage.jst.go.jp/article/mta/6/1/6_2/_article/-char/ja/ |access-date=13 April 2025}}</ref>
 
=== Optimized product quantization (OPQ) ===
 
'''Optimized product quantization''' is a widely used enhancement which applies a [[rotation matrix]] before quantizing vectors, in order to better take into account the data distribution.<ref name="matsui2018survey"></ref>
 
==References==
 
{{Reflist}}
Optimized product quantization is a widely used enhancement which applies a [[rotation matrix]] before quantizing vectors, in order to better take into account the data distribution.<ref name="matsui2018survey"></ref>
 
{{Drafts moved from mainspace|date=March 2025}}