Revision as of 12:01, 20 January 2025 edit John of Reading (talk \| contribs) Autopatrolled, Extended confirmed users, Pending changes reviewers 787,577 edits →top: Typo fixing, fixed "Mahematical" by choosing a different "short description" Tag: AWB ← Previous edit		Revision as of 10:53, 23 April 2025 edit undo Citation bot (talk \| contribs) Bots 5,871,566 edits Altered template type. Add: isbn, volume, series. \| Use this bot. Report bugs. \| Suggested by Headbomb \| #UCB_toolbar Next edit →
Line 77: === CVM Algorithm === Compared to other approximation algorithms for the count-distinct problem the CVM Algorithm<ref>{{Cite ~~journal~~book \|last1=Chakraborty \|first1=Sourav \|last2=Vinodchandran \|first2=N. V. \|last3=Meel \|first3=Kuldeep S. \|date=2022 \|title=Distinct Elements in Streams: An Algorithm for the (Text) Book \|series=Leibniz International Proceedings in Informatics (LIPIcs) \|volume=244 \|pages=6 pages, 727571 bytes \|publisher=Schloss Dagstuhl – Leibniz-Zentrum für Informatik \|doi=10.4230/LIPIcs.ESA.2022.34 \|doi-access=free \|arxiv=2301.10191 \|isbn=978-3-95977-247-1 \|issn=1868-8969}}</ref> (named by [[Donald Knuth]] after the initials of Sourav Chakraborty, N. V. Vinodchandran, and Kuldeep S. Meel) uses sampling instead of hashing. The CVM Algorithm provides an unbiased estimator for the number of distinct elements in a stream,<ref name=":0" /> in addition to the standard (ε-δ) guarantees. Below is the CVM algorithm, including the slight modification by Donald Knuth. <ref name=":0">{{cite journal \|last1=Knuth \|first1=Donald \|date=May 2023 \|title=The CVM Algorithm for Estimating Distinct Elements in Streams \|url=https://cs.stanford.edu/~knuth/papers/cvm-note.pdf \|journal=}}</ref> {{nowrap\|Initialize <math> p \leftarrow 1 </math>}}

Count-distinct problem: Difference between revisions