Revision as of 02:36, 24 May 2024 edit 142.198.199.23 (talk) Simplified to the version that's easy to understand Tag: Visual edit ← Previous edit		Revision as of 04:02, 28 May 2024 edit undo WikiCleanerBot (talk \| contribs) Bots 1,007,823 edits m v2.05b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation - Link equal to linktext) Tag: WPCleaner Next edit →
Line 52: === CVM Algorithm === Compared to other approximation algorithms for the count-distinct problem the CVM Algorithm<ref>{{Cite journal \|last=Chakraborty \|first=Sourav \|last2=Vinodchandran \|first2=N. V. \|last3=Meel \|first3=Kuldeep S. \|date=2022 \|title=Distinct Elements in Streams: An Algorithm for the (Text) Book \|url=http://arxiv.org/abs/2301.10191 \|pages=6 pages, 727571 bytes \|doi=10.4230/LIPIcs.ESA.2022.34 \|issn=1868-8969}}</ref> (named by [[~~Donald Knuth \|~~ Donald Knuth]] after the initials of Sourav Chakraborty, N. V. Vinodchandran, and Kuldeep S. Meel) uses sampling instead of hashing. The CVM Algorithm provides an unbiased estimator for the number of distinct elements in a stream,<ref name=":0" />, in addition to the standard (ε-δ) guarantees. Below is the modification of CVM algorithm, proposed by Donald Knuth, that maintains a buffer of maximum size s<ref name=":0">{{cite journal \|last1=Knuth \|first1=Donald \|date=May 2023 \|title=The CVM Algorithm for Estimating Distinct Elements in Streams \|url=https://cs.stanford.edu/~knuth/papers/cvm-note.pdf \|journal=}}</ref> {{nowrap\|Initialize <math> p \leftarrow 1 </math>}}

Count-distinct problem: Difference between revisions