Count-distinct problem: Difference between revisions

Content deleted Content added
WikiCleanerBot (talk | contribs)
m v2.05b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation)
Simplified the description of CVM algorithm and also cited the source for epsilon-delta guarantees.
Line 51:
for a practical overview with comparative simulation results.
 
=== CVM Algorithm ===
Compared to other approximation algorithms for the count-distinct problem the CVM Algorithm uses sampling instead of hashing. The CVM Algorithm provides an unbiased estimator for the number of distinct elements in a stream<ref name=":0" />, howeverin addition to the boundstandard on(ε-δ) errorguarantees<ref>{{Cite isjournal not|last=Chakraborty currently|first=Sourav known|last2=Vinodchandran |first2=N. V. |last3=Meel |first3=Kuldeep S. |date=2022 |title=Distinct Elements in Streams: An Algorithm for the (Text) Book |url=http://arxiv.org/abs/2301.10191 |pages=6 pages, 727571 bytes |doi=10.4230/LIPIcs.ESA.2022.34 |issn=1868-8969}}</ref>. Here is the algorithm:<ref name=":0">{{cite journal |last1=Knuth |first1=Donald |title=The CVM Algorithm for Estimating Distinct Elements in Streams |date=May 2023}}</ref>
 
Initialize a counter, {{mvar|t}}, to zero, {{nowrap|<math> t \leftarrow 0 </math>.}}
Line 61:
{{nowrap|If <math> a </math> is in {{mvar|B}} then}}
{{nowrap|Delete <math> a </math> from {{mvar|B}}}}
{{nowrap|End If}}
{{nowrap|<math> u \leftarrow </math> random number in <math> [0, 1) </math>}}
{{nowrap|If <math> u \geqleq p </math> then}}
Insert <math> (a, u) </math> into {{nowrapmvar|NEXTB}}
{{nowrap|Else If <math> |B| <> s </math> then}}
{{nowrap|Insert <math> (a, u) </math> into {{mvar|B}}}}
{{nowrap|NEXT}}
{{nowrap|Else}}
{{nowrap|<math> (a', u') \leftarrow (a, u) </math> of max <math> u </math> in {{mvar|B}}}}
{{nowrap|IfDelete <math> u >(a', u') </math> thenfrom {{mvar|B}}}}
{{nowrap|<math> p \leftarrow u' </math>}}
{{nowrap|ElseEnd For}}
{{nowrap|Delete <math> (a', u') </math> from {{mvar|B}}}}
{{nowrap|Insert <math> (a, u) </math> into {{mvar|B}}}}
{{nowrap|<math> p \leftarrow u' </math>}}
{{nowrap|End If}}
{{nowrap|End If}}
{{nowrap|End For}}
{{nowrap|return <math> |B| / p </math>.}}