Content deleted Content added
Line 17:
=== STREAM ===
STREAM is an algorithm for clustering data streams described by Guha, Mishra, Motwani and O'Callaghan <ref name="CDS">S. Guha, N. Mishra, R. Motwani, L. O'Callaghan. [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.1927 Clustering Data Streams]. Proceedings of the Annual Symposium on Foundations of Computer Science, 2000</ref> which achieves a constant factor approximation for the k-Median problem in a single pass and using small space.
'''''Theorem''': STREAM can solve the k-Median problem on a data stream with time O(n<sup>1+e</sup>) and space θ(n<sup>ε</sup>) up to a factor 2<sup>O(1/e)</sup>, where n the number of points and e<1/2''.
Line 45:
The actual STREAM algorithm works as follows:
<ol>
<li> Input the first ''O(M/k)'' points, where ''M'' denotes bounded memory and ''k'' number of centers. Using the randomized algorithm presented in <ref
<li> Use a local search algorithm to cluster'' O(M)'' intermediate medians of level ''i'' to ''2k'' medians of level ''i+1''.
<li> Use the primal dual algorithm <ref> K. Jain and V. Vazirani. [http://portal.acm.org/citation.cfm?id=796509 Primal-dual approximation algorithms for metric facility ___location and k-median problems.] Proc. FOCS, 1999</ref> to cluster the ''O(M)'' medians to k medians.
|