Data stream clustering: Difference between revisions

Content deleted Content added
Bfoteini (talk | contribs)
No edit summary
Bfoteini (talk | contribs)
Line 21:
'''''Theorem''': STREAM can solve the k-Median problem on a data stream with time O(n<sup>1+e</sup>) and space θ(n<sup>ε</sup>) up to a factor 2<sup>O(1/e)</sup>, where n the number of points and e<1/2''.
 
The first goal is to show that clustering can take place in small space (not caring about the number of passes). Small-Space<ref>S. Guha, N. Mishra, R. Motwani, L. O'Callaghan. [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.32.1927 Clustering Data Streams]. Proceedings of the Annual Symposium on Foundations of Computer Science, 2000</ref> is a [[divide-and-conquer algorithm]] that divides the data into pieces, clusters each one of them (using k-means) and then clusters the centers obtained.
 
[[File:Small-Space.jpg | thumb | 440x140px | right | Small-Space Algorithm representation]]