Data stream clustering: Difference between revisions

Content deleted Content added
Bfoteini (talk | contribs)
No edit summary
Bfoteini (talk | contribs)
No edit summary
Line 1:
In [[computer science]], data stream [[clustering analysis | clustering]] is defined as the clustering of data that arrive continuously such as telephone records, multimedia data, financial transactions etc. Data stream clustering is usually studied under the [[streaming algorithm | data stream model]] of computation and the objective is, given a sequence of points, to maintain a consistently good clustering of the sequence observed so far, using a small amount of memory and time.
 
 
<!-- in contrary to the traditional clustering where data are static. -->
 
 
 
== History ==
Line 13 ⟶ 10:
== Algorithms ==
Many algorithms have been proposed for the data stream clustering problem. The performance of an algorithm that operates on data streams is measured by three basic factors:
#* The number of passes the algorithm must make over the stream.
#* The available memory.
#* The running time of the algorithm.
These algorithms have many similarities with [[online algorithms]] but they are not identical. Unlike online algorithms, algorithms for data stream clustering have only a bounded amount of memory available and they may be able to take action after a group of points arrives while online algorithms are required to take action after each point arrives.
 
Some of the most well-known algorithms used for data stream clustering include:
* BIRCH
* COBWEB
* STREAM
 
===BIRCH===
 
===COBWEB===
 
===STREAM===
 
 
== References ==