Data stream clustering: Difference between revisions

Content deleted Content added
Bfoteini (talk | contribs)
No edit summary
Bfoteini (talk | contribs)
No edit summary
Line 1:
In [[computer science]], data stream [[clustering analysis | clustering]] is defined as the clustering of data that arrive continuously such as telephone records, multimedia data, financial transactions etc. Data stream clustering is usually studied under the [[streaming algorithm | data stream model]] of computation and the objective is, given a sequence of points, to maintain a consistently good clustering of the sequence observed so far, using a small amount of memory and time.
 
 
Line 12:
 
== Algorithms ==
Many algorithms have been proposed for the data stream clustering problem. The performance of of an algorithm that operates on data streams is measured by three basic factors:
# The number of passes the algorithm must make over the stream.
# The available memory.
# The running time of the algorithm.
These algorithms have many similarities with [[online algorithms]] but they are not identical. Unlike online algorithms, algorithms for data stream clustering have only a bounded amount of memory available and they may be able to take action after a group of points arrives while online algorithms are required to take action after each point arrives.
 
Some of the most well-known algorithms
 
== References ==