Content deleted Content added
m recat |
mNo edit summary |
||
Line 3:
Apriori (Agrawal 94) employs [[breadth-first search]] and uses a [[hash tree]] structure to count candidate item sets efficiently. The algorithm generates candidate item sets (patterns) of length <math>k</math> from <math>k-1</math> length item sets. Then, the patterns which have an infrequent sub pattern are pruned. According to the [[downward closure lemma]], the generated candidate set contains all frequent <math>k</math> length item sets. Following that, the whole transaction database is scanned to determine frequent item sets among the candidates. For determining frequent items in a fast manner, the algorithm uses a hash tree to store candidate itemsets. Note: A hash tree has item sets at the leaves and [[hash table]]s at internal nodes (Zaki, 99).
Apriori is designed to operate on
== Algorithm ==
Apriori<math>(T,\varepsilon)</math>
<math>L_1 \gets \{ </math> large 1-itemsets <math> \} </math>
Line 21 ⟶ 20:
== References ==
*Rakesh Agrawal and Tomasz Imielinski and Arun N. Swami, ''Mining Association Rules between Sets of Items in Large Databases'', Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data.
*Rakesh Agrawal and Ramakrishnan Srikant, ''Fast Algorithms for Mining Association Rules'', Proc. 20th Int. Conf. Very Large Data Bases (VLDB), 1994.
|