Revision as of 11:10, 8 December 2006 edit 212.162.130.92 (talk) →Algorithm ← Previous edit		Revision as of 02:07, 9 December 2006 edit undo BagpipingScotsman (talk \| contribs) 530 edits m grammar Next edit →
Line 7: Apriori uses [[breadth-first search]] and a [[hash tree]] structure to count candidate item sets efficiently. It generates candidate item sets of length <math>k</math> from item sets of length <math>k-1</math>. Then it prunes the candidates which have an infrequent sub pattern. According to the [[downward closure lemma]], the candidate set contains all frequent <math>k</math>-length item sets. After that, it scans the transaction database to determine frequent item sets among the candidates. For determining frequent items quickly, the algorithm uses a hash tree to store candidate itemsets. This hash tree has item sets at the leaves and [[hash table]]s at internal nodes (Zaki, 99). Note that this is not the same kind of [[hash tree]] used in for instance p2p systems Apriori, while historically significant, suffers from a number of inefficiencies or ~~tradeoffs~~trade-offs, which have spawned other algorithms. Candidate generation generates large numbers of subsets (the algorithm attempts to load up the candidate set with as many as possible before each scan). Bottom-up subset exploration (essentially a breadth-first traversal of the subset lattice) finds any maximal subset S only after all <math>2^{\|S\|}-1</math> of its proper subsets. == Algorithm ==

Apriori algorithm: Difference between revisions