Revision as of 10:04, 4 August 2006 edit SB Johnny (talk \| contribs) 6,118 edits rm ((wikify)) tag - terms in article not already wikified can't be wikified in a practical way ← Previous edit		Revision as of 01:05, 10 August 2006 edit undo Pgan002 (talk \| contribs) Extended confirmed users 15,367 edits copy-edit: clean-up and clarification Next edit →
Line 2: {{cleanup-context}} '''Apriori''' is an ~~efficient~~algorithm for efficiently [[~~association~~data ~~rule~~mining\|mining data]] for [[~~data~~association ~~mining~~rule]]s. ~~algorithm,~~It was developed by Rakesh Agrawal, et al.. Apriori employs [[breadth-first search]] and uses a [[hash tree]] structure to count candidate item sets efficiently. The algorithm generates candidate item sets (patterns) of length <math>k</math> from <math>k-1</math> length item sets. Then, the patterns which have an infrequent sub pattern are pruned. According to the [[downward closure lemma]], the generated candidate set contains all frequent <math>k</math> length item sets. Following that, the whole transaction database is scanned to determine frequent item sets among the candidates. For determining frequent items in a fast manner, the algorithm uses a hash tree to store candidate itemsets. Note: This hash tree has item sets at the leaves and [[hash table]]s at internal nodes (Zaki, 99). Note that this is not the same kind of [[hash tree]] structure used in for instance p2p systems.▼ Apriori is designed to operate on [[database]]s containing transactions (eg: collection of items bought by customers or details of a website frequentation). Other algorithms are designed for finding association rules in data having no transactions (Winepi and Minepi), or having no timestamps (DNA sequencing). ▲Apriori ~~employs~~uses [[breadth-first search]] and ~~uses~~ a [[hash tree]] structure to count candidate item sets efficiently. ~~The algorithm~~It generates candidate item sets ~~(patterns)~~ of length <math>k</math> from item sets of length <math>k-1</math>. ~~length~~ ~~item~~Then ~~sets.~~it ~~Then,~~prunes the ~~patterns~~candidates which have an infrequent sub pattern ~~are pruned~~. According to the [[downward closure lemma]], the ~~generated~~ candidate set contains all frequent <math>k</math> -length item sets. ~~Following~~After that, it scans the ~~whole~~ transaction database ~~is scanned~~ to determine frequent item sets among the candidates. For determining frequent items ~~in a fast manner~~quickly, the algorithm uses a hash tree to store candidate itemsets. ~~Note:~~ This hash tree has item sets at the leaves and [[hash table]]s at internal nodes (Zaki, 99). Note that this is not the same kind of [[hash tree]] ~~structure~~ used in for instance p2p systems. == Algorithm ==

Apriori algorithm: Difference between revisions