Revision as of 01:05, 10 August 2006 edit Pgan002 (talk \| contribs) Extended confirmed users 15,367 edits copy-edit: clean-up and clarification ← Previous edit		Revision as of 02:55, 10 August 2006 edit undo Pgan002 (talk \| contribs) Extended confirmed users 15,367 edits copy-edit Next edit →
Line 3: '''Apriori''' is an algorithm for efficiently [[data mining\|mining data]] for [[association rule]]s. It was developed by Rakesh Agrawal, et al. Apriori is designed to operate on [[database]]s containing transactions (eg: collection of items bought by customers or details of a website frequentation). Other algorithms are designed for finding association rules in data having no transactions (Winepi and Minepi), or having no timestamps (DNA sequencing).▼ Apriori uses [[breadth-first search]] and a [[hash tree]] structure to count candidate item sets efficiently. It generates candidate item sets of length <math>k</math> from item sets of length <math>k-1</math>. Then it prunes the candidates which have an infrequent sub pattern. According to the [[downward closure lemma]], the candidate set contains all frequent <math>k</math>-length item sets. After that, it scans the transaction database to determine frequent item sets among the candidates. For determining frequent items quickly, the algorithm uses a hash tree to store candidate itemsets. This hash tree has item sets at the leaves and [[hash table]]s at internal nodes (Zaki, 99). Note that this is not the same kind of [[hash tree]] used in for instance p2p systems. ▲Apriori is designed to operate on [[database]]s containing transactions (~~eg:~~for ~~collection~~example, collections of items bought by customers, or details of a website frequentation). Other algorithms are designed for finding association rules in data having no transactions (Winepi and Minepi), or having no timestamps (DNA sequencing). == Algorithm ==

Apriori algorithm: Difference between revisions