Apriori algorithm: Difference between revisions

Content deleted Content added
m Better explanation of limitation
update url on rebranded site
Line 4:
== Overview ==
 
The Apriori algorithm was proposed by Agrawal and Srikant in 1994. Apriori is designed to operate on [[database]]s containing transactions (for example, collections of items bought by customers, or details of a website frequentation or [[IP address]]es<ref>[https://www.dativadeductive.com/blogs/data-science-ip-matching/ The data science behind IP address matching] Published by dativadeductive.com, September 6, 2018, retrieved September 7, 2018</ref>). Other algorithms are designed for finding association rules in data having no transactions ([[Winepi]] and Minepi), or having no timestamps (DNA sequencing). Each transaction is seen as a set of items (an ''itemset''). Given a threshold <math>C</math>, the Apriori algorithm identifies the item sets which are subsets of at least <math>C</math> transactions in the database.
 
Apriori uses a "bottom up" approach, where frequent subsets are extended one item at a time (a step known as ''candidate generation''), and groups of candidates are tested against the data. The algorithm terminates when no further successful extensions are found.