ASSOCIATION RULES & THE APRIORI ALGORITHM BY: JOE CASABONA INTRODUCTION • Recap o Data Mining o Three types • Association Rules • Apriori Algorithm ASSOCIATION RULES • Most apparent form of Data Mining • Objective: Find all co-occurrence relationships among data items • Strength: Support & Confidence SUPPORT • Those who buy X buy Y, where X and Y are sets o X => Y • .count = number of occurences • n = number of total transactions • Number produced is % of all transactions (T) CONFIDENCE • % of transactions where X also contains Y • Determines predictability of the rule • Min Support and Confidence Determined. EXAMPLE • AR 1: Xbox ---> Controller o Support: 5/8 o Confidence: 3/5 • AR 2: COD4 ---> Xbox o Support: 5/8 o Confidence: 2/5 • AR 1 passes, AR 2 fails APRIORI ALGORITHM • Generate all frequent item sets o All item sets with min support • Generate all confident ARs from frequent item sets • Downward Closure Property GENERATE FREQUENT ITEM SETS • • • • Count supports of each individual item Create a set F with all individual items with min support Creates "Candidate Set" C[k] based on F[k-1]. Check each element c in C[k] to see if it meets min support • Return set of all frequent item sets. GENERATE CANDIDATE SETS • Create two sets differing only in the last element, based on some seed set • Join those item sets into c • Compare each subset s of c to F[k-1]- if s is not in F[k-1], delete it. • Return final candidate set RULE GENERATE • Take Frequent Item Set F o If {F[1], F[2],...F[k-1]} => {F[k]}meets some min confidence, make it a rule o Remove last element from antecedent, insert into consequent, check again OTHER ALGORITHMS • • • • Eclat algorithm FP-Growth algorithm One-attribute-rule Zero-attribute-rule SAMPLE DATA • • • • • • • • Xbox, Controller, COD4 Xbox, COD4 Xbox, Controller Controller, COD4 Xbox, Rock Band, Controller Xbox, PS3 COD4, COD5, Rock Band COD4, Rock Band • Min Support: 60% • Min Confidence: 50% RERERENCES The Book I am using: Liu, Bing. Web Data Mining, Chapter 2: Association Rules and Sequential Patterns. Springer, December, 2006 Wikipedia: "Apriori Algorithm." http://en.wikipedia.org/wiki/Apriori_algorithm March 23, 2009 "Association rule learning." http://en.wikipedia.org/wiki/Association_rules March 25, 2009