association rules & the apriori algorithm

advertisement
ASSOCIATION RULES & THE
APRIORI ALGORITHM
BY: JOE
CASABONA
INTRODUCTION
• Recap
o Data Mining
o Three types
• Association Rules
• Apriori Algorithm
ASSOCIATION RULES
• Most apparent form of Data Mining
• Objective: Find all co-occurrence relationships among data
items
• Strength: Support & Confidence
SUPPORT
• Those who buy X buy Y, where X and Y are sets
o X => Y
• .count = number of occurences
• n = number of total transactions
• Number produced is % of all
transactions (T)
CONFIDENCE
• % of transactions where X also contains Y
• Determines predictability of the rule
• Min Support and Confidence Determined.
EXAMPLE
• AR 1: Xbox ---> Controller
o Support: 5/8
o Confidence: 3/5
• AR 2: COD4 ---> Xbox
o Support: 5/8
o Confidence: 2/5
• AR 1 passes, AR 2 fails
APRIORI ALGORITHM
• Generate all frequent item sets
o All item sets with min support
• Generate all confident ARs from frequent item sets
• Downward Closure Property
GENERATE FREQUENT ITEM SETS
•
•
•
•
Count supports of each individual item
Create a set F with all individual items with min support
Creates "Candidate Set" C[k] based on F[k-1].
Check each element c in C[k] to see if it
meets min support
• Return set of all frequent item sets.
GENERATE CANDIDATE SETS
• Create two sets differing only in the last element, based on
some seed set
• Join those item sets into c
• Compare each subset s of c to F[k-1]- if s is not in F[k-1],
delete it.
• Return final candidate set
RULE GENERATE
• Take Frequent Item Set F
o If {F[1], F[2],...F[k-1]} => {F[k]}meets some min
confidence, make it a rule
o Remove last element from antecedent, insert
into consequent, check again
OTHER ALGORITHMS
•
•
•
•
Eclat algorithm
FP-Growth algorithm
One-attribute-rule
Zero-attribute-rule
SAMPLE DATA
•
•
•
•
•
•
•
•
Xbox, Controller, COD4
Xbox, COD4
Xbox, Controller
Controller, COD4
Xbox, Rock Band, Controller
Xbox, PS3
COD4, COD5, Rock Band
COD4, Rock Band
• Min Support: 60%
• Min Confidence: 50%
RERERENCES
The Book I am using:
Liu, Bing. Web Data Mining, Chapter 2: Association Rules and
Sequential Patterns. Springer, December, 2006
Wikipedia:
"Apriori Algorithm."
http://en.wikipedia.org/wiki/Apriori_algorithm March 23,
2009
"Association rule learning."
http://en.wikipedia.org/wiki/Association_rules
March 25, 2009
Download