Classification based on Association Rules

advertisement
Classification based on
Association Rules
Introduction
• Association rules were originally designed for
finding multi-correlated items in transactions
• However, they can be easily adapted for
classification..
• How ?
Example
{SL=L, SW=M,PL = S, PW = M}
virginica
{SL=S,SW=L,PL=M,PW=S}
setosa
:
:
Sepal Length (SL); Sepal Width (SW); Petal Length (PL); Petal Width (PW)
Large = L; Medium = M; Small = S;
Discretization of numeric attributes to create “Large”, “Medium”, “Small”
Now apply Association rule mining to find patterns of
the form: <features-sets> - Class Labels
Rank rules first by confidence and then support
Integration with Bayes Classifier
• The frequent items generated for the frequent mining
algorithm can be used as features and integrated into a
Bayes classifier.
• Suppose <f1,f2> is a frequent itemset in all transactions
projected on class 1 (C1).
• Eg. <f1,f2> appears in 20% of the transactions of C1 but
only 5% of the transactions of C2.
• Then <f1,f2> is a good candidate feature to try out in the
Bayes classifier.
• [This is part of the assignment]
Integration with Bayesian Classifier
• Suppose we have <SL=L,PW=M> as a frequent
feature for Virginica.
• Should we also have <SL=L> and <PW=M> as
separate features ?
• What are the pros and cons ?
Download