Hybrid Fuzzy-Rough Rule Induction and Feature Selection

advertisement
Hybrid Fuzzy-Rough Rule
Induction and Feature Selection
Dr. Richard Jensen
Aberystwyth University, UK
Dr. Chris Cornelis
Ghent University, Belgium
rkj@aber.ac.uk
Chris.Cornelis@UGent.be
Prof Qiang Shen
Aberystwyth University, UK
qqs@aber.ac.uk
FUZZ-IEEE 2009
Richard Jensen, Chris Cornelis and Qiang Shen
Outline
• Introduction
• Rough set theory (RST)
• Fuzzy-rough set theory
• Proposed method: QuickRules
• Experimentation
• Conclusion
Richard Jensen, Chris Cornelis and Qiang Shen
Introduction
• Rule induction has many advantages: e.g.
understandability, accuracy, adding prior
knowledge
• ... but also limitations: scaling, dealing with
noise, uncertainty...
• Pre-processing often used
Richard Jensen, Chris Cornelis and Qiang Shen
Rough set theory
Upper
Approximation
Set A
Lower
Approximation
Equivalence
class Rx
Rx is the set of all points that are indiscernible
with point x in terms of feature subset B
Richard Jensen, Chris Cornelis and Qiang Shen
Discovering rules via RST
• Equivalence classes
• Form the antecedent part of a rule
• The lower approximation tells us if this is
predictive of a given concept (certain rules)
• Typically done in one of two ways:
• Overlaying reducts
• Building rules by considering individual
equivalence classes (e.g. LEM2)
• These require a discretization procedure
Richard Jensen, Chris Cornelis and Qiang Shen
Fuzzy rough sets
Rough set
t-norm
Fuzzy rough set
implicator
Richard Jensen, Chris Cornelis and Qiang Shen
6
Fuzzy-rough sets
• Fuzzy-rough feature selection
• Evaluation: function based on fuzzy-rough lower
approximation
• Generation: greedy hill-climbing
• Stopping criterion: when maximal ‘goodness’ is
reached (or to degree α)
• The fuzzy tolerance classes used during
this process can be used to create fuzzy
rules
Richard Jensen, Chris Cornelis and Qiang Shen
QuickRules
Richard Jensen, Chris Cornelis and Qiang Shen
Check
Richard Jensen, Chris Cornelis and Qiang Shen
Experimentation
• 10-fold cross validation
• 6 fuzzy/rough set classifiers
• 5 non fuzzy/rough set classifiers
Richard Jensen, Chris Cornelis and Qiang Shen
Experimentation
Richard Jensen, Chris Cornelis and Qiang Shen
Experimentation
Richard Jensen, Chris Cornelis and Qiang Shen
Conclusion
• Proposed a rule induction method based
on fuzzy-rough sets
• Based on fuzzy-rough feature selection, using
fuzzy tolerance classes
• Future work
• Post-processing
• Other search mechanisms (from FS literature)
• Other measures, e.g. VQRS positive region and
dependency
Richard Jensen, Chris Cornelis and Qiang Shen
• WEKA implementations of all fuzzy-rough
classifiers and feature selectors can be
downloaded from:
Richard Jensen, Chris Cornelis and Qiang Shen
Richard Jensen, Chris Cornelis and Qiang Shen
Richard Jensen, Chris Cornelis and Qiang Shen
Download