Machine Learning Feature Selection
J.-S. Roger Jang ( 張智星 )
CSIE Dept., National Taiwan University
( 台灣大學 資訊工程系 ) http://mirlab.org/jang jang@mirlab.org
2
Machine Learning Feature Selection
Feature Selection: Goal & Benefits
Feature selection
• Also known as input selection
Goal
• To select a subset out of the original feature sets for better recognition rate
Benefits
• Improve recognition rate
•
Reduce computation load
• Explain relationships between features and classes
3
Machine Learning Feature Selection
Exhaustive Search
Steps for direct exhaustive search
1. Use KNNC as the classifier, LOO for RR estimate
2. Generate all combinations of features and evaluate them one-by-one
3. Select the feature combination that has the best
RR.
Drawback
• d features
2 d
1 models for evaluation
• d = 10 1023 models for evaluation Time consuming!
Advantage
• The optimal feature set can be identified.
Machine Learning Feature Selection
Exhaustive Search
Direct exhaustive search
1 input x1 x2 x3 x4 x5
2 inputs x1, x2 x1, x3 x1, x4 x1, x5 x2, x3
. . .
3 inputs x1, x2, x3 x1, x2, x4 x1, x2, x5 x1, x3, x4
. . .
x1, x2, x3, x4 x1, x2, x3, x5 x1, x2, x4, x5
. . .
4
4 inputs
. . .
5
Machine Learning Feature Selection
Exhaustive Search
Characteristics of exhaustive search for feature selection
• The process is time consuming, but the identified feature set is optimum.
• It’s possible to use classifiers other than KNNC.
• It’s possible to use performance indices other than LOO.
6
Machine Learning Feature Selection
Heuristic Search
Heuristic search for input selection
• One-pass ranking
• Sequential forward selection
• Generalized sequential forward selection
• Sequential backward selection
• Generalized sequential backward selection
• ‘Add m, remove n’ selection
• Generalized ‘add m, remove n’ selection
7
Machine Learning Feature Selection
Sequential Forward Selection
Steps for sequential forward selection
1. Use KNNC as the classifier, LOO for RR estimate
2. Select the first feature that has the best RR.
3. Select the next feature (among all unselected features) that, together with the selected features, gives the best RR.
4. Repeat the previous step until all features are selected.
Advantage
• If we have d features, we need to evaluate d(d+1)/2 models A lot more efficient.
Drawback
• The selected features are not always optimal.
8
Machine Learning Feature Selection
Sequential Forward Selection
Sequential forward selection (SFS)
1 input x1 x2 x3 x4 x5
2 inputs
3 inputs
4 inputs
. . .
x2, x1 x2, x3 x2, x4 x2, x5 x2, x4, x1 x2, x4, x3 x2, x4, x5 x2, x4, x3, x1 x2, x4, x3, x5
9
Machine Learning Feature Selection
Example: Iris Dataset
Sequential forward selection
Exhaustive search
Machine Learning Feature Selection
Example: Wine Dataset
SFS SFS with input normalization
10
3 selected features, LOO RR=93.8% 6 selected features, LOO RR=97.8%
If we use exhaustive search, we have 8 features with LOO RR=99.4%
11
Machine Learning Feature Selection
Use of Input Selection
Common use of input selection
• Increase the model complexity sequentially by adding more inputs
• Select the model that has the best test RR
Typical curve of error vs. model complexity
• Determine the model structure with the least test error
Test error
Optimal structure
Training error
Model complexity (# of selected inputs)