Lei-SVM

advertisement
A support Vector Method
for Multivariate
performance Measures
Author: Thorsten Joachims
(ICML’05)
Presenter: Lei Tang
Motivation
Current classifier focus on error-rate,
how to optimize it directly for different
performance measures?
Precision, recall, F-measure etc.
Existing Approach
Accurately estimate the probabilities of
class membership of each example.
(Difficult)
Optimize tractable different variants. But
for non-linear measure(F-measure),
extensive CV is required.
Directly optimize the measure like
ROCArea. But non on F-measure.
Reformulation
Given training examples
and test examples S’, our goal is to minimize
Decompose the loss function linearly:
Empirical loss:
SVM
Original SVM:
Multivariate SVM:
Here,
x,y
is a function that returns a feature vector of
Prediction:
Problems
Too many constraints!!!!
N samples, k class labels, then |Y|=k^N.
Do we really need to include all the
constraints?
Algorithm
Constrain
t
Selection
Contingency Table
Still impractical!! We have to calculate
Contingency table
N samples, how
many different
tables?
Algorithm for argmax
Given a table,
What should the
assignment be?
Exhaustive search all the possible
contingency tables and get the
maximum.
Various Loss
F-measure:
Precision /Recall
(Just look at top k data points)
Precision/Recall Break-Even Point
The search space is reduced as
a+b=a+c
Download