A support Vector Method for Multivariate performance Measures Author: Thorsten Joachims (ICML’05) Presenter: Lei Tang Motivation Current classifier focus on error-rate, how to optimize it directly for different performance measures? Precision, recall, F-measure etc. Existing Approach Accurately estimate the probabilities of class membership of each example. (Difficult) Optimize tractable different variants. But for non-linear measure(F-measure), extensive CV is required. Directly optimize the measure like ROCArea. But non on F-measure. Reformulation Given training examples and test examples S’, our goal is to minimize Decompose the loss function linearly: Empirical loss: SVM Original SVM: Multivariate SVM: Here, x,y is a function that returns a feature vector of Prediction: Problems Too many constraints!!!! N samples, k class labels, then |Y|=k^N. Do we really need to include all the constraints? Algorithm Constrain t Selection Contingency Table Still impractical!! We have to calculate Contingency table N samples, how many different tables? Algorithm for argmax Given a table, What should the assignment be? Exhaustive search all the possible contingency tables and get the maximum. Various Loss F-measure: Precision /Recall (Just look at top k data points) Precision/Recall Break-Even Point The search space is reduced as a+b=a+c