Online Knowledge-Based Support Vector Machines Gautam Kunapuli1, Kristin P. Bennett2, Amina Shabbeer2, Richard Maclin3 and Jude W. Shavlik1 1University of Wisconsin-Madison, USA 2Rensselaer Polytechnic Institute, USA 3University of Minnesota, Duluth, USA ECML 2010, Barcelona, Spain Outline • • • • • Knowledge-Based Support Vector Machines The Adviceptron: Online KBSVMs A Real-World Task: Diabetes Diagnosis A Real-World Task: Tuberculosis Isolate Classification Conclusions ECML 2010, Barcelona, Spain Knowledge-Based SVMs • Introduced by Fung et al (2003) • Allows incorporation of expert advice into SVM formulations • Advice is specified with respect to polyhedral regions in input (feature) space (f eat ur e7 ¸ 5) ^ (f eat ur e12 · 4) ) (cl ass = + 1) (f eat ur e2 · ¡ 3) ^ (f eat ur e3 · 4) ^ (f eat ur e10 ¸ 0) ) (cl ass = ¡ 1) (3f eat ur e6 + 5f eat ur e8 ¸ 2) ^ (f eat ur e11 · ¡ 3) ) (cl ass = + 1) • Can be incorporated into SVM formulation as constraints using advice variables ECML 2010, Barcelona, Spain Knowledge-Based SVMs In classic SVMs, we have T labeled data points (xt, yt), t = 1, …, T. We learn a linear classifier w’x – b = 0. Class A, y= +1 The standard SVM formulation trades off regularization and loss: 1 kw k2 + ¸ e0» 2 sub. t o Y (X w ¡ be) + » ¸ e; min » ¸ 0: Class B, y = -1 ECML 2010, Barcelona, Spain Knowledge-Based SVMs We assume an expert provides polyhedral advice of the form Dx · d ) 0 wx¸ b Class A, y= +1 We can transform the logic constraint above using advice variables, u 8 < : Dx · d 9 D u + w = 0; = ¡ d 0u ¡ b ¸ 0; ; u ¸ 0 0 These constraints are added to the standard formulation to give Knowledge-Based SVMs Class B, y = -1 ECML 2010, Barcelona, Spain Knowledge-Based SVMs In general, there are m advice sets, each with label z = § ,1 for advice belonging to Class A or B, Class A, y= +1 D i x · d i ) zi (w 0x) ¡ b¸ 0 D 1x · d 1 Each advice set adds the following constraints to the SVM formulation 8 < : 9 zi w = 0; = ¡ d u ¡ zi b ¸ 0; ; ui ¸ 0 D i0u i + i0 i D 3x · d 3 Class B, y = -1 D 2x · d 2 ECML 2010, Barcelona, Spain Knowledge-Based SVMs The batch KBSVM formulation introduces advice slack variables to soften the advice constraints min Xm ¡ ¢ 1 2 0 i kw k + ¸ e » + ¹ ´ + ³i 2 Class A, y= +1 : i= 1 s.t . Y (X w ¡ be) + » ¸ e; D 1x · d 1 » ¸ 0; D i0u i + zi w + ´ i = 0; i0 i ¡ d u ¡ zi b+ ³ i ¸ 1; u i ; ´ i ; ³ i ¸ 0; i = 1; :::; m: D 3x · d 3 Class B, y = -1 D 2x · d 2 ECML 2010, Barcelona, Spain Outline • • • • • Knowledge-Based Support Vector Machines The Adviceptron: Online KBSVMs A Real-World Task: Diabetes Diagnosis A Real-World Task: Tuberculosis Isolate Classification Conclusions ECML 2010, Barcelona, Spain Online KBSVMs • Need to derive an online version of KBSVMs • Algorithm is provided with advice and one labeled data point at each round • Algorithm should update the hypothesis at each step, wt, as well as the advice vectors, ui,t ECML 2010, Barcelona, Spain Passive-Aggressive Algorithms • Adopt the framework of passive-aggressive algorithms (Crammer et al, 2006), where at each round, when a new data point is given, – if loss = 0, there is no update (passive) – if loss > 0, update weights to minimize loss (aggressive) • Why passive-aggressive algorithms? – readily applicable to most SVM losses – possible to derive elegant, closed-form update rules – simple rules provide fast updates; scalable – analyze performance by deriving regret bounds ECML 2010, Barcelona, Spain Online KBSVMs • There are m advice sets, (D i ; d i ; zi ) m i= 1 t (x • At round t, the algorithm receives ; yt ) • The current hypothesis is w t , and the current advice variables are u i ;t ; i = 1; :::; m At round t, the formulation for deriving an update is »;u i min i ;´ ;³ i ¸ 0;w subject t o Xm Xm ¡ ¢ 1 1 ¸ ¹ t 2 i i ;t 2 2 i 2 2 kw ¡ w k + ku ¡ u k + » + k´ k + ³ i 2 2 i= 1 2 2 i= 1 yt w 0x t ¡ 1 + » ¸ 0; 9 0 i i D i u + zi w + ´ = 0 > = 0 i = 1; :::; m: ¡ di ui ¡ 1 + ³ i ¸ 0 > ; i u ¸ 0 ECML 2010, Barcelona, Spain Formulation At The t-th Round • There are m advice sets, (D i ; d i ; zi ) m i= 1 t (x • At round t, the algorithm receives ; yt ) • The current hypothesis is w t , and the current advice variables are u i ;t ; i = 1; :::; m proximal terms for hypothesis and advice vectors »;u i min i ;´ ;³ i ¸ 0;w subject t o Xm Xm ¡ ¢ 1 1 ¸ ¹ t 2 i i ;t 2 2 i 2 2 kw ¡ w k + ku ¡ u k + » + k´ k + ³ i ; 2 2 i= 1 2 2 i= 1 yt w 0x t ¡ 1 + » ¸ 0; 9 0 i i D i u + zi w + ´ = 0 > = 0 i = 1; :::; m: ¡ di ui ¡ 1 + ³ i ¸ 0 > ; i u ¸ 0 ECML 2010, Barcelona, Spain Formulation At The t-th Round • There are m advice sets, (D i ; d i ; zi ) m i= 1 t (x • At round t, the algorithm receives ; yt ) • The current hypothesis is w t , and the current advice variables are u i ;t ; i = 1; :::; m data loss advice loss »;u i min i ;´ ;³ i ¸ 0;w subject t o Xm Xm ¡ ¢ 1 1 ¸ ¹ t 2 i i ;t 2 2 i 2 2 kw ¡ w k + ku ¡ u k + » + k´ k + ³ i ; 2 2 i= 1 2 2 i= 1 yt w 0x t ¡ 1 + » ¸ 0; 9 0 i i D i u + zi w + ´ = 0 > = 0 i = 1; :::; m: ¡ di ui ¡ 1 + ³ i ¸ 0 > ; i u ¸ 0 ECML 2010, Barcelona, Spain Formulation At The t-th Round • There are m advice sets, (D i ; d i ; zi ) m i= 1 t (x • At round t, the algorithm receives ; yt ) • The current hypothesis is w t , and the current advice variables are u i ;t ; i = 1; :::; m parameters »;u i min i ;´ ;³ i ¸ 0;w subject t o Xm Xm ¡ ¢ 1 1 ¸ ¹ t 2 i i ;t 2 2 i 2 2 kw ¡ w k + ku ¡ u k + » + k´ k + ³ i ; 2 2 i= 1 2 2 i= 1 yt w 0x t ¡ 1 + » ¸ 0; 9 0 i i D i u + zi w + ´ = 0 > = 0 i = 1; :::; m: ¡ di ui ¡ 1 + ³ i ¸ 0 > ; i u ¸ 0 ECML 2010, Barcelona, Spain Formulation At The t-th Round • There are m advice sets, (D i ; d i ; zi ) m i= 1 t (x • At round t, the algorithm receives ; yt ) • The current hypothesis is w t , and the current advice variables are u i ;t ; i = 1; :::; m »;u i min i ;´ ;³ i ¸ 0;w subject t o Xm Xm ¡ ¢ 1 1 ¸ ¹ t 2 i i ;t 2 2 i 2 2 kw ¡ w k + ku ¡ u k + » + k´ k + ³ i ; 2 2 i= 1 2 2 i= 1 yt w 0x t ¡ 1 + » ¸ 0; inequality constraints 9 0 i i D i u + zi w + ´ = 0 > make deriving a closed= 0 i = 1; :::; m: form update impossible ¡ di ui ¡ 1 + ³ i ¸ 0 > ; i u ¸ 0 ECML 2010, Barcelona, Spain Formulation At The t-th Round • There are m advice sets, (D i ; d i ; zi ) m i= 1 t (x • At round t, the algorithm receives ; yt ) • The current hypothesis is w t, and the current advice-vector estimates are u i ;t ; i = 1; :::; m »;u i min i ;´ ;³ i ¸ 0;w subject t o Xm Xm ¡ ¢ 1 1 ¸ ¹ t 2 i i ;t 2 2 i 2 2 kw ¡ w k + ku ¡ u k + » + k´ k + ³ i ; 2 2 i= 1 2 2 i= 1 yt w 0x t ¡ 1 + » ¸ 0; inequality constraints 9 0 i i D i u + zi w + ´ = 0 > make deriving a closed= 0 i = 1; :::; m: form update impossible ¡ di ui ¡ 1 + ³ i ¸ 0 > ; i u ¸ 0 ECML 2010, Barcelona, Spain Decompose Into m+1 Sub-problems • First sub-problem: update hypothesis by fixing the advice variables, to their values at the t-th iteration u i = u i ;t »;u i min i ;´ ;³ i ¸ 0;w subject t o Xm Xm ¡ ¢ 1 1 ¸ ¹ t 2 i i ;t 2 2 i 2 2 kw ¡ w k + ku ¡ u k + » + k´ k + ³ i ; 2 2 i= 1 2 2 i= 1 yt w 0x t ¡ 1 + » ¸ 0; 9 0 i i D i u + zi w + ´ = 0 > = 0 i = 1; :::; m: ¡ di ui ¡ 1 + ³ i ¸ 0 > ; ui ¸ 0 • Some objective terms and constraints drop out of the formulation ECML 2010, Barcelona, Spain Deriving The Hypothesis Update • First sub-problem: update hypothesis by fixing the advice vectors Xm 1 ¸ ¹ • Update vectors w t + 1 = advice min kw ¡ w tby k22 +fixing »2 +the hypothesis k´ i k22 2 m sub-problems, 2 2 i =for – Breaks down into one 1 each subject advice sett o yt w 0x t ¡ 1 + » ¸ 0; w ;»;´ i (®) D i0u i ;t + zi w + ´ i = 0; i = 1; :::; m: (¯ i ) ECML 2010, Barcelona, Spain Deriving The Hypothesis Update • First sub-problem: update hypothesis by fixing the advice vectors Xm 1 ¸ ¹ • Update vectors w t + 1 = advice min kw ¡ w tby k22 +fixing »2 +the hypothesis k´ i k22 2 m sub-problems, 2 2 i =for – Breaks down into one 1 each subject advice sett o yt w 0x t ¡ 1 + » ¸ 0; w ;»;´ i (®) D i0u i ;t + zi w + ´ i = 0; i = 1; :::; m: (¯ i ) fixed, advice-estimate of the hypothesis according to i-th advice set ; denote as r i ;t ECML 2010, Barcelona, Spain Advice-Estimate Of Current Hypothesis • First sub-problem: update hypothesis by fixing the advice vectors Xm 1 ¸ ¹ • Update vectors w t + 1 = advice min kw ¡ w tby k22 +fixing »2 +the hypothesis k´ i k22 2 m sub-problems, 2 2 i =for – Breaks down into one 1 each subject advice sett o yt w 0x t ¡ 1 + » ¸ 0; w ;»;´ i (®) D i0u i ;t + zi w + ´ i = 0; i = 1; :::; m: (¯ i ) fixed, advice-estimate of the hypothesis according to i-th advice set ; denote as r i ;t average advice-estimates over all m advice vectors and denote as m X 1 rt = r i ;t m i= 1 ECML 2010, Barcelona, Spain The Hypothesis Update For ¸ , ¹ > 0, and advice-est imat e r t , t he hypot hesis updat e is w t + 1 = º (w t + ®t yt x t ) + (1 ¡ º ) r t ; µ ®t = 1 + º kx t k2 ¸ ¶¡ ³ 1 ¢max ´ 1 ¡ º yt w x ¡ (1 ¡ º ) yt r x ; 0 : ECML 2010, Barcelona, Spain t0 t t0 t The Hypothesis Update For ¸ , ¹ > 0, and advice-est imat e r t , t he hypot hesis updat e is w t + 1 = º (w t + ®t yt x t ) + (1 ¡ º ) r t ; µ ®t = 1 + º kx t k2 ¸ ¶¡ ³ 1 ¢max ´ 1 ¡ º yt w x ¡ (1 ¡ º ) yt r x ; 0 : Update is convex combination of the standard passiveaggressive update and the average advice-estimate Parameter of convex combinations is º = 1 1 + m¹ ECML 2010, Barcelona, Spain t0 t t0 t The Hypothesis Update For ¸ , ¹ > 0, and advice-est imat e r t , t he hypot hesis updat e is w t + 1 = º (w t + ®t yt x t ) + (1 ¡ º ) r t ; µ ®t = 1 + º kx t k2 ¸ ¶¡ ³ 1 ¢max ´ 1 ¡ º yt w x ¡ (1 ¡ º ) yt r x ; 0 : t0 t t0 t Update is convex combination of the standard passiveaggressive update and the Update weight depends on average advice-estimate hinge loss computed with respect to a composite weight vector that is a convex Parameter of convex 1 combination of the current º = combinations is 1 + m¹ hypothesis and the average advice-estimate ECML 2010, Barcelona, Spain Deriving The Advice Updates • Second sub-problem: update advice vectors by fixing the hypothesis w = w t + 1 »;u i min i ;´ ;³ i ¸ 0;w subject t o Xm Xm ¡ ¢ 1 1 ¸ ¹ t 2 i i ;t 2 2 i 2 2 kw ¡ w k + ku ¡ u k + » + k´ k + ³ i ; 2 2 i= 1 2 2 i= 1 yt w 0x t ¡ 1 + » ¸ 0; 9 0 i i D i u + zi w + ´ = 0 > = 0 i = 1; :::; m: ¡ di ui ¡ 1 + ³ i ¸ 0 > ; i u ¸ 0 • Some constraints and objective terms drop out of the formulation ECML 2010, Barcelona, Spain Deriving The Advice Updates • Second sub-problem: update advice vectors by fixing the hypothesis m Xm ¡ ¢ 1X ¹ i i ;t 2 i 2 2 min ku ¡ u k + k´ k + ³ i ; i i 2 i= 1 2 i= 1 u ;´ ;³ i ¸ 0; 9 0 i t+ 1 i D i u + zi w + ´ = 0 > = 0 subject t o i = 1; :::; m: ¡ di ui ¡ 1 + ³ i ¸ 0 > ; i u ¸ 0 ECML 2010, Barcelona, Spain Deriving The Advice Updates • Second sub-problem: update advice vectors by fixing the hypothesis m Xm ¡ ¢ 1X ¹ i i ;t 2 i 2 2 min ku ¡ u k + k´ k + ³ i ; i i 2 i= 1 2 i= 1 u ;´ ;³ i ¸ 0; 9 0 i t+ 1 i D i u + zi w + ´ = 0 > = 0 subject t o i = 1; :::; m: ¡ di ui ¡ 1 + ³ i ¸ 0 > ; i u ¸ 0 split into m sub-problems ECML 2010, Barcelona, Spain Deriving The i-th Advice Updates • m sub-problems: update the i-th advice vector by fixing the hypothesis u ¢ 1 i ¹ ¡ i 2 i ;t 2 2 = min ku ¡ u k + k´ k2 + ³ i i 2 2 u ;´ ;³ subject t o D i0u i + zi w t + 1 + ´ i = 0; (¯ i ) i ;t + 1 i0 i ¡ d u ¡ 1 + ³ i ¸ 0; (° i ) u i ¸ 0: (¿i ) ECML 2010, Barcelona, Spain Deriving The i-th Advice Updates • m sub-problems: update the i-th advice vector by fixing the hypothesis u ¢ 1 i ¹ ¡ i 2 i ;t 2 2 = min ku ¡ u k + k´ k2 + ³ i i 2 2 u ;´ ;³ subject t o D i0u i + zi w t + 1 + ´ i = 0; (¯ i ) i ;t + 1 i0 i cone constraints still complicating cannot derive closed form solution ¡ d u ¡ 1 + ³ i ¸ 0; (° i ) u i ¸ 0: (¿i ) • Use projected-gradient approach • drop constraints to compute intermediate closed-form update • project intermediate update back on to cone constraints ECML 2010, Barcelona, Spain The m Advice Updates For ¹ > 0, and current hypot hesis w t + 1 , for each advice set , i = 1; :::; m, t he updat e rule is given by ¡ i ;t ¢ i ;t + 1 i i u = max u + D i ¯ ¡ d ° i ; 0 ; · i ¯ °i ¸ 2 = 4 ¡ (D i0D i 0 + di Di 1 ¹ In) 3¡ D i0d i 0 ¡ (d i d i + 5 1 ¹ ) ECML 2010, Barcelona, Spain 1 2 D i0u i ;t + zi w t+ 1 4 3 5 i 0 i ;t ¡ d u ¡ 1 The m Advice Updates For ¹ > 0, and current hypot hesis w t + 1 , for each advice set , i = 1; :::; m, t he updat e rule is given by ¡ i ;t ¢ i ;t + 1 i i projection u = max u + D i ¯ ¡ d ° i ; 0 ; · i ¯ °i ¸ 2 = 4 ¡ (D i0D i + 1 ¹ In) 0 0 di Di 3¡ D i0d i 5 ¡ (d i d i + • hypothesis-estimate of the advice; denote si = ¯ i =° i • The update is the error or the amount of violation of the constraint D i x · d i by an ideal data point, si 1 ¹ ) 1 2 D i0u i ;t + zi w 4 3 5 i 0 i ;t ¡ d u each advice update depends on the newly updated hypothesis ECML 2010, Barcelona, Spain t+ 1 ¡ 1 The Adviceptron A lgor it hm 1 T he Passive-Aggressive Advicept ron Algorit hm 1: i nput : dat a (x t ; yt ) Tt= 1 , advice set s (D i ; d i ; zi ) m i = 1 , paramet ers ¸ ; ¹ > 0 2: i ni t i al ize: u i ; 1 = 0, w 1 = 0 3: l et : º = 1=(1 + m¹ ) 4: for (x t ; yt ) do 0 5: predict label y^t = sign(w t x t ) 6: receive correct label yt 7: su®er loss ¡ t0 t t0 t ` t = max 1 ¡ º yt w x ¡ (1 ¡ º )yt r x ; 0 8: updat e hypot hesis using u i ;t ® = ` t =( 9: ¢ 1 + º kx t k2 ); ¸ w t + 1 = º ( w t + ®yt x t ) + (1 ¡ º ) r t updat e advice variables using w t + 1 i (¯ ; ° i ) = H i¡ 1 g i ; u i ;t + 1 ¡ = u 10: en d for ECML 2010, Barcelona, Spain i ;t i i + Di ¯ ¡ d °i ¢ + Outline • • • • • Knowledge-Based Support Vector Machines The Adviceptron: Online KBSVMs A Real-World Task: Diabetes Diagnosis A Real-World Task: Tuberculosis Isolate Classification Conclusions ECML 2010, Barcelona, Spain Diagnosing Diabetes • Standard data set from UCI repository (768 x 8) – all patients at least 21 years old of Pima Indian heritage – features include body mass index, blood glucose level • Expert advice for diagnosing diabetes from NIH website on risks for Type-2 diabetes – a person who is obese (characterized by BMI > 30) and has a high blood glucose level (> 126) is at a strong risk for diabetes (BMI ¸ 30) ^ (bl oodgl ucose ¸ 126) ) di abet es – a person who is at normal weight (BMI < 25) and has low blood glucose level (< 100) is at a low risk for diabetes (BMI · 25) ^ (bl oodgl ucose · 100) ) : di abet es ECML 2010, Barcelona, Spain Diagnosing Diabetes: Results • 200 examples for training, remaining for testing • Results averaged over 20 randomized iterations • Compared to advice-free online algorithms: • Passive-aggressive (Crammer et al, 2006), • ROMMA (Li & Long, 2002), • Max margin-perceptron (Freund & Schapire, 1999) batch KBSVM, 200 points ECML 2010, Barcelona, Spain Outline • • • • • Knowledge-Based Support Vector Machines The Adviceptron: Online KBSVMs A Real-World Task: Diabetes Diagnosis A Real-World Task: Tuberculosis Isolate Classification Conclusions ECML 2010, Barcelona, Spain Tuberculosis Isolate Classification • Task is to classify strains of Mycobacterium tuberculosis complex (MTBC) into major genetic lineages based on DNA fingerprints • MTBC is the causative agent for TB – leading cause of disease and morbidity – strains vary in infectivity, transmission, virulence, immunogenicity, host associations depending on genetic lineage • Lineage classification is crucial for surveillance, tracking and control of TB world-wide ECML 2010, Barcelona, Spain Tuberculosis Isolate Classification • Two types of DNA fingerprints for all culture-positive TB strains collected in the US by the CDC (44 data features) • Six (classes) major lineages of TB for classification – ancestral: M. bovis, M. africanum, Indo-Oceanic – modern: Euro-American, East-Asian, East-African-Indian • Problem formulated as six 1-vs-many classification tasks Class East -Asian East -African-Indian Euro-American Indo-Oceanic M. africanum M. bovis # isolat es 4924 1469 25161 5309 154 693 # pieces of Posit ive Advice 1 2 1 5 1 1 ECML 2010, Barcelona, Spain # pieces of Negat ive Advice 1 4 2 5 3 3 Expert Rules for TB Lineage Classification East-Asian Indo-Oceanic M. bovis M. africanum East-African-Indian MI RU24 > 1 MI RU24 · 1 Indo-Oceanic Euro-American Rules provided by Dr. Lauren Cowan at the Center for Disease Control, documented in Shabbeer et al, (2010) ECML 2010, Barcelona, Spain TB Results: Might Need Fewer Examples To Converge With Advice Euro-American vs. the Rest M. africanum vs. the Rest batch KBSVM batch KBSVM ECML 2010, Barcelona, Spain TB Results: Can Converge To A Better Solution With Advice East-African-Indian vs. the Rest Indo-Oceanic vs. the Rest batch KBSVM batch KBSVM ECML 2010, Barcelona, Spain TB Results: Possible To Still Learn Well With Only Advice East-Asian vs. the Rest M. bovis vs. the Rest batch KBSVM batch KBSVM ECML 2010, Barcelona, Spain Outline • • • • • Knowledge-Based Support Vector Machines The Adviceptron: Online KBSVMs A Real-World Task: Diabetes Diagnosis A Real-World Task: Tuberculosis Isolate Classification Conclusions And Questions ECML 2010, Barcelona, Spain Conclusions • New online learning algorithm: the adviceptron • Makes use of prior knowledge in the form of (possibly imperfect) polyhedral advice • Performs simple, closed-form updates via passiveaggressive framework; scalable • Good advice can help converge to a better solution with fewer examples • Encouraging empirical results on two important real-world tasks ECML 2010, Barcelona, Spain References. (Fung et al, 2003) G. Fung, O. L. Mangasarian, and J. W. Shavlik. Knowledge-based support vector classifiers. In S. Becker, S. Thrun & K. Obermayer, eds, NIPS, 15, pp. 521–528, 2003 (Crammer et al, 2006) K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passiveaggressive algorithms. J. of Mach. Learn. Res., 7:551–585, 2006. (Freund and Schapire, 1999) Y. Freund and R. E. Schapire. Large margin classification using the perceptron algorithm. Mach. Learn., 37(3):277–296, 1999. (Li and Long, 2002) Y. Li and P. M. Long. The relaxed online maximum margin algorithm. Mach. Learn., 46(1/3):361–387, 2002. (Shabbeer et al, 2001) A. Shabbeer, L. Cowan, J. R. Driscoll, C. Ozcaglar, S. L Vandenberg, B. Yener, and K. P Bennett. TB-Lineage: An online tool for classification and analysis of strains of Mycobacterium tuberculosis Complex. Unpublished manuscript, 2010. Acknowledgements. The authors would like to thank Dr. Lauren Cowan of the Center for Disease Control (CDC) for providing the TB dataset and the expert-defined rules for lineage classification. We gratefully acknowledge support of DARPA under grant HR0011-07-C-0060 and the NIH under grant 1-R01-LM009731-01. Views and conclusions contained in this document are those of the authors and do not necessarily represent the official opinion or policies, either expressed or implied of the US government or of DARPA. ECML 2010, Barcelona, Spain KBSVMs: Deriving The Advice Constraints We assume an expert provides polyhedral advice of the form Class A, y= +1 Dx · d ) w 0x ¸ b We know p ) q is equivalent to : p _ q Dx · d If : p _ q has a solution then its negation has no solution or, 8 9 < D x ¡ d ¿ · 0; = w 0x ¡ b¿ < 0; : ; ¡ ¿< 0 has no solution (x; ¿): Class B, y = -1 ECML 2010, Barcelona, Spain KBSVMs: Deriving The Advice Constraints If the following system Class A, y= +1 8 9 < D x ¡ d ¿ · 0; = w 0x ¡ b¿ < 0; : ; ¡ ¿< 0 has no solution (x; ¿) , then by Dx · d Motzkin’s Theorem of the Alternative, the following system 8 9 < : D 0u + w = 0; = ¡ d 0u ¡ b ¸ 0; ; u ¸ 0 has a solution u. Class B, y = -1 ECML 2010, Barcelona, Spain