kunapuli.ecml10.pptx

advertisement
Online Knowledge-Based
Support Vector Machines
Gautam Kunapuli1, Kristin P. Bennett2,
Amina Shabbeer2, Richard Maclin3 and
Jude W. Shavlik1
1University
of Wisconsin-Madison, USA
2Rensselaer Polytechnic Institute, USA
3University of Minnesota, Duluth, USA
ECML 2010, Barcelona, Spain
Outline
•
•
•
•
•
Knowledge-Based Support Vector Machines
The Adviceptron: Online KBSVMs
A Real-World Task: Diabetes Diagnosis
A Real-World Task: Tuberculosis Isolate Classification
Conclusions
ECML 2010, Barcelona, Spain
Knowledge-Based SVMs
• Introduced by Fung et al (2003)
• Allows incorporation of expert advice into
SVM formulations
• Advice is specified with respect to polyhedral
regions in input (feature) space
(f eat ur e7 ¸ 5) ^ (f eat ur e12 · 4) ) (cl ass = + 1)
(f eat ur e2 · ¡ 3) ^ (f eat ur e3 · 4) ^ (f eat ur e10 ¸ 0) ) (cl ass = ¡ 1)
(3f eat ur e6 + 5f eat ur e8 ¸ 2) ^ (f eat ur e11 · ¡ 3) ) (cl ass = + 1)
• Can be incorporated into SVM formulation as
constraints using advice variables
ECML 2010, Barcelona, Spain
Knowledge-Based SVMs
In classic SVMs, we have T
labeled data points (xt, yt),
t = 1, …, T. We learn a linear
classifier w’x – b = 0.
Class A, y= +1
The standard SVM formulation
trades off regularization and loss:
1
kw k2 + ¸ e0»
2
sub. t o Y (X w ¡ be) + » ¸ e;
min
» ¸ 0:
Class B, y = -1
ECML 2010, Barcelona, Spain
Knowledge-Based SVMs
We assume an expert provides
polyhedral advice of the form
Dx · d )
0
wx¸ b
Class A, y= +1
We can transform the logic
constraint above using
advice variables, u
8
<
:
Dx · d
9
D u + w = 0; =
¡ d 0u ¡ b ¸ 0;
;
u
¸ 0
0
These constraints are added to
the standard formulation to
give Knowledge-Based SVMs Class B, y = -1
ECML 2010, Barcelona, Spain
Knowledge-Based SVMs
In general, there are m advice
sets, each with label z = § ,1
for advice belonging to
Class A or B,
Class A, y= +1
D i x · d i ) zi (w 0x) ¡ b¸ 0
D 1x · d 1
Each advice set adds the
following constraints to the
SVM formulation
8
<
:
9
zi w = 0; =
¡ d u ¡ zi b ¸ 0;
;
ui
¸ 0
D i0u i +
i0 i
D 3x · d 3
Class B, y = -1
D 2x · d 2
ECML 2010, Barcelona, Spain
Knowledge-Based SVMs
The batch KBSVM formulation
introduces advice slack variables
to soften the advice constraints
min
Xm ¡
¢
1
2
0
i
kw k + ¸ e » + ¹
´ + ³i
2
Class A, y= +1
:
i= 1
s.t .
Y (X w ¡ be) + » ¸ e;
D 1x · d 1
» ¸ 0;
D i0u i + zi w + ´ i = 0;
i0 i
¡ d u ¡ zi b+ ³ i ¸ 1;
u i ; ´ i ; ³ i ¸ 0; i = 1; :::; m:
D 3x · d 3
Class B, y = -1
D 2x · d 2
ECML 2010, Barcelona, Spain
Outline
•
•
•
•
•
Knowledge-Based Support Vector Machines
The Adviceptron: Online KBSVMs
A Real-World Task: Diabetes Diagnosis
A Real-World Task: Tuberculosis Isolate Classification
Conclusions
ECML 2010, Barcelona, Spain
Online KBSVMs
• Need to derive an online version of KBSVMs
• Algorithm is provided with advice and one labeled
data point at each round
• Algorithm should update the hypothesis at each
step, wt, as well as the advice vectors, ui,t
ECML 2010, Barcelona, Spain
Passive-Aggressive Algorithms
• Adopt the framework of passive-aggressive
algorithms (Crammer et al, 2006), where at each
round, when a new data point is given,
– if loss = 0, there is no update (passive)
– if loss > 0, update weights to minimize loss (aggressive)
• Why passive-aggressive algorithms?
– readily applicable to most SVM losses
– possible to derive elegant, closed-form update rules
– simple rules provide fast updates; scalable
– analyze performance by deriving regret bounds
ECML 2010, Barcelona, Spain
Online KBSVMs
• There are m advice sets, (D i ; d i ; zi ) m
i= 1
t
(x
• At round t, the algorithm receives ; yt )
• The current hypothesis is w t , and the current
advice variables are u i ;t ; i = 1; :::; m
At round t, the formulation for deriving an update is
»;u i
min
i
;´ ;³ i ¸ 0;w
subject t o
Xm
Xm ¡
¢
1
1
¸
¹
t 2
i
i ;t 2
2
i 2
2
kw ¡ w k +
ku ¡ u k + » +
k´ k + ³ i
2
2 i= 1
2
2 i= 1
yt w 0x t ¡ 1 + » ¸ 0;
9
0 i
i
D i u + zi w + ´ = 0 >
=
0
i = 1; :::; m:
¡ di ui ¡ 1 + ³ i ¸ 0
>
;
i
u ¸ 0
ECML 2010, Barcelona, Spain
Formulation At The t-th Round
• There are m advice sets, (D i ; d i ; zi ) m
i= 1
t
(x
• At round t, the algorithm receives ; yt )
• The current hypothesis is w t , and the current
advice variables are u i ;t ; i = 1; :::; m
proximal terms for hypothesis and advice vectors
»;u i
min
i
;´ ;³ i ¸ 0;w
subject t o
Xm
Xm ¡
¢
1
1
¸
¹
t 2
i
i ;t 2
2
i 2
2
kw ¡ w k +
ku ¡ u k + » +
k´ k + ³ i ;
2
2 i= 1
2
2 i= 1
yt w 0x t ¡ 1 + » ¸ 0;
9
0 i
i
D i u + zi w + ´ = 0 >
=
0
i = 1; :::; m:
¡ di ui ¡ 1 + ³ i ¸ 0
>
;
i
u ¸ 0
ECML 2010, Barcelona, Spain
Formulation At The t-th Round
• There are m advice sets, (D i ; d i ; zi ) m
i= 1
t
(x
• At round t, the algorithm receives ; yt )
• The current hypothesis is w t , and the current
advice variables are u i ;t ; i = 1; :::; m
data loss advice loss
»;u i
min
i
;´ ;³ i ¸ 0;w
subject t o
Xm
Xm ¡
¢
1
1
¸
¹
t 2
i
i ;t 2
2
i 2
2
kw ¡ w k +
ku ¡ u k + » +
k´ k + ³ i ;
2
2 i= 1
2
2 i= 1
yt w 0x t ¡ 1 + » ¸ 0;
9
0 i
i
D i u + zi w + ´ = 0 >
=
0
i = 1; :::; m:
¡ di ui ¡ 1 + ³ i ¸ 0
>
;
i
u ¸ 0
ECML 2010, Barcelona, Spain
Formulation At The t-th Round
• There are m advice sets, (D i ; d i ; zi ) m
i= 1
t
(x
• At round t, the algorithm receives ; yt )
• The current hypothesis is w t , and the current
advice variables are u i ;t ; i = 1; :::; m
parameters
»;u i
min
i
;´ ;³ i ¸ 0;w
subject t o
Xm
Xm ¡
¢
1
1
¸
¹
t 2
i
i ;t 2
2
i 2
2
kw ¡ w k +
ku ¡ u k + » +
k´ k + ³ i ;
2
2 i= 1
2
2 i= 1
yt w 0x t ¡ 1 + » ¸ 0;
9
0 i
i
D i u + zi w + ´ = 0 >
=
0
i = 1; :::; m:
¡ di ui ¡ 1 + ³ i ¸ 0
>
;
i
u ¸ 0
ECML 2010, Barcelona, Spain
Formulation At The t-th Round
• There are m advice sets, (D i ; d i ; zi ) m
i= 1
t
(x
• At round t, the algorithm receives ; yt )
• The current hypothesis is w t , and the current
advice variables are u i ;t ; i = 1; :::; m
»;u i
min
i
;´ ;³ i ¸ 0;w
subject t o
Xm
Xm ¡
¢
1
1
¸
¹
t 2
i
i ;t 2
2
i 2
2
kw ¡ w k +
ku ¡ u k + » +
k´ k + ³ i ;
2
2 i= 1
2
2 i= 1
yt w 0x t ¡ 1 + » ¸ 0;
inequality constraints
9
0 i
i
D i u + zi w + ´ = 0 >
make deriving a closed=
0
i = 1; :::; m: form update impossible
¡ di ui ¡ 1 + ³ i ¸ 0
>
;
i
u ¸ 0
ECML 2010, Barcelona, Spain
Formulation At The t-th Round
• There are m advice sets, (D i ; d i ; zi ) m
i= 1
t
(x
• At round t, the algorithm receives ; yt )
• The current hypothesis is w t, and the current
advice-vector estimates are u i ;t ; i = 1; :::; m
»;u i
min
i
;´ ;³ i ¸ 0;w
subject t o
Xm
Xm ¡
¢
1
1
¸
¹
t 2
i
i ;t 2
2
i 2
2
kw ¡ w k +
ku ¡ u k + » +
k´ k + ³ i ;
2
2 i= 1
2
2 i= 1
yt w 0x t ¡ 1 + » ¸ 0;
inequality constraints
9
0 i
i
D i u + zi w + ´ = 0 >
make deriving a closed=
0
i = 1; :::; m: form update impossible
¡ di ui ¡ 1 + ³ i ¸ 0
>
;
i
u ¸ 0
ECML 2010, Barcelona, Spain
Decompose Into m+1 Sub-problems
• First sub-problem: update hypothesis by fixing
the advice variables, to their values at the t-th
iteration u i = u i ;t
»;u i
min
i
;´ ;³ i ¸ 0;w
subject t o
Xm
Xm ¡
¢
1
1
¸
¹
t 2
i
i ;t 2
2
i 2
2
kw ¡ w k +
ku ¡ u k + » +
k´ k + ³ i ;
2
2 i= 1
2
2 i= 1
yt w 0x t ¡ 1 + » ¸ 0;
9
0 i
i
D i u + zi w + ´ = 0 >
=
0
i = 1; :::; m:
¡ di ui ¡ 1 + ³ i ¸ 0
>
;
ui ¸ 0
• Some objective terms and constraints drop out
of the formulation
ECML 2010, Barcelona, Spain
Deriving The Hypothesis Update
• First sub-problem: update hypothesis by fixing
the advice vectors
Xm
1
¸
¹
• Update
vectors
w t + 1 = advice
min
kw ¡ w tby
k22 +fixing
»2 +the hypothesis
k´ i k22
2 m sub-problems,
2
2 i =for
– Breaks down into
one
1 each
subject
advice
sett o yt w 0x t ¡ 1 + » ¸ 0;
w ;»;´ i
(®)
D i0u i ;t + zi w + ´ i = 0; i = 1; :::; m: (¯ i )
ECML 2010, Barcelona, Spain
Deriving The Hypothesis Update
• First sub-problem: update hypothesis by fixing
the advice vectors
Xm
1
¸
¹
• Update
vectors
w t + 1 = advice
min
kw ¡ w tby
k22 +fixing
»2 +the hypothesis
k´ i k22
2 m sub-problems,
2
2 i =for
– Breaks down into
one
1 each
subject
advice
sett o yt w 0x t ¡ 1 + » ¸ 0;
w ;»;´ i
(®)
D i0u i ;t + zi w + ´ i = 0; i = 1; :::; m: (¯ i )
fixed, advice-estimate of
the hypothesis according
to i-th advice set ; denote
as r i ;t
ECML 2010, Barcelona, Spain
Advice-Estimate Of Current Hypothesis
• First sub-problem: update hypothesis by fixing
the advice vectors
Xm
1
¸
¹
• Update
vectors
w t + 1 = advice
min
kw ¡ w tby
k22 +fixing
»2 +the hypothesis
k´ i k22
2 m sub-problems,
2
2 i =for
– Breaks down into
one
1 each
subject
advice
sett o yt w 0x t ¡ 1 + » ¸ 0;
w ;»;´ i
(®)
D i0u i ;t + zi w + ´ i = 0; i = 1; :::; m: (¯ i )
fixed, advice-estimate of
the hypothesis according
to i-th advice set ; denote
as r i ;t
average advice-estimates
over all m advice vectors
and denote as m
X
1
rt =
r i ;t
m i= 1
ECML 2010, Barcelona, Spain
The Hypothesis Update
For ¸ , ¹ > 0, and advice-est imat e r t , t he hypot hesis updat e is
w t + 1 = º (w t + ®t yt x t ) + (1 ¡ º ) r t ;
µ
®t =
1
+ º kx t k2
¸
¶¡
³
1
¢max
´
1 ¡ º yt w x ¡ (1 ¡ º ) yt r x ; 0 :
ECML 2010, Barcelona, Spain
t0 t
t0 t
The Hypothesis Update
For ¸ , ¹ > 0, and advice-est imat e r t , t he hypot hesis updat e is
w t + 1 = º (w t + ®t yt x t ) + (1 ¡ º ) r t ;
µ
®t =
1
+ º kx t k2
¸
¶¡
³
1
¢max
´
1 ¡ º yt w x ¡ (1 ¡ º ) yt r x ; 0 :
Update is convex combination
of the standard passiveaggressive update and the
average advice-estimate
Parameter of convex
combinations is º =
1
1 + m¹
ECML 2010, Barcelona, Spain
t0 t
t0 t
The Hypothesis Update
For ¸ , ¹ > 0, and advice-est imat e r t , t he hypot hesis updat e is
w t + 1 = º (w t + ®t yt x t ) + (1 ¡ º ) r t ;
µ
®t =
1
+ º kx t k2
¸
¶¡
³
1
¢max
´
1 ¡ º yt w x ¡ (1 ¡ º ) yt r x ; 0 :
t0 t
t0 t
Update is convex combination
of the standard passiveaggressive update and the Update weight depends on
average advice-estimate
hinge loss computed with
respect to a composite weight
vector that is a convex
Parameter of convex
1
combination of the current
º
=
combinations is
1 + m¹
hypothesis and the average
advice-estimate
ECML 2010, Barcelona, Spain
Deriving The Advice Updates
• Second sub-problem: update advice vectors by
fixing the hypothesis w = w t + 1
»;u i
min
i
;´ ;³ i ¸ 0;w
subject t o
Xm
Xm ¡
¢
1
1
¸
¹
t 2
i
i ;t 2
2
i 2
2
kw ¡ w k +
ku ¡ u k + » +
k´ k + ³ i ;
2
2 i= 1
2
2 i= 1
yt w 0x t ¡ 1 + » ¸ 0;
9
0 i
i
D i u + zi w + ´ = 0 >
=
0
i = 1; :::; m:
¡ di ui ¡ 1 + ³ i ¸ 0
>
;
i
u ¸ 0
• Some constraints and objective terms drop out
of the formulation
ECML 2010, Barcelona, Spain
Deriving The Advice Updates
• Second sub-problem: update advice vectors by
fixing the hypothesis
m
Xm ¡
¢
1X
¹
i
i ;t 2
i 2
2
min
ku ¡ u k +
k´ k + ³ i ;
i
i
2 i= 1
2 i= 1
u ;´ ;³ i ¸ 0;
9
0 i
t+ 1
i
D i u + zi w
+ ´ = 0 >
=
0
subject t o
i = 1; :::; m:
¡ di ui ¡ 1 + ³ i ¸ 0
>
;
i
u ¸ 0
ECML 2010, Barcelona, Spain
Deriving The Advice Updates
• Second sub-problem: update advice vectors by
fixing the hypothesis
m
Xm ¡
¢
1X
¹
i
i ;t 2
i 2
2
min
ku ¡ u k +
k´ k + ³ i ;
i
i
2 i= 1
2 i= 1
u ;´ ;³ i ¸ 0;
9
0 i
t+ 1
i
D i u + zi w
+ ´ = 0 >
=
0
subject t o
i = 1; :::; m:
¡ di ui ¡ 1 + ³ i ¸ 0
>
;
i
u ¸ 0
split into m sub-problems
ECML 2010, Barcelona, Spain
Deriving The i-th Advice Updates
• m sub-problems: update the i-th advice vector
by fixing the hypothesis
u
¢
1 i
¹ ¡ i 2
i ;t 2
2
= min
ku ¡ u k +
k´ k2 + ³ i
i
2
2
u ;´ ;³
subject t o D i0u i + zi w t + 1 + ´ i = 0;
(¯ i )
i ;t + 1
i0 i
¡ d u ¡ 1 + ³ i ¸ 0;
(° i )
u i ¸ 0:
(¿i )
ECML 2010, Barcelona, Spain
Deriving The i-th Advice Updates
• m sub-problems: update the i-th advice vector
by fixing the hypothesis
u
¢
1 i
¹ ¡ i 2
i ;t 2
2
= min
ku ¡ u k +
k´ k2 + ³ i
i
2
2
u ;´ ;³
subject t o D i0u i + zi w t + 1 + ´ i = 0;
(¯ i )
i ;t + 1
i0 i
cone constraints still
complicating
cannot derive closed
form solution
¡ d u ¡ 1 + ³ i ¸ 0;
(° i )
u i ¸ 0:
(¿i )
• Use projected-gradient approach
• drop constraints to compute
intermediate closed-form update
• project intermediate update back
on to cone constraints
ECML 2010, Barcelona, Spain
The m Advice Updates
For ¹ > 0, and current hypot hesis w t + 1 , for each advice set , i = 1; :::; m, t he
updat e rule is given by
¡ i ;t
¢
i ;t + 1
i
i
u
= max u + D i ¯ ¡ d ° i ; 0 ;
·
i
¯
°i
¸
2
= 4
¡
(D i0D i
0
+
di Di
1
¹ In)
3¡
D i0d i
0
¡ (d i d i +
5
1
¹ )
ECML 2010, Barcelona, Spain
1
2
D i0u i ;t
+ zi w
t+ 1
4
3
5
i 0 i ;t
¡ d u
¡ 1
The m Advice Updates
For ¹ > 0, and current hypot hesis w t + 1 , for each advice set , i = 1; :::; m, t he
updat e rule is given by
¡ i ;t
¢
i ;t + 1
i
i
projection
u
= max u + D i ¯ ¡ d ° i ; 0 ;
·
i
¯
°i
¸
2
= 4
¡
(D i0D i
+
1
¹ In)
0
0
di Di
3¡
D i0d i
5
¡ (d i d i +
• hypothesis-estimate of the
advice; denote si = ¯ i =° i
• The update is the error or
the amount of violation of
the constraint D i x · d i by
an ideal data point, si
1
¹ )
1
2
D i0u i ;t
+ zi w
4
3
5
i 0 i ;t
¡ d u
each advice update
depends on the newly
updated hypothesis
ECML 2010, Barcelona, Spain
t+ 1
¡ 1
The
Adviceptron
A lgor it hm 1 T he Passive-Aggressive Advicept ron Algorit hm
1: i nput : dat a (x t ; yt ) Tt= 1 , advice set s (D i ; d i ; zi ) m
i = 1 , paramet ers ¸ ; ¹ > 0
2: i ni t i al ize: u i ; 1 = 0, w 1 = 0
3: l et : º = 1=(1 + m¹ )
4: for (x t ; yt ) do
0
5:
predict label y^t = sign(w t x t )
6:
receive correct label yt
7:
su®er loss
¡
t0 t
t0 t
` t = max 1 ¡ º yt w x ¡ (1 ¡ º )yt r x ; 0
8:
updat e hypot hesis using u i ;t
® = ` t =(
9:
¢
1
+ º kx t k2 );
¸
w t + 1 = º ( w t + ®yt x t ) + (1 ¡ º ) r t
updat e advice variables using w t + 1
i
(¯ ; ° i ) =
H i¡ 1 g i
;
u
i ;t + 1
¡
= u
10: en d for
ECML 2010, Barcelona, Spain
i ;t
i
i
+ Di ¯ ¡ d °i
¢
+
Outline
•
•
•
•
•
Knowledge-Based Support Vector Machines
The Adviceptron: Online KBSVMs
A Real-World Task: Diabetes Diagnosis
A Real-World Task: Tuberculosis Isolate Classification
Conclusions
ECML 2010, Barcelona, Spain
Diagnosing Diabetes
• Standard data set from UCI repository (768 x 8)
– all patients at least 21 years old of Pima Indian heritage
– features include body mass index, blood glucose level
• Expert advice for diagnosing diabetes from NIH
website on risks for Type-2 diabetes
– a person who is obese (characterized by BMI > 30) and
has a high blood glucose level (> 126) is at a strong risk
for diabetes
(BMI ¸ 30) ^ (bl oodgl ucose ¸ 126) ) di abet es
– a person who is at normal weight (BMI < 25) and has low
blood glucose level (< 100) is at a low risk for diabetes
(BMI · 25) ^ (bl oodgl ucose · 100) ) : di abet es
ECML 2010, Barcelona, Spain
Diagnosing Diabetes: Results
• 200 examples for training,
remaining for testing
• Results averaged over 20
randomized iterations
• Compared to advice-free
online algorithms:
• Passive-aggressive
(Crammer et al, 2006),
• ROMMA
(Li & Long, 2002),
• Max margin-perceptron
(Freund & Schapire, 1999)
batch KBSVM, 200 points
ECML 2010, Barcelona, Spain
Outline
•
•
•
•
•
Knowledge-Based Support Vector Machines
The Adviceptron: Online KBSVMs
A Real-World Task: Diabetes Diagnosis
A Real-World Task: Tuberculosis Isolate Classification
Conclusions
ECML 2010, Barcelona, Spain
Tuberculosis Isolate Classification
• Task is to classify strains of Mycobacterium
tuberculosis complex (MTBC) into major
genetic lineages based on DNA fingerprints
• MTBC is the causative agent for TB
– leading cause of disease and morbidity
– strains vary in infectivity, transmission, virulence,
immunogenicity, host associations depending on
genetic lineage
• Lineage classification is crucial for surveillance,
tracking and control of TB world-wide
ECML 2010, Barcelona, Spain
Tuberculosis Isolate Classification
• Two types of DNA fingerprints for all culture-positive TB strains
collected in the US by the CDC (44 data features)
• Six (classes) major lineages of TB for classification
– ancestral: M. bovis, M. africanum, Indo-Oceanic
– modern: Euro-American, East-Asian, East-African-Indian
• Problem formulated as six 1-vs-many classification tasks
Class
East -Asian
East -African-Indian
Euro-American
Indo-Oceanic
M. africanum
M. bovis
# isolat es
4924
1469
25161
5309
154
693
# pieces of
Posit ive Advice
1
2
1
5
1
1
ECML 2010, Barcelona, Spain
# pieces of
Negat ive Advice
1
4
2
5
3
3
Expert Rules for TB Lineage Classification
East-Asian
Indo-Oceanic
M. bovis
M. africanum
East-African-Indian
MI RU24 > 1
MI RU24 · 1
Indo-Oceanic
Euro-American
Rules provided by Dr. Lauren Cowan at the Center for
Disease Control, documented in Shabbeer et al, (2010)
ECML 2010, Barcelona, Spain
TB Results: Might Need Fewer
Examples To Converge With Advice
Euro-American vs. the Rest
M. africanum vs. the Rest
batch KBSVM
batch KBSVM
ECML 2010, Barcelona, Spain
TB Results: Can Converge To A Better
Solution With Advice
East-African-Indian vs. the Rest
Indo-Oceanic vs. the Rest
batch KBSVM
batch KBSVM
ECML 2010, Barcelona, Spain
TB Results: Possible To Still Learn Well
With Only Advice
East-Asian vs. the Rest
M. bovis vs. the Rest
batch KBSVM
batch KBSVM
ECML 2010, Barcelona, Spain
Outline
•
•
•
•
•
Knowledge-Based Support Vector Machines
The Adviceptron: Online KBSVMs
A Real-World Task: Diabetes Diagnosis
A Real-World Task: Tuberculosis Isolate Classification
Conclusions And Questions
ECML 2010, Barcelona, Spain
Conclusions
• New online learning algorithm: the adviceptron
• Makes use of prior knowledge in the form of (possibly
imperfect) polyhedral advice
• Performs simple, closed-form updates via passiveaggressive framework; scalable
• Good advice can help converge to a better solution
with fewer examples
• Encouraging empirical results on two important
real-world tasks
ECML 2010, Barcelona, Spain
References.
(Fung et al, 2003) G. Fung, O. L. Mangasarian, and J. W. Shavlik. Knowledge-based support vector
classifiers. In S. Becker, S. Thrun & K. Obermayer, eds, NIPS, 15, pp. 521–528, 2003
(Crammer et al, 2006) K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passiveaggressive algorithms. J. of Mach. Learn. Res., 7:551–585, 2006.
(Freund and Schapire, 1999) Y. Freund and R. E. Schapire. Large margin classification using the
perceptron algorithm. Mach. Learn., 37(3):277–296, 1999.
(Li and Long, 2002) Y. Li and P. M. Long. The relaxed online maximum margin algorithm. Mach. Learn.,
46(1/3):361–387, 2002.
(Shabbeer et al, 2001) A. Shabbeer, L. Cowan, J. R. Driscoll, C. Ozcaglar, S. L Vandenberg, B. Yener, and
K. P Bennett. TB-Lineage: An online tool for classification and analysis of
strains of Mycobacterium tuberculosis Complex. Unpublished manuscript, 2010.
Acknowledgements.
The authors would like to thank Dr. Lauren Cowan of the Center for Disease Control (CDC) for
providing the TB dataset and the expert-defined rules for lineage classification.
We gratefully acknowledge support of DARPA under grant HR0011-07-C-0060 and the NIH
under grant 1-R01-LM009731-01.
Views and conclusions contained in this document are those of the authors and do not necessarily represent the
official opinion or policies, either expressed or implied of the US government or of DARPA.
ECML 2010, Barcelona, Spain
KBSVMs: Deriving The Advice
Constraints
We assume an expert provides
polyhedral advice of the form Class A, y= +1
Dx · d )
w 0x ¸ b
We know p ) q is equivalent
to : p _ q
Dx · d
If : p _ q has a solution then its
negation has no solution or,
8
9
< D x ¡ d ¿ · 0; =
w 0x ¡ b¿ < 0;
:
;
¡ ¿< 0
has no solution (x; ¿):
Class B, y = -1
ECML 2010, Barcelona, Spain
KBSVMs: Deriving The Advice
Constraints
If the following system
Class A, y= +1
8
9
< D x ¡ d ¿ · 0; =
w 0x ¡ b¿ < 0;
:
;
¡ ¿< 0
has no solution (x; ¿) , then by
Dx · d
Motzkin’s Theorem of the
Alternative, the following
system
8
9
<
:
D 0u + w = 0; =
¡ d 0u ¡ b ¸ 0;
;
u
¸ 0
has a solution u.
Class B, y = -1
ECML 2010, Barcelona, Spain
Download