Complex Support Vector Machines for Regression and Quaternary

advertisement
Complex Support Vector
Machines for Regression and
Quaternary Classification
P.Bouboulis, S.Theodoridis, C.Mavroforakis, L.Dalla
Learning (MKL) 19, 2010 (?)
(I got it from arXiv)
Coffeetalk, 1 May 2014
donderdag 1 mei 2014
Classification of complex data
• Classifying bird songs:
ï3
8
8000
x 10
6
7000
4
6000
2
frequency
5000
0
ï2
4000
3000
ï4
2000
ï6
ï8
1000
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
0
1
2
3
4
5
time (s)
6
7
8
9
• But... the spectrogram contains complex values!
• Can I classify a complex-valued image?
2
donderdag 1 mei 2014
Complex support vector machine
• Inputs are complex vectors:
• Output can be:
r
˜ = x + ix
x
i
• a complex number (regression)
• a class label (classification)
• To output a label:
�
For real valued SVM:
≥ 0 assign to ω1
f (x) = �w, x� + b
< 0 assign to ω2
For complex SVM:
˜ x
˜ �H + b
f (x) = �w,
3
donderdag 1 mei 2014
Complex inner product
• You can define inner product in different ways:
• Simple:
r
r
i
i
˜ x
˜ �H = �w , x � + �w , x �
�w,
• Just a concatenation of the real and imaginary spaces.
• You can use the same SVM as always...
• Complex:
�
�
i
r
r
i
r
r
i
i
˜ x
˜ �H = �w , x � + �w , x � + i �w , x � − �w , x �
�w,
• Now the output is not so clear:
˜ x
˜ �H + b)
sign (�w,
sign of a complex number?
donderdag 1 mei 2014
4
v∗
Not 2-class, but 4-class classifier!
Fig. 4. A complex couple of hyperplanes separates the space of complex
numbers (i.e., H = C) into four parts.
2
∂L
=w−
∗
c
and
for n = 1, . . .
finally take tha
5. A complex
coupledistinguish
of hyperplanes thatFOUR
separates classes:
the four given
• Now theFig.
classifier
can
classes. The hyperplanes are chosen so that to maximize the margin between
the classes.
˜ x
˜ �H + b)) ≷ 0
sign (Re(�w,
˜
˜
sign (Im(�
w,
x
�
+
b))
≷
0
H
!"
$"
#$
r
r
$2
! w +v
!
$
! wi − v i $ 2
H
n=1
#$2
and
$ −(w + v ) $
$
+$
$ wr − v r $ 2 =
H
i
i
"wr + v r "2H + "wi − v i "2H + "(wi + v i )"2H + "wr − v r "2H =
donderdag 1 mei 2014
r 2
i 2
N
%
r 2
i 2
5
Further things in the paper...
• To find the Quadratic program (or, find derivatives of real
valued functions in a complex space), use “Wirtinger’s
calculus
(it appears you can write it as two independent quadratic
programming problems)
• Derive for both regression as classification
• Propose kernel mappings for complex data: ΦC (˜
x)
6
donderdag 1 mei 2014
w "H + 2"v "H + 2"v "H =
2("w"2H + "v"2H ).
N
for n = 1, . . . , N . Following a similar procedure as in
complex SVR case, it turns out that the dual problem ca
split into two separate maximization tasks:
For your information:
SVM optimization problem
H +
N
!
• It
C
N
looks like:
(ξnr + ξni )
a
n=1
Φ∗C (z n ), v#H + c)
Φ∗C (z n ), w#H + c)
maximize
≥
≥
≥
1−
1−
0
ξnr
ξni
• Labels
d = ±1 ± i
1, . . . , N. n
(four classes) (28)
es
N
!
r
i
2 •C
the
!H + QPs(ξfor
+
ξ
n
n)
N n=1
real
and the complex
part of the
+ "Φ∗C (z n ), v#H + c) − 1 + ξnr )
label
+ "Φ∗C (z n ), w#H + c) − 1 + ξni
&
N
%
an −
n=1


subject to
an drn = 0

n=1
C
0 ≤ an ≤ N
for n = 1, . . . , N
and
ˆ
a
an am drn drm κrC (z m , z n )
n,m=1

N
%
subject to
maximize
N
%
N
%
bn −
n=1
N
%
bn bm din dim κrC (z m , z n )
n,m=1

N
%


bn din = 0

n=1
C
0 ≤ bn ≤ N
for n = 1, . . . , N.
Observe that, similar to the regression case, these
7
lems are equivalent with two distinct real SVM (dual)
donderdag 1 mei 2014
Conclusion
r
˜ = x + ix
• Treating the real and imaginary part of x
as independent features is OK!
i
• In the fully complex treatment,
we get a four-class classifier
explo
• ... and we can define new
kernels
3
!
n,
2
1
0
1.0
2
0.5
1
0.0
0
!0.5
!1
Henc
!1.0
!2
Fig. 1. The element κrC (·, (0, 0)T ) of the induced real feature space of the
complex Gaussian kernel.
8
donderdag 1 mei 2014
As a
on C
Download