Total Variation and Euler's Elastica for Supervised Learning
Tong Lin, Hanlin Xue, Ling Wang, Hongbin Zha
Contact: tonglin123@gmail.com
Peking University, China
2012-6-29
Key Lab. Of Machine Perception, School of EECS, Peking University, China
1
Background
• Supervised Learning:
• Definition: Predict u : x -> y, with training data ( x
1
, y
1
), …, ( x
N
, y
N
)
•
Two tasks: Classification and Regression
• Prior Work :
• SVM:
Hinge loss:
• RLS: Regularized Least Squares, Rifkin, 2002
Squared loss:
2
Background
• Prior Work (Cont.) :
• Laplacian Energy: “Manifold Regularization: A Geometric Framework for
Learning from Labeled and Unlabeled Examples,” Belkin et al., JMLR
7:2399-2434, 2006
• Hessian Energy: “Semi-supervised Regression using Hessian Energy with an Application to Semisupervised Dimensionality Reduction,” K.I.
Kim, F. Steinke, M. Hein, NIPS 2009
• GLS : “Classification using geometric level sets,” Varshney & Willsky,
JMLR 11:491-516, 2010
3
Motivation
SVM Our Proposed Method 4
3D display of the output classification function u ( x ) by the proposed EE model
Large margin should not be the sole criterion; we argue sharper edges and smoother boundaries can play significant roles. 5
•
General :
Models min u
n i
1
( ( ), i y i
)
•
Laplacian Regularization
(LR) : min u
(
)
2 u y dx
|
|
2 u dx
•
Total Variation (TV) : min u
(
)
2 u y dx
|
|
• Euler’s Elastica (EE) : min u
(
)
2 u y dx
(
2
) |
|
u
|
u |
6
TV&EE in Image Processing
• TV: a measure of total quantity of the value change
• Image denoising (Rudin, Osher, Fatemi, 1992)
• Elastica was introduced by Euler in 1744 on modeling torsion-free elastic rods
• Image inpainting (Chan et al., 2002)
7
• TV can preserve sharp edges, while EE can produce smooth boundaries
• For details, see T. Chan & J. Shen’s textbook:
Image Processing and Analysis: Variational,
PDE, Wavelet, and Stochastic Methods , SIAM,
2005
8
Decision boundary
The mean curvature k in high dimensional space can have same expression except the constant 1/(d-1).
9
Framework
10
Energy Functional Minimization min [ ]
( u
)
2
( )
S
LR
( )
|
u |
2 dx S
TV
( )
|
| S
EE
( )
(
2
) |
|
2( u
y )
0 (#)
u
|
u |
2( u
y )
0
2( u
y )
0
•
The calculus of variations
→
Euler-Lagrange PDE
V
n
1
|
u |
u |)
1
|
u |
3
a b
2
u |))
11
Solutions
Radial Basis Function Approximation a. Laplacian Regularization (LR) b. TV & EE: We develop two solutions
• Gradient descent time marching (GD)
• Lagged linear equation iteration (LagLE)
12
Experiments: Two-Moon Data
SVM
EE
Both methods can achieve 100% accuracies with different parameter combinations
13
Experiments: Binary Classification
14
Experiments: Multi-class Classification
15
Experiments: Multi-class Classification
Note: Results of TV and EE are computed by the LagLE method.
16
Experiments: Regression
17
Conclusions
• Contributions:
• Introduce TV&EE to the ML community
• Demonstrate the significance of curvature and gradient empirically
• Achieve superior performance for classification and regression
• Future Work :
• Hinge loss
• Other basis functions
• Extension to semi-supervised setting
• Existence and uniqueness of the PDE solutions
• Fast algorithm to reduce the running time
End, thank you!
18