ICML12_lin_Oral_20min

advertisement

Total Variation and Euler's Elastica for Supervised Learning

Tong Lin, Hanlin Xue, Ling Wang, Hongbin Zha

Contact: [email protected]

Peking University, China

2012-6-29

Key Lab. Of Machine Perception, School of EECS, Peking University, China

1

Background

Supervised Learning:

Definition: Predict

u

:

x

-> y, with training data (

x

1

, y

1

), …, (

x

N

, y

N

)

Two tasks: Classification and Regression

Prior Work

SVM:

Hinge loss:

RLS: Regularized Least Squares, Rifkin, 2002

Squared loss:

2

Background

Prior Work (Cont.)

Laplacian

Energy: “Manifold Regularization: A Geometric Framework for

Learning from Labeled and Unlabeled Examples,” Belkin et al., JMLR

7:2399-2434, 2006

Hessian

Energy: “Semi-supervised Regression using Hessian Energy with an Application to Semisupervised Dimensionality Reduction,” K.I.

Kim, F. Steinke, M. Hein, NIPS 2009

GLS

: “Classification using geometric level sets,” Varshney & Willsky,

JMLR 11:491-516, 2010

3

Motivation

SVM Our Proposed Method

4

3D display of the output classification function

u

(

x

) by the proposed EE model

Large margin

should not be the sole criterion; we argue

sharper edges

and

smoother boundaries

can play significant roles.

5

General

Models min

u

n i

1

L u x i y i

)

Laplacian Regularization

(LR)

: min

u

(

u

)

2

y dx

|

|

2

u dx

Total Variation (TV)

: min

u

(

u

)

2

y dx

|

u dx

• Euler’s Elastica (EE)

: min

u

(

u

)

2

y dx

(

2

) |

u dx

|

u

u

|

6

TV&EE in Image Processing

TV: a measure of total quantity of the value change

Image denoising (Rudin, Osher, Fatemi, 1992)

Elastica was introduced by Euler in 1744 on modeling torsion-free elastic rods

Image inpainting (Chan et al., 2002)

7

TV can preserve sharp edges, while EE can produce smooth boundaries

• For details, see T. Chan & J. Shen’s textbook:

Image Processing and Analysis: Variational,

PDE, Wavelet, and Stochastic Methods

, SIAM,

2005

8

Decision boundary

The mean curvature

k

in high dimensional space can have same expression except the constant 1/(d-1).

9

Framework

10

S

LR

Energy Functional Minimization

|

u

|

2

dx

(

u

)

2

y dx

S

TV

|

u dx

S

EE

(

2

) |

u dx

2(

u

y

)

0 (#)

|

u

u

|

2(

u

y

)

0

2(

u

y

)

0

The calculus of variations

Euler-Lagrange PDE

V

 

n

|

1

u

|

 

2

u

|)

1

|

u

|

3

u u

u

|))

11

Solutions

Radial Basis Function Approximation a. Laplacian Regularization (LR) b. TV & EE: We develop two solutions

Gradient descent time marching (GD)

Lagged linear equation iteration (LagLE)

12

Experiments: Two-Moon Data

SVM

EE

Both methods can achieve 100% accuracies with different parameter combinations

13

Experiments: Binary Classification

14

Experiments: Multi-class Classification

15

Experiments: Multi-class Classification

Note: Results of TV and EE are computed by the LagLE method.

16

Experiments: Regression

17

Conclusions

Contributions:

Introduce

TV&EE

to the ML community

Demonstrate the significance of

curvature

and

gradient

empirically

Achieve

superior performance

for classification and regression

Future Work

Hinge loss

Other basis functions

Extension to semi-supervised setting

Existence and uniqueness of the PDE solutions

Fast algorithm to reduce the running time

End, thank you!

18

Download