Supervised Descent Method and its Applications to Face Alignment

Supervised Descent Method and its

Applications to Face Alignment

Xuehan Xiong, Fernando De la Torre (CVPR-2013, extensions: submitted to TPAMI)

Zoltán Szabó

Gatsby Unit, Tea Talk

March 16, 2015

Zoltán Szabó

Supervised Descent Method

Motivation

Computer vision: many tasks boil down to continuous nonlinear optimization.

Our focus: facial feature detection/tracking.

Zoltán Szabó


Newton’s method (and its variants)

Task: min x f ( x ) , f ∈ C

2

.

Newton’s method: locally quadratic approximation, x

0

: given.

Second order Taylor expansion ( k = 0 , 1 , 2 , ...

) around x k

: f ( x k

+ ∆ x ) ≈ f ( x k

) + J f

( x k

)

T

(∆ x ) +

1

2

(∆ x )

T

H ( x k

)(∆ x ) ⇒

∆ x k + 1

= − H −

1

( x k

) J f

( x k

) , x k + 1

= x k

+ ∆ x k + 1

.

Zoltán Szabó


Newton’s method: pro & contra

Advantages:

If it converges ⇒ quadratic rate ( q = 2) lim k

→∞ k x k k x k

−

1

− x

∗ k

− x

∗ k q

= L > 0 .

If x

0 is ”close enough” to x

∗

Disadvantages (in CV):

⇒ convergence.

f : non-differentiable (SIFT) → numerical J f

Linear equation: expensive (quasi too).

, H : slow.

Zoltán Szabó


Optimization idea (z-axis: reversed)

Zoltán Szabó


Face alignment: x

0

– initial estimation, x

∗

– manual labels

Zoltán Szabó


Face alignment: formulation

Image: d .

Landmark locations ( p ): x = [ x

1

; y

1

; . . .

; x p

; y p

] ∈ R

2 p .

Feature extraction function (SIFT): h .

Features around the landmarks: h ( x ; d ) ∈ R

128 p

.

Task: for given x

0 g (∆ x ) = f ( x

0

+ ∆ x ) = k h ( x

0

+ ∆ x ; d ) − φ

∗ k

2

2

→ min

∆ x

,

φ

∗

= h ( x

∗

; d ) .

Zoltán Szabó


Face alignment

Task: f ( x

0

+ ∆ x ) = k h ( x

0

+ ∆ x ; d ) − φ

∗ k

2

2

→ min

∆ x

.

By the Newton trick (+chain rule):

∆ x

1

= − H −

1

( x

0

) J f

( x

0

) = − 2 H −

1

( x

0

) J

T h

( x

0

)( φ

0

− φ

∗

) ,

φ

0

= h ( x

0

; d ) ,

φ

∗

= h ( x

∗

; d ) .

Zoltán Szabó


Face alignment

Task: f ( x

0

+ ∆ x ) = k h ( x

0

+ ∆ x ; d ) − φ

∗ k

2

2

→ min

∆ x

.

By the Newton trick (+chain rule):

∆ x

1

= − H −

1

( x

0

) J f

( x

0

) = − 2 H −

1

( x

0

) J

T h

( x

0

)( φ

0

− φ

∗

) ,

φ

0

= h ( x

0

; d ) ,

φ

∗

= h ( x

∗

; d ) .

Idea [ H := H ( x

0

) , J h

:= J h

( x

0

) ]:

∆ x

1

= − 2 H −

1

J

T h

( φ

0

− φ

∗

) = h

− 2 H −

1

J

T h i

φ

0

+ h

2 H −

1

J

T h

φ

∗ i

= R

0

φ

0

+ b

0

⇒ optimize for ( R

0

, b

0

) based on samples .

Zoltán Szabó


Face alignment

The algorithm is unlikely to converge in 1 iteration.

Cascade of regressors:

∆ x

1

= R

0

φ

0

+ b

0

,

−− − −

∆ x k

= R k

−

1

φ k

−

1

+ b k

−

1

, where

φ k

−

1

= h ( x k

−

1

; d ) : features at the previous landmarks .

Zoltán Szabó


Face alignment: optimization ( k

=

0)

Given: set of images: { d i } N i = 1 initial estimates: { x i

0

, hand-labelled landmarks: { x i

∗

} N i = 1

⇒

} N i = 1

,

Optimal updates, extracted features:

∆ x i

∗

0

= x i

∗

− x i

0

, φ i

0

= h x i

0

; d i

.

Objective:

J ( R

0

, b

0

) =

1

N

N

X i = 1

∆ x i

∗

0

− R

0

φ i

0

− b

0

2

2

→ min

R

0

, b

0

.

Zoltán Szabó


Face alignment: optimization (general k )

Update the landmark estimates ( x k

):

∆ x i k

= R k

−

1

φ i k

−

1

+ b k

−

1

( i = 1 , . . . ,

Compute optimal updates ( ∀ i ), extract features:

N ) .

∆ x i

∗ k

= x i

∗

− x i k

, φ i k

= h x i k

; d i

.

Objective:

J ( R k

, b k

) =

1

N

N

X i = 1

∆ x i

∗ k

− R k

φ i k

− b k

2

2

→ min

R k

, b k

.

Numerical experience: convergence in 4 − 5 steps.

Zoltán Szabó


Face alignment: training, testing

Training ⇒ { R k

, b k

} .

Testing: test image:

˜

, inital estimate: x

0

, extract features: φ

0

= h x

0

; ˜ , iteratively compute ∆ x k

, the features at x k

( k = 1 , . . .

):

∆ x k

= R k

−

1

φ k

−

1

+ b k

−

1

,

φ k

= h x k

; ˜ .

Zoltán Szabó


Facial feature detection: 2 ”face in the wild” datasets

Last row: 10 worst cases.

Zoltán Szabó


Facial feature detection: cumulative error distribution curves

Zoltán Szabó


Facial feature tracking

Initialization = landmark estimate from the previous frame.

(a): average RMSE-s on 29 videos, (b): RMSE=5 .

03 demo.

Zoltán Szabó


Summary

Focus: continuous nonlinear optimization.

Newton’s method: expensive.

Idea: supervised Newton method, learn cascade of affine regressors based on samples.

Application: facial feature detecion, face tracking.

Zoltán Szabó


Thank you for the attention!

Zoltán Szabó


Supervised Descent Method and its Applications to Face Alignment

Related documents

Products

Support

Supervised Descent Method and its Applications to Face Alignment

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib