slides

advertisement
Optimization & Learning for
Registration of Moving
Dynamic Textures
Junzhou Huang1, Xiaolei Huang2, Dimitris Metaxas1
Rutgers University1, Lehigh University2
Outline






Background
Goals & Problems
Related Work
Proposed Method
Experimental Results
Discussion & Conclusion
Background

Dynamic Textures (DT)


static camera, exhibiting certain stationary properties
Moving Dynamic Textures (MDT)

dynamic textures captured by a moving camera
DT [Kwatra et al. SIGGRAPH’03]
MDT [Fitzgibbbon ICCV’01]
Background

Video registration


Traditional assumption



Required by many video analysis applications
Static, rigid, brightness constancy
Bergen et al. ECCV’92, Black et al. ICCV’93
Relaxing rigidity assumption


Dynamic textures
Fitzgibbon, ICCV’01; Doretto et al. IJCV’03; Yuan et al.
ECCV’04; Chan et al. NIPS’05; Vidal et al. CVPR’05; Lin
et al. PAMI’07; Rav-Acha et al. Dynamic Vision Workshop
at ICCV’05; Vidal et al. ICCV’07
Our Goal

Registration of Moving Dynamic Textures

Recover the camera motion and register image
frames in the MDT image sequence
Translation to the left
Translation to the right
Complex Optimization Problem

Complex optimization



W.r.t. camera motion, dynamic texture model
Chicken-and-Egg Problem
Challenges



About the mean images
About Linear Dynamic System (LDS) model
About the camera motion
Related Works

Fitzgibbon, ICCV’01




Pioneering attempt
Stochastic rigidity
Non-linear optimization
Vidal et al. CVPR’05



Time varying LDS model
Static assumption in small time windows
Simple and general framework but often underestimate motion
Formulation

Registration of MDT





I(t), the video frame
(t ) , camera motion parameters
y0 , the desired average image of the video
y(t), appearance of DT
x(t), dynamics of DT
Generative Model
x(t-1)
x(t)
x(t+1)
y(t-1)
y(t)
y(t+1)
W(t-1)
W(t)
I(t-1)
W(t+1)
I(t)
I(t+1)
y0
Generative image model for a MDT
First Observation

Good registration



A good registration according to the accurate
camera motion should simplify the dynamic texture
model while preserving all useful information
Used by Fitzgibbon, ICCV’01, Minimizing the
entropy function of an auto regressive process
Used by Vidal, CVPR’05, optimizing time varying
LDS model by optimizing piecewise LDS model
Second Observation

Good registration


A good registration according to the accurate
camera motion should lead to a sharp average
image whose statistics of derivative filters are
similar to those of the input image frames.
Statistics of derivative filters in images


Student-t distribution/heavy-tailed image priors
Huang et al. CVPR’99, Roth et al. CVPR’05
Prior Models



The Average Image Prior
The Motion Prior
The Dynamics Prior
Average Image Priors

Student-t distribution

Model parameters / contrastive divergence method
(a) Before registration, (b) In the middle of registration (c) After registration
Motion / Dynamics Priors

Gaussian Perturbation (Motion)



Uncertainty in the motion is modeled by a
Gaussian perturbation about the mean estimation
M0 with the covariance matrix S ( a diagonal
matrix)
Motivated by the work [Pickup et al. NIPS’06]
GPDM / MAR model (Dynamic)


Marginalizing over all possible mappings between
appearance and dynamics
Motivated by the work [Wang et al. NIPS’05],
[Moon et al. CVPR’06]
Joint Optimization

Generative image model

Optimization

Final marginal likelihood

Scaled conjugate gradients algorithm (SCG)
Procedures





Obtaining image derivative prior model
Dividing the long sequence into many short image
sequences
Initialization for video registration
Performing model optimization with the proposed
prior models until model convergence.
With estimated y0, Y and X, the camera motion is then
obtained iteratively by Maximum Likelihood
estimation using SCG optimization
Obtaining Data

Three DT video sequences


DT data [Kwatra et al. SIGGRAPH’03]
Synthesized MDT video sequence


60 frames each, no motion from 1st to 20th frame
and from 41st to 60th
Camera motion with speed [1, 0] from 21st to 40th
Grass MDT Video

The average image
(a) One frame, (b) the average image after registration, (c) average image
before registration
Grass MDT Video
The statistics of derivative filter responses
0.15
Input Images
After Registration
Before Registration
Probability distribution

0.1
0.05
0
-60
-40
-20
0
Gradient
20
40
60
Evaluation / Comparison

False Estimation Fraction

Comparison with two classical methods


Hybrid method, [Bergen et al. ECCV’92] [Black et
al. ICCV’93]
Vidal’method, [Vidal et al. CVPR’05]
Waterfall MDT Video

Motion estimation
(a) Ground truth, (b) by hybrid method, (c) by Vidal’s, (d) by our method
Waterfall MDT Video

The average Image and its statistics
The average image and its derivative filter response distribution after
registration by: (a) our method, (b) Vidal’s method, (c) hybrid method
FEF Comparison

On three synthesized MDT video
Experiment on real MDT Video




Moving flower bed video
554 frames total
Ground truth motion 110
pixels
Estimation 104.52 pixels
( FEF 4.98%)
Conclusions

Proposed:


Solution for:


Powerful priors for MDT registration
Camera motion, Average image of video,
Dynamic texture model
What have we learned?


Correct registration simplifies DT model while
preserving useful information
Better registration leads to sharper average image
Thank you !
Future work



More complex camera motion
Different metrics for performance evaluation
Multiple dynamic texture segmentation
Experiment on real MDT Video


Moving flower bed video
Our method




554 frames total
Ground truth motion 110 pixels
Estimation 104.52 pixels ( FEF
4.98%)
Vidal’s method



250 frames [Vidal et al. CVPR’05]
Ground truth motion 85 pixels
Estimation 60 pixels (FEF 29.41%)
Download