ICCV07 ppt - UCLA Statistics

advertisement
Deformable Template as Active Basis
Zhangzhang Si
UCLA Department of Statistics
Ying Nian Wu, Zhangzhang Si, Chuck Fleming, Song-Chun Zhu ICCV07
The work presented in this 2007 talk is outdated, see
http://www.stat.ucla.edu/~ywu/AB/ActiveBasisMarkII.html
for the most updated results
2016/3/19
CIVS, Statistics Dept. UCLA
1
Motivation
Design a deformable template to model a set of images of a
certain object category. The template can be learned from
example images.
2016/3/19
CIVS, Statistics Dept. UCLA
2
Related work
Representation: generative and deformable models
1. Sparse coding [Olshausen-Field 96]
2. Deformable templates [Yuille-Hallinan-Cohen 89]
3. Active contours [Kass-Witkin-Terzopoulos 87]
4. Active appearance [Cootes-Edwards-Taylor 95]
5. Texton model [Zhu et.al. 02]
Computation: learning and pursuit algorithm
1. Matching pursuit [Mallat and Zhang 93]
2. HMAX [Riesenhuber-Poggio 99, Mutch-Lowe 06]
3. Adaboost [Freund-Shapire 96, Viola-Jones 99]
2016/3/19
CIVS, Statistics Dept. UCLA
3
Linear additive image model
Image reconstruction by matching pursuit.
n
I   ci Bi  ε,
{Bi ,i  1,...,n}
i 1
selected from a dictionary of Gabor wavelet elements
location
{Bx , y , s ,a }
scale
orientation
Two extensions:
1. Encoding a single image
Simultaneously encoding a set of images;
2. Allow each Gabor wavelet element Bi to locally perturb.
2016/3/19
CIVS, Statistics Dept. UCLA
4
The active basis model
Active basis :{Bi , i  1,..., n}, common to all Im
(Gabor elements represented by bar)
“Active”: Local perturbation
Bx , y , s ,a  Bi if:
x  xi  δ sin ai
When encoding image Im, we use
the perturbed version of Bi:
y  yi  δ cos ai
a  ai  γ
Bm,i  arg max | I m , B |
where δ [  b1 , b1 ],  [b2 , b2 ]
2016/3/19
B  Bi
CIVS, Statistics Dept. UCLA
5
Deformable template using active basis
A car template
(Gabor elements represented by bar)
An incoming car image:
2016/3/19
CIVS, Statistics Dept. UCLA
6
Deformable template using active basis
A car template
Deformed to fit many car instances
2016/3/19
CIVS, Statistics Dept. UCLA
7
Learning the template: pursuing the active basis
q(I): background distribution
(all natural images)
p(I): pursued model to approximate
the true distribution.
B1
B2
B3
KL( p(ri ) || q(ri ))
p(r1 )
p(r2 )
q (r )
p(r3 )
# Gabor elements selected
2016/3/19
CIVS, Statistics Dept. UCLA
8
Pursuing the active basis
  {Bi , i  1,...,n},  are model parameters.
MLE:
p ( I m ; ,  )
( ,  )  arg max  log p( I m ; ,  )  arg max  log
 ,
 ,
q( I m )
m 1
m 1
M
*
M
*
M
 arg max  log
 ,
p (rm ,1 ,..., rm, n ; ,  )
m 1
n
M
 arg max  log
 ,
i 1 m 1
q (rm ,1 ,..., rm ,n )
p (rm ,i ;i )
(Projected on {B1,…,Bn})
(orthogonality of {B1,…,Bn})
q (rm ,i )
where rm,i  I m , Bm,i  , and Bm,i is the locally perturbed version of Bi .
For each step i  1,..., n, we want to choose Bi to maximize
p(rm,i ; i )
1 M
log
 KL( p(rm,i ; i ) || q(rm,i ))

M m1
q(rm,i )
subject to the orthogonal ity constraint .
2016/3/19
CIVS, Statistics Dept. UCLA
9
Pursuing the active basis
log
p(rm,i )
q(rm,i )
?
In case of exponential model
p(rm,i )
1

exp{ (rm,i )/i }
q (rm,i ) Z (i )
M
 log
m 1
p(rm,i ; i )
q(rm,i )

1
i
M
  (r
m 1
m ,i
)  M log Z ( i )
we have two designs for the transformation  ():
1) Whitening:  (rm,i ) ~ exp(1) under q.
2) Sigmoid:  ( rm,i )  const.
2016/3/19
CIVS, Statistics Dept. UCLA
10
Shared pursuit algorithm
0) For m  1,..., M , and for each B, compute  ( I m , B )
(convolution & transformation)
M
1) Choose Bi with maximum
  ( I
m 1
m
, Bm ,i )
(recall that Bm,i  arg max | I m , B |)
B  Bi
2) For m  1,..., M , for each B not  Bm ,i , set  ( I m , B )  0.
3) Go back to 1), until stopping criterion, e.g.,
1
M
2016/3/19
M
  ( I
m 1
m
, Bm ,i )  average in natural images
CIVS, Statistics Dept. UCLA
11
Learning the template: pursuing the active basis
A car template
consisting of 60
Gabor elements
Car instances
2016/3/19
CIVS, Statistics Dept. UCLA
12
Experiment 1: learning an active basis model of vehicle
template
• 37 training images, listed in the descending order of log-likelihood ratio
• 4.3 seconds (Core 2 Duo 2.4GHz) , after convolution
2016/3/19
CIVS, Statistics Dept. UCLA
13
Experiment 2: learning without alignment
Active basis
pursuit + EM
Given bounding box for the first example for initialization.
Iterate:
- Estimate the bounding boxes using current model.
- Re-learn the model from estimated bounding boxes.
2016/3/19
CIVS, Statistics Dept. UCLA
14
Experiment 3: learning and clustering
Learning active basis
EM clustering
2016/3/19
CIVS, Statistics Dept. UCLA
15
Experiment 4: car detection with active basis model
•Scan bounding box over the image at multi-resolutions
•Compute log-likelihood ratio by combining responses from active basis
LLR: log likelihood ratio
LLR: log likelihood ratio
Maximum LLR over scale
2016/3/19
map of LLR at optimal scale
CIVS, Statistics Dept. UCLA
16
Experiment 5: head-and-shoulder recognition
Features: using the same set of Gabor filters.
Some positives
Some negatives
Human head and shoulders, roughly aligned
Negatives include various in-door and out door
scenes, with and without human
43 training positives, 157 training negatives
88 testing positives, 474 testing negatives
2016/3/19
CIVS, Statistics Dept. UCLA
17
Experiment 5: head-and-shoulder recognition
comparing with Adaboost
ROC of sigmoid model is a further improvement of the result presented in the paper.
2016/3/19
CIVS, Statistics Dept. UCLA
18
Main contributions
1. An active basis model as deformable template.
2. A shared pursuit algorithm for fast learning.
http://www.stat.ucla.edu/~ywu/ActiveBasis.html
Download
1) Training and testing images
2) Matlab and mex-C source codes that reproduce all the experiments in the
paper and powepoint.
2016/3/19
CIVS, Statistics Dept. UCLA
19
Download