IJCAI 2007 Wei Liu, Xiaoou Tang, and Jianzhuang Liu Wei Liu, Xiaoou Tang, and Jianzhuang Liu Dept. of Information Engineering

advertisement
IJCAI 2007
Wei Liu, Xiaoou Tang, and Jianzhuang Liu
Wei
Liu, Xiaoou Tang, and Jianzhuang Liu
Dept. of Information Engineering
The Chinese University of Hong Kong
The Chinese University of Hong Kong
Outline
y What is sketch-based facial photo hallucination
y Related Works
y Our Approach
y Tensor
T
Model
M d l
y TensorPatches
y Bayesian Tensor Inference
y Experimental Results
Definition
Sketch-based facial photo
hallucination:
hallucinate (imagine) photorealistic faces from sketches, i.e.,
the backward transform from
sketches to photos.
Bidirectional transforms on photo-sketch pairs.
(a) Forward transform: synthesizing a sketch
image from a photo image; (b) backward transform:
hallucinating a photorealistic image from a sketch
image.
Related Works
y Face Sketching
o H. Chen et al. “Example-based facial sketch generation with non-
parametric sampling”, in Proc. of ICCV, 2001.
o X. Tang and X. Wang. “Face sketch recognition”. IEEE Trans. on
CSVT 14(1):50
CSVT,
14(1):50-57
57, 2004
2004.
o Q. Liu et al. “A nonlinear approach for face sketch synthesis and
recognition”, in Proc. of CVPR, 2005.
y Ideas
y Pixel-wise non-parametric sampling
y Global linear model: PCA
y Local linear model: LLE
Motivations
y Consider the complexity of image spaces and the
conspicuous distinction between photos and sketches.
y Extract the local relations by explicitly establishing the
connection between two feature spaces (sketch and
photo)
h t ) formed
f
d by
b a patch-based
t hb
d ttensor model.
d l
y Formulate a Bayesian approach accounting for the
statistical inference from sketches to their corresponding
photos in terms of the learned tensor model.
Tensor Model
• We make use of a novel tensor model to exclusively
•
•
account for the representation of images with two styles:
photo style and sketch
photo-style
sketch-style.
style
As small image patches can account for high-level
statistics of images
images, we take patches as constitutive
elements of the tensor model.
Based
ased o
on a pa
patch
c co
corpus
pus with p
photooo a
and
d ssketch-styles,
e c s y es,
we arrange patches into a high-order tensor which will
disclose the latent connection between the two styles.
Patch-based Tensor Models
y Generic Images
y Model a 3rd order tensor resulting from the confluence of 3 modes:
patch examples, patch styles, and patch features.
y Tensor transfer can learn the hidden relations between photo
patch space and sketch patch space.
y Face Images
y Model a 4th order tensor resulting from the confluence of 4 modes:
people, patches, styles, and features.
y Utilize PCA prior of face images.
y Bayesian
B
i T
Tensor IInference
f
will
ill incorporate
i
t the
th llearned
d relations
l ti
iin
which the tensor model entails into a Bayesian framework.
Multilinear Analysis
y Why use Multilinear Analysis?
y It unifies multiple
p factors in a framework.
y It explicitly models the interaction between these factors.
y Theoretical Foundation – Tensor Algebra
g
I × I ×"× I n
y Tensor – Multidimensional Array A ∈ R 1 2
y Tensor Product
Ik
(A ×k U)i1i2 "ik −1 jk ik +1"in = ∑ (A )i1i2 "ik −1ik ik +1"in (U) jk ik
ik =1
y High Order Singular Value Decomposition (HOSVD)
A = C ×1 U1 ×2 U 2 ×3 " ×n U n
H
How
M
Multilinear
ltili
A
Analysis
l i W
Works
k
y Ensemble Representation
Ensemble Tensor:
Arrange samples
based on factors
Base Matrix:
Span the space of
the self variations
D = C ×1 U1 ×2 U 2 ×3 " ×n −1 U n −1 ×n U n
Core Tensor:
Control the
i t
interaction
ti
between factors
Mode matrices:
Capture the
variations of each
factor
H
How
M
Multilinear
ltili
A
Analysis
l i W
Works
k
y Individual Sample Representation
x = C ×1 u ×2 u ×3 " ×n −1 u
T
1
T
2
Core Tensor:
C
Coordinate
the
interaction
between factors
T
n −1
The vector
representation
of each factor
Obtain the coefficients of basis
×n U n
T
TensorPatches
P t h
y Formulation
F
l ti off Multi-Style
M lti St l Patch
P t h Ensembles
E
bl
D = C ×1 U people ×2 U positions ×3 U styles ×4 U features
Patch
ensemble
Core ttensor
C
coordinating
the interaction
between
bet
ee
factors
Encoding
E
di
people-people
related
information
Encoding
position-position
related
information
Encoding
style-style
related
information
The basis spanning
the variation
subspace of patches
y The tensor decomposition can be done by HOSVD
Bidirectional Mapping/Inferring
Forward transform: mapping the
“Photo Patch Space” to the “Common
Variation Space”
Space from which inferring
the “Sketch Patch Space” .
x
y ≈ Ay Bx x
Backward transform: mapping the
“Sketch
Sketch Patch Space
Space” to the “Common
Common
Variation Space” from which inferring
the “Photo Patch Space” .
x ≈ Ax By y
y
Illustration of relations among common
variation space, photo patch space and
sketch patch space.
Hidden Relations
y From common variation space to photo/sketch image
spaces:
I x = Ax w, I y = Ay w.
y From photo/sketch image spaces to common variation
ariation
space:
w = ( AxT Ax ) −1 AxT I x = Bx I x , w = ( AyT Ay ) −1 AyT I y = By I y .
y The people parameter vector w maintains to be solved
for new face images.
B
Bayesian
i T
Tensor Inference
I f
y We fulfill the backward transform Iy → Ix through w by
taking these quantities as a whole into a global
optimization formulation.
y Our inference approach is still deduced from canonical
Bayesian statistics, exploiting PCA to represent the photo
feature vector Ix (use the latent variable a) to be
hallucinated.
hallucinated
y The advantage of our approach is to take into account
the statistics among a,
a w,
w and Iy.
Iy
B
Bayesian
i T
Tensor Inference
I f
•
Perform PCA on the training photo vectors {Ix}
I x ≈ Ua + μ , p(a) ∝ exp{−aT Λ −1a}.
•
Use the learned relations I y = Ay w, w = Bx I x , we have
p ( I y | w) ∝ exp{−
•
I y − Ay w
λ1
2
}, p ( w | a ) ∝ exp{−
w − Bx (Ua + μ )
λ2
We find the MAP solution a* for hallucinating the optimal Ix* as
follows
a* = arg max w,a p ( w, a | I y ) = arg max w,a p ( I y | w, a ) p ( w, a )
= arg max w,a p ( I y | w) p ( w | a ) p (a ).
)
2
}.
Architecture of our sketch-based facial
photo hallucination approach.
(1) Learn the TensorPatches model taking
image derivatives as features in the
training phase.
((2)) Obtain the initial result applying
pp y g the
local geometry preserving method.
(3) Infer image derivatives of the target
photo using the Bayesian Tensor Inference
method, given input image derivatives
extracted from the test sketch face.
(4) Conduct gradient correction
to hallucinate the final result.
Photo hallucination results for Asian and European faces
faces. (a) Input sketch images
images,
(b) eigentransform method, (c) local geometry preserving method, (d) our method,
(e) groundtruth face photos.
Thanks!
Jan 2007
If any question on this paper, feel free to contact me
via wliu5@ie.cuhk.edu.hk.
Visiting http://mmlab.ie.cuhk.edu.hk/~face/ for more
information about my other works.
Download