MRF

advertisement
Markov Random Fields (MRF)
•
•
A graphical model for describing spatial consistency in images
Suppose you want to label image pixels with some labels {l1,…,lk} , e.g.,
segmentation, stereo disparity, foreground-background, etc.
Ref:
1. S. Z. Li. Markov Random Field
Modeling in Image Analysis.
Springer-Verlag, 1991
2. S. Geman and D. Geman. Stochastic
relaxation, gibbs distribution
and bayesian restoration of images.
PAMI, 6(6):721–741, 1984.
From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 1
Definition
MRF Components:
• A set of sites: P={1,…,m} : each pixel is a site.
• Neighborhood for each pixel N={Np | p  P}
• A set of random variables (random field), one for each site F={Fp | p  P} Denotes the
label at each pixel.
• Each random variable takes a value fp from the set of labels L={l1,…,lk}
• We have a joint event {F1=f1,…, Fm=fm} , or a configuration, abbreviated as F=f
• The joint prob. Of such configuration: Pr(F=f) or Pr(f)
From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 2
Definition
MRF Components:
• Pr(fi) > 0 for all variables fi.
• Markov Property: Each Random variable depends on other RVs only
through its neighbors. Pr(fp | fS-{p})=Pr (fp|fNp), p
• So, we need to define a neighborhood system: Np (neighbors for site p).
– No strict rules for neighborhood definition.
Cliques for this neighborhood
From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 3
Definition
MRF Components:
• The joint prob. of such configuration:
Pr(F=f) or Pr(f)
•
•
Markov Property: Each Random variable depends on other RVs only
through its neighbors. Pr(fp | fS-{p})=Pr (fp|fNp), p
So, we need to define a neighborhood system: Np (neighbors for site p)
Hammersley-Clifford Theorem:Pr(f)  exp(-C VC(f))
Sum over all cliques in the neighborhood system
VC is clique potential
We may decide
1. NOT to include all cliques in a neighborhood; or
2. Use different Vc for different cliques in the same
neighborhood
From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 4
Optimal Configuration
Sum over all cliques in the neighborhood
system
MRF Components:
• Hammersley-Clifford Theorem:
VC is clique potential: prior probability that
elements of the clique C have certain values
Pr(f)  exp(-C VC(f))
•
Consider MRF’s with arbitrary cliques among neighboring pixels

Pr( f )  exp  
 cC


Vc ( f p1 , f p2 ...) 


p1, p2 ...c

Typical potential: Potts model:
V( p ,q ) ( f p , f q )  u{ p ,q}  (1   ( f p  f q ))
From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 5
Optimal Configuration
MRF Components:
• Hammersley-Clifford Theorem:
Pr(f)  exp(-C VC(f))
•
Consider MRF’s with clique potentials of pairs of neighboring pixels



Pr( f )  exp  Vp ( f p )   V( p ,q ) ( f p , f q ) 
p qN ( p )
 p

Most commonly used….very popular in vision.
Energy function: E ( f ) 
V ( f
p
p
)

 V
p qNp
p
There are two constraints
to satisfy:
1.
Data Constraint: Labeling
should
reflect
the
observation.
2.
( p ,q )
( f p , fq )
Smoothness constraint: Labeling
should reflect spatial consistency
(pixels close to each other are
most likely to have similar labels).
CS 534 – Stereo Imaging - 6
Probabilistic interpretation
•
•
•
•
The problem is we are not observing the labels but we observe something else that
depends on these labels with some noise (eg intensity or disparity)
At each site we have an observation ip
The observed value at each site depends on its label: the prob. of certain observed
value given certain label at site p : g(ip,fp)=Pr(ip|Fp=fp)
The overall observation prob. Given the labels: Pr(O|f)
Pr(O | f )   g (i p , f p )
p
• We need to infer about the labels
given the observation Pr(f|O)  Pr(O|f) Pr(f)
CS 534 – Stereo Imaging - 7
Using MRFs
•
•
•
How to model different problems?
Given observations y, and the parameters of the MRF, how to infer the hidden
variables, x?
How to learn the parameters of the MRF?
Modeling image pixel labels as MRF
MRF-based segmentation
real image
1
 ( xi , yi )
label image
 ( xi , x j )
Slides by R. Huang – Rutgers University
Modeling image pixel labels as MRF
MRF-based segmentation
real image
1
 ( xi , yi )
label image
 ( xi , x j )
Slides by R. Huang – Rutgers University
Modeling image pixel labels as MRF
MRF-based segmentation
real image
1
 ( xi , yi )
label image
 ( xi , x j )
MRF-based segmentation
•
•
Classifying image pixels into different regions under the constraint of both local
observations and spatial relationships
Probabilistic interpretation:
(x* , * )  arg max P(x, | y )
( x , )
region
labels
model
param
.
Slides by R. Huang – Rutgers University
image
pixels
Model joint probability
(x* , * )  arg max P(x, | y )
( x , )
region
labels
model
param
.
How did we
factorize?
image
pixels
1
P(x, y)   ( xi , x j ) ( xi , yi )
Z (i , j )
i
label
image
label-label
compatibility
Function
enforcing
Smoothness
constraint
neighboring
label nodes
Slides by R. Huang – Rutgers University
image-label
compatibility
Function
enforcing
Data
Constraint
local
Observations
Probabilistic interpretation
•
We need to infer about the labels given the observation
Pr( f | O )  Pr(O|f ) Pr(f)
MAP estimate of f should minimize the posterior energy
E ( f )   V( p,q ) ( f p , f q )   ln( g (i p , f p ))
p qNp
p
Data (observation) term:
Data Constraint
Neighborhood term:
Smoothness Constraint
CS 534 – Stereo Imaging - 14
Applying and learning MRF
MRF-based segmentation
EM algorithm
•
E-Step: (Inference)
1
P(y | x, ) P(x |  )
Z
x*  arg max P(x | y, )
P ( x | y , ) 
x
Methods to be described.
•
M-Step: (learning)
 *  arg max E ( P(x, y |  ))  arg max  P(x, y |  ) P(x | y, )

Pseduo-likelihood method.
Slides by R. Huang – Rutgers University

x
Applying and learning MRF: Example
x*  arg max P(x | y )
x
1
 arg max P(x, y ) P(x | y )  P(x, y ) / P(y )  P( x, y )
x
Z1
 arg max   ( xi , yi ) ( xi , x j ) P ( x, y ) 
x
i
(i , j )
1
 ( xi , yi ) ( xi , x j )

Z2 i
(i , j )
 ( xi , yi )  G ( yi ;  x ,  x2 )
i
i
 ( xi , x j )  exp( ( xi  x j ) /  2 )
  [  x ,  x2 ,  2 ]
i
i
 ( xi , yi )
 ( xi , x j )
Slides by R. Huang – Rutgers University
Inference in MRFs
• Inference in MRFs
– Classical:
• Gibbs sampling, simulated annealing  Self study
• Iterated condtional modes (ICM)  Also Self study
– State of the Art
•
•
•
•
Graph cuts
Belief propagation
Linear Programming (not covered in this lecture)
Tree-reweighted message passing (not covered in this lecture)
Slides by R. Huang – Rutgers University
Gibbs sampling and simulated annealing
• Gibbs sampling:
– A way to generate random samples from a (potentially very
complicated) probability distribution
• Simulated annealing:
– A schedule for modifying the probability distribution so that, at “zero
temperature”, you draw samples only from the MAP solution.
Simulated Annealing algorithm:
x := x0; e := E(x) // Initial state, energy.
k := 0 // Energy evaluation count.
while k < kmax and e > emax // While time remains & not good enough:
xn := neighbour(x) // Pick some neighbour.
en := E(xn) // Compute its energy.
if P(e, en, temp(k/kmax)) > random() then // Should we move to it?
x := xn; e := en // Yes, change state.
k := k + 1 // One more evaluation done
return x // Return current solution
Slides by R. Huang – Rutgers University
Gibbs sampling and simulated annealing cont.
•
Simulated annealing as you gradually lower the “temperature” of the probability
distribution ultimately giving zero probability to all but the MAP estimate.
 finds global MAP solution.
 takes forever. (Gibbs sampling is in the inner loop…)
Slides by R. Huang – Rutgers University
Iterated conditional modes
•
•
Start with an estimate of labeling x
For each node xi:
– Condition on all the neighbors
– Find the label decreasing the energy
function the most
– Repeat till convergence
 Fast
 Heavily depend on initialization, local
minimum
Described in: Winkler, 1995. Introduced by Besag in 1986.
Slides by R. Huang – Rutgers University
Solving Energy Minimization with Graph Cuts
•
•
Many classes of Energy Minimization problems in Computer Vision can be
reduced to Graph Cuts
Solve multiple-labels problems with binary decisions
Yevgeny Doctor IP Seminar 2008, IDC
Approximate Energy Minimization
•
•
“Fast Approximate Energy Minimization via Graph Cuts.” Yuri Boykov, Olga Veksler,
Ramin Zabih, 1999
For two classes of interaction potentials V (Esmooth):
– V is semi-metric on a label space L if for every
:
•
V ,   0    
,   L
•
 
V  ,    V  ,    0
– V is metric on L if in addition, triangle inequality holds:
•
V ,   V , V  ,  ,  ,  L

•





For example, truncated L2 distance and Potts Interaction Penalty are both metric.
Yevgeny Doctor IP Seminar 2008, IDC
Solution for Semi-metric Class
• Swap-Move algorithm:
– 1. Start with an arbitrary labeling f
– 2. Set success := 0
– 3. For each pair of labels  ,   L
• 3.1. Find f* = argmin E(f') among f' within one a-b swap of f
• 3.2. If E(f*) < E(f), set f := f* and success := 1
– 4. If success = 1 goto 2
– 5. Return f
  swap:
– In the new labeling f’, some pixels that were labeled  in f are now labeled , and
vice versa.
Yevgeny Doctor IP Seminar 2008, IDC
Solve  swap step with Graph Cut
•
Graph:
Fast Approximate Energy Minimization via Graph Cuts
Yevgeny Doctor IP Seminar 2008, IDC
Yuri Boykov, Olga Veksler, Ramin Zabih, 1999
Solve  swap step with Graph Cut
• Cut and Labeling:
• Weights:
Fast Approximate Energy Minimization via Graph Cuts
Yevgeny Doctor IP Seminar 2008, IDC
Yuri Boykov, Olga Veksler, Ramin Zabih, 1999
Computing a multiway cut
•
With two labels: classical min-cut problem
– Solvable by standard network flow algorithms
• polynomial time in theory, nearly linear in practice
•
More than 2 labels: NP-hard
– But efficient approximation algorithms exist
–
• Within a factor of 2 of optimal
• Computes local minimum in a strong sense
– even very large moves will not improve the energy
• Yuri Boykov, Olga Veksler and Ramin Zabih, Fast Approximate Energy Minimization via
Graph Cuts, International Conference on Computer Vision, September 1999.
Basic idea
• reduce to a series of 2-way-cut sub-problems, using one of:
– swap move: pixels with label l1 can change to l2, and vice-versa
– expansion move: any pixel can change it’s label to l1
Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 26
Belief propagation
•
Message Passing (Original: Weiss & Freeman ‘01, faster: Felzenswalb & Huttenlocher ‘04)
– Send messages between neighbors.
– Messages estimate the cost (or Energy) of a configuration of a clique given all
other cliques.
s3
q
=
q
p
s2
s1
t
pq
m


t 1
 min  D( f p ) V ( f p , f q )   msp ( f p ) 
fp
s N ( p ) \ q


Messages are initialized to zero
Belief propagation
•
Gathering belief
– After time T, the messages are combined to compute a belief.
p3
p4
q
p2
p1
bq ( f q )  D( f q ) 
T
m
 pq ( fq )
pN ( p )
Label with largest belief wins.
Inference in MRFs
•
Loopy BP
– tractable, good approximate in network with loops
– Not guaranteed to converge, may oscillate infinitely.
Stereo as energy minimization
•
Matching Cost Formulated as Energy:
– At pixel p = (x , y)
– “data” term penalizing bad matches
D( x, y, d ) || I( x, y)  J( x  d p , y) ||
(truncated)
– “neighborhood term” encouraging spatial smoothness
Norm of the difference between labels at neighboring x, y.
V ( p1 , p2 )  || d p1  d p2 ||
E   D( x, y, d p ) 
p
(also, truncated)
 V (d
{ p1 , p2 }  Nbrs
p1
, d p2 )
From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 30
Stereo as a Graph cut
Terminals (possible disparity labels)
From Slides by Yuri Boykov, Olga Veksler, Ramin Zabih “Markov Random Fields
with Efficient Approximations” – CVPR 98
CS 534 – Stereo Imaging - 31
Stereo as a graph problem [Boykov, 1999]
edge weight
D( x, y, d3 )
d3
d2
d1
Labels
(disparities)
•
From Slides by S. Seitz - University of Washington
edge weight
Pixels
V (d1 , d1 )
CS 534 – Stereo Imaging - 32
Graph definition
d3
d2
d1
•
Initial state
– Each pixel connected to it’s immediate neighbors
– Each disparity label connected to all of the pixels
From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 33
Stereo matching by graph cuts
d3
d2
d1
•
Graph Cut
– Delete enough edges so that
• each pixel is (transitively) connected to exactly one label node
– Cost of a cut: sum of deleted edge weights
– Finding min cost cut equivalent to finding global minimum of the energy
function
From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 34
Motion estimation as energy minimization
•
Matching Cost Formulated as Energy:
– At pixel p = (x , y)
– “data” term penalizing bad matches
D( p, d p )  || I( p)  J( p  d p ) ||
(truncated)
– “neighborhood term” encouraging spatial smoothness
Norm of the difference between labels at neighboring x, y.
V ( p1 , p2 )  || d p1  d p2 ||
E   D( p, d p ) 
p
(also, truncated)
 V (d
{ p1 , p2 } Nbrs
p1
, d p2 )
From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 35
Results with window search
Window-based matching
(best window size)
Ground truth
From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 36
Better methods exist...
State of the art method
Ground truth
Boykov et al., Fast Approximate Energy Minimization via Graph Cuts,
International Conference on Computer Vision, September 1999.
From Slides by S. Seitz - University of Washington
CS 534 – Stereo Imaging - 37
GrabCut
GrabCut
Magic Wand
Intelligent Scissors
(198?)
Mortensen and Barrett (1995)
Rother et al 2004
User
Input
Result
Regions
Slides C Rother et al., Microsoft Research, Cambridge
Boundary
Regions & Boundary
Data Term
R
Foreground &
Background
Background
G
Gaussian Mixture Model
(typically 5-8 components)
D() is log-likelihood given the
mixture model \Theta
Slides C Rother et al., Microsoft Research, Cambridge
Smoothness term
An object is a coherent set of pixels:
Probability of a configuration:
Iterate until convergence:
1. Compute a configuration given the mixture model. (E-Step)
2. Compute the model parameters given the configuration. (M-Step)
Slides C Rother et al., Microsoft Research, Cambridge
Moderately simple examples
… GrabCut completes automatically
Slides C Rother et al., Microsoft Research, Cambridge
Difficult Examples
Camouflage &
Low Contrast
Initial
Rectangle
Initial
Result
Slides C Rother et al., Microsoft Research, Cambridge
Fine structure
No telepathy
Download