Outline • Statistical Modeling and Conceptualization of Visual Patterns

advertisement

Outline

• Statistical Modeling and Conceptualization of Visual Patterns

– S. C. Zhu, “Statistical modeling and conceptualization of visual patterns,”

IEEE

Transactions on Pattern Analysis and Machine

Intelligence , vol. 25, no. 6, 1-22, 2003

A Common Framework of Visual Knowledge Representation

• Visual patterns in natural images

– Natural images consist of an overwhelming number of visual patterns

• Generated by very diverse stochastic processes

• Comments

– Any single image normally consists of a few recognizable/segmentable visual patterns

– Scientifically, given that visual patterns are generated by stochastic processes, shall we model the underlying stochastic processes or model visual patterns presented in the observations from the stochastic processes?

April 11, 2020 Computer Vision 2

A Common Framework of Visual Knowledge Representation – cont.

April 11, 2020 Computer Vision 3

A Common Framework of Visual Knowledge Representation – cont.

• The image analysis as an image parsing problem

– Parse generic images into their constituent patterns

(according to the underlying stochastic processes)

• Perceptual grouping when applied to points, lines, and curves processes

• Image segmentation when applied to region processes

• Object recognition when applied to high level objects

April 11, 2020 Computer Vision 4

A Common Framework of Visual Knowledge Representation – cont.

April 11, 2020 Computer Vision 5

A Common Framework of Visual Knowledge Representation – cont.

• Required components for parsing

– Mathematical definitions and models of various visual patterns

• Definitions and models are intrinsically recursive

– Grammars (or called rules)

• Which specifies the relationships among various patterns

• Grammars should be stochastic in nature

– A parsing algorithm

April 11, 2020 Computer Vision 6

Syntactical Pattern Recognition

April 11, 2020 Computer Vision 7

A Common Framework of Visual Knowledge Representation – cont.

• Conceptualization of visual patterns

– The concept of a pattern is an abstraction of some properties decided by certain “visual purposes”

• They are feature statistics computed from

– Raw signals

– Some hidden descriptions inferred from raw signals

– Mathematically, each pattern is equivalent to a set of observable signals governed by a statistical model

April 11, 2020 Computer Vision 8

A Common Framework of Visual Knowledge Representation – cont.

• Statistical modeling of visual patterns

– Statistical models are intrinsic representations of visual knowledge and image regularities

• Due to noise and distortion in imaging process?

• Due to noise and distortion in the underlying generative process?

• Due to transformations in the underlying stochastic process?

– Pattern theory

April 11, 2020 Computer Vision 9

A Common Framework of Visual Knowledge Representation – cont.

• Statistical modeling of visual patterns

- continued

– Mathematical space for patterns and spaces

• Depends on the forms

– Parametric

– Non-parametric

– Attributed graphs

– Different models

• Descriptive models

– Bottom-up, feature-based models

• Generative models

– Hidden variables for generating images in a top-down manner

April 11, 2020 Computer Vision 10

A Common Framework of Visual Knowledge Representation – cont.

• Learning a visual vocabulary

– Hierarchy of visual descriptions for general visual patterns

– Vocabulary of visual description

• Learning from an ensemble of natural images

• Vocabulary is far from enough

– Rich structures in physics

– Large vocabulary in speech and language

April 11, 2020 Computer Vision 11

A Common Framework of Visual Knowledge Representation – cont.

• Computational tractability

– Computational heuristics for effective inference of visual patterns

• Discriminative models

– A framework

• Discriminative probabilities are used as proposal probabilities that drive the Markov chain search for fast convergence and mixing

• Generative models are top-down probabilities and the hidden variables to be inferred from posterior probabilities

April 11, 2020 Computer Vision 12

A Common Framework of Visual Knowledge Representation – cont.

• Discussion

– Images are generated by rendering 3D objects under some external conditions

• All the images from one object form a low dimensional manifold in a high dimensional image space

• Rendering can be modeled fairly accurately

• Describing a 3D object requires a huge amount of data

– Under this setting

• A visual pattern simply corresponds to the manifold

• Descriptive model attempts to characterize the manifold

• Generative model attempts to learn the 3D objects and the rendering

April 11, 2020 Computer Vision 13

3D Model-Based Recognition

April 11, 2020 Computer Vision 14

Literature Survey

• To develop a generic vision system, regularities in images must be modeled

• The study of natural image statistics

– Ecologic influence on visual perception

– Natural images have high-order (i.e., non-Gaussian) structures

• The histograms of Gabor-type filter responses on natural images have high kurtosis

– Histograms of gradient filters are consistent over a range of scales

April 11, 2020 Computer Vision 15

Natural Image Statistics Example

April 11, 2020 Computer Vision 16

Analytical Probability Models for Spectral Representation

• Transported generator model

(Grenander and Srivastava, 2000) where

– g i

’s are selected randomly from some generator space G

– the weigths a i

’s are i.i.d. standard normal

– the scales r i

’s are i.i.d. uniform on the interval [0,L]

– the locations z i

’s as samples from a 2D homogenous Poisson process, with a uniform intensity l

, and

– the parameters are assumed to be independent of each other

April 11, 2020 Computer Vision 17

Analytical Probability Models

- continued

• Define

• Model u by a scaled 

-density

April 11, 2020 Computer Vision 18

Analytical Probability Models

- continued

April 11, 2020 Computer Vision 19

Analytical Probability Models

- continued

April 11, 2020 Computer Vision 20

Analytical Probability Models

- continued

April 11, 2020 Computer Vision 21

Analysis of Natural Image Components

• Harmonic analysis

– Decomposing various classes of functions by different bases

– Including Fourier transform, wavelet transforms, edgelets, curvelets, and so on

22 April 11, 2020 Computer Vision

Sparse Coding

From S. C. Zhu

April 11, 2020 Computer Vision 23

Grouping of Natural Image Elements

• Gestalt laws

– Gestalt grouping laws

– Should be interpreted as heuristics rather than deterministic laws

– Nonaccidental property

April 11, 2020 Computer Vision 24

Illusion

April 11, 2020 Computer Vision 25

Illusion – cont.

April 11, 2020 Computer Vision 26

Ambiguous Figure

April 11, 2020 Computer Vision 27

Statistical Modeling of Natural Image Patterns

• Synthesis-by-analysis

April 11, 2020 Computer Vision 28

Analog from Speech Recognition

April 11, 2020 Computer Vision 29

Modeling of Natural Image Patterns

• Shape-from-X problems are fundamentally ill-posed

• Markov random field models

• Deformable templates for objects

– Inhomogeneous MRF models on graphs

April 11, 2020 Computer Vision 30

Four Categories of Statistical Models

• Descriptive models

– Constructed based on statistical descriptions of the image ensembles

– Homogeneous models

• Statistics are assumed to be the same for all elements in the graph

– Inhomogeneous models

• The elements of the underlying graph are labeled and different features and statistics are used at different sites

April 11, 2020 Computer Vision 31

Variants of Descriptive Models

• Casual Markov models

– By imposing a partial ordering among the vertices of the graph, the joint probability can be factorized as a product of conditional probabilities

– Belief propagation networks

• Pseudo-descriptive models

April 11, 2020 Computer Vision 32

Generative Models

• Use of hidden variables that can “explain away” the strong dependency in observed images

– This requires a vocabulary

– Grammars to generate images from hidden variables

• Note that generative models can not be separated from descriptive models

– The description of hidden variables requires descriptive models

April 11, 2020 Computer Vision 33

Discriminative Models

• Approximation of posterior probabilities of hidden variables based on local features

– Can be seen as importance proposal probabilities

April 11, 2020 Computer Vision 34

An Example

April 11, 2020 Computer Vision 35

Problem formation

Input: a set of images

S

 {

I

1

, I

2

, ...

, I n

} ~ f (I)

Output: a probability model p ( I)

 f (I)

Here, f (I) represents the ensemble of images in a given domain, we shall discuss the relationship between ensemble and probability later.

Computer Vision 36 April 11, 2020

Problem formation

The model p approaches the true density p *

 arg min p

 

D ( f || p )

The Kullback-Leibler Divergence

D ( f || p )

  f f ( I)

( I) log p ( I )

E f

[log f ( I)] E f d I

[log p ( I)]

April 11, 2020 Computer Vision 37

Maximum Likelihood Estimate p *

 argmax p

 

E f

[log p ( I)]

 argmax p

  i n 

1 log p ( I i

)

 

April 11, 2020 Computer Vision 38

Model Pursuit

1

,

2

, ...

,

 n

  * p f

1. What is

 -the family of models ?

2. How do we augment the space

?

April 11, 2020 Computer Vision 39

Two Choices of Models

1. The exponential family – descriptive models

--- Characterize images by features and statistics

2. The mixture family -- generative models

--- Characterize images by hidden variables

Features are deterministic mathematical transforms of an image.

Hidden variables are stochastic and are inferred from an image.

F

F(I) vs.

H ~ p (H | I)

April 11, 2020 Computer Vision 40

I: Descriptive Models

Step 1: extracting image features/statistics as transforms

1

( I),

2

( I), ...

,

 k

(I)

For example: histograms of Gabor filter responses.

Other features/statistics: Gabors, geometry, Gestalt laws, faces.

April 11, 2020 Computer Vision 41

I.I: Descriptive Models

Step 2: using features/statistics to constrain the model

Two cases:

1. On infinite lattice Z 2 --- an equivalence class.

2. On any finite lattice --- a conditional probability model.

image space on Z 2

April 11, 2020 Computer Vision image space on lattice

L

42

I.I Descriptive Model on Finite Lattice

Modeling by maximum entropy: p

 arg max

 p (I) log p (I) d I

Subject to:

E p

[

 m

( I)]

E f

[

 m

( I)]

1 n

  j m

( I j

)

Remark: p and f have the same projected marginal statistics.

April 11, 2020 Computer Vision 43

Minimax Entropy Learning p *

 arg min p

 

D ( f || p )

 arg min p

 

E f

[

log p ]

For a Gibbs (max. entropy) model p , this leads to the minimax entropy principle

(Zhu,Wu, Mumford 96,97) p *

 arg min

{ max

β entropy ( p ( I ;

β

))

}

April 11, 2020 Computer Vision 44

FRAME Model

• FRAME model

– Filtering, random field, and maximum entropy

– A well-defined mathematical model for textures by combining filtering and random field models

45 April 11, 2020 Computer Vision

I.I Descriptive Model on Finite Lattice

The FRAME model

(Zhu, Wu, Mumford, 1996) p ( I ; β) 

1 z

(

) exp

{ k  j

1

β j

 j

( I)

}

This includes all Markov random field models.

Remark: all known exponential models are from maxent., and maxent was proposed in Physics ( Jaynes, 1957 ). The nice thing is that it provides a parametric model integrating features.

April 11, 2020 Computer Vision 46

I.I Descriptive Model on Finite Lattice

Two learning phases:

1. Choose information bearing features

-- augmenting the probability family.

1

,

2

, ...

,

 n

  *

2. Compute the parameter

L by MLE

-- learning within a family.

April 11, 2020 Computer Vision 47

Maximum Entropy

• Maximum entropy

– Is an important principle in statistics for constructing a probability distribution on a set of random variables

– Suppose the available information is the expectations of some known functions

 n

(x), that is

E [

( x )]

  

( x ) p ( x ) dx

  for n

1,  , N p n n n

– Let  be the set of all probability distributions p ( x ) which satisfy the constraints

 

{ p ( x ) | E p

[

 n

( x )]

  n

, n

1 ,  , N }

April 11, 2020 Computer Vision 48

Maximum Entropy – cont.

• Maximum Entropy – continued

– According to the maximum entropy principle, a good choice of the probability distribution is the one that has the maximum entropy p

*

( x subject to

)

 arg max

 p ( x ) log p ( x ) dx

E p

[

 n

( x )]

   n

( x ) p ( x ) dx

  n for n

1,  , N

 p ( x ) dx

1

April 11, 2020 Computer Vision 49

Maximum Entropy – cont.

• Maximum Entropy – continued

– By Lagrange multipliers, the solution for p(x) is p ( x ;

L

)

1

Z (

L

) exp  n

N 

1 l n

 n

( x )

– where

L 

( l

1

, l

2

,  , l

N

) and Z(

L

)

  exp n

N 

1 l n

 n

( x )  dx

April 11, 2020 Computer Vision 50

Maximum Entropy – cont.

• Maximum Entropy – continued

– L 

( l

1

, l

2

,  , l

N

) are determined by the constraints

– But a closed form solution is not available general

• Numerical solutions d l n dt

E p ( x ;

L

)

[

 n

( x )]

 n n

1 , 2 ,  , N

April 11, 2020 Computer Vision 51

Maximum Entropy – cont.

• Maximum Entropy – continued

– The solutions are guaranteed to exist and be unique by the following properties

April 11, 2020 Computer Vision 52

Minimax Entropy Learning

(cont.)

Intuitive interpretation of minimax entropy.

April 11, 2020 Computer Vision 53

Learning A High Dimensional Density

April 11, 2020 Computer Vision 54

Toy Example I

April 11, 2020 Computer Vision 55

Toy Example II

April 11, 2020 Computer Vision 56

FRAME Model

• Texture modeling

– The features can be anything you want  n

(x)

– Histograms of filter responses are a good feature for textures

57 April 11, 2020 Computer Vision

FRAME Model – cont.

• The FRAME algorithm

– Initialization

Input a texture image I obs

Select a group of K filters S

K

={F (1) , F (2) , ...., F (K) }

Compute {H obs( a

) , a

= 1, ....., K}

Initialize l i

( a

) 

0 , i

1,2,  , L a 

1,2,  , K

Initialize I syn as a uniform white noise image

April 11, 2020 Computer Vision 58

FRAME Model – cont.

• The FRAME algorithm – continued

– The algorithm

Repeat calculate H syn( a

)

E p ( I ;

L

K

, S

K

)

( H

, a

=1,..., K from I syn and use it as

( a

)

) l i

( a d l i

( a

)

/ dt

E p ( I ;

L

K

, S

K

)

[ H

( a

)

]

-

H obs ( a

)

Apply Gibbs sampler to flip I syn for w sweeps until

1

2 i

L 

1

| H i obs ( a

) -

H i syn ( a

)

|

  for a 

1, 2,  , K

April 11, 2020 Computer Vision 59

FRAME Model – cont.

• The Gibbs sampler

April 11, 2020 Computer Vision 60

FRAME Model – cont.

• Filter selection

– In practice, we want a small number of “good” filters

– One way to do that is to choose filters that carry the most information

• In other words, minimum entropy

61 April 11, 2020 Computer Vision

FRAME Model – cont.

• Filter selection algorithm

– Initialization

April 11, 2020 Computer Vision 62

FRAME Model – cont.

April 11, 2020 Computer Vision 63

Descriptive Models

– cont.

April 11, 2020 Computer Vision 64

Existing Texture Features

April 11, 2020 Computer Vision 65

Existing Feature Statistics

April 11, 2020 Computer Vision 66

Most General Feature Statistics

April 11, 2020 Computer Vision 67

Julesz Ensemble – cont.

• Definition

– Given a set of normalized statistics on lattice L h

{ h ( a

)

:, a 

1 , 2 ,  K } a Julesz ensemble

( h ) is the limit of the following set as

L 

Z 2 and H

{ h } under some boundary conditions

L

( H )

{ I : h ( I )

H }

April 11, 2020 Computer Vision 68

Julesz Ensemble – cont.

• Feature selection

– A feature can be selected from a large set of features through information gain, or the decrease in entropy

April 11, 2020 Computer Vision 69

Example: 2D Flexible Shapes

April 11, 2020 Computer Vision 70

A Random Field for 2D Shape

The neighborhood

Co-linearity, co-circularity, proximity, parallelism, symmetry, …

April 11, 2020 Computer Vision 71

A Descriptive Shape Model

Random 2D shapes sampled from a Gibbs model.

(Zhu, 1999)

April 11, 2020 Computer Vision 72

A Descriptive Shape Model

Random 2D shapes sampled from a Gibbs model.

April 11, 2020 Computer Vision 73

Example: Face Modeling

April 11, 2020 Computer Vision 74

Generative Models

• Use of hidden variables that can “explain away” the strong dependency in observed images

– This requires a vocabulary

– Grammars to generate images from hidden variables

• Note that generative models can not be separated from descriptive models

– The description of hidden variables requires descriptive models

April 11, 2020 Computer Vision 75

Generative Models

– cont.

April 11, 2020 Computer Vision 76

Philosophy of Generative Models

?

World structure H f ( H, I) observer I

1

, I

2

, ...

, I n

~ f ( I)

April 11, 2020

F

Features Hidden variables

F(I) vs.

H ~ p (H | I)

Computer Vision 77

Example of Generative Model: image coding

I

 i k 

1

α i

ψ i

 n

Random variables

Parameters: wavelets

Assumptions:

1. Overcomplete basis

2. High kurtosis for iid a

, e.g.

p (

α

)

 exp

{ λ

|

α

|

}

April 11, 2020 Computer Vision 78

A Generative Model

(Zhu and Guo, 2000)

( T

1

, Ψ

1

)

( T

2

, Ψ

2

) noise

Image I occlusion occlusion additive

April 11, 2020 Computer Vision 79

Example: Texton map

One layer of hidden variables: the texton map

T

 { n , (x i

, y i

,

θ i

, s i

,

α i

), i

1,2,..., n

}

April 11, 2020 Computer Vision 80

Learning with Generative Model p ( I obs

,

)

  p ( I obs

| H ,

) p ( H ;

β )

 

( 

1

,   k

; d H

β

1

,  β k

)

1. A generative model from H to I

I

I (T

1

,

Ψ

1

)

 I (T

2

,

Ψ

2

)

   I (T k

,

Ψ k

)

 n

2. A descriptive model for H.

k p ( H ;

β

)

 p ( T i

;

β i

) i

1

April 11, 2020 Computer Vision 81

Learning with Generative Model p *

 arg min p

 

D( f || p )

 arg max p

  log p ( I obs

;

)

 

Learning by MLE:

 log p ( I obs

;

)

 

 

(

 log p ( I obs

 

| H ;

)

 k  i

1

 log p ( T i

 β i

;

β i

)

) p ( H | I obs

;

) dH

1. Regression, fitting.

2. Minimax entropy learning

3. Stochastic inference

April 11, 2020 Computer Vision 82

Stochastic Inference by DDMCMC

Goal: sampling H ~ p (H | I obs

;

)

Method: a symphony algorithm by data driven

Markov chain Monte Carlo .

(Zhu, Zhang and Tu 1999)

1. Posterior probability p (H | I obs ;

)

2. Importance proposal probability density q (H | I) computer vision pattern recognition

April 11, 2020 Computer Vision 83

Example of A Generative Model

An observed image:

April 11, 2020 Computer Vision 84

Data Clustering

The saliency maps used as proposal probabilities

April 11, 2020 Computer Vision 85

April 11, 2020 Computer Vision 86

April 11, 2020 Computer Vision 87

A Descriptive Model for Texton Map

April 11, 2020 Computer Vision 88

Example of A Generative Model

April 11, 2020 Computer Vision 89

Data Clustering

April 11, 2020 Computer Vision 90

A Descriptive Model on Texton Map

April 11, 2020 Computer Vision 91

April 11, 2020 Computer Vision 92

April 11, 2020 Computer Vision 93

Example of A Generative Model

April 11, 2020 Computer Vision 94

A Descriptive Model for Texton Map

April 11, 2020 Computer Vision 95

Download