Background – contextual classification INF 5300 - Most essential issues ƒ Contextual classification

advertisement
INF 5300 Most essential issues
ƒ Contextual classification
ƒTexture based on filtering
Repetisjon
INF 5300 28.5.04 - AS
1
Background – contextual classification
•
•
•
•
An image normally contains areas of similar
class
– neighboring pixels tend to be correlated.
Classified images based on a non-contextual
model often contain isolated misclassified
pixels (or small regions).
How can we get rid of this?
– Majority filtering in a local neighborhood
– Remove small regions by region area
– Relaxation (Kittler and Foglein – see INF
3300 Lecture 23.09.03)
– Bayesian models for the joint distribution
of pixel labels in a neighborhood.
How do we know if the small regions are
correct or not?
– Look at the data, integrate spatial models
in the classifier.
Repetisjon
INF 5300 28.5.04 - AS
2
Bayesian image classification
X = {x1,...,xN} Image of feature vectors to classify
C = {c1,...cN} Class labels of pixels
• Classification consists choosing the class that maximizes the
posterior probabilities
P (C | X ) =
P( X | C ) P(C )
∑ P(X | C)P(C)
all classes
• Maximizing P(C|X) with respect to c1,.....cN is equivalent to
maximizing P(X|C)P(C) since the denominator does not depend on
the classes c1,.....cN .
Repetisjon
INF 5300 28.5.04 - AS
3
Haslett’s model (Haslett 1983)
• Based on 4-neighbors
• Model the probability of observing classes
a,b,c,d as neighbor of pixel i which has
class k. g(a,b,c,d|k)=π(a|k) π(b|k) π(c|k)
π(d|k),
– π(a|k) is the probability of finding class a as a
the north neighbor of class k.
– π(a|k)= π(b|k)= π(c|k)= π(d|k)
– π(k|k)=0.9 and π(l|k)=0.1/(K-1) if l≠k is often
used
Repetisjon
INF 5300 28.5.04 - AS
a
b k c
d
4
Haslett’s model
• Classify each pixel i to the class which maximize
G (k ) = π (k ) p ( xi | k )Tk ( xiN )Tk ( xiE )Tk ( xiW )Tk ( xiS ),
Tk ( x ) = ∑ π (m | k ) p( x | m)
m
• π(k) is the prior probability of class k (often equal for all
classes). xiN, xiW, xiE, xiS are the north, west, east and south
neighbors of pixel i.
• Haslett’s method is non-iterative and thus fast.
• It is sub-optimal in terms of finding class labels for all pixels in
the scene.
Repetisjon
INF 5300 28.5.04 - AS
5
Markov random fields
Basic assumption:
• The class label ci at pixel i is supposed
to depen on the class neibhbors cj in
a neibhborhood Ni surrounding pixel i.
• P(ci|c1,.....cN) = P(ci|cj) if pixel j is a
neihbor of pixel i.
• P(ci|cj) can be shown to be (using the
equivalence between Gibbs random
fields and Markov random fields)
P(ci | c j ) =
1 −U ( C ) / T
e
Z
Class labels in a local neighborhood
where Z and T are constants that can be
ignored. U is called the energy
function.
Repetisjon
INF 5300 28.5.04 - AS
6
Energy functions and cliques
• For a Gibbs random field, U can be expressed as a sum of potential
function over all cliques in the neighborhood
U (c ) =
∑ V(c)
all cliques
second order
neighborhood
All cliques for this neighborhood
• A clique is a pair of neighbors
• Using this scheme, texture models can be defined, but we will only
look at a simple model called the Ising model
Repetisjon
INF 5300 28.5.04 - AS
7
The Ising model for spatial context
Uspatial(i) = β ∑ I (ci , ck )
k∈Ni
β controls the degree of spatial smoothing
•
I(ci,ck) = -1 if ci = ck and 0 otherwise
• This corresponds to counting the number
of pixels in the neighborhood assigned to
the same class as pixel i.
Repetisjon
INF 5300 28.5.04 - AS
8
How to classify the image
• Classification consist of shoosing the class that maximize
P(xi|C)P(C).
• We can rewrite this is the form
P( xi | C ) P(C ) =
1 −Udata ( X |C ) −Uspatial (C )
e
e
Z1
• Maximizing P(xi|C)P(C) is equivalent to minimizing
U = Udata( X | C ) + Uspatial (C )
where
Uspatial (i ) = β ∑ I (ci , ck )
k ∈N i
Udata( X | C ) = − log P( xi | C )
Repetisjon
INF 5300 28.5.04 - AS
9
Udata(X|C)
• Any kind of probability-based classifier can be used, for example a
Gaussian classifier with a k classes, d-dimensional feature vector,
mean µk and covariance matrix Σk:
1
1
1
d
Udata( xi | ci ) = − log(2π ) − log( Σk ) − xiT Σk−1 xi + µkT Σ−k1 xi − µkT Σ−k1µk
2
2
2
2
1
1
1
∝ − xiT Σ−k1 xi + µkT Σ−k1 xi − µkT Σ−k1µk − log( Σk )
2
2
2
Repetisjon
INF 5300 28.5.04 - AS
10
But how do we minimize U
for the whole image?
• Optimization problem involving simultaneous
optimization of N class labels.
• Three common methods:
– Simulated annealing
– Maximimizing posterior marginals
– Iterated conditional models (ICM)
• We will only study the ICM algorithm, which
converges only to a local minima and is theoretically
suboptimal, but computationally feasible.
Repetisjon
INF 5300 28.5.04 - AS
11
ICM algorithm
1. Initilalize ci, i=1,...N as the contextual classification
by finding the class which minimize Udata
2. For all pixels i in the image, update ci with the class
that mimimize U=Udata+Uspatial
3. Repeat 2 n times
Usually <10 iterations are sufficient
Repetisjon
INF 5300 28.5.04 - AS
12
How to choose the smoothing parameter β
• β controls the degree of spatial smoothing
• β normally lies in the range 1≤ β ≤2.5
• The value of β can be estimated based on formal parameter
estimation procedures (heavy statistics, but the best way!)
• Another approach is to try different values of β, and choose the
one that produces the best classification rate on the training
data set.
Repetisjon
INF 5300 28.5.04 - AS
13
A taxonomy of texture models
We can characterize texture models into different groups:
• Statistical models
– GLCM, GLRL,
– Autocorrelation features
• Geometrical models
– Voronoi tesselation, structural models
• Model-based methods
– Markov random field models
– Fractals
• Signal-processing methods
– Frequency-based methods like wavelets, Gabor filters, filter
banks, etc.
Repetisjon
INF 5300 28.5.04 - AS
14
Texture based on filtering
• To discriminate textures containing structures with
different spatial frequencies or different orientations,
spatial filtering methods are useful.
• The most common approach is to set up a filter bank
with different filter, and compute the response to a
set of filters with covering different ranges of the
frequence spectrum. A special feature extraction
function is then used to combine the filter outputs to
texture descriptors.
• A simple approach is to use edge detection filters.
• This frequency-based approach is best suited to
texture which can be identified as belonging to
different regions of the Fourier spectrum.
Repetisjon
INF 5300 28.5.04 - AS
15
Texture based on filter banks
Original
1D profile
Filtered profile
Nonlinear transform Smoothed
Resulting 2D feature image
Repetisjon
INF 5300 28.5.04 - AS
16
Designing the filters in the filter bank
• A filter bank is a collection of spatial filters which
covers the most interesting parts of the frequency
domain.
• To detect a set of textures, a filter bank with filters
that are tailored to the frequency characteristics of
the texture is needed.
• The main idea is to partition the frequency space into
different regions and apply one filter for each region.
• How do we partion the frequency domain, and how
many filters do we use?
• Can we use prior knowledge about the textures to
tailor the filters?
Repetisjon
INF 5300 28.5.04 - AS
17
Unsupervised filter banks
• Unsupervised means that no information about the
textures is used when selecting the filter banks.
• Several approaches have been tried:
– Laws filter masks
– Ring and wedge filters
– Gabor filter banks
– Wavelet transform
– Discrete Cosine Transform
– Quadrature Mirror Filters
Repetisjon
INF 5300 28.5.04 - AS
18
Frequency response ring and wedge filters
Repetisjon
INF 5300 28.5.04 - AS
19
Gabor filter kernels
• We consider even-symmetric Gabor-filters of the following form:
1 ⎡ x2 y2 ⎤
− ⎢ 2+ 2⎥
2 ⎢⎣σ x σ y ⎥⎦
h( x, y )with
= e orientation
cos(20°.
πf 0 xOther
)
filter
• This yields a
orientations are
found by rotating the reference coordinate system x,y.
(Orientations 0°, 45°, 90°, and 135° are often used.)
• The Fourier-transform of a Gaussian function is a Gaussian, thus
the filter frequency respons for each filter is a Gaussian function
with a given center frequency and width.
• Gabor filters are claimed to give optimal localization properties
both in the spatial and in the frequency domain (mainly because
of their shape (Gaussian)).
• Different filter parameters can be choosen.
Repetisjon
INF 5300 28.5.04 - AS
20
Frequency respons for a
Gabor filter bank
•Jain and Farroknia suggests a set of filters with center
2 2 2 2 2
frequencies
,
,
,
,
26 25 24 23 22
and orientations
0°, 45°, 90°, and 135°
This gives a almost uniform
coverage of the spectrum.
Repetisjon
INF 5300 28.5.04 - AS
21
From the output of a filter bank
to texture features
• The result after applying a filter bank with M filters to an image
is M filtered images.
• If texture is computed in a local window, M subimages result
from each window position.
• These cannot be used as feature vectors directly. We try some
kind of feature extraction to the filtered images. We are looking
for a feature extraction step that will give constant feature
values for regions with equal texture, and different for region
with different texture.
• There is no evident way to do this. A common approach is to
first perform a non-linear transform, then to smooth the
resulting image.
• The success of the texture model will depend on the success of
this step.
Repetisjon
INF 5300 28.5.04 - AS
22
Jain and Farrokhnias
feature extraction approach
• First, each filter is subjected to a nonlinear transform
using a tanh function
ψ (t ) = tanh(αt ) =
1 − e − 2αt
1 + e − 2αt
where α is a constant
• This results in a threshold-like function and gradual
changes in the filtered images are converted to
square-like blobs.
• Then, they compute the average deviation from the
mean in small overlapping windows
ek ( x, y ) =
1
∑ ψ (rk (a, b)) where rk is the filtered image no. k
M 2 ( a ,b )
• This is similar to Law’s texture feature.
Repetisjon
INF 5300 28.5.04 - AS
23
• Let f0 be the radial center frequency for the a
bandpass filter in the filter bank.
• Use the following corresponding
Gaussian lowpass
1 n2
− 2
filter
1
2σ
hG (n ) =
2π σ s
where σ s =
e
s
1
2 2 f0
• Other choices are also used.
Repetisjon
INF 5300 28.5.04 - AS
24
The texture segmentation
or classification step
• After the filtering, nonlinear transform and
smoothing, a set of K feature images result. How do
we use these to discriminate between various
textures?
• It is possible to use them as input to a regular
feature selection process. They can be used either
individually or as a multivariate feature vector.
• Either unsupervised texture segmentation or
supervised texture classification can be applied.
Repetisjon
INF 5300 28.5.04 - AS
25
Frequency-based texture computation:
when does it work?
• Scientists often report good results for
texture patches like Broadatz-images
which contains large regions of different
textures with different orientation.
• Such synthetic texture patches have large
regions and sharp borders between
different textures. This is often not the
case in real applications!
• A keyword is orientation: do the textures
we want to discriminate have different
orientations, or are they isotropic?
– For isotropic textures, filtering
methods are often not so good.
Repetisjon
INF 5300 28.5.04 - AS
26
Download