INF 5300 Most essential issues Contextual classification Texture based on filtering Repetisjon INF 5300 28.5.04 - AS 1 Background – contextual classification • • • • An image normally contains areas of similar class – neighboring pixels tend to be correlated. Classified images based on a non-contextual model often contain isolated misclassified pixels (or small regions). How can we get rid of this? – Majority filtering in a local neighborhood – Remove small regions by region area – Relaxation (Kittler and Foglein – see INF 3300 Lecture 23.09.03) – Bayesian models for the joint distribution of pixel labels in a neighborhood. How do we know if the small regions are correct or not? – Look at the data, integrate spatial models in the classifier. Repetisjon INF 5300 28.5.04 - AS 2 Bayesian image classification X = {x1,...,xN} Image of feature vectors to classify C = {c1,...cN} Class labels of pixels • Classification consists choosing the class that maximizes the posterior probabilities P (C | X ) = P( X | C ) P(C ) ∑ P(X | C)P(C) all classes • Maximizing P(C|X) with respect to c1,.....cN is equivalent to maximizing P(X|C)P(C) since the denominator does not depend on the classes c1,.....cN . Repetisjon INF 5300 28.5.04 - AS 3 Haslett’s model (Haslett 1983) • Based on 4-neighbors • Model the probability of observing classes a,b,c,d as neighbor of pixel i which has class k. g(a,b,c,d|k)=π(a|k) π(b|k) π(c|k) π(d|k), – π(a|k) is the probability of finding class a as a the north neighbor of class k. – π(a|k)= π(b|k)= π(c|k)= π(d|k) – π(k|k)=0.9 and π(l|k)=0.1/(K-1) if l≠k is often used Repetisjon INF 5300 28.5.04 - AS a b k c d 4 Haslett’s model • Classify each pixel i to the class which maximize G (k ) = π (k ) p ( xi | k )Tk ( xiN )Tk ( xiE )Tk ( xiW )Tk ( xiS ), Tk ( x ) = ∑ π (m | k ) p( x | m) m • π(k) is the prior probability of class k (often equal for all classes). xiN, xiW, xiE, xiS are the north, west, east and south neighbors of pixel i. • Haslett’s method is non-iterative and thus fast. • It is sub-optimal in terms of finding class labels for all pixels in the scene. Repetisjon INF 5300 28.5.04 - AS 5 Markov random fields Basic assumption: • The class label ci at pixel i is supposed to depen on the class neibhbors cj in a neibhborhood Ni surrounding pixel i. • P(ci|c1,.....cN) = P(ci|cj) if pixel j is a neihbor of pixel i. • P(ci|cj) can be shown to be (using the equivalence between Gibbs random fields and Markov random fields) P(ci | c j ) = 1 −U ( C ) / T e Z Class labels in a local neighborhood where Z and T are constants that can be ignored. U is called the energy function. Repetisjon INF 5300 28.5.04 - AS 6 Energy functions and cliques • For a Gibbs random field, U can be expressed as a sum of potential function over all cliques in the neighborhood U (c ) = ∑ V(c) all cliques second order neighborhood All cliques for this neighborhood • A clique is a pair of neighbors • Using this scheme, texture models can be defined, but we will only look at a simple model called the Ising model Repetisjon INF 5300 28.5.04 - AS 7 The Ising model for spatial context Uspatial(i) = β ∑ I (ci , ck ) k∈Ni β controls the degree of spatial smoothing • I(ci,ck) = -1 if ci = ck and 0 otherwise • This corresponds to counting the number of pixels in the neighborhood assigned to the same class as pixel i. Repetisjon INF 5300 28.5.04 - AS 8 How to classify the image • Classification consist of shoosing the class that maximize P(xi|C)P(C). • We can rewrite this is the form P( xi | C ) P(C ) = 1 −Udata ( X |C ) −Uspatial (C ) e e Z1 • Maximizing P(xi|C)P(C) is equivalent to minimizing U = Udata( X | C ) + Uspatial (C ) where Uspatial (i ) = β ∑ I (ci , ck ) k ∈N i Udata( X | C ) = − log P( xi | C ) Repetisjon INF 5300 28.5.04 - AS 9 Udata(X|C) • Any kind of probability-based classifier can be used, for example a Gaussian classifier with a k classes, d-dimensional feature vector, mean µk and covariance matrix Σk: 1 1 1 d Udata( xi | ci ) = − log(2π ) − log( Σk ) − xiT Σk−1 xi + µkT Σ−k1 xi − µkT Σ−k1µk 2 2 2 2 1 1 1 ∝ − xiT Σ−k1 xi + µkT Σ−k1 xi − µkT Σ−k1µk − log( Σk ) 2 2 2 Repetisjon INF 5300 28.5.04 - AS 10 But how do we minimize U for the whole image? • Optimization problem involving simultaneous optimization of N class labels. • Three common methods: – Simulated annealing – Maximimizing posterior marginals – Iterated conditional models (ICM) • We will only study the ICM algorithm, which converges only to a local minima and is theoretically suboptimal, but computationally feasible. Repetisjon INF 5300 28.5.04 - AS 11 ICM algorithm 1. Initilalize ci, i=1,...N as the contextual classification by finding the class which minimize Udata 2. For all pixels i in the image, update ci with the class that mimimize U=Udata+Uspatial 3. Repeat 2 n times Usually <10 iterations are sufficient Repetisjon INF 5300 28.5.04 - AS 12 How to choose the smoothing parameter β • β controls the degree of spatial smoothing • β normally lies in the range 1≤ β ≤2.5 • The value of β can be estimated based on formal parameter estimation procedures (heavy statistics, but the best way!) • Another approach is to try different values of β, and choose the one that produces the best classification rate on the training data set. Repetisjon INF 5300 28.5.04 - AS 13 A taxonomy of texture models We can characterize texture models into different groups: • Statistical models – GLCM, GLRL, – Autocorrelation features • Geometrical models – Voronoi tesselation, structural models • Model-based methods – Markov random field models – Fractals • Signal-processing methods – Frequency-based methods like wavelets, Gabor filters, filter banks, etc. Repetisjon INF 5300 28.5.04 - AS 14 Texture based on filtering • To discriminate textures containing structures with different spatial frequencies or different orientations, spatial filtering methods are useful. • The most common approach is to set up a filter bank with different filter, and compute the response to a set of filters with covering different ranges of the frequence spectrum. A special feature extraction function is then used to combine the filter outputs to texture descriptors. • A simple approach is to use edge detection filters. • This frequency-based approach is best suited to texture which can be identified as belonging to different regions of the Fourier spectrum. Repetisjon INF 5300 28.5.04 - AS 15 Texture based on filter banks Original 1D profile Filtered profile Nonlinear transform Smoothed Resulting 2D feature image Repetisjon INF 5300 28.5.04 - AS 16 Designing the filters in the filter bank • A filter bank is a collection of spatial filters which covers the most interesting parts of the frequency domain. • To detect a set of textures, a filter bank with filters that are tailored to the frequency characteristics of the texture is needed. • The main idea is to partition the frequency space into different regions and apply one filter for each region. • How do we partion the frequency domain, and how many filters do we use? • Can we use prior knowledge about the textures to tailor the filters? Repetisjon INF 5300 28.5.04 - AS 17 Unsupervised filter banks • Unsupervised means that no information about the textures is used when selecting the filter banks. • Several approaches have been tried: – Laws filter masks – Ring and wedge filters – Gabor filter banks – Wavelet transform – Discrete Cosine Transform – Quadrature Mirror Filters Repetisjon INF 5300 28.5.04 - AS 18 Frequency response ring and wedge filters Repetisjon INF 5300 28.5.04 - AS 19 Gabor filter kernels • We consider even-symmetric Gabor-filters of the following form: 1 ⎡ x2 y2 ⎤ − ⎢ 2+ 2⎥ 2 ⎢⎣σ x σ y ⎥⎦ h( x, y )with = e orientation cos(20°. πf 0 xOther ) filter • This yields a orientations are found by rotating the reference coordinate system x,y. (Orientations 0°, 45°, 90°, and 135° are often used.) • The Fourier-transform of a Gaussian function is a Gaussian, thus the filter frequency respons for each filter is a Gaussian function with a given center frequency and width. • Gabor filters are claimed to give optimal localization properties both in the spatial and in the frequency domain (mainly because of their shape (Gaussian)). • Different filter parameters can be choosen. Repetisjon INF 5300 28.5.04 - AS 20 Frequency respons for a Gabor filter bank •Jain and Farroknia suggests a set of filters with center 2 2 2 2 2 frequencies , , , , 26 25 24 23 22 and orientations 0°, 45°, 90°, and 135° This gives a almost uniform coverage of the spectrum. Repetisjon INF 5300 28.5.04 - AS 21 From the output of a filter bank to texture features • The result after applying a filter bank with M filters to an image is M filtered images. • If texture is computed in a local window, M subimages result from each window position. • These cannot be used as feature vectors directly. We try some kind of feature extraction to the filtered images. We are looking for a feature extraction step that will give constant feature values for regions with equal texture, and different for region with different texture. • There is no evident way to do this. A common approach is to first perform a non-linear transform, then to smooth the resulting image. • The success of the texture model will depend on the success of this step. Repetisjon INF 5300 28.5.04 - AS 22 Jain and Farrokhnias feature extraction approach • First, each filter is subjected to a nonlinear transform using a tanh function ψ (t ) = tanh(αt ) = 1 − e − 2αt 1 + e − 2αt where α is a constant • This results in a threshold-like function and gradual changes in the filtered images are converted to square-like blobs. • Then, they compute the average deviation from the mean in small overlapping windows ek ( x, y ) = 1 ∑ ψ (rk (a, b)) where rk is the filtered image no. k M 2 ( a ,b ) • This is similar to Law’s texture feature. Repetisjon INF 5300 28.5.04 - AS 23 • Let f0 be the radial center frequency for the a bandpass filter in the filter bank. • Use the following corresponding Gaussian lowpass 1 n2 − 2 filter 1 2σ hG (n ) = 2π σ s where σ s = e s 1 2 2 f0 • Other choices are also used. Repetisjon INF 5300 28.5.04 - AS 24 The texture segmentation or classification step • After the filtering, nonlinear transform and smoothing, a set of K feature images result. How do we use these to discriminate between various textures? • It is possible to use them as input to a regular feature selection process. They can be used either individually or as a multivariate feature vector. • Either unsupervised texture segmentation or supervised texture classification can be applied. Repetisjon INF 5300 28.5.04 - AS 25 Frequency-based texture computation: when does it work? • Scientists often report good results for texture patches like Broadatz-images which contains large regions of different textures with different orientation. • Such synthetic texture patches have large regions and sharp borders between different textures. This is often not the case in real applications! • A keyword is orientation: do the textures we want to discriminate have different orientations, or are they isotropic? – For isotropic textures, filtering methods are often not so good. Repetisjon INF 5300 28.5.04 - AS 26