72x48 Poster Template - Linköping University

advertisement
Spatio-chromatic image content descriptors and
their analysis using Extreme Value theory
Vasileios Zografos and Reiner Lenz
(zografos@isy.liu.se, Reiner.Lenz@liu.se)
Computer Vision Laboratory, Linköping University, Sweden
Garnics
2. Spatio-chromatic descriptors
1. Introduction
Challenges for Content based image retrieval (CBIR):
Increase in online visual information
Symmetry groups and filter design:
Large variation in content, appearance and quality
Images indexed by simple and erroneous textual tags

Complex, sophisticated, slow descriptors are not suited for large scale CBIR tasks
Filter systems should be adapted to
• transformations of the image grid
• properties of the RGB color space
Our proposal:
Symmetry groups D(4) and D(3)
Fast spatio-chromatic descriptors suited for fast search over large image databases
Low dimensional representation using models derived from Extreme Value theory

Digital Images are defined on grids (square or hexagonal)
• their symmetry groups are the dihedral groups D(4) and D(6). (See [1]).

RGB channels are on average interchangeable
• the RGB symmetry group is the permutation group equal to the dihedral group
D(3). (See [2]).

The representation theory of the dihedral groups is used to construct filter systems
with:
• simple transformation properties under grid and color transformations
• uncorrelated filter responses
• minimum-mean-squared error encoding
• are generalizations of the FFT for spatial RGB distributions
3. Extreme value theory (EVT)
“The limiting distribution of the extrema of a large number of i.i.d. random variables, is
one of the three parametric forms:”
x   )k )
Weibull:W ( x)  1  exp((
,
Frechet: F ( x)  exp(( x    ) k )
Gumbel: G( x)  exp( exp(  x ))

(1)
Our filters are essentially sums of differences of correlated variables [3]. This also
leads to the EVT forms (1)
We can use (1) as analytical models of the spatio-chromatic filtered image
distribution.
4. Our approach
Method:
Filter each image with the 48 spatio-chromatic filters organized in 24 vectors
Represent the magnitude of each filter vector as: model type + 3 parameters from (1)
Parameter estimation: ML estimation using Newton-Raphson descent
Model type selection: Residual based goodness-of-fit (g.o.f.) with the coeff. of determination R2
How well do the EVT models explain our filtered data?
2 image databases (1100 colour photos + 30000 thumbnails) – natural and
synthetic
Tested all 3 models in (1)
Various g.o.f. measures (K-S test, g-test, chi-square, R2)
Result:
We can do analysis and classification in a low dimensional space 24x3
Additional benefits of the EVT models compared to histograms:
•Continuous; natural clustering in scale-shape space; semantic information (saliency) isolation
Results:
The EVT models provide a good fit to over 80% of the filtered images
Especially suited for natural images
The R2 test is the most robust measure than other typical statistical measures
5. Experiments – The scale-shape space
The scale-shape space is the space spanned by the two parameters of the models in (1).
We can analyse the location and dispersion of filtered images in that space and their
trajectories as their properties vary. It turns out that the images occupy different portions of
that space depending on their texture properties and intensity variation.
Fig 2. Trajectories of model parameters in scale-shape space of an image under
linear and nonlinear transformations (left) and increase in noise and smoothing (right)
Fig 1. Samples from a photo
database distributed in scaleshape space. This behaviour
generalises to other
datasets.
Fig 3. Original, downscaled image (left) and a filtered result (middle). The filter
responses at the tails (i.e. extrema) of the distribution are shown on the right. We can
see that extrema typically correspond to salient features such as edges and corners.
Fig 4. The intensity and colour filters also have a
natural, distinct distribution in this space.
6. Experiments – classification and retrieval
Presented a set of spatio-chromatic descriptors well suited for fast image retrieval
We have used the EVT models to describe the filter output distributions
More flexible, more descriptive and more compact than other competing representations
such as histograms and fragmentation theory.
The filters and EVT models
can be used for very fast
classification and retrieval.
Trained an SVM on the 24x3
parameters
4 class classification example
of scenes and painting styles
(abstract classes)
7. Conclusions
References:
Fig 5. Top ranked results from the 4 classes.
Overall All-to-All classification score 40.5%.
[1] R. Lenz. “Investigation of receptive fields using representations of dihedral groups” JVCIR 6 (1995) 209-227
[2] R. Lenz et al. “A group theoretical toolbox for color image operators” ICIP 3. (2005) 557-560
[3] E. Bertin et al. “Generalized extreme value statistics and sum of correlated variables” J. Phys. A: Math. Gen. 39
7607, (2006)
This research was funded by the EU FP7/2007-2013 programme, under grant agreement No 247947 – GARNICS.
Download