Lecture3 - Image Sim..

advertisement
Seminar on Image
Similarity and Image
Retrieval
{
Presentation by Feliks Beilis
Background
Object categorization and object class
detection – How to find images from
database with specific query ,for example
any red car or any brown horse etc.
Methods used : EMD –histograms ,EMD –
signatures .
Texture classification .
Methods used : Gabor filters ,Patch match
,EMD .
1) 0.00
29020.jpg
Left texture is the source
2) 0.11
29077.jpg
3) 0.19
157090.jpg
1) 0.00
29020.jpg
2) 0.06
29077.jpg
3) 0.09
29005.jpg
1) 0.00
29020.jpg
2) 8.16
29077.jpg
3) 12.23
29005.jpg
4) 0.21
197037.jpg
4) 0.10
96035.jpg
4) 12.64
29017.jpg
5) 0.21
81005.jpg
5) 0.10
1033.jpg
5) 13.82
20003.jpg
6) 0.21
29017.jpg
6) 0.10
25013.jpg
6) 14.52
53062.jpg
7) 0.22
197058.jpg
7) 0.10
20003.jpg
7) 14.70
29018.jpg
8) 0.22
77045.jpg
8) 0.11
140075.jpg
8) 14.78
29019.jpg
Are these 3 textures the same as the source ?
Image recognition - This means finding specific
object ,for example a face of person .
Methods used : SIFT, Color interest points .
Points that describe this picture (SIFT)
Image editing with patch matching
algorithms – How to change image with
existing data ,how reconstruct an image .
Methods used : NNF – nearest neighbor
field ,editing tools with constrains .
Object categorization and object class detection
In this section we will talk mostly on EMD – Earth Moving
Distance ,but there also other methods for comparing Histograms
.
We choose to focus on EMD because it matches perceptual
similarity for image retrieval better then other methods .
In IP we talked about why EMD is better then other methods so this part of the
proof I won’t explain thoroughly .
Just a reminder :
Minkovski distance
Kullback Lieber
X^2 statistics
m = (h + k ) / 2
i
i
i
Quadratic form distance
Histogram vs Signature
Signatures derived from Histograms and represented as
{s j = (m j, w j )}
m
w
j
j
- Cluster mean ,M – is d-dimensional vector of Bins.
- Pixels that belong to that cluster
J – number of corresponded Bin in histogram
The EMD methods that described earlier and used on
Histograms, now can be used on Signatures .
More intuitive explanation next
Histogram of Image A
Histogram of Image B
Signature of Image A
Signature of Image B
The Experiment
Our database contains 20,000 images .
In our first experiment we identified 75 images of red cars ,from
this set we choose 10 “good” images ,in those images the
background was green/grey .
We preformed ten queries using different “good” car each time .
For this experiment we used histograms with Coarse
binning and Fine binning .
Over 20,000 images the average Coarse binning left us
with 15.3 non zero bins ,average Fine binning left us with
39 non zero bins .
The EMD outperformed other methods and results with
Signatures are much better then with Histograms .
Middle – Coarse binning
Bottom – fine binning
The Experiment
In our second experiment colors of the objects and background
are pretty similar ,we took 157 of brown horses in green fields
,again 10 “good” images were chosen ,again for Coarse and Fine
histograms .
For Coarse binning EMD signatures outperformed others but
Jefrey divergence and x^2 statistics outperformed EMD
histograms .
(This can be explained that the distance is computed between
more distance bin centers and therefore less meaningful )
For Fine binning EMD outperformed but signatures
outperformed all the rest .
Middle – Coarse binning
Bottom – fine binning
Conclusion
Emd has desirable properties for image retrieval ,Compared to
other methods it has advantages in all parameters .
As we saw Signatures have a better results in image retrieval .
Texture classification
We will focus on texture classification mostly using Gabor texture
features .
While color is purely point wise texture property ,texture involves
notion of spatial extent ,a single point has no texture .
If texture defined in the frequency domain the information of a
texture is carried by a point and it’s neighbors .
Short Background on Gabor Filtering
Gabor filter is similar to Fourier filter but are limited to certain
freq bands ,they do an excellent job in image or compaction .
Gabor filters are defined by harmonic functions and modulated
by Gaussian distributions .
Transform Fourier of Gabor filter ->
Texture representation
After applying Gabor filters on image with different orientation at
different scale ,we obtain an array of magnitudes ,these
magnitudes describe energy content at different scale orientation
of the image .
The main purpose of texture based retrieval is to find images or
regions with similar textures ,since this similarity is not rotation
invariant ,similar textures with different direction may be missed
out from retrieval or get a low rank .
Example is on next page
To solve this problem we suggested a simple circular shift .
The orientation with total high energy will be called dominant
,then we will rotate other images to meet dominant image .
Results
Our database included 1000 images with different kind of texture ,it
contained both natural and both texture images .
In first retrieval experiment all the 15 similar textures retrieved in the first
18 images and only one image was irrelevant .
Those results were conducted on color image database with 360 different
images ,the same images retrieved within 25 images .
How to imply EMD over textures
As we saw before we can with Gabor filter represent texture, so
represent it as 24 bins (4 for scale and 6 for orientation ) after we
represented texture we can use it as Histograms or Signatures
and other methods for similarity .
Results
Database - constructed 1744 texture patches .
Using EMD we can find partial matches in textures ,the query was
20% texture and 80% don’t care ,16 patches were the same and
followed them patches with partial original texture .
origin
We created 250 images database with 25 zebras ,then we cropped
a block of zebra stripes pattern and asked for images with at least
20% of that pattern ,the best 8 matches are shown above .
Block of cheetah pattern and asked for images with at least 10% of
that pattern ,the best 12 matches are shown above .
Conclusion
In this chapter we talked about Gabor texture retrieval and mostly
focused on rotations ,but this method can be extended to other
methods and we saw Emd measurement used Gabor properties
for texture retrieval ,the textures usually homogenous and
correspond to different parts of images ,therefore image retrieval
is very useful .
Reminder
Patch match can be achieved by SIFT algorithms or Histogram
distance (As described earlier) .
Image recognition
SIFT- scale invariant feature transform .
What is SIFT ? SIFT is algorithm used to describe
and detect local features in images ,we will mostly
talk on Harris corner detector .
Corners have long been considered as useful interest points and
there for they were used in many different algorithms .Color also
have great importance on matching images .In RGB color cube
most interest points are found using just intensity (luminance
returned light from bright objects) useful with studio
photography or artificial images .
However in natural images ,high contrast changes might have
place and so the changes won’t be that noticeable using intensity
based approach.
Harris corner detector reminder
Intuitive :
Harris corner detection achieved by second derivative in axis X
and Y meaning Convolving twice with (-1,0,1) and (-1,0,1)^t .
More formal explanation for different corners :
det(M) =
trace(M) =
RGB
normalized RGB
In RGB method the corners are spread all over the
image and not concentrate on specific area .
In normalized RGB we can see that the corners found
around silhouette of the parrot but in dark areas it
unstable as can see in the bottom of an image .
Harris detector with other color spaces
Quasi Invariant colors – derived from
RGB color with special
Equations (HSI, OCS, spherical color
space)
Can we improve corner detection ?
Harris Detector with scale invariant
As we will see scale have huge impact on corner detection ,so
we will use “Fixed scale” to improve results.
There some drawbacks with images that too large or too small .
We will use function to set “Fixed scale” .
E – cornerness measurement for each pixel (part of Harris
algorithm)
M – second moment matrix
Convolution
t – the amount of scale change
The optimum rescale for images decided by
experiments with Harris detector 1.2 < t < √2 .
Now we can see the changes in Harris Detector after
“Fixing scale” the parrot is highly prioritized .
Some more improvement is coming ->next
Colored Scale invariant Harris Corner Detection
Now let add Color information in Scale decision .
We will build function from 3 dimensional color to 1
dimensional data set and it will be combined in already
known Scale Invariant function .When we combine this
information we will get different definition of interest
points .
Working with quasi-invariant color space ,the interest
points now free of shading ,illumination or specular
changes so the lighting conditions don’t effect the image .
Natural cluttered animals images have different lightning
conditions and this method overcomes it as we will se it
now .
The background is structured with high illumination
changes and Quasi invariant HSI found the exact
scorpion image .
Image retrieval
For the retrieval experiment we will capture 1000 images ,for
every image, 18 images will be taken with different rotations
,the result is database of 18000 images .
As it can be seen Quasi Invariant color outperformed .
Conclusion
Using those methods that explained upper ,we saw that
they are much better than luminance based methods .
A color scale selection leads to better stability also it can
be transformed into various color spaces and we can take
advantage of this variable color properties .
In retrieval scenarios our approach was much more stable
,which leads to higher retrieval rates .
Image editing and reconstruction
What is it Image editing ?
As digital and computational photography have matured
,researchers developed methods for high level editing .
Now we can resize an image with good likeness of the original
image also we can erase unwanted portion of an image and
automatic image completion will complete the data .
Image reshuffling algorithms allow us to take a portion of an
image and move it around so the reminder will resemble
original image .
These algorithms depend on user intervention to obtain best
results because the user knows his expectation from modified
image .
NNF – nearest neighbor field
Is an algorithm that finds in image A for most
similar patch in image B .
Our algorithm to be efficient relies on 3 keys :
– searches in 2D space for
possible patch offset, achieving greater speed and efficiency then
standard Kd-tree structure search .
– Our algorithm ignores natural
structure in images by searching for each pixel in the patch ,
Improves efficiency .
– random choice for patch would be a
bad guess ,the bigger the patch the chances for correct
offset improves .
Example for Patch match
Good estimate for match ,it doesn’t need to be perfect .
Phases of the Algorithm
propagation – searches for good matches of the
neighbor patches .
The blue cube propagating (b) above red and left
to green and then (c) searches in neighborhoods
with certain radius .
The Algorithm
The outcome of this Patch Match algorithm is offset map ,this map
is 2D field with 2D vectors with the same dimensions as source
image . Each vectors stores location of the currently best match
vector known .
1.Initialization - is random except areas where we
have initial info ,called constrains we talk about the later
.
2.Propagation step
3.Random search
2&3 steps executed consecutive for each pixel.
Propagation step – the natural correlation is exploited ,assume
that red cube is best match ,now when we moving left from our
target ,black cube ,we can try and use this offset as our patch
guess ,there a good chance that this offset will be a match ,this is
also done bellow the current patch . We will take the best match of
these 3 patches (using patch match methods) ,in this matter we
propagating bottom left .
if we stopped -> go to Search step -> next
Search step – We will use random unit vector and scale it with
decreasing radius ,if this radius bellow certain point we stop ,this
random vector is added to the best current offset and the patches
are compared ,if it better , then the random one takes from that
moment ,if not search step repeated with different random vector
.
Then this algorithm applied multiple times .
The top image reconstructed using patches from the bottom image,
after 5 iterations the image complete .
Real world implementation
Our algorithm is much more faster and uses much less memory
then Kd-Tree .
For 7x7 patch size we found our algorithm 20x to 100x times
faster and uses about 20x less memory .
For smaller patches we obtain smaller speedups .
We also made GPU (8800 GTS video card) implementation for
NNF that 7x times faster then CPU implementation.
Editing tools
Now we will talk on novel interactive editing tools enabled by
our Algorithm.
By modifying the search in various ways we can introduce local
constraints on offsets to provide user control on synthesis
process .
We mostly will focus on :
Video …..
http://gfx.cs.princeton.edu/pubs/Barnes_2009_PAR/
Search Space Constraints
Image completion of Large regions is challenging task
,boundaries of missing region provide few or no constraints .
In our work we adopted user interaction approach allowing user
to draw curves across missing region . Our algorithm synthesizes
simultaneously curves and texture in the same unified work .
User provides completion region and a mask
Deformation constraints
Many recent retargeting methods allow user to mark important
regions ,One important cue that was overlooked are lines and
objects with straight edges are very common in images ,buildings,
roads, trucks keeping those line straight is important .
In our algorithm we overcome those problems by
constraining the domain of possible nearest neighbors
locations in the output .
We impose these constraints with “gradual scaling” ,
deformations become gradual because lack of space and
we been able to correct them .
Another example - Scaling
Hard Constrains (reshuffling)
The user wants to keep a region in an Image as hard constrain
without changing it during the process ,we can achieve that by
fixing NN fields to relevant region points after Iteration we
simply correct the offsets to the output position ,so the other
object will gradually rearrange to align with these constructed
regions .
a
b
c
Image moved but the background (b) “reshuffled” and competed
all the missing data .
(c) – patch transform .
Summary
We saw different methods for Image retrieval some of them were
feature based and some of them were regions based ,every
method has it advantages and disadvantages which one to use is
up to the implementation .
Also we saw some new Editing tool Algorithm that showed what
can be done using those image retrieval methods .
http://gfx.cs.princeton.edu/pubs/Barnes_2009_PAR/
http://docs.opencv.org/doc/tutorials/features2d/trackingmotion/harris_dete
ctor/harris_detector.html
Content-based Image Retrieval Using Gabor Texture Features
IEEE Transactions PAMI 2000
http://www.gscit.monash.edu.au/~dengs/resource/papers/pcm00.pdf
B. S. Manjunath and W. Y. Ma. “Texture features for browsing and retrieval of large
image data”
IEEE Transactions on Pattern Analysis and Machine Intelligence,
(Special Issue on Digital Libraries), Vol. 18 (8), August 1996, pp. 837-842.
http://jamf.eu/jamf/export/2097/trunk/doc/papers/96PAMITrans.pdf
Y. Rubner and C. Tomasi and L. J. Guibas.
The Earth Mover's Distance as a Metric for Image Retrieval.
International Journal of Computer Vision, 40(2) November 2000, pages 99--121.
http://www.cs.duke.edu/~tomasi/papers/rubner/rubnerIjcv00.pdf
Colour Interest Points for Image Retrieval
Julian Stottinger, Nicu Sebe, Theo Gevers, and Allan Hanbury
Computer Vision Winter Workshop 2007.
http://oldwww.prip.tuwien.ac.at/people/julian/publications1/data/Stoettinger_et_al_CVWW07.pdf
"Combining Color and Spatial Information for Content-based Image Retrieval".

J. Huang and R. Zabih,
http://www.cs.cornell.edu/rdz/Papers/ecdl2/spatial.htm

Patch Based:

Learning Image Patch Similarity
(Chapter in book - very detailed. explains bg point matching patch,
features.
http://ttic.uchicago.edu/~gregory/thesis/thesisChapter6.pdf






PatchMatch: A Randomized Correspondence Algorithm for Structural
Image Editing
Connelly Barnes Eli Shechtman Adam Finkelstein1 Dan B Goldman
ftp://194.153.101.105/Faculty/arik/Seminar2009/papers/patchMatch.pdf
Download