GEMINI (GEneric Multimedia IndexIng) Approach To illustrate the

advertisement
GEMINI (GEneric Multimedia IndexIng) Approach
To illustrate the basic idea of the GEMINI approach, we focus on ‘whole match’
queries. For such queries, the problem can be characterised as follows:
We have a collection of N objects, O1 ,O2 ,...ON .
The distance/dissimilarity between two objects (Oi ,O j ) is given by the function
D(Oi ,O j ) , which can be implemented as a (possibly slow) program.
The user specifies a query object Q , and a tolerance .
Our goal is to find the objects in the collection that are within distance of the query
object. An obvious solution is to apply sequential scanning: For each object
Oi (1 i N ) , we can compute its distance from Q and report the objects with
distance D(Oi ,Q) .
However, sequential scanning may be slow for two reasons:
1. The distance computation may be expensive.
2. The database size N might be very large.
GEMINI aims to provide a faster alternative, and is based on two ideas:
A quick-and-dirty test, to discard quickly the vast majority of non-qualifying
objects (possibly permitting some false positives).
The use of spatial access methods to achieve faster-than-sequential searching.
The reason for the distance computation being expensive is that multimedia objects
can have a very large dimensionality. The quick-and-dirty test is designed to reduce
this dimensionality to more manageable proportions, often to only one or two
dimensions. Effectively, each multimedia object is projected onto a lowerdimensional space by extracting some important features. The distance between the
query and collection objects is measured in this lower-dimension space, with little
computation effort. Any object that is distant from the query object by more than is
disqualified from further consideration. Finally, each collection object that was not
disqualified is then fully compared to the query object in the original highdimensional space.
Spatial access methods involve segmenting the multimedia objects into smaller,
logically-cohesive components, and storing these in a data structure such that the
original object can easily be reconstructed. For example, a video clip could be divided
into scenes, while a still image could be stores as a fixed number of overlapping
rectangular segments. When evaluating a query, objects can be compared on a
component-by-component basis – possibly in short-circuit (if the first c components
are dissimilar, then terminate the comparison)1.
1
For the time being, we shall ignore this spatial access aspect, as it has more to do with finding an
adequate object decomposition and associated data structure than with the retrieval problem.
The projection from high to low dimensionality space should be contractive in nature
– i.e. it should place objects either the same distance apart or closer together than they
were in high-dimensional space. That way, there will be no false dismissals, though
there may be false positives (which are allowed).
Formally, let (
) be the projection of objects to f-dimensional space, that is (O)
is
the f-dimensional point that corresponds to object O . Let DFeature( (Oi ), (Oj )) be the
distance between the projections of objects Oi ,Oj . Then, for ( ) to be a valid
projection, we must prove that DFeature( (Oi ), (Oj )) D(Oi ,Oj ) .
If we can find a suitable feature-extracting projection that satisfies the above
requirement, then the MMIR problem can be expressed as two algorithms:
Algorithm 1 (GEMINI):
1. Define the distance D(Oi ,O j ) between two objects of the media at hand.
2. Find a feature-extraction function ( ) that reduces the dimensionality2.
3. Prove that DFeature ((Oi ),(O j )) D(Oi ,O j ) , to guarantee correctness.
4. Use an appropriate data structure to store and retrieve the f-dimensional features
of each collection object.
Algorithm 2 (Search):
1. Map the query object Q into a point (Q) .
2. Using a spatial access method, retrieve all points within distance of (Q) .
3. Retrieve the corresponding objects, compute their actual distance from Q and
discard the false positives.
2
This step will often require the assistance of a domain expert, who will have the insight to choose an
appropriate contractive feature-extraction function.
Download