Outline

advertisement
Storage of images for Efficient Retrieval

Representing IDB as relations


Representing IDB with spatial data structures


represent the image as rectangles
Representing IDB using image transformations


straightforward
can be represented by a vector of length k (k is between 100 and
200)
However,

different photographs of the same person may vary, depending on
a variety of factors
time of the day at which the photographs were taken
 the lighting conditions
 the camera used
 the exact position of the subject’s head and facial expression
 etc.

1
Representing IDB with Relations

(1) Create a relation called images having the scheme:

(Image, ObjID, XLB,XUB,YLB,YUB)


Suppose R is a rectangle specified by (XLB, XUB,YLB,YUB)
and R is in Rec(I), the there exists a tuple


where Image is the name of the image file and (XLB,XUB,YLB,YUB)
represent the rectangle in question
(I, newid, XLB XUB,YLB,YUB)
(2) For each property p, create a relation Rp having the
scheme:

(Image, XLB, XUB,YLB,YUB, Value)


where Image is the image file and the (XLB, XUB,YLB,YUB) denote a
rectangular cell in the image
Properties can be
Pixel level properties (RGB values)
 Object/region level properties (Name, age)
 Image level properties (when image was captured, where, by whom)

2
Querying IDB Relations






Eliciting the contents of an image is done using image
processing algorithms
Image processing algorithms are only partially accurate
This implies that tuples placed in an IDB relation by an
image processing program has certain associated
probabilistic attributes
The probability that John Lee is the name of o2 is 0.75
The probability that Ken Yip is the name of o2 is 0.15
Prob
There is a 10% missing probability ObjId Name
o1
o2
o2
o3
Jim Hatch
John Lee
Ken Yip
John Lee
0.8
0.75
0.15
1
3
Complex Queries

Query: What is the probability that pic1.gif contains both
Jim Hatch and Ken Yip? (assume o1 and o2 are in pic1.gif)


Is it 0.8*0.15 = 0.12?
In general, the answer is NO
ObjId
o1
o2
o2
o3
Name
Prob
Jim Hatch 0.8
John Lee 0.75
Ken Yip
0.15
John Lee
1
4
Complex Queries


Consider pic8.gif with o10 and o11
Four possibilities





possibility 1: o10 is Ken Yip and o11 is John Hatch
possibility 2: o10 is Ken Yip and o11 is not John Hatch
possibility 3: o10 is not Ken Yip but o11 is John Hatch
possibility 4: o10 is not Ken Yip and o11 is not John Hatch
Suppose pi denotes the probability of possibility i





p1 + p2 = 0.5
p3 + p4 = 0.5
p1 + p3 = 0.8
p2 + p4 = 0.2
p1 + p2 + p3 + p4 = 1
ObjId
o10
o10
o11
o11
Name
Prob
Ken Yip
0.5
Jim Hatch 0.4
Jim Hatch 0.8
John Lee 0.1
5
Interval Probability


Need to solve these linear equations to determine the
probability that pic8.gif contains both Ken Yip and John
Hatch
The result is an interval of probabilities



p1 between 0.3 and 0.5
Requires the use of Interval probabilities!!!
Interval probabilities allow us to represent the margin of
error of image processing algorithms in identifying the
object
ObjId
Name Prob(l) Prob(u)

e.g. with 3% error
o1
o2
o2
o3
Jim Hatch 0.77
John Lee 0.72
Ken Yip
0.12
John Lee
0.97
0.83
0.78
0.18
1
6
Representing image DBs with R-Trees


Create a relation with two attributes (ImageId, ObjId)
Create an R-tree that stores rectangles




if the same rectangle appears in two images, then we have an
overflow list
Each rectangle has an associated set of fields that specifies
the object/region level properties
Not good for nearest neighbor queries as the tree is
constructed with only two dimensions out of the n+2
dimensions of the image
Generalized R-Trees use all the n+2 dimensions to
construct the tree
7
Implementations


Many use object-oriented implementation
Support methods such as




Most implementations assume the whole image to compare
Perform feature extraction and thus represent an image as a
vector of n fields
an index is created on an n-dimensional vector



rotate, segment, edit
multidimensional extension of point quadtree
R-Tree
To perform similarity search, they compute the Euclidian
distance between the vector representing the query image
and those of all images
8
Download