An Tutorial of Projects for CSM16 Visual Information Systems

advertisement
An Tutorial of Projects for CSM16 Visual Information Systems
By Li Chen
20/02/2004
This tutorial introduces steps of designing and developing part of the projects in
detail. Some particular algorithms/methods should be treated as samples, be aware
that what have been described here are not the only solutions and you can develop
your own. The general information can refer to the formal documents of proposal
for the projects of this module.
Project1: Develop a system which applies Gaussian Model to realize image
classification, content-based and semantic-based retrieval
1
Training stage
1.1
Analysis of data
You have to decide which type of image data used in the projects firstly, and then predefine
their labels/categories.
1.2
Choose training samples
An example:
There are six images (training samples), which are separated into two categories {w1 , w2 } :
w1 : image1, image2, image3
w2 : image4, image5, image6
1.3
Extract primitive features
Primitive features can be colour or texture. Here I will give an example of colour histogram
extracted from RGB image. The related codes as following.
function RGBHist = Colour_Feature(rgbimage)
% Extract the colour histogram features
RGBHist = [];
%extract Red
[counts,x] = imhist(rgbimage(:,:,1));
totalpixels = sum(counts);
for j = 1: size(x)
RGBHist = [RGBHist counts(j)/totalpixels];
end
%extract Green
[counts,x] = imhist(rgbimage(:,:,2));
totalpixels = sum(counts);
for j = 1: size(x)
RGBHist = [RGBHist counts(j)/totalpixels];
end
1
%extract Blue
[counts,x] = imhist(rgbimage(:,:,3));
totalpixels = sum(counts);
for j = 1: size(x)
RGBHist = [RGBHist counts(j)/totalpixels];
end
RGBHist = RGBHist';
1.4
Calculate parameters in Gaussian classification functions
The Gaussian probability density function for the category wi is in the following format:
1
P( x | wi )  [( 2 ) d  i ] 1 / 2 exp{  ( x  u i ) T  i1 ( x  u i )}
2
where
u i --- the mean of vector of class wi
 i --- the i-th class covariance matrix
x is the feature vector.
u i and  i are calculated from training samples belonging to the category wi
1
ui 
N
Ni
x
j 1
j
, x j  wi ,where N i is the number of training patterns from the class wi .
The covariance matrix as
i 
1 Ni
( x j  ui )( x j  ui )T

N j 1
The following is an example for the procedure of calculating parameters of Gaussian probability
density of the class w1 :
Assume:
w1 : image1, image2, image3
After extracting primitive features from images,
2
3
1


 


Features_image1= 1 , Features_image2= 3 , Features_image3= 1
 
 
3
4
2
 
1 2 3
6  2 
1       1   5
u1  {1  3  1}  * 5   
3
3
3
3 4 2
9  3 
 
 
,
2
1 
1 3
T
 ( x j  u1 )( x j  u1 )
3 j 1
1  2 
1  2 
 2  2 
 2  2 
 3  2 
 3  2 
5 T
5
5 T
5
5
1   5








 {( 1    ) * ( 1    )  ( 3    ) * ( 3    )  ( 1     ) * ( 1     ) T }
3
3
3
3
3
3
3
3  3 
3  3 
4  3 
4  3 
2  3 
2  3 
 
 
 
 
 
 
2 / 3 0  0
0
0   1
 2 / 3 1 
 1
1 




 *{2 / 3 4 / 9 0  0 16 / 9 4 / 3   2 / 3 4 / 9 2 / 3}
3
 0
0
0 0 4 / 3
1    1
2/3
1 
0
 1 / 3
 2/3

 0
8 / 9 2 / 3 
 1 / 3 2 / 3 2 / 3 

1
1
0
 1 / 3
 2/3

 0
8 / 9 2 / 3 
 1 / 3 2 / 3 2 / 3 
1
 1.1629
  0.4281
 0.2604
 0.2604 
0.2354  (Forget
0.3068 
0.4281
0.5993
0.2354
the formula how to
solve it, and maybe you can check your mathematic books and find the solution.
Anyway, you also do not need know the calculation details but use the function of
pinv(x) to calculate inverse of x)
2. Testing stage
2.1 Theoretical inference
P( x, wi )  P( x) * P( wi | x)  P( wi ) * P( x | wi )
 P( wi | x)  P( wi ) * P( x | wi ) / P( x)
To simplify the problem here, we suppose P( x), and P( wi ) are scale factors or constants, so
p( wi | x)  P( x | wi ),
where  is a cons tan t to make sure
 p( w | x)  1
i
i
2.2 Classify an unknown sample
An example:
After training stage, we have one Gaussian function for category w1 :
Gaussian function for category w2 :
posterior probability:
If
P( x | w1 ) and the other
P( x | w2 ) . From the above theoretical inference, we get the
P(w1 | x) and P(w2 | x) .
P(w1 | x) > P(w2 | x) , assign x to w1 ; otherwise assign x to w2 .
3. Content based image retrieval
I recommend you to try the simplest function to calculate the distance between the query example and
data in your image database. Rank them according to values of distance.
References:
[1] B. S. Manjunath, J. R. Ohm, V. V. Vasudevan and A. Yamada, “Color and texture descriptors”,
IEEE Transactions on circuits and systems for video technology, VOL.11, NO.6, June 2001,
pp.703-715.
[2] Schalkoff, Robert J.. - Pattern recognition: statistical, structural and neural approaches / Robert. New York; Chichester: Wiley, 1992. (Chapter 2)
3
Project2: Develop a system, which uses multiple classifier combination method
to realize image classification.
1.
Design individual classifiers
Individual classifiers can be constructed based on different feature spaces (colour, texture) [4] and
different classification algorithm (K-Nearest Neighbours,
Neural
Networks,
Gaussian
classification, etc.)
The introduction of Neural Networks can refer to the documents by Mr. Jifeng Wang, which will
go with the tutorial.
The details about classifiers based on Gaussian model can refer to the project 1.
I will introduce K- Nearest Neighbours Algorithm
1.1 An example of K- Nearest Neighbours Algorithm
Given a training set of patterns
vector, and
X  {( x1 , y1 ), ( x2 , y2 ),..., ( x N , y N )} , where xi is feature
yi is class label for xi . Assume 2- class classification problem.
Circle mean w1 , and square mean w2 . Crossed symbol is input vector x. K should be related to the
size of N of the training set. We suppose K = 4. Draw a circle with the centre of x, which
concludes K=4 training samples. There are 3 samples from w1 , and 1 sample from w2 , so assign x
to w1 .
x
2.
Combination strategies
Fixed rules on combining classifiers are discussed in theoretical and in practical in [5]. Xu,
et. Al [3] summarised possible solutions to fuse individual classifiers are divided into three
categories according to the levels of information available from various classifiers. I will
give two different algorithms for examples.
2.1 An example of simplest majority voting algorithm
Assume 3 classifiers and all of them will output labels for unknown images. If two of them agree
with each other, we take their result as the final decision. For example, classifier 1 says this image
is A, classifier 2 says it is B, and classifier 3 says it is A. A is the final decision. At some
situations, if three classifiers output different labels, you have to decide which one is right using
some rules.
2.2 The Borda count method algorithm
4
The example can refer to Notes from Dr. Tang’s lecture.
Assume we have 4 categories {A, B, C, D} and 3 classifiers. Each classifier outputs a rank list.
When an unknown image x is inputted into the system, we have the following output information:
Rank value
Classifier 1
Classifier 2
Classifier 3
4
C
A
B
3
B
B
A
2
D
D
C
1
A
C
D
Rank values mean scores to assign for different ranked levels.
The score of x belonging to the category A equals as following:
S A  S 1A  S A2  S A3  1  4  3  8
The score of x belonging to the category B equals as following:
S B  S B1  S B2  S B3  3  3  4  10
The score of x belonging to the category C equals as following:
SC  SC1  SC2  SC3  4  1  2  7
The score of x belonging to the category D equals as following:
S D  S D1  S D2  S D3  2  2  1  5
The final decision is B because it obtains the highest value.
References:
[3] L. Xu, A. Kryzak, C. V. Suen, “Methods of Combining Multiple Classifiers and Their
Applications to Handwriting Recognition”, IEEE Transactions on Systems, Man Cybernet, 22(3),
1992, pp. 418-435.
[4] B. S. Manjunath, J. R. Ohm, V. V. Vasudevan and A. Yamada, “Color and texture
descriptors”, IEEE Transactions on circuits and systems for video technology, VOL.11, NO.6,
June 2001, pp.703-715.
[5] J. Kittler, M. Hatef, R. Duin and J. Matas, “On Combining Classifiers”, IEEE Transactions on
Pattern Analysis and Machine Intelligence, 20(3), 1998, pp. 226-239.
[6] R. Duin, “The Combining Classifier: to Train or Not to Train”, in: R. Kasturi, D.
Laurendeau, C. Suen (eds.), ICPR16, Proceedings 16th International Conference on Pattern
Recognition (Quebec City, Canada, Aug.11-15), vol. II, IEEE Computer Society Press, Los
Alamitos, 2002, pp.765-770.
5
Download