A Tutorial of Some Useful Methods for CSM16 Visual Information Systems By Li Chen Gaussian Model 1 Training stage 1.1 Analysis of data You have to decide which type of image data used in the projects firstly, and then predefine their labels/categories. 1.2 Choose training samples An example: There are six images (training samples), which are separated into two categories {w1 , w2 } : w1 : image1, image2, image3 w2 : image4, image5, image6 1.3 Extract primitive features Primitive features can be colour or texture. Here I will give an example of colour histogram extracted from RGB image. The related codes as following in Matlab. function RGBHist = Colour_Feature(rgbimage) % Extract the colour histogram features RGBHist = []; %extract Red [counts,x] = imhist(rgbimage(:,:,1)); totalpixels = sum(counts); for j = 1: size(x) RGBHist = [RGBHist counts(j)/totalpixels]; end %extract Green [counts,x] = imhist(rgbimage(:,:,2)); totalpixels = sum(counts); for j = 1: size(x) RGBHist = [RGBHist counts(j)/totalpixels]; end %extract Blue [counts,x] = imhist(rgbimage(:,:,3)); totalpixels = sum(counts); 1 for j = 1: size(x) RGBHist = [RGBHist counts(j)/totalpixels]; end RGBHist = RGBHist'; 1.4 Calculate parameters in Gaussian classification functions The Gaussian probability density function for the category wi is in the following format: 1 P( x | wi ) [( 2 ) d i ] 1 / 2 exp{ ( x u i ) T i1 ( x u i )} 2 where u i --- the mean of vector of class wi i --- the i-th class covariance matrix x is the feature vector. u i and i are calculated from training samples belonging to the category wi 1 ui N Ni x j 1 j , x j wi ,where N i is the number of training patterns from the class wi . The covariance matrix as i 1 Ni ( x j ui )( x j ui )T N j 1 The following is an example for the procedure of calculating parameters of Gaussian probability density of the class w1 : Assume: w1 : image1, image2, image3 After extracting primitive features from images, 2 3 1 Features_image1= 1 , Features_image2= 3 , Features_image3= 1 3 4 2 1 2 3 6 2 1 1 5 u1 {1 3 1} * 5 , 3 3 3 3 4 2 9 3 2 1 1 3 T ( x j u1 )( x j u1 ) 3 j 1 1 2 1 2 2 2 2 2 3 2 3 2 5 T 5 5 T 5 5 1 5 {( 1 ) * ( 1 ) ( 3 ) * ( 3 ) ( 1 ) * ( 1 ) T } 3 3 3 3 3 3 3 3 3 3 3 4 3 4 3 2 3 2 3 2 / 3 0 0 0 0 1 2 / 3 1 1 1 *{2 / 3 4 / 9 0 0 16 / 9 4 / 3 2 / 3 4 / 9 2 / 3} 3 0 0 0 0 4 / 3 1 1 2/3 1 0 1 / 3 2/3 0 8 / 9 2 / 3 1 / 3 2 / 3 2 / 3 1 1 0 1 / 3 2/3 0 8 / 9 2 / 3 1 / 3 2 / 3 2 / 3 1 1.1629 0.4281 0.2604 0.4281 0.5993 0.2354 0.2604 0.2354 0.3068 (For matrix operation, you can check your mathematic books. You can also use the function of pinv(x) to calculate inverse of x if you are using Matlab) 2. Testing stage 2.1 Theoretical inference P( x, wi ) P( x) * P( wi | x) P( wi ) * P( x | wi ) P( wi | x) P( wi ) * P( x | wi ) / P( x) To simplify the problem here, we suppose P( x), and P( wi ) are scale factors or constants, p( wi | x) P( x | wi ), so where is a cons tan t to make sure p( w | x) 1 i i 2.2 Classify an unknown sample An example: After training stage, we have one Gaussian function for category w1 : P( x | w1 ) and the other Gaussian function for category w2 : P( x | w2 ) . From the above theoretical inference, we get the posterior probability: P(w1 | x) and P(w2 | x) . If P(w1 | x) > P(w2 | x) , assign x to w1 ; otherwise assign x to w2 . 3. Content based image retrieval You may try some simple functions (such as Euclidean or absolute subtract) to calculate the distance between the query example and data in the image database. Rank them according to values of the distance. You can also make use of the analysed classes in the database to speed up the process. References: [1] B. S. Manjunath, J. R. Ohm, V. V. Vasudevan and A. Yamada, “Color and texture descriptors”, IEEE Transactions on circuits and systems for video technology, VOL.11, NO.6, June 2001, pp.703-715. 3 [2] Schalkoff, Robert J.. - Pattern recognition: statistical, structural and neural approaches / Robert. - New York; Chichester: Wiley, 1992. (Chapter 2) Multiple classifier combination method 1. Design individual classifiers Individual classifiers can be constructed based on different feature spaces (colour, texture) [4] and different classification algorithm (K-Nearest Neighbours, Neural Networks, Gaussian classification, etc.) The details about classifiers based on Gaussian model can refer to the project 1. K- Nearest Neighbours Algorithm is described as below: 1.1 An example of K- Nearest Neighbours Algorithm Given a training set of patterns X {( x1 , y1 ), ( x2 , y2 ),..., ( x N , y N )} , where xi is feature vector, and yi is class label for xi . Assume 2- class classification problem. Circles mean members in class w1 , and squares represent members in class w2 . Symbol cross is the input vector x. K should be related to the size of N of the training set. We suppose K = 4. Draw a circle with the centre of x, which concludes K=4 training samples. There are 3 samples from w1 , and 1 sample from w2 , so assign x to w1 . See more details in lecture note. x 2. Combination strategies Fixed rules on combining classifiers are discussed in theory and with practical examples in [5]. Xu, et. al [3] summarised possible solutions to fuse individual classifiers through three approaches according to the levels of information available from various classifiers. Here two different algorithms are given as examples. 2.1 An example of simplest majority voting algorithm Assume three classifiers and all of them will output labels for an unknown images. If two of them agree with each other, the result will be the agreed label. For example, the output 4 from classifier 1 is A, from classifier 2 is B, and from classifier 3 is A. A will be the final decision. For some situations, if three classifiers output different labels, we have to decide which one is right using some rules. 2.2 The Borda count method algorithm The example can refer to Notes from Dr. Tang’s lecture. Assume we have 4 categories {A, B, C, D} and 3 classifiers. Each classifier outputs a rank list. When an unknown image x is inputted into the system, we have the following output information: Rank value Classifier 1 Classifier 2 Classifier 3 4 C A B 3 B B A 2 D D C 1 A C D Rank values mean scores to assign for different ranked levels. The score of x belonging to the category A equals as following: S A S 1A S A2 S A3 1 4 3 8 The score of x belonging to the category B equals as following: S B S B1 S B2 S B3 3 3 4 10 The score of x belonging to the category C equals as following: SC SC1 SC2 SC3 4 1 2 7 The score of x belonging to the category D equals as following: S D S D1 S D2 S D3 2 2 1 5 The final decision is B because it obtains the highest value. References: [3] L. Xu, A. Kryzak, C. V. Suen, “Methods of Combining Multiple Classifiers and Their Applications to Handwriting Recognition”, IEEE Transactions on Systems, Man Cybernet, 22(3), 1992, pp. 418-435. [4] B. S. Manjunath, J. R. Ohm, V. V. Vasudevan and A. Yamada, “Color and texture descriptors”, IEEE Transactions on circuits and systems for video technology, VOL.11, NO.6, June 2001, pp.703-715. [5] J. Kittler, M. Hatef, R. Duin and J. Matas, “On Combining Classifiers”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(3), 1998, pp. 226-239. [6] R. Duin, “The Combining Classifier: to Train or Not to Train”, in: R. Kasturi, D. Laurendeau, C. Suen (eds.), ICPR16, Proceedings 16th International Conference on Pattern Recognition (Quebec City, Canada, Aug.11-15), vol. II, IEEE Computer Society Press, Los Alamitos, 2002, pp.765-770. 5