Information Retrieval Image Retrieval: Project paper The University of Joensuu Department of Computer Science 20.11.2003 Ioan Cleju 157743 icleju@cs.joensuu.fi Tuomas Hakala 149162 thakala@cs.joensuu.fi Suvi Korpela 153927 skorpela@cs.joensuu.fi Andrei Mihaila 157752 amihaila@cs.joensuu.fi Risto Sarapik 142821 rsara@cs.joensuu.fi 1. Introduction 2. IR-Model 3. Technical description 3.1. K-means Clustering Algorithm 3.2. Self Organizing Maps 3.3. GUI 4. Evaluation 4.1. Test plan 4.2. The results of the tests 4.3. Conclusions 5. Description of other systems in the field 5.1. Comparison to other systems 6. Future development 7. References 1. Introduction Among the increasing stream of published data is a growing amount of visual information. Digital image and video libraries are becoming more common and widely used. There exists a huge number of visual data sources, among them various digital image collections and databases for all kind of purposes. The fast development of computing hardware has enabled to use visual information as a irremovable part of normal everyday computing. In order to utilize the visual information stored in databases we need an effective image retrieval method. It can be based on textual keywords which are connected to stored images or it can be based on the measurable visual properties of the images such like colors, shapes, textures and spatial relations. Traditional text-based image retrieval systems have some obvious limitations. The connecting of keywords to the every image in the whole database has to be done manually and it is quite laborious. The other problem is the contents of the images; how to find the correct, allembracing and unambiguous keywords. Content-based image retrieval is another way to handle this problem. It is based on automatically extracted features from the contents of the image. The problems by this method are due to different way to read the images of human being and a computer. The method we have studied is the latter one, content-based image retrieval. By analyzing the key features of the pattern image we get a group of values which are typical of this image and which can be compared to corresponding features of the images in database. The query result is a set of images similar to the query image. How similar the images look like the original pattern depends on the methods which are used by analyzing the pattern image and the features which are compared. 2. IR-Model Information retrieval model is a model which presents the idea of how the system retrieves information to its user [1]. In another words we can say that the retrieval algorithm that the system uses is build upon the model used. Generally, different information retrieval models can be used depending on the system’s purpose and the user needs. Our system should be able to group and rank images. Actually the system extracts key vectors from images and uses them to do the abovementioned tasks. Corresponding to these tasks we used two different models. In the phase of ordering the vectors and measuring the distances between them we used generalized vector space model [3] . In 1985, Wong, Ziarko and Wong proposed an interpretation in which the index term vectors are assumed linearly independent but are not pair wise orthogonal. Such interpretation leads to the generalized vector space model. In our system the generalized vector space model appears in the structure of the vector. The vector is composed of set of elements which includes the values for the amount of pixels in a image containing the specified color and the color is represented in RGB so that the vector can be considered as multi dimensional structure for each element. In phase of grouping similar vectors we used neural network model. Since neural networks are known to be good pattern matchers, it is natural to consider their usage as an alternative model for information retrieval. It is now well established that our brain is composed of billions of neurons, working in parallel for the same goal. Each neuron can be viewed as a basic processing unit, which when it is simulated by input signals might emit output signals as a reactive action. It has been established that similar inputs cause neuron that are in the same area to react. This is the basics observations that lead to development of selforganizing-maps as an instrument of neural networks[6]. 3. Technical description The system has to implement more tasks in order to become a standalone application from which we can retrieve images. The first one is the feature extraction. Given a image, the system has to extract some feature that will use for comparison with other images or for indexing. This version of the product will be focused only on color information image. We used color quantization as a technique to reduce the number of colors. The final number of colors depends on the type of the images from the database, desired precision of the process and run time. For a retrieving process, both the query image and all the other images have to be reduced to the same number of colors. Of course, feature extraction is performed online just for the query image. After the number of colors is reduced, the extracted vector contains information about the number of colors and the actual colors. Another task is to construct the database. All images are passed throw feature extractor and in the database we store the path to the image and feature vectors. After database is constructed, it has to be re indexed. For indexing we use a self-organizingmap. This structure should organize our data in a two-dimensional space, with the observation that similar images will be in the same region. To achieve this, the SOM needs to be trained first. It takes all vectors obtained for the images and adjusts its parameters. To train, a SOM needs to know how to compute distance between vectors. Usually euclidean distance is proposed. However, because we know the actual meaning of each element in the vector, we can develop our own distance. That is what we did. Here is the way we compute the distances between two vectors [10]. By a greedy algorithm, each color from one image is associated with one color from other image, so that the sum of euclidean distances between colors is minimum. After the association, the actual distance is computed as the sum of distances between each corresponding color, and the distance between colors is computed as a sum of three terms. The first one is the absolute difference of percents of the colors in images. The second one is the euclidean distance between colors, which is normalized (taken in range between 0 and 1). The last term is the product of the other two terms. With the SOM constructed, we can begin the actual indexing. All vectors from images are showed to the map. It returns the two coordinates, which represent the location of the vector in SOMs space. This coordinates are registered in the database. All the tasks described before are done offline. This is offline approach [2]. The online task – actual retrieving is like this. The user selects a query image. The image is processed and features extracted. Features are shown to SOM, which will return their location. A collection of images that are in the neighborhood (larger or smaller, depending to user preferences) are retrieved from the database. Then all these images will be compared to query image, will be sorted and shown in the GUI. 3.1. K-means Clustering Algorithm There are many approaches when trying to solve the problem of clustering: algorithms based on splitting or merging, neural networks, randomized approaches. From the approaches that try to minimize an objective function, the most used is Generalized Lloyd’s Algorithm, also known as K-means Clustering Algorithm. Suppose there is a d-dimension space, a set of n data points in this space, and an integer k, the algorithm is supposed to find a set of k centers so that to minimize the total squared distance from each data point to its nearest center. Our approach is a refined version of K-means Clustering Algorithm [4] which has the advantage of being computationally very feasible. It uses a iterative method to compute the set of centers. We will make the convention: by the neighborhood of a center z we denote all the points in the data set for which z is the nearest center. As an observation, the optimal placement of the center - given a set of points, which will minimize total squared distance from data points to the center, is in the centroid of the set. Each step of the algorithm re computes the neighborhood for the centers and then moves the center to the centroid of its neighborhood. The algorithm will converge finally to a local minimum. How close this local minimum is to the global minimum is influenced by the choice of the original centers. The implementation of the algorithm uses the tree structure as a key element. The width of the tree depends on the number of dimensions of the space. The root of the tree corresponds to the entire space, which is a hypercube. The space is divided – each dimension is divided by two, and all the new hypercubes obtained will correspond to children nodes of the root. Each region is again split, and correspondingly the tree will become deeper. A hypercube will stop to split when the number of data points in it becomes less than a certain limit (granularity). The algorithm is implemented for three dimensional case, so we need to construct an octal tree. As shown before, for the general case, the root of the tree corresponds to the entire space of data points, which is a cube. Then the cube is divided into 8 cubes (each of cube’s dimensions is divided by 2) and these cubes correspond to all the children of the root. The process is recursively repeated until the number of points in cubes that corresponds to leaves do not exceed the granularity. The splitting of the histogram is done just once, at the beginning of the algorithm. After the tree construction, the actual algorithm can start. First, k locations for the centers are chosen (randomly or depending on particularities of data set). Then the first iteration may begin. Each iteration of the algorithm receives a set of centers, computes the neighborhood for each of them and then move center locations to centroids of their neighborhoods. Iteration is repeated until the locations of centers stabilize. Usually in K-means implementations, the most computations are done when neighborhoods are constructed, for each center. Here is the biggest improvement in this algorithm. With the help of the already constructed tree, relations between each data point and a center is build easier. The idea is to propagate the centers through the tree, and each node to retain only the centers that are candidates for the points inside its corresponding cube. The algorithm is known by the name filter algorithm. In the end, the leaves will have only a small number of centers, or even just one center. If a leaf receives more centers, it will eliminate all but one. All the data points corresponding to the leaf belong to the neighborhood of the only center remained unfiltered from the leaf. Because we can find all data point (actually all space) in the reunion of all leaves, each data point has a corresponding center. At this moment, for each neighborhood of each center, the centroid is computed, and each center moves to corresponding centroid. As new locations for centers are generated, next iteration can start. The key point in this approach is the propagation of the centers – filtering algorithm. There are 2 different cases, whether the node is a leaf or not. If the node is internal, the algorithm should select from the centers, only those ones which might be nearest to some points contained inside node’s corresponding cube. However, a cube (especially if its corresponding node is closer to root) could contain large number of data points. The filter algorithm tries to eliminate those centers which are further from any part of the cube than other centers. To achieve this, filtering works like this: first get the center that is nearest to the geometric center of the cube, let it be center z. Then take each of all other centers – z’, and check whether any part of the cube is closer to z’ than to z. This can be done considering the bisecting plane for the 2 points, z-z’. If this plane intersects the cube, it means that some points are closer to z’ than to z, and others closer to z than to z’. If no part of the cube is closer to z’ than to z, z’ is eliminated from the candidates list. The new list of centers is passed to children, and filter is applied for children and so on. If the node is a leaf, the weighted distances from each center to all points are computed and only the center with minimum distance will be considered. So after filtering centers received by a leaf, only one center will remain. The iterative algorithm continues until the centers will remain the same location for two consecutive iterations. As shown before, the way the initial centers are generated is very important for the final result of the algorithm and also for the algorithm’s run time. Because of the tree structure which is constructed for the algorithm, we propose a way to generate the initial centers. For this we need an additional information: each node in the tree should keep record of the number of data points it contains. The tree receives the number of centers it has to generate and passes this number to root. All root’s children will receive the number of centers to generate, proportional to the number of points that belong to their space. The number of points to be generated by an internal node should be equal to the total number of centers to be generated by its children. This process is recursive. Finally, leaves will generate the centers in their corresponding spaces. Centers can be generated random or in a certain way (for some applications the process must not contain any stochastic part). This way we assure that the centers are generated in regions with large density of data points. 3.2. Self Organizing Maps Self Organizing Map is a technique introduced by Professor Teuvo Kohonen which reduces the dimensions of data vectors to a little number of dimensions [5]. Not only that the SOM reduces the dimensions of vectors but also it groups together similar data. It converts the complex relations between vectors into simple geometrical (usual 2 dimensions) relations. Most usual SOMs have 2 dimensions, and of this type is also the one we are using. That’s why I will speak about them mostly. The SOM consists as a 2 dimensional grid of nodes. It has an input (vector type), and an output (a vector with two dimensions). The input is connected to all the nodes of network. Each node also has an associated vector model. The size of the model and of input must be the same. When an input vector is showed to network, the output should be the coordinates of the node that has most similar model with input vector. Training is also an iterative algorithm. The same data set of input vectors is shown repeatedly to the network. More iterations, better solution will be find, but with a longer time cost. As the iteration number increase, the neighborhood and the ratio by which a model change. We can consider a neighborhood function that will show us the ratio by which a node will be modified. Usually the neighborhood function follows a gaussian distribution with parameters depending on the coordinates of the winner and iteration number. So the ratio will decrease as the distance to the winner grows and as the iteration number increase. When talking about gaussian function, (standard deviation) is the key parameter. It should decrease over time[6]. A generally used formula for sigma is (n) = (0)*exp(-n/T1) When considering SOMs, there are two phases. The other parameter, the amplitude (step), also decreases over time, and an usual formula would be (n) = (0)*exp(-n/T2) The first phase is the training of the system. During this phase, all the input vectors are shown to the network. The inner models of the nodes change so that finally to closer nodes will correspond closer models. As good initial parameters: (0) = radius of lattice (0) = 0.1 The training is based on competition. For each data vector, one node will win – the node that contains the most similar model. As a reward for this, the winner will modify itself and its neighbors with a certain ratio of the input vector, so it will be more similar to this. The data within the nodes are first initialized either random, or as a regular array of vectorial values (preferred especially if the SOM is developed for a specific task and more details about the distribution of vectors is known). During the training, there are 2 important phases: - self organizing and ordering - convergence Self organizing phase should take about 1000 epochs, and convergence at least 500 epochs. Considering these values, then the parameters for training will be T1 = 1000/log( (0)) T2 = 1000 The second phase is the interrogation of the network. The SOM is already trained. If we will pass it a vector, it should return the coordinates of the winning node. Responses from similar vectors should be in the same region. The dimensions of the network, depends on the application. 3.3. GUI The application offers full functionality graphical user interface. The same interface provides: - an interface for administrating system. There is an option to create a new image database by selecting the folder which contains the images, extracts features from them and the database is populated. The neural network is trained with all the data from the new created database. The images are re-indexed corresponding to their response from the SOM. - an interface for users to query images. Working database can be chosen. User can select image to query by accessing File - Query image option from the main menu, when an open file dialog appears. The user can query for jpeg of gif file format images which may not exist in the database. The result of query is displayed by showing the images ordered by relevance, in a thumbnail view mode. For each image, the distance from to the query image is also shown. On the status bar it is displayed the number of similar images found, the cluster of the query image and cluster of query result image that is focused. User can select the image database which he wants to query from Settings – File Menu. Example screenshot of system’s graphical user interface 4. Evaluation Basically the purpose of evaluating the retrieval system is to make sure that the system, in our case the image retrieval system, works and to analyze its functionality and resources needed to perform the task the system is supposed to do, and evaluating the system how well it performs the retrieval task in order to make conclusions from the given answer set, and how userfriendly the system is. The type of evaluation to be considered depends on the objectives of retrieval system [3]. Usually the first type of analysis that should be considered is a functional analysis, in which the retrieval system’s functionality is tested one by one. An analysis like functional analysis should include an error-analysis phase in which the misbehaviors/errors of the system are recognized by trying to make the system fail. In our case we did the error-analysis using the black-box testing method meaning that we didn’t spend much time searching for errors in the source code, instead we were more focused on the results the system generated, in another words mainly focused on evaluating the performance of the image retrieval system. The most common measures of system performance are time and space. The shorter the response time, the smaller the space used, the better the system is considered to be. There is an inherent tradeoff between space complexity and time complexity which frequently allows trading one for the other. About the resources the system uses: Space requirements for the system include the collection of images and the collection of vectors in which each vector presents the properties of an image related to it. The more colours are used in a process of creating vectors to present image properties, the more space is naturally used, because the vector length grows meaning that the more information of images colour are added to the vector. The time resources of the system depends on the hardware efficiency and number of vectors (number of images), because the system has to perform more comparisons between vectors. The other way (which we mainly focus on this chapter) is to evaluate the system’s performance by analyzing the answer set of images |A| the system generates for each user given query. For each query we had a related set of relevant images |R| (the ones the system returns if it’s working in ideal way), which we assumed to be similar for the query image given. There are two main terms in this type of approach, which we can use to prove how reliable the answers of the system are. The first term recall is the fraction of the relevant images from the set of relevant images |Ra|, which has been retrieved. In another words, recall means how many percents of relevant images are being retrieved. → Recall = |Ra|/|R|. The second term precision is the fraction of the retrieved images from the answer set which is relevant. In another words precision shows how well the system has managed to find relevant images to a query image. → Precision = |Ra|/|A|. Example of precision and recall for a given query The average precision Pavg(r) at the recall level r is counted from formula: → Pavg(r) = i=1toNq (Pi(r)/Nq) , where Nq is the number of queries, and Pi(r) is the precision at recall level r for the i-th query. 4.1. Test plan Here is a short description of the test plan for evaluating the performance of the system. We have 2 image databases, one with national flags images consisting of 251 images 48x32 resolution and the other one 120x118 with general images: people, animals, flowers, satellite images, cars, chemicals. For evaluation, we used the first one and we made six image queries per colours used to create vectors. We created image vectors so that we reduced the number of colours in pictures to 2, 3, 4, 5 and 6. So, that we had a total amount of 30 queries, six for every colour amount. We use the same test images for each colour amount test, because in that way we can see the differences more easily and if the system performance improves by increasing the colour amount. 4.2. The results of the tests Results of performance evaluation are shown below in five tables: Test No 1 2 3 4 5 6 Test No 1 2 3 4 5 6 R A Ra 7 9 14 6 10 20 10 25 59 43 61 61 Recall Precision 1 5 13 5 8 10 14,29 % 10,00 % 55,56 % 20,00 % 92,86 % 22,03 % 83,33 % 11,63 % 80,00 % 13,11 % 50,00 % 16,39 % 62,67 % 15,53 % Result table using 2-colour vectors R A Ra 7 9 14 6 10 20 16 38 7 15 46 32 Recall Precision 3 5 2 1 8 13 42,86 % 18,75 % 55,56 % 13,16 % 14,29 % 28,57 % 16,67 % 6,67 % 80,00 % 17,39 % 65,00 % 40,63 % 45,73 % 20,86 % Result table using 3-colour vectors Test No 1 2 3 4 5 6 R A Ra 7 9 14 6 10 20 Recall Precision 15 29 22 40 25 27 1 14,29 % 6,67 % 5 55,56 % 17,24 % 10 71,43 % 45,45 % 6 100,00 % 15,00 % 7 70,00 % 28,00 % 14 70,00 % 51,85 % 63,54 % 27,37 % Result table using 6-colour vectors From the tests above we have average results for precision and recall 1 = 2 colors 2= 3 colors 3= 4 colors 4= 5 colors 5= 6 colors 80,00 % 70,00 % 60,00 % 50,00 % Test No 1 2 3 4 5 6 Test No 1 2 3 4 5 6 R A Ra Recall Precision 40,00 % 30,00 % 7 9 14 6 10 20 18 38 42 26 45 45 4 57,14 % 22,22 % 6 66,67 % 15,79 % 5 35,71 % 11,90 % 6 100,00 % 23,08 % 9 90,00 % 20,00 % 19 95,00 % 42,22 % 74,09 % 22,54 % Result table using 4-colour vectors R A Ra Recall Precision 20,00 % 10,00 % 0,00 % 1 2 3 4 5 2 3 4 5 30,00 % 25,00 % 20,00 % 15,00 % 10,00 % 7 9 14 6 10 20 22 19 15 23 57 57 5 71,43 % 22,73 % 4 44,44 % 21,05 % 3 21,43 % 20,00 % 6 100,00 % 26,09 % 9 90,00 % 15,79 % 20 100,00 % 35,09 % 71,22 % 23,46 % Result table using 5-colour vectors 5,00 % 0,00 % 1 The Graphs of average recall and precision, where the tests are represented on the x-axis The time system uses to perform the following tasks is shown below: (colors) Network Learning Time (2000 epochs) Performing the query 4.3. 2 3m 0.4s 3 4 7 m 13 m 0.5s 0.5s 5 18 m 0.6s 6 25 m 0.6s Conclusions The system is performing best when the vectors contain information of four colours, which can be seen on the test results where the average recall is counted for each group of tests. In these test cases we used images of flags which were quite simple and did not contain many colours. One reason for using images as simple as flags was that it was easier to specify the relevant images related to each query image, so that we could count the recall and the precision more reliably. If we had used images from the real world such as photos, we would have noticed that the system would work better when the colour information in vectors is increased, because the comparisons would be more precise between the vectors. As the time results show the network learning time increases when the colour information in vectors is increased. Here we have found the dependency between the time and the space against the precision of the system given answer set. Our tests show that the system is working quite well with images that contain a relatively small amount of colours. Outside of testing that is documented on this chapter, the queries made showed that the system works well also with more complicated images too. In conclusion, we are pleased with the system’s behavior. 5. Description of other systems in the field Many collections of digital images have been created by digitizing existing collections of photographs, diagrams, drawings, paintings, prints, in different domains. Research in image retrieval techniques has become more active since 1970s as study field of database management and computer vision communities. Corresponding to each community, two ways to retrieve images developed: text based and visual based. First approaches used some text to annotate the images, and then retrieving became text based. However there are two major limitations: annotation images must be made by humans, and the number of images can be huge, and second comes from the rich content of an image and the subjectivity of human perception. In 1990s, content based image retrieval was proposed as a way to overcome the difficulties showed earlier. Instead of being manually annotated, images would be indexed by visual content information. Recently, MPEG (Moving Picture Experts Group) developed MPEG7 standard, formally named ”Multimedia Content Description Interface”, for describing the multimedia content data [7]. There are two different tasks that every systems should implement: one is feature extraction and the other one is indexing of features vectors. Due to perception subjectivity, there is not a best presentation for a given feature. The mostly used features are: color, texture, shape, color layout. For each of them there are more models, each with good points as well as bad points. After feature extraction, also reduced a lot, the dimensions of the vectors can also be high (typical order of hundreds components). And considering also that dimension of database could be high, and the response should be as fast as possible, most of the system implement also an indexing algorithm. The high dimension of vectors and non-Euclidean similarity measure are the biggest problems in indexing. Three research communities that contribute to this area are computational geometry, database management and pattern recognition[9]. The most used indexing techniques are bucketing algorithm, k-d tree, priority k-d tree, quad-tree, K-D-B tree, hB- tree, R-tree. As new techniques, clustering and neural networks have promising results. Most of the systems support one or more of the following options for search: - random browsing; - search by example; - search by sketch; - search by text (key word or speech); - navigation with customized image categories; Here are some of the most representative systems that have been developed. QBIC is the first commercial system, developed by IBM. Queries can be example images, user constructed sketches, selected color or texture, etc. As features, it includes color, texture, shape. It uses a R*-tree as multidimensional indexing structure. Its demo is at http://wwwqbic.almaden.ibm.com . Virage – a system developed at Virage Inc. It supports queries based on color, color layout, texture and structure. It goes deeper than QBIC, it supports combinations of those queries, and weights for each. Its demo is at http://www.virage.com/cgi-bin/query-e . RetrievalWare – an engine developed by Excalibur Technologies Corp. It uses neural networks in retrieving process. As features, it uses color, shape, texture, brightness, color layout, aspect ratio. Its demo page http://www.virage.com/cgi-bin/query-e . is Photobook – a set of interactive tools for browsing and searching images developed at the MIT Media Lab. It uses shape, texture and face features. In its more recent versions, it is proposed to include human in image annotation and retrieval loop. The motivation is that there is no feature that can best model images from each domain. VisualSEEk and WebSEEk: the first one is a visual feature search engine and the latter is a www oriented search engine, developed at Columbia University. Visual information is based on color set and wavelet transform based feature. The demo page is http://www.ee.columbia.edu/˜sfchang/demos.ht ml . PicSOM: developed at Laboratory of Computer and information Science at Helsinki University of Technology [8]. It is the first system that uses self organizing maps as a mean of indexing images. The system tries to adapt to user’s preferences regarding the similarity of images in the database. Features describe color content, texture, shape and structure of the images. 5.1. Comparison to other systems In general, our system follows the general architecture of other systems in the field. It integrates both important parts of image retrieval systems - the feature extraction and the indexing system. So the architecture is the same as the majority of systems. Due to the lack of time and knowledge, the features include only color information. However, the algorithms used for color extraction and distance of histograms computed are among the highest in their field. For indexing, we followed a technique that is very promising – the self organizing map, and which began to be used only in recent years. From the other systems we have studied only PicSOM uses the same indexing technique. Their SOM is a special one (tree structured SOM), but considering that for now we work only with little databases, our implementation fits well. With the help of our initialization technique, the training times are in normal range. In a future version, which will work with larger amount of data, we will have to consider this problem. The indexing part can also be developed. We can keep the SOM general approach, but implement a tree structure SOM, that improves organization and time training time. Another indexing technique, more deterministic, based for example on tree organization structure, can be developed. Even the space of features can be clustered similar to the way colors are clustered. 6. Future development The system can be developed further. Actually the architecture was specially designed to permit easy development. Feature extraction. Currently features are only associated with colors. Next step will be to add some spatial information, which could be not so hard. Easiest way is to divide the image into sub blocks and to perform the same color extraction algorithm on each block. Other way could be to extract colors the same as now, and then to associate to each color a region in space (a bounding rectangle for example). Of course some statistics parameters can also be extracted, like dispersion of colors. Some information about shape can be added. First run some edge detection algorithm and then processed the information. Texture is harder to model. Some filters can be applied to original images and their response analyzed, to get information about texture. Other way, after edge detection, the general orientation and some statistical information about it can be obtained. Also more complex transforms like Fourier or better some wavelet transform can be applied and the result stored as a part of feature. 7. References [1] E. Sutinen – ”Information Retrieval, Lecture Notes”, The University of Joensuu. [2] P. Fränti – ”Image Analysis, Lecture Notes”, The University of Joensuu. [3] R. Baeza-Yates, B. Ribeiro-Neto, Modern Information Retrieval, ACM, London, 1999. [4] T. Kanungo, D.M. Mount, N.S. Netanyahu, C.D. Piatko, R. Silverman, and A.Y. Wu, “An Efficient k-Means Clustering Algorithm: Analysis and Implementation”, IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 24, NO. 7, JULY 2002. [5] T. Kohonen, The Self-Organizing Map (SOM), WWW, http://www.cis.hut.fi/projects/somtoolbox/theor y/somalgorithm.shtml, 19.10.2003. [6] S. Haykin, Neural Networks A Comprehensive Foundation. [7] International Organization for Standardization ISO/IEC JTC1/SC29/WG11 Coding of Moving Pictures and Audio, March 2003. [8] M. Koskela, Content-Based Image Retrieval with Self-Organizing Maps, WWW, http://www.cis.hut.fi/picsom/thesis-koskela.pdf, 15.10.2003. [9] Y. Rui and T.S.Huang, “Image retrieval: Current Techniques, Promising Directions and Open Issues”, Journal of Visual Communication and Image Representation, 10:39-62, January 1999. [10] A. Mojsilovic, J. Hu and E. Soljanin, ” Extraction of Perceptually Important Colors and Similarity Measurement for Image Matching, Retrieval, and Analysis”, IEEE Transaction on Image Processing, Vol. 11, No. 11, November 2002