A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias 2000 Presented by Mohammed S. Al-Logmani Agenda • • • • • • • • • Introduction Motivation/ Problem Statement Video Sequence Analysis Fuzzy Visual Content Representation Video Summarization Content-Based Retrieval Experimental Results Future Work Conclusion Introduction • The increase amount of digital image & video data requires new technologies for efficient searching, indexing, contentbased retrieving & managing multimedia databases. • Drawbacks with keyword annotations: • Large amount of effort for developing them. • Cannot efficiently characterize the rich visual content using only text Introduction Cont. • Content-based algorithms • QBIC • VisualSeek • Virage • Cannot easily applied to video DBs. • Perform queries on every frame is inefficient & time consuming • Videos DBs. are distributed which impose large storage & transmission requirements Introduction Cont. • Content-based sampling algorithms • Extract small but meaningful info. (summarization) • Require a more meaningful representation of visual content than the traditional pixel-based one • Related Work: • • • • A hidden Markov model for color image retrieval An approach of image retrieval based on user sketches A hierarchical color clustering method Construction of a compact image map or image mosaics for video summarization • A pictorial summary of video sequences based on story units Motivation/ Problem Statement • Increase the flexibility of content-based retrieval systems • Provide an interpretation closer to the human perception • Result a more robust description of visual content • possible instabilities of the segmentation are reduced fuzzy representation of visual content • Video summarization • Performed by minimizing a cross correlation criterion among the video frames using a GA. • The correlation is computed using several features extracted using a color/ motion segmentation on a fuzzy feature vector formulation basis. • Content-based indexing & retrieval • The user provides queries (images or sketches) which are analyzed in the same way as video frames in video summarization scheme. • A metric distance or similarity measure is then used to find a set of frames that best match the user's query. Video Sequence Analysis • A color/motion segmentation algorithm is applied for visual content description • Multiresolution Recursive Shortest Spanning Tree (M-RSST) • recursively applies the RSST to images of increasing resolution. (a truncated image pyramid is created) • Produces same results as RSST with less time. • Eliminates regions of small segments Video Sequence Analysis cont. • Factors affect the segmentation efficiency • The initial image resolution level • selected to be 3 (downsampling by 8x8 pixels) • The selection of threshold used for terminating the algorithm • Euclidean distance of the color or motion intensities between two neighboring segments • Terminate the segmentation if no segments are merged from one step to another. Video Sequence Analysis cont. Fuzzy visual content representation • The size & location cannot be used directly • segments # is not constant for each video frame • To overcome this problem, pre-determined classes of color/motion properties • To avoid the possibility of classifying two similar segments to different classes, a degree of membership is allocated to each class • Resulting in a fuzzy classification formulation • Create a fuzzy multidimensional histogram Fuzzy visual content representation Cont. •Example: property (s) is used for each segment. •s takes values in [0,1] •It is classified into Q classes using Q membership functions • • degree of membership of s in the nth class Fuzzy visual content representation Cont. •Assume a video frame consists of K segments •First, evaluate the degree of membership of feature si = 1,2, … K, of the ith segment •Then, find the degree of membership of K in the nth class through the fuzzy histogram Video summarization Video summarization Cont. • Extraction of key-frames • Key-frames are extracted by minimizing a crosscorrelation criterion, so that the selected frames are not similar to each other. • The generic approach (GA) • Similarities to the traveling salesman problem (TSP). • Initially, a population of m chromosomes is created. • Evaluate the performance of all chromosomes in population P(n) using a correlation measure. • Evaluate the chromosomes quality using fitness functions. • Select appropriate parent so that a fitter chromosome gives a higher number of offspring • The GA terminates when the best chromosome fitness remains constant for a large number of generations Video summarization Cont. • Examined about170 shot, # Kf=6 , Q=3 Content-based retrieval • Apply the previous scheme to discard all the redundant temporal video information • The user can submit: • Images (query by example) • Sketches (query by sketch) • Analyze the query using M-RSST • Extract and classify the segments • Apply a distance similarity measure Experimental results Experimental results Cont. Experimental results Cont. Future Work • Increase the system accuracy by developing a fuzzy adaptive mechanism for estimating the distance weights. Conclusion • Presented a fuzzy video content representation • Efficient for: • Video summarization • Content-based image indexing & retrieval • Experimental results indicate that this approach outperforms the other methods for both accuracy and computational efficiency