Performance Comparison of Traditional Metrics For Transition Detection in Video in Presence of Object & Camera Motion Ashishgoud purushotham M.Tech., H.O.D., E.C.E., J.D.C.T., Indore, India ashishgoudp@gmail.com +91 9754207233 G. Usha Rani M.Tech., Asst. Prof., E.C.E., J.D.C.T., Indore, India usharanigodugu@gmail.com Yeshvant Birla M.Tech., Asst. Prof. E.C.E., J.D.C.T., Indore, India yeshvantbirla@gmail.com Margi S. Kanani M.Tech. student, E.C.E., J.D.C.T., Indore, India margik.90@gmail.com ABSTRACT: Video has swiftly become a fundamental component of today’s multimedia applications like virtual walkthrough, on demand video etc. Shot changes can be categorized as gradual and abrupt transitions. Video analysis & summarization retrieval is the main approach of shot transition detection methods. Video editing special effects includes dissolve, fade-in, wipe & fade-out and camera movements include tilt, pan & zoom effect which are some of the gradual transitions. Due to large diffusion of such effects, the main aspect of video segmentation into shots can be very tough due to large diffusion of such effects. Abrupt transitions are very easy to detect as the two frames of consecutive shots are enough uncorrelated for detecting video shot boundaries many algorithm have been proposed. To find gradual transition in presence of object & camera motion is even tougher task. according to the weather, the shot boundaries can be divided into two types: cut transition (ct) and gradual transition (GT). Shot is termed as the visual content captured by a single camera in which sequence of frames have no key changes. Fig. 1 Consecutive frames with abrupt transitions In this paper, we have compared the results of traditional metrics like twin comparison, likelihood ratio and wavelet decomposition. Number of videos that include dissolve transition and fast motion of camera and objects gives us experimental studies. The performance comparison of these three metrics Keywords Shot boundary detection, gradual transition, dissolve, fades, wipe . I. INTRODUCTION With the development of the Internet and video processing technology, acquiring, storing and transmitting video are more and more convenient, and the amount of videos on the Internet are greatly amplified. Areas such as sports, education & entertainment are widely using videos. Video browsing & retrieval is time consuming as well as tough for users as it contains bunch of important information, high dimensional data & don’t have obvious structure. So searching for the video sections in which we are interested is a very difficult task. Therefore, it has been challenging task to develop new tools and technologies for efficient and effective indexing, browsing and retrieval of video data. In video retrieval, Shot boundary detection is the initial step. Transition between shots is gradual or abrupt Fig. 2 Consecutive frames of dissolve transitions Fig.1 shows consecutive frames with abrupt transition. Transition from one shot to another occurs suddenly in abrupt transition (AT). All non-cut transitions are Gradual Transitions (GT), including dissolves, fade in/out, and wipes. Some monochrome frames separate two shots temporally & spatially during the fading transition. A dissolve in a video sequence is a transition where intensity of next frame progressively increases and intensity of first frame progressively decreases. A wipe occurs when pixels from the second shot replace those of the first shot in a regular pattern. Consecutive frames of dissolve transition are shown in Fig. 2. Here, in presence of motion, detection of gradual transition is focused. Specifically, we use a B-spline interpolation curve fitting technique [3] for estimating the associated linear-like production features and make use of “goodness” of fitting to detect the presence of the dissolve transition effects. Further paper is structured as follows. Section II contains a pithy survey of previous works. The metric used for evaluation of shot boundary detection is presented in section III. Section IV contains the test video sequence, evaluation criterion and results. Lastly, conclusion of this paper is in Section V. Limitation of the likelihood ratio is that no change will be detected when the comparison between two images having same mean & variance but entirely different probability density functions. FLOW CHART: Video II. PREVIOUS WORK Several approaches have been proposed for the shot boundary detection. Ali Amiri [1] have proposed video shot boundary detection using eigen value decomposition and Gaussian transition detection. Y- N Li et al. [2] have used thresholding and bisection based composition on large number of non-boundary frames. Using motion intensity and motion suppression value Yang Xu et al. [4] have proposed 3-DWT based motion support algorithm. Jun Li et al. [5] have used color and the edge in different direction from wavelet transition coefficient. Vaselein chasanis et al [6] have proposed algorithm for detecting abrupt cut and dissolve by using color histogram and α΅‘ 2 value. The algorithm proposed in [7], based on kstep slipped window using low feature and edit feature of the video shot. Using global feature such as color histogram, Mohanta et al. [8] have proposed model based shot boundary detection. Na. Lv et al. [9] have proposed video shot boundary detection algorithm by using gray variance based method and block color histogram. Using C frames, T.Lu and P.N Suganthan [10]have proposed an accumulation based algorithm. Classification algorithm for shot boundary detection has been proposed by K. I Koumousis [11] using kappa coefficient. Liu and Jian Xun Li[12] have proposed video segmentation algorithm by calculating difference between DC images of all I – frames images. Frames Apply Likelihood Ratio Thresholding Dissolve Detecting Twin Comparison: The twin-comparison shot detection scheme [14] is a well known method to detect both gradual and abrupt transitions. It makes use of two thresholds, one for locating gradual transitions and other for finding abrupt transitions. FLOW CHART: After reviewing the literature in details, we found that most of the algorithms are unable to differentiate between gradual transition and motion. While most of the algorithms which gives good results with hard cuts, fails to provide output with soft cuts. Sudden and extensive changes in visual content occurs usually in hard cuts, while soft cuts feature shows slow and gradual changes. We haave describe major metrics and method in section III and are to detect gradual transitions in the test video sequences. III. MAJOR METRIC USED FOR EVALUTION Video Convert into frames Apply Twin Comparison algorithm Likelihood Ratio: Based on the assumption of uniform of second order statistics, Jain et al.[13] computed a likelihood ratio test. It is a typical hypothesis test wherein a ratio of probabilities is used as the test statistic. This is a statistical method which expands on the idea of pixel difference by breaking the images into regions and comparing the obtained statistical measures of the pixel in those regions. It is defined as LHR(i) = β¦( σi +σi+1 2 (µi −µi_+1 )2 β¦2 2 )+ σi x σi+1 ……… (1) Where, LHR is termed as a likelihood ratio involving two consecutive regions, where µπ is the mean of current frame and ππ is the standard deviation of current frame. Thresholding Dissolve Detection The twin-comparison method in addition to interframe differences uses cumulative differences between frames of a gradual transition. In twin comparison algorithm, it has been assumes that two frames which have a common background and unchanging objects will show little difference in their histograms. The basic formulation for histogram comparison is as follows: the histogram (either color or π π grayscale) is computed for each frame and the difference is calculated as π π ∑2π=1 ∑2π=1 [πΏπΏ3(π, π, π) − πΏπΏ3(π, π, π + 1)]2 D(i, i+1) = ∑π΅−1 π½=0 |π»π (π) − π»π+1 (π)| Where, N × π is a image size, l are no.of levels, k are frame number. …………(2) Where Hi(j) is the jth element of the histogram of the ith frame, and B is the number of bins in the histogram. Here, we have used a difference function defined by the histogram intersection D (i, i+1) = 1- Intersection (π»π , π»π+1 ) = = 1 π (π)) ∑π΅π−1 πππ(π»ππ (π), π»π+1 1- ∑π π π=1 π=0 ………… (5) IV. TEST VIDEO AND EVALUTION CRITERION USED FOR METRIC EVALUTION: Test Video Sequence: ……….… (3) Wavelet based metric: Daubechies second order third level decomposition filter is used for decomposition of each frame as shown in Fig. (3). Video Convert into Frames RGB to Gray conversion All algorithms have been tested on clips of movies Landscape, Alvin & Chipmunks part 2, Titanic, these video sequences are manually observed frame by frame to find actual transition using virtual dub software. We mostly considered the video clips from these movies in the presence of dissolve transition with fast camera and object motion. Evalution criterion: The two metrics precision & recall are used for evaluation performance of shot boundary detection algorithm. Recall is defined as π = C C+M = C ……… (6) D Whereas precision is defined as Apply 3 level wavelet decomposition Use LL3 part Apply WDBF metric Thresholding Dissolve Detection π= C ……… (7) C+FP Where D is the total number of actual frames with dissolves boundaries, C is the number of dissolve frames correctly detected by the algorithm; M is the number of number of dissolve frames missed by the algorithm and FP is false positives detected by the algorithm. F1 measure is used to rank the performance of the different algorithms [4]. F1 combines recall and precision with equal weight. F1 measure is a harmonic average of recall and precision and is given as below πΉ1( π , π ) = 2×R×P R +P .……… (8) Evalution Results of the Traditional Shot Boundary Detetction Metric : Fig. (3). Three level wavelet decomposition The wavelet transform [16] is used as a shot detection metrics The performance comparison of traditional metrics such as Twin comparison, WDBF and likelihood ratio is done. The performance comparison between all these algorithms is shown in Table 2. Overall it has been observed that due to the fast camera and object motion WDBF and twin comparison method gives poor results in terms of false positive, likelihood ratio gives better results than other metrics shown in table 2. We have selected the test video sequence which contains significant camera motion with dissolve transition. Fig.4 shows Consecutive frames from the movie Landscape. The part of the video measured for analysis contains 409 frames with camera motion (frames 3-84) and 69 frames with dissolve transition (frames 79-96) and (frame 190-242) Table 1. Number of frames considered for analysis from test video sequence Movie Number of frames No. of dissolves AC2 2069 LS 1203 TI 4392 162 529 909 Table 2. Performance comparison results between WDBF, likelihood and Twin comparison Metric LHR WDBF Twin comparison Video AC2 LS TI R 49.50 45.98 56.89 P 42.17 43.56 51.89 F1 45.54 44.73 54.27 R 36.20 41.78 45.56 P 36.77 40.89 46.76 F1 36.48 41.13 45.65 R 76.30 73.67 79.56 P 11.20 20.98 18.90 F1 19.53 19.53 30.54 frame 16 frame 32 frame 54 frame 124 frame 187 frame 217 frame 289 frame 293 frame 301 frame 308 frame 315 frame 320 frame 333 frame 338 frame 352 frame 363 Fig.4. Consecutive frames from movie clip Landscape showing dissolve transition V.CONCLUSION 1.0007 In this paper, we have compared traditional metrics like wavelet decomposition, twin comparison and likelihood ratio. It has been observed that likelihood ratio gives better results in identifying gradual effects such as dissolve, fade and wipe transitions. Also the performance of this algorithm is better than the other metric in terms of Recall, Precision and F1 measure. We have extensively tested major metrics and presented experimental results that were obtained by applying these approaches on video sequence from three different movies. 1.0006 1.0004 1.0003 1.0002 Reference: 1.0001 1 0 50 100 150 200 250 Frame Index 300 350 400 450 400 450 400 450 Fig. 5(a). Result using likelihood ratio 140 120 twin comprison 100 80 60 40 20 0 0 50 100 150 200 250 Frame Index 300 350 Fig. 5(b). Results using twin comparison 4 2.5 x 10 2 1.5 WDBF Likelihood ratio 1.0005 1 0.5 0 0 50 100 150 200 250 Frame Index 300 350 Fig. 5(c). Result using WDBF metric [1]. Ali Amiri , Mahmood Fathy, “Video shot boundary detection using generelized eigen value decomposition and Gaussian transition detection”, Computing and Informatics, vol. 30, pp. 595- 619, 2011. [2]. Y.-N, Li, Z.-M, Lu, X.-M, Niu, “Fast video shot boundary detection framework employing preprocessing techniques”, IET, Image Process, vol. 3, iss. 3, pp. 121–134, 2009 [3]. Jeho Nam and Ahmed H. Tewfik, “Detection of gradual transitions in video sequences using Bspline interpolation” IEEE Transaction on Multimedia, vol. 7, iss. 4, pp. 667-678, Aug 2005. [4]. Yang Xu, Xu De, Gaun Tengfei, Wu Aimin, Lang Congyan, “3 DWT based motion suppression for video shot boundary detection”, Knowledge -based Intellingent Information and Engineering System, Lecture notes in Computer Science, vol. 3682, pp. 1204-1209, 2005 [5]. Jun Li, Youdong Ding, Yunyu Shi, Qingyue Zeng, “DWT-based shot boundary detection using support vector machine”, Information Assurance and Security, vol. 1, pp. 435 – 438, 18-20 Aug 2009. [6]. Vasileios Chasanis, Aristidis Likas, Nikolaos Galatsanos, “Simultaneous detection of abrupt cuts and dissolves in videos using support vector machines”, Pattern Recognition Letters 30, pp. 5565, 2009 [7]. Tuanfa Qin, Jiayu Gu, Huiting Chen, Zhenhua Tang, “A fast shot boundary detection based on Kstep slipped window”, 2nd IEEE international Conference on Network Infrastructure and Digital Content, pp. 190-195, 24-26 Sept 2010 . [8]. Partha Pratim Mohanta, Sanjoy Kumar Saha, and Bhabatosh Chanda, “A model-based shot boundary detection techniqueusing frame transition parameters, IEEE transaction on Multimedia , vol. 14, iss. 1, pp. 223-233, Feb.2012. [9]. Na Lv, Zhiquan Feng and Jingliang Peng, “mutual information based video shot boundary detection” Image Analysis and Signal Processing, pp. 1-5, 9-11 Nov. 2012 [10]. T. LU tong, P.N. Suganthan, “An accumulation algorithm for video shot boundary detection”, Multimedia Tools and Applications,vol. 22, iss. 1, pp. 89–106, Jan 2004. [11]. K. Koumousis , V. Fotopoulos, A. N. Skodras , “A new approach to gradual video transition detection”, Informatics (PCI), pp. 245-249, 5-7 Oct 2012 [12]. Liu Liu, Jian-Xun Li, “A novel shot segmentation agorithm based on motion edge feature”, 2010 Synopsis on Photonics and Optoelecctronic, pp. 1-5, 19-20, June 2010. [13]. Jain, R., Kasturi, R., Schunck, B.: Machine vision, pp. 406–415, McGraw-Hill, New York, 1995. [14]. Fa-Xin Yu1, Zhe-Ming Lu1 and Yue-Nan Li2 “ Dissolve detection based on twin-comparison with curve fitting ” International Journal of Innovative, ISSN 1349-4198,vol. 7, No. 5(A), pp. 2417-2426, May 2011. [15]. Sethi, I.K, Patel, N.: “A statistical approach to scene chane detection”, SPIE proc. Storage Retr. Image video database III 2420, pp. 329-338, 1995 [16]. Khin Thandar Tint, Dr. Kyi Soe, “Key frame extraction for video summarization using DWT wavelet statistics” International journal of research in computer engineering & technology, vol. 2, No. 5, May 2013.