Test Video Sequence

advertisement
Performance Comparison of Traditional Metrics For Transition Detection
in Video in Presence of Object & Camera Motion
Ashishgoud purushotham
M.Tech., H.O.D., E.C.E., J.D.C.T., Indore, India
ashishgoudp@gmail.com
+91 9754207233
G. Usha Rani
M.Tech., Asst. Prof., E.C.E., J.D.C.T., Indore, India
usharanigodugu@gmail.com
Yeshvant Birla
M.Tech., Asst. Prof. E.C.E., J.D.C.T., Indore, India
yeshvantbirla@gmail.com
Margi S. Kanani
M.Tech. student, E.C.E., J.D.C.T., Indore, India
margik.90@gmail.com
ABSTRACT:
Video has swiftly become a fundamental component of today’s
multimedia applications like virtual walkthrough, on demand video
etc. Shot changes can be categorized as gradual and abrupt
transitions. Video analysis & summarization retrieval is the main
approach of shot transition detection methods. Video editing special
effects includes dissolve, fade-in, wipe & fade-out and camera
movements include tilt, pan & zoom effect which are some of the
gradual transitions. Due to large diffusion of such effects, the main
aspect of video segmentation into shots can be very tough due to
large diffusion of such effects. Abrupt transitions are very easy to
detect as the two frames of consecutive shots are enough
uncorrelated for detecting video shot boundaries many algorithm
have been proposed. To find gradual transition in presence of object
& camera motion is even tougher task.
according to the weather, the shot boundaries can be divided into two
types: cut transition (ct) and gradual transition (GT). Shot is termed
as the visual content captured by a single camera in which sequence
of frames have no key changes.
Fig. 1 Consecutive frames with abrupt transitions
In this paper, we have compared the results of traditional metrics like
twin comparison, likelihood ratio and wavelet decomposition.
Number of videos that include dissolve transition and fast motion of
camera and objects gives us experimental studies. The performance
comparison of these three metrics
Keywords
Shot boundary detection, gradual transition, dissolve, fades, wipe
.
I.
INTRODUCTION
With the development of the Internet and video processing
technology, acquiring, storing and transmitting video are more and
more convenient, and the amount of videos on the Internet are greatly
amplified. Areas such as sports, education & entertainment are
widely using videos. Video browsing & retrieval is time consuming
as well as tough for users as it contains bunch of important
information, high dimensional data & don’t have obvious structure.
So searching for the video sections in which we are interested is a
very difficult task.
Therefore, it has been challenging task to develop new tools and
technologies for efficient and effective indexing, browsing and
retrieval of video data. In video retrieval, Shot boundary detection is
the initial step. Transition between shots is gradual or abrupt
Fig. 2 Consecutive frames of dissolve transitions
Fig.1 shows consecutive frames with abrupt transition. Transition
from one shot to another occurs suddenly in abrupt transition (AT).
All non-cut transitions are Gradual Transitions (GT), including
dissolves, fade in/out, and wipes. Some monochrome frames separate
two shots temporally & spatially during the fading transition. A
dissolve in a video sequence is a transition where intensity of next
frame progressively increases and intensity of first frame
progressively decreases. A wipe occurs when pixels from the second
shot replace those of the first shot in a regular pattern. Consecutive
frames of dissolve transition are shown in Fig. 2. Here, in presence of
motion, detection of gradual transition is focused. Specifically, we
use a B-spline interpolation curve fitting technique [3] for estimating
the associated linear-like production features and make use of
“goodness” of fitting to detect the presence of the dissolve transition
effects.
Further paper is structured as follows. Section II contains a pithy
survey of previous works. The metric used for evaluation of shot
boundary detection is presented in section III. Section IV contains the
test video sequence, evaluation criterion and results. Lastly,
conclusion of this paper is in Section V.
Limitation of the likelihood ratio is that no change will be detected
when the comparison between two images having same mean &
variance but entirely different probability density functions.
FLOW CHART:
Video
II. PREVIOUS WORK
Several approaches have been proposed for the shot boundary
detection. Ali Amiri [1] have proposed video shot boundary detection
using eigen value decomposition and Gaussian transition detection.
Y- N Li et al. [2] have used thresholding and bisection based
composition on large number of non-boundary frames. Using motion
intensity and motion suppression value Yang Xu et al. [4] have
proposed 3-DWT based motion support algorithm. Jun Li et al. [5]
have used color and the edge in different direction from wavelet
transition coefficient. Vaselein chasanis et al [6] have proposed
algorithm for detecting abrupt cut and dissolve by using color
histogram and α΅‘ 2 value. The algorithm proposed in [7], based on kstep slipped window using low feature and edit feature of the video
shot. Using global feature such as color histogram, Mohanta et al. [8]
have proposed model based shot boundary detection. Na. Lv et al. [9]
have proposed video shot boundary detection algorithm by using
gray variance based method and block color histogram. Using C
frames, T.Lu and P.N Suganthan [10]have proposed an accumulation
based algorithm. Classification algorithm for shot boundary detection
has been proposed by K. I Koumousis [11] using kappa coefficient.
Liu and Jian Xun Li[12] have proposed video segmentation
algorithm by calculating difference between DC images of all I –
frames images.
Frames
Apply Likelihood
Ratio
Thresholding
Dissolve
Detecting
Twin Comparison:
The twin-comparison shot detection scheme [14] is a well
known method to detect both gradual and abrupt transitions. It makes
use of two thresholds, one for locating gradual transitions and other
for finding abrupt transitions.
FLOW CHART:
After reviewing the literature in details, we found that most of the
algorithms are unable to differentiate between gradual transition and
motion. While most of the algorithms which gives good results with
hard cuts, fails to provide output with soft cuts. Sudden and extensive
changes in visual content occurs usually in hard cuts, while soft cuts
feature shows slow and gradual changes. We haave describe major
metrics and method in section III and are to detect gradual transitions
in the test video sequences.
III. MAJOR METRIC USED FOR EVALUTION
Video
Convert into frames
Apply Twin Comparison
algorithm
Likelihood Ratio:
Based on the assumption of uniform of second order statistics, Jain et
al.[13] computed a likelihood ratio test. It is a typical hypothesis test
wherein a ratio of probabilities is used as the test statistic. This is a
statistical method which expands on the idea of pixel difference by
breaking the images into regions and comparing the obtained
statistical measures of the pixel in those regions. It is defined as
LHR(i) =
⦋(
σi +σi+1
2
(µi −µi_+1 )2
⦌2
2
)+
σi x σi+1
……… (1)
Where, LHR is termed as a likelihood ratio involving two
consecutive regions, where µπ‘– is the mean of current frame and πœŽπ‘– is
the standard deviation of current frame.
Thresholding
Dissolve Detection
The twin-comparison method in addition to interframe differences
uses cumulative differences between frames of a gradual transition.
In twin comparison algorithm, it has been assumes that two frames
which have a common background and unchanging objects will show
little difference in their histograms. The basic formulation for
histogram comparison is as follows: the histogram (either color or
𝑁
𝑀
grayscale) is computed for each frame and the difference is
calculated as
𝑙
𝑙
∑2𝑖=1
∑2𝑗=1
[𝐿𝐿3(𝑖, 𝑗, π‘˜) − 𝐿𝐿3(𝑖, 𝑗, π‘˜ + 1)]2
D(i, i+1) = ∑𝐡−1
𝐽=0 |𝐻𝑖 (𝑗) − 𝐻𝑖+1 (𝑗)|
Where, N × π‘€ is a image size, l are no.of levels, k are frame number.
…………(2)
Where Hi(j) is the jth element of the histogram of the ith frame, and B
is the number of bins in the histogram. Here, we have used a
difference function defined by the histogram intersection
D (i, i+1) = 1- Intersection (𝐻𝑖 , 𝐻𝑖+1 ) =
=
1
π‘š (𝑗))
∑π΅π‘š−1 π‘šπ‘–π‘›(π»π‘–π‘š (𝑗), 𝐻𝑖+1
1- ∑𝑀
𝑀 π‘š=1 𝑗=0
………… (5)
IV. TEST VIDEO AND EVALUTION CRITERION
USED FOR METRIC EVALUTION:
Test Video Sequence:
……….… (3)
Wavelet based metric:
Daubechies second order third level decomposition filter is used for
decomposition of each frame as shown in Fig. (3).
Video
Convert into Frames
RGB to Gray conversion
All algorithms have been tested on clips of movies Landscape, Alvin
& Chipmunks part 2, Titanic, these video sequences are manually
observed frame by frame to find actual transition using virtual dub
software. We mostly considered the video clips from these movies in
the presence of dissolve transition with fast camera and object
motion.
Evalution criterion:
The two metrics precision & recall are used for evaluation
performance of shot boundary detection algorithm.
Recall is defined as
𝑅=
C
C+M
=
C
……… (6)
D
Whereas precision is defined as
Apply 3 level wavelet
decomposition
Use LL3 part
Apply WDBF metric
Thresholding
Dissolve Detection
𝑃=
C
……… (7)
C+FP
Where D is the total number of actual frames with dissolves
boundaries, C is the number of dissolve frames correctly detected by
the algorithm; M is the number of number of dissolve frames missed
by the algorithm and FP is false positives detected by the algorithm.
F1 measure is used to rank the performance of the different
algorithms [4]. F1 combines recall and precision with equal weight.
F1 measure is a harmonic average of recall and precision and is given
as below
𝐹1( 𝑅, 𝑃 ) =
2×R×P
R +P
.……… (8)
Evalution Results of the Traditional Shot
Boundary Detetction Metric :
Fig. (3). Three level wavelet decomposition
The wavelet transform [16] is used as a shot detection metrics
The performance comparison of traditional metrics such as Twin
comparison, WDBF and likelihood ratio is done. The performance
comparison between all these algorithms is shown in Table 2. Overall
it has been observed that due to the fast camera and object motion
WDBF and twin comparison method gives poor results in terms of
false positive, likelihood ratio gives better results than other metrics
shown in table 2.
We have selected the test video sequence which contains significant
camera motion with dissolve transition. Fig.4 shows Consecutive
frames from the movie Landscape. The part of the video measured
for analysis contains 409 frames with camera motion (frames 3-84)
and 69 frames with dissolve transition (frames 79-96) and (frame
190-242)
Table 1. Number of frames considered for analysis from test
video sequence
Movie
Number
of frames
No.
of
dissolves
AC2
2069
LS
1203
TI
4392
162
529
909
Table 2. Performance comparison results between WDBF, likelihood and Twin comparison
Metric
LHR
WDBF
Twin comparison
Video
AC2
LS
TI
R
49.50
45.98
56.89
P
42.17
43.56
51.89
F1
45.54
44.73
54.27
R
36.20
41.78
45.56
P
36.77
40.89
46.76
F1
36.48
41.13
45.65
R
76.30
73.67
79.56
P
11.20
20.98
18.90
F1
19.53
19.53
30.54
frame 16
frame 32
frame 54
frame 124
frame 187
frame 217
frame 289
frame 293
frame 301
frame 308
frame 315
frame 320
frame 333
frame 338
frame 352
frame 363
Fig.4. Consecutive frames from movie clip Landscape showing dissolve transition
V.CONCLUSION
1.0007
In this paper, we have compared traditional metrics like
wavelet decomposition, twin comparison and likelihood ratio.
It has been observed that likelihood ratio gives better results
in identifying gradual effects such as dissolve, fade and wipe
transitions. Also the performance of this algorithm is better
than the other metric in terms of Recall, Precision and F1
measure. We have extensively tested major metrics and
presented experimental results that were obtained by applying
these approaches on video sequence from three different
movies.
1.0006
1.0004
1.0003
1.0002
Reference:
1.0001
1
0
50
100
150
200
250
Frame Index
300
350
400
450
400
450
400
450
Fig. 5(a). Result using likelihood ratio
140
120
twin comprison
100
80
60
40
20
0
0
50
100
150
200
250
Frame Index
300
350
Fig. 5(b). Results using twin comparison
4
2.5
x 10
2
1.5
WDBF
Likelihood ratio
1.0005
1
0.5
0
0
50
100
150
200
250
Frame Index
300
350
Fig. 5(c). Result using WDBF metric
[1]. Ali Amiri , Mahmood Fathy, “Video shot boundary
detection
using
generelized
eigen
value
decomposition and Gaussian transition detection”,
Computing and Informatics, vol. 30, pp. 595- 619,
2011.
[2]. Y.-N, Li, Z.-M, Lu, X.-M, Niu, “Fast video shot
boundary detection framework employing preprocessing techniques”, IET, Image Process, vol. 3,
iss. 3, pp. 121–134, 2009
[3]. Jeho Nam and Ahmed H. Tewfik, “Detection of
gradual transitions in video sequences using Bspline interpolation” IEEE Transaction on
Multimedia, vol. 7, iss. 4, pp. 667-678, Aug 2005.
[4]. Yang Xu, Xu De, Gaun Tengfei, Wu Aimin, Lang
Congyan, “3 DWT based motion suppression for
video shot boundary detection”, Knowledge -based
Intellingent Information and Engineering System,
Lecture notes in Computer Science, vol. 3682, pp.
1204-1209, 2005
[5]. Jun Li, Youdong Ding, Yunyu Shi, Qingyue Zeng,
“DWT-based shot
boundary detection using
support vector machine”, Information Assurance
and Security, vol. 1, pp. 435 – 438, 18-20 Aug
2009.
[6]. Vasileios Chasanis, Aristidis Likas, Nikolaos
Galatsanos, “Simultaneous detection of abrupt cuts
and dissolves in videos using support vector
machines”, Pattern Recognition Letters 30, pp. 5565, 2009
[7]. Tuanfa Qin, Jiayu Gu, Huiting Chen, Zhenhua
Tang, “A fast shot boundary detection based on Kstep slipped window”, 2nd IEEE international
Conference on Network Infrastructure and Digital
Content, pp. 190-195, 24-26 Sept 2010 .
[8]. Partha Pratim Mohanta, Sanjoy Kumar Saha, and
Bhabatosh Chanda, “A model-based shot boundary
detection
techniqueusing
frame
transition
parameters, IEEE transaction on Multimedia , vol.
14, iss. 1, pp. 223-233, Feb.2012.
[9]. Na Lv, Zhiquan Feng and Jingliang Peng, “mutual
information based video shot boundary detection”
Image Analysis and Signal Processing, pp. 1-5, 9-11
Nov. 2012
[10]. T. LU tong, P.N. Suganthan, “An accumulation
algorithm for video shot boundary detection”,
Multimedia Tools and Applications,vol. 22, iss. 1,
pp. 89–106, Jan 2004.
[11]. K. Koumousis , V. Fotopoulos, A. N. Skodras , “A
new approach to gradual video
transition
detection”, Informatics (PCI), pp. 245-249, 5-7 Oct
2012
[12]. Liu Liu, Jian-Xun Li, “A novel shot segmentation
agorithm based on motion edge feature”, 2010
Synopsis on Photonics and Optoelecctronic, pp. 1-5,
19-20, June 2010.
[13]. Jain, R., Kasturi, R., Schunck, B.: Machine vision,
pp. 406–415, McGraw-Hill, New York, 1995.
[14]. Fa-Xin Yu1, Zhe-Ming Lu1 and Yue-Nan Li2 “
Dissolve detection based on twin-comparison with
curve fitting ” International Journal of Innovative,
ISSN 1349-4198,vol. 7, No. 5(A), pp. 2417-2426,
May 2011.
[15]. Sethi, I.K, Patel, N.: “A statistical approach to
scene chane detection”, SPIE proc. Storage Retr.
Image video database III 2420, pp. 329-338, 1995
[16]. Khin Thandar Tint, Dr. Kyi Soe, “Key frame
extraction for video summarization using DWT
wavelet statistics” International journal of research
in computer engineering & technology, vol. 2, No.
5, May 2013.
Download