Automatic Local Affine Feature Tracking

advertisement
Local Affine Feature Tracking
in Films/Sitcoms
Chunhui Gu
CS 294-6
Final Presentation
Dec. 13, 2006
Objective
• Automatically detect and track local affine
features in film/sitcom frame sequences.
– Current Dataset: Sex and the City
– Why sitcom?
• Simple daily environment
• Few or no special effects
• Repeated scenes
Outline
• Preprocessing
• Tracking Algorithm
– Pairwise local matching
– Robust features
• Feature Matching across Shots
• Results
– Feature matching vs baseline color histogram
– Time complexity
– When does tracking fail
Preprocessing
(i-1)’th shot
Frame
Extraction
SIFT Feature
Extraction
i’th shot
Shot
Detection
MSER Interest Point
Detection
Tracking Algorithm
• Basic: Pairwise Matching
f
x
i
m
, ymi
Frame i
i
m
f nj

Frame j=i+1
Tracking Algorithm
• Basic: Pairwise Matching
f
x
i
m
, ymi
Frame i
i
m
f nj

Frame j=i+1
Tracking Algorithm
• Basic: Pairwise Matching
f
x
i
m
, ymi
Frame i
i
m
min d f 
f nj

Frame j=i+1
Thresholding on both minimum distance and ratio
Tracking Algorithm
• Basic: Pairwise Matching
f
x
i
m
, ymi
Frame i
i
m
f nj

Frame j=i+1
Tracking Algorithm
• Basic: Pairwise Matching
f
x
i
m
, ymi
Frame i
i
m
f nj

Frame j=i+1
Tracking Algorithm
• Problem of Pairwise Matching
– Sensitive to occlusion and feature misdetection
• Solutions:
– Use multiple overlapping windows
– Backward Matching
• Match features in current frame to features in all previous
frames within the shot
• Pruning process (reduce computation time)
• Select a proportion of features that have longer
tracking length as robust features
Shot grouping/Scene Retrieval
10746
10747







10772
f rf53 x1 , x2 ,...xm53
Shot 53
Scene 5

f rf49 x1 , x2 ,...xm49
Shot 49
10933
10934
10968
f rf56 x1 , x2 ,...xm56
Shot 56
11393
11394
11435
f rf60 x1 , x2 ,...xm60
Shot 60
11533
11534
11560
Inter-Shot Matching


f J1 x1 , x2 ,...xn1


f J 2 x1 , x2 ,...xn2


f J q x1 , x2 ,...xnq
f I1 x1 , x2 ,...xm1
f I2 x1 , x2 ,...xm2
f I p x1 , x2 ,...xmp
Shot I






D
Shot J
“Confusion Table”
50
50
50
55
55
55
60
60
60
65
65
65
70
70
70
75
75
75
50
55
60
65
Ground Truth
70
75
50
55
60
65
70
Color Histograms
75
50
55
60
65
70
Feature Matching
75
ROC
ROC curve of Feature Matching
1
0.9
True Detection
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
0.2
0.4
0.6
False Alarm
0.8
1
When Does Tracking Fail?
• Tracking feature outside local window
– Rare when continuous tracking
– Happens when occlusion occurs
• Same feature splitting to two or more groups
– Long occlusion
– Multiple matching in a single frame
f
x
i
m
, ymi
Frame i
i
m
f nj

Frame j=i+1
Computation Complexity
• Everything except for MSER and SIFT algorithms are
implemented in Matlab (slow…)
Complexity
Time
Frame Extraction
O(N)
~0.3s/frame
Shot Detection
O(N*f(B))
~0.07s/frame (B=16)
MSER Detection
O(N)
~0.3s/frame
SIFT Detection
O(N)
~0.9s/frame
Feature Tracking
O(N*F*W*L)
~0.5s/frame
Matching across
shots
O(S2*T2)
~1s/shot pair
N: # of frames; (30,000) B: # of bins for color hist (16)
F: ave. # of features per frame; (400) W: Local window size; (15)
L: tracking length; (20) T: ave. # of robust trackers per shot; (300)
S: # of shots; (35)
Conclusion
• We successfully implemented local affine feature
tracking in sitcom “sex and the city”. The tracking
method is robust to occlusion and feature misdetection.
• Although no quantitative precision/recall curve (hard to
find ground truth), the demonstration shows that
precision is almost perfect with good recall performance.
• We show one successful application of using robust
features to associate similar shots together for scene
retrieval.
Future Work
• Implement algorithm in real-time (C/C++)
• Search unique shots in films/sitcoms
• Separate indoor scenes from outdoor
scenes
• Determine context of the scene
Acknowledgement
Download