1 - Technische Universiteit Eindhoven

advertisement
Video processing for
multimedia systems
G. de Haan
technische universiteit eindhoven
W
2
Schedule lectures 5P530
Week 1
Week 2
Week 3
Week 4
Basics
(Ch 2, 3)
Filtering
(Ch 4)
Video Enhancement
(Ch 5)
Picture-Rate Conversion (Ch 7/9)
Week 5
Week 6
Week 7
Week 8
De-interlacing
(Ch 8)
Questions
Motion Estimation
(Ch 10)
Object Detection
(Ch 11)
X
technische universiteit eindhoven
W
2
3
Motion
Estimation
technische universiteit eindhoven
W
3
4
Motion Estimation
•
Is there any motion?
•
How fast?
•
Into which direction?
Dy
Dx
technische universiteit eindhoven
W
4
5
Application dependency of ME
•
Scan rate conversion (true-motion vectors)
•
•
•
Picture rate conversion
Video compression (low prediction error)
•
•
•
De-interlacing
MPEG
H.2.63
True-motion vectors are usually more consistent than
coding vectors. Consistency has some, but no dominant
relevance for coding efficiency
ME
technische universiteit eindhoven
W
5
6
Motion estimation and coding
Motion
Picture
delay
compensation
Prediction
Input
+
- error
Image compression:
accuracy demands
decrease with increasing
frequency (DCT-transform
(DCT transform +
quantization)
technische universiteit eindhoven
W
6
Output
7
Pixel-recursive
PixelME methods
technische universiteit eindhoven
W
7
8
Pixel (Pel) recursive ME; Earliest methods, many variants
DFD
2
dDFD
dD
2
Algorithm: Determine
gradient of displaced
frame difference (DFD),
and update vector in
direction of decreasing
DFD.
i
I+1
I+2
I+3
Displacement D
technische universiteit eindhoven
W
8
9
Pel-recursive ME
1)
2)
3)
4)



Di  Di 1  u

 
d
u    DFD( x , Di1 , n)
dD
 

 
DFD ( x , Di 1 , n)  F ( x , n)  F ( x  Di 1 , n  1)

 
 
d
u  DFD( x , Di 1 , n)  F ( x  Di 1 , n)
dx
technische universiteit eindhoven
W
9
10
Pel-recursive ME; The use of predictions
Spatial causal prediction
x
Temporal predictions
Current pixel
Time
technische universiteit eindhoven
W
10
11
Why not so popular anymore?
•
Pel-recursive estimators require fairly complex
calculations for every pixel in the image
•
As soon as applications became practical that required
real-time motion estimation, complexity reduction of the
estimator was crucial
•
•
Primarily coding, later also scan conversion
For coding one vector per pixel is not attractive
technische universiteit eindhoven
W
11
12
Block-matching
BlockME methods:
Full--search
Full
technische universiteit eindhoven
W
12
13
Block-matching; find corresponding block in image n-1
Corresponding block
Search area
Current block
n -1
n
Image number
technische universiteit eindhoven
W
13
14
Finding block similarity
Current block
Dy
Dx
Search area
technische universiteit eindhoven
W
14
15
Formal definitions
Luminance value in
previous picture, shifted
over candidate vector C:
 
F ( x  C , n  1)
A block matcher optimizes a function, Cost, varying C:
 
 (C , X , n) 

 
Cost ( F ( x, n), F ( x  C , n  1))

xB ( X )
And the resulting candidate vector for which the error is minimal
is assumed to be the displacement vector:

D ( x , n)
technische universiteit eindhoven
W
15
16
Normalised cross-correlation
 
 (C , X , n) 



( F ( x , n).F ( x  C , n  1)

xB ( X )
 F


xB ( X )
•
•
2

( x , n).
favourable performance
rather high operations count
technische universiteit eindhoven
W
16
F


xB ( X )
2


( x  C , n  1)
17
Summed Square Error
 
 (C , X , n) 
•
•
2

 
(F ( x, n)  F ( x  C , n  1) )

xB ( X )
good performance
acceptable operations count
technische universiteit eindhoven
W
17
18
Summed Absolute Difference
 
 (C , X , n) 
•
•



xB ( X )

 
F ( x, n)  F ( x  C , n  1)
still good performance
favourable operations count
technische universiteit eindhoven
W
18
19
Significantly differently pixels
 
 (C , X , n) 



xB ( X )

 
T ( F ( x , n)  F ( x  C , n  1) )
with :
1
T ( a )  
0
•
•
, ( a  threshold )
, ( a  threshold )
Rather poor performance
Favourable operations count, reduced register size
compared to SAD
technische universiteit eindhoven
W
19
Alternative match criteria
Complexity
20
•
Correlation (NCCF) of pixels in the two blocks
•
Mean Square Error (MSE) between pixels in the blocks
•
Mean Absolute Difference (MAD) between pixels in the blocks
•
Number of significantly different pixels (NSD) in the two blocks
technische universiteit eindhoven
W
20
21
Comparison of match criteria
MSE
SAD
technische universiteit eindhoven
W
21
NSDP
22
Operations count of full search block matching
•
•
•
CCIR signal
•
720x288x50 (pixels/s)
Search window for realistic velocities
•
64x48 (HxV in pixels) = 3000 possible vectors, assuming
integer vector accuracy
Matching error (SAD) calculation only:
•
approximately: 1x1011 (ops/s)
technische universiteit eindhoven
W
22
23
Block-matching
Blockefficient search
techniques
technische universiteit eindhoven
W
23
24
Current block
Finding block similarity
Dy
Dx
Search
area
technische universiteit eindhoven
W
24
25
Current block
Sub-sampled search
Dy
Dx
Search
area
technische universiteit eindhoven
W
25
26
Sub-sampled full search
Dy
2
1
Dx
Search
area
technische universiteit eindhoven
W
26
27
3-step search (Koga et al., 1981)
Dy
Dx
Search area
technische universiteit eindhoven
W
27
28
One-at-a-time search (Srinivasan & Rao, 1985)
Dy
Dx
technische universiteit eindhoven
W
28
Video processing for
multimedia systems
G. de Haan
technische universiteit eindhoven
W
30
Successive approximation may become necessary
Dy y
xmin
3
2
1
Dx
x0 
i1j
Contour plot of
error plane
technische universiteit eindhoven
W
30
31
Prevention of trap in local minimum
Dy
X0(b)
Xmin(b) 
Xmin(a)
X0(a)
Dx
x0
Xmin(d
)
Xmin(c
)
X0(c)
X0(d)
Contour plot of
error plane
technische universiteit eindhoven
W
31
32
Reality is even more complicated…
technische universiteit eindhoven
W
32
33
And sometimes there is no unique solution…
technische universiteit eindhoven
W
33
34
Comparison of search techniques
FS
LogS
technische universiteit eindhoven
W
34
OTS
35
Pixel subsubsampling in
match function
technische universiteit eindhoven
W
35
36
Intermediate conclusion
•
Efficient search techniques can highly reduce the
operations count of a block matching motion estimator,
but increase the risk of getting trapped in a local minimum
of the error function
•
Methods to prevent the disadvantages of efficient search,
increase complexity again.
technische universiteit eindhoven
W
36
37
Pixel sub-sampling of match error criterion
Dy
Current block
Dx
Search
area
technische universiteit eindhoven
W
37
38
Pixel sub-sampling in match error criterion
1 4 2 4
technische universiteit eindhoven
W
38
39
Block subsubsampling
technische universiteit eindhoven
W
39
40
Block sub-sampling
V-position
Search
area
Candidate vector
Current block
n-1
n
Picture number
technische universiteit eindhoven
W
40
H-position
41
Interpolate missing motion vectors
Up
Le
current
Ri
Lo
1:
2:
Current Dx = median{Lex, (Upx+Lox)/2, Rix}
Current Dy = median{Ley, (Upy+Loy)/2, Riy}
Use the vector-median to prevent new vectors
technische universiteit eindhoven
W
41
42
Summary cost reduction block matchers
•
Simple match criterion
•
Efficient search strategy
•
Pixel sub-sampling in match criterion
•
•
a factor of four is usually feasible with little influence on the
performance
Block sub-sampling
•
only valid if motion field is smooth
technische universiteit eindhoven
W
42
43
Vectors and
object velocity
technische universiteit eindhoven
W
43
44
Full search block matching motion vectors
technische universiteit eindhoven
W
44
45
True motion versus best match
Poor relation vectors & velocities
1
1
3
2
Number 7
Arm
Scarf
2
SAD :
 
 (C , X , n) 
Seven: Arm:
1 clear
no
Scarf:
min
clear
multiple
min
min
3



xB ( X )

 
F ( x, n)  F ( x  C , n  1)
C is motion vector, F image grey value
B 8x8 block, x pixel position, n picture nr
technische universiteit eindhoven
W
45
46
Block-matching
Blocktrue--motion
true
estimation
technische universiteit eindhoven
W
46
47
What is wrong with block matching?
•
Blocks are not unique
•
Optimization is ill-posed problem
•
Testing for best match gives too many solutions
•
Solutions:
•
•
•
Introduce bias, e.g. towards consistent vectors (test better)
Post-processing, e.g. eliminating outliers (test again)
Pre-selection of likely candidates (test less)
technische universiteit eindhoven
W
47
48
Introduce bias
Test better…
technische universiteit eindhoven
W
48
49
Introduce bias – Test better
Minimal match error gives no unique solution
 

 
 (C, X , n)  xB ( X ) | F ( x, n)  F ( x  C, n  1) |
An improved criterion takes into account that vectors are consistent
within objects and over time:
 





 (C, X , n)  xB ( X ) | F ( x, n)  F ( x  C, n  1) |  Ps (C )  Pt (C )
Ps and Ps are penalties depending on spatial and temporal
consistency of the candidate vector
PROBLEM: Consistently only known after completion…
Usually an iterative approach is required
technische universiteit eindhoven
W
49
50
Post-processing
PostTest again..
technische universiteit eindhoven
W
50
51
Post processing to improve vector consistency (Reuter, 1988)
V-Pos
y-2Y
y-Y
y
y+Y
y+2Y
y+3Y
x-4X x-2X
x
x+2X x+4X
 
   
Do ( X )  Fp ( D ( X  k ), k  Neighbourhood
technische universiteit eindhoven
W
51
H-Pos
52
The effect of post-filtering (5x3 blocks)
Original
Average
technische universiteit eindhoven
W
52
Median
53
Pre-selection
PreTest less…
technische universiteit eindhoven
W
53
54
Hierarchical block matching (Thoma & Bierling, 1989)
Coarse
estimation
Down-sampled picture
at intermediate level
Initialise
Initialise
Medium size
update vectors
Small size
update vectors
Down-sampled picture
at highest level
Original picture
technische universiteit eindhoven
W
54
55
Hierarchical block matching
Hierarchical
Full search
technische universiteit eindhoven
W
55
56
Pre-selection in Fourier domain- Phase Plane Correlation
•
PPC is a two-step hierarchical
motion estimator
•
1) Select largest correlation
peaks in the Fourier domain
using blocks larger than 64x64
•
2) Test SAD only for these
vectors on small block, here
8x8, in the spatial domain
•
Algorithm originally proposed by
Graham Thomas, and applied in
professional studio scan
converters
technische universiteit eindhoven
W
56
57
Time recursive block matching (Ninomya, 1982)
Cy
+6
+4
+2
0
-2
-4
-6
-6
-4
-2
0
+2
+4
+6
Cx
Test SAD only for these vectors centred around
result vector previous picture
technische universiteit eindhoven
W
57
58
ST-recursive
STcandidate selection
after break
technische universiteit eindhoven
W
58
Video processing for
multimedia systems
G. de Haan
technische universiteit eindhoven
W
60
3-D Recursive
Search blockblockmatching
technische universiteit eindhoven
W
60
61
3-Dimensional Recursive Search (3DRS)
Assumptions:
1. Objects are LARGER than blocks
2. Objects have INERTIA
Candidate set
• Spatial candidates
• Temporal candidates
• Updated candidates
technische universiteit eindhoven
W
61
??
62
3-D RS: How to start? Single random update sufficient!
Noise
vector
update
Dy
Spatial
prediction
candidates
Temporal
prediction
candidate
technische universiteit eindhoven
W
62
Dx
63
Chosen candidates
Spatial
Temporal
Update
technische universiteit eindhoven
W
63
64
Performance
technische universiteit eindhoven
W
64
65
Operations Count
140
FS: 2000
125
H3: 1500
120
100
100
Pel-Rec:1000
75
80
68
60
40
22
20
10
0
PPC
4-St
3-St
OTS
technische universiteit eindhoven
W
65
H2
3-D RS
66
Performance of a true-motion estimator: Smoothness
technische universiteit eindhoven
W
66
67
Vector field smoothness
4.5
4.3
4
3.5
3
2.5
2
1.5
1
0.5
0.8
0.2
0.3
0.3
0.9
0.5
0
4-St
3-St
FS
OTS
technische universiteit eindhoven
W
67
H2
PPC
3-D RS
68
Performance testing of true-motion estimator: M2SE
MC
ME


MMSE (n)   ( F ( x , n)  Fmc ( x , n)) 2

x


  
  
1
Fmc ( x , n)  F ( x  D( x ), n  1)  F ( x  D( x ), n  1)
2
n-1
n
n+1
picture nr.
technische universiteit eindhoven
W
68

69
M2SE score of ME-methods
250
244
196
200
189
150
137
120
112
100
101
106
50
0
4-St
OTS
3-St
H2
FS
technische universiteit eindhoven
W
69
H3
PPC
3-DRS
70
Comparison of best vector fields
Phase Plane Correlation
motion vectors
3-D Recursive Search BM
motion vectors
technische universiteit eindhoven
W
70
71
MC up-conversion; Relevance of true-motion vectors
Interpolated images using
full search motion vectors
Interpolated image using
3D-RS motion vectors
In contrast with coding, for scan rate conversion true-motion is an
absolute must. RATHER SMOOTH THAN ACCURATE!!
technische universiteit eindhoven
W
71
72
Simplifications
1) Reduced candidate set
technische universiteit eindhoven
W
72
73
With 8 prediction and 1 update: 9 candidates
Current block
Block in current field
Block in previous field
V-pos
y-Y
Sa
Sb
y
Sd
y+Y
Tb
Tc
x-X
x
Sc
Ta
Td
y+2Y
x-2X
x+X x+2X
technische universiteit eindhoven
W
73
H-pos
74
3DRS, 4 candidates are enough (including 1 update)
Current block
Block in current field
Block in previous field
V-pos
Sb
y-Y
Sa
y
y+Y
T
y+2Y
x-2X
x-X
x
x+X x+2X
technische universiteit eindhoven
W
74
H-pos
75
Y-estimator, advantage for pipe-lining
Current block
Block in current field
Block in previous field
V-pos
Sa
y-Y
Sb
y
y+Y
T
y+2Y
x-2X
x-X
x
x+X x+2X
technische universiteit eindhoven
W
75
H-pos
76
Effect of candidate reduction
M2SE: 21.5
M2SE: 26.0
M2SE: 23.3
S: 2.8
S: 1.7
S: 2.6
technische universiteit eindhoven
W
76
77
Block diagram of Y-estimator; Simple hardware
Prediction
memory
0
Nbl
U(X,n)
D(X,n) D(x,n)
Update
Mod p
count
Look
Up
Table
Best
vector
selection
Update Generator
Current
Previous
picture
picture
technische universiteit eindhoven
W
77
78
Simplifications
1) Reduced resolution for ME
technische universiteit eindhoven
W
78
79
ME with reduced resolution compared to application
input
Application, like
De-interlacing, PRC, etc.
Down-scale
video signal
Motion
estimation D(x,n)
on reduced
video
technische universiteit eindhoven
W
79
Up-scale
motion
vectors
output
80
SophisSophistications
technische universiteit eindhoven
W
80
81
Iterating more than once on an image pair
Effect of iterations
Once, 1st image
300
10 times
250
200
M2SE
150
100 x smoothness
100
50
0
1
2
3
4
5
6
7
8
9
10
Remark 1: If estimating in the output domain (100Hz): 2
iterations on video and 4 iterations on film material!
Remark 2: Effect mainly shows in 1st image after scene change:
•1 iteration, 10th frame:
M2SE: 29, Smoothness: 2.8
•10 iterations, 10th frame:
M2SE: 28, Smoothness: 3.5
technische universiteit eindhoven
W
81
82
Block--erosion
Block
technische universiteit eindhoven
W
82
83
Block diagram of Y-estimator; Simple hardware
Prediction
memory
0
Nbl
U(X,n)
D(X,n)
Update
Mod p
count
Look
Up
Table
Best
vector
selection
Block
erosion
Update Generator
Current
Previous
picture
picture
technische universiteit eindhoven
W
83
D(x,n)
1 step BE
2 step BE
3 step BE
Block erosion
84
U
U
Median
L
C
R
L
V1
V2
V3
V4
D
U
Median
L
C
R
L
V1
V2
V3
V4
D
D
U
U
Median
L
C
D
R
D
U
R
L
V1
V2
V3
V4
R
U
U
Median
L
R
C
D
D
technische universiteit eindhoven
W
No BE
84
R
L
V1
V2
V3
V4
D
R
85
The effect of block erosion
technische universiteit eindhoven
W
85
86
Advanced
scanning
technische universiteit eindhoven
W
86
87
3-Dimensional Recursive Search (3DRS)
Normal scan
Meandering scan
technische universiteit eindhoven
W
87
Reverse scan
Video processing for
multimedia systems
G. de Haan
technische universiteit eindhoven
W
89
Parametric
motion models
technische universiteit eindhoven
W
89
90
Global motion estimation
•
Simple parametric motion model:
•
•
•
p1 and p2 describe pan and tilt

• Dx ( x , n)  p1 (n)  p3 (n) x  p5 (n) y  ......

• D y ( x , n)  p2 (n)  p4 (n) y  p6 (n) x  ......
p3 and p4 describe zoom
p5 and p6 describe rotation
technische universiteit eindhoven
W
90
91
Sample vector field to calculate model parameters
Motion model with 4 parameters can be calculated from any 2
independent sample vectors
So,
in totaluniversiteit
from these
9 vectors 18 models can be estimated
technische
eindhoven
W
91
92
Derive robust background model from sample vectors
Take median of all estimated parameters to eliminate outliers:
p1
p2
p3
p4
= median{p11 , p21 , p31 ,………………… p181 }
= median{p12 , p22 , p32 ,………………… p182 }
= median{p13 , p23 , p33 ,………………… p183 }
= median{p14 , p24 , p34 ,………………… p184 }
technische universiteit eindhoven
W
92
93
Extra candidate from parametric motion model (SAA4992)
Prediction
memory
 
U(X, n)
Mod p
counter
Look
up
table
Update vector
generator
micro processor
calculates parameters
calculate
local
candidates
P1, P2,..
technische universiteit eindhoven
W
 
D( X , n)
update
>
Sample vectors
Nbl

0
93
Best
vector
selection
Block
erosion
Current
Previous
picture
picture
 
D( x, n)
94
Effect of extra candidates from parametric model
Without parametric candidate
With parametric candidate
Clearly, the effect depends on the settings of the candidate’s penalty!
technische universiteit eindhoven
W
94
95
Block--hopping
Block
technische universiteit eindhoven
W
95
96
Chosen candidates
Spatial
Temporal
Update
technische universiteit eindhoven
W
96
•In many cases the
spatial prediction
(SP) is good.
•Save calculations
on the average by
checking the other
candidates only if
SP error is above Th
97
Block-hopping
Calculate
all SADs
(grey
blocks
are skipped)
technische universiteit eindhoven
W
97
98
Block hopping; optimal resource usage
Vector
memory
Calc. SAD
of SP
compare
MUX
s
Calc. all
Assign
SADs
best D
Th
SP
Adapt
threshold
Calculate
Resource
Usage/field
technische universiteit eindhoven
W
MUX
s
Assign
98
99
Motion
estimation and
occlusion
technische universiteit eindhoven
W
99
100
The basic block matching concept
V-position
Search area
Candidate vector
Reference
block
8 x 8 pixels
n-1
H-position
n
Picture number
technische universiteit eindhoven
W
100
101
How to estimate motion estimation in occlusion areas?
Information not available in
previous picture
n-1
n
technische universiteit eindhoven
W
101
Ambiguities due to uncovering
Position
102
Preference
for FG-vector
in uncovered
areas
?
n-1
Time
n
technische universiteit eindhoven
W
102
103
How to estimate motion estimation in occlusion areas?
Information not available in
next picture
Information not available in
previous picture
n-1
n
technische universiteit eindhoven
W
103
104
Motion estimation problem in occlusion areas
•
Observations:
•
•
•
Foreground:
• Matches always, i.e. in previous and in next picture
Background:
• In case of covering all background will match in previous picture
• In case of uncovering all background will match in next picture
Conclusion:
•
Switch between “forward” and “backward” motion estimation to
prevent ambiguities
technische universiteit eindhoven
W
104
105
Solution: In covering areas “forward” estimation
V-position
Search area
Candidate vector
Reference
block
8 x 8 pixels
n-1
H-position
n
Picture number
technische universiteit eindhoven
W
105
106
Solution: In uncovering areas “backward” estimation
V-position
Reference block
8 x 8 pixels
Search area
Candidate
vector
n-1
H-position
n
Picture number
technische universiteit eindhoven
W
106
107
Unambiguous motion vectors for original images
Look for correspondences in BOTH neighbouring images, select
Position
prediction with the highest correlation
forward
backward
n-1
n
technische universiteit eindhoven
W
107
n+1
Time
108
Comparison 2 frame and 3 frame motion estimation
2 frame ME
3 frame ME
technische universiteit eindhoven
W
108
109
Global motion
estimation
technische universiteit eindhoven
W
109
110
Projection based global motion estimation
•
Algorithm:
•
•
•
Accumulate luminance over all lines
Accumulate luminance over all collumns
Determine global H- and V- motion based on these
projections
Demo
Samsung ME
technische universiteit eindhoven
W
110
111
Projection based global motion estimation
•
Global motion: Minimum SAD of projection current
and previous image
F(i,k)
i
F(i,k+1)
Global ME
2v
i
technische universiteit eindhoven
W
DEMO
111
112
Success and failure of the projection based global ME
technische universiteit eindhoven
W
112
113
Conclusions
•
Motion estimators for scan rate conversion differ from
estimators for coding, due to additional true-motion
constraint
•
True motion results from constraints like spatial and
temporal consistency
•
•
3 options: better criterion, post-processing, pre-selection
Pre-selection options
•
•
Hierarchical approach (e.g. Phase Plane Correlation.)
Recursive approach (3-D RS)
technische universiteit eindhoven
W
113
114
Conclusions
•
Picture rate conversion requires very consistent but not
necessarily very accurate motion vectors (integer
resolution sufficient), the range should be at least +/-16
pixels
•
De-interlacing requires very accurate motion vectors (at
least 1/4 pixel) . For larger vectors the accuracy is less
important
technische universiteit eindhoven
W
114
115
Prepare yourself for the exam…
•
Last week:
•
Today:
•
I recommend you read the text
•
And try the exercises in the book:
•
•
•
•
•
•
Chapter 8
Chapter 10 (not: object based ME)
Book available at Pt9:24
Chapter 8
Chapter 10, skip 10.6
You have to download VidProc (w3.ics.ele.tue.nl/~dehaan/ )
• Send me e-mail for password (G.d.Haan@tue.nl)
technische universiteit eindhoven
W
115
116
Questions. Part 6. Motion estimation
1.
A full-search block-matcher uses blocks of 8x8 pixels and a search range of 7x5 blocks. How
many candidate vectors have to be evaluated per block?
2.
A 3-D recursive-search block-matcher uses blocks of 8x8 pixels and a search range of 7x5
blocks. How many candidate vectors have to be evaluated per block in case the true-motion
vector is (Dx,Dy) = (4,3)?
3.
Increase the number of iterations using 3DRS block-matching and evaluate the effect on the
M2SE and smoothness
4.
Analyze the effect of a parametric motion model in 3DRS (choose a suitable test sequence!)
5.
Try some of the available motion estimation algorithms of the software (4 frames will do..)
1.
2.
3.
4.
How do they compare in M2SE and Smoothness?
What is the effect of vector-field post-processing on these two quality metrics?
What is the effect of the match-criterion and the number of images used in it?
How would you rate the algorithms by (subjectively) evaluating the vector field?
technische universiteit eindhoven
W
116
Download