CAMED: Complexity Adaptive Motion Estimation & Mode Decision for H.264 Video Yong Wang

advertisement
ICASSP’06
CAMED: Complexity Adaptive
Motion Estimation & Mode
Decision for H.264 Video
Yong Wang
Prof. Shih-Fu Chang
Digital Video and Multimedia (DVMM) Lab, Columbia University
CAMED
Outline
„
„
Complexity aware H.264 coding
CAMED
„
„
„
„
„
Rate-Distortion-Complexity video coding
Complexity cost function
Complexity control
Experiment results
Conclusions & future work
2
CAMED
H.264: Emerging Video Standard
„
More efficient motion prediction
„
„
„
„
„
„
„
Variable block size motion compensation (as small as 4 x 4)
Quarter pixel interpolation
Multiple reference frames (up to 5)
Decoupling of referencing order from display order
Directional spatial prediction for intra coding
SKIP/DIRECT Mode
Improved coding efficiency
„
„
„
„
Small block-size transform (4 x 4 integer transform)
Lossless inverse transform
Context-based adaptive binary arithmetic coding (CABAC)
Switching P (SP) and switching I (SI) pictures
3
CAMED
Complexity Issue In H.264
„
Much better quality, with much higher complexity
[Lappalainen’03]
[Schäfer’03]
4
CAMED
Complexity Aware H.264 Coding
„
„
Important for mobile video solutions
H.264: Emerging coding standard
„
„
„
Much higher quality
Much higher complexity
Complexity constrained video coding
min D ( P ) ,
s.t., R ( P ) ≤ RT , P : QP, MV , Mode,
C ( P ) ≤ CT
5
?
CAMED
Approaches for Comp. Reduction
ƒ Hardware level
ƒ Hardware architecture (MMX, SIMD [Chen’04])
ƒ Low power chip (Advanced voltage scaling [ARM’03])
ƒ High capacity battery [GIT]
ƒ Encoder complexity
ƒ
ƒ
ƒ
ƒ
ƒ
Fast motion estimation (FME) [Li’05, Tourapis’03 ]
Fast mode decision (FMD) [Zhu’05, Peng’05]
Fast reference frame selection [Su’04]
Adaptive I-B-P GOP structure [Ray’04]
SAD (Sum of Absolute Difference) complexity modeling
ƒ Decoder complexity
ƒ
ƒ
ƒ
ƒ
[He’04]
Smart inverse DCT [Ortega’00]
Interpolation approximation [Wang’05]
Implementation optimization (buffer reusage, assembly coding)
Complexity Adaptive ME & MD (CAMED)
6
CAMED
CAMED: Philosophy
P(t ) +
-
Transform
+
d (t )
Inverse
Transform
Quantization
d T (t )
Inverse
Quantization
Entropy
Encoding
d T (t ) Entropy
Decoding
+
+
PR (t )
+
Input
Frames
Motion
Compensation
P (t − 1)
Pre-decoded
Frame Buffer
v
v
Motion
Estimation &
Mode Decision
Video Encoder
Decoded
Frame
De-blocking
(H.264)
Entropy
Decoding
Video Decoder
v
v
Entropy
Encoding
7
Bit Strem
Combining
Encoded
Bitstream
CAMED
CAMED: Highlight
„
„
„
„
„
„
Develop an encoding algorithm that achieves both high
video quality and low decoding complexity
Addressed the most significant module (Motion
Compensation) in decoders, so that the complexity
reduction is efficient.
Nothing need to be done in decoders, and the extra cost in
encoders is trivial
Full standard compliant
The proposed method can be a valuable complement for
hardware-level and encoder-level solutions
Very promising for mobile video applications.
8
CAMED
R-D-C Optimization
„
R-D-C Lagrange optimization
R , D ,C
VC* ( B, M ) = arg min J MOTION
( V B, M )
V∈sup{V}
{
}
R,D
= arg min J MOTION
(V B, M ) + γ MOTION CMOTION (V B, M )
V∈sup{V}
R , D ,C
M C* ( MB, QP ) = arg min J MODE
( M MB, QP )
M ∈sup{M }
{
}
R,D
= arg min J MODE
( M MB, QP ) + γ MODE CMODE ( M MB, QP )
M ∈sup{M }
γ MOTION = γ MODE
9
CAMED
Complexity Cost Function
„
Pixel-based interpolation operation
⎧ 0
⎪ e
6
⎪
⎪⎪ e6 + e2
cP ( V ) = ⎨
⎪ 2e6 + e2
⎪ 7e6
⎪
⎪⎩7e6 + e2
„
V is integer MV
V is subpixel b,h
V is subpixel a,c,d,n
V is subpixel e,g,p,r
V is subpixel j
V is subpixel i,f,k,q
G a b c H
d e
h i
f
j
n p q
M
s
Can be further simplified counting 7-tap interpolations.
10
g
k m
r
N
CAMED
Complexity Cost Function (cont.)
„
Block-based interpolation operation
Mode
Integer
1 6-tap
2 6-tap
7 6-tap
Skip/Direct
0
256
512
592
16x16
0
256
512
592
16x8
0
128
256
296
8x16
0
128
256
296
8x8
0
64
128
168
8x4
0
32
64
84
4x8
0
32
64
84
4x4
0
16
32
52
Intra
0
0
0
0
11
CAMED
Complexity Cost Function (cont.)
„
Complexity benchmark
„
Real time AVC decoder in Dell Axim x50 PDA
Dell Axim x50 w/ 524MHz Xscale CPU, Foreman, QCIF, 15fps, baseline, 200Kbps
Decoding Time Per Pixel (10E-7s)
0.6
M1
M2
M3
M4
M5
M6
M7
0.5
0.4
0.3
0.2
0.1
0
V1
V5
V9
V13
V2
V3
V4
V6
V8
V14
Different Subpixel Levels
12
V16
V7
V10
V11
V12
V15
CAMED
Complexity Cost Function (cont.)
„
Modeling based on complexity benchmark
c (v, M ) = b
−p
M
„
„
„
(n
c + n2,v c2 + cl )
6,v 6
Content-independent: the complexity is mainly decided
by the arithmetic calculations
Can be summarized into look-up table to reduce
encoding cost
Reflects the actual computing complexity, including not
only interpolation, but also other cost rather than
interpolation such as memory access and logic flow.
13
CAMED
Complexity Control
„
The procedure to adjust coding parameters so that the
complexity level meets the target requirement.
„
„
Similar to rate control
Control parameter: Lagrange multiplier
γ MOTION , γ MODE
Mobile, 600kbps
„
Complexity modeling
C ( γ MODE ) = D ( K1 ln ( γ MODE ) + K 0 )
⎧⎪ C ( t ) − K0 D ( t ) ⎫⎪
γ MODE ( t ) = exp ⎨
⎬
K
D
t
1 ( )
⎩⎪
⎭⎪
Compelxity (1K Interpolations)
40
35
30
25
15
10
5
0 0
10
14
Each line is for one P frame
20
10
1
10
2
Lagrange multiplier for block mode
10
3
CAMED
Complexity Control (cont.)
„
Model estimation
„
„
Complexity
Collect complexity stat using
sliding window
Estimate model parameters C E
through regression
Observed
(γ , C ) Point
Increased frame complexity
Observed
„
γ MODE
Buffer management
„
„
(γ , C ) Point
Lagrange Multiplier
Use complexity buffer to simulate the complexity usage status
on the decoder side
Estimate of available resource, adjust the parameters, and
avoid issues of buffer overflow or underflow
15
CAMED
Experiment Setup
„
Environment
„
„
„
„
„
„
„
„
Software: H.264 JM 8.2
Sequences: Akiyo, Foreman, Stefan, Mobile
Resolution: QCIF/CIF
Profile: Main/Baseline
Bit rate: 100kbps~1Mbps
Block mode: all on
Direct mode: temporal
Motion estimation search range: 32 pixels
16
CAMED
R-D-C Optimization
Foreman P S NR
Foreman Comple xity
80
40
39
70
P S NR(dB)
38
37
36
35
34
33
32
H.264
Lamda=5
Lamda=20
Lamda=50
Lamda=200
Lamda=500
31
30
29
28
0
500
1000
Bandwidth (kbps )
1500
Re ma ining Comple xity(%)
42
41
60
50
40
30
La mda =5
La mda =20
La mda =50
La mda =200
La mda =500
20
10
0
0
500
1000
Bandwidth (kbps )
1500
Up to 60% 6-tap interpolation can be saved while keeping the video quality
almost intact (quality difference less than 0.2dB)
17
CAMED
Demonstration
18
CAMED
Results: Complexity Benchmark
Foreman, QCIF, 15fps, baseline
50.00
48.00
46.00
PSNR (dB)
44.00
42.00
Beat H.264 up to 1.8dB in
terms of C-D performance
40.00
H.264
L1.0
L2.0
L5.0
L0.5
L10
400K
38.00
36.00
34.00
20.00
22.00
24.00
26.00
28.00
30.00
32.00
Decoding Time (s)
Dell Axim x50
19
34.00
36.00
38.00
40.00
CAMED
Results: Compatibility w/ FME
20
CAMED
Results: Complexity Control
Forema n, 1000kbps
400
Comple xity (K inte rpolation)
350
300
J M82
50K
100K
150K
200K
250
200
150
100
50
0
0
5
10
GOP Index
21
15
20
CAMED
Results: Complexity Control
CIF, 30fps, 100kbps
22
CAMED
Conclusions & Future Work
„
Proposed CAMED for complexity-efficient H.264 coding
Applied rigorous R-D-C optimization framework
Defined several complexity cost functions
Proposed the algorithm of complexity control
Performance Validated through extensive simulation experiment
„
Future work
„
„
„
„
„
„
„
Theoretical analysis on the relationship between the complexity
level and the control parameter (i.e., the Lagrange multiplier)
Considering hardware acceleration (such as MMX, SSE)
More efficient complexity modeling and complexity control
23
THANK YOU!
Prof. Shih-Fu Chang
sfchang@ee.columbia.edu
Yong Wang
ywang@ee.columbia.edu
Download