Standardization in JVT: Scalable Video Coding Jens-Rainer Ohm

advertisement
Jens-Rainer Ohm
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Standardization of SVC
Standardization in JVT:
Scalable Video Coding
Jens-Rainer Ohm
RWTH Aachen University
Institute of Communications Engineering
ohm@ient.rwth-aachen.de
http://www.ient.rwth-aachen.de
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Standardization of SVC
Outline
! Introduction
! Summary on MPEG Call for Proposals on SVC and
subsequent Core Experiments process
! Results
! Technical solutions
! Present status of SVC standard development in JVT
! Temporal scalability and open-loop concepts
! Spatial scalability
! SNR and fine-granularity scalability
! Conclusions
2
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Standardization of SVC
Scalable Media – the Idea
! Universal Media Access:
code once and then customize the stream to access content
! “Anytime”
! from “Anywhere” (i.e. using any access network - wireless,
internet etc.)
! and by “Anyone” (i.e. with any terminal complexity)
! Compatibility of different formats/resolutions
Scalable
Coded Content
Network
Terminal
Terminal capabilities & Network characteristics feedback
3
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Standardization of SVC
MPEG/ITU-T Work Item on Scalable Video Coding
! SVC Work item
! started as ISO/IEC 21000-13
(MPEG-21 Scalable Video Coding)
! was moved to ISO/IEC 14496-10 January 2005
! Timeline:
! October 2003: Call for Proposals
! March 2004: Evaluation of proposals
! January 2005: Working Draft (now within JVT)
! October 2005: Committee Draft
! March 2006 Final Committee Draft
! July 2006 Final Draft International Standard
4
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
Standardization of SVC
ITU-T VICA Workshop, 22-23 July 2005, Geneva
MPEG Call for Proposals on SVC – Results
!
Scenario 2: Scalability over 2 spatial layers
Scenario 2 Overview
MOS x2 loss compared to anchor
MCTF+block transform
0
Advanced scalable
-1
MC prediction
-2
-3
-4
-5
QCIF 1
5
MCTF+2DWT
RWTH Aachen University
QCIF 2
ohm@ient.rwth-aachen.de
CIF 1
CIF 2
Institute of Communications Engineering
CIF 3
A2
J2
K2
B2
H2
D2
I2
E2
L2
C2
G2
F2
JVT
Jens-Rainer Ohm
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Standardization of SVC
Following the Call
!
Core experiments were run to further
identify and develop SVC technology
! based on Scalable Video Model (SVM)
! Comparison made for wavelet-based and
AVC/H.264-extending solutions
!
!
6
Full range of scalabilities tested (3x spatial, 3x
temporal, medium-granularity SNR)
Formal visual tests (experts & non-experts)
indicated superiority of AVC/H.264 solution,
particularly at low rates and resolutions
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Standardization of SVC
JSVM and Working Draft
! JSVM: Joint Scalable Video Model (extending from
MPEG Scalable Video Model 3.0)
! Working Draft of 14496-10:2005/AMD1 Scalable Video
Coding
! Extension of AVC / H.264 with compatible base layer
! Motion-compensated temporal filtering (MCTF) with adaptive
prediction and update steps for open-loop compression
! Layered structure with "bottom-up" prediction from lower layers
! One decoder loop
! FGS functionality as extension of CABAC, modified interleaved
scan order („cyclic block coding“)
! New SVC-specific NAL unit types
! Bitstream scalability at level of NAL packets
7
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Standardization of SVC
Motion-compensated Lifting Filters
! Temporal-axis
video
A B A B A B
lifting filters,
sequence
MC/IMC in lifting flow
H = A − MC ( B)
! MC and IMC
MC
MC
MC
Prediction
must be aligned
1 -1
1 -1
1 -1
highpass
! H pictures similar to
sequence
H
H
H
MC prediction P pictures
! Adaptive predict & update
Update
! Switching intra/forward/
1/2 1 1/2 1 1/2 1
backward/bi-directional lowpass
L
L
L
sequence
! L pictures similar to
I pictures when update
1
L = [ MC ( A ) + B ]
turned down
2
IMC
8
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
IMC
IMC
JVT
Jens-Rainer Ohm
Standardization of SVC
ITU-T VICA Workshop, 22-23 July 2005, Geneva
MCTF as „Temporal-axis Wavelet Tree“
A
B
A
B
A
original video sequence
B
A
B
H
L
H
L
H
L
L
H
1st decomposition level
video sequence
1st temporal level
H
2nd temporal level
3rd temporal level
9
LL
LH
LL
LLL
LH
LL
LH
2nd decomposition level
LLH
! No closed loop in prediction
and update required
! Prediction loop over LLL.. possible
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
LLH
LLL
3rd level
JVT
Jens-Rainer Ohm
Standardization of SVC
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Layered Coding with compatible base layer
GOP boundaries
10
MCTF enhancement
layer
L3
H1
H2
H1
H3
H1
H2
H1
L3
AVC Main Profile
compatible base layer
A
B3
B2
B3
B1
B3
B2
B3
A
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
Standardization of SVC
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Layered Coding with Spatial Pyramids
Coding
4CIF
Scaling
Prediction
Coding
CIF
Scaling
Scalable Stream
Prediction
QCIF
Coding
! Pyramid structure
! Basic building blocks from AVC / H.264
! Spatial transform coding using integer DCT approximations
(4x4 and 8x8)
11
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Standardization of SVC
Inter Layer Prediction
8x8
8x4
4x8
4x4
Direct,
16x16,
16x8,
8x16
Intra
16x16
16x8
16x16
16x16
Intra-BL Intra-BL
8x16
8x8
16x16
16x16
Intra-BL Intra-BL
! Motion: Upsampled partitioning and motion vectors for
prediction
! Residual: Block-wise upsampled residual (bi-linear)
! Intra: Upsampled intra MB (H.264 / AVC 1/2-pel filter)
12
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
Standardization of SVC
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Extended Spatial scalability
! Cropping and non-dyadic up-sampling
! Macroblocks not aligned between layers
wenh
wextract
(xorig ,yorig)
henh
down-sampling
hextract
hbase
wbase
Base spatial layer
Enhancement spatial layer
13
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Standardization of SVC
SNR Scalability
! Layered representation of L/H pictures
! Non-scalable ‘base-layer’ (BL)
! Motion information
! Coarsest represenation of intra and residual data
! Minimum acceptable reconstruction quality
! Quality scalable enhancement layers (EL)
! Residue between original and SNR-BL representation
! Doubled quantizer precision per SNR-EL
! FGS: truncation of EL packets at arbitrary points
! Refinement in the transform domain
! Single inverse transform at the decoder side
! “Dead substream” concept allows flexible increase of
quality beyond the limitations of layered coding
14
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
Standardization of SVC
ITU-T VICA Workshop, 22-23 July 2005, Geneva
SNR&Spatial scalability: 2 spatial layer example
Video
Bitstream
bitstream
Texture
ResidualCoding
5/3 MCTF
2D Spatial Decimation
AVC
(with Hierarchical
B pictures)
Texture
N1 1x Quality Layer
N 1x Quality Layer
N x Quality Layer
Quality BaseLayer
Motion {Mp1 }
MotionCoding
Motion
Interpolation
Texture
Multiplex
Motion
2D Texture
Interpolation
N1 1x Quality Layer
N 0x Quality Layer
N x Quality Layer
Motion {Mp 0}
bistream
15
RWTH Aachen University
bistream
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
Standardization of SVC
ITU-T VICA Workshop, 22-23 July 2005, Geneva
SNR&Spatial scalability: 2 spatial layer example
Video
Bitstream
bitstream
Texture
ResidualCoding
5/3 MCTF
Texture
N1 1x Quality Layer
N 1x Quality Layer
N x Quality Layer
Quality BaseLayer
Motion {Mp1 }
MotionCoding
AVC
(with Hierarchical
B pictures)
=
Progresive
Motion Refinement
Quality Layer
Interpolation
Texture
ResidualCoding
CGS
OR
Quality Layer
2D Texture
MotionCoding
Interpolation
(Opt.)
N1 1x Quality Layer
N 0x Quality Layer
N x Quality Layer
Motion {Mp 0}
bistream
16
RWTH Aachen University
bistream
Motion
ResidualCoding
FGS
Quality Layer
2D Spatial Decimation
Multiplex
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Standardization of SVC
Realisation of FGS SNR Scalability
!
!
!
Three scans for FGS coding of transform coefficients
1. Non-significant transform coefficients of significant blocks
2. Refinement symbols for already significant coefficients
3. Coefficients of non-significant blocks
Scanning pattern for each scan
! Scanning from low to high frequency bands (zig-zag)
! Inside each band: first luma, then chroma in raster scan
Coding of transform coefficient levels
! Non-significant coefficients
!
!
!
Significant coefficients
!
17
Coded Block Pattern, Coded Block Bit,
DeltaQP, Transform Size
CABAC symbols (SIG, LAST, SIGN, ABS)
Refinement symbol (-1, 0, +1)
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
Standardization of SVC
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Bitstream structure
Spatial
! Data cube model and layered stream
representation
4CIF
B1
Layer 3
T0, S2, B0
Layer3
T1, S2, B0
Layer 3
T2, S2, B0
B1
CIF
Layer 1
T0, S1, B0
Layer 1
T1, S1, B0
Layer 2
T2, S1, B0
B1
SN
QCIF
Base-Layer
Layer 1
T1, S0, B0
R
Layer 2
T2, S0, B0
Temporal
7.5 fps
18
RWTH Aachen University
15 fps
ohm@ient.rwth-aachen.de
30 fps
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
Standardization of SVC
ITU-T VICA Workshop, 22-23 July 2005, Geneva
…
scalability level A
SVC NAL unit
length
extension header with
level ID
length
SVC NAL unit
SVC sample
length
SVC NAL unit
Length
Coded
Video
NAL U
length
extension header with
level ID
length
SEI
NAL U
AVC NAL unit
Length
length
Param
NAL U
AVC NAL unit
length
Length
! New SVC NAL unit types can be ignored by
conventional AVC parser
scalability level B
sample
19
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Jens-Rainer Ohm
Standardization of SVC
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Performance Example
CREW 176x144 15
37
36.5
36
35.5
]
B
d[
R
N
S
P
35
34.5
34
JSVM2 CombinedScalability
JSVM2 SingleLayer
JSVM2 SnrScalability
JSVM2 SpatialScalability
33.5
33
80
20
RWTH Aachen University
100
120
ohm@ient.rwth-aachen.de
140
Rate [kbit/s]
160
180
Institute of Communications Engineering
200
JVT
Jens-Rainer Ohm
Standardization of SVC
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Performance Example
CREW 352x288 30
37.5
37
36.5
36
]
B
d[
R
N
S
P
35.5
35
34.5
JSVM2 CombinedScalability
JSVM2 SingleLayer
JSVM2 SnrScalability
JSVM2 SpatialScalability
34
33.5
350
21
RWTH Aachen University
400
450
500
ohm@ient.rwth-aachen.de
550
600
Rate [kbit/s]
650
700
750
Institute of Communications Engineering
800
JVT
Jens-Rainer Ohm
Standardization of SVC
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Performance Example
CREW 704x576 60
37.5
37
36.5
]
B
d[
R
N
S
P
36
35.5
JSVM2 CombinedScalability
JSVM2 SingleLayer
JSVM2 SnrScalability
JSVM2 SpatialScalability
35
34.5
1400
22
RWTH Aachen University
1600
1800
2000
2200
2400
Rate [kbit/s]
ohm@ient.rwth-aachen.de
2600
2800
3000
Institute of Communications Engineering
3200
JVT
Jens-Rainer Ohm
ITU-T VICA Workshop, 22-23 July 2005, Geneva
Standardization of SVC
Summary
! SVC to become an extension of H.264 / AVC
!
!
!
!
Residual coding using existing tools
MCTF open loop structure with update step
Pyramid structure with inter layer prediction
FGS quality layers plus a minimum quality base layer
! Good compression performance
! tradeoffs by the amount of flexibility in scalability
! Further improvements expected e.g. by
!
!
!
!
!
23
adaptive MCTF structures
combinations closed/open loop
improvement of quantization & FGS schemes
cross-layer dependencies of texture and motion
joint RD optimization over multiple layers
RWTH Aachen University
ohm@ient.rwth-aachen.de
Institute of Communications Engineering
JVT
Download