SVC(project)

Overview of the Scalable Video Coding Extension of the H.264/AVC Standard Kai-Chao Yang 2007/8 Kai-Chao Yang, NTHU, Taiwan 1 Outline  Introduction                Problems Definition Functionality Goal Competition Applications Targets History of SVC Structure of SVC Temporal Scalability Spatial Scalability Quality Scalability Combined Scalability Profiles of SVC Conclusions 2007/8 Kai-Chao Yang, NTHU, Taiwan 2 Introduction - problem  Non-Scalable Video Streaming  Multiple video streams are needed for heterogeneous clients 8Mb/s 512Kb/s 1Mb/s 6Mb/s 2007/8 4Mb/s Kai-Chao Yang, NTHU, Taiwan 3 Introduction - definition  Scalable video stream Sub-stream n Sub-stream 2 Sub-stream 1  reconstruc tion Sub-stream ki High quality … …  Sub-stream k2 Sub-stream k1 Low quality Scalability  2007/8 Removal of parts of the video bit-stream to adapt to the various needs of end users and to varying terminal capabilities or network conditions Kai-Chao Yang, NTHU, Taiwan 4 Introduction - functionality  Functionality of SVC     2007/8 Graceful degradation when “right” parts of the bit-stream are lost Bit-rate adaptation to match the channel throughput Format adaptation for backwards compatible extension Power adaptation for trade-off between runtime and quality Kai-Chao Yang, NTHU, Taiwan 5 Introduction - mode  Example Most significant bit Enhancement 1 Enhancement 2 Enhancement 3 Enhancement 4 Enhancement 5 Enhancement layer Base layer  0 1 1 0 1 residual 10010 01101 10010 11001 00101 Scalability mode      2007/8 Fidelity reduction (SNR scalability) Picture size reduction (spatial scalability) Frame rate reduction (temporal scalability) Sharpness reduction (frequency scalability) Selection of content (ROI or object-based scalability) Kai-Chao Yang, NTHU, Taiwan 6 Structure of SVC SNR scalable coding Temporal scalable coding Prediction Multiplex Spatial decimation SNR scalable coding Temporal scalable coding 2007/8 Base layer coding Prediction Kai-Chao Yang, NTHU, Taiwan Base layer coding 7 Temporal Scalability  Hierarchical prediction structures Hierarchical B pictures 0 4 3 5 2 7 6 8 1 12 11 13 10 15 14 16 9 GOP Non-dyadic hierarchical prediction 0 3 4 2 6 7 5 8 9 1 12 13 11 15 16 14 17 18 10 Hierarchical prediction with zero delay 2007/8 Yang, NTHU, Taiwan 0 1 2 3 4 5 6 7 8 Kai-Chao 9 1011 1213 14 15 16 8 Temporal Scalability N=1 Video Coding Experiment with H.264/MPEG4-AVC Foreman, CIF 30Hz @ 1320kbps Performance as a function of N I P P P P P P P P Cascaded QP assignment QP(P)  QP(B0)-3  QP(B1)-4  QP(B2)-5 Temporal scalability N=2 I B0 P B0 P B0 P B0 P N=4 I B1 B0 B1 P B1 B0 B1 P N=8 2007/8 I B2 B1 B2 B0 B2 B1 B2 P Kai-Chao Yang, NTHU, Taiwan 9 This slide is copied from JVT-W132-Talk Spatial Scalability Hierarchical MCP & Intra-prediction Spatial decimation texture motion Base layer coding Inter-layer prediction •Intra •Motion •Residual H.264/AVC MCP & Intra-prediction 2007/8 motion Base layer coding Inter-layer prediction •Intra •Motion •Residual Hierarchical MCP & Intra-prediction Spatial decimation texture texture motion Multiplex Scalable bit-stream H.264/AVC compatible base layer bit-stream Base layer coding H.264/AVC compatible coder Kai-Chao Yang, NTHU, Taiwan 10 Spatial Scalability      Similar to MPEG-2, H.263, and MPEG-4 Arbitrary resolution ratio The same coding order in all spatial layers Combination with temporal scalability Inter-layer prediction Spatial 1 Temporal 2 Intra Spatial 0 Temporal 0 Temporal 1 Intra 2007/8 Kai-Chao Yang, NTHU, Taiwan 11 Spatial Scalability  The prediction signals are formed by  MCP inside the enhancement layer (Temporal) (small motion and high spatial detail)    Up-sampling from the lower layer (Spatial) Average of the above two predictions (Temporal + Spatial) Inter-layer prediction  Three kinds of inter-layer prediction     Base mode MB  2007/8 Inter-layer motion prediction Inter-layer residual prediction Inter-layer intra prediction Only residual are transmitted, but no additional side info. Kai-Chao Yang, NTHU, Taiwan 12 Spatial Scalability  Inter-layer motion prediction    base_mode_flag = 1 The reference layer is inter-coded Data are derived from the reference layer     (2x1,2y1) 16 16 (x2,y2) Reference layer (x1,y1) 8 8 motion_pred_flag   2007/8 MB partitioning Reference indices MVs (2x2,2y2) 1: MV predictors are obtained from the reference layer 0: MV predictors are obtained by conventional spatial predictors. Kai-Chao Yang, NTHU, Taiwan 13 Spatial Scalability  Inter-layer residual prediction   residual_pred_flag = 1 Predictor   2007/8 Block-wise up-sampling by a bi-linear filter from the corresponding 88 sub-MB in the reference layer Transform block basis Kai-Chao Yang, NTHU, Taiwan 14 Spatial Scalability  Inter-layer intra prediction    base_mode_flag = 1 The reference layer is intra-coded Up-sampling from the reference layer   2007/8 Luma: one-dimensional 4-tap FIR filter Chroma: bi-linear filter Kai-Chao Yang, NTHU, Taiwan 15 Spatial Scalability  Past spatial scalable video:     Inter-layer intra prediction requires completely decoding of base layer. Multiple motion compensation and deblocking filter are needed. Full decoding + inter-layer prediction: complexity > simulcast. Single-loop decoding  2007/8 Inter-layer intra prediction is restricted to MBs for which the co-located base layer is intra-coded Kai-Chao Yang, NTHU, Taiwan 16 Spatial Scalability  Single-loop vs. multi-loop decoding Inter I 2007/8 B P Kai-Chao Yang, NTHU, Taiwan 17 This slide is copied from http://iphome.hhi.de/wiegand/assets/pdfs/H264AVC_SVC.pdf Spatial Scalability  Generalized spatial scalability in SVC  Arbitrary ratio   Cropping   2007/8 Neither the horizontal nor the vertical resolution can decrease from one layer to the next. Containing new regions Higher quality of interesting regions Kai-Chao Yang, NTHU, Taiwan 18 Spatial Scalability  Encoder control (JSVM)  Base layer  p0 '  arg min {D0 ( p0 )  0 R0 ( p0 )} { p0 }   p0’ is optimized for base layer Enhancement layer  p1 '  arg min {D1 ( p1 | p0 )  1R1 ( p1 | p0 )} { p1| p0 }   Decisions of p1 depend on p0  2007/8 p1’ is optimized for enhancement layer Efficient base layer coding but inefficient enhancement layer coding Kai-Chao Yang, NTHU, Taiwan 19 Spatial Scalability  Encoder control (optimization)  Base layer  Considering enhancement layer coding  Eliminating p0’s disadvantaging enhancement layer coding  p0 '  arg min {(1  w)[ D0 ( p0 )  0 R0 ( p0 )]  w[ D1 ( p1 | p0 )  1R1 ( p1 | p0 )]} { p0 , p1| p0 }  Enhancement layer   w   2007/8 No change w = 0: JSVM encoder control w = 1: Single-loop encoder control (base layer is not controlled) Kai-Chao Yang, NTHU, Taiwan 20 Quality Scalability  Coarse-grain quality scalability (CGS)  A special case of spatial scalability    Smaller quantization step sizes of for higher enhancement residual layers Designed for only several selected bit-rate points   2007/8 Identical sizes for base and enhancement layers Supported bit-rate points = Number of layers Switch can only occur at IDR access units Kai-Chao Yang, NTHU, Taiwan 21 Quality Scalability  Medium-grain quality scalability (MGS)  More enhancement layers are supported   Key pictures    2007/8 Refinement quality layers of residual Drift control Switch can occur at any access units CGS + key pictures + refinement quality layers Kai-Chao Yang, NTHU, Taiwan 22 Quality Scalability  Drift control   Drift: The effect caused by unsynchronized MCP at the encoder and decoder side Trade-off of MCP in quality SVC  2007/8 Coding efficiency  drift Kai-Chao Yang, NTHU, Taiwan 23 Quality Scalability  MPEG-4 quality scalability with FGS Refinement (possibly lost or truncated) Base layer     Base layer is stored and used for MCP of following pictures Drift: Drift free Complexity: Low Efficiency: Efficient based layer but inefficient enhancement layer  2007/8 Refinement data are not used for MCP Kai-Chao Yang, NTHU, Taiwan 24 Quality Scalability  MPEG-2 quality scalability (without FGS) Refinement (possibly lost or truncated) Base layer   Only 1 reference picture is stored and used for MCP of following pictures Drift: Both base layer and enhancement layer    2007/8 Frequent intra updates is necessary Complexity: Low Efficiency: Efficient enhancement layer but inefficient base layer Kai-Chao Yang, NTHU, Taiwan 25 Quality Scalability  2-loop prediction Refinement (possibly lost or truncated) Base layer     2007/8 Several closed encoder loops run at different bitrate points in a layered structure Drift: Enhancement layer Complexity: High Efficiency: Efficient base layer and medium efficient enhancement layer Kai-Chao Yang, NTHU, Taiwan 26 Quality Scalability  SVC concepts Refinement (possibly lost or truncated) Base layer  Key picture    2007/8 Trade-off between coding efficiency and drift MPEG-4 FGS: All key pictures MPEG-2 quality scalability: No key pictures Kai-Chao Yang, NTHU, Taiwan 27 Quality Scalability  Drift control with hierarchical prediction Refinement (possibly lost or truncated) Base layer P  Key pictures   2007/8 B1 B2 P B2 B1 B2 P Based layer is stored and used for the MCP of following pictures Other pictures   B2 Enhancement layer is stored and used for the MCP of following pictures GOP size adjusts the trade-off between enhancement layer coding efficiency andNTHU, drift Kai-Chao Yang, Taiwan 28 Combined Scalability  SVC encoder structure The same motion/prediction information Dependency layer Temporal Decomposition The same motion/prediction information 2007/8 Kai-Chao Yang, NTHU, Taiwan 29 Combined Scalability  Dependency and Quality refinement layers Q=2 D=2 Q=1 Q=0 Q=2 D=1 Q=1 Scalable bitstream Q=0 Q=2 D=0 2007/8 Q=1 Q=0 Kai-Chao Yang, NTHU, Taiwan 30 Combined Scalability Q1 D1 Q0 T0 T2 T1 T2 T0 Q1 D0 Q0 2007/8 Kai-Chao Yang, NTHU, Taiwan 31 Combined Scalability  Bit-stream format NAL unit header NAL unit header extension 2 6 3 3 2 P T D Q NAL unit payload 1 1 1 1 1 3 P (priority_id): indicates the importance of a NAL unit T (temporal_id): indicates temporal level D (dependency_id): indicates spatial/CGS layer Q (quality_id): indicates MGS/FGS layer 2007/8 Kai-Chao Yang, NTHU, Taiwan 32 Combined Scalability  Bit-stream switching  Inside a dependency layer   Outside a dependency layer   2007/8 Switching everywhere Switching up only at IDR access units Switching down everywhere if using multiple-loop decoding Kai-Chao Yang, NTHU, Taiwan 33 Profiles of SVC  Scalable Baseline       2007/8 For conversational and surveillance applications requiring low decoding complexity Spatial scalability: fixed ratio (1, 1.5, or 2) and MBaligned cropping Temporal and quality scalability: arbitrary No interlaced coding tools B-slices, weighted prediction, CABAC, and 8x8 luma transform The base layer conforms Baseline profile of H.264/AVC Kai-Chao Yang, NTHU, Taiwan 34 Profiles of SVC  Scalable High     For broadcast, streaming, and storage Spatial, temporal, and quality scalability: arbitrary The base layer conforms High profile of H.264/AVC Scalable High Intra  2007/8 Scalable High + all IDR pictures Kai-Chao Yang, NTHU, Taiwan 35 References     H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the Scalable Video Coding Extension of the H.264/AVC Standard,” CSVT 2007. T. Wiegand, “Scalable Video Coding,” Joint Video Team, doc. JVT-W132, San Jose, USA, April 2007. T. Wiegand, “Scalable Video Coding,” Digital Image Communication, Course at Technical University of Berlin, 2006. (Available on http://iphome.hhi.de/wiegand/dic.htm) H. Schwarz, D. Marpe, and T. Wiegand, “Constrained Inter-Layer Prediction for Single-Loop Decoding in Spatial Scalability,” Proc. of ICIP’05. 2007/8 Kai-Chao Yang, NTHU, Taiwan 36

SVC(project)

Related documents

Products

Support

SVC(project)

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib