Jens-Rainer Ohm ITU-T VICA Workshop, 22-23 July 2005, Geneva Standardization of SVC Standardization in JVT: Scalable Video Coding Jens-Rainer Ohm RWTH Aachen University Institute of Communications Engineering ohm@ient.rwth-aachen.de http://www.ient.rwth-aachen.de RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT Jens-Rainer Ohm ITU-T VICA Workshop, 22-23 July 2005, Geneva Standardization of SVC Outline ! Introduction ! Summary on MPEG Call for Proposals on SVC and subsequent Core Experiments process ! Results ! Technical solutions ! Present status of SVC standard development in JVT ! Temporal scalability and open-loop concepts ! Spatial scalability ! SNR and fine-granularity scalability ! Conclusions 2 RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT Jens-Rainer Ohm ITU-T VICA Workshop, 22-23 July 2005, Geneva Standardization of SVC Scalable Media – the Idea ! Universal Media Access: code once and then customize the stream to access content ! “Anytime” ! from “Anywhere” (i.e. using any access network - wireless, internet etc.) ! and by “Anyone” (i.e. with any terminal complexity) ! Compatibility of different formats/resolutions Scalable Coded Content Network Terminal Terminal capabilities & Network characteristics feedback 3 RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT Jens-Rainer Ohm ITU-T VICA Workshop, 22-23 July 2005, Geneva Standardization of SVC MPEG/ITU-T Work Item on Scalable Video Coding ! SVC Work item ! started as ISO/IEC 21000-13 (MPEG-21 Scalable Video Coding) ! was moved to ISO/IEC 14496-10 January 2005 ! Timeline: ! October 2003: Call for Proposals ! March 2004: Evaluation of proposals ! January 2005: Working Draft (now within JVT) ! October 2005: Committee Draft ! March 2006 Final Committee Draft ! July 2006 Final Draft International Standard 4 RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT Jens-Rainer Ohm Standardization of SVC ITU-T VICA Workshop, 22-23 July 2005, Geneva MPEG Call for Proposals on SVC – Results ! Scenario 2: Scalability over 2 spatial layers Scenario 2 Overview MOS x2 loss compared to anchor MCTF+block transform 0 Advanced scalable -1 MC prediction -2 -3 -4 -5 QCIF 1 5 MCTF+2DWT RWTH Aachen University QCIF 2 ohm@ient.rwth-aachen.de CIF 1 CIF 2 Institute of Communications Engineering CIF 3 A2 J2 K2 B2 H2 D2 I2 E2 L2 C2 G2 F2 JVT Jens-Rainer Ohm ITU-T VICA Workshop, 22-23 July 2005, Geneva Standardization of SVC Following the Call ! Core experiments were run to further identify and develop SVC technology ! based on Scalable Video Model (SVM) ! Comparison made for wavelet-based and AVC/H.264-extending solutions ! ! 6 Full range of scalabilities tested (3x spatial, 3x temporal, medium-granularity SNR) Formal visual tests (experts & non-experts) indicated superiority of AVC/H.264 solution, particularly at low rates and resolutions RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT Jens-Rainer Ohm ITU-T VICA Workshop, 22-23 July 2005, Geneva Standardization of SVC JSVM and Working Draft ! JSVM: Joint Scalable Video Model (extending from MPEG Scalable Video Model 3.0) ! Working Draft of 14496-10:2005/AMD1 Scalable Video Coding ! Extension of AVC / H.264 with compatible base layer ! Motion-compensated temporal filtering (MCTF) with adaptive prediction and update steps for open-loop compression ! Layered structure with "bottom-up" prediction from lower layers ! One decoder loop ! FGS functionality as extension of CABAC, modified interleaved scan order („cyclic block coding“) ! New SVC-specific NAL unit types ! Bitstream scalability at level of NAL packets 7 RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT Jens-Rainer Ohm ITU-T VICA Workshop, 22-23 July 2005, Geneva Standardization of SVC Motion-compensated Lifting Filters ! Temporal-axis video A B A B A B lifting filters, sequence MC/IMC in lifting flow H = A − MC ( B) ! MC and IMC MC MC MC Prediction must be aligned 1 -1 1 -1 1 -1 highpass ! H pictures similar to sequence H H H MC prediction P pictures ! Adaptive predict & update Update ! Switching intra/forward/ 1/2 1 1/2 1 1/2 1 backward/bi-directional lowpass L L L sequence ! L pictures similar to I pictures when update 1 L = [ MC ( A ) + B ] turned down 2 IMC 8 RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering IMC IMC JVT Jens-Rainer Ohm Standardization of SVC ITU-T VICA Workshop, 22-23 July 2005, Geneva MCTF as „Temporal-axis Wavelet Tree“ A B A B A original video sequence B A B H L H L H L L H 1st decomposition level video sequence 1st temporal level H 2nd temporal level 3rd temporal level 9 LL LH LL LLL LH LL LH 2nd decomposition level LLH ! No closed loop in prediction and update required ! Prediction loop over LLL.. possible RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering LLH LLL 3rd level JVT Jens-Rainer Ohm Standardization of SVC ITU-T VICA Workshop, 22-23 July 2005, Geneva Layered Coding with compatible base layer GOP boundaries 10 MCTF enhancement layer L3 H1 H2 H1 H3 H1 H2 H1 L3 AVC Main Profile compatible base layer A B3 B2 B3 B1 B3 B2 B3 A RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT Jens-Rainer Ohm Standardization of SVC ITU-T VICA Workshop, 22-23 July 2005, Geneva Layered Coding with Spatial Pyramids Coding 4CIF Scaling Prediction Coding CIF Scaling Scalable Stream Prediction QCIF Coding ! Pyramid structure ! Basic building blocks from AVC / H.264 ! Spatial transform coding using integer DCT approximations (4x4 and 8x8) 11 RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT Jens-Rainer Ohm ITU-T VICA Workshop, 22-23 July 2005, Geneva Standardization of SVC Inter Layer Prediction 8x8 8x4 4x8 4x4 Direct, 16x16, 16x8, 8x16 Intra 16x16 16x8 16x16 16x16 Intra-BL Intra-BL 8x16 8x8 16x16 16x16 Intra-BL Intra-BL ! Motion: Upsampled partitioning and motion vectors for prediction ! Residual: Block-wise upsampled residual (bi-linear) ! Intra: Upsampled intra MB (H.264 / AVC 1/2-pel filter) 12 RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT Jens-Rainer Ohm Standardization of SVC ITU-T VICA Workshop, 22-23 July 2005, Geneva Extended Spatial scalability ! Cropping and non-dyadic up-sampling ! Macroblocks not aligned between layers wenh wextract (xorig ,yorig) henh down-sampling hextract hbase wbase Base spatial layer Enhancement spatial layer 13 RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT Jens-Rainer Ohm ITU-T VICA Workshop, 22-23 July 2005, Geneva Standardization of SVC SNR Scalability ! Layered representation of L/H pictures ! Non-scalable ‘base-layer’ (BL) ! Motion information ! Coarsest represenation of intra and residual data ! Minimum acceptable reconstruction quality ! Quality scalable enhancement layers (EL) ! Residue between original and SNR-BL representation ! Doubled quantizer precision per SNR-EL ! FGS: truncation of EL packets at arbitrary points ! Refinement in the transform domain ! Single inverse transform at the decoder side ! “Dead substream” concept allows flexible increase of quality beyond the limitations of layered coding 14 RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT Jens-Rainer Ohm Standardization of SVC ITU-T VICA Workshop, 22-23 July 2005, Geneva SNR&Spatial scalability: 2 spatial layer example Video Bitstream bitstream Texture ResidualCoding 5/3 MCTF 2D Spatial Decimation AVC (with Hierarchical B pictures) Texture N1 1x Quality Layer N 1x Quality Layer N x Quality Layer Quality BaseLayer Motion {Mp1 } MotionCoding Motion Interpolation Texture Multiplex Motion 2D Texture Interpolation N1 1x Quality Layer N 0x Quality Layer N x Quality Layer Motion {Mp 0} bistream 15 RWTH Aachen University bistream ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT Jens-Rainer Ohm Standardization of SVC ITU-T VICA Workshop, 22-23 July 2005, Geneva SNR&Spatial scalability: 2 spatial layer example Video Bitstream bitstream Texture ResidualCoding 5/3 MCTF Texture N1 1x Quality Layer N 1x Quality Layer N x Quality Layer Quality BaseLayer Motion {Mp1 } MotionCoding AVC (with Hierarchical B pictures) = Progresive Motion Refinement Quality Layer Interpolation Texture ResidualCoding CGS OR Quality Layer 2D Texture MotionCoding Interpolation (Opt.) N1 1x Quality Layer N 0x Quality Layer N x Quality Layer Motion {Mp 0} bistream 16 RWTH Aachen University bistream Motion ResidualCoding FGS Quality Layer 2D Spatial Decimation Multiplex ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT Jens-Rainer Ohm ITU-T VICA Workshop, 22-23 July 2005, Geneva Standardization of SVC Realisation of FGS SNR Scalability ! ! ! Three scans for FGS coding of transform coefficients 1. Non-significant transform coefficients of significant blocks 2. Refinement symbols for already significant coefficients 3. Coefficients of non-significant blocks Scanning pattern for each scan ! Scanning from low to high frequency bands (zig-zag) ! Inside each band: first luma, then chroma in raster scan Coding of transform coefficient levels ! Non-significant coefficients ! ! ! Significant coefficients ! 17 Coded Block Pattern, Coded Block Bit, DeltaQP, Transform Size CABAC symbols (SIG, LAST, SIGN, ABS) Refinement symbol (-1, 0, +1) RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT Jens-Rainer Ohm Standardization of SVC ITU-T VICA Workshop, 22-23 July 2005, Geneva Bitstream structure Spatial ! Data cube model and layered stream representation 4CIF B1 Layer 3 T0, S2, B0 Layer3 T1, S2, B0 Layer 3 T2, S2, B0 B1 CIF Layer 1 T0, S1, B0 Layer 1 T1, S1, B0 Layer 2 T2, S1, B0 B1 SN QCIF Base-Layer Layer 1 T1, S0, B0 R Layer 2 T2, S0, B0 Temporal 7.5 fps 18 RWTH Aachen University 15 fps ohm@ient.rwth-aachen.de 30 fps Institute of Communications Engineering JVT Jens-Rainer Ohm Standardization of SVC ITU-T VICA Workshop, 22-23 July 2005, Geneva … scalability level A SVC NAL unit length extension header with level ID length SVC NAL unit SVC sample length SVC NAL unit Length Coded Video NAL U length extension header with level ID length SEI NAL U AVC NAL unit Length length Param NAL U AVC NAL unit length Length ! New SVC NAL unit types can be ignored by conventional AVC parser scalability level B sample 19 RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT Jens-Rainer Ohm Standardization of SVC ITU-T VICA Workshop, 22-23 July 2005, Geneva Performance Example CREW 176x144 15 37 36.5 36 35.5 ] B d[ R N S P 35 34.5 34 JSVM2 CombinedScalability JSVM2 SingleLayer JSVM2 SnrScalability JSVM2 SpatialScalability 33.5 33 80 20 RWTH Aachen University 100 120 ohm@ient.rwth-aachen.de 140 Rate [kbit/s] 160 180 Institute of Communications Engineering 200 JVT Jens-Rainer Ohm Standardization of SVC ITU-T VICA Workshop, 22-23 July 2005, Geneva Performance Example CREW 352x288 30 37.5 37 36.5 36 ] B d[ R N S P 35.5 35 34.5 JSVM2 CombinedScalability JSVM2 SingleLayer JSVM2 SnrScalability JSVM2 SpatialScalability 34 33.5 350 21 RWTH Aachen University 400 450 500 ohm@ient.rwth-aachen.de 550 600 Rate [kbit/s] 650 700 750 Institute of Communications Engineering 800 JVT Jens-Rainer Ohm Standardization of SVC ITU-T VICA Workshop, 22-23 July 2005, Geneva Performance Example CREW 704x576 60 37.5 37 36.5 ] B d[ R N S P 36 35.5 JSVM2 CombinedScalability JSVM2 SingleLayer JSVM2 SnrScalability JSVM2 SpatialScalability 35 34.5 1400 22 RWTH Aachen University 1600 1800 2000 2200 2400 Rate [kbit/s] ohm@ient.rwth-aachen.de 2600 2800 3000 Institute of Communications Engineering 3200 JVT Jens-Rainer Ohm ITU-T VICA Workshop, 22-23 July 2005, Geneva Standardization of SVC Summary ! SVC to become an extension of H.264 / AVC ! ! ! ! Residual coding using existing tools MCTF open loop structure with update step Pyramid structure with inter layer prediction FGS quality layers plus a minimum quality base layer ! Good compression performance ! tradeoffs by the amount of flexibility in scalability ! Further improvements expected e.g. by ! ! ! ! ! 23 adaptive MCTF structures combinations closed/open loop improvement of quantization & FGS schemes cross-layer dependencies of texture and motion joint RD optimization over multiple layers RWTH Aachen University ohm@ient.rwth-aachen.de Institute of Communications Engineering JVT