Basics of MPEG • Picture sizes: up to 4095 x 4095 • Most algorithms are for the CCIR 601 format for video frames • Y-Cb-Cr color space • NTSC: 525 lines per frame at 60 fps, 720 x 480 pixel luminance frame, 360 x 480 pixel chrominance frame • PAL: 625 lines per frame at 50 fps, 720 x 576 pixel luminance frame, 360 x 576 pixel chrominance frame • SIF (source input format) for digital TV • Luminance resolution: 360 x 240 pixels at 30 fps or 360 x 288 pixels at 25 fps • Chrominance resolution: half the luminance resolution in both dimensions Detour: Motion Vectors with Subpixel Accuracy • Find motion vector (u,v) with integer pixel accuracy • Let the MAE be m0 • Compute the MAE at its 4-neighbor pixels (m1 .. m4) • Horizontal pixels • Model with the function p(i)=a|i-b|+c • If 2(m3 – m0) < (m4 – m0), the i coordinate is to the left of 1 0 3 2 the center 4 • If (m3 – m0) > 2(m4 – m0), the i coordinate is to the right of the center • Otherwise it is along the center line • Similarly for the vertical direction Basics of MPEG • Types of pictures • I (intra) frame • compressed using only intraframe coding • Moderate compression but faster random access • P (predicted) frame • Coded with motion compression using past I frames or P frames • Can be used as reference pictures for additional motion compensation • B (bidirectional) frame • Coded by motion compensation by either past or future I or P frames • D (DC) frame • Limited use: encodes only DC components of intraframe coding MPEG: Video Encoding • The MPEG standards • do not define an encoding process • define syntax of the coded stream • define a decoding process MPEG: Video Encoding Regulator Frame Memory + - DCT Quantizer (Q) VLC Encoder Input IDCT + Motion Compensation Motion Estimation Frame Memory Motion vectors Pre processing Predictive frame Q-1 Buffer Output MPEG: Video Encoding • Some highlights • Interframe predictive coding (P-pictures) • For each macroblock the motion estimator produces the best matching macroblock • The two macroblocks are subtracted and the difference is DCT coded • Interframe interpolative coding (B-pictures) • The motion vector estimation is performed twice • The encoder forms a prediction error macroblock from either or from their average • The prediction error is encoded using a block-based DCT • The encoder needs to reorder pictures because B-frames always arrive late MPEG: Structure of the Coded Bit-Stream • Sequence layer: picture dimensions, • • • • pixel aspect ratio, picture rate, minimum buffer size, DCT GOP-1 GOP-2 GOP-n quantization matrices Sequence layer GOP layer: will have one I picture, start with I or B picture, end with I or P picture, has closed GOP flag, timing I B B B P B B.. GOP layer info, user data Picture layer: temporal ref number, picture type, synchronization info, Slice layer resolution, range of motion vectors Slice-1 Slices: position of slice in picture, mb-1 mb-2 mb-n quantization scale factor Slice-2 Macroblock: position, H and V motion … Macroblock layer vectors, which blocks are coded and Slice-N 01 transmitted 23 4 5 Picture layer 8x8 block MPEG: Macroblock Coding Picture Type I picture change MQUANT P picture no change to MQUANT motion comp. interframe coded change MQUANT not coded no change to MQUANT B picture motion vector set to 0 Fwd motion compensation Bwd motion compensation interpolated compensation A A A A intraframe change MQUANT no change to MQUANT MQUANT= scale factor q A Quantization matrix MPEG-2 • Why another standard? • Support higher bit rates e.g., 80-100 Mbits/s for HDTV instead of the 1.15 Mvits/s for SIF • Support a larger number of applications • The encoding standard should be a toolkit rather than a flat procedure • • • • • Interlaced and non-interlaced frame Different color subsampling modes e.g., 4:2:2, 4:2:0, 4:4:4 Flexible quantization schemes – can be changed at picture level Scalable bit-streams Profiles and levels MPEG-2: Effects of Interlacing • Fields or frame pictures can be encoded • Prediction Modes and Motion Compensation • Frame prediction: current frame predicted from previous frame • Field prediction: • Top and bottom fields of reference frame predicts first field • Bottom field of previous frame and top field of current frame predicts the bottom field of current frame • 16 X 8 motion compensation mode • A macroblock may have two of them • A B picture macroblock may have four! • Dual prime motion compensation • Top field of current frame is predicted from two motion vectors coming from the top and bottom field of reference frame • Works for P vectors MPEG-2: Profiles and Levels Profiles Levels High High-1440 SNR 4:2:0 High 4:2:0;4:2:2 Multiview 4:2:0 Enhancement 1920 X 1151/60 1920 X 1151/60 Lower 960 X 576/30 1920 X 1151/60 Bitrate 100, 80,25 130, 50, 80 Enhancement 1440 X 1152/60 1440 X 1152/60 1920 X 1152/60 Lower 720 X 576/30 720 X 576/30 1920 X 1152/60 Bitrate 60, 40, 15 80, 60, 20 100, 40, 60 720 X 576/30 720 X 576/30 352 X 288/30 720 X 576/30 20, 15, 4 25, 10, 15 Enhancement Main Low Spatial 4:2:0 720 X 576/30 Lower Bitrate 15, 10 Enhancement 352 X 288/30 Lower Bitrate 352 X 288/30 352 X 288/30 4, 3 8, 4, 4 MPEG-2 Applications • Digital Betacam: 90 Mbits/s video • MPEG-2 • Main Profile, Main Level, 4:2:0: 15 Mbits/s • High Profile, High Level, 4:2:0: adequate, expensive • Image quality preserved across generations of processing • Multiview Profile • Stereoscopic view – disparity prediction • Virtual walk-throughs composed from multiple viewpoints