MPEG Video (Part 2) Ketan Mayer-Patel CS 294-9 :: Fall 2003 Last Time • Overall MPEG bitstream organization. • I-Frames • Examples of many encoding techniques: – – – – – – Subsampling (chrominance planes) Transform Coding (DCT, zig-zag) Run-length Encoding (AC coeffs) Predictive Encoding (DC coeffs) Entropy Encoding (Huffman encoding) Quantization (All coefficients) CS 294-9 :: Fall 2003 This Time • P and B frames – Motion compensation. • Search techniques • The problem with error measurements – Skipped macroblocks • Quantization control – Variable bitrate vs. Constant bitrate • DCT Artifacts – Spider noise – Blockiness CS 294-9 :: Fall 2003 P-Frames • Two types of macroblocks in P-Frames: – I-Macroblocks. • Just like macroblocks in a I-Frame • DC term is differentially encoded from DC predictor – DC predictor is simply last coded DC term. – Predictor reset at slice boundaries. – Encoded as DC size followed by that many bits. • AC terms – RLE’d as (run,value) pairs. Huffman encoded. – P-Macroblocks CS 294-9 :: Fall 2003 P-Macroblocks Luminance Blocks U Block V Block Block Pattern (3- 9 bits) Motion Vector (variable) Q Scale (5 bits) Macroblock Type (1-6 bits) Macroblock Address Increment (variable) Macroblock Type determines if Q Scale, Motion Vector, or Block Pattern exist. One or all of the blocks may be absent in a P-Macroblock. CS 294-9 :: Fall 2003 Address Increment • Each macroblock has an address. – – – – MB_WIDTH = width of luminance / 16 MB_ROW = row # of upper left pixel / 16 MB_COL = col. # of upper left pixel / 16 MB_ADDR = MB_ROW * MB_WIDTH + MB_COL • Decoder maintains PREV_MBADDR. – Set to -1 at beginning of picture. – Set to (SLICE_ROW*MB_WIDTH-1) at slice header. • MB address increment added to PREV_MBADDR provides current macroblock address. – PREV_MBADDR set to current macroblock address. CS 294-9 :: Fall 2003 Address Increment Coding • Address increment coded using Huffman code. – 33 codes for values (1-33). • 1 is smallest (1-bit) • 33 is largest (11-bits) – 1 code for ESCAPE • ESCAPE means add 33 to address increment code that follows. • ESCAPS can be chained allowing any positive value to be encoded as an address increment. • This occurs for I-Frames as well. CS 294-9 :: Fall 2003 MB Type • Huffman coded. – 7 possible codes (1 - 6 bits) • Determine the following: – – – – Intra or non-intra. Q scale specified or not. Motion vector exists or not. Block pattern exists or not. • Not all combinations are possible. • Not all possible combinations are feasible. CS 294-9 :: Fall 2003 Quantization Scale • 5 bits. • Zero is illegal. • Encoded as 1-31 which results in q-scale values of (2-62). – Odd values impossible to encode. • Decoder maintains current q-scale. – If not specified, current q-scale used. – If specified, current q-scale replaced. CS 294-9 :: Fall 2003 Motion Vector • Two components: – – – – – Horizontal and vertical offsets. Offset is from upper left pixel of macroblock. Positive values indicate right and down. Negative values indicate left and up. Offsets are specified in half pixels. • Motion vector is used to define a predictive base for the current macroblock from the reference picture. CS 294-9 :: Fall 2003 Motion Vector Illustrated Previously Decoded I- or P- Frame P-Frame Prediction base does not have to be macroblock aligned. If predictive base is half-pixel aligned, bilinear interpolation is used. Whatever luminance pixels are picked out, corresponding chrominance pixels used to form chrominance prediciton. CS 294-9 :: Fall 2003 Motion Vector Encoding • If no motion vector is present, then motion vector is understood to be (0,0). • Horiz. component followed by vertical. • Decoder maintains motion vector predictor. – Set to 0,0 at beginning of picture or slice or whenever an I-macroblock is encountered. – Difference between predictor and value is Huffman encoded. • Actually a bit more complicated than this. CS 294-9 :: Fall 2003 Predictive Base • P-Macroblocks always specify a predictive base: – Either motion vector picks out an area, or – No motion vector implicitly implies 0,0 (i.e., predictive base is same macroblock in reference frame.) CS 294-9 :: Fall 2003 Block Pattern • The goal of motion compensation is to find predictive base that matches most closely with macroblock. – If match is really good, then no appreciable difference will need to be encoded at all. • Block pattern indicates which blocks have enough error to warrant coding. • Absence of block pattern indicates no blocks needed coding. CS 294-9 :: Fall 2003 Block • Difference between pixels in prediction and macroblock is encoded as block: – 9-bit input values – Still produces 12-bit coefficients – Sometimes called error blocks. CS 294-9 :: Fall 2003 Error Block Encoding • Different quantization matrix is used. – Default is “16” in all coefficient positions. – Error blocks have lots of high frequency info. – No good perceptual correlation between frequencies of error coding and artifacts. • DC no longer specially treated. – No differential encoding from predictor. • All terms are zig-zag RLE’d and then (run,value) pairs Huffman encoded. CS 294-9 :: Fall 2003 P-Frame Review • Macroblocks are either I-macroblocks or P-macroblocks. • I-macroblocks just like macroblocks in I-frame. • P-macroblocks define predictive base and encode the difference. CS 294-9 :: Fall 2003 Skipped Macroblocks • If P-macroblock has (0,0) motion vector and no appreciable difference to encode, then can be skipped altogether. • Skipped macroblock detected when address increment for next coded macroblock is detected. • First block and last block of slice must not be skipped. • Last slice must include lower right macroblock. CS 294-9 :: Fall 2003 Decoder State Updates • DC predictors are reset whenever a P-macroblock or skipped macroblock is encountered. • Motion vector predictors reset whenever I-macroblock is encountered. CS 294-9 :: Fall 2003 B-Frames • B-frames have 4 macroblock types: – I-macroblocks – P-macroblocks • Predictive base specified from previous reference frame. – B-macroblocks • Predictve base specified from subsequent reference frame. – Bi-macroblocks. • Predictive base specified from both reference frames. CS 294-9 :: Fall 2003 Skipped Macroblocks • Handled slightly differently than P-frames. • Skipped macroblock implies: – Same macroblock type as last encoded macroblock (i.e., P-, B-, or Bi-). – Motion vectors same a previous encoded macroblock. • Compare to (0,0) assumption in P-frame. • Also means that predictors not reset. – Can’t skip macroblock following an I-macroblock. • Other state changes as per P-frames. CS 294-9 :: Fall 2003 Motion Compensation • Provides most of MPEG’s compression. • Relies on temporal coherence. • Finding a good motion vector essentially a search problem. • Evaluating “goodness” of a motion vector can be a bit tricky. • MC is what makes MPEG asymmetric. – Harder to encode than to decode. CS 294-9 :: Fall 2003 Exhaustive Search • The most obvious and easiest solution. • Encoding time related to size of search window. • Although time consuming, also embarrassingly parallel. CS 294-9 :: Fall 2003 Logarithmic Search • Evaluate the search window with an even sampling of motion vectors. • Take best and reevaluate in region of the motion vector with denser sampling. CS 294-9 :: Fall 2003 Predictive Search • Motion vectors differentially encoded for a reason. – Tend to be correlated from one macroblock to the next. • Use previous macroblocks motion vector as centering point for search. • Or, use motion vector from same block in previous frame as center of search. • My research is looking at using depth and other spatial info to guide encoding. CS 294-9 :: Fall 2003 Error Measurements • Regardless of search algorithm, need to determine which motion vector is best. • Simple measures: – Mean Squared Error – Mean Absolute Error – Minimum Difference Variance • Fundamental problem is no good correlation between any simple metric and perceptual quality. CS 294-9 :: Fall 2003 VBR vs. CBR • Two ways to handle bitrate: – Variable Bit Rate (VBR) • Allows compressed bitrate to vary – Constant Bit Rate (CBR) • Bitrate constant over some averaging window. • MPEG buffer model. – Optional (don’t have to use it). – Provides in the sequence header parameters to a buffer model that can describe bitrate behavior. CS 294-9 :: Fall 2003 VBR Q-scale adjustments • In general, VBR used to maintain quality. • Q scale is adjusted to provide maximum compression given quality limit. • Need some metric for quality. – Same issue for judging perceptual quality crop up here. • Common solution: q scale statically set for I-, P-, and B-frames. – A variation on this is differentiating among macroblock types. CS 294-9 :: Fall 2003 CBR Q-scale adjustments • To achieve CBR, q-scale used to control bitrate. – Higher q-scale provides better compression at the expense of quality. – Lower q-scale provides better quality at the expense of compression. • Algorithms for controlling how q-scale is adjusted can get pretty complicated. • Common solution is to have target I, P, and B frame sizes and then adjust q-scale as macroblocks are encoded to hit the target. CS 294-9 :: Fall 2003