MPEG-4 - Hanyang University

advertisement
Acknowledgement: Many of the materials
are adapted from Prof. Jechang Jeong’s
excellent presentation on international
coding standards.
Computer Vision
– Coding Standards
Hanyang University
Jong-Il Park
Topics to be covered
 International coding standards
 Background and brief history
 Key techniques in
 JPEG
 MPEG-1,2,4
* Only image/video coding techniques will be covered
Department of Computer Science and Engineering, Hanyang University
Multimedia Everywhere
 Towards Multimedia :
Consumer
Electronics
Computer
Multimedia
TeleCommunication
Broadcasting
Department of Computer Science and Engineering, Hanyang University
Still Picture Compression Standards
 1980 : ITU-T T.4 : G3 FAX for PSTN




Modified Huffman and Modified READ
1984 : ITU-T T.6 : G4 FAX for ISDN
Modified MR
1992 : JPEG (ISO 10918, ITU-T T.81) : Color Still Pictures
used for Color Fax, Electronic Still Camera, Color
Printer, Computer Applications etc
Lossless/Lossy Modes, Baseline/Extended Modes,
Progressive/Sequential Modes
DPCM + DCT + Q + RLE + Huffman/Arithmetic Codes
Motion JPEG can be used for Moving Pictures.
1993 : JBIG (ISO 11544, ITU-T T.82) : Bi-level Pictures
Improvement on T.4 and T.6
Recently: JPEG-LS, JBIG2, etc
Department of Computer Science and Engineering, Hanyang University
Moving Picture Compression Standards
 1982 : ITU-R BT.601 : Studio Quality PCM Component Video
Common to 525/60 and 625/50 Systems
13.5 MHz Sampling, 8 bit/sample, 4:2:2 Format
 1990 : ITU-T H.261 : Video Phone/Conference Application via ISDN
Bitrate = p x 64 kbps, p = 1-30
MC DPCM + DCT + Q + RLE + Huffman Codes
Reference Model 1 - 8
 1992 : MPEG-1 Video : DSM Applications (e.g. Video CD)
Bitrate = 1.5 Mbps
MC DPCM + DCT + Q + RLE + Huffman Codes
GOP Structure for Random Access and Error Recovery
(I, P, B Frames)
Simulation Model 1 - 3
Department of Computer Science and Engineering, Hanyang University
Moving Picture Compression Standards(Cont.)
 1994 : MPEG-2 Video (ISO 13818-2, ITU-T H.262) :
Generic Algorithm for Various Applications
(Broadcasting, Communication, Network, DSM etc)
5 Profiles of Functionality
(Simple, Main, Spatial Scalable, SNR Scalable, High)
4 Levels of Resolution (Low, Main, High-1440, High)
Deals with Interlaced Scan as well as Progressive Scan
Field/Frame ME & DCT, Dual Prime ME, Intra VLC,
Alternate Scan, Nonuniform Q, etc
 1993 : ITU-R CMTT.721 : 140 Mbps Contribution Quality Video
Adaptive DPCM, Componentwise
 1993 : ITU-R CMTT.723 : 34-45 Mbps Contribution Quality Video
MC DPCM + DCT + Q + RLE + Huffman Codes
Department of Computer Science and Engineering, Hanyang University
Moving Picture Compression Standards(Cont.)
 1995 : ITU-T H.263 : Videophone via PSTN
Bitrate < 64 kbps
(V.34 modem = 33.6 kbps, Recent modem = 56 kbps)
Improved version of H.261
 1998 : MPEG-4
Bitrates < 2 Mbps
Targets: Multimedia data base access
Wireless multimedia communication
Components of H.263 are incorporated
Content-based compression
Synthetic and natural video/audio
Multiple tools/algorithms/profiles => Flexibility
 1999 : MPEG-4 Version 2, MPEG-7
Department of Computer Science and Engineering, Hanyang University
Continuous-tone still image
 JPEG(Joint Photographic Experts Group)
Applications : color FAX, digital still camera, multimedia computer, internet
JPEG Standard consists of
- a lossy baseline coding system
- an extended coding system for greater compression, higher precision
or progressive reconstruction applications
- a lossless independent coding system for reversible compression
References
- ITU-T recommendation T.81, “Information Technology - Digital
compression and Coding of Continuous-Tone Still Images Requirements and Guideline”, 92. 2
- K. R. Rao, J. J. Hwang, “Techniques & Standards for Image,
Video & Audio Coding”, Prentice Hall PTR, 1996
Department of Computer Science and Engineering, Hanyang University
Baseline system
 Baseline system : most widely used among JPEG standards
Data precision
- 8 bits for input and output
- 11 bits for quantized DCT coefficients
Algorithm
- DCT + quantization + variable length coding
Compression Guideline
- 0.25 ~ 0.5 bits/pixel : moderate to good quality, some applications
- 0.5 ~ 0.75 bits/pixel : good to very good quality, many applications
- 0.75 ~ 1.5 bits/pixel : excellent quality, most applications
- 1.5 ~ 2.0 bits/pixel : indistinguishable (visually lossless) quality,
most demanding applications
Department of Computer Science and Engineering, Hanyang University
Block diagram of baseline system
 Baseline system encoder
 Baseline system decoder
Department of Computer Science and Engineering, Hanyang University
Quantization and inverse quant.
 Quantization table
- No default values for quantization tables
- Application may specify the tables
- Q(u, v) : quantization table
integer value from 1 to 255
 F u , v  

: F Q u , v   round
 Qu , v  
Dequantization : Ru , v   F Q u , v  Qu , v 
Quantization
Department of Computer Science and Engineering, Hanyang University
Example
f (x,y)
FQ (u,v)
F (u,v)
FDCT
r (x,y)
Quant.
e (x,y)
Inverse Q
& IDCT
Department of Computer Science and Engineering, Hanyang University
Entropy coding
 DC Coefficient Coding
Differential Coding
DC coefficients of adjacent blocks are strongly correlated.
VLC(Huffman Coding)
Department of Computer Science and Engineering, Hanyang University
Entropy coding(Cont.)
 AC coefficients Coding
- Zigzag Scanning
- VLC(Variable Length Coding, Huffman Coding)
Department of Computer Science and Engineering, Hanyang University
Eg. JPEG Compression
Original
image
(24bpp)
JPEG
Compressed
image
(8:1 -- 3bpp)
JPEG
Compressed
image
( 32:1 -0.75bpp )
JPEG
Compressed
image
( 128:1 -0.1875bpp )
Department of Computer Science and Engineering, Hanyang University
MPEG Digital Video Technology
 MPEG-1( ISO/IEC 11172 ) and MPEG-2( ISO/IEC 13818 )
Applications :
MPEG-1 : Digital Storage Media(CD-ROM…)
MPEG-2 : Higher bit rates and broader generic applications
( Consumer electronics, Telecommunications,
Digital Broadcasting, HDTV, DVD, VOD, etc. )
Coding scheme :
Spatial redundancy : DCT + Quantization
Temporal redundancy : Motion estimation and compensation
Statistical redundancy : VLC
References :
- ISO/IEC 11172-2 (MPEG-1), ISO/IEC 13818-2 (MPEG-2)
- K.R.RAO and J.J. HWANG, “TECHNIQUES & STANDARDS
FOR IMAGE•VIDEO & AUDIO CODING,” Prentice Hall, 1996.
Department of Computer Science and Engineering, Hanyang University
MPEG Overview
 MPEG :
- Motion Picture Experts Group
- Specifies a standard compression, transmission, and decompression scheme
for video and audio.
- ISO/IEC 11172 : MPEG-1
- ISO/IEC 13818 : MPEG-2
- Consists of 3 parts.
Part 1 : System
Part 2 : Video
Part 3 : Audio
Department of Computer Science and Engineering, Hanyang University
MPEG compression of video
 How to remove spectral, spatial, temporal, and statistical redundancy?
Department of Computer Science and Engineering, Hanyang University
Intra-frame compression
Rate Control
Quantization step size
Video
DCT
No information loss
No data reduction
Entropy
Coding
Q
Information loss
Data reduction
MUX
Buffer
RLE
VLC
Data reduction Data reducetion
Run Length Coding
Generates (Run, Level)
symbols
Variable Length Coding
Use short words for
most frequent symbols
(like Morse code)
111
110
101
100
011
010
001
000
8-bit
quantization
Compressed Data
11
10
2-bit
quantization
01
00
In pu t Val u e
In pu t Val u e
Quantizing
Coefficients processing order
to encourage runs of 0s
Reduce the number of bits for each coefficient.
Give preference to certain coefficients.
Reduction can differ for each coefficient
Department of Computer Science and Engineering, Hanyang University
Removing spatial redundancy
 Pixel Coding using the DCT
• As human eyes are insensitive to HF color changes, the R,G, B signal is
converted into a luminance and two color difference signals. We can remove
redundancy more on U, V than on Y.
• The top left DCT component is taken as the dc datum for the block.
• DCT coefficients to the right are increasingly higher horizontal spatial freqs.
DCT coefficients below are higher vertical spatial frequencies.
Department of Computer Science and Engineering, Hanyang University
Inter-frame compression
Activity
calculator
Rate
control
Field/Frame
DCT
selector
SOURCE
INPUT
Frame
reordering
Field/Frame
memory
+
+
MQ
Side informations
DCT
VLC
MUX
Q
BUFFER
CODED
BITSTREAM
De Q
Motion
estimator 1
IDCT
+
Adaptive
predictor
Field/Frame
memory
Motion
estimator 2
Side informations
Department of Computer Science and Engineering, Hanyang University
Temporal redundancy
 Inter-frame prediction & motion estimation
• This really reduces the overall bit rate from frame to frame!
Department of Computer Science and Engineering, Hanyang University
Motion estimation
Department of Computer Science and Engineering, Hanyang University
Putting it all together
 I, P, B Frames
• The Intra Frames contain full picture information
• Predicted(P) Frames are predicted from past I, or P frames
• Bi-directional predicted frames offer the greatest compression and use past
and future I & P frames for motion compensation.
Department of Computer Science and Engineering, Hanyang University
Building the elementary stream
• This slide shows how the actual blocks, slices, frames etc. are all put together to
form the elementary stream
• Along with the actual picture data, header information is required to reconstruct the
I, B, P frames. This header structure is shown.
• The next stage is to take this ES and convert it into something that can be
transmitted and decoded at the other end.
Department of Computer Science and Engineering, Hanyang University
Ordering frames
 Frame Reordering
Department of Computer Science and Engineering, Hanyang University
MPEG-4
 MPEG-4( ISO/IEC 14496 )
Applications :
Internet Multimedia
Wireless Multimedia Communication
Multimedia Contents for Computers and Consumer Electronics
Interactive Digital TV
Coding scheme :
Spatial redundancy : DCT + Quantization, Wavelet Transform
Temporal redundancy : Motion estimation and compensation
Statistical redundancy : VLC (Huffman Coding, Arithmetic Coding)
Shape Coding : Context-based Arithmetic Coding
References :
- ISO/IEC 14496
Department of Computer Science and Engineering, Hanyang University
Interactive television
Department of Computer Science and Engineering, Hanyang University
Scene composition
Department of Computer Science and Engineering, Hanyang University
MPEG-4
Department of Computer Science and Engineering, Hanyang University
MPEG-4: Background
Department of Computer Science and Engineering, Hanyang University
MPEG-4: Concept
Department of Computer Science and Engineering, Hanyang University
MPEG-4: Scene composition
Department of Computer Science and Engineering, Hanyang University
MPEG-4 Video: Summary
Department of Computer Science and Engineering, Hanyang University
MPEG-4 Decoder
Department of Computer Science and Engineering, Hanyang University
Sprite in MPEG-4
Department of Computer Science and Engineering, Hanyang University
SNR Scalability
Department of Computer Science and Engineering, Hanyang University
Spatial Scalability
Department of Computer Science and Engineering, Hanyang University
Download