Berkeley Multimedia Research Center September 1996

advertisement
Image/Video Compression
September 28, 1999
Lawrence A. Rowe
University of California, Berkeley
URL: http://www.BMRC.Berkeley.EDU/~larry
Copyright @1999, L.A. Rowe
Outline
•
•
•
•
•
Background
Block Transform Coding
Other Coding Algorithms
Software/Hardware CODEC’s
Pragmatic Issues
Multimedia Systems and Applications
2
Video Data Size
size of uncompressed video in gigabytes
1 sec
1 min
1 hour
1000 hours
1920x1080
0.19
11.20
671.85
671,846.40
1280x720
0.08
4.98
298.60
298,598.40
640x480
0.03
1.66
99.53
99,532.80
320x240
0.01
0.41
24.88
24,883.20
160x120
0.00
0.10
6.22
6,220.80
image size of video
1280x720 (1.77)
Multimedia Systems and Applications
640x480 (1.33)
320x240
160x120
3
Video Bit Rate Calculation
width * height * depth * fps
= bits/sec
compression factor
width ~ pixels(160, 320, 640, 720, 1280, 1920, …)
height ~ pixels
(120, 240, 480, 485, 720, 1080, …)
depth ~ bits (1, 4, 8, 15, 16, 24, …)
fps ~ frames per second (5, 15, 20, 24, 30, …)
compression factor (1, 6, 24, …)
Multimedia Systems and Applications
4
Effects of Compression
storage for 1 hour of compressed video in megabytes
1:1
3:1
6:1
25:1
100:1
1920x1080
671,846
223,949
111,974
26,874
6,718
1280x720
298,598
99,533
49,766
11,944
2,986
640x480
99,533
33,178
16,589
3,981
995
320x240
24,883
8,294
4,147
995
249
160x120
6,221
2,074
1,037
249
62
3 bytes/pixel, 30 frames/sec
Multimedia Systems and Applications
5
Be Careful...
mpeg 200:1, jpeg 24:1
analog
source
digital
representation
compressed
representation
vs
Multimedia Systems and Applications
6
Another View
Data Rate
128 Kbs
384 Kbs
1.5 Mbs
3.0 Mbs
6.0 Mbs
25 Mbs
Multimedia Systems and Applications
Size/Hour
60 MB
170 MB
680 MB
1.4 GB
2.7 GB
11.0 GB
7
Perceptual Coding
• Encode source signal using lossy compression
Lossless algorithms typically reduce signal by 3:1
Must use lossy algorithm to get adequate compression
• Hide errors where humans will not see or hear it
Study hearing and vision system to understand how we see/hear
Masking refers to one signal overwhelming/hiding another
(e.g., loud siren or bright flash)
Audio perception is 20-20 kHz but most sounds in low
frequencies (e.g., 2 kHz to 4 kHz)
Visual perception influenced by edges and low frequencies
Multimedia Systems and Applications
8
What is…
• JPEG - Joint Photographic Experts Group
Still image compression, intraframe picture technology
MJPEG is sequence of images coded with JPEG
• MPEG - Moving Picture Experts Group
Many standards MPEG1, MPEG2, and MPEG4
Very sophisticated technology involving intra- and
interframe picture coding and many other optimizations
=> high quality and cost in time/computation
• H.261/H.263/H.263+ - Video Conferencing
Low to medium bit rate, quality, and computational cost
Used in H.320 and H.323 video conferencing standards
Multimedia Systems and Applications
9
Coding Overview
• Digitize
Subsample to reduce data
• Intraframe compression
Remove redundancy within frame (spatial compression)
• Interframe compression
Remove redundancy between frames (temporal
compression)
• Symbol coding
Efficient coding of sequence of symbols
Multimedia Systems and Applications
10
Digitizing
• Modify color space
24 bit RGB => 15 or 16 bit RGB
24 bit RGB => YUV (8 bit Y, 4 bit U, 4 bit V)
24 bit RGB => 8 bit color map
• Encode only 1 field
• Reduce frame rate
Film is 24 fps so why encode 30 frames of video
Is 20 fps good enough? 18? 15? 12? 8? 4? ...
Variable frame rate?
Multimedia Systems and Applications
11
Block Transform Encoding
DCT
Quantize
Zig-zag
011010001011101...
Run-length
Code
Multimedia Systems and Applications
Huffman
Code
12
Block Encoding
DC component
139
144
150
159
144
151
155
161
149
153
160
162
153
156
163
160
DCT
1260 -1 -12
-23 -17 -6
-11 -9 -2
-7
-2
0
-5
-3
2
1
Quantize
original image
AC components
79 0 -2 -1 -1 -1 0 0 -1 0 0 0 0 0 0 0
run-length
code
0
1
0
0
0
2
0
79
-2
-1
-1
-1
-1
0
Multimedia Systems and Applications
Huffman
code
zigzag
79
-2
-1
0
0
-1
-1
0
-1
0
0
0
10011011100011...
coded bitstream < 10 bits (0.55 bits/pixel)
13
0
0
0
0
Result of Coding/Decoding
139
144
150
159
144
151
155
161
149
153
160
162
144
156
155
160
153
156
163
160
146
150
156
161
149
152
157
161
152
154
158
162
reconstructed block
original block
-5
-4
-5
-1
-2
1
-1
0
0
1
3
1
1
2
5
-2
errors
Multimedia Systems and Applications
14
Examples
Uncompressed
(262 KB)
Compressed (50)
(22 KB, 12:1)
Multimedia Systems and Applications
Compressed (1)
(6 KB, 43:1)
15
Discrete Cosine Transform
4C(u)C(v)
F[u,v] =
n2
n-1 n-1

f(j,k) cos
(2j+1)up
2n
j=0 k=0
where
C(w) =
1
2
1
cos
(2k+1)vp
2n
for w=0
for w=1,2,…,n-1
• Inverse is very similar
• DCT better at reducing redundancy than Discrete Fourier
Transform but computationally expensive
Multimedia Systems and Applications
16
DCT vs DFT





original signal



 
recovered from DCT
Multimedia Systems and Applications





recovered from DFT
17
Inter-frame Compression
• Pixel difference with previous frame
If previous pixel very similar, skip it
• Send sequence of blocks rather than frames
If previous block similar, skip it or send difference
• Motion compensation
Search around block in previous frame for a better
matching block and encode position and error difference
Search in future frame
Average block in previous and future frame
Multimedia Systems and Applications
18
Background
• JPEG still image (1 bpp)
Symmetric codec
• MPEG
Interleaved audio and video
Low cost decoder at expense of high cost encoder
(asymmetric)
• H.26x
Video conferencing standard (QCIF, CIF, 4CIF, and
16CIF)
Variable bit rates and coding flexibility
Multimedia Systems and Applications
19
MPEG Standards
• MPEG1 - vhs quality (1992)
CIF images, 4:2:0 sampling, 1.5 Mbs
Frame encoding
• MPEG2 - broadcast quality (1994)
CCIR 601 images, 4:2:2 sampling, 15 Mbs
Interlaced and progressive scanning
Frame and field encoding
Multimedia Systems and Applications
20
MPEG Technology
•
•
•
•
Picture coding types
Motion compensation
Bit rate control
Picture order
Multimedia Systems and Applications
21
Do MPEG slides HERE
• Frame types
I, P, B, and Bi frames
• Jargon
GOP, macroblock, slice
• Motion vectors
Predicted and interpolated vectors
Multimedia Systems and Applications
22
Compression Formats
• Cinepack (2 bpp)
Inter-frame, RGB15, software playback
• Motion JPEG (1-4 bpp)
Intra-frame, DCT+Quantization, good for editing
• MPEG (0.5-2 bpp)
Inter-frame, DCT+Quantization+Motion Compensation, excellent
for playback
• H.261/H.263
Inter-frame, DCT+Quantization+Motion Compensation, block
stream, excellent for conferencing
Multimedia Systems and Applications
23
Quantitative Measures of Quality
• Mean Squared Error
n-1 n-1
1
2
MSE =



f(i,j)
f

(i,j)
]
n2 i=0 i=0
where f(i,j) is original image and f’(i,j) is image
after compression and decompression
• Perceptual Measures
Perceptual Distortion Measure (Heeger)
Picture Quality Scale (Miyahara, et.al.)
Spatial/Temporal Measure (Webster, et.al.)
Multimedia Systems and Applications
24
Discussion
• Where are the lossy steps?
Quantization and subsampling before coding
• How do you choose quantization matrix?
Standard proscribes matrix based on psychophysics
Vary quantization by scaling Q matrix (i.e. MQUANT)
Custom designed Q matrix
• Can improve compressioin by using motion
compensation (MC)
• Comparison of standards…
JPEG - still image uses fixed Q matrix
H.261 - video conferencing uses MC and variable quantization
MPEG - video playback uses MC and variable quantization
Multimedia Systems and Applications
25
Discussion
• How to do very low bit rate (28.8 Kbs)?
Small image, low quality/bit rate, and better coding
MPEG4 will address this problem
• Gazillions of other proposals…
Fine, but who needs them
Scalable proposals will be useful
• Must be prepared to deliver several formats
Multimedia Systems and Applications
26
H.261 Coding Example
1
3
2
4
5
6
7
1
1
4
2
3
5
6
Multimedia Systems and Applications
2
3
1
4
2
5
3
6
4
5
7
6
27
Other Techniques
• Vector Quantization
Use small codebook of values
Slow encode, fast decode
• Fractal Coding
Fit curves to signal
Slow encode, fast decode?
• Wavelet Coding
Better transform than DCT – incorporates both spatial and frequency
in transform
Used in most research work and scalable codecs
Multimedia Systems and Applications
28
Scalable Algorithms
• Some applications want to control bandwidth and
performance
Send low quality, low bit rate image on high priority
channel and quality improvements on low priority
channel(s)
May be dictated by network bandwidth (wireless), end
station (PDA), or application need (video gallery)
• Scalability parameters
Image size
Frame rate
Reconstructed image quality
Multimedia Systems and Applications
29
Pyramid Codes
CodedImage-0
Low Quality
CodedImage-1
Multimedia Systems and Applications
30
MPEG4 Standards Work
• Original goal was very low-bit rate coding
Can do remarkably well at rates as low as 10 bits/sec but
coding/decoding time is large
Hierarchical VQ is popular because of low cost to decode,
but trade-off is less coding efficiency
• Recent activity has moved towards
programmable decoders
Idea is to allow code to be downloaded to the decoding
chip
• Difficult trade-off between power
consumption, quality and coding algorithm
Multimedia Systems and Applications
31
Understanding CODEC Performance
CODEC
Compress
Decompress
JPEG
314 inst/pixel
283 inst/pixel
MPEG1
1120 inst/pixel
60-100 inst/pixel
• Problem is motion vector search in MPEG
• Low bit-rate coding (1.5 Mbs) requires
excellent coding
Multimedia Systems and Applications
32
MPEG Coding Performance
• Decoding is easy
MPEG1 decoding in software on most platforms
Hardware decoders are widely available ($150/board)
Windows graphics accelerators with MPEG decoding now
entering market (e.g., Matrox, Diamond, …)
• Encoding is expensive
Sequential software encoders are 20:1 real-time
Real-time encoders use parallel processing
Real-time hardware encoders are expensive (e.g., $12K$50K for MPEG1 and $100K-$500K for MPEG2)
Hardware-assisted off-line MPEG1 encoders (3:1) used for
multimedia authoring at reasonable cost ($5k)
Multimedia Systems and Applications
33
Put decoding/encoding
performance slides here
Multimedia Systems and Applications
34
Other Boards
• MJPEG
DEC J300 (Turbochannel, CL560) costs $3K
Parallax (Sun, HP, IBM R6K, CL560) costs $4K-$7.5K
SGI Cosmo (CL560) cost (?)
Radius VVS (Mac, CL560) costs $3K (?)
Radius Truevision (PC, (?)) costs (?)
PC boards (ASL, RealMagic MPEG Editor, IIT, …)
• Proprietary Algorithms
Intel Smart Video Recorder Pro ($3K, I860)
• Programmable Boards
Sun VideoPix (CL4K) supports MJPEG, MPEG, CELLB costs $1.5K
Chromatics MPACT chip board (mid ‘96)
Multimedia Systems and Applications
35
Download