Image and Video Compression - Indian Institute of Technology

advertisement
Image and Video Compression: An
Overview
Jayanta Mukhopadhyay
Department of Computer Science & Engineering
Indian Institute of Technology, Kharagpur, 721302, India
jay@cse.iitkgp.ernet.in
1
Data Compression
• Alternative description of data requiring
less storage and bandwidth.
Uncompressed
1 Mbyte
Compressed (JPEG)
50 Kbyte (20:1)
2
•IMAGE COMPRESSION
The process of obtaining a representation of the
image with less number of bits than the original.
•DECOMPRESSION
The process of recovering the image from its
compressed form to its (almost) original form.
•LOSSLESS COMPRESSION
Compression without any loss of information-There
is no difference between the Original Image and
Reconstructed one.
•LOSSY COMPRESSION
Compression with partial loss of information –
Reconstructed Image differs from the Original one
(within and accepted limit).
3
Compression / Decompression
4
Different Forms of Alternative
Representation
• Image Transforms
- DCT
- DWT
• Fractal Representation
- Partial Iterated Function System
• Multi-resolution Representation
- Laplacian Pyramid
- Wavelet Subbands
• Differential Representation
- DPCM
5
6
JPEG Scheme
N 1
2
 (2n  1) k 
( N )
C2e{x(n)}  X II (k )   (k ) x(n)cos 
,
N
 2N 
n 0
0  k  N 1
7
Quantization Table
8
DC and AC encoding
9
JPEG: Code Structure
10
JPEG2000 Scheme
11
Dyadic Decomposition
12
JPEG2000 9/7 wavelet filters
Analysis
Filter Coefficients
Synthesis
Filter Coefficients
i
Lowpass
H(i)
Highpass
G(i)
Lowpass
H’(i)
Highpass
G’(i)
0
0.6029
-1.1151
1.1151
-0.6029
1
0.2669
0.5913
0.5913
0.2669
2
-0.0782
0.0575
-0.0575
0.0782
3
-0.0169
-0.0913
-0.0913
-0.0169
4
0.0267
-0.0267
13
JPEG2000 Code Structure
14
Fractals
Infinite Detailing
Continuous at every point but nowhere differentiable
Self Similarity:
Rules of production
similar at every scale
Whole can be described from part and vice versa
15
BASIC COMRESSION SCHEME
R
1
W1
D1
D2
For every RANGE BLOCK (R)
find suitable DOMAIN BLOCK(D)
so that
RW(D)
[W is the Transformation applied on D]
16
ADAPTIVE SCHEMES
• RANGE & DOMAIN SIZE
– QUADTREE PARTITION
• DOMAIN POOL
– TRIANGULATED PARTITION
– LATTICE OF SEPARATION
• TRANSFORMATION SET
– IGNORE ISOMETRIES
– NON-LINEAR CONTRACTIVE TRANSFORM
17
DECOMPRESSION ALGORITHM
• Start with any arbitrary image
• For each Range Block, apply transformation on
corresponding Domain Block in every iteration.
• After sufficiently large number of iteration it converges to
the target image (lossy version of original image)
18
MPEG2
• Uses DCT based JPEG for temporal
compression.
• Lossy video compression and lossy audio
compression.
• Core of most digital television and DVD
formats.
• Designed to code standard-definition
television at bit rates from about 3-15
Mbit/s and high-definition television at 1530 Mbit/s .
19
(DCT, Quant., Motion Estimation & Compensation, VLC)
Encoding
INTRA
Motion Compensated Inter Frames
INTRA
(IDCT, IQuant., Inverse Motion Compensation, VLC)
Decoding
20
Frame Types
• I Frames (Intra Pictures)
- encoded using only information within
that frame.
- used as reference frames.
- provides least amount of compression.
• P Frames (Predicted Pictures)
- encoded with reference to a previous I or
P frame.
- used as reference frame.
- provides moderate compression
• B Frames (Bidirectional Pictures)
- requires both past and future reference frames.
- never used as reference.
- provides highest amount of compression
21
MPEG2 : Coding Sequence
Forward Prediction
1
2
3
4
5
6
7
8
1
I
B
B
B
P
B
B
B
I
Bidirectional Prediction
22
MPEG2 Encoder
Video
in
_
_
DCT
Q
VLC
IQ
Coded
bitstream
out
(I)DCT=(Inverse) discrete
cosine transform
(I)Q=(Inverse) Quantization
IDCT
MCP=Motion Compensated
Prediction
VLC= Variable Length Coder
+
MCP
23
MPEG2 Decoder
Coded
Bitstream in
VLD
IQ
IDCT
+
Decoded
video out
MCP
(I)DCT=(Inverse) discrete cosine transform
(I)Q=(Inverse) Quantization
MCP=Motion Compensated Prediction
VLD = Variable Length Decoder
24
Intra Frame Encoding
DCT
For each
8x8 block
Quant.
Zig-Zag
Scan
……011000011010
Huffman
{
RLE
25
Motion Estimation & Prediction to construct Inter Frame (P/B- frames)
m’
1
2
3
4
5
6
7
8
9
10
11
m
13
14
15
16
17
18
19
20
21
22
23
24
25
Predicted P
Reference
e = m – m’
26
}
Reference
Difference
P-Frame Encoding
Best Match
DCT + Quant. + RLE
Motion Vector
B-Frame Encoding
Past reference
[
01101100
Future reference
Target
- 0.5 x
Huffman
+
]
=
DCT + Quant. + RLE
Motion vectors
Huffman coder
01101100
27
MPEG-4
29
MJPEG2000
(Motion JPEG2000)
30
MCJ2K codec Structure
Video
in
I frame
JPEG2000
Compression
Compressed
Stream
+
P
or
B
Motion Estimation
/Compensation
JPEG2000
Compression
JPEG2000 Decompression
of I or P Frames
31
H.264/AVC: Encoder
32
Image Transform
• MPEG-2 uses 8x8 DCT(Discrete Cosine Transform)
• A separable integer 4x4 transform an approximation to 4x4
DCT
• Can be computed with only 16 bit additions, subtractions, and
shifts
• Repeated transform of DC coeffs for 8x8 chroma and 16x16
Intra luma blocks
33
Intra-frame Prediction
• Two modes for luma block
– Intra 4x4
• 9 modes
• Used in texture area
– Intra 16x16
• 4 modes
• Used in flat area
• One mode of chroma block
– Similar to Intra 16x16
• I_PCM: bypass the prediction and transform
coding
34
Intra 16x16 & Chroma prediction
• 4 modes
35
Intra 4x4
• 9 modes
36
Inter-Frame Prediction
– Multiple reference pictures and decoupling of reference
order from display order.
– Quarter sample accuracy motion vector
37
Macro Block and Block size
8x8
8x8
8x8
8x8
Macro Block
16x16
38
Performance comparison
H264/AVC Frame
MPEG-2 Frame
39
Performance comparison contd.
MPEG-2 Vs H264/AVC Baseline profile(foreman.cif video)
40
Download