Cap11_Mihaela_engleza

advertisement
Chapter 11. Image compression
Digital image processing
11. IMAGE COMPRESSION
11.1. Introduction
11.2. Pixel level coding
PCM coding
Entropic coding
RL coding
Arithmetic coding
11.3. Predictive coding techniques
Principles of the predictive coding
Delta modulation
1-D DPCM
2-D DPCM
Comparison of the performances of various predictive coding methods
11.4. Transform-based image compression techniques
Bits allocation
Transform-based image compression
Zonal coding; threshold-based coding
Transform-based coding with visibility criteria
Adaptive transform-based coding
2-D DCT based image coding
Image compression algorithm based on DCT
11.5. Hybrid coding; vectorial DPCM based coding
11.6. Interframe coding
Principles of the inter-frame coding
Predictive coding with motion compensation
Motion estimation algorithms
Hybrid inter-frame coding
Digital image processing
Chapter 11. Image compression
11. 1. INTRODUCTION
• Def. Compression = minimization of the number of bits needed to represent the visual
information in the scene, without noticeable loss of information
• Applications:
• image transmission
• image storage in multimedia/image databases
Chapter 11. Image compression
Digital image processing
The need for image compression:
• huge sizes of the data files, in uncompressed form
• the storage devices have relatively slow access =>impossible to render/index image info in real time
• limited bandwidth of the communication channels => impossible to do image transmission in real time
TIPUL
DIMENSIUNE
FISIER SI
LARGIME DE
BANDA
TEXT
- ASCII
- EBCDIC
IMAGINI
-grafica bitmap
-fax
-imagini statice
2kB/pagina - simple
(320x200x8biti)
77kB/imagine
-complexe
(true color)
(1100x900x24biti)
3Mb/imagine
AUDIO
-secvente necodate
audio
sau voce
-voce/telefon
8kHz/8biti (mono)
6-44kB/s
- audio CD
44,1 kHz
16 biti/ stereo
176 kB/s
(44,1x2chx16biti)
ANIMATIE
- sunet si
imagini cu
15-19 cadre/s
-secvente 16biti/pixel,
16cadre/s
6,5 Mb/s
(32x640x16bitix16cadre/s)
VIDEO
- TV analogic[ sau
imagini cu 24-30 cadre/s
-secvente color
pe 24 biti cu
30cadre/s
27,6 Mb/s
(640x480x24bitix30cadre/
s)
Memory needed for
storage before and
after compression
 text 2:1,
 color images 15:1,
 maps 10:1,
 stereo sound 6:1,
 animation 50:1,
 video 50:1.
Chapter 11. Image compression
Digital image processing
•Shannon => the average amount of information per symbol:
L -1
H    pilog2 pi
i 1
biti/simbol
L = number of symbols
=> It is possible to lossless encode the source having the entropy H by (H+e) bits/symbol, e>0,
e->0.
The compression factor (ratio) C:
C
Averagenumber of bits/symbol of theuncompressed data (B)
B
B


Averagenumber of bits/symbol of thecompresseddata (H  e) H  e H
•Maximal (ideal) compression can be achieved by vector quantization (theoretically).
Practically – near optimal only:
Chapter 11. Image compression
Digital image processing
11. 2. PIXEL LEVEL CODING
11.2.1. PCM (pulse code modulation)
u(t)
•
Sample and u(nT) yZero-memory gi(n)
quantizer;
hold
encoder
Transmission
Digital
modulation
The minimal number of bits per pixel possible to be achieved by PCM:

1
 log 2 u 2 ;
2
q
2
RPCM
q u
2
2
σ2u= the variance of the signal at the input
of the quantizer; σ2q= quantization error
11.2.2. Entropic coding (Huffman)
= coding one block of M pixels, M – small; without compression: B bits/pixel needed =>
MB bits/block
=> the source–has L=2MB symbols => any block can be encoded on less than MB bits,
according to:
L1
= average bit rate
H   pi log2 pi  e
i 0
=> entropic coding principle: use a codebook with variable length code words, the length
of the code word is different for each block as follows:
- the more frequent blocks (pi large) => have -log2pi smaller => shorter code word
- the less frequent blocks (pi small) => have -log2pi larger => longer code word
Chapter 11. Image compression
Digital image processing
•
Huffman coding algorithm:
1)
2)
Order decreasingly the probabilities pi , i=1,2,…,L, and write them in a list of symbols.
Chose from the list the last 2 elements (with the smallest probabilities, pk , pl) . From the
2 => create a node with the probability pk+pl => and insert it in the ordered list in the
right position; erase pk, pl from the list
Asign the codes: 0 to k ; 1 to l
Repeat steps (1)…(3)
3)
4)
Symbol
A
B
C
D
E
Total:
Binary tree
pi
15/39
7/39
6/39
6/39
5/39
-log2(pi)
1.38
2.48
2.70
2.70
2.96
Code
0
111
110
101
100
Subtotal
2*15
3*7
3*6
3*6
3*5
87 bits
H=87/39=2.23 bits per pixel in average
Chapter 11. Image compression
Digital image processing
1)
Decoding:
Simbol
A
B
C
D
E
Code
0
111
110
101
100
The received string: 1000100010101010110111 (22 bits)
Decode!
Chapter 11. Image compression
Digital image processing
•
Versions of the Huffman coding:
* the truncated Huffman code
- first L1 symbols – Huffman encoded;
- the remaining L-L1 are encoded using a prefix code followed by a fixed length code
* the modified Huffman code
- the integer number “i” is represented as i=qL1+j
- encode first L1 symbols with a Huffman code
- if above L1 => encode q by a prefix code + use a terminator code = Huffman code of j
 (L  1)
i  qL1  j,0  q  Int
,0  j  L1  1

 L1 
11.2.3. RLC (run length coding)
=> For binary images, 0 – white, 1- black; p(white)>>p(black); encode the length of the white run
=> applications: fax; sketches
p(white)=p(0)=p; p(black)=p(1)=q; p→1.
 Notations:
M = maximum length of a run of zeros; M=2m-1 => m bits to encode the run
Chapter 11. Image compression
Digital image processing
11.2.3. RLC - continued

The probability of appearance of a run of l symbols ‘0’, 0 ≤ l ≤ M, followed by a ‘1’:
 p l (1  p); 0  l  M  1
g(l)  
pM ; l  M


The average number of symbols/run:

M 1
 (l  1)p (1  p)  Mp
l
l 0

M
1 pM

1 p
The compression ratio:

1- pM
C 
m m(1  p)

The average number of bits per pixel:
Ba 

m


nr. biti pentrua coda cursa de "0"bitul "1"de incheiere m

nr. mediu de biti de "0"continutiintr- o cursa

H
The coding efficiency: h 
can be increased by the Huffman coding or by the arithmetic
Ba
coding of the run lengths (instead of the uniform m bits encoding)
Chapter 11. Image compression
Digital image processing
11.2.3. The arithmetic coding
=> A single codeword is used for an entire sequence of symbols
=> The code = a number in a sub-range of [0;1).
Source’s symbol
Probability
a1
a2
a3
a4
0,2
0,2
0,4
0,2
The sequence to encode
a1
a2
1
0
* Decoding:
=>
Initial sub-range
0,2
[0,00,2)
[0,2 0,4)
[0,4 0,8)
[0,8 1,0)
a3
a4
0,08
a5
0,072
0,0688
a4
a4
a4
a4
a4
a3
a3
a3
a3
a3
a2
a2
a2
a2
a2
a1 0,04
a1 0,056
a1 0,0624
a1
a1
0
1) 0.068  [0.0; 0.2) => a1;
2) 0.34  [0.2; 0.4) => a2;
(0.068-0.0)/(0.2-0.0)=0.34
(0.34-0.2)/(0.4-0.2)=0.7
=> codeword:
Any number in
[0.0624;0.0688)
ex. 0.068
Digital image processing
Chapter 11. Image compression
Arithmetic decoding - continued
=>
=>
=>
3) 0.7  [0.4; 0.8) => a3;
4) 0.75  [0.4; 0.8) => a3;
4) 0.875  [0.8; 1) => a4
(0.7-0.4)/(0.8-0.4)=0.75
(0.75-0.4)/(0.8-0.4)=0.875
=> The decoded sequence: a1 a2 a3 a3 a4
* Drawbacks of the arithmetic coding:
- limited by the representation precision for long sequences
- an end-of-message flag is needed
* Alternative solutions: re-normalization and rounding
11.2.4. Bits planes encoding
=> e.g. 8 planes of 1 bit, independently encoded (MSB …. LSB)
=> disadvantage: sensitive to errors
Chapter 11. Image compression
Digital image processing
11. 3. PREDICTIVE CODING
11.3.1. DPCM coding principles
= Maximize image compression efficiency by exploiting the spatial redundancy present in
an image!
Many close to zero data => spatial redundancy, the brightness
is almost repeating from a point to the next one! => no need to
encode all brightness info, only the new one!
Line-by-line
difference of
the luminance
see images: one can “estimate” ( “predict”) the brightness of the subsequent spatial point
based on the brightness of the previous (one or more) spatial points = PREDICTIVE CODING
Chapter 11. Image compression
Digital image processing
11. 3. PREDICTIVE CODING
11.3.1. DPCM coding principles – continued – BASIC FORMULATION OF DPCM:
Let {u(m)} – the image pixels, represented on a line by line basis, as a vector.
Consider we already encoded u(0), u(1),…, u(n-1); then at the decoder => only available
their decoded versions ( original + coding error), denoted: u•(0), u•(1),…, u•(n-1)
=> currently to do: predictive encode u(n), n-current sample:
(1) estimate (predict) the grey level of the nth sample based on the knowledge of the
previously encoded neighbor pixels u•(0), u•(1),…, u•(n-1):

u (n)   (u (n  1)u  (n  2)...)
(2) compute the prediction error:

e(n)  u(n)  u (n)
(3) quantize the prediction error e(n) and keep the quantized e•(n); encode e•(n) PCM
and transmit it.


At the decoder: 1) decode e => get e•(n) + 2) build prediction u (n) + 3) construct u•(n):



u (n)  u (n)  e (n)
* The encoding & decoding error:
u(n) u(n) u(n)  e(n) e(n)  q(n)
Chapter 11. Image compression
Digital image processing
11. 3. PREDICTIVE CODING - continued
Basic DPCM codec
DPCM codec: (a) with distorsions; (b) without distorsions
Chapter 11. Image compression
Digital image processing
11.3.2. Delta modulation
=> the simplest DPCM encoder; prediction from 1 sample, 1 bit quantizer.
u (n)  u(n  1);

e(n)  u(n) u (n  1)
=> alternatively, one can predict the value of the sample as:
u  (n)    u (n  1)   (n);
with:
 (n)= zero-mean error;
 - factor which minimizes the mean square error (leak factor); given by the error
covariance equation:
E (n)  (m)  (1   2 )u 2  (m  n)
=> SNR can be estimated as:
(SNR) M  10log10
{1  (2  1)f(1)}
dB
2(1  )f(1)
Digital image processing
Chapter 11. Image compression
Chapter 11. Image compression
Digital image processing
Delta modulation result for images: q=6,5;  =1
Original
Compressed
(compression rate 8:1)
Chapter 11. Image compression
Digital image processing
11.3.2. 1-D DPCM (line-by-line DPCM)
=> is the usual version of applying DPCM on digital images. The sequence to encode = one
image line; consider the sequence {u(n)} stationary => one can use a p order linear predictor:
p
u(n)   a(k)u  (n  k)  e(n);
k 1
where a(k) are chosen to minimize:
E[ e(n)2 ]   2
E[ e(n)2 ]   2
**The equations of 1-D DPCM codec:
(1) Predictor:

p
u (n)   a(k)u (n  k)
k 1
(2) Quantizer’s input:
e(n) u(n) u (n)
(3) Quantizer’s output:
e(n)
p

(4) Decoder:
u  (n)  u (n) e (n)   a(k) u  (n  k)  e (n)
k 1
=> Estimation of the SNR:
 (1  2 f(B) 
(SNR)DPCM  10log10 
 [dB]
2
(1


)f(B)


=> Average bit rate reduction vs. PCM:
1
1
RPCM  RDPCM  log2
biti/pixel
2
1  2
Chapter 11. Image compression
Digital image processing
11.3.2. 1-D DPCM - Comparison to PCM; the effect of error propagation:
PCM 8 bits
The error in 3 DPCM on 3 bits
PCM 3 bits
DPCM 3 bits
Digital image processing
Chapter 11. Image compression
11.3.2. 1-D DPCM - Comparison to PCM; the effect of error propagation:
Chapter 11. Image compression
Digital image processing
11.3.2. 2-D DPCM
=> prediction as a function of one neighbor on the same line and 3 neighbors on the previous line:
u(m, n)  a1u  (m  1, n)  a2u  (m, n  1)  a3u  (m  1, n  1)  a4 u  (m  1, n  1)
n
m
B
C
D
2
2
with a(k,l) chosen to minimize E[ e(n) ]  
A
X
r(1,0) a1r(0,0) a2r(1,-1) a3r(0,1) a4r(0,1)
r(0,1) a1r(1,-1) a2r(0,0) a3r(1,0)a4r(1,-2)
r(1,1) a1r(0,1) a2r(1,0) a3r(0,0) a4r(0,2)
r(1,-1) a1r(0,1) a2r(1,-2) a3r(0,2) a4r(0,0)
 2  E2(m,n)  r(0,0) - a1r(1,0) - a2r(0,1) - a3r(1,1) - a4r(1,-1)
r(k,l) = covariance of u(m,n). If r(k,l) is separable =>:
a1  1 ,
a2  2 , a3  12 , a4  0
2  2 (1  12 )(1 22 )
Chapter 11. Image compression
Digital image processing
11.3.2. 2-D DPCM - continued
**Equations of 2-D DPCM:
(1) Predictor: u  (m ,n)  a1u  (m  1,n)  a2u  (m ,n  1)  a3u  (m  1,n  1)  a4u  (m  1,n  1)
e(m,n)  u(m,n)  u  (m,n)
(2) Quantizer’s input:
e(n)
(3) Quantizer’s output:
(4) Decoder:
u  (m,n)  u  (m,n)+ e (m,n)
** Comparison of the performances of the predictive coding methods:
SNR (dB)
35
2D DPCM (maxim)
30
25
2-D DPCM cu predictie
cu trei puncte
20
DPCM linie cu linie
15
PCM
10
5
0
Rata (biti/esantion)
1
2
3
4
Digital image processing
Chapter 11. Image compression
11.4. TRANSFORM-BASED IMAGE COMPRESSION
( “block quantization”)
• The essence of transform-based image compresssion:
(1) an image block is transformed using a unitary transform, so that a large percent from
the energy of the block is packed in a small number of transform coefficients
(2) the transform coefficients are independently quantized
• The optimal transform-based encoder = the encoder which minimizes the mean square error
of the decoded data, for a given number of bits  the Karhunen-Loeve based encoder.
• Transform-based image coding algorithm:
u = a random vector, u[N×1], of zero mean; v=A∙u – a liniar transform of u, so that
v(k) – not correlated => they can be quantized independently => v’(k); u’=B∙v’ – inverse
transform
Chapter 11. Image compression
Digital image processing
11.4.1. Main issues to be solved in transform-based image coding:
(1) Finding the optimal transform matrices: A[N×N], B[N×N], able to minimize the
mean square value of the distortion for a given number of bits, D:


1 N
 1
D  E  (u(n)  u ' (n))2   E (u - u ' )T (u - u ' )
N  n1
 N
(2)
•
Designing the optimal quantizers for v, to minimize D.
Approaches to solve these issues:
(1) For a given quantizer and a forward transform matrix A, the optimal reconstruction
matrix B is:
 1 0 ... 0 
B  A 1  
 0  ... 0 
2



 k  'k
k
 k  Ev(k )  v'* (k )
•
2
' k  E  v ' (k ) 


.
  
.

.
 0
.
.
.
0
.
. 
.
. 

.
. 
...  N 
The optimal matrix A from the max. de-correlation p.o.v.= the KLT of u; the lines of
A =the eigenvectors of the covariance matrix R of u. Then: B=A-1=A*T.
Unfortunately KLT does not have a fast computation algorithm => is typically
replaced in compression by: DCT; DST; DFT; Hadamard; Slant  good
compromise between decorrelation performance & computational speed
Chapter 11. Image compression
Digital image processing
11.4.2. Bits allocation:
•
The variance of v(k) is typically non-uniform => different number of bits needed on
each coefficient for the same quantization error; v(k) → nk
•
The optimal design of the transform-based encoder: how to efficiently allocate the
total available number of bits between the transform coefficients, so that D –
minimized; for A – some unitary matrix, B=A-1=A*T:
1
D
N
 E v( k )  v'( k )
N 1
k 0
2

1

N
N 1

2
k
f (nk )
k 0
where: σk2 = the variance of v(k); nk = the number of bits allocated for the quantization of
v(k); f(nk) = the distortion rate of the quantizer, monotonic, f(0)=1; f(∞)=0.
•
•
1 N 1
The transformed-based encoder rate is: RA   nk  B ; B= the average
N k 0
number of bits/sample
Bits allocation = finding the optimal nk, nk ≥ 0, which minimizes D for the average
number of bits/sample B given.
Chapter 11. Image compression
Digital image processing
»
Bits allocation algorithm:
(1) Define
h( x )  f ' 1 ( x); f ' ( x)  f ( x) x
(2) Find θ = the root of the non-linear equation:
RA 
1
N
   f '(0) 
h

B

2



2
k
k  k 
(11.30)
i
One encodes only the coefficients having σk2 > θ.
(3) The number of bits nk is:

0,
 2k  
nk  
2
2
h(f '( 0) /  k ),  k  
(11.31)
=> The minimum mean distortion:
1
D
N
•


2
2
   k f (nk )    k 
k :2 

k :2k  
k

(11.32)
If instead of imposing B, one imposes D=d: (1) solve (11.32) => θ; (2) solve (11.31) => nk; (3)
estimate B.
Chapter 11. Image compression
Digital image processing
»
Practical (simplified) algorithm to allocate the bits for nk - integer:
= iterative algorithm, j – the current iteration:
(1) At the beginning of the iterations, set:
nk0  0, 0  k  N  1;
j 1
(2) Compute:
nkj  nkj 1   (k  I ),
(3) If
N 1
  
j
 nk  N  B => stop the algorithm. Otherwise repeat (2).
k

so that  I  max  k ;  k   2k f nkj 1  f nkj 1  1
k 0
Chapter 11. Image compression
Digital image processing
11.4.3. Transform-based image compression steps:
(1)
•
•
The image = U[M×N], U={u(m,n)}  vector u[MN×1] => A[MN×MN], B[MN×MN],
v[MN×1]
In practice: use DCT, DST, DFT, Slant, Hadamard – fast algorithms
Practical implementation of the transform-based image compression:
(1) divide the image in small sized blocks: U[M×N] => I = MN/pq blocks Ui[p×q], i=0,1,…,I-1,
p<<M, q<<N
(2) transform + independently encode the blocks Ui[p×q].
=>
Advantages: (a) memory efficient implementation of the transform, MN/pq times
(b) decrease the number of operations for computing the transform on the factor
log2MN/log2pq times
Disadvantage: lower compression rate r.
r
r – compression
rate (bits/pixel)
2,5
2,0
1,5
1,0
0,5
2x2
4x4
8x8
16x16 32x32 64x64 128x128
Block size
Fig. 11.11 Compression rate when using a KLT-based compression scheme
Chapter 11. Image compression
Digital image processing
•
Practical algorithm for transform-based image compression:
q
p
uj
ApUiATq
Quantizer
Transmit/
storage
Encoding
q
Decoding
*T
A p V jA
*
q
u'j
p
* The steps of the algorithm:
Fig. 11.12 2-D transform image coding
(1) Divide the input image into blocks Ui, transform them 2-D to get Vi, i=0,..., I-1,
(2) Bit allocation:
Bit allocation for DCT for
16x16 pixels block
I
NM
pq
Chapter 11. Image compression
Digital image processing
(3)
Quantizer design:
(4) Encode the quantizer’s output
(5) Decode & inverse quantization of the coefficients
– at the decoder:

vi (k , l )   k ,l ,

vi (k , l )  


0,
(k , l )  I t
in
Cosinus, KL
Hadamard
Sinus

v (k , l )
v~i (k , l )  i
, (k , l )  (0,0)
 k ,l
10-1
10-2
10-3
rest
0,25
=> One can compute: the mean distortion, the mean bit rate: D 
0,5
1,0
2,0
4,0
1 p 1 q 1 2
1 p 1 q 1


f
(
n
)
,
R

  k ,l
  nk ,l
k ,l
pq k 0 l 0
pq k 0 l 0
R
(bits/pixel)
Chapter 11. Image compression
Digital image processing
11.4.4. Zonal coding and threshold-based coding:
•
Zonal coding: according to the bit allocation matrix => transmit just N t coeficients = coeficients of the
transform with large variances => the zonal mask = mask with values 1 for the sent coefficients, 0 otherwise:
1, k , l  I t
m(k , l )  
0, in rest
•
Threshold based coding: transmit only the coefficients having the amplitude above a threshold (regardless
their variance); the set of the transmitted coefficients = I’t
I 't  k , l : v(k , l )  
1, k , l  I '
t
m (k , l )  
0, in rest
= threshold-based mask
Chapter 11. Image compression
Digital image processing
11.4.4. Transformed-based coding with visibility criteria:
Brightness/
contrast
Brigthness transform f()
L(m,n)
u
FUF
v'
Visual
quantization
v
*
*
F VF
u'
Inverse transform
L'(m,n)
contrast/
brightness f-1()
a. The encoding system
v,(k,l)
Frequency
filtering
w(k,l)
H(k,l)
MMSE
quantizer
w'(k,l)
Inverse frequency
filtering
v'(k,l)
1/H(k,l)
b. Visual quantization based on visibility criteria
Fig.11.16 Transform-based codec with visibility criteria (F – the unitary Fourier
transform matrix)
11.4.5. Adaptive transform-based coding:
•
•
•
adapt the transform
adapt the bit allocation
adapt the quantization levels
Chapter 11. Image compression
Digital image processing
11.4.6. 2-D DCT –based coding:
N 1 N 1
(2m  1)k 
(2n  1)l 
cos
 u(m ,n) cos


2N
2N




m0 n 0
N 1 N 1
 (2m  1)k 
 (2n  1)l 
u(m,n)     (k)   (l)  v(k,l)  cos
cos


2N
2N




k=0 l=0
1
2
 (0) 
and  (k) 
for 1  k  N
N
N
v(k,l)  (k) (l) 
Fig.11.17 The influence of the different DCT coefficients
Digital image processing
11.4.6. 2-D DCT –based coding- continued:
Chapter 11. Image compression
Digital image processing
•
DCT-based encoding algorithm:
Chapter 11. Image compression
Digital image processing
•
DCT-based encoding algorithm – continued:
Chapter 11. Image compression
Download