wavelet_coding

advertisement
Roadmap to Lossy Image
Compression


JPEG standard: DCT-based image coding
First-generation wavelet coding


Second-generation schemes




FBI WSQ standard
Embedded Zerotree Wavelet (EZW)
A unified where-and-what perspective
A classification-based interpretation
Beyond wavelet coding
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
1
A Tour of JPEG Coding Algorithm
T
Q
C
Flow-chart diagram of DCT-based coding algorithm specified by
Joint Photographic Expert Group (JPEG)
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
2
Transform Coding of Images

Why not transform the whole image
together?



Require a large memory to store transform
matrix
It is not a good idea for compression due
to spatially varying statistics within an
image
Idea of partitioning an image into
blocks

Each block EE591U
is viewed
asFilteraBanks
smaller-image
Wavelets and
Xin Li 2008
and processedCopyright
independently
3
8-by-8 DCT Basis Images
A88
 a11
 ...

 ...

a81
... ... a18 
... ... ... 
... ... ... 

... ... a88 
1 , k  1,1  l  8

8

akl   1
(2l  1)( k  1)
cos
,2  k  8,1  l  8

16
2
8
8
Y   xij B ij ,
i 1 j 1
T 
B ij  bi b j , bi  [ai1 ,..., ai 8 ]T
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
4
Block Processing under
MATLAB

Type “help blkproc” to learn the usage
of this function


B = BLKPROC(A,[M N],FUN) processes the
image A by applying the function FUN to
each distinct M-by-N block of A, padding A
with zeros if necessary.
I = imread('cameraman.tif');
fun = @dct2;
J = blkproc(I,[8 8],fun);
Example
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
5
Block-based DCT Example
I
J
note that white lines are artificially added to
the border of each 8-by-8 block to denote
that each block is processed independently
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
6
Boundary Padding
Example
12
13
14
15
16
17
18
19
12
13
14
15
16
17
18
19
12
13
14
15
16
17
18
19
12
13
14
15
16
17
18
19
padded regions
When the width/height of an image is not the multiple of 8, the boundary is
artificially padded with repeated columns/rows to make them multiple of 8
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
7
Work with a Toy Example
183
183

179
177
178

179
179

180
160
153
168
177
178
180
179
179
94
116
171
179
179
180
180
181
153
176
182
177
176
179
182
179
194
187
179
179
182
183
183
181
163
166
170
165
164
169
170
170
132
130
131
131
130
132
129
130
165
169
167 
167 
171

169
173

169
Any 8-by-8 block in an image is processed in a similar fashion
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
8
Encoding Stage I: Transform
• Step 1: DC level shifting
183
183

179
177
178

179
179

180
160
153
168
177
178
180
179
179
94
116
171
179
179
180
180
181
153
176
182
177
176
179
182
179
194
187
179
179
182
183
183
181
163
166
170
165
164
169
170
170
132
130
131
131
130
132
129
130
55
55

 51
49
50

 51
 51

 52
165
169
167 
167 
171

169
173

169
36  34 25
25  12 48
40 43 54
49 51 49
50 51 48
52 52 51
51 52 54
51 53 51
66
59
51
51
54
55
55
53
35
38
42
37
36
41
42
42
4
2
3
3
2
4
1
2
37
41
39 
39 
43

41
45

41
128 (DC level)
_
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
9
Encoding Step 1: Transform (Con’t)
• Step 2: 8-by-8 DCT
55
55

 51
49
50

 51
 51

 52
36  34 25
25  12 48
40 43 54
49 51 49
50 51 48
52 52 51
51 52 54
51 53 51
66
59
51
51
54
55
55
53
35
38
42
37
36
41
42
42
4
2
3
3
2
4
1
2
37
41
39 
39 
43

41
45

41
 313
 38

  20
  10
 6

 2
 4

 3
56  27 18 78  60 27  27 
 27 13 44 32  1  24  10 
 17 10 33 21  6
 16  9 
8
9 17
9  10  13 1 
1
6 4
3 7
5 5 

3
0 3
7 4
0 3 
4
1  2  9 0
2 4 

1
0  4  2 1
3 1 
88
DCT
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
10
Encoding Stage II: Quantization
Q-table
16
12

14
14
18

24
49

72
11
12
13
17
22
35
64
92
: specifies quantization stepsize (see slide #28)
10
14
16
22
37
55
78
95
16 24 40
19 26 58
24 40 57
29 51 87
56 68 109
64 81 104
87 103 121
98 112 100
51
60
69
80
103
113
120
103
61 
55 
56 
62 
77 

92 
101

99 
 xij 
f : sij   
 Qij 
f 1 : xˆij  sij  Qij
1  i, j  8
Notes: Q-table can be specified by customer
Q-table is scaled up/down by a chosen quality factor
Quantization stepsize Qij is dependent on the coordinates
(i,j) within the 8-by-8 block
 Quantization stepsize Qij increases from top-left to bottom-right
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
11
Encoding Stage II: Quantization (Con’t)
Example
sij
xij
 313
 38

  20
  10
 6

 2
 4

 3
56  27 18 78  60 27  27 
 27 13 44 32  1  24  10 
 17 10 33 21  6
 16  9 
8
9 17
9  10  13 1 
1
6 4
3 7
5 5 

3
0 3
7 4
0 3 
4
1  2  9 0
2 4 

1
0  4  2 1
3 1 
 20
 3

 1
 1
 0

 0
 0

 0
5 3
2 1
1 1
0
0
0
0
0
0
0
0
0
0
1 3
2 1
1
1
1
0
0
0
0
0
0
0
0
0
2
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0

0
0

0
f
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
12
Encoding Stage III: Entropy Coding
 20
 3

 1
 1
 0

 0
 0

 0
5 3
2 1
1 1
0
0
0
0
0
0
0
0
0
0
1 3
2 1
1
1
1
0
0
0
0
0
0
0
0
0
2
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0

0
0

0
zigzag scan
(20,5,-3,-1,-2,-3,1,1,-1,-1,
0,0,1,2,3,-2,1,1,0,0,0,0,0,
0,1,1,0,1,EOB)
End Of the Block:
Zigzag Scan
All following coefficients
are zero
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
13
Run-length Coding
(20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB)
DC
AC
coefficient
coefficient
- DC coefficient : DPCM coding
encoded bit stream
- AC coefficient : run-length coding (run, level)
(5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB)
(0,5),(0,-3),(0,-1),(0,-2),(0,-3),(0,1),(0,1),(0,-1),(0,-1),(2,0),(0,1),
(0,2),(0,3),(0,-2),(0,1),(0,1),(6,0),(0,1),(0,1),(1,0),(0,1),EOB
Huffman coding
encoded bit stream
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
14
JPEG Decoding Stage I:
Entropy Decoding
encoded bit stream
Huffman decoding
(0,5),(0,-3),(0,-1),(0,-2),(0,-3),(0,1),(0,1),(0,-1),(0,-1),(2,0),(0,1),
(0,2),(0,3),(0,-2),(0,1),(0,1),(6,0),(0,1),(0,1),(1,0),(0,1),EOB
encoded bit stream
DPCM decoding
(20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB)
DC
coefficient
AC
coefficients
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
15
JPEG Decoding Stage II:
Inverse Quantization
(20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB)
zigzag
 20
 3

 1
 1
 0

 0
 0

 0
5 3
2 1
1 1
0
0
0
0
0
0
0
0
0
0
1 3
2 1
1
1
1
0
0
0
0
0
0
0
0
0
2
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
 320
 36

  14
  14
 0

 0
 0

 0
0
0
0
0
0

0
0

0
55  30
 24 14
 13 16
0
0
0
0
0
0
0
0
0
0
16 72
38 26
24
40
29
0
0
0
0
0
0
0
0
0
 80 51 0
0
0 0
0
0 0
0
0 0
0
0 0

0
0 0
0
0 0

0
0 0 
f-1
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
16
JPEG Decoding Stage III:
Inverse Transform
 320
 36

  14
  14
 0

 0
 0

 0
195
186

174
169
172

181
183

180
55  30
 24 14
 13 16
0
0
0
0
0
0
0
0
0
0
140
153
169
180
182
178
178
179
119
143
172
187
186
181
184
181
16 72
38 26
24
40
29
0
0
0
0
0
0
0
0
0
148
158
168
171
168
174
181
179
197
193
187
185
186
191
192
181
 80 51 0
0
0 0
0
0 0
0
0 0
0
0 0

0
0 0
0
0 0

0
0 0 
171
168
166
170
175
169
162
170
120
124
128
131
129
128
127
130
88
IDCT
128
170
175 (DC level)
177 
170 
161

173
185

169
+
67
58

46
 41
44

53
55

52
12
25
41
52
54
50
50
51
9
15
44
59
58
53
56
53
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
20
30
40
43
40
46
53
51
69
65
59
57
58
63
64
53
43  8 43
40  4 47
38 0 49 
42 3 42 
47 1 33 

41 0 45 
34  1 57 

42 2 41 
17
Quantization Noise
183
183

179
177
178

179
179

180
160
153
168
177
178
180
179
179
94
116
171
179
179
180
180
181
153
176
182
177
176
179
182
179
194
187
179
179
182
183
183
181
163
166
170
165
164
169
170
170
132
130
131
131
130
132
129
130
165
169
167 
167 
171

169
173

169
195
186

174
169
172

181
183

180
119
143
172
187
186
181
184
181
148
158
168
171
168
174
181
179
197
193
187
185
186
191
192
181
171
168
166
170
175
169
162
170
120
124
128
131
129
128
127
130
170
175
177 
170 
161

173
185

169
^
X
X
Distortion calculation:
140
153
169
180
182
178
178
179
^ 2
MSE=||X-X||
Rate calculation: Rate=length of encoded bit stream/number of pixels (bps)
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
18
JPEG Examples
10 (8k bytes)
50 (21k bytes)
90 (58k bytes)
best quality,
0 worst quality,
100
lowest compression
highest compression
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
19
Roadmap to Lossy Image
Compression


Lifting scheme: unifying prediction and
transform
First-generation schemes


Second-generation schemes





FBI WSQ standard
Probabilistic modeling of wavelet coefficients
Embedded Zerotree Wavelet (EZW)
SPIHT coder
A unified where-and-what perspective
JPEG2000
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
20
Early Attempts
Each band is modeled by a Guassian
random variable with zero mean and
unknown variance (e.g., WSQ)
 Only modest gain over JPEG (DCTbased) is achieved
 Question: is this an accurate model?
and how can we test it?

EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
21
FBI Wavelet Scalar
Quantization (WSQ)
Each band is approximately modeled by a Gaussian r.v.
xk ~ N (0,  )
2
k
Given R, minimize
k: band index
1
D   Dk
k mk
mk=
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
image size
subband size
22
Rate Allocation Problem*
LL
HL
LH
HH
Given a quota of bits R, how should we
allocate them to each band to minimize
the overall MSE distortion?
Solution: Lagrangian Multiplier technique (turn a constrained optimization
Into an unconstrained optimization problem)
1
min D   Dk , s.t., R  R *
k mk
min D  R
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
23
Proof by Contradiction (I)
Assumption: our modeling target Ω is the collection of natural images
Suppose each coefficient X in a high band does observe
Gaussian distribution, i.e., X~N(0,σ2), then flip the sign of
X (i.e., replace X with –X) should not matter and generates
another element in Ω (i.e., a different but meaningful image)
Let’s test it!
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
24
Proof by Contradiction (II)
DWT
sign flip
IWT
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
25
What is wrong with that?



Think of two coefficients: one in smooth
region and the other around edge, do
they observe the same probabilistic
distribution?
Think of all coefficients around the
same edge, do they observe the same
probabilistic distribution?
Ignorance of topology and geometry
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
26
The Importance of Modeling
Singularity Location Uncertainty


Singularities carry critical visual
information: edges, lines, corners …
The location of singularities is important


Recall locality of wavelets in spatialfrequency domain
Singularities in spatial domain →
significant coefficients in wavelet
domain
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
27
Where-and-What Coding

Communication context
picture

Alice
Bob
Where


communication
channel
The location of significant coefficients
What

The sign and magnitude of significant
coefficients
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
28
Roadmap to Lossy Image
Compression


Lifting scheme: unifying prediction and
transform
First-generation schemes


Second-generation schemes




FBI WSQ standard
Embedded Zerotree Wavelet (EZW)
A unified where-and-what perspective
A classification-based interpretation
Scalable and ROI coding in JPEG2000
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
29
1993-2003






Embedded Zerotree Wavelet (EZW)’1993
Set Partition In Hierarchical Tree
(SPIHT)’1995
Space-Frequency Quantization (SFQ)’ 1996
Estimation Quantization (EQ)’1997
Embedded Block Coding with Optimal
Truncation (EBCOT)’2000
Least-Square Estimation Quantization
(LSEQ)’2003
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
30
A Simpler Two-Stage Coding

Position coding stage (where)



Generate a binary map indicating the
location of significant coefficients (|X|>T)
Use context-based adaptive binary
arithmetic coding (e.g., JBIG) to code the
binary map
Intensity coding stage (what)

Code the sign and magnitude of significant
coefficients
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
31
Classification-based Modeling
Significant class
Insignificant class
X 1 ~ N (0,  12 )
X 0 ~ N (0,  02 )
Mixture
X  aX 1  (1  a) X 0 ~ N (0,  2 ),  2  a 12  (1  a) 02
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
32
Classification Gain
Without classification
D( R)   2 22 R
With classification
D' ( R)   02(1 a ) 12 a 2 2 R
Classification gain
G  10 log 10
a 12  (1  a) 02
D( R)
dB  10 log 10
dB  0
2 (1 a ) 2 a
D' ( R)
 0 1
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
33
Example
 02  1,  12  100
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
34
Advanced Wavelet Coding




SPIHT: a simpler yet more efficient
implementation of EZW coder
SFQ: Rate-Distortion optimized zerotree
coder
EQ: Rate-Distortion optimization via
backward adaptive classification
EBCOT (adopted by JPEG2000): a
versatile embedded coder
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
35
Beyond SPIHT
SFG-enhanced
at rate of 0.32bpp
(PSNR=33.22dB)
JPEG-decoded
at rate of 0.32bpp
(PSNR=32.07dB)
SPIHT-decoded
at rate of 0.20bpp
(PSNR=26.18dB)
SFG-enhanced
at rate of 0.20bpp
(PSNR=27.33dB)
Maximum-Likelihood
(ML) Decoding
Maximum a Posterior
(MAP) Decoding
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
36
Open Problems Related to
Image Coding





Coding of specific class of images (e.g.,
Satellite, microarray, fingerprint)
Coding of color-filter-array (CFA) images
Error resilient coding of images
Perceptual image coding
Image coding for pattern recognition
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
37
Coding of Specific Class of
Images
How to design
specific coding
algorithms for
each class?
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
38
CFA Image Coding
Approach I
CFA Interpolation
(demosaicing)
Color image
compression
Approach II
Bayer Pattern
CFA data
compression
CFA Interpolation
(demosaicing)
Which one is better and why?
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
39
Error Resilient Image Coding
source
channel
encoder
channel
channel
decoder
destination
super-channel
source
encoder
source
decoder
How can we optimize the end-to-end performance in the presence
of channel errors?
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
40
Perceptual Image Coding
Characterizing image
distortion is difficult!
How do we objectively
define mage quality
which has to be subject
to individual opinions?
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
41
Image Coding for PR
image
sensor
Communication
channel
Pattern
recognition
How does coding distortion affect the recognition performance?
We need to develop a new image representation which
Can simultaneously support low-level (e.g., compression,
denoising) and high-level (e.g., recognition and retrieval)
vision tasks
EE591U Wavelets and Filter Banks
Copyright Xin Li 2008
42
Download