Roadmap to Lossy Image Compression JPEG standard: DCT-based image coding First-generation wavelet coding Second-generation schemes FBI WSQ standard Embedded Zerotree Wavelet (EZW) A unified where-and-what perspective A classification-based interpretation Beyond wavelet coding EE591U Wavelets and Filter Banks Copyright Xin Li 2008 1 A Tour of JPEG Coding Algorithm T Q C Flow-chart diagram of DCT-based coding algorithm specified by Joint Photographic Expert Group (JPEG) EE591U Wavelets and Filter Banks Copyright Xin Li 2008 2 Transform Coding of Images Why not transform the whole image together? Require a large memory to store transform matrix It is not a good idea for compression due to spatially varying statistics within an image Idea of partitioning an image into blocks Each block EE591U is viewed asFilteraBanks smaller-image Wavelets and Xin Li 2008 and processedCopyright independently 3 8-by-8 DCT Basis Images A88 a11 ... ... a81 ... ... a18 ... ... ... ... ... ... ... ... a88 1 , k 1,1 l 8 8 akl 1 (2l 1)( k 1) cos ,2 k 8,1 l 8 16 2 8 8 Y xij B ij , i 1 j 1 T B ij bi b j , bi [ai1 ,..., ai 8 ]T EE591U Wavelets and Filter Banks Copyright Xin Li 2008 4 Block Processing under MATLAB Type “help blkproc” to learn the usage of this function B = BLKPROC(A,[M N],FUN) processes the image A by applying the function FUN to each distinct M-by-N block of A, padding A with zeros if necessary. I = imread('cameraman.tif'); fun = @dct2; J = blkproc(I,[8 8],fun); Example EE591U Wavelets and Filter Banks Copyright Xin Li 2008 5 Block-based DCT Example I J note that white lines are artificially added to the border of each 8-by-8 block to denote that each block is processed independently EE591U Wavelets and Filter Banks Copyright Xin Li 2008 6 Boundary Padding Example 12 13 14 15 16 17 18 19 12 13 14 15 16 17 18 19 12 13 14 15 16 17 18 19 12 13 14 15 16 17 18 19 padded regions When the width/height of an image is not the multiple of 8, the boundary is artificially padded with repeated columns/rows to make them multiple of 8 EE591U Wavelets and Filter Banks Copyright Xin Li 2008 7 Work with a Toy Example 183 183 179 177 178 179 179 180 160 153 168 177 178 180 179 179 94 116 171 179 179 180 180 181 153 176 182 177 176 179 182 179 194 187 179 179 182 183 183 181 163 166 170 165 164 169 170 170 132 130 131 131 130 132 129 130 165 169 167 167 171 169 173 169 Any 8-by-8 block in an image is processed in a similar fashion EE591U Wavelets and Filter Banks Copyright Xin Li 2008 8 Encoding Stage I: Transform • Step 1: DC level shifting 183 183 179 177 178 179 179 180 160 153 168 177 178 180 179 179 94 116 171 179 179 180 180 181 153 176 182 177 176 179 182 179 194 187 179 179 182 183 183 181 163 166 170 165 164 169 170 170 132 130 131 131 130 132 129 130 55 55 51 49 50 51 51 52 165 169 167 167 171 169 173 169 36 34 25 25 12 48 40 43 54 49 51 49 50 51 48 52 52 51 51 52 54 51 53 51 66 59 51 51 54 55 55 53 35 38 42 37 36 41 42 42 4 2 3 3 2 4 1 2 37 41 39 39 43 41 45 41 128 (DC level) _ EE591U Wavelets and Filter Banks Copyright Xin Li 2008 9 Encoding Step 1: Transform (Con’t) • Step 2: 8-by-8 DCT 55 55 51 49 50 51 51 52 36 34 25 25 12 48 40 43 54 49 51 49 50 51 48 52 52 51 51 52 54 51 53 51 66 59 51 51 54 55 55 53 35 38 42 37 36 41 42 42 4 2 3 3 2 4 1 2 37 41 39 39 43 41 45 41 313 38 20 10 6 2 4 3 56 27 18 78 60 27 27 27 13 44 32 1 24 10 17 10 33 21 6 16 9 8 9 17 9 10 13 1 1 6 4 3 7 5 5 3 0 3 7 4 0 3 4 1 2 9 0 2 4 1 0 4 2 1 3 1 88 DCT EE591U Wavelets and Filter Banks Copyright Xin Li 2008 10 Encoding Stage II: Quantization Q-table 16 12 14 14 18 24 49 72 11 12 13 17 22 35 64 92 : specifies quantization stepsize (see slide #28) 10 14 16 22 37 55 78 95 16 24 40 19 26 58 24 40 57 29 51 87 56 68 109 64 81 104 87 103 121 98 112 100 51 60 69 80 103 113 120 103 61 55 56 62 77 92 101 99 xij f : sij Qij f 1 : xˆij sij Qij 1 i, j 8 Notes: Q-table can be specified by customer Q-table is scaled up/down by a chosen quality factor Quantization stepsize Qij is dependent on the coordinates (i,j) within the 8-by-8 block Quantization stepsize Qij increases from top-left to bottom-right EE591U Wavelets and Filter Banks Copyright Xin Li 2008 11 Encoding Stage II: Quantization (Con’t) Example sij xij 313 38 20 10 6 2 4 3 56 27 18 78 60 27 27 27 13 44 32 1 24 10 17 10 33 21 6 16 9 8 9 17 9 10 13 1 1 6 4 3 7 5 5 3 0 3 7 4 0 3 4 1 2 9 0 2 4 1 0 4 2 1 3 1 20 3 1 1 0 0 0 0 5 3 2 1 1 1 0 0 0 0 0 0 0 0 0 0 1 3 2 1 1 1 1 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 f EE591U Wavelets and Filter Banks Copyright Xin Li 2008 12 Encoding Stage III: Entropy Coding 20 3 1 1 0 0 0 0 5 3 2 1 1 1 0 0 0 0 0 0 0 0 0 0 1 3 2 1 1 1 1 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 zigzag scan (20,5,-3,-1,-2,-3,1,1,-1,-1, 0,0,1,2,3,-2,1,1,0,0,0,0,0, 0,1,1,0,1,EOB) End Of the Block: Zigzag Scan All following coefficients are zero EE591U Wavelets and Filter Banks Copyright Xin Li 2008 13 Run-length Coding (20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB) DC AC coefficient coefficient - DC coefficient : DPCM coding encoded bit stream - AC coefficient : run-length coding (run, level) (5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB) (0,5),(0,-3),(0,-1),(0,-2),(0,-3),(0,1),(0,1),(0,-1),(0,-1),(2,0),(0,1), (0,2),(0,3),(0,-2),(0,1),(0,1),(6,0),(0,1),(0,1),(1,0),(0,1),EOB Huffman coding encoded bit stream EE591U Wavelets and Filter Banks Copyright Xin Li 2008 14 JPEG Decoding Stage I: Entropy Decoding encoded bit stream Huffman decoding (0,5),(0,-3),(0,-1),(0,-2),(0,-3),(0,1),(0,1),(0,-1),(0,-1),(2,0),(0,1), (0,2),(0,3),(0,-2),(0,1),(0,1),(6,0),(0,1),(0,1),(1,0),(0,1),EOB encoded bit stream DPCM decoding (20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB) DC coefficient AC coefficients EE591U Wavelets and Filter Banks Copyright Xin Li 2008 15 JPEG Decoding Stage II: Inverse Quantization (20,5,-3,-1,-2,-3,1,1,-1,-1,0,0,1,2,3,-2,1,1,0,0,0,0,0,0,1,1,0,1,EOB) zigzag 20 3 1 1 0 0 0 0 5 3 2 1 1 1 0 0 0 0 0 0 0 0 0 0 1 3 2 1 1 1 1 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 320 36 14 14 0 0 0 0 0 0 0 0 0 0 0 0 55 30 24 14 13 16 0 0 0 0 0 0 0 0 0 0 16 72 38 26 24 40 29 0 0 0 0 0 0 0 0 0 80 51 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 f-1 EE591U Wavelets and Filter Banks Copyright Xin Li 2008 16 JPEG Decoding Stage III: Inverse Transform 320 36 14 14 0 0 0 0 195 186 174 169 172 181 183 180 55 30 24 14 13 16 0 0 0 0 0 0 0 0 0 0 140 153 169 180 182 178 178 179 119 143 172 187 186 181 184 181 16 72 38 26 24 40 29 0 0 0 0 0 0 0 0 0 148 158 168 171 168 174 181 179 197 193 187 185 186 191 192 181 80 51 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 171 168 166 170 175 169 162 170 120 124 128 131 129 128 127 130 88 IDCT 128 170 175 (DC level) 177 170 161 173 185 169 + 67 58 46 41 44 53 55 52 12 25 41 52 54 50 50 51 9 15 44 59 58 53 56 53 EE591U Wavelets and Filter Banks Copyright Xin Li 2008 20 30 40 43 40 46 53 51 69 65 59 57 58 63 64 53 43 8 43 40 4 47 38 0 49 42 3 42 47 1 33 41 0 45 34 1 57 42 2 41 17 Quantization Noise 183 183 179 177 178 179 179 180 160 153 168 177 178 180 179 179 94 116 171 179 179 180 180 181 153 176 182 177 176 179 182 179 194 187 179 179 182 183 183 181 163 166 170 165 164 169 170 170 132 130 131 131 130 132 129 130 165 169 167 167 171 169 173 169 195 186 174 169 172 181 183 180 119 143 172 187 186 181 184 181 148 158 168 171 168 174 181 179 197 193 187 185 186 191 192 181 171 168 166 170 175 169 162 170 120 124 128 131 129 128 127 130 170 175 177 170 161 173 185 169 ^ X X Distortion calculation: 140 153 169 180 182 178 178 179 ^ 2 MSE=||X-X|| Rate calculation: Rate=length of encoded bit stream/number of pixels (bps) EE591U Wavelets and Filter Banks Copyright Xin Li 2008 18 JPEG Examples 10 (8k bytes) 50 (21k bytes) 90 (58k bytes) best quality, 0 worst quality, 100 lowest compression highest compression EE591U Wavelets and Filter Banks Copyright Xin Li 2008 19 Roadmap to Lossy Image Compression Lifting scheme: unifying prediction and transform First-generation schemes Second-generation schemes FBI WSQ standard Probabilistic modeling of wavelet coefficients Embedded Zerotree Wavelet (EZW) SPIHT coder A unified where-and-what perspective JPEG2000 EE591U Wavelets and Filter Banks Copyright Xin Li 2008 20 Early Attempts Each band is modeled by a Guassian random variable with zero mean and unknown variance (e.g., WSQ) Only modest gain over JPEG (DCTbased) is achieved Question: is this an accurate model? and how can we test it? EE591U Wavelets and Filter Banks Copyright Xin Li 2008 21 FBI Wavelet Scalar Quantization (WSQ) Each band is approximately modeled by a Gaussian r.v. xk ~ N (0, ) 2 k Given R, minimize k: band index 1 D Dk k mk mk= EE591U Wavelets and Filter Banks Copyright Xin Li 2008 image size subband size 22 Rate Allocation Problem* LL HL LH HH Given a quota of bits R, how should we allocate them to each band to minimize the overall MSE distortion? Solution: Lagrangian Multiplier technique (turn a constrained optimization Into an unconstrained optimization problem) 1 min D Dk , s.t., R R * k mk min D R EE591U Wavelets and Filter Banks Copyright Xin Li 2008 23 Proof by Contradiction (I) Assumption: our modeling target Ω is the collection of natural images Suppose each coefficient X in a high band does observe Gaussian distribution, i.e., X~N(0,σ2), then flip the sign of X (i.e., replace X with –X) should not matter and generates another element in Ω (i.e., a different but meaningful image) Let’s test it! EE591U Wavelets and Filter Banks Copyright Xin Li 2008 24 Proof by Contradiction (II) DWT sign flip IWT EE591U Wavelets and Filter Banks Copyright Xin Li 2008 25 What is wrong with that? Think of two coefficients: one in smooth region and the other around edge, do they observe the same probabilistic distribution? Think of all coefficients around the same edge, do they observe the same probabilistic distribution? Ignorance of topology and geometry EE591U Wavelets and Filter Banks Copyright Xin Li 2008 26 The Importance of Modeling Singularity Location Uncertainty Singularities carry critical visual information: edges, lines, corners … The location of singularities is important Recall locality of wavelets in spatialfrequency domain Singularities in spatial domain → significant coefficients in wavelet domain EE591U Wavelets and Filter Banks Copyright Xin Li 2008 27 Where-and-What Coding Communication context picture Alice Bob Where communication channel The location of significant coefficients What The sign and magnitude of significant coefficients EE591U Wavelets and Filter Banks Copyright Xin Li 2008 28 Roadmap to Lossy Image Compression Lifting scheme: unifying prediction and transform First-generation schemes Second-generation schemes FBI WSQ standard Embedded Zerotree Wavelet (EZW) A unified where-and-what perspective A classification-based interpretation Scalable and ROI coding in JPEG2000 EE591U Wavelets and Filter Banks Copyright Xin Li 2008 29 1993-2003 Embedded Zerotree Wavelet (EZW)’1993 Set Partition In Hierarchical Tree (SPIHT)’1995 Space-Frequency Quantization (SFQ)’ 1996 Estimation Quantization (EQ)’1997 Embedded Block Coding with Optimal Truncation (EBCOT)’2000 Least-Square Estimation Quantization (LSEQ)’2003 EE591U Wavelets and Filter Banks Copyright Xin Li 2008 30 A Simpler Two-Stage Coding Position coding stage (where) Generate a binary map indicating the location of significant coefficients (|X|>T) Use context-based adaptive binary arithmetic coding (e.g., JBIG) to code the binary map Intensity coding stage (what) Code the sign and magnitude of significant coefficients EE591U Wavelets and Filter Banks Copyright Xin Li 2008 31 Classification-based Modeling Significant class Insignificant class X 1 ~ N (0, 12 ) X 0 ~ N (0, 02 ) Mixture X aX 1 (1 a) X 0 ~ N (0, 2 ), 2 a 12 (1 a) 02 EE591U Wavelets and Filter Banks Copyright Xin Li 2008 32 Classification Gain Without classification D( R) 2 22 R With classification D' ( R) 02(1 a ) 12 a 2 2 R Classification gain G 10 log 10 a 12 (1 a) 02 D( R) dB 10 log 10 dB 0 2 (1 a ) 2 a D' ( R) 0 1 EE591U Wavelets and Filter Banks Copyright Xin Li 2008 33 Example 02 1, 12 100 EE591U Wavelets and Filter Banks Copyright Xin Li 2008 34 Advanced Wavelet Coding SPIHT: a simpler yet more efficient implementation of EZW coder SFQ: Rate-Distortion optimized zerotree coder EQ: Rate-Distortion optimization via backward adaptive classification EBCOT (adopted by JPEG2000): a versatile embedded coder EE591U Wavelets and Filter Banks Copyright Xin Li 2008 35 Beyond SPIHT SFG-enhanced at rate of 0.32bpp (PSNR=33.22dB) JPEG-decoded at rate of 0.32bpp (PSNR=32.07dB) SPIHT-decoded at rate of 0.20bpp (PSNR=26.18dB) SFG-enhanced at rate of 0.20bpp (PSNR=27.33dB) Maximum-Likelihood (ML) Decoding Maximum a Posterior (MAP) Decoding EE591U Wavelets and Filter Banks Copyright Xin Li 2008 36 Open Problems Related to Image Coding Coding of specific class of images (e.g., Satellite, microarray, fingerprint) Coding of color-filter-array (CFA) images Error resilient coding of images Perceptual image coding Image coding for pattern recognition EE591U Wavelets and Filter Banks Copyright Xin Li 2008 37 Coding of Specific Class of Images How to design specific coding algorithms for each class? EE591U Wavelets and Filter Banks Copyright Xin Li 2008 38 CFA Image Coding Approach I CFA Interpolation (demosaicing) Color image compression Approach II Bayer Pattern CFA data compression CFA Interpolation (demosaicing) Which one is better and why? EE591U Wavelets and Filter Banks Copyright Xin Li 2008 39 Error Resilient Image Coding source channel encoder channel channel decoder destination super-channel source encoder source decoder How can we optimize the end-to-end performance in the presence of channel errors? EE591U Wavelets and Filter Banks Copyright Xin Li 2008 40 Perceptual Image Coding Characterizing image distortion is difficult! How do we objectively define mage quality which has to be subject to individual opinions? EE591U Wavelets and Filter Banks Copyright Xin Li 2008 41 Image Coding for PR image sensor Communication channel Pattern recognition How does coding distortion affect the recognition performance? We need to develop a new image representation which Can simultaneously support low-level (e.g., compression, denoising) and high-level (e.g., recognition and retrieval) vision tasks EE591U Wavelets and Filter Banks Copyright Xin Li 2008 42