Digital Watermarking 7. Image Compression H. Danyali hdanyali@ieee.org Digital Watermarking, Shiraz University of Technology Raw (uncompressed) Data Picture size (pixels) Bits/pixels Frames/sec Bit rate Common application 600 x 800 24 - 11.2 Mbits Screen images 1200 x 1600 24 - 46.08 Mbits 2M pixels digital photos 2048 x 2560 24 - 125.83 Mbits High quality images 96 x 128 24 7.5 2.21 Mbits/sec Videophone 288 x 352 24 30 72.99 Mbits/sec Video conference 480 x 720 24 30 248.83 Mbits/sec Standard TV 1080 x 1920 24 30 1.49 Gbits/sec High-definition TV 1200 x 1600 x 24 bits/pixel = 4608000 bits Downloading time (over a 28.8 kbps link) = 26.67 min. Digital Watermarking, Shiraz University of Technology Image Compression Digital Watermarking, Shiraz University of Technology Data and Information • Data and information are not synonymous. • Data are the means by which information is conveyed. • Various amount of data may be used to represent the same amount of information. Data Redundancy Digital Watermarking, Shiraz University of Technology Lossy vs. Lossless Compression Digital Watermarking, Shiraz University of Technology Some General Concepts • How Can an Image be Compressed, AT ALL!? – If images are random matrices, better not to try any compression – Image pixels are highly correlated redundant information • Information, Uncertainty and Redundancy – Information is uncertainty (to be) resolved – Redundancy is repetition of the information we have – Compression is about removing redundancy • Entropy and Entropy Coding – Entropy is a statistical measure of uncertainty, or a measure of the amount of information to be resolved – Entropy coding: approaching the entropy (no-redundancy) limit Digital Watermarking, Shiraz University of Technology Some General Concepts • Bit Rate and Compression Ratio – Bit rate: bits/pixel, sometimes written as bpp – Compression ratio (CR): number of bits to represent the original image – CR = number of bits in compressed bit stream • Binary, Gray-Scale, Color Image Compression – Original binary image: 1 bit/pixel – Original gray-scale image: typically 8bits/pixel – Original Color image: typically 24bits/pixel • Lossless, Nearly lossless and Lossy Compression – Lossless: original image can be exactly reconstructed – Nearly lossless: reconstructed image nearly (visually) lossless – Lossy: reconstructed image with loss of quality (but higher CR) Digital Watermarking, Shiraz University of Technology Data Redundancy Digital Watermarking, Shiraz University of Technology Coding Redundancy Digital Watermarking, Shiraz University of Technology Coding Redundancy (cont.) Digital Watermarking, Shiraz University of Technology Coding Redundancy, An Example Assume rk : [0,1] represents the gray levels of an image Digital Watermarking, Shiraz University of Technology A Graphic Representation of Data Compression l2 ( rk ) increases as pr ( rk ) decreases Digital Watermarking, Shiraz University of Technology Interpixel / Interframe Redundancy Digital Watermarking, Shiraz University of Technology Pixel Correlations Notes: 1- 3 dominant ranges of gray-level value in both histograms. 2- High correlation between pixels separated by 25 and 90 samples in (f) related to the spacing between the matches in (b). n 1 0.9922 in (a) and 0.9928 in (b) 3- Adjacent pixels of both images are highly correlated. Interpixel redundancy: also known as spatial redundancy, geometric redundancy, interaframe redundancy Digital Watermarking, Shiraz University of Technology Illustration of run-length coding The binary image can be more efficiently represented by the value and the length of its constant gray-level runs Digital Watermarking, Shiraz University of Technology Compression by Quantization • Human visual system (HVS) has limitations; good example is quantization. Conveys information but requires much less IGS: Improved gray-scale quantization recognizes the eye’s inherent sensitivity to edges and breaks them up by adding to each pixel a pseudo random number (generated from the low order bits of neighboring pixels) before quantizing. Digital Watermarking, Shiraz University of Technology Fidelity Criteria • Objective fidelity criteria – root-mean-square (rms) error – mean-square signal-to-noise ratio (SNRms or SNR), PSNR Pixel error: Image error: – Other criteria (HVS-Based, SSIM) • Subjective fidelity criteria Digital Watermarking, Shiraz University of Technology A Subjective Fidelity Criteria Digital Watermarking, Shiraz University of Technology Image Compression Models Source encoding: to remove input redundancies Channel encoding: to increase noise immunity for the source encoder’s output Digital Watermarking, Shiraz University of Technology Source Encoding Model Digital Watermarking, Shiraz University of Technology Channel Encoding • Channel encoding is important when the channel is noisy or prone to error. • An example of channel coding: Hamming encoding - It is based on appending enough bits to the data to ensure that some distance between the code words exist. - Distance between code words: minimum number of digits must change in one word so that the other word results. - Example: Distance between 010110 and 110011 is 3. Digital Watermarking, Shiraz University of Technology An Example of Hamming Encoding • The 7-bit (4-bit data + 3-bit redundancy): h1h2 h3h4 h5h6 h7 • Encoder: – 4-bit binary data: b3b2b1b0 h1 b3 b2 b0 (even parity bits for the fields b3b2b0 ) h2 b3 b1 b0 (even parity bits for the fields b3b1b0 ) h4 b2 b1 b0 (even parity bits for the fields b2b1b0 ) h3 b3 h5 b2 h6 b1 h7 b0 – – Distance between code words: 3 All single bit error can be detected and corrected. Digital Watermarking, Shiraz University of Technology Hamming Decoding • • To decode a Hamming encoded result, the channel decoder must check the encoded value for odd parity over the bit fields in which even parity was established (i.e. C4C2C1 ) A single bit-error is indicated by a nonzero parity word C4C2C1 where: c1 h1 h3 h5 h7 c2 h2 h3 h6 h7 c4 h4 h5 h6 h7 • If a nonzero value is found, the decoder simply complements the code word bit position indicated by the parity word. The final decode binary value is: h3h5h6h7 Digital Watermarking, Shiraz University of Technology Error-Free (Lossless) Compression • Variable length coding – Huffman coding – Arithmetic coding • LZW (Lampel-Ziv-Welch) coding • Bit-plane coding • Lossless predictive coding Lossless methods: - generally consist of two stages: 1- providing an alternative representation (mapping) of to reduce the interpixel redundancy 2- coding the representation to eliminate coding redundancy (symbol coding) - normally provide compression ratio (CR) of 2 to 10 - applicable to both binary and gray level images Digital Watermarking, Shiraz University of Technology Huffman Coding • Most popular technique • Yields the smallest possible number of code symbol per source symbol ( the resulting code is optimal). • First step: source reduction by ordering the symbols according to their probability and combining the lowest probability symbol into a single symbol Digital Watermarking, Shiraz University of Technology Huffman coding (cont.) • Second step: code assignment procedure Uniquely decodable by scanning from left-to-right in Fig 8.12 For example: encoded string 010100111100 reveals code words 01010 (a3), 011 (a1), 1 (a2) , 1(a2), 00(a6) a3a1a2a2a6 - H (z) = Entropy of the source = 2.14 bits/symbol - Average length of this code is: Lavg = (0.4)(1) + (0.3)(2) + (0.1)(3) + (0.1)(4) + (0.06)(5) + (0.04)(5) = 2.2 bits/symbol - Code efficiency = H(z) / Lavg = 0.973 Digital Watermarking, Shiraz University of Technology Arithmetic Coding • Unlike variable-length code a one-to-one correspondence between source symbol and code words does not exist. Instead an entire sequence of source symbol is assigned to a single arithmetic code word. • The code word itself defines an interval of real numbers between 0 and 1. Digital Watermarking, Shiraz University of Technology Lossy Gray-Scale Image Compression Digital Watermarking, Shiraz University of Technology Quantization • Quantization: Widely Used in Lossy Compression – Represent certain image components with fewer bits (compression) – With unavoidable distortions (lossy) • Quantizer Design – Find the best tradeoff between maximal compression minimal distortion • Scalar Quantization Uniform scalar quantization: 8 Non-uniform scalar quantization: 24 1 2 40 3 248 ... 4 Quantization • Vector Quantization – Group multiple image components together form a vector – Quantize the vector in a higher dimensional space – More efficient than scalar quantization (in terms of compression) Vector quantization: image Gray-level22 component Gray-level 1 1 image component From Prof. Al Bovik 30 Lossy Image Coding (Compression) Methods • Spatial domain methods • Transform coding – Adaptive (adaptive to the local image content) – nonadaptive (fixed for all subimages) Digital Watermarking, Shiraz University of Technology Ideas on Lossy Image Compression code each block independently • Block-Based Image Compression Partition image MxM MxM MxM MxM MxM MxM MxM MxM MxM From Prof. Al Bovik • Transform-Domain Compression – Scalar or vector quantization of transform coefficients (instead of image pixels) Transform Coding Scheme Digital Watermarking, Shiraz University of Technology Discrete Cosine Transform (DCT) • 2D-DCT: 4C (u )C (v) N 1 N 1 (2m 1)u X (u , v) x ( m , n ) cos N2 2N m 0 n 0 • Inverse 2D-DCT: ( 2 m 1)u x( m, n ) C (u )C ( v ) X (u, v ) cos 2N u 0 v 0 N 1 N 1 where 1 C (u ) 2 1 u 1,, N 1 N N 2N From Prof. Al Bovik reflected periodic extension by DFT ( 2 n 1)v cos 2N u0 • DFT vs. DCT periodic extension by DFT (2n 1)v cos 2 N discontinuities: high frequencies continuous 2D-DCT image block 55 55 51 49 50 51 51 52 36 34 25 25 12 48 40 43 54 49 51 49 50 51 48 52 52 51 51 52 54 51 53 51 66 59 51 51 54 55 55 53 35 38 42 37 36 41 42 42 4 2 3 3 2 4 1 2 37 41 39 39 43 41 45 41 DC component 2D-DCT low frequency high frequency Digital Watermarking, Shiraz University of Technology low frequency 313 38 20 10 6 2 4 3 high frequency 56 27 18 78 60 27 27 27 13 44 32 1 24 10 17 10 33 21 6 16 9 8 9 17 9 10 13 1 1 6 4 3 7 5 5 3 0 3 7 4 0 3 4 1 2 9 0 2 4 1 0 4 2 1 3 1 DCT block JPEG Compression – Partition the image into 8x8 blocks, for each block 183 183 179 177 178 179 179 180 160 153 168 177 178 180 179 179 94 116 171 179 179 180 180 181 153 176 182 177 176 179 182 179 194 187 179 179 182 183 183 181 163 166 170 165 164 169 170 170 132 130 131 131 130 132 129 130 165 169 167 167 171 169 173 169 - 128 55 55 51 49 50 51 51 52 DCT 313 38 20 10 6 2 4 3 56 27 18 78 60 27 27 27 13 44 32 1 24 10 17 10 33 21 6 16 9 8 9 17 9 10 13 1 1 6 4 3 7 5 5 3 0 3 7 4 0 3 4 1 2 9 0 2 4 1 0 4 2 1 3 1 scalar quantization 36 34 25 25 12 48 40 43 54 49 51 49 50 51 48 52 52 51 51 52 54 51 53 51 20 3 1 1 0 0 0 0 66 59 51 51 54 55 55 53 35 38 42 37 36 41 42 42 5 3 2 1 1 1 0 0 0 0 0 0 0 0 0 0 zig-zag scan Digital Watermarking, Shiraz University of Technology 4 2 3 3 2 4 1 2 37 41 39 39 43 41 45 41 1 3 2 1 1 1 1 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 JPEG Compression – Adjust Quantization Step to Achieve Tradeoff between CR and distortion Original: 100KB JPEG: 9KB JPEG: 5KB – Artifacts: Inside blocks: blurring (why?); Across blocks: blocking (why?) Digital Watermarking, Shiraz University of Technology Wavelet and JPEG2000 Compression • Wavelet Transform Energy Compaction Lower Entropy Wavelet and JPEG2000 Compression • Bitplane Coding – Scan bitplanes from MSB to LSB – Progressive (scalable) JPEG (64:1) sign .. ... 0 ... 0 ... 0 ... 0 ... 1 .. .. ... s s s s s s s s s s s s s s s s s s s s s msb 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 1 1 1 1 1 0 0 0 0 0 0 0 0 0 4 1 1 1 1 1 1 1 0 0 5 1 1 6 7 .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ... . . . . . . . . . . . . . . . . . . . . . . . JPEG2000 (64:1) Wavelet Image Coding Examples: • EZW (1993) • SPIHT (1996) • JPEG2000 Digital Watermarking, Shiraz University of Technology Mutiresolution Wavelet Decomposition Multi level wavelet decomposition of an image 2D-DWT LL1 Input image LL2 HL2 HL1 HL1 LH2 HH2 2D-DWT LH1 HH1 LH1 HH1 0 0 63 127 255 0 Digital Watermarking, Shiraz University of Technology 127 127 255 255 0 63 127 255 Digital Watermarking, Shiraz University of Technology Bitplanes and Self-Similarity Across Scales • Most of the image’s energy is concentrated in the highest level of the subband pyramid. • There is a spatial self-similarity between subbands in different levels of the pyramid. • Coding of a subband in higher level of the pyramid is started with higher bitplane during a bitplane coding process. Digital Watermarking, Shiraz University of Technology Bitplane coding Sign bit 37 0 18 -20 14 0 1 0 9 0 -6 1 Bitplane level 5 1 0 0 0 0 0 Bitplane level 4 0 1 1 0 0 0 Bitplane level 3 0 0 0 1 1 0 Bitplane level 2 1 0 1 1 0 1 Bitplane level 1 0 1 0 1 0 1 Bitplane level 0 1 0 0 0 1 0 Digital Watermarking, Shiraz University of Technology Set Partitioning in Hierarchical Trees (SPIHT) Introduced by A. Said and W. A. Pearlman, in 1996 http://www.cipr.rpi.edu/research/SPIHT/ Key Ideas • Multi-pass zero-tree coding. • Ordered bit plane transmission. • Exploitation of self-similarity across wavelet scales. Digital Watermarking, Shiraz University of Technology SPIHT Definitions • Sets • O (i,j): set of coordinates of all offspring of node (i,j). D (i,j): set of coordinates of all descendants of node (i,j). H : set of coordinates of all spatial orientation tree roots. L (i,j): D (i,j) - O (i,j). Lists LIS: List of Insignificant Sets • Type A: Entries are elements D (i,j) • Type B: Entries are elements L (i,j) LIP: List of Insignificant Pixels LSP: List of Significant Pixels • Significance test 1, S n (T ) 0, max {| ci , j |} 2 n ( i , j )T otherwise Digital Watermarking, Shiraz University of Technology SPIHT Algorithm 1. Initialization • Output n log 2 (max (i , j ) {| ci , j |}) • Set the LSP as an empty list • Add (i, j ) H to the LIP and only those with descendants to the LIS as type A. 2. Sorting Pass • Sort each entry in the LIP. • Sort each entry in the LIS type A. • Sort each entry in the LIS type B. 3. Refinement Pass • Output the nth most significant bits for all entries in the LSP except those added in the last sorting pass. 4. Quantization Step Update • Decrement n by 1 and go to Step 2. Digital Watermarking, Shiraz University of Technology LSP LIP (0,0) (0,1) (1,0) (1,1) LIP S. P. LIS LIS S. P. (0,1)A (1,0)A (1,1)A SPIHT Sorting Pass LSP LIP S. P. LIP (0,0) (0,1) (1,0) (1,1) 10 10 11 0 LIS S. P. (type A) (0,0) (0,1) (1,0) 10 11 0 0 (0,4) (0,5) (0,4) (0,5) (1,4) (1,5) 1 1 0 If leaves exist (1,4) (1,5) 1 1 1 11 LIS (0,1)A (1,0)A (1,1)A LIS S. P. (type B) (0,1)B (1,0)B 1 0 (0,4) (0,5) (1,4) (1,5) Add as type A, remove (0,1) 1 1 Output bitstream Header Bitplane n Bitplane n-1 1010110 … 11011001 … 1 … 0 … 10 … Digital Watermarking, Shiraz University of Technology Bitplane 0 SPIHT Properties • Provides good image quality (high PSNR). • Produces a fully embedded bit stream (optimized for progressive image transmission). • Can code to exact bit-rate or distortion. • Can be used for lossless compression. • Fast coding/decoding (nearly symmetric). • Has wide application, completely adaptive. Digital Watermarking, Shiraz University of Technology Demo Bpp = 0.31 SPIHT PSNR = 35.12 dB (Show a demo by using VCDemo software) Digital Watermarking, Shiraz University of Technology JPEG PSNR = 31.8 dB (quality factor 15%) Original Lena and Barbara Images Digital Watermarking, Shiraz University of Technology JPEG (0.25bpp, CR: 32:1) Digital Watermarking, Shiraz University of Technology SPIHT (0.25bpp, CR: 32:1) Digital Watermarking, Shiraz University of Technology Comparison (JPEG and SPIHT at CR :32:1) Top: JPEG Bottom: SPIHT Digital Watermarking, Shiraz University of Technology Failure of Logarithmic Wavelet Transform, An Example Digital Watermarking, Shiraz University of Technology Wavelet Packet Digital Watermarking, Shiraz University of Technology Wavelet Packet (Cont.) Digital Watermarking, Shiraz University of Technology Wavelet Packet (Cont.) Digital Watermarking, Shiraz University of Technology Digital Watermarking, Shiraz University of Technology Binary Image Compression Digital Watermarking, Shiraz University of Technology Run Length Coding • Run Length – The length of consecutively identical symbols • Run length Coding Example what's stored: '1' row m 7 5 8 3 1 • When Does it Work? – Images containing many runs of 1’s and 0’s • When Does it Not Work? what's stored: '1'1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 row m Run Length Coding CCITT test image No. 1 Size: 17282376 1 bit/pixel (bpp) original: 513216 bytes compressed: 37588 bytes CR = 13.65 Run Length Coding • Decoding Example A binary image is encoded using run length code row by row, with “0” represents white, and “1” represents black. The code is given by Row 1: “0”, 16 Row 2: “0”, 16 Row 3: “0”, 7, 2, 7 Row 4: “0”, 4, 8, 4 Row 5: “0”, 3, 2, 6, 3, 2 Row 6: “0”, 2, 2, 8, 2, 2 Row 7: “0”, 2, 1, 10, 1, 2 Row 8: “1”, 3, 10, 3 Row 9: “1”, 3, 10, 3 Row 10: “0”, 2, 1, 10, 1, 2 Row 11: “0”, 2, 2, 8, 2, 2 Row 12: “0”, 3, 2, 6, 3, 2 Row 13: “0”, 4, 8, 4 Row 14: “0”, 7, 2, 7 Row 15: “0”, 16 Row 16: “0”, 16 decode Decode the image Run Length Coding • Decoding Example A binary image is encoded using run length code row by row, with “0” represents white, and “1” represents black. The code is given by Row 1: “0”, 16 Row 2: “0”, 16 Row 3: “0”, 7, 2, 7 Row 4: “0”, 4, 8, 4 Row 5: “0”, 3, 2, 6, 3, 2 Row 6: “0”, 2, 2, 8, 2, 2 Row 7: “0”, 2, 1, 10, 1, 2 Row 8: “1”, 3, 10, 3 Row 9: “1”, 3, 10, 3 Row 10: “0”, 2, 1, 10, 1, 2 Row 11: “0”, 2, 2, 8, 2, 2 Row 12: “0”, 3, 2, 6, 3, 2 Row 13: “0”, 4, 8, 4 Row 14: “0”, 7, 2, 7 Row 15: “0”, 16 Row 16: “0”, 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 decode 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 10 11 12 13 14 15 16 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Chain Coding Assume the image contains only single-pixel-wide contours, like this, not this contour image After the initial point position, code direction only (3bits/step) region image From Prof. Al Bovik n 2 3 1 0 4 7 6 = initial point Code Stream: 5 (3, 2), 1, 0, 1, 1, 1, 1, 3, 3, 3, 4, 4, 5, 4 m initial point position chain code Digital Watermarking, Shiraz University of Technology 65 Chain Coding • Decoding Example The chain code for a 8x8 binary image is given by: column row 2 3 (1, 6), 7, 7, 0, 1, 1, 3, 3, 3, 1, 1, 0, 7, 7 decode Decode the image 1 0 4 5 6 7 Chain Coding • Decoding Example The chain code for a 8x8 binary image is given by: column row 2 3 (1, 6), 7, 7, 0, 1, 1, 3, 3, 3, 1, 1, 0, 7, 7 decode 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 0 4 5 1 2 3 4 5 6 7 8 1 6 7 Lossless Gray-Scale Image Compression Digital Watermarking, Shiraz University of Technology Variable Word Length Coding • Intuitive Idea – Assign short words to gray levels that occur frequently – Assign long words to gray levels that occur infrequently • How Much Can Be Compressed? – Theoretical limit: entropy of the histogram – Practical algorithms (approach entropy): Huffman coding, arithmetic coding K 1 p (k ) log 2 p (k ) k 0 Maximum entropy: Uniform distribution H I (k) typically in-between 0 K-1 gray level k Minimum entropy: Impulse (delta) distribution Variable Word Length Coding: Example • A 4x4 4bits/pixel original image is given by Default Code Book 0: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 0000 0001 0010 0011 0100 0101 0110 0111 1000 1001 1010 1011 1100 1101 1110 1111 2 8 6 6 6 8 8 8 8 8 10 10 9 10 10 14 encode 0010 1000 0110 0110 0110 1000 1000 1000 1000 1000 1010 1010 1001 1010 1010 1110 Bit rate = 4bits/pixel Total # of bits used to represent the image: 4x16 = 64 bits Variable Word Length Coding: Example • Encode the original image with a CODE BOOK given left Huffman Code Book 0: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 0000000 0000001 0001 0000010 0000011 0000100 01 0000101 10 00100 11 0000110 0000111 001010 0011 001011 2 8 6 6 6 8 8 8 8 8 10 10 Total # of bits used to represent the image: 4+2+2+2+2+2+2+2+2+ 2+2+2+5+2+2+4 = 39 bits 9 10 10 14 encode 0001 10 01 01 01 10 10 10 10 10 11 11 00100 11 11 0011 Bit rate = 39/16 = 2.4375 bits/pixel CR = 64/39 = 1.6410 Predictive Coding • Intuitive Idea – Image pixels are highly correlated (dependent) – Predict the image pixels to be coded from those already coded • Differential Pulse-Code Modulation (DPCM) – Simplest form: code the difference between pixels Original pixels: 82, 83, 86, 88, 56, 55, 56, 60, 58, 55, 50, …… DPCM: 82, 1, 3, 2, -32, -1, 1, 4, -2, -3, -5, …… – Key features: Invertible, and lower entropy (why?) H I (k) H D(k) 0 K-1 high entropy image 1-K K-1 reduced entropy image image histogram (high entropy) DPCM histogram (low entropy) From Prof. Al Bovik Advanced Predictive Coding • Higher Order (Pattern) Prediction – Use both 1D and 2D patterns for prediction 1D Causal: 1D Non-causal: 2D Causal: 2D Non-Causal: • Apply Image Transforms before Predictive Coding – Decouple dependencies between image pixels • Use Advanced Statistical Image Models – Better understanding of “the nature” of image structures implies potentials of better prediction Digital Watermarking, Shiraz University of Technology