Introduction to Digital Image Processing with MATLAB® Asia Edition McAndrew‧Wang‧Tseng Chapter 14: Image Coding and Compression 1 © 2010 Cengage Learning Engineering. All Rights Reserved. 1 14.1 Lossless and Lossy Compression • It is thus important for both reasons of storage and file transfer to make these file sizes smaller, if possible • It will be necessary to distinguish between two different classes of compression methods: Lossless compression, where all the information is retained Lossy compression, where some information is lost 2 Ch14-p.403 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.2 Huffman Coding The average number of bits per pixel can be calculated easily as the expected value (in a probabilistic sense): 3 Ch14-p.404 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.2 Huffman Coding Determine the probabilities of each gray value in the image Form a binary tree by adding probabilities two at a time, always taking the two lowest available values Now assign 0 and 1 arbitrarily to each branch of the tree from its apex Read the codes from the top down 4 Ch14-p.404 © 2010 Cengage Learning Engineering. All Rights Reserved. FIGURE 14.1 5 Ch14-p.405 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.2 Huffman Coding 6 Ch14-p.405 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.2 Huffman Coding We can evaluate the average number of bits per pixel as an expected value Huffman codes are uniquely decodable 7 Ch14-p.406 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.3 Run-length Encoding • Run-length encoding (RLE) is based on a simple idea: to encode strings of 0s and 1s by the number of repetitions in each string Encode each line separately starting with the number of 0s (binary image) 8 Ch14-p.407 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.3 Run-length Encoding Encode each row as a list of pairs of numbers, the first number in each pair giving the starting position of a run of 1s and the second number its length • Grayscale images can be encoded by breaking them up into their bit planes 9 Ch14-p.407 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.3 Run-length Encoding Each plane can then be encoded separately using our chosen implementation of RLE 10 Ch14-p.407 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.3 Run-length Encoding • However, small changes of gray value may cause significant changes in bits • To overcome this difficulty, we may encode the gray values with their binary Gray codes A Gray code is an ordering of all binary strings of a given length so that there is only one bit change between one string and the next 11 Ch14-p.408 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.3 Run-length Encoding 4-bit gray codes Binary bit plane Gray codes 12 Ch14-p.408 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.3 Run-length Encoding • 14.3.1 Run-length Encoding in MATLAB 13 Ch14-p.409 © 2010 Cengage Learning Engineering. All Rights Reserved. FIGURE 14.3 14 Ch14-p.410 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.3 Run-length Encoding We can reduce the size of the output by storing it using the data type uint16 15 Ch14-p.410 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.3 Run-length Encoding 16 Ch14-p.411 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm • Lossy compression trades some acceptable data loss for greater rates of compression • The algorithm developed by the Joint Photographic Experts Group (JPEG) has become one of the most popular • It uses transform coding, where the coding is done not on the pixel values themselves, but on a transform 17 Ch14-p.411 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm • discrete cosine transform (DCT) In the JPEG algorithm it is applied only to 8 × 8 blocks If f(j, k) is one such block, then the forward (2-D) DCT is defined as The corresponding inverse DCT as 18 Ch14-p.412 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm where C(w) is defined as • The DCT has a number of properties that make it particularly suitable for compression: It is real-valued, so there is no need to manipulate complex numbers 19 Ch14-p.412 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm It has a high information-packing ability because it packs large amounts of information into a small number of coefficients It can be implemented very efficiently in hardware Like the FFT, there is a “fast” version of the transform that maximizes efficiency The basis values are independent of the data [7] 20 Ch14-p.412 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm 21 Ch14-p.412 © 2010 Cengage Learning Engineering. All Rights Reserved. FIGURE 14.4 22 Ch14-p.414 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm • The JPEG baseline compression scheme is applied as follows: 1. The image is divided into 8 × 8 blocks, with each block transformed and compressed separately 2. For a given block, the values are shifted by subtracting 128 from each value 3. The DCT is applied to this shifted block 4. The DCT values are normalized by dividing by a normalization Matrix Q. It is this normalization that provides the compression by making most of the elements of the block zero 23 Ch14-p.413 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm 5. This matrix is formed into a vector by reading off all nonzero values from the top left in a zigzag fashion: 6. The first coefficients of each vector are encoded by listing the difference between each value and the values from the previous block. This helps keep all values (except for the first) small 7. These values are then compressed using RLE 8. All other values (known as the AC coefficients) are compressed using a Huffman coding 24 Ch14-p.414 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm • To decompress, the steps above are applied in reverse 1. 2. 3. 4. 25 The Huffman encoding and RLE can be decoded with no loss of information The vector is read back into an 8 × 8 matrix The matrix is multiplied by the normalization matrix The inverse DCT is applied to the result The result is shifted back by 128 to obtain the original image block Ch14-p.415 © 2010 Cengage Learning Engineering. All Rights Reserved. The normalization matrix 14.4 The JPEG Algorithm 26 Ch14-p.415 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm 27 Ch14-p.416 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm The normalization matrix 28 Ch14-p.416 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm Where EOB signifies the end of the block. By this stage we have reduced an 8 × 8 block to a vector of length 21, containing only small values To uncompress 29 Ch14-p.417 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm 30 Ch14-p.417 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm It can be seen that these values are very close to the values in the original block. The differences between original and reconstructed values are The algorithm works best on regions of low frequency; in such cases the original block can be reconstructed with only very small errors 31 Ch14-p.418 © 2010 Cengage Learning Engineering. All Rights Reserved. FIGURE 14.5 32 Ch14-p.419 © 2010 Cengage Learning Engineering. All Rights Reserved. FIGURE 14.6 33 Ch14-p.420 © 2010 Cengage Learning Engineering. All Rights Reserved. FIGURE 14.7 34 Ch14-p.421 © 2010 Cengage Learning Engineering. All Rights Reserved. FIGURE 14.8 Suppose we take our extra parameter n to be 2. This has the effect of doubling each value in the normalization matrix, thus should set more of the DCT values to 0: 35 Ch14-p.421 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm 36 Ch14-p.421 © 2010 Cengage Learning Engineering. All Rights Reserved. FIGURE 14.9 37 Ch14-p.422 © 2010 Cengage Learning Engineering. All Rights Reserved. FIGURE 14.10 38 Ch14-p.423 © 2010 Cengage Learning Engineering. All Rights Reserved. FIGURE 14.11~14.13 39 Ch14-p.423 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm We have seen how changing the compression rate may affect the output. The JPEG algorithm, however, is particularly designed for storage The output vector is further encoded using Huffman coding DC (first value) k contains all elements x whose absolute value satisfies 40 Ch14-p.424 © 2010 Cengage Learning Engineering. All Rights Reserved. 14.4 The JPEG Algorithm e.g. 41 Ch14-p.425 © 2010 Cengage Learning Engineering. All Rights Reserved.