Image compression -Contents Date: 08/06/04 images take a lot of

 Image compression -Contents o o o o o o o o o Date: 08/06/04 images take a lot of storage space  1024 x 1024 x 32-bit images requires 4MB  suppose you have some video that is 640 x 480 x 24-bits x 30 frames per second, 1 minute of video would require 1.54 GB that many bytes take a long time to transfer over slow connections - suppose you have 56,000 bps  4MB will take almost 10 minutes  1.54 GB will take almost 66 hours storage problems, plus the desire to exchange images over the Internet, have led to a large interest in image compression algorithms the same information can be represented many ways a representation contains redundancy if more data than necessary is used to represent the information compression algorithms remove redundancy  lossless algorithms remove only redundancy present in the data  lossy algorithms create redundancy (by discarding some information) and then remove it types of redundancy  coding redundancy  our quantized data is represented using codewords  if the size of the codeword is larger than is necessary to represent all quantization levels, then we have coding redundancy  interpixel redundancy  the intensity at a pixel may correlate strongly with the intensity value of its neighbors  may remove redundancy by representing changes in intensity rather than absolute intensity values  psychovisual redundancy  may have information present that is of lesser importance to human perception  e.g. high spatial frequencies or many quantization levels  interframe redundancy  temporal equivalent of interpixel redundancy measuring compression algorithm performance  compression ratio  C = n/nc  n is the size of the uncompressed data, NC is the size of the compressed data  larger values of C indicate better compression  symmetry  how does the time required for compression compare with the time required for decompression  requirements for an algorithm may depend on the application  e.g. images on a CD-ROM or served over the Web versus images being stored from a security camera  fidelity criteria  when a lossy algorithm is used, how does the decompressed image compare to the original image  one objective measure is the root-mean-square (RMS) error  smaller values of RMS error indicate the decompressed image is closer to the original  small RMS error does not always correlate well with subjective perception  subjective fidelity measures are also important Lossless compression algorithms  delta compression       takes advantage of interpixel redundancy in a scan line assumes most intensity changes happen gradually use codewords to represent the change in pixel intensity along a scan line e.g. use 4-bit codewords to represent intensity changes -7 to +7, plus a flag for an 8-bit codeword  DeltaEncoder.java run length encoding  also takes advantage of interpixel redundancy  especially suited for synthetic images containing large homogeneous regions  encode runs (a sequence of pixels of equal intensity) with a (length, intensity) pair  runs can be constrained to a scan line, or allowed to extend to multiple scan lines  application - compression of binary images to be faxed  it is possible to increase the size of the dataset if applied to non-suitable images  RunLengthEncoder.java statistical coding  want to remove coding redundancy  to measure the effectiveness of a coding scheme, we compute the entropy  entropy - in information theory, it is the measure of the information content of a message  entropy gives the average bits per pixel required to encode an image, H  based on the probability of occurrence of each gray level, pi  probabilities are computed by normalizing the histogram of the image  we can then state the amount of redundancy - r = b - H, where b is the number of bits used per codeword  we can also state the maximum compression ratio achievable by removing coding redundancy - Cmax=b/H  entropy encoding - is a method for efficiently representing the information content of a message  we are interested in encoding the information content only and gaining efficiency by discarding the noisy part of the information  we will use variable-length codewords  codewords that occur infrequently (low probability) should use more bits  codewords that occur frequently (high probability) should use fewer bits  no codeword of length n should be identical to the first n bits of any other codeword  Entropy.java  measuring entropy encoding performance on an image  compute the average length bit length of its codewords  lower limit is H, the entropy  upper limit is b, number of bits used in fixed-length codewords  Huffman coding  one algorithm to perform entropy encoding  used as a step or phase in many other compression algorithms  rank gray levels in decreasing order of probability  build a Huffman tree  pair the least frequent gray levels  assign 0 to one and 1 to the other  replace the pair with the sum of their probabilities  continue until no more pairs  read the codeword for each gray level by starting at the root and following the nodes out to the leaf   In-class exercise on Huffman coding o Given a 3-bit grayscale image with the following gray-level probabilities, give the Huffman tree for the encoding and list the codewords for all gray levels.  0 - 0.12  1 - 0.26  2 - 0.30  3 - 0.15  4 - 0.10  5 - 0.03  6 - 0.02  7 - 0.02 Dictionary-based coding o a type of universal coding (versus entropy coding)  does not require a priori knowledge of data source characteristics  thus can compress data in one pass o based upon the realization that any stream of data with some measure of redundancy consists of repetitions o encode variable-length strings (or sequences of bits) with a single codeword o a universal code will realize optimum compression in the limit for large datasets (doesn't work as well on small datasets) o Lempel and Ziv first published their work in 1977  LZ77  zip, gzip  PNG  implemented in java.util.zip  LZ78  GIF is based on a (patented) variant of this - LZW by Lempel-Ziv and Welch o a simple example of LZ77  we can achieve compression of an arbitrary sequence of bits by always coding a series of 0's and 1's as some previous such string (the prefix string) plus one new bit  the new string formed by adding the new bit to the previously used prefix string becomes a potential prefix string for future strings  we wish to code the string: 101011011010101011  the first bit, a 1, has no predecessors, so, it has a ``null'' prefix string and the one extra bit is itself: 1,01011011010101011  the same goes for the 0 that follows since it can't be expressed in terms of the only existing prefix: 1,0,1011011010101011  the following 10 is obviously a combination of the 1 prefix and a 0: 1,0,10,11011010101011  we eventually parse the whole string as follows: 1,0,10,11,01,101,010,1011  since we found 8 phrases, we will use a three bit code to label the null phrase and the first seven phrases for a total of 8 numbered phrases  write the string in terms of the number of the prefix phrase plus the new bit needed to create the new phrase  the eight phrases can be described by: (000,1)(000,0),(001,0),(001,1),(010,1),(011,1),(101,0),(110,1)  the coded version of the above string is: 00010000001000110101011110101101  the larger the initial string, the more savings we get as we move along, because prefixes that are quite large become representable as small numerical indices  Ziv proved that for long documents the compression of the file approaches the optimum obtainable as determined by the information content of the document  another way to think of it  the algorithm searches the window for the longest match with the beginning of the look-ahead buffer and outputs a pointer to that match  o o since it is possible that not even a one-character match can be found, the output cannot contain just pointers  LZ77 solves this problem this way:  after each pointer it outputs the first character in the lookahead buffer after the match  if there is no match, it outputs a null-pointer and the character at the coding position deflation in gzip and PNG use a variant of LZ77  sliding window compression technique  window consists of two parts  previously seen data (dictionary)  look-ahead buffer  a "string" is an arbitrary sequence of bytes (they don't have to be printable characters)  try to match strings of bytes in the look-ahead buffer with strings of bytes in the dictionary  the second occurance of a string is replaced with a pair containing  distance to the match (maximum is 32K bytes)  length of the match  if no match, string is not replaced  unreplaced bytes and match lengths are Huffman coded using one tree  match distances are Huffman coded using another tree  coding is done in blocks - different blocks may use different trees  trees are stored with the blocks  hash tables are used for string matching compression in TIFF and GIF use LZW  based on a code table that maps frequently encountered bit strings to codewords  output bit string and code table are produced simultaneously (one pass)  code table has 4096 entries - codewords can be up to 12 bits  entries 0 - 255: contain 0 - 255  entry 256: contains clear code  entry 257: contains end of information code  entries 258 - 4095: contain frequently encountered bit strings - built on the fly  codewords from 0 - 511 are 9 bits long  codewords from 512 - 1023 are 10 bits long  codewords from 1024 - 2047 are 11 bits long  codewords from 2048 to 4095 are 12 bits long  encoding algorithm  initialize first 258 entries of code table  set prefix string to null  output clear code  iterate  read new image byte  create current string by concatenating prefix string with new byte  if current string in code table  update prefix and repeat  else  output codeword of prefix  put current string in code table  set prefix to new byte and repeat  a tree structure may be imposed on the table to facilitate searching  end of information is output at end of image (or image strip)  when code table is full, output clear code and reinitialize table  decoding  code table is built during decompression  must be careful to read the correct number of bits    DeflateTest.java HuffmanTest.java Comparision of compression ratios for lossless techniques  Lossy compression o take advantage of psychovisual redundancy to create more coding redundancy  example - convert 100, 101, 100, 99, 101 to 100, 100, 100, 100, 100 o JPEG- Joint Photographics Experts Group - compression algorithm based on transform coding http://www.jpeg.org/public/jpeghomepage.htm o o o o o o transform coding  map the image into a set of transform coefficients using a reversible, linear transform  a significant number of the coefficients will have small magnitudes  these can be coarsely quantized or discarded  compression is achieved during the quantization step, not during the transform block coding  subdivide the image into small, non-overlapping blocks  apply the transform to each block separately  allows the compression to adapt to local image characteristics  reduces the computational requirements choice of transform  there are many that would work, including the discrete Fourier transform (DFT)  discrete cosine transform (DCT) has several advantages  real coefficients, rather than complex  packs more information into fewer coefficients than DFT  periodicity is 2N rather than N, reducing artifacts from discontinuities at block edges  discrete wavelet transform (DWT) is even better - JPEG 2000 will use this http://www.jpeg.org/JPEG2000.htm choice of block size  affects root-mean-square (RMS) error due to transform coding  affects computational complexity  want an n x n subimage, where n is an integer power of 2  as block size increases  compression level and computational requirements increase  RMS error decreases, then levels off  adaptability decreases quantization  DCT coefficients are real numbers, which require more bits to represent  use an 8 x 8 quantization table that gives greater precision to lower frequency components  higher frequency components may be discarded because their coefficients get set to zero  table may be scaled, giving us a compression versus quality parameter  low quality parameter will zero more coefficients and give a high compression ratio  high quality parameter will zero fewer coefficients and give a lower compression ratio JPEG standard  actually defines three different coding systems  lossy baseline coding system  extended coding system  lossless independent coding system  to be JPEG compatible only the baseline coding system must be implemented  o o algorithm to compress 8-bit grayscale  subdivide into 8 x 8 blocks  for each block  shift all pixels by -128  perform DCT  quantize DCT coefficients  reorder coefficients using a zigzag pattern  delta encode the zero frequency coefficient (DC)  run length encode zero-valued (AC) coefficients  Huffman code AC coefficients  JPEG classes provided in package com.sun.image.codec.jpeg  factory class JPEGCodec creates instances of encoders and decoders JPEGEncoder.java  access to control parameters is provided - JPEGQuantTable.java  Comparison of quality versus compression ratio using JPEGTool application  JPEG compression is not as effective as the lossless methods on synthetic images fractal compression  fractals  repeated evaluation of a simple equation at varying scales  generates infinite level of detail  has the property of self-similarity - parts of the shape look like transformed copies of other parts  apply this concept to images - one group of pixels is a transformed copy of another group of pixels  the transformation is not just geometric, but also gray level  the transformation can be represented more compactly than the original group of pixels  compression algorithm  subdivide the image into non-overlapping ranges  search for a domain twice the size of the range which is most similar to the range  select the best set of translation, orientation, and gray-level mapping parameters  store these in a compact format  can subdivide a range further if an appropriate domain can't be found  specify an RMS error tolerance parameter  if the result for a range exceeds this, we subdivide further  gives us a quality versus computational requirements control parameter  algorithm is not symmetric - compression takes much longer than decompression  decompression can be done to any image resolution compressing video  could do JPEG compression on each frame - e.g. QuickTime, Video for Windows  would like to reduce interframe redundancy  MPEG- Motion Picture Experts Group - develops standards http://mpeg.telecomitalialab.com/  MPEG-1 - video CD, MP3  MPEG-2 - digital television set-top boxes, DVD  MPEG-4 - multimedia for the web  working on MPEG-7 (Multimedia Content Description Interface) and MPEG21 (Multimedia Framework)  compress a reference frame at regular intervals  perform motion estimation for subsequent frames  try to encode blocks in subsequent frames in terms of blocks in the reference frame  highly asymmetrical - compression (often hardware assisted) takes much longer than decompression

Image compression -Contents Date: 08/06/04 images take a lot of

Related documents

Products

Support

Image compression -Contents Date: 08/06/04 images take a lot of

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib