Huffman Coding

advertisement
Huffman Coding
Huffman coding (Gonzalez et al. 2008,
A.Wong Lecture 16b 2011)
• Popular lossless data compression technique
• Removes coding redundancy (or in context of image
processing, redundant image data)
• Often used in compression standards:
▫ JPEG
▫ MPEG-1, 2, 4
• Uses variable-length coding
▫ Encoding a source symbol using a table that considers
the estimated occurrence probability of that symbol
▫ Resulting data size depends on underlying image
characteristics
Steps
1.) Determine histogram of image
2.) Construct Huffman tree
3.) Encode the image using codes generated from
the Huffman tree
7
3
3
3
3
3
3
3
1
1
9
9
5
6
9
9
Frequency
1.) Histogram
8
7
6
5
4
3
2
1
0
1
3
5
6
Intensity
7
9
http://courses.cs.washington.edu/courses/cse373/02wi/slides/ImageADT/sld0
12.htm
Frequency
2) Huffman tree
Sorted from lowest to highest frequency
Intensity
Frequency
Intensity
Frequency
1
2
5
1
3
7
6
1
5
1
7
1
6
1
1
2
7
1
9
4
9
4
3
7
10
5
0
1 3 5 6 7 9
Intensity
Intensity
Frequency
5
1
6
1
7
1
1
2
9
4
3
7
2
1
1
5
6
•Lower frequency becomes left child node
•Higher frequency becomes right child node
•Sum of the two children nodes becomes the parent node
Intensity
Frequency
5
1
6
1
7
1
1
2
9
4
3
7
3
1
2
7
1
1
5
6
5
Intensity
Frequency
5
1
6
1
7
1
1
2
9
4
3
7
3
2
1
1
2
7
1
1
5
6
16
9
7
3
4
5
9
3
2
1
1
2
7
1
1
5
6
3) Encoding
• If traversing the left branch  Label 1
• If traversing the right branch  Label 0
• Follow this procedure from the root to the child
of interest adding a 1 or 0 depending on the
traversal
1
0
0
1
3
1
0
9
Intensity
Encoding
3
1
9
01
1
001
5
00001
6
00000
7
0001
0
1
1
1
0
7
0
1
5
6
Frequency
What’s awesome here?
8
7
6
5
4
3
2
1
0
1
3
5
6
Intensity
7
9
Intensity
Encoding
1
001
3
1
5
00001
6
00000
7
0001
9
01
Real Example
• Which values are going to have the shortest code?
Q8.8 in textbook
How many unique Huffman codes are
there for a three-symbol source?
Construct them.
Let’s assume the three symbols are: A, B and C
where the probability of A, B and C are in order
from lowest to highest.
What would the Huffman tree look
like?
9
1
4
5
0
C
1
1
3
A
B
0
A
11
B
10
C
0
What would happen if the probability
of C was less than the sum of A and B?
7
1
3
4
0
C
1
1
3
A
B
0
A
01
B
00
C
1
• Therefore there are two unique codes for a
three-symbol source
• Notice the codes are complements of each other
A
11
A
01
B
10
B
00
C
0
C
1
Q8.10 in textbook
Using the Huffman code in Fig. 8.8, decode the
encoded string:
0101000001010111110100
0101000001010111110100
Another example
010100111100
Download