Beginner - Example: Hufmann Encoding:

advertisement
Beginner Hufmann Encoding:
Huffman encoding is an example of entropy encoding.
It is based on statistical methods. Given the character
that must be encoded, together with the probability of
their occurrences, the Huffman encoding algorithm
determines the optimal code using the minimum
number of bits. Hence the length (number of bits) of
the coded characters will differ. In text, the shortest
code is assigned to those characters that occur most
frequently. To determine a Huffman code, it is useful
to construct a binary tree. The nodes of this tree
represent the characters that are to be encoded. Every
node contains the occurrece probability of one of the
characters belonging to this subtree. 0 and 1 are
assigned to the edges of the tree. The two characters
with the lowest probabilities are combined in the first
binary tree. Their root node is labeled with these
characters and the combined probability. The edges
are labeled with 1 and 0 resp. (This assignment is
arbitrary, therefore, with the same data one can get
different Huffman codes). The nodes below this root
node wont be considered anymore. Again, the two
Example: Hufmann Encoding:
In the figure, characters A,B,C,D and E have the following probability of
occurence:
p(A) = 0.16, p(B) = 0.51, p(C)=0.09, p(D)=0.13, p(E)=0.11
The edge from node CE to node C is assigned a 1 and the edge from CE to
E becomes a 0.
The following nodes remain after the first step:
p(A) = 0.16, p(B) = 0.51, p(CE)=0.20, p(D)=0.13
Tbe edge from AD to A is assigned a 1 and the edge from AD to D a 0.
The following nodes remain after the second step:
p(AD) = 0.29, p(B) = 0.51, p(CE)=0.20
Tbe edge from ADCE to AD is assigned a 0 and the edge from ADCE to
CE a 1.
The following nodes remain after the third step:
p(ADCE) = 0.49, p(B) = 0.51
Tbe edge from ADCEB to B is assigned a 1 and the edge from ADCEB to
CEAD a 0.
The figure shows the resulting Huffman code in the form of a binary tree.
The result is the following code that is stored in a table:
w(A)=001, w(B)=1, w(C)=011, w(D)=000, w(E)=010
nodes with the lowest probabilities are combined into
a binary subtree, the root node and the edges are
labeled as before. Continue in this way until the bode
with the whole used alphabet and the probability 1 is
reached. The encoded data for the characters are the
pathes through the tree to theit nodes. The result is
stored in a table. If the information of an image can be
transformed into a bit stream, such a table can be used
to compress the data without loss.
Download