Set Problem We are going to model the throw of a pair of fair dice as a random variable, calculating the associated probability and entropy measures for each value. Each throw of a pair of dice can be thought of as an event that produces some output, namely a number from the set S = {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}, corresponding to the sum of the two dice values. For convenience, we will label each possible throw by a 2-tuple (i, j), where i, j∈{1,2,…,6}. Given that the dice are fair and independent of each other, we know that P(i, j) = P(i) ∗ P(j) = 1/6 ∗ 1/6 = 1/36 for all i and j. We can then calculate the probability distribution on S, by considering all possible throws (i, j) that will result in a particular output. E.g.: S3 = { (1,2), (2,1)}. Output Probability Distribution Information (in bits) 2 1/36 log2(36/1)=5.1699 3 2/36 log2(36/2)=4.1699 4 3/36 log2(36/3)=3.5850 5 4/36 log2(36/4)=3.1699 6 5/36 log2(36/5)=2.8480 7 6/36 log2(36/6)=2.5850 8 5/36 log2(36/5)=2.8480 9 4/36 log2(36/4)=3.1699 10 3/36 log2(36/3)=3.5850 11 2/36 log2(36/2)=4.1699 12 1/36 log2(36/1)=5.1699 Entropy 3.2744 Remark: a clear connection between the probability of the random variable output and the information gained upon reading it can be seen from above table. The more probable the event is, the less information is associated with it, and vice versa. Now, we can encode the values of our random variable using Huffman and Shannon-Fano codings. Huffman encoding Using the algorithm proposed by David A. Huffman, we build a Huffman tree: ROOT 0 1 21/36 15/36 0 0 1 11/36 8/36 10/36 0 1 0 0 0 3/36:10 5/36 1 5/36:6 7/36 1 1 1 0 4/36 4/36:5 6/36:7 1 5/36:8 1 0 2/36:11 3/36:3 1/36:2 3/36:4 4/36:9 2/36 0 1 2/36:12 Then, traversing the tree from its root to the leaves, containing one of the random variable values (highlighted in red), we obtain the following Huffman encodings: Output 2 3 4 5 6 7 8 9 10 11 12 Probability Distribution 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 Huffman encoding 10110 0011 111 100 010 000 011 110 0010 1010 10111 We can see that the characters that occur most frequently have the shortest bit string. The characters that do not occur so frequently have longer bit strings. We can calculate the average length of the above encoding: <l> = 5 ∗ (1/36 + 1/36) + 4 ∗ (2/36 + 3/36 + 2/36) + 3 ∗ (3/36 + 4/36 + 5/36 + 6/36 + 5/36 + 4/36) = 3.3056 bits. Shannon-Fano encoding Applying the Shannon-Fano algorithm, we build a Shannon-Fano tree: 0 1 0 1 0 1 6/36:7 0 1 0 5/36:6 5/36:8 1 0 1 4/36:5 0 1 0 1 0 1 2/36:11 3/36:4 4/36:9 0 1 2/36:3 1/36:2 3/36:10 1/36:12 Again, traversing the tree from its root to the leaves, we get the Shannon-Fano encodings: Output 2 3 4 5 6 7 8 9 10 11 12 Probability Distribution 1/36 2/36 3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 Shannon-Fano encoding 11110 1101 1011 100 010 00 011 1010 1100 1110 11111 The average length: <l> = 5 ∗ (1/36 + 1/36) + 4 ∗ (2/36 + 3/36 + 4/36 + 3/36 + 2/36) + 3 ∗ (4/36 + 5/36 + 5/36) + 2 * 6/36 = 3.3333 bits. Hence, in our particular example, the Huffman encoding offers more efficiency than the Shannon-Fano one: the averange length of the code is shorter and closer to the enropy. This corresponds to a well-known result: Shannon–Fano is almost never used; Huffman coding is almost as computationally simple and produces prefix codes that always achieve the lowest expected code word length[1]. [1] - http://en.wikipedia.org/wiki/Shannon–Fano_coding