Huffman Compression Steps for the Huffman Compression 1) Write down all of the letters appearing in the text in a list, most frequent to least frequent. For each letter, put the frequency as a subscript (it will look like a scrabble tile). For our example (which was DANIEL LEWIS DREIBELBIS), you would get: E4 I4 L3 D2 S2 B2 A1 N1 R1 W1 2) Combine the two letters with the smallest frequencies. If there is more than one way to pick your two letters, then make a choice. This will give you a token of two letters, and you will put the sum of the frequencies as its subscript. For our example, we would combine R1 and W1 to get RW2. 3) Repeat step 2), always combining the tokens with the smallest frequencies. While combining these tokens, we will draw a chart to keep track of how we did it (see the Huffman chart). 4) Continue until there is only one token left which contains all of the letters. See the Huffman chart to see what you should end up with. 5) In the end, we will end up with a tree that describes how all of the elements were combined. Redraw this tree, this time with the combined token on top and all the branches going down: EILDSBANRW EILD EI E I SBANRW LD L SB D S B ANRW AN RW A N R W 6) Now, for each letter, assign the code that tells how you would traverse the tree to reach that letter. Use a “0” to represent a left branch and a “1” to represent a right branch. For instance, to get to the letter “B”, you must first go right, then left, then right again, and so the code for “B” will be “010”. The overall code for us is: Letter Frequency Code E 4 000 I 4 001 L 3 010 D 2 011 S 2 100 B 2 101 A 1 1100 N 1 1101 R 1 1110 W 1 1111 The final result for DANIEL LEWIS DREIBELBIS: 01111001101001000010 0100001111001100 0111110000001101000010101001100 To read the code, take out your tree and use the sequence of zeros and ones as a set of directions. So if you saw 0111110, the first three digits will give you directions to the letter “D”, and the final four digits will give you directions to the letter “R”.