Project 7: Huffman Code 1 Project 7: Huffman Code Extend the most recent version of the Huffman Code program to include decode information in the binary output file and use that information to restore the original ASCII text. 2 Embedding the Code In order for the coded file to be useful, we have to store the code along with it. Then we can read and decode the file at a later time. Even on a different computer (with the same architecture) In order to decode First read the code Then read and decode the message 3 Serialization We will need to serialize the decode tree. Convert data structure into a byte array. Necessary any time an object is to be written to a file or transmitted over a network. Also deserialize. Convert byte array back into the data structure in memory. Serialize and deserialize methods are required for any class whose objects need to be preserved in files or transmitted over a network. 4 Serialization We will write the serialized decode tree into the binary output file ahead of the coded message. On decode, first read back and deserialize the decode tree, then read and decode the message. Serialization and deserialization are byte level operations. Don't want to do one bit at a time. Will need new operations in our binary file classes. http://www.cse.usf.edu/~turnerr/Data_Structures/ Downloads/Project_7_Binary_File_Classes/ 5 Serializing the Decode Tree How do we convert the decode tree into a byte array that can be converted back into a decode tree elsewhere? Don't output pointers! We can't put nodes back into the same memory locations. Have to create new nodes and link them together in the same way they were linked in the original but at new memory locations. Applies to all serialization operations. 6 How to serialize a decode tree? * 1.0 * 0.55 * * 0.35 0.20 a d c b e 0.20 0.15 0.10 0.10 0.45 Note that each nonleaf node has two child nodes. 7 How to deserialize a decode tree? Need the child node addresses in order to restore a nonleaf node. Get restored left child address. Get restored right child address. Get data for node. Create new node with data, and pointers to left child and right child. 8 How to deserialize a decode tree? Work from the bottom up. Leaf nodes can be restored immediately from the serialized data. Push addresses onto a stack. Get parent data from serialized data. Pop left child address. Pop right child address. Restore parent node. Push parent node address onto stack. 9 Serialization Algorithm Do a postorder depth-first traversal Output node data as each node is visited. a 0.20 d 0.15 0.35 c 0.10 b 0.10 0.20 Output only the node data not the pointers. 0.55 0.45 1.0 10 Deserialization Algorithm While any node data left in serialized stream: Get next node data from serialized stream. If it is a leaf Create a new node in memory with the data. Push address of new node onto stack. Else Pop child address from stack. Pop child address from stack. Create new node in memory with data from the serialized stream and child addresses from the stack. Push address of new node onto the stack. 11 Implementing Serialization Add code to class Huffman_Tree to serialize and deserialize the decode tree. Output and input data from the Char_Freq objects. Need serialize and deserialize methods in that class also. 12 Implementing Serialization How do we indicate end of the serialized decode tree? Use a sentinel. A unique value that cannot appear as real data. Char_Freq(0, 0) 13 Sample Run 14 Test on Full Text Delete screen output of decoded text. 15 The Files 16 Development Environment You may develop your program on any system you like. The same source files should compile and run on either Windows or Linux. 17 Ground Rules You may work with one other person. OK to work alone if you prefer. If you do work as a pair Work together! Submit a single program. Do not share your code with other students. Both members are expected to contribute. Both members should understand the program in detail. Before or after submitting the project. OK to discuss the project. Do not copy any other student’s work. Don’t look at anyone else’s program. Don’t let anyone look at your program. 18 Ground Rules Except for code posted on the current class web site Do not copy code from the Internet or any other source. Write your own code. 19 Submission Project is due by 11:59 PM, Sunday night, April 24 Deliverables: Source files only. Zip using Windows "Send to Compressed Folder" If you work with another student, include both names in the assignment comments. Other student submit just a Blackboard submission comment including both names. End of Presentation 20