An Enhanced Steganographic Method for Data Hiding in Microsoft Word Documents by a Change Tracking Technique Dan DeBlasio and Brad Mundt University of Central Florida deblasio@cs.ucf.edu, bmundt@pegasus.cc.ucf.edu Steganography What is Steg? The word steganography is of Greek origin and means "covered, or hidden writing". Modern implementation is embedding secret information into something innocent Pictures, audio, video Picture Steg Image of a tree. By removing all but the last 2 bits of each color component, an almost completely black image results. Making the resulting image 85 times brighter results in the image to the right. Image extracted from tree image. Our selection A New Steganographic Method for Data Hiding in Microsoft Word Documents by a Change Tracking Technique by Tsung-Yuan Liu and Wen-Hsiang Tsai A new steg method to hide information via MS Word change tracking Collaborative document with corrections Deleted words are struck through and new words are underlined Overview Transform a regular document to one with misspellings or mistakes Make changes, allow word to track them. Using a Huffman tree on the original words in the file, create a binary string of the embedded message Decode the binary string to get the secret message Original sentence The quick brown fox jumped over the lazy dog. Sentences with Misspellings The quick borwn fox jumped oevr the lazy dag. Sentence with changes tracked The quick borwnbrown fox jumped oevrover the lazy dagdog. Huffman Tree Created from the frequency of misspellings Word teh qwik borwn broun jumpped oevr orver dag Huffman 11 101 10010 0011 000010 0001110 100010 01 How to Decode The quick borwnbrown fox jumped oevrover the lazy dagdog. Word teh qwik borwn broun jumpped oevr orver dag Huffman 11 101 10010 0011 000010 0001110 100010 01 1001000 0111001 0x72 0x105 Hi Our implementation Recreated the method from scratch Made improvements Modular code for easy improvements Encryption, simple XOR pad with key in the author field Example next, then a live demo A Word document Before steg encoding Encoding in Progress Encoding Complete The Word document After steg encoding Decoding in Progress Decoding complete Live Demo Statistical Results Changes Document 1 40.4 Document 2 43.2 Bits 93.8 99.8 b/change 2.321 2.310 Document 3 65.6 129.6 1.976 Document 4 124.4 224.8 1.968 Document 5 37.6 78 2.074 Words % changed Sent. Changes/Sent 475 551 8.5% 7.8% 29 31 1.393 1.394 641 1268 481 10.2% 51 9.8% 89 7.8% 27 1.286 1.398 1.392 Improvements Due to the code being modular, improvements will be easy More advanced or alternate encryption algorithms Implement a Markov model instead of a Huffman tree for statistical attacks Use synonyms instead of misspellings or a combination of both That’s all for now folks!