KEEPING SECRETS SECRET – IMPLEMENTATION OF STEGANOGRAPHY WITH AUDIO FILE AND ENCRYPTED DOCUMENT Vijaya Lakshmi Chittimalli B.Tech, Jawaharlal Nehru Technological University, 2006 PROJECT Submitted in partial satisfaction of the requirements for the degree of MASTER OF SCIENCE in COMPUTER SCIENCE at CALIFORNIA STATE UNIVERSITY, SACRAMENTO FALL 2009 KEEPING SECRETS SECRET – IMPLEMENTATION OF STEGANOGRAPHY WITH AUDIO FILE AND ENCRYPTED DOCUMENT A Project by Vijaya Lakshmi Chittimalli Approved by: __________________________________, Committee Chair Dr. Isaac Ghansah __________________________________, Second Reader Prof. Dick Smith ____________________________ Date ii Student: Vijaya Lakshmi chittimalli I certify that this student has met the requirements for format contained in the University format manual, and that this project is suitable for shelving in the Library and credit is to be awarded for the Project. __________________________, Graduate Coordinator Dr. Cui Zhang Department of Computer Science iii ________________ Date Abstract of KEEPING SECRETS SECRET – IMPLEMENTATION OF STEGANOGRAPHY WITH AUDIO FILE AND ENCRYPTED DOCUMENT by Vijaya Lakshmi Chittimalli Steganography is the art of hiding messages inside an image file/Audio file or a Video file such that the very existence of the message is unknown to third party. Cryptography is used to encrypt the data so that it is unreadable by a third party. Keeping Secrets Secret is an application which combines both of the above mentioned techniques to embed Text document in an audio signal. A Text document is compressed and then embedded into the Audio file in order to achieve robustness and better performance. Then the users can easily and securely send the compressed data over the network. The major task of this application is to provide the user the flexibility of passing the information implementing the encryption standards as per the specification and algorithms proposed and store the information in a form that is undetectable in an Audio file. The Application will have a reversal process which de-embeds the data file from audio file and decrypts the data to its original format upon the proper request by the user. This application is developed using Java Programming Language and is compatible with windows environment. _______________________, Committee Chair Dr. Isaac Ghansah _______________________ Date iv TABLE OF CONTENTS Page List of Tables………………………………………………………………………………....vii List of Figures………………………………………………………………………………..viii Chapter 1. INTRODUCTION AND PROJECT OVERVIEW………………………………….1 1.1 Modules……………………………………………………………………..2 2. STUDY OF STEGANOGRAPHY AND CRYPTOGRAPHY…………………….. 4 2.1 Steganography……………………………………………………………....4 2.2 Cryptography…………………………………………………………….….5 2.3 Comparison of Cryptography and Steganography……………………….…7 3. SYSTEM ANALYSIS……………………………………………………….….…...9 4. EXISTING AND PROPOSED SYSTEM…………………………………,............10 4.1 Existing System................................................................................,...........10 4.2 Proposed System.............................................................................,............10 5 .FEASIBILITY STUDY AND SYSTEM REQUIREMENTS…….………,.……...12 5.1 Operational Feasibility……………………….….…………………,.…….12 5.2 Technical feasibility………………………….….……………….,……….13 5.3 Financial and Economical Feasibility…………..……………….,………..14 6. SYSTEM DESIGN……………………………………………….…….…………..15 7. FEATURES OF THE LANGUAGE USED………………………….……………17 7.1 About Java………………………………………………….……………...17 7.2 Applications and Applets………………………………….…….…….…..17 7.3 Features of Java............................................................................................17 7.3.1 Security..........................................................................................17 7.3.2 Portability………………………………………………………...18 7.3.3 The Byte code................................................................................18 7.3.4 Java Virtual Machine (JVM)..........................................................18 7.4 Java Architecture...........................................................................................19 v 7.5 Characteristics of Java....................................................................................20 7.5.1 Simple………………………………………………….….………20 7.5.2 Object-Oriented…………………………….…………..…...…….20 7.5.3. Robust.............................................................................................20 8. SYSTEM DESCRIPTION………………………………………………….….…….22 8.1 Compression………………………………………………………........……23 8.2 BlowFish Algorithm for Encryption and Decryption.....................................25 8.3 Steganography (Embed and De-embed Process)……………………...….…26 9. UML DIAGRAMS......................................................................................................29 9.1 Use Case Diagrams………………………………………………...…..……29 9.2 Activity Diagrams………………………………………………………...….31 10. TESTING ……………………………………………………………………….......33 11. USER GUIDE……….……………………………………………………………...36 12. CONCLUSION AND FUTURE WORK………………………….…………….....53 References…………………………………………………………………………........55 vi LIST OF TABLES Table 1: Case Generation Report………………………………………………………...34 Table 2: Test Case Analysis……………………………………………………………...35 vii LIST OF FIGURES Figure 1: Types of Cryptography……………………… …………………………….....6 Figure 2: Examples of Symmetric Cryptography………………………………………..7 Figure 3: Development process a typical Java program ………………………………..19 Figure 4: A bitmap with information for run-length encoding………………………….24 Figure 5: DSSS Application…………………………………………………………….28 Figure 6: Use Case Diagrams for Sender……………………………………………….29 Figure 7: Use Case Diagrams for Receiver……………………………………………..30 Figure 8: Activity Diagram for compression, encryption and embedding……………..31 Figure 9: Activity Diagram for Un-compression, decryption and De-embedding……..32 Figure 10: Step to initiate the compression………………………………….………….36 Figure 11: Selecting a file for compression…………………………………….……….37 Figure 12: File Compression Successful……………………………………….……….38 Figure 13: Step to Initiate Encryption…………………………………………….…….39 Figure 14: Selecting a file for Encryption……………………………………….……...40 Figure 15: File encryption successful……………………………………….…….…….41 Figure 16: Initiate the Embed Process……………………………….…………….……42 Figure 17: Selecting files for Embed Process….…………………………….….....……43 Figure 18: Embed Process Successful……………………………………….…………44 Figure 19: Step to initiate the De-Embedding……………………………….…………45 viii Figure 20: Selecting a file for De-Embedding…………………………………………46 Figure 21: File De Embed Successful………………………………………………….47 Figure 22: Step to Initiate Decryption………………………………………………….48 Figure 23: Selecting a file for Decryption……………………………………………...49 Figure 24: Step to initiate the compression…………………………………….……….50 Figure 25: Selecting file for Un-Compression Process…………………………………51 Figure 26: File Un-Compression Successful…………………………………………...52 ix 1 Chapter 1 INTRODUCTION AND PROJECT OVERVIEW Information hiding has recently gained importance in various applications. One such area where information hiding is important is military. Steganography has become one of the popular approaches for information hiding. A recent breakthrough in this field is hiding information in music. The embedded data should be as immune as possible to modifications from intelligent attacks or anticipated manipulations. Thus the hidden messages are encrypted before hiding behind Audio files. Goal of this project is to hide encrypted information in music. Keeping Secrets Secret is software, which tries to alter the Audio file properties to allow the encrypted text document to be embedded into it. The text file is first compressed, encrypted and then embedded into the Audio file to allow maximum performance and robustness. This allows the users to easily and securely carry the compressed data. The major task of the Audio Steganography is to provide the user the flexibility of passing the information implementing the encryption standards as per the specification and algorithms proposed and store the information in a form that is undetectable. This Application has a reversal process, which is used to de-embed the data file from audio file and decrypt the data to its original format upon the proper request by the user. While the Encryption and Decryption is done the application should satisfy the standards of authentication and authorization of the user. The Entire application provides a user friendly Graphical User Interface, which is in a self-learning mode for the end user. The System will provide all the functional standards 2 of proper navigation within the environment, which makes it possible for the users to have a smooth flow while working with this environment. The Application is designed in such a way that, as soon as it creates Buffer for compressed data, the application asks the user for the Encryption Key details and starts its functionality upon the logistics that are provided with in this key. The key should be given in such a way that it should prevent unauthorized persons from identifying the encrypted information at any point of time. This application provides De-embedding, Decryption and Uncompression for reversal process, which is carried at the other end, and the receiver will be able to Decrypt and Uncompress the data only if the correct key is entered. The Decryption process will produce the corresponding log when the application is under processing state and will also list the errors if there are any. This Application uses Blowfish Algorithm for encryption. This algorithm is a 64-bit block cipher with a variable length key. This algorithm has been used because it requires less memory. 1.1 Modules: 1) Compression Module /Uncompression Module 2) Blowfish Algorithm Encryption Module/Decryption Module 3) Steganography Module (Embedding and De-embedding) 4) GUI Module The 1st chapter gives an overview of the system, need for this system.2nd chapter gives the literature review for steganograpic and cryptographic techniques which form a strong foundation for this system. Following that is the 3rd chapter which gives the analysis of 3 system modules and the 4rth chapter gives the comparison between the existing system and proposed system. 5th and 6th chapters give a detailed explanation of the system requirements, how the system is designed and the feasibility of the system. Once the requirements and the system design is finalized the next task is to choose the system implementation language.7th chapter discusses the need for java language to be used for this system and the main features of this language.8th chapter gives an overview of the modules in the system and explanation about each module in the system. 9th chapter is about the UML representation of the system which follows all the stages of SDLC (Software Development Life Cycle). The final test analysis of the project is given in 10 th chapter. 4 Chapter 2 STUDY OF STEGANOGRAPHY AND CRYPTOGRAPHY 2.1 Steganography: As explained above Steganography is the art of concealing the existence of information within seemingly innocuous carriers. It includes various methods of secret communications that conceal the very existence of the message. Among these methods are invisible inks, microdots, character arrangement (other than the cryptographic methods of permutation and substitution), digital signatures, covert channels and spreadspectrum communications. “In steganography “cover” is the medium which is used to hide the secret information. The information to be hidden can be a plain text message, a cipher text, another image, or anything that can be represented in binary.”[1] An image, audio file or a video file usually act as cover to hold the information. If an image is used as cover then it can be altered in the noisy areas with a lot of color variations so that the alterations are less obvious. The message can also be scattered randomly throughout the image. Common methods of concealing data in digital images include: Least significant bit insertion: This is a very simple method of hiding the message in a digital image. In this method, the LSB of each byte in the image is used to store the secret data. Changes in new image in which the message is hidden are too small to be recognized by the human eye. The disadvantage of this technique is that since it uses each 5 pixel in an image, a lossless compression format like bmp or gif has to be used for the image. Masking and filtering: “These methods hide information in a manner similar to paper watermarks. This can be done, for example, by modifying the luminance of parts of the image. It does change the visible properties of an image, but if done with care the distortion is barely discernable.”[1] Transformations: This is a complex way of hiding information in an image. This is considered as the most efficient way of hiding the information. Various algorithms and transformations are applied on the image to hide information in it. DCT (Direct cosine transformation) is one such method. 2.2 Cryptography: “On the other hand cryptography is the process of conversion of data into scrambled code that can be deciphered and sent across a public or private network”. [2] The two main forms of encrypting data in cryptography are symmetrical and asymmetrical. Symmetric encryptions, or algorithms, use the same key for encryption as they do for decryption. Other names for this type of encryption are secret-key, sharedkey, and private-key. Symmetric cryptography is at times simple to decode. Asymmetric cryptography uses different encryption keys for encryption and decryption. In we have one key for encryption and a different key for decryption. These keys are labeled or known as a public and a private key; in this instance the private key cannot be derived from the public key. The asymmetrical cryptography method has been proven to be secure as compared to symmetric cryptography. The most common form of 6 asymmetrical encryption is in the application of sending messages where the sender encodes and the receiving party decodes the message by using a random key generated by the public key of the sender. Types of algorithms: Cryptographic algorithms can be categorized in several ways. One of the ways is that they can be categorized based on the number of keys that are employed for encryption and decryption, and further defined by their application and use. The three types of algorithms that will be discussed, as shown in Figure 1 which is taken from [3] are: 1. Secret Key Cryptography (SKC): Uses a single key for both encryption and decryption. 2. Public Key Cryptography (PKC): Uses one key for encryption and another for decryption. 3. Hash Functions: Uses a mathematical transformation to encrypt information. Figure 1: Types of Cryptography [3] 7 Here Figure 2 which is taken from [3] gives the example of Symmetric cryptography: Figure 2: Examples of Symmetric Cryptography [3] 2.3 Comparison of Cryptography and Steganography: The comparison of Cryptography and Steganography is done in either ways saying they are almost similar and also saying they are different. Both of them are used for secure communication but the application of the process is entirely different in both the cases. That said cryptography provides privacy, steganography is intended to provide secrecy. In real time application you need cryptography when you use your credit card on the Internet -- you do not want your number revealed to the public. Though your code may be unbreakable, any hacker can look and see you have sent a message. For true secrecy, you do not want anyone to know you are sending a message at all. The main differences between steganography and cryptography: 8 1. We can apply cryptography in the process of using steganography but not the other way round. 2. We need a “cover” to use steganogrphy but we do not need it for cryptography. 3. Cryptography: “Protects Message”, Steganography: “Hides Message”. So the application of cryptography can be seen while the main motive of stegonography is to hide the fact that it is applied and used. Depending on the need, combination of these two methods for secure transmission can be very useful. This is the main motive for developing Keeping Secrets Secret. Here the message first undergoes cryptographic techniques and then it undergoes steganographic techniques for efficient and safe transmission. 9 Chapter 3 SYSTEM ANALYSIS People for long time have tried to sort out the problems faced in the general digital communication system but as these problems exist even now, a secured and easy transfer method has evolved which does the Encryption and Decryption of the file to be sent and hiding this file inside an audio file. The file to be sent goes through the cryptographic standards and Steganography Techniques. The advantages of this Audio Steganography are: 1. High level Security 2. Cost-effective transfer In the world where every individual is free to access the information over the network hacking the information over the network has not remained as big task anymore. Many organizations send the information in and out of their network at various levels, which might need security. If the organizations have this application, then each employee can send the information to any other registered employee and thus can establish communication and perform the prescribed tasks in secured fashion. The audio file that the employee sends reaches the destinations, within no time in an audio file format where the end user needs to de-embed the file, decrypt it and de compress and use for the purpose. The various branches of the organization can be connected to a single host server and then an employee of one branch can send files to the employee of another branch through the server but in a secured format. 10 Chapter 4 EXISTING AND PROPOSED SYSTEM 4.1 Existing System In traditional client server architecture the server is only a data base server that offers data. Therefore majority of the business logic i.e., validations etc. had to be placed on the clients system. This makes maintenance expensive. Such clients are called as ‘fat clients’. These clients are responsible for the security for the communication between client and server. Since the actual processing of the data takes place on the remote client the data has to be transported over the network, which requires a secured format of the transfer method. Present day transactions are considered to be "un-trusted" in terms of security, i.e. they are relatively easy to be hacked. Secure transfer mode in the existing system is the motivation factor for a new system with higher-level security standards for the information exchange. 4.2 Proposed System The proposed system will have new features to overcome the above mentioned issues. The transactions will take place in a secured format between various clients. It provides flexibility to the user to transfer the data through the network very easily by compressing the large amount of file. It will also identify the user and provide the communication according to the security standards. In this system the data will be sent through the network as an audio file. The user who receives the file will do the operations 11 like de-embedding, decryption, and uncompression in their level of hierarchy and get the required information. Compressing the data will increase the performance of the transfer and embedding the encrypted data in the audio file in such a way that the audio file pretends as normal audio file will assure the security while the transfer takes place. 12 Chapter 5 FEASIBILITY STUDY AND SYSTEM REQUIREMENTS Feasibility Study A feasibility study is a high-level prototype version of the entire System analysis and Design Process. The study begins by classifying the problem definition. Feasibility is to determine if it is worth implementing the set of requirements with approaches mentioned which include the algorithms and the programming language used. Once an acceptance problem definition has been generated, the development of logical model of the system will begin. A search for alternatives is analyzed carefully. There are 3 parts in feasibility study. 5.1 Operational Feasibility Question which arise over here are: Will the system be used if it developed and implemented? If there is sufficient support for the project by the platform on which it is developed? Test cases where the system might fail and methods to overcome those problems. Will this application be compatible with different platforms? 13 This system might face a performance issue when trying to embed the data directly into the Audio file but the compression technique addresses that problem which enhances the performance to a great extent. This system being developed in java helps a lot because one of the features of Java is platform independence which addresses the compatibility issue. 5.2 Technical feasibility Does the necessary technology exist to do what is been suggested? Does the proposed equipment have the technical capacity for using the new system? Are there technical guarantees of accuracy, reliability and data security? The above mentioned issues are addressed by the following approaches: The project is developed on Pentium IV with 256 MB RAM. The environment required in the development of system is any platform The observer pattern along with factory pattern will update the results eventually The language used in the development is JAVA 1.5 & Windows Environment 5.3 Financial and Economical Feasibility The system developed and installed will be good benefit to the organization. The system will be developed and operated in the existing hardware and software infrastructure. So there is no need of additional hardware and software for the system. 14 Software Requirements Operating System Windows (Client/Server). Software requirements Front-end: Java J2SDK 1.5, Swing. Hardware Requirements System Configuration Pentium IV Processor with 700 MHz Clock Speed 256 MB RAM, 20 GB Hard Disk, 32 Bit PCI Ethernet Card. 15 Chapter 6 SYSTEM DESIGN The Proposed system is designed to provide a method by where a normal Audio file can carry important information in it and still sounds the same as the original audio file. The information which can be hidden inside the audio file can be of .txt, .doc or .xls formats. For increasing the performance level and security level, the file to be embedded into the audio file will go through the phases like Compression and Encryption and at the receiving end it will go through Decryption and Uncompression. In the interface design we are involved with the design of the user interface with GUI standards and a proper navigation system. The user needs to follow the GUI operations/options in order to use the system. Then the user needs to select the operations provided through the GUI where compression, encryption, embedding, de-embedding, Decryption, Uncompressing, General information and exit are provided. Here the Encryption and decryption and services are provided connecting to the security services module where the encryption and decryption are carried out using the cryptographic standards implementing the Blowfish algorithm. To use this application the user first uses Compression option on the application which accesses the Compression Module on the back end and compresses the file. After the compression process is completed the user selects the file for encryption option which accesses the Encryption Module on the back end and encrypts the file. After encryption of the file is completed the user selects the file for embedding it into the audio file which accesses the Embedding Module on the back end and embeds the file into Audio file .The 16 user on the other end working with this application software should go for the options De-Embed Files, Decrypt and Uncompress which access the De-Embed, Decrypt and Uncompress modules on the back end. 17 Chapter 7 FEATURES OF THE LANGUAGE USED 7.1 About Java Initially the language was called as “oak” but it was renamed as “Java” in 1995. The primary motivation of this language was the need for a platform-independent (i.e., architecture neutral) language. Some of the important aspects of Java which actually helped a lot for this project are: 1. Java is a platform Independent. 2. Java is cohesive and consistent. 3. Except for those constraints imposed by the Internet environment, Java gives the programmer, full control. 4. Support provided by various APIs such as java.util.zip for compression and javax.sound.sampled for embedding data into Audio file formats. 7.2 Applications and Applets Java’s ability to create Applets makes it important. An Applet is an application designed, to be transmitted over the Internet and executed by the JVM on other machine. An applet is actually a Java program which is usually used to create GUI for an application. It can react to the user input and dynamically change as according to the input. 7.3 Features of Java 7.3.1 Security Every time we download a “normal” program, there is a possibility of malicious data being downloaded along with the data we intend to download. Prior to Java, most users 18 did not prefer to download executable programs frequently, and those who did scanned them for viruses prior to execution. Java ensures security by providing a “firewall” between a networked application and your computer. When you use a Java-compatible Web browser, you can safely download Java applets without fear of virus infection or malicious intent. 7.3.2 Portability Java programs can be executed on any platform due to one of its unique nature which is platform independence. 7.3.3 The Byte code The key that allows the Java to solve the security and portability problem is that the output of Java compiler is Byte code. “Byte code is a highly optimized set of instructions designed to be executed by the Java run-time system, which is called the Java Virtual Machine (JVM). That is, in its standard form, the JVM is an interpreter for byte code. Translating a Java program into byte code helps makes it much easier to run a program in a wide variety of environments”. [4] 7.3.4 Java Virtual Machine (JVM) The Java virtual machine is an important element of the Java technology. The virtual machine can be embedded within a web browser or an operating system. Once a piece of Java code is loaded onto a machine, it is verified. As part of the loading process, a class loader is invoked and does byte code verification and makes sure that the code that’s has been generated by the compiler will not corrupt the machine that it is loaded on. Byte code verification takes place at the end of the compilation process to make sure that is all 19 accurate and correct. So byte code verification is integral to the compiling and executing of Java code. Complie Java byte code Java Source Java Virtual Machine .Java .Class Figure 3: Development process a typical Java program Figure 3 shows the development process a typical Java program used to produce byte codes and execution of that byte code. The first box indicates that the Java source code is located in a .java file that is processed with a Java compiler. The Java compiler produces a file called a. class file, which contains the byte code. The class file is then loaded across the network or loaded locally on your machine into the execution environment is the Java virtual machine, which interprets and executes the byte code. 7.4 Java Architecture Java architecture provides a portable, robust, high performing environment for development. Java provides portability by compiling the byte codes for the Java Virtual Machine, which is then interpreted on each platform by the run-time environment. Java is a dynamic system, able to load code when needed from a machine. 20 7.5 Characteristics of Java: 7.5.1 Simple Java was designed to make the programmer’s life easy by providing the necessary libraries and API’s for performing widely developed tasks. In Java there are a small number of clearly defined ways to accomplish a given task. 7.5.2 Object-Oriented The object model in Java is simple and easy to extend, while simple types, such as integers, are kept as high-performance non-objects. 7.5.3. Robust “The ability to create robust programs was given a high priority in the design of Java. Java is strictly typed language; it checks your code at compile time and run time.”[4] Another important API of java which is used in this project is the support for Audio files, handling the Audio files is done using javax.sound.sampled package. The Java Sound API can handle audio transport in both a streaming, buffered fashion and an in-memory, unbuffered fashion. "Streaming" is used here in a general sense to refer to real-time handling of audio bytes; it does not refer to the specific, well-known case of sending audio over the Internet in a certain format. In other words, a stream of audio is simply a continuous set of audio bytes that arrive more or less at the same rate that they are to be handled (played, recorded, etc.). Operations on the bytes commence before all the data has arrived. In the streaming model, particularly in the case of audio input rather than audio output, you do not necessarily know in advance how long the sound is and 21 when it will finish arriving. You simply handle one buffer of audio data at a time, until the operation is halted. In the case of audio output (playback), you also need to buffer data if the sound you want to play is too large to fit in memory all at once. In other words, you deliver your audio bytes to the sound engine in chunks, and it takes care of playing each sample at the right time. Mechanisms are provided that make it easy to know how much data to deliver in each chunk. 22 Chapter 8 SYSTEM DESCRIPTION This system is designed basically in four different modules they are GUI module, Compression Module, Security System module, Stegnography Module, Connection Manager Module. GUI Module basically deals with the design of the interface which include the service of providing the user with the flexibility of accessing the file system and selecting the required file for the transfer. It should also provide the system to collect the information from the user to check the authorization in providing the access to the file system. The interface is also to consider the design which includes the services of sending and receiving of the files with encryption and decryption standards. The Compression module basically deals with the compression and uncompression of the data file because of which the file can be sent easily with minimum upload time. Security implementation module considers the implementation of the encryption and decryption standards to transfer the files from one system to another in a distributed environment. The basic algorithm used in this purpose is the Blowfish where the user can enter the key depending upon level encryption he is interested. The Modules of the system are: 1) Compression (Compression and Uncompression Modules) 2) Stegnography (Embed and De-embed Modules) 3) Blowfish Algorithm Implementation (Encryption and Decryption Modules) 23 4) GUI Module (User Interface Module) 8.1 COMPRESSION: Most of the information which is transferred between the client and the server contains redundant data; time to transfer this information can be reduced greatly if this redundant information and extra white spaces are managed properly. Managing over here means that the original file can be compressed using some compression techniques where it overcomes the redundancy and other information which add very less to the actual source content. Even after the file is compressed original file format and styling are still preserved. The file which is to be transferred can be compressed at senders end to increase the speed of transfer and uncompressed at the receiver’s end to retain the original file. Some of the simplest compression techniques are explained below: Consider the string BBBWWDDXXXXKKKKWWZZZ This string can be encoded more compactly by replacing each repeated string of characters by a single instance of the repeated character and a number that represents the number of times it is repeated. The earlier string can be encoded as follows: 3B2W2D4X4K2W2Z Here "4B" means four B's, and 2H means two H's, and so on. Compressing a string in this way is called run-length encoding. As another example, consider the storage of a rectangular image. As a single color bitmapped image, it can be stored as shown in Figure 4 taken from [5]. 24 Figure 4: A bitmap with information for run-length encoding [5] Another approach might be to store the image as a graphics metafile: Rectangle 11, 3, 20, 5 “This says, the rectangle starts at coordinate (11, 3) of width 20 and length 5 pixels. The rectangular image can be compressed with run-length encoding by counting identical bits as follows”. [5] 0, 40 0, 40 0, 10 1, 20 0, 10 0, 10 1, 1 0, 18 1, 1 0, 10 0, 10 1, 1 0, 18 1, 1 0, 10 0, 10 1, 1 0, 18 1, 1 0, 10 0, 10 1, 20 0, 10 0, 40 The first line says that there are zeros repeated for forty times for the first row of the bitmap and the second row has the same format where as the third row says that the row starts with ten zeros twenty one’s and again zeros for ten times.In this way we can get back the original representation of the bitmap. One of the methods which can be followed for this process of compression and uncompression is by using java.util.zip package. 25 The package provides various methods to read, create, and modify ZIP and GZIP file formats. It also provides utility classes for computing checksums of arbitrary input streams that can be used to validate input data. Using the above methods in this class we can use compress the input file; The file is read and the output compressed file is written using File Channels like File input stream and File output Stream. 8.2 BLOWFISH ALGORITHM FOR ENCRYPTION AND DECRYPTION: To add security to the transfer process the above compressed file is encrypted using Encryption algorithm. Encryption algorithms are of two types: 1. Symmetric Key and 2. Public Key. Symmetric Key such as blow fish uses the same key for both encryption and decryption where as Public key algorithm use two keys, one for encryption and another for decryption. Public key used for encryption and private key used for decryption in case of public key encryption algorithms. Blowfish is a key symmetric block cipher; it has 64 bit block cipher and takes a variable length key from 32 to 448bits and the block of the blowfish algorithm is 64 bits. It has sub keys in 18 entry P Array and four S boxes of 256 entries. This algorithm had 16 rounds and each round consists of a key-dependent permutation, and a key- and datadependent substitution. All operations are XORs and additions on 32-bit words. The only additional operations are four indexed array data lookups per round. 26 The input is 64 bit data element A. This is divided two halves 32bits each.A1 and A2. As mentioned above this algorithm has 16 rounds and each round goes through the following steps (For I = 1 to 16) A1 = A1 XOR Pi A2 = F(A1) XOR A2 Swap A1 and A2 After 16th round swap A1 and A2 to undo the swap.Then, A1 = A1 XOR P17 and A2 = A2 XOR P18. Finally, recombine A1 and A2 to get the ciphertext. Function F looks like this: Divide xL into four eight-bit quarters: a, b, c, and d. Then, F(A1) = ((S1,a + S2,b mod 232) XOR S3,c) + S4,d mod 232. Decryption is exactly the same as encryption, except that P1, P2,..., P18 are used in the reverse order. 8.3 STEGANOGRAPHY (EMBED AND DE-EMBED PROCESS): Steganography is art of hiding information in ways that prevent the detection of hidden messages. Stegnography derived from Greek, literally means “Covered Writing”. It includes a vast array of secret communications methods that conceal the message’s very existence. These methods are including invisible inks, microdots, character arrangement, digital signature, and covert channels and spread spectrum communications. In this technology, the end user embeds a compressed and encrypted data file into an audio file, which is going to act as the carrier for data. This protects the data from being invisible and hence is secure during transmission. The user on the receiving end uses De - 27 embed module to retrieve the data file from Audio file. The module deals with identifying the hidden data in the audio file. The two major approaches to implement this are: Spread Spectrum is method which works by modifying/adding random noises to the signal; the information is spread across the carrier. Echo data hiding is another method which uses the echoes in sound files in order to try and hide information. By simply adding extra sound to an echo inside an audio file, information can be concealed. We use one of the approaches of Spread Spectrum technique for Embedding data into audio file. There are two approaches in Spread Spectrum:\ 1. Direct Sequence Spread Spectrum (DSSS) 2. Frequency Hopped Spread Spectrum (FHSS) We use Direct Sequence Spread Spectrum (DSSS) for embedding the data into Audio file. The basic concept of Spread Spectrum is Data sent using the Spread Spectrum is spread all over the carrier by modifying the high pitch bits of the Audio file. The modification is done such that the high pitch bits of the audio file are replaced with the data file bits. Since the data file bits occupy the high pitch Audio bits, it is difficult to detect the change and also the numbers of bits being replaced are very few. Direct Sequence Spread Spectrum (DSSS): Data to be transmitted is divided into small pieces and each piece is allocated to a frequency channel across the spectrum with the high pitch. Phase varying modulation 28 technique is used to modulate each piece of data over the channel. Figure 5 depicts the application of DSSS on the audio signal. Figure 5: DSSS Application Frequency Hopped Spread Spectrum (FHSS): Data carrier frequency is periodically modified (hopped) across a specific range of frequencies (spreading).The shifting pattern is determined by the chosen code sequence (FSK – Frequency Shift Key). Here as we are using DSSS for embedding the data into the file so the data to be embedded into the Audio File is split into 8 bit blocks and when ever a high pitch node is found the 8 bits of the audio file are replaced with the data file bits and the node position is stored in an array which helps in retrieving the original data bits. 29 Chapter 9 UML DIAGRAMS UML diagrams for “Keeping Secrets Secret” are given below: 9.1 Use Case Diagrams: A Use Case diagram is a process of organizing the system requirements pictorially. Each use case represents a requirement in the system. Figure 7 represents the Use Case Diagrams for Sender: The requirements which fall onto the sender’s side are Compression, Encryption and Embedding. So they are depicted pictorially as shown below where in each use case acts as a requirement. Figure 6: Use Case Diagrams for Sender 30 Figure 7 represents the Use Case Diagram for Receiver The requirements which fall onto the sender’s side are Un-Compression, Decryption and De-Embedding. Figure 7: Use Case Diagrams for Receiver 31 9.2 Activity Diagrams: Activity Diagrams for Compression, Encryption and Embedding: Activity Diagrams are the flow chart of a particular process. Here the activity diagram depicts the flow between the Compression, Encryption and Embedding. Figure 8 represents the activity diagram from sender’s side. Figure 8: Activity Diagram for compression, encryption and embedding. 32 Activity Diagrams for Un-compression, Decryption and De-Embedding: Here the activity diagram depicts the flow between the Un-Compression, Decryption and De-Embedding. Figure 9 represents the activity diagram from receiver’s side. Figure 9: Activity Diagram for Un-compression, decryption and De-embedding. 33 Chapter 10 TESTING The test procedure used in the testing process is Black box testing. Test cases are analyzed accordingly. Black Box Testing This test involves the manual evaluation of the flow from one module to the other and check accordingly for the process flow. This process of testing is with the following criteria Compression process Encryption process Embedding Process Retrieving Process Decryption process Uncompression process Key prescription Information display Exit process 34 Case Generation Report: Test Type Case Expected Result Operational / Unit Compress Receive the file to compress and saved in to the / Functional Test same location. -do- Receive the file to uncompress and saved in to UnCompress the same location. -do- Encryption Receive the file to be encrypted and encrypt according to the key and save -do- Decryption Receive the file to be decrypted and decrypt with same key and save -do- Exit Ends the Process -do- Embed Receive the encrypted file and receive the Audio file to embed and save in the same location -do- Retrieve Receives the Folder contains audio file to retrieve the encrypted file and save in to the same folder Table 1: Case Generation Report 35 Test Case Analysis: Test Type Expected Result Observed Remarks Result Receive data, decrypted Path failure decrypt and Address of the file is corrected save Table 2: Test Case Analysis All the above validations on buttons have been verified and they are successfully executed. The flow is tested at different possible conditions by means of this testing. 36 Chapter 11 USER GUIDE Step 1: The first step is to compress the text file which is done using the compression module. Select the “Compress” option from drop down menu. Figure 10 represents the step of initiating the compression. Figure 10: Initiating the compression 37 Step 2: Once the “Compress” option is selected a new window is opened where in the path to the text file which has to be compressed must be given. Once the path is selected hit the “Compress” button on the window. Figure 11 represents the process of selecting a file for compression. Figure 11: Selecting a file for compression 38 Once the compress button is hit the file is compressed successfully and an alert pops up which says “File Compressed Successfully”. The compressed file is placed in the same folder where the original text file is present with the same filename but different extension which is .cmp. Figure 12 shows the prompt which says file compression is successful. Figure 12: File Compression Successful 39 Step 3: Once the compression is done, this compressed file has to be encrypted using the encryption module. Select “Encrypt” option from the drop down menu. Figure 13 represents the step for initiating the Encryption process. Figure 13: Step to Initiate Encryption. 40 Step 4: Once the “Encrypt” option is selected a new window is opened where in the path to the compressed file which has to be encrypted must be selected. Once the path is selected hit the “Encrypt” button on the window. Figure 14 represents the step for selecting already compressed file for encryption. Figure 14: Selecting a file for Encryption. 41 Once the “Encrypt” button is hit the file is encrypted successfully and a message is displayed on the right hand side log of the window saying “File Encrypted Successfully”. The Encrypted file is placed in the same folder where the original compressed file is present with the same filename but different extension which is .enc. Figure 15 shows the prompt which says file Encryption is successful. Figure 15: File encryption successful. 42 Step 5: The next step is to embed the encrypted file which is done using the “Embed” module. Select the “Embed” option from drop down menu. Figure 16 shows the step to initiate the Embed process. Figure 16: Initiate the Embed Process. 43 Step 6: Once the “Embed” option is selected a new window is opened where in the path to the Audio file and the path to the encrypted file must be selected. Once the paths are selected hit the “Embed” button on the window. Figure 17 shows the prompt where an Audio file and already encrypted file will have to be selected for embedding process. Figure 17: Selecting files for Embed Process. 44 Once the Embed button is hit the file embed process successful and an alert pops up which says “Embed Process Complete”. The new audio file is placed in the same folder as the original audio file. Figure18 represents the prompt which says Embed Process Successful. Figure 18: Embed Process Successful. 45 Reversal Process: Step 1: The first step in the reversal process is to de embed the encrypted file from the audio file which is done using the De embed module. Select the “De-embed” option from drop down menu. Figure 19 represents the step of initiating the De-Embedding. Figure 19: Step to initiate the De-Embedding 46 Step 2: Once the “De-embed” option is selected a new window is opened where in the path to the audio file in which the encrypted file is hidden must be selected. Once the path is selected hit the “De-Embed” button on the window. Figure 20 represents a prompt where a file has to be selected for De-Embedding. Figure 20: Selecting a file for De-Embedding. 47 Once the “De-Embed” button is hit an alert pops up which says “De-Embed Process Completed”. Once this process is completed successfully an encrypted file is placed in the same folder where the Audio file is present. The file has an extension .enc. Figure 21 represents the prompt which says “De Embed Procedd Completed”. Figure 21: File De Embed Successful 48 Step 3: The next step is to decrypt the encrypted file which is produced in the previous step. Select the “Decrypt” option from drop down menu. Figure 22 represents the step to initiate the Decryption process. Figure 22: Step to Initiate Decryption 49 Step 4: Once the “Decrypt” option is selected a new window is opened where in the path to the encrypted file must be browsed. Once the path is selected hit the “Decrypt” button on the window. Figure 23 shows the prompt for selecting a file foe decryption. Figure 23: Selecting a file for Decryption. 50 Step 5: The next step is to uncompress the text file. Select the “Un-Compress” option from drop down menu. Figure 24 represents the process of initiating the Un Compression Process. Figure 24: Step to initiate the compression. 51 Step 6: Once the “Un-Compress” option is selected a new window is opened where in the path to the text file which has to be uncompressed must be selected. Once the path is selected hit the “Un-Compress” button on the window. Figure 25 represents the prompt where a file will have to be selected for un-compression. Figure 25: Selecting file for Un-Compression Process. 52 Once the “Un-compress” button is hit the file is uncompressed successfully an alert pops up which says “File Un-Compressed Successfully”. The uncompressed file is placed in the same folder as the original compressed file. Figure 25 represents the prompt which says “File Un Compression Successful”. Figure 26: File Un-Compression Successful 53 Chapter 12 CONCLUSION AND FUTURE WORK Keeping Secrets Secret is an application which alters the Audio file properties to allow the encrypted text document to be embedded into it. As mentioned earlier, the main purpose of building an application like this is for secure communication over the network. The different modules in the system like Compression, Encryption and Embedding serve the above purpose. Both Security and Efficiency are achieved through these modules. Efficiency is achieved by Compression; since a whole file is used over here, compressing the original file would speed up the whole process and also increases the scope of using large file sizes. Security is achieved using Cryptography and Steganography. The main strength of this system is high level of security, where the information is first encrypted, hidden and then transmitted. The algorithms like DSSS (Direct Sequence Spread Spectrum) also added up to the system’s strengths. It was always a challenging task to implement those methods practically and make them work and check if the process sustains under all the conditions. The main drawback of this system is, the file size of the file to be embedded should be less than the file in which it is embedded. Developing a tool like this has been a great experience as the Author got to see various other areas in the Security field and various methods to perform a particular task. The Author also got a chance to discover and learn new programming techniques through this project. 54 With reference to Future Work, a new module can be added to the present system for communication over network, where a file can be sent from one system to another by taking a parameter like a system IP address or emailing the file to a particular account when the email address is provided. Also this system is limited to using steganography for hiding the information. Steganalysis which is the process of detecting the presence of steganography can be another module which can be added to the present system. The above two aspects can be added to the present application to further enhance it. 55 REFERENCES [1] Sonali Guptha. (2005, April). All About Steganography: How it Works? [Online]. Available: http://palisade.plynt.com/issues/2005Apr/steganography/ [2] IT Security Notes and References for CISSP candidates. [Online]. Available: http://www.barcodesinc.com/articles/cryptography2.htm [3] Gary C.Kessler.(2009,August). An Overview of Cryptography: Cryptographic Algorithms.[Online]. Avalilable: http://www.garykessler.net/library/crypto.html#intro [4]PatrickNaughton and Herbert Schildt, “Java2: The Complete Reference”. Tata Mcgraw Hill, 1999, pp. 534-536. [5] Qusay H. Mahmoud. (2002 February). Compressing and Decompressing Data Using Java APIs. [Online]. Available: http://java.sun.com/developer/technicalArticles/Programming/compression/ [6] John E. Hershey,”Cyptography Demystified”. Tata McGraw Hill, 2004, pp. 56-180. [7] Rich Helton and Johennie Helton , “Mastering Java Security”. Wiley Dreamtech, 2002, pp. 10-140. [8] Wei Qin Cheng, Fei Han, Man Juon Tung, Kai Xu, “Robust Audio Steganography using Direct-Sequence Spread Spectrum Technology”, 2007. [9] Jeff England, “Audio Steganography, Echo Data Hiding”, EE6886 [10] Nick Sterling, Sarah Wahl, Sarah Summers, “Spread Spectrum Steganography”. 56 [11] Roger S. Pressman, “Software Engineering”. Tata McGraw Hill, 2004, pp.50-70. [12] Package javax.sound.sampled . [Online]. Available: http://www.j2ee.me/j2se/1.4.2/docs/api/javax/sound/sampled/package-summary.html