View/Open - Sacramento

advertisement
KEEPING SECRETS SECRET – IMPLEMENTATION OF STEGANOGRAPHY
WITH AUDIO FILE AND ENCRYPTED DOCUMENT
Vijaya Lakshmi Chittimalli
B.Tech, Jawaharlal Nehru Technological University, 2006
PROJECT
Submitted in partial satisfaction of
the requirements for the degree of
MASTER OF SCIENCE
in
COMPUTER SCIENCE
at
CALIFORNIA STATE UNIVERSITY, SACRAMENTO
FALL
2009
KEEPING SECRETS SECRET – IMPLEMENTATION OF STEGANOGRAPHY
WITH AUDIO FILE AND ENCRYPTED DOCUMENT
A Project
by
Vijaya Lakshmi Chittimalli
Approved by:
__________________________________, Committee Chair
Dr. Isaac Ghansah
__________________________________, Second Reader
Prof. Dick Smith
____________________________
Date
ii
Student: Vijaya Lakshmi chittimalli
I certify that this student has met the requirements for format contained in the University
format manual, and that this project is suitable for shelving in the Library and credit is to
be awarded for the Project.
__________________________, Graduate Coordinator
Dr. Cui Zhang
Department of Computer Science
iii
________________
Date
Abstract
of
KEEPING SECRETS SECRET – IMPLEMENTATION OF STEGANOGRAPHY
WITH AUDIO FILE AND ENCRYPTED DOCUMENT
by
Vijaya Lakshmi Chittimalli
Steganography is the art of hiding messages inside an image file/Audio file or a Video
file such that the very existence of the message is unknown to third party. Cryptography
is used to encrypt the data so that it is unreadable by a third party.
Keeping Secrets Secret is an application which combines both of the above
mentioned techniques to embed Text document in an audio signal. A Text document is
compressed and then embedded into the Audio file in order to achieve robustness and
better performance. Then the users can easily and securely send the compressed data over
the network. The major task of this application is to provide the user the flexibility of
passing the information implementing the encryption standards as per the specification
and algorithms proposed and store the information in a form that is undetectable in an
Audio file. The Application will have a reversal process which de-embeds the data file
from audio file and decrypts the data to its original format upon the proper request by the
user. This application is developed using Java Programming Language and is compatible
with windows environment.
_______________________, Committee Chair
Dr. Isaac Ghansah
_______________________
Date
iv
TABLE OF CONTENTS
Page
List of Tables………………………………………………………………………………....vii
List of Figures………………………………………………………………………………..viii
Chapter
1. INTRODUCTION AND PROJECT OVERVIEW………………………………….1
1.1 Modules……………………………………………………………………..2
2. STUDY OF STEGANOGRAPHY AND CRYPTOGRAPHY…………………….. 4
2.1 Steganography……………………………………………………………....4
2.2 Cryptography…………………………………………………………….….5
2.3 Comparison of Cryptography and Steganography……………………….…7
3. SYSTEM ANALYSIS……………………………………………………….….…...9
4. EXISTING AND PROPOSED SYSTEM…………………………………,............10
4.1 Existing System................................................................................,...........10
4.2 Proposed System.............................................................................,............10
5 .FEASIBILITY STUDY AND SYSTEM REQUIREMENTS…….………,.……...12
5.1 Operational Feasibility……………………….….…………………,.…….12
5.2 Technical feasibility………………………….….……………….,……….13
5.3 Financial and Economical Feasibility…………..……………….,………..14
6. SYSTEM DESIGN……………………………………………….…….…………..15
7. FEATURES OF THE LANGUAGE USED………………………….……………17
7.1 About Java………………………………………………….……………...17
7.2 Applications and Applets………………………………….…….…….…..17
7.3 Features of Java............................................................................................17
7.3.1 Security..........................................................................................17
7.3.2 Portability………………………………………………………...18
7.3.3 The Byte code................................................................................18
7.3.4 Java Virtual Machine (JVM)..........................................................18
7.4 Java Architecture...........................................................................................19
v
7.5 Characteristics of Java....................................................................................20
7.5.1 Simple………………………………………………….….………20
7.5.2 Object-Oriented…………………………….…………..…...…….20
7.5.3. Robust.............................................................................................20
8. SYSTEM DESCRIPTION………………………………………………….….…….22
8.1 Compression………………………………………………………........……23
8.2 BlowFish Algorithm for Encryption and Decryption.....................................25
8.3 Steganography (Embed and De-embed Process)……………………...….…26
9. UML DIAGRAMS......................................................................................................29
9.1 Use Case Diagrams………………………………………………...…..……29
9.2 Activity Diagrams………………………………………………………...….31
10. TESTING ……………………………………………………………………….......33
11. USER GUIDE……….……………………………………………………………...36
12. CONCLUSION AND FUTURE WORK………………………….…………….....53
References…………………………………………………………………………........55
vi
LIST OF TABLES
Table 1: Case Generation Report………………………………………………………...34
Table 2: Test Case Analysis……………………………………………………………...35
vii
LIST OF FIGURES
Figure 1: Types of Cryptography……………………… …………………………….....6
Figure 2: Examples of Symmetric Cryptography………………………………………..7
Figure 3: Development process a typical Java program ………………………………..19
Figure 4: A bitmap with information for run-length encoding………………………….24
Figure 5: DSSS Application…………………………………………………………….28
Figure 6: Use Case Diagrams for Sender……………………………………………….29
Figure 7: Use Case Diagrams for Receiver……………………………………………..30
Figure 8: Activity Diagram for compression, encryption and embedding……………..31
Figure 9: Activity Diagram for Un-compression, decryption and De-embedding……..32
Figure 10: Step to initiate the compression………………………………….………….36
Figure 11: Selecting a file for compression…………………………………….……….37
Figure 12: File Compression Successful……………………………………….……….38
Figure 13: Step to Initiate Encryption…………………………………………….…….39
Figure 14: Selecting a file for Encryption……………………………………….……...40
Figure 15: File encryption successful……………………………………….…….…….41
Figure 16: Initiate the Embed Process……………………………….…………….……42
Figure 17: Selecting files for Embed Process….…………………………….….....……43
Figure 18: Embed Process Successful……………………………………….…………44
Figure 19: Step to initiate the De-Embedding……………………………….…………45
viii
Figure 20: Selecting a file for De-Embedding…………………………………………46
Figure 21: File De Embed Successful………………………………………………….47
Figure 22: Step to Initiate Decryption………………………………………………….48
Figure 23: Selecting a file for Decryption……………………………………………...49
Figure 24: Step to initiate the compression…………………………………….……….50
Figure 25: Selecting file for Un-Compression Process…………………………………51
Figure 26: File Un-Compression Successful…………………………………………...52
ix
1
Chapter 1
INTRODUCTION AND PROJECT OVERVIEW
Information hiding has recently gained importance in various applications. One such area
where information hiding is important is military. Steganography has become one of the
popular approaches for information hiding. A recent breakthrough in this field is hiding
information in music. The embedded data should be as immune as possible to
modifications from intelligent attacks or anticipated manipulations. Thus the hidden
messages are encrypted before hiding behind Audio files. Goal of this project is to hide
encrypted information in music.
Keeping Secrets Secret is software, which tries to alter the Audio file properties to allow
the encrypted text document to be embedded into it. The text file is first compressed,
encrypted and then embedded into the Audio file to allow maximum performance and
robustness. This allows the users to easily and securely carry the compressed data. The
major task of the Audio Steganography is to provide the user the flexibility of passing the
information implementing the encryption standards as per the specification and
algorithms proposed and store the information in a form that is undetectable. This
Application has a reversal process, which is used to de-embed the data file from audio
file and decrypt the data to its original format upon the proper request by the user. While
the Encryption and Decryption is done the application should satisfy the standards of
authentication and authorization of the user.
The Entire application provides a user friendly Graphical User Interface, which is in a
self-learning mode for the end user. The System will provide all the functional standards
2
of proper navigation within the environment, which makes it possible for the users to
have a smooth flow while working with this environment. The Application is designed in
such a way that, as soon as it creates Buffer for compressed data, the application asks the
user for the Encryption Key details and starts its functionality upon the logistics that are
provided with in this key. The key should be given in such a way that it should prevent
unauthorized persons from identifying the encrypted information at any point of time.
This application provides De-embedding, Decryption and Uncompression for reversal
process, which is carried at the other end, and the receiver will be able to Decrypt and
Uncompress the data only if the correct key is entered. The Decryption process will
produce the corresponding log when the application is under processing state and will
also list the errors if there are any.
This Application uses Blowfish Algorithm for encryption. This algorithm is a 64-bit
block cipher with a variable length key. This algorithm has been used because it requires
less memory.
1.1 Modules:
1) Compression Module /Uncompression Module
2) Blowfish Algorithm Encryption Module/Decryption Module
3) Steganography Module (Embedding and De-embedding)
4) GUI Module
The 1st chapter gives an overview of the system, need for this system.2nd chapter gives
the literature review for steganograpic and cryptographic techniques which form a strong
foundation for this system. Following that is the 3rd chapter which gives the analysis of
3
system modules and the 4rth chapter gives the comparison between the existing system
and proposed system. 5th and 6th chapters give a detailed explanation of the system
requirements, how the system is designed and the feasibility of the system. Once the
requirements and the system design is finalized the next task is to choose the system
implementation language.7th chapter discusses the need for java language to be used for
this system and the main features of this language.8th chapter gives an overview of the
modules in the system and explanation about each module in the system. 9th chapter is
about the UML representation of the system which follows all the stages of SDLC
(Software Development Life Cycle). The final test analysis of the project is given in 10 th
chapter.
4
Chapter 2
STUDY OF STEGANOGRAPHY AND CRYPTOGRAPHY
2.1 Steganography:
As explained above Steganography is the art of concealing the existence of information
within
seemingly
innocuous
carriers. It
includes
various
methods
of
secret
communications that conceal the very existence of the message. Among these methods
are invisible inks, microdots, character arrangement (other than the cryptographic
methods of permutation and substitution), digital signatures, covert channels and spreadspectrum communications.
“In steganography “cover” is the medium which is used to hide the secret information.
The information to be hidden can be a plain text message, a cipher text, another image, or
anything that can be represented in binary.”[1]
An image, audio file or a video file usually act as cover to hold the information. If an
image is used as cover then it can be altered in the noisy areas with a lot of color
variations so that the alterations are less obvious. The message can also be scattered
randomly throughout the image.
Common methods of concealing data in digital images include:

Least significant bit insertion: This is a very simple method of hiding the message in
a digital image. In this method, the LSB of each byte in the image is used to store the
secret data. Changes in new image in which the message is hidden are too small to be
recognized by the human eye. The disadvantage of this technique is that since it uses each
5
pixel in an image, a lossless compression format like bmp or gif has to be used for the
image.

Masking and filtering: “These methods hide information in a manner similar to paper
watermarks. This can be done, for example, by modifying the luminance of parts of the
image. It does change the visible properties of an image, but if done with care the
distortion is barely discernable.”[1]

Transformations: This is a complex way of hiding information in an image. This is
considered as the most efficient way of hiding the information. Various algorithms and
transformations are applied on the image to hide information in it. DCT (Direct cosine
transformation) is one such method.
2.2 Cryptography:
“On the other hand cryptography is the process of conversion of data into scrambled code
that can be deciphered and sent across a public or private network”. [2]
The two main forms of encrypting data in cryptography are symmetrical and
asymmetrical. Symmetric encryptions, or algorithms, use the same key for encryption as
they do for decryption. Other names for this type of encryption are secret-key, sharedkey, and private-key. Symmetric cryptography is at times simple to decode.
Asymmetric cryptography uses different encryption keys for encryption and decryption.
In we have one key for encryption and a different key for decryption. These keys are
labeled or known as a public and a private key; in this instance the private key cannot be
derived from the public key. The asymmetrical cryptography method has been proven to
be secure as compared to symmetric cryptography. The most common form of
6
asymmetrical encryption is in the application of sending messages where the sender
encodes and the receiving party decodes the message by using a random key generated by
the public key of the sender.
Types of algorithms:
Cryptographic algorithms can be categorized in several ways. One of the ways is that
they can be categorized based on the number of keys that are employed for encryption
and decryption, and further defined by their application and use. The three types of
algorithms that will be discussed, as shown in Figure 1 which is taken from [3] are:
1. Secret Key Cryptography (SKC): Uses a single key for both encryption and
decryption.
2. Public Key Cryptography (PKC): Uses one key for encryption and another for
decryption.
3. Hash Functions: Uses a mathematical transformation to encrypt information.
Figure 1: Types of Cryptography [3]
7
Here Figure 2 which is taken from [3] gives the example of Symmetric cryptography:
Figure 2: Examples of Symmetric Cryptography [3]
2.3 Comparison of Cryptography and Steganography:
The comparison of Cryptography and Steganography is done in either ways saying they
are almost similar and also saying they are different. Both of them are used for secure
communication but the application of the process is entirely different in both the cases.
That said cryptography provides privacy, steganography is intended to provide
secrecy. In real time application you need cryptography when you use your credit card on
the Internet -- you do not want your number revealed to the public. Though your code
may be unbreakable, any hacker can look and see you have sent a message. For true
secrecy, you do not want anyone to know you are sending a message at all.
The main differences between steganography and cryptography:
8
1. We can apply cryptography in the process of using steganography but not the
other way round.
2. We need a “cover” to use steganogrphy but we do not need it for cryptography.
3. Cryptography: “Protects Message”, Steganography: “Hides Message”.
So the application of cryptography can be seen while the main motive of
stegonography is to hide the fact that it is applied and used. Depending on the need,
combination of these two methods for secure transmission can be very useful. This is the
main motive for developing Keeping Secrets Secret. Here the message first undergoes
cryptographic techniques and then it undergoes steganographic techniques for efficient
and safe transmission.
9
Chapter 3
SYSTEM ANALYSIS
People for long time have tried to sort out the problems faced in the general digital
communication system but as these problems exist even now, a secured and easy transfer
method has evolved which does the Encryption and Decryption of the file to be sent and
hiding this file inside an audio file. The file to be sent goes through the cryptographic
standards and Steganography Techniques. The advantages of this Audio Steganography
are:
1. High level Security
2. Cost-effective transfer
In the world where every individual is free to access the information over the network
hacking the information over the network has not remained as big task anymore. Many
organizations send the information in and out of their network at various levels, which
might need security.
If the organizations have this application, then each employee can send the information to
any other registered employee and thus can establish communication and perform the
prescribed tasks in secured fashion. The audio file that the employee sends reaches the
destinations, within no time in an audio file format where the end user needs to de-embed
the file, decrypt it and de compress and use for the purpose. The various branches of the
organization can be connected to a single host server and then an employee of one branch
can send files to the employee of another branch through the server but in a secured
format.
10
Chapter 4
EXISTING AND PROPOSED SYSTEM
4.1 Existing System
In traditional client server architecture the server is only a data base server that offers
data. Therefore majority of the business logic i.e., validations etc. had to be placed on the
clients system. This makes maintenance expensive. Such clients are called as ‘fat clients’.
These clients are responsible for the security for the communication between client and
server.
Since the actual processing of the data takes place on the remote client the data has to be
transported over the network, which requires a secured format of the transfer method.
Present day transactions are considered to be "un-trusted" in terms of security, i.e. they
are relatively easy to be hacked. Secure transfer mode in the existing system is the
motivation factor for a new system with higher-level security standards for the
information exchange.
4.2 Proposed System
The proposed system will have new features to overcome the above mentioned issues.
The transactions will take place in a secured format between various clients.
It provides flexibility to the user to transfer the data through the network very easily by
compressing the large amount of file. It will also identify the user and provide the
communication according to the security standards. In this system the data will be sent
through the network as an audio file. The user who receives the file will do the operations
11
like de-embedding, decryption, and uncompression in their level of hierarchy and get the
required information. Compressing the data will increase the performance of the transfer
and embedding the encrypted data in the audio file in such a way that the audio file
pretends as normal audio file will assure the security while the transfer takes place.
12
Chapter 5
FEASIBILITY STUDY AND SYSTEM REQUIREMENTS
Feasibility Study
A feasibility study is a high-level prototype version of the entire System analysis and
Design Process. The study begins by classifying the problem definition. Feasibility is to
determine if it is worth implementing the set of requirements with approaches mentioned
which include the algorithms and the programming language used. Once an acceptance
problem definition has been generated, the development of logical model of the system
will begin. A search for alternatives is analyzed carefully. There are 3 parts in feasibility
study.
5.1 Operational Feasibility
Question which arise over here are:
 Will the system be used if it developed and implemented?
 If there is sufficient support for the project by the platform on which it is
developed?
 Test cases where the system might fail and methods to overcome those
problems.
 Will this application be compatible with different platforms?
13
This system might face a performance issue when trying to embed the data directly into
the Audio file but the compression technique addresses that problem which enhances the
performance to a great extent. This system being developed in java helps a lot because
one of the features of Java is platform independence which addresses the compatibility
issue.
5.2 Technical feasibility
 Does the necessary technology exist to do what is been suggested?
 Does the proposed equipment have the technical capacity for using the new
system?
 Are there technical guarantees of accuracy, reliability and data security?
The above mentioned issues are addressed by the following approaches:
 The project is developed on Pentium IV with 256 MB RAM.
 The environment required in the development of system is any platform
 The observer pattern along with factory pattern will update the results
eventually
 The language used in the development is JAVA 1.5 & Windows Environment
5.3 Financial and Economical Feasibility
The system developed and installed will be good benefit to the organization. The system
will be developed and operated in the existing hardware and software infrastructure. So
there is no need of additional hardware and software for the system.
14
Software Requirements
 Operating System
Windows (Client/Server).
 Software requirements
Front-end: Java J2SDK 1.5, Swing.
Hardware Requirements
 System Configuration
Pentium IV Processor with 700 MHz Clock Speed
256 MB RAM, 20 GB Hard Disk, 32 Bit PCI Ethernet Card.
15
Chapter 6
SYSTEM DESIGN
The Proposed system is designed to provide a method by where a normal Audio file can
carry important information in it and still sounds the same as the original audio file. The
information which can be hidden inside the audio file can be of .txt, .doc or .xls formats.
For increasing the performance level and security level, the file to be embedded into the
audio file will go through the phases like Compression and Encryption and at the
receiving end it will go through Decryption and Uncompression.
In the interface design we are involved with the design of the user interface with GUI
standards and a proper navigation system. The user needs to follow the GUI
operations/options in order to use the system. Then the user needs to select the operations
provided through the GUI where compression, encryption, embedding, de-embedding,
Decryption, Uncompressing, General information and exit are provided.
Here the Encryption and decryption and services are provided connecting to the security
services module where the encryption and decryption are carried out using the
cryptographic standards implementing the Blowfish algorithm.
To use this application the user first uses Compression option on the application which
accesses the Compression Module on the back end and compresses the file. After the
compression process is completed the user selects the file for encryption option which
accesses the Encryption Module on the back end and encrypts the file. After encryption
of the file is completed the user selects the file for embedding it into the audio file which
accesses the Embedding Module on the back end and embeds the file into Audio file .The
16
user on the other end working with this application software should go for the options
De-Embed Files, Decrypt and Uncompress which access the De-Embed, Decrypt and
Uncompress modules on the back end.
17
Chapter 7
FEATURES OF THE LANGUAGE USED
7.1 About Java
Initially the language was called as “oak” but it was renamed as “Java” in 1995. The
primary motivation of this language was the need for a platform-independent (i.e.,
architecture neutral) language.
Some of the important aspects of Java which actually helped a lot for this project are:
1. Java is a platform Independent.
2. Java is cohesive and consistent.
3. Except for those constraints imposed by the Internet environment, Java gives the
programmer, full control.
4. Support provided by various APIs such as java.util.zip for compression and
javax.sound.sampled for embedding data into Audio file formats.
7.2 Applications and Applets
Java’s ability to create Applets makes it important. An Applet is an application designed,
to be transmitted over the Internet and executed by the JVM on other machine. An applet
is actually a Java program which is usually used to create GUI for an application. It can
react to the user input and dynamically change as according to the input.
7.3 Features of Java
7.3.1 Security
Every time we download a “normal” program, there is a possibility of malicious data
being downloaded along with the data we intend to download. Prior to Java, most users
18
did not prefer to download executable programs frequently, and those who did scanned
them for viruses prior to execution. Java ensures security by providing a “firewall”
between a networked application and your computer. When you use a Java-compatible
Web browser, you can safely download Java applets without fear of virus infection or
malicious intent.
7.3.2 Portability
Java programs can be executed on any platform due to one of its unique nature which is
platform independence.
7.3.3 The Byte code
The key that allows the Java to solve the security and portability problem is that the
output of Java compiler is Byte code. “Byte code is a highly optimized set of instructions
designed to be executed by the Java run-time system, which is called the Java Virtual
Machine (JVM). That is, in its standard form, the JVM is an interpreter for byte code.
Translating a Java program into byte code helps makes it much easier to run a program in
a wide variety of environments”. [4]
7.3.4 Java Virtual Machine (JVM)
The Java virtual machine is an important element of the Java technology. The virtual
machine can be embedded within a web browser or an operating system. Once a piece of
Java code is loaded onto a machine, it is verified. As part of the loading process, a class
loader is invoked and does byte code verification and makes sure that the code that’s has
been generated by the compiler will not corrupt the machine that it is loaded on. Byte
code verification takes place at the end of the compilation process to make sure that is all
19
accurate and correct. So byte code verification is integral to the compiling and executing
of Java code.
Complie
Java byte code
Java
Source
Java Virtual
Machine
.Java
.Class
Figure 3: Development process a typical Java program
Figure 3 shows the development process a typical Java program used to produce byte
codes and execution of that byte code. The first box indicates that the Java source code is
located in a .java file that is processed with a Java compiler. The Java compiler produces
a file called a. class file, which contains the byte code. The class file is then loaded across
the network or loaded locally on your machine into the execution environment is the Java
virtual machine, which interprets and executes the byte code.
7.4 Java Architecture
Java architecture provides a portable, robust, high performing environment for
development. Java provides portability by compiling the byte codes for the Java Virtual
Machine, which is then interpreted on each platform by the run-time environment. Java is
a dynamic system, able to load code when needed from a machine.
20
7.5 Characteristics of Java:
7.5.1 Simple
Java was designed to make the programmer’s life easy by providing the necessary
libraries and API’s for performing widely developed tasks. In Java there are a small
number of clearly defined ways to accomplish a given task.
7.5.2 Object-Oriented
The object model in Java is simple and easy to extend, while simple types, such as
integers, are kept as high-performance non-objects.
7.5.3. Robust
“The ability to create robust programs was given a high priority in the design of Java.
Java is strictly typed language; it checks your code at compile time and run time.”[4]
Another important API of java which is used in this project is the support for
Audio files, handling the Audio files is done using javax.sound.sampled package. The
Java Sound API can handle audio transport in both a streaming, buffered fashion and an
in-memory, unbuffered fashion. "Streaming" is used here in a general sense to refer to
real-time handling of audio bytes; it does not refer to the specific, well-known case of
sending audio over the Internet in a certain format. In other words, a stream of audio is
simply a continuous set of audio bytes that arrive more or less at the same rate that they
are to be handled (played, recorded, etc.). Operations on the bytes commence before all
the data has arrived. In the streaming model, particularly in the case of audio input rather
than audio output, you do not necessarily know in advance how long the sound is and
21
when it will finish arriving. You simply handle one buffer of audio data at a time, until
the operation is halted. In the case of audio output (playback), you also need to buffer
data if the sound you want to play is too large to fit in memory all at once. In other words,
you deliver your audio bytes to the sound engine in chunks, and it takes care of playing
each sample at the right time. Mechanisms are provided that make it easy to know how
much data to deliver in each chunk.
22
Chapter 8
SYSTEM DESCRIPTION
This system is designed basically in four different modules they are GUI module,
Compression Module, Security System module, Stegnography Module, Connection
Manager Module.
GUI Module basically deals with the design of the interface which include the service of
providing the user with the flexibility of accessing the file system and selecting the
required file for the transfer. It should also provide the system to collect the information
from the user to check the authorization in providing the access to the file system. The
interface is also to consider the design which includes the services of sending and
receiving of the files with encryption and decryption standards.
The Compression module basically deals with the compression and uncompression of
the data file because of which the file can be sent easily with minimum upload time.
Security implementation module considers the implementation of the encryption and
decryption standards to transfer the files from one system to another in a distributed
environment. The basic algorithm used in this purpose is the Blowfish where the user can
enter the key depending upon level encryption he is interested.
The Modules of the system are:
1) Compression (Compression and Uncompression Modules)
2) Stegnography (Embed and De-embed Modules)
3) Blowfish Algorithm Implementation (Encryption and Decryption Modules)
23
4) GUI Module (User Interface Module)
8.1 COMPRESSION:
Most of the information which is transferred between the client and the server contains
redundant data; time to transfer this information can be reduced greatly if this redundant
information and extra white spaces are managed properly. Managing over here means
that the original file can be compressed using some compression techniques where it
overcomes the redundancy and other information which add very less to the actual source
content. Even after the file is compressed original file format and styling are still
preserved. The file which is to be transferred can be compressed at senders end to
increase the speed of transfer and uncompressed at the receiver’s end to retain the
original file.
Some of the simplest compression techniques are explained below:
Consider the string
BBBWWDDXXXXKKKKWWZZZ
This string can be encoded more compactly by replacing each repeated string of
characters by a single instance of the repeated character and a number that represents the
number of times it is repeated. The earlier string can be encoded as follows:
3B2W2D4X4K2W2Z
Here "4B" means four B's, and 2H means two H's, and so on. Compressing a string in this
way is called run-length encoding.
As another example, consider the storage of a rectangular image. As a single color
bitmapped image, it can be stored as shown in Figure 4 taken from [5].
24
Figure 4: A bitmap with information for run-length encoding [5]
Another approach might be to store the image as a graphics metafile:
Rectangle 11, 3, 20, 5
“This says, the rectangle starts at coordinate (11, 3) of width 20 and length 5 pixels.
The rectangular image can be compressed with run-length encoding by counting identical
bits as follows”. [5]
0, 40
0, 40
0, 10 1, 20 0, 10
0, 10 1, 1 0, 18 1, 1 0, 10
0, 10 1, 1 0, 18 1, 1 0, 10
0, 10 1, 1 0, 18 1, 1 0, 10
0, 10 1, 20 0, 10
0, 40
The first line says that there are zeros repeated for forty times for the first row of the
bitmap and the second row has the same format where as the third row says that the row
starts with ten zeros twenty one’s and again zeros for ten times.In this way we can get
back the original representation of the bitmap.
One of the methods which can be followed for this process of compression and
uncompression is by using java.util.zip package.
25
The package provides various methods to read, create, and modify ZIP and GZIP file
formats. It also provides utility classes for computing checksums of arbitrary input
streams that can be used to validate input data.
Using the above methods in this class we can use compress the input file; The file is read
and the output compressed file is written using File Channels like File input stream and
File output Stream.
8.2 BLOWFISH ALGORITHM FOR ENCRYPTION AND DECRYPTION:
To add security to the transfer process the above compressed file is encrypted using
Encryption algorithm.
Encryption algorithms are of two types:
1. Symmetric Key and
2. Public Key.
Symmetric Key such as blow fish uses the same key for both encryption and decryption
where as Public key algorithm use two keys, one for encryption and another for
decryption. Public key used for encryption and private key used for decryption in case of
public key encryption algorithms.
Blowfish is a key symmetric block cipher; it has 64 bit block cipher and takes a variable
length key from 32 to 448bits and the block of the blowfish algorithm is 64 bits. It has
sub keys in 18 entry P Array and four S boxes of 256 entries. This algorithm had 16
rounds and each round consists of a key-dependent permutation, and a key- and datadependent substitution. All operations are XORs and additions on 32-bit words. The only
additional operations are four indexed array data lookups per round.
26
The input is 64 bit data element A. This is divided two halves 32bits each.A1 and A2.
As mentioned above this algorithm has 16 rounds and each round goes through the
following steps (For I = 1 to 16)
A1 = A1 XOR Pi
A2 = F(A1) XOR A2
Swap A1 and A2
After 16th round swap A1 and A2 to undo the swap.Then, A1 = A1 XOR P17 and A2 =
A2 XOR P18. Finally, recombine A1 and A2 to get the ciphertext.
Function F looks like this: Divide xL into four eight-bit quarters: a, b, c, and d. Then,
F(A1) = ((S1,a + S2,b mod 232) XOR S3,c) + S4,d mod 232.
Decryption is exactly the same as encryption, except that P1, P2,..., P18 are used in the
reverse order.
8.3 STEGANOGRAPHY (EMBED AND DE-EMBED PROCESS):
Steganography is art of hiding information in ways that prevent the detection of hidden
messages. Stegnography derived from Greek, literally means “Covered Writing”. It
includes a vast array of secret communications methods that conceal the message’s very
existence. These methods are including invisible inks, microdots, character arrangement,
digital signature, and covert channels and spread spectrum communications.
In this technology, the end user embeds a compressed and encrypted data file into an
audio file, which is going to act as the carrier for data. This protects the data from being
invisible and hence is secure during transmission. The user on the receiving end uses De -
27
embed module to retrieve the data file from Audio file. The module deals with
identifying the hidden data in the audio file.
The two major approaches to implement this are:
Spread Spectrum is method which works by modifying/adding
random noises to the
signal; the information is spread across the carrier.
Echo data hiding is another method which uses the echoes in sound files in order to try
and hide information. By simply adding extra sound to an echo inside an audio file,
information can be concealed.
We use one of the approaches of Spread Spectrum technique for Embedding data into
audio file. There are two approaches in Spread Spectrum:\
1. Direct Sequence Spread Spectrum (DSSS)
2. Frequency Hopped Spread Spectrum (FHSS)
We use Direct Sequence Spread Spectrum (DSSS) for embedding the data into Audio
file. The basic concept of Spread Spectrum is Data sent using the Spread Spectrum is
spread all over the carrier by modifying the high pitch bits of the Audio file. The
modification is done such that the high pitch bits of the audio file are replaced with the
data file bits. Since the data file bits occupy the high pitch Audio bits, it is difficult to
detect the change and also the numbers of bits being replaced are very few.
Direct Sequence Spread Spectrum (DSSS):
Data to be transmitted is divided into small pieces and each piece is allocated to a
frequency channel across the spectrum with the high pitch. Phase varying modulation
28
technique is used to modulate each piece of data over the channel. Figure 5 depicts the
application of DSSS on the audio signal.
Figure 5: DSSS Application
Frequency Hopped Spread Spectrum (FHSS):
Data carrier frequency is periodically modified (hopped) across a specific range of
frequencies (spreading).The shifting pattern is determined by the chosen code sequence
(FSK – Frequency Shift Key).
Here as we are using DSSS for embedding the data into the file so the data to be
embedded into the Audio File is split into 8 bit blocks and when ever a high pitch node is
found the 8 bits of the audio file are replaced with the data file bits and the node position
is stored in an array which helps in retrieving the original data bits.
29
Chapter 9
UML DIAGRAMS
UML diagrams for “Keeping Secrets Secret” are given below:
9.1 Use Case Diagrams:
A Use Case diagram is a process of organizing the system requirements pictorially. Each
use case represents a requirement in the system.
Figure 7 represents the Use Case Diagrams for Sender:
The requirements which fall onto the sender’s side are Compression, Encryption and
Embedding. So they are depicted pictorially as shown below where in each use case acts
as a requirement.
Figure 6: Use Case Diagrams for Sender
30
Figure 7 represents the Use Case Diagram for Receiver
The requirements which fall onto the sender’s side are Un-Compression, Decryption and
De-Embedding.
Figure 7: Use Case Diagrams for Receiver
31
9.2 Activity Diagrams:
Activity Diagrams for Compression, Encryption and Embedding:
Activity Diagrams are the flow chart of a particular process. Here the activity diagram
depicts the flow between the Compression, Encryption and Embedding. Figure 8
represents the activity diagram from sender’s side.
Figure 8: Activity Diagram for compression, encryption and embedding.
32
Activity Diagrams for Un-compression, Decryption and De-Embedding:
Here the activity diagram depicts the flow between the Un-Compression, Decryption and
De-Embedding. Figure 9 represents the activity diagram from receiver’s side.
Figure 9: Activity Diagram for Un-compression, decryption and De-embedding.
33
Chapter 10
TESTING
The test procedure used in the testing process is Black box testing. Test cases are
analyzed accordingly.
Black Box Testing
This test involves the manual evaluation of the flow from one module to the other and
check accordingly for the process flow. This process of testing is with the following
criteria

Compression process

Encryption process

Embedding Process

Retrieving Process

Decryption process

Uncompression process

Key prescription Information display

Exit process
34
Case Generation Report:
Test Type
Case
Expected Result
Operational / Unit Compress
Receive the file to compress and saved in to the
/ Functional Test
same location.
-do-
Receive the file to uncompress and saved in to
UnCompress the same location.
-do-
Encryption
Receive the file to be encrypted and encrypt
according to the key and save
-do-
Decryption
Receive the file to be decrypted and decrypt
with same key and save
-do-
Exit
Ends the Process
-do-
Embed
Receive the encrypted file and receive the
Audio file to embed and save in the same
location
-do-
Retrieve
Receives the Folder contains audio file to
retrieve the encrypted file and save in to the
same folder
Table 1: Case Generation Report
35
Test Case Analysis:
Test Type
Expected Result
Observed
Remarks
Result
Receive
data,
decrypted Path failure
decrypt
and
Address of the file
is corrected
save
Table 2: Test Case Analysis
All the above validations on buttons have been verified and they are successfully
executed. The flow is tested at different possible conditions by means of this testing.
36
Chapter 11
USER GUIDE
Step 1: The first step is to compress the text file which is done using the compression
module. Select the “Compress” option from drop down menu. Figure 10 represents the
step of initiating the compression.
Figure 10: Initiating the compression
37
Step 2: Once the “Compress” option is selected a new window is opened where in the
path to the text file which has to be compressed must be given. Once the path is selected
hit the “Compress” button on the window. Figure 11 represents the process of selecting a
file for compression.
Figure 11: Selecting a file for compression
38
Once the compress button is hit the file is compressed successfully and an alert pops up
which says “File Compressed Successfully”. The compressed file is placed in the same
folder where the original text file is present with the same filename but different
extension which is .cmp. Figure 12 shows the prompt which says file compression is
successful.
Figure 12: File Compression Successful
39
Step 3: Once the compression is done, this compressed file has to be encrypted using the
encryption module. Select “Encrypt” option from the drop down menu. Figure 13
represents the step for initiating the Encryption process.
Figure 13: Step to Initiate Encryption.
40
Step 4: Once the “Encrypt” option is selected a new window is opened where in the path
to the compressed file which has to be encrypted must be selected. Once the path is
selected hit the “Encrypt” button on the window. Figure 14 represents the step for
selecting already compressed file for encryption.
Figure 14: Selecting a file for Encryption.
41
Once the “Encrypt” button is hit the file is encrypted successfully and a message is
displayed on the right hand side log of the window saying “File Encrypted Successfully”.
The Encrypted file is placed in the same folder where the original compressed file is
present with the same filename but different extension which is .enc. Figure 15 shows the
prompt which says file Encryption is successful.
Figure 15: File encryption successful.
42
Step 5: The next step is to embed the encrypted file which is done using the “Embed”
module. Select the “Embed” option from drop down menu. Figure 16 shows the step to
initiate the Embed process.
Figure 16: Initiate the Embed Process.
43
Step 6: Once the “Embed” option is selected a new window is opened where in the path
to the Audio file and the path to the encrypted file must be selected. Once the paths are
selected hit the “Embed” button on the window. Figure 17 shows the prompt where an
Audio file and already encrypted file will have to be selected for embedding process.
Figure 17: Selecting files for Embed Process.
44
Once the Embed button is hit the file embed process successful and an alert pops up
which says “Embed Process Complete”. The new audio file is placed in the same folder
as the original audio file. Figure18 represents the prompt which says Embed Process
Successful.
Figure 18: Embed Process Successful.
45
Reversal Process:
Step 1: The first step in the reversal process is to de embed the encrypted file from the
audio file which is done using the De embed module. Select the “De-embed” option from
drop down menu. Figure 19 represents the step of initiating the De-Embedding.
Figure 19: Step to initiate the De-Embedding
46
Step 2: Once the “De-embed” option is selected a new window is opened where in the
path to the audio file in which the encrypted file is hidden must be selected. Once the
path is selected hit the “De-Embed” button on the window. Figure 20 represents a prompt
where a file has to be selected for De-Embedding.
Figure 20: Selecting a file for De-Embedding.
47
Once the “De-Embed” button is hit an alert pops up which says “De-Embed Process
Completed”. Once this process is completed successfully an encrypted file is placed in
the same folder where the Audio file is present. The file has an extension .enc. Figure 21
represents the prompt which says “De Embed Procedd Completed”.
Figure 21: File De Embed Successful
48
Step 3: The next step is to decrypt the encrypted file which is produced in the previous
step. Select the “Decrypt” option from drop down menu. Figure 22 represents the step to
initiate the Decryption process.
Figure 22: Step to Initiate Decryption
49
Step 4: Once the “Decrypt” option is selected a new window is opened where in the path
to the encrypted file must be browsed. Once the path is selected hit the “Decrypt” button
on the window. Figure 23 shows the prompt for selecting a file foe decryption.
Figure 23: Selecting a file for Decryption.
50
Step 5: The next step is to uncompress the text file. Select the “Un-Compress” option
from drop down menu. Figure 24 represents the process of initiating the Un Compression
Process.
Figure 24: Step to initiate the compression.
51
Step 6: Once the “Un-Compress” option is selected a new window is opened where in the
path to the text file which has to be uncompressed must be selected. Once the path is
selected hit the “Un-Compress” button on the window. Figure 25 represents the prompt
where a file will have to be selected for un-compression.
Figure 25: Selecting file for Un-Compression Process.
52
Once the “Un-compress” button is hit the file is uncompressed successfully an alert pops
up which says “File Un-Compressed Successfully”. The uncompressed file is placed in
the same folder as the original compressed file. Figure 25 represents the prompt which
says “File Un Compression Successful”.
Figure 26: File Un-Compression Successful
53
Chapter 12
CONCLUSION AND FUTURE WORK
Keeping Secrets Secret is an application which alters the Audio file properties to allow
the encrypted text document to be embedded into it. As mentioned earlier, the main
purpose of building an application like this is for secure communication over the
network. The different modules in the system like Compression, Encryption and
Embedding serve the above purpose. Both Security and Efficiency are achieved through
these modules. Efficiency is achieved by Compression; since a whole file is used over
here, compressing the original file would speed up the whole process and also increases
the scope of using large file sizes. Security is achieved using Cryptography and
Steganography.
The main strength of this system is high level of security, where the information is first
encrypted, hidden and then transmitted. The algorithms like DSSS (Direct Sequence
Spread Spectrum) also added up to the system’s strengths. It was always a challenging
task to implement those methods practically and make them work and check if the
process sustains under all the conditions. The main drawback of this system is, the file
size of the file to be embedded should be less than the file in which it is embedded.
Developing a tool like this has been a great experience as the Author got to see
various other areas in the Security field and various methods to perform a particular task.
The Author also got a chance to discover and learn new programming techniques through
this project.
54
With reference to Future Work, a new module can be added to the present system
for communication over network, where a file can be sent from one system to another by
taking a parameter like a system IP address or emailing the file to a particular account
when the email address is provided. Also this system is limited to using steganography
for hiding the information. Steganalysis which is the process of detecting the presence of
steganography can be another module which can be added to the present system. The
above two aspects can be added to the present application to further enhance it.
55
REFERENCES
[1] Sonali Guptha. (2005, April). All About Steganography: How it Works? [Online].
Available: http://palisade.plynt.com/issues/2005Apr/steganography/
[2] IT Security Notes and References for CISSP candidates. [Online]. Available:
http://www.barcodesinc.com/articles/cryptography2.htm
[3] Gary C.Kessler.(2009,August). An Overview of Cryptography: Cryptographic
Algorithms.[Online]. Avalilable: http://www.garykessler.net/library/crypto.html#intro
[4]PatrickNaughton and Herbert Schildt, “Java2: The Complete Reference”. Tata
Mcgraw Hill, 1999, pp. 534-536.
[5] Qusay H. Mahmoud. (2002 February). Compressing and Decompressing Data Using
Java APIs. [Online]. Available:
http://java.sun.com/developer/technicalArticles/Programming/compression/
[6] John E. Hershey,”Cyptography Demystified”. Tata McGraw Hill, 2004, pp. 56-180.
[7] Rich Helton and Johennie Helton , “Mastering Java Security”. Wiley Dreamtech,
2002, pp. 10-140.
[8] Wei Qin Cheng, Fei Han, Man Juon Tung, Kai Xu, “Robust Audio Steganography
using Direct-Sequence Spread Spectrum Technology”, 2007.
[9] Jeff England, “Audio Steganography, Echo Data Hiding”, EE6886
[10] Nick Sterling, Sarah Wahl, Sarah Summers, “Spread Spectrum Steganography”.
56
[11] Roger S. Pressman, “Software Engineering”. Tata McGraw Hill, 2004, pp.50-70.
[12] Package javax.sound.sampled . [Online]. Available:
http://www.j2ee.me/j2se/1.4.2/docs/api/javax/sound/sampled/package-summary.html
Download