research paper on enhancing data compression rate using

advertisement
RESEARCH PAPER ON ENHANCING DATA
COMPRESSION RATE USING
STEGANOGRAPHY
Tamanna Garg
School of Computer Science & Engineering
Bahra University, Shimla Hills, India.
garg.tamanna1@gmail.com
Sonia Vatta
School of Computer Science & Engineering
Bahra University, Shimla Hills, India.
soniavatta@yahoo.com
ABSTRACT-In this paper, description of a compression algorithm based on steganography has been narrated. The
compression algorithm has been used to develop an application which will help the users to hide large size text
documents inside small size images. Maximum bits to be hidden per pixel can be increased to eight with the help of the
developed compression application. After hiding the data inside an image, there appears to be no visible distortion at all.
Also the application is compatible with all the documents and image formats. The developed application automatically
converts the output stego image in bmp format.
Keywords- steganography, cryptography , LSB, embedding, extraction, secret key, compression.
I.INTRODUCTION
In the contemporary era there is a dire need to convey confidential information secretly. Steganography serves the above said
purpose in a frictionless way by hiding information inside the carrier. In other words steganography facilitates the process of
hiding any information related document in carrier such a way that the existence of hidden information can’t even be judged by
anyone.
Image steganography is very prominent now-a- days. Hiding small size information in small or large images is an easy task
but hiding large size information in small images is very complicated. In this research work, there is the narration of a
compression algorithm which has been used to design an application capable of hiding large documents in small images without
any changes. The developed compression algorithm has a capability of hiding data up to eight bits per pixel.
The primary objective of steganography is to avoid drawing attention to the transmission of hidden information. The basic
terminologies used in steganography systems are: the cover message, secret message, the secret key and embedding algorithm.
In this research work, the embedding algorithm is the compression algorithm. The cover message is the carrier of the message
such as image, video, audio, text or some other digital media. Here the carrier is an image. The secret message is the
information which is needed to be hidden in the suitable digital media. The secret information in this work is in the form of any
text format. The secret key is usually used to encrypt the message to have more security. The embedding algorithm is the way
or the idea that is generally used to embed the secret information in the cover message.
In steganography, before the hiding process starts, the sender must select an appropriate message carrier, an effective
message to be hidden as well as a secret key used as a password. A robust steganography algorithm must be selected that should
be able to encrypt the message more effectively. The sender then sends the hidden message to the receiver by using any of the
modern communication techniques. The receiver after receiving the message decrypts the hidden message using the extraction
algorithm and a secret key.
1
Figure 1: General Steganography Approach
II. REVIEW OF LITRATURE
In any field, the literature review provides a massive support to find the questions to carry out research. The review of
literature reveals that further investigation in the field is required. So in relation to this work many research developments have
been taken into consideration.
Great scholar James C. Judge in his work' Steganography: Past, Present, Future’, stated that steganography is the term
applied to any number of processes that will hide a message within an object, where the hidden message will not be apparent to
an observer [1]. One of the researches by Muhalim bin Mohamed Amin et al in their work on' Information Hiding Using
Steganography' has put forward that the system used to enhance the compression rate using LSB technique by randomly
dispersing the bits of the message in the image. This technique makes it harder for unauthorized people to extract the original
message [2].The pioneer researchers T. Morkel et al in their work 'An Overview of Image Steganography' asserted that
different applications have different requirements of the steganography technique used. For example, some applications may
require absolute invisibility of the secret information, while others require a larger secret message to be hidden [3].
In one another study by Shawn D. Dickman entitled ' An Overview of Steganography’, it has been stated that
Steganography is a useful tool that allows covert transmission of information over an overt communications channel [4]. One
another research by Namita Tiwari et al entitled 'Evaluation of Various LSB based methods of Image Steganography on
GIF File Format 'proposed that many different carrier file formats can be used, but digital images are the most popular because
of their frequency on the Internet [5]. Prominent research scholar Yongzhen Zheng et al in their work on ' Identification of
Steganography Software based on Core Instructions Template Matching ' proposed an approach, which was based on the
principles of LSB Replacement Steganography algorithm and which was used to identify steganography software by Core
Instructions Template Matching [6]. Research scholars Dipesh Agrawal & Samidha Diwedi in their research on ' Analysis of
random bit image steganography techniques' propounded that many steganography techniques can be used like least
significant bit (LSB), layout management schemes replacing only 1& apos;s or only zero & apos;s from lower nibble from the
byte for hiding secret message in an image [7].
Saddaf Rubab and Dr. M. Younus in their project' Improved Image Steganography Technique for Colored Images
using Huffman Encoding with Symlet Wavelets' stated a new devised algorithm to hide text in any colored image of any size
using Huffman encryption and 2D Wavelet Transform. The results proved that there is very negligible image quality
degradation. It gives more capacity for larger image sizes. It enhances security and also preserves the image quality. By
inserting Huffman codes into the three components of colored image it becomes complicated[8].Shamim Ahmed Laskar and
Kattamanchi Hemachandran in their work on' High Capacity data hiding using LSB Steganography and Encryption'
proposed a high capacity data embedding approach by the combination of Steganography and Cryptography. The combination
of these two methods will enhance the security of the data embedded. The main objective of this work was to provide
resistance against visual and statistical attacks as well as high capacity [9].
Hemalatha Sharma et al in their project on 'A Secure and High Capacity Image Steganography Technique' provides a
novel image steganography technique to hide multiple secret images and keys in color cover image using Integer Wavelet
Transform (IWT).However the disadvantage of the approach is that it is susceptible to noise if spatial domain techniques are
used to hide the key[10].Elham Ghasemi et al in their work on 'High Capacity Image Steganography Based on Genetic
Algorithm and Wavelet Transform' stated the application of wavelet transform and genetic algorithm (GA) in a novel
steganography scheme. A GA based mapping function to embed data in discrete wavelet transform coefficients in 4*4 blocks on
the cover image has been employed. The optimal pixel adjustment process (OPAP) is applied after embedding the message.
2
This work introduced a novel steganography technique to increase the capacity and the imperceptibility of the image after
embedding [11].
Rahul Jain and Naresh Kumar in their research on ' Efficient data hiding scheme using lossless data compression and
image steganography' stated a data hiding scheme using image steganography and compression. The improved embedding
capacity of the image is possible due to preprocessing the secret message in which a lossless data compression technique is
applied. This preprocessing reduces the size of the secret data by a significant amount and thus permits more data into the same
image [12]. Prashant Dahake in his work ‘An Efficient Encryption Using Data Compression towards Steganography'
stated that compactness is achieved using data compression technique, that is by using arithmetic coding. In proposed system
additional security is provided to data by using encryption technique, which makes use of any cryptographic algorithm and it is
applied on the compressed data [13].
In the above study, first of all there has been described the general definition of the steganography given by a researcher.
Then the work done by some pioneer researchers on the steganography to enhance the quality as well as the size of data being
hidden in digital media has been described.
After having a deep observation, it has been found that there was a problem related to hide large size text information in
small size image. So the aim was to develop a technique which could enhance the compression rate in order to hide large size
information in small size images.
III.OBJECTIVES
The main goal of this research work is to enhance the data compression rate by designing and applying compression
algorithm on bmp images to facilitate the hiding of enlarged text in an image. This project has following objectives:
To explore techniques of hiding data using encryption module of this project.

To extract techniques of getting secret data using decryption module.

To design a compression algorithm.

To enhance data compression rate by using the designed algorithm.

To create a tool that can be used to hide the data inside a 24-bit colored image.
IV. OLDTECHNIQUE & PROPOSEDTECHNIQUE
1. OLD TECHNIQUE
Old technique was based on LSB algorithm.LSB (Least Significant Bit) substitution is the process of adjusting the least
significant bit pixels of the carrier image. It is a simple approach for embedding message into the image. The Least Significant
Bit insertion varies according to number of bits in an image. For an 8 bit image, the least significant bit i.e., the 8th bit of each
byte of the image is changed to the bit of secret message. For 24 bit image, the colors of each component like RGB (red, green
and blue) are changed. LSB is effective in using BMP images as the compression in BMP is lossless. But for hiding the secret
message inside an image of BMP file using LSB algorithm it requires a large image which is used as a cover. LSB substitution
is also possible for GIF formats, but the problem with the GIF image is whenever the least significant bit is changed the whole
color palette will be changed. The problem can be avoided by only using the gray scale GIF images as the gray scale image
contains 256 shades and the changes will be done gradually, so that it will be very hard to detect. For JPEG, the direct
substitution of steganography techniques is not possible as it will use lossy compression. So it uses LSB substitution for
embedding the data into images. There are many approaches available for hiding the data within an image: one of the simple
least significant bit submission approaches is "Optimum Pixel Adjustment Procedure". The simple algorithm for OPA explains
the procedure of hiding the sample text in an image.
Step1: A few least significant bits (LSB) are substituted with data to be hidden.
Step2: The pixels are arranged in a manner of placing the hidden bits before the pixel of each cover image to minimize the
errors.
Step3: Let n LSBs be substituted in each pixel.
3
Step4: Let d= decimal value of the pixel after the substitution.d1 = decimal value of last n bits of the pixel.d2 = decimal value
of n bits hidden in that pixel.
Step5: If (d1~d2) <= (2^n)/2, then no adjustment is made in that pixel.
Else
Step6: If (d1<d2) d = d –2^n.If (d1>d2) d = d + 2^n.
This "d" is converted to binary and written back to pixel.
This method of substitution is simple and easy to retrieve the data and the image quality is better & it provides enhanced
security.
Figure 2: General LSB technique
2. PROPOSED TECHNIQUE
The algorithm that has proposed is basically an extension of the original LSB technique, which is quite vulnerable. Instead of
hiding data in least significant bits of the RGB components of a pixel, the data would be hidden as shown below:Let the data to be hidden is word “ABC”
ASCII code of A= 65 and corresponding binary is 01000001.
ASCII code of B= 66 and corresponding binary is 01000010.
ASCII code of C= 67 and corresponding binary is 01000011.
Let the first pixel’s RGB component be: -
Red component is replaced with binary of 65 i.e. A.
Let the second pixel’s RGB component be: -
4
Green component of second pixel is replaced with binary of 66 i.e. B.
Let the third pixel’s RGB component be: -
Blue component of third pixel is replaced with binary of 67 i.e. C.
And the process continues until all the pixels get exhausted.
The resulting stego image that will be obtained after the algorithm completes its execution, is distorted and is easy to detect,
that some kind of alteration has been done to the image. So, to enhance the security of the secret message the covering of
resulting stego image with a new cover image would be done, this is the first level of security. By just looking at the resulting
image no one would be able to predict that something is hidden inside it. The new cover image can be the same or different than
the original.
In order to increase the storage capacity of the image, a compression algorithm has been used; each component of an RGB
pixel is represented with 8 bits. So, the maximum compression would be 8 bits per pixel and minimum would be 1 bit per pixel.
The proposed steganography algorithm comprises of two embedding techniques; which are data hiding technique and data
retrieving technique. Data hiding technique as the name suggests is used to hide secret message and key in the cover image,
while data retrieving technique is used to retrieve the key and the hidden secret message from the stego image. Therefore data is
protected in image without revealing to unauthorized party.
A. Proposed embedding technique.
Inputs: - Text file, cover image 1, cover image 2 and secret key.
Output: - Stego image.
Begin
1.
Select a text file, convert it into binary form and calculate the number of bits in it.
2.
Select a carrier image (cover image 1) for hiding purpose, find the number of pixels, convert it into RGB image and call the
compression function.
3.
If bits calculated are compatible with the image resolution, then
Start sub iteration 1
Replace red component of the first pixel with first character.
Replace green component of the second pixel with second character.
5
Replace blue component of the third pixel with third character.
And repeat iterations until pixels exhaust.
Stop sub iteration 1
Else
Repeat sub iteration 1
Find necessary compression ratio and perform sub iteration 2.
Sub iteration 2
Replace necessary bits as defined by the compression ratio in immediate component of each pixel.
Store the information about bits embedded in a binary address file.
Stop sub iteration2
4.
Provide a security key to encrypt the data for better security.
5.
Select 2nd cover image to hide the distorted stego image.
End
B. Proposed Extraction technique.
Input: - Stego image and secret key.
Output: - Secret text file.
Begin
1.
2.
3.
4.
Browse the stego image.
Choose the folder in which you want to extract the hidden text file.
Provide necessary security key.
Convert the binary file into human readable form.
End
The main focus of this proposed steganography technique is to hide text files in images, compresses the text files so as to
increase the overall storage capacity, applying a secret key on the resulting stego image and transferring the secret message
without any vulnerability and threat.
6
Figure 3: General Layout of Proposed System.
This system is able to maintain the accuracy & confidentiality of the data. The system also works by hiding the text files in
images using a secret key and is also able to retrieve the data back from the stego image.
V. IMPLEMENTATION OF SYSTEM
The system has been developed in Java. The system basically comprises of two main interfaces, one for embedding purpose
and other for the extraction process.
Overview of System
The embedding form looks like as shown below:
Figure 4: Embedding form of application
The embedding form as shown above comprises of three main browsing fields. One for the text file to be embedded, second
for the image in which the file will be embedded and third for the cover image to hide the underlying distortion. One important
point to note here is that the cover file can or cannot be same as the one used for the hiding process. After filling these necessary
fields, the next step is to check the encryption checkbox. User need not to worry about the underlying compression procedure,
which in turn is automatically performed by the system itself. User then needs to provide the secret key twice for the
verification procedure, various validations are applied here. The secret key along with the text file is embedded inside the
7
image. Once the data has been keyed in and the secret key has been entered, the new stego image can be saved to a different
image location. The new stego image can then be used by the user to send it via internet or email to other parties without
revealing the secret data inside the image. If the other parties want to extract the hidden data from the stego image, they need to
upload the new stego image using the system itself to retrieve the text file hidden inside the image by providing the secret key.
The extraction form looks like as shown below:
Figure 5:Extraction form of Application
VI. RESULTS
The system is tested using the images shown in Figures 6-8.
Example 1
Figure 6a: Original image (.jpg)
Figure 6b: Cover image (.jpg)
Figure 6 c: Stego image (.jpg.bmp)
Figure 6a shows the original image before the message is stored in it. Figure 6b shows the cover image. Here it should be
noted that the original image and the cover image are exactly same having same extensions of .jpg. The resulting stego image
has a double extension of .jpg.bmp. It has been found that the stego image as shown in Figure 6c does not have any noticeable
changes in it as seen from naked eyes.
8
Example 2
Figure 7a: Original image (.jpg)
Figure 7b: Cover image (.jpg)
Figure 7c: Stego image (.jpg.bmp)
In this example hidden image and cover image Figure 7a and 7b respectively are exactly the same having same extensions of
.jpg. The resulting stego image as shown in Figure7c obtained does not have any noticeable changes and it is found that it is
having an extension of .jpg.bmp only.
Example 3
Figure 8a: Original image (.jpg)
Figure 8b: Cover image (.jpg)
Figure 8c: Stego image (.jpg.bmp)
In this example hidden image and cover image inFigure 8a and 8b respectively are exactly the same having same extensions
of .jpg. The resulting stego image as shown in Figure 8c obtained does not have any noticeable changes and it is found that it is
having an extension of .jpg.bmp only.
Actually what is happening here is that the data is embedded inside the original image using the algorithm which has been
proposed but the images which are obtained after the embedding process, are distorted, so in order to overcome this limitation
the distorted image is covered using a cover image.
Using the proposed algorithm, the testing is done on several sizes of images to see various sizes of data being stored in the
image.
Table 1: Comparison of different file sizes in jpg format.
Sr.no.
1
2
3
4
5
6
7
8
9
10
Original
image
size(.jpg)
289 kb
201 kb
172 kb
151 kb
134 kb
126 kb
114 kb
73 kb
38 kb
27 kb
Text file
size
216 kb
345 kb
556 kb
818 kb
1.10 mb
1.50 mb
1.71 mb
1.30 mb
1.10 mb
1.50 mb
Cover image
size(.jpg)
289 kb
201 kb
172 kb
151 kb
134 kb
126 kb
114 kb
73 kb
38 kb
27 kb
9
Stego
image
size(.jpg.b
mp)
289 kb
201 kb
172 kb
151 kb
134 kb
126 kb
114 kb
73 kb
38 kb
27 kb
Embedding
Done
Done
Done
Done
Done
Done
Done
Done
Done
Done
Extraction
Done
Done
Done
Done
Done
Done
Done
Done
Done
Done
After embedding the text file, the developed application automatically converts the image into bmp format.
Table 2: Text files formats supported by our system.
Text file formats
Embedding
Extraction
.txt
Done
Done
.docx
Done
Done
.pdf
Done
Done
.ppt
Done
Done
.cpp
Done
Done
VII. CONCLUSION AND SUGGESTIONS
After the analysis and development of steganography application with the capability of compression technique, which was
not present in earlier existing steganography applications, it is derived that the designed application works well even with the
large size documents due to its compression capability. The application can be used for hiding large document in a small image
by increasing the maximum bits hidden per pixel. It has been concluded that the developed application can hide up to eight bits
per pixel through its unique compression technique. The generated stego image even after the optimization of compression is
free from any visible changes.
Steganography will continue to increase in popularity over cryptography, as it gets more and more advanced as will the
steganalysis tools for detecting it. At the time though most of the tools can detect the files hidden in any image, but small
sentences and one-word answers like ‘yes’ are virtually impossible to find. There also seems very less of tools for hiding data in
videos. There are some available for audio, but this is still an area, which lags behind image steganography. The future may see
audio files and video streams that could possibly be decoded on the fly to form their correct messages.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
James. C. Judge, "Steganography: Past, Present, Future", GSEC Version 1.2f, SANS Institute 2001.
M.M. Amin, .M. Salleh, S. Ibrahim, M.R Katmin (2003), “Information Hiding Using Steganography”, 4th National
Conference on Telecommunication Technology Proceeding 2003 (NCTT2003), Concorde Hotel, Shah Alam, Selangor,
14-15 January 2003, pp. 21-25.
T Morkel, JHP Eloff and MS Olivier, "An Overview of Image Steganography," in Proceedings of the Fifth Annual
Information Security South Africa Conference (ISSA2005), Sandton, South Africa, June/July 2005 (Published
electronically).
Shawn D.Dickman "An Overview of Steganography", James Madison University Infosec Tech report, July 2007, JMUINFOSEC-TR-2007-002.
Namita Tiwari, Dr.Madhu Shandilya, "Evaluation of Various LSB based methods of Image Steganography on GIF File
Format", International Journal of Computer Applications (0975-8887), Volume 6-No.2, September 2010.
Yongzhen Zheng, Fenlin Liu ; Xiangyang Luo ; Chunfang Yang ,"Identification of Steganography Software based on
Core Instructions Template Matching," in Multimedia Information Networking and Security (MINES), 2012 Fourth
International Conference on, Shanghai , 4-6 Nov. 2011.
Dipesh Agrawal & Samidha Diwedi, "Analysis of random bit image steganography techniques" IJCA Proceedings on
International Conference on Recent Trends in Engineering & technology 2013 ICRTET, New York, USA, 1-4, May
2013.
Saddaf Rubab, Dr. M. Younus," Improved Image Steganography Technique for Colored Images using Huffman Encoding
with Symlet Wavelets" IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 2, No 1, March 2012
Shamim Ahmed Laskar and Kattamanchi Hemachandran," High Capacity data hiding using LSB Steganography and
Encryption" International Journal of Database Management Systems ( IJDMS ) Vol.4, No.6, December 2012.
10
[10]
[11]
[12]
[13]
Hemalatha S, U Dinesh Acharya, Renuka A and Priya R. Kamath, "A SECURE AND HIGH CAPACITY IMAGE
STEGANOGRAPHY TECHNIQUE" Signal & Image Processing: An International Journal (SIPIJ) Vol.4, No.1, February
2013.
Elham Ghasemi, Jamshid Shanbehzadeh, and Nima Fassihi, "High Capacity Image Steganography Based on Genetic
Algorithm and Wavelet Transform"
Rahul Jain," High Capacity data hiding using LSB Steganography and Encryption" International Journal of Engineering
Science and Technology (IJEST)
Prashant Dahake ,"An efficient encryption using data compression towards steganography"
11
Download