- the Journal of Information, Knowledge and Research in

advertisement
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN
ELECTRONICS AND COMMUNICATION ENGINEERING
STEGANOGRAPHY THROUGH DIGITAL IMAGE
PROCESSING TECHNIQUE
1 PALLAVI
KHARE, 2 JAIKARAN SINGH, 3 MUKESH TIWARI
1,2,3
Dept. of E&TC, SSSIST Bhopal, India
Pallavi3386@gmail.com
ABSTRACT: This paper presents a proposal for communication which is major part of daily life of today. We
are living an age of science. . With the increasing communication traffic demand, data security has become very
important field. Lots of data security and data hiding algorithms have been developed in the last decade. we are
implementing a method of “Digital Stenography” for image data hiding. Digital Image Steganography system
allows an average user to securely transfer text messages by hiding them in a digital image file. A combination
of Steganography and encryption algorithms provides a strong backbone for its security. Digital Image
Steganography system features innovative techniques for hiding text in a digital image file or even using it as a
key to the encryption.
KEYWORDS: Steganography, Digital Stenography, Encryption ,Digitization Of
Steganography.
1. INTRODUCTION
The word steganography is of Greek origin and
means "covered, or hidden writing". Steganography
is the art and science of writing hidden messages in
such a way that no one apart from the sender and
intended recipient even realizes there is a hidden
message. Generally, a steganographic message will
appear to be something else: a picture, an article, a
shopping list, or some other message.
A
steganographic message (the plaintext) is often first
encrypted by some traditional means, producing a
cipher text.
Digital image steganography system is a stand-alone
application that combines steganography and
encryption to enhance the confidentiality of intended
message. The user’s intended message is first
encrypted to create unintelligible cipher text. Then
the cipher text will be hidden within an image file in
such a way as to minimize the perceived loss in
quality. The recipient of the image is able to retrieve
the hidden message back from the image with this
system.
1.1 ENCODING
In order to hide text in an image, the user must
provide the text and a Target Image in which it is to
be hidden. Optionally, users may enter or load a text
key. When a user key is not provided, a default key is
used. To make choosing images easier, an image bar
is provided for the user. The user may set the source
directory, whose contents are displayed as thumbnails
in the image bar located on the bottom panel of
application. These thumbnails are automatically
generated from the specified source directory. The
user then selects a target image and loads it for use as
the Target Image. The user may also open the Target
Image by selecting Open Image under File menu. The
Image, JPEG
user may enter the text to be embedded in several
ways. He or she may type it directly into the text
window, open a text file, or paste text from another
application. The user may also open a text file by
selecting Open Text under the File menu or use the
Edit menu to paste from the clipboard. Loading a text
key can be done from the Open Text Key selection
on the Tools menu. When user has specified both the
target image, and text body, Digital image
steganography system is ready to hide the text body
within the image. The user may then select Embed
Text from the Tools menu. The key image will then
be hashed into an appropriate Key Value. This value
will be used for the encryption of the user’s plaintext
to produce cipher text. The cipher text and key value
will then be input into the steganography and bit
placement algorithms, and the Output Image will be
generated. After the application has generated the
image, the user may then inspect it by selecting the
Image tab on the main panel. The user is then able to
compare the encrypted target image with the original
target image located on the right box of the main
window. The user may wish to save the generated
image by selecting Save Image or Save Image As…
under File menu.
1.2 DECODING
In order to retrieve the hidden text from a source
image, the user needs to provide a Source Image and
any Keys used when the Source Image was
generated. Loading the Source Image file is done
similarly to the process of opening a Target Image. If
the sender has used the default key, the recipient need
not load any keys to the application. If a non-default
key was used in text hiding process, the receiving
party must have prearranged knowledge of the key
for use in retrieving the text. This key could be text
ISSN: 0975 – 6779| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02
Page 137
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN
ELECTRONICS AND COMMUNICATION ENGINEERING
file or a key string. If the key is a text file, the user
may load the key by selecting Open Text Key from
the Tools menu. Lastly, the user may simply type in
the key by activating the Text tab of the key panel
and entering it into the text box there. After the
Source Image and any required Keys are loaded into
the application, the hidden text can be retrieved by
selecting Extract Text from the Tools menu. The key
will be hashed into an appropriate Key Value. This
key value will be used to recover the hidden cipher
text from the Source Image. The key will then be
used for the decryption of the cipher text to produce
plaintext. The plaintext will finally be displayed in
the text box on the Text tab of the main panel. After
the application has generated the text, the user may
wish to save the generated text by selecting Save
Text under the File menu. The application is not able
to identify the presence of the hidden message in
images until the actual decoding process is attempted,
as the image files are virtually indistinguishable from
normal images.
2. DIGITAL IMAGE DEFINITIONS
A digital image a[m,n] described in a 2D
discrete space is derived from an analog image a(x,y)
in a 2D continuous space through a sampling process
that is frequently referred to as digitization. The
effect of digitization is shown in Figure 1. The 2D
continuous image a(x,y) is divided into N rows and
M columns. The intersection of a row and a column
is termed a pixel. The value assigned to the integer
coordinates [m,n] with {m=0,1,2,...,M-1} and
{n=0,1,2,...,N-1} is a[m,n]. In fact, in most cases
a(x,y)--which we might consider to be the physical
signal that impinges on the face of a 2D sensor--is
actually a function of many variables including depth
(z), color (λ), and time (t).. The image shown in
Figure 1 has been divided into N = 16 rows and M =
16 columns. The value assigned to every pixel is the
average brightness in the pixel rounded to the nearest
integer value. The process of representing the
amplitude of the 2D signal at a given coordinate as an
integer value with L different gray levels is usually
referred to as amplitude quantization.
Fig. 1. Digitization of a continuous image
Common Values: There are standard values for the various parameters
encountered in digital image processing. These
values can be caused by video standards, by
algorithmic requirements, or by the desire to keep
digital circuitry simple. Table 1 gives some
commonly encountered values.
Parameter
Symbol
Typical values
Rows
N
Columns
M
Gray Levels
L
256, 512 , 525 ,625 ,
1024, 1035
256, 512, 768, 1024,
1320
2, 64, 256, 1024, 4096,
16384
Table 2.1 Common values of digital image
parameters
Quite frequently we see cases of M=N=2K where {K
= 8,9,10}. The number of distinct gray levels is
usually a power of 2, that is, L=2B where B is the
number of bits in the binary representation of the
brightness levels. When B>1 we speak of a gray-level
image; when B=1 we speak of a binary image.
3. IMAGE DEFINITION
To a computer, an image is a collection of numbers
that constitute different light intensities in different
areas of the image. This numeric representation
forms a grid and the individual points are referred to
as pixels. Most images on the Internet consists of a
rectangular map of the image’s pixels (represented as
bits) where each pixel is located and its colour. These
pixels are displayed horizontally row by row.
The number of bits in a colour scheme, called the bit
depth, refers to the number of bits used for each
pixel. The smallest bit depth in current colour
schemes is 8, meaning that there are 8 bits used to
describe the colour of each pixel. Monochrome and
grey scale images use 8 bits for each pixel and are
able to display 256 different colours or shades of
grey. Digital colour images are typically stored in 24bit files and use the RGB colour model, also known
as true colour. All colour variations for the pixels of a
24 bit image are derived from three primary colours:
red, green and blue, and each primary colour is
represented by 8 bits. Thus in one given pixel, there
can be 256 different quantities of red, green and blue,
adding up to more than 16-million combinations,
resulting in more than 16-million colours. Not
surprisingly the larger amount of colours that can be
displayed, the larger the file size.
When working with larger images of greater bit
depth, the images tend to become too large to
transmit over a standard Internet connection. In order
to display an image in a reasonable amount of time,
techniques must be incorporated to reduce the
image’s file size. These techniques make use of
mathematical formulas to analyze and condense
image data, resulting in smaller file sizes. This
process is called compression.
ISSN: 0975 – 6779| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02
Page 138
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN
ELECTRONICS AND COMMUNICATION ENGINEERING
In images there are two types of compression: lossy
and lossless. Both methods save storage space, but
the procedures that they implement differ. Lossy
compression creates smaller files by discarding
excess image data from the original image. It
removes details that are too small for the human eye
to differentiate, resulting in close approximations of
the original image, although not an exact duplicate.
An example of an image format that uses this
compression technique is JPEG (Joint Photographic
Experts Group).
Lossless compression, on the other hand, never
removes any information from the original image, but
instead represents data in mathematical formulas. The
original image’s integrity is maintained and the
decompressed image output is bit-by-bit identical to
the original image input. The most popular image
formats that use lossless compression is GIF
(Graphical Interchange Format) and 8-bit BMP (a
Microsoft Windows bitmap file).
Compression plays a very important role in choosing
which steganographic algorithm to use. Lossy
compression techniques result in smaller image file
sizes, but it increases the possibility that the
embedded message may be partly lost due to the fact
that excess image data will be removed. Lossless
compression though, keeps the original digital image
intact without the chance of lost, although is does not
compress the image to such a small file size.
Different steganographic algorithms have been
developed for both of these compression types and
are explained in the following sections.
3.1Image and Transform Domain
Image steganography techniques can be divided into
two groups: those in the Image Domain and those in
the Transform Domain. Image – also known as
spatial – domain techniques embed messages in the
intensity of the pixels directly, while for transform –
also known as frequency – domain, images are first
transformed and then the message is embedded in the
image.
Image domain techniques encompass bit-wise
methods that apply bit insertion and noise
manipulation and are sometimes characterized as
“simple systems”. The image formats that are most
suitable for image domain steganography are lossless
and the techniques are typically dependent on the
image format.
Steganography in the transform domain involves the
manipulation of algorithms and image transforms.
These methods hide messages in more significant
areas of the cover image, making it more robust.
Many transform domain methods are independent of
the image format and the embedded message may
survive conversion between lossy and lossless
compression.
In the next sections I have explained steganographic
algorithms in categories according to image file
formats and the domain in which they are performed.
Steganography
Text
Images
Transform Domain
Audio/Video
Image Domain
Patchwork
JPEG
LSB in
BMP& GIF
Spread Spectrum
Fig. 2 Categories of Image Steganography
4. IMAGE DOMAIN
Least significant bit (LSB) insertion is a
common, simple approach to embedding information
in a cover image. The least significant bit (in other
words, the 8th bit) of some or all of the bytes inside
an image is changed to a bit of the secret message.
When using a 24-bit image, a bit of each of the red,
green and blue colour components can be used, since
they are each represented by a byte. In other words,
one can store 3 bits in each pixel. An 800 × 600 pixel
image, can thus store a total amount of 1,440,000 bits
or 180,000 bytes of embedded data. For example a
grid for 3 pixels of a 24-bit image can be as follows:
(00101101 00011100 11011100)
(10100110 11000100 00001100)
(11010010 10101101 01100011)
When the number 200, which binary
representation is 11001000, is embedded into the
least significant bits of this part of the image, the
resulting grid is as follows:
(00101101 00011101 11011100)
(10100110 11000101 00001100)
(11010010 10101100 01100011)
Although the number was embedded into the first 8
bytes of the grid, only the 3 underlined bits needed to
be changed according to the embedded message. On
average, only half of the bits in an image will need to
be modified to hide a secret message using the
maximum cover size. Since there are 256 possible
ISSN: 0975 – 6779| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02
Page 139
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN
ELECTRONICS AND COMMUNICATION ENGINEERING
intensities of each primary colour, changing the LSB
of a pixel results in small changes in the intensity of
the colours. These changes cannot be perceived by
the human eye - thus the message is successfully
hidden. With a well-chosen image, one can even hide
the message in the least as well as second to least
significant bit and still not see the difference.
In the above example, consecutive bytes of the image
data – from the first byte to the end of the message –
are used to embed the information. This approach is
very easy to detect. A slightly more secure system is
for the sender and receiver to share a secret key that
specifies only certain pixels to be changed. Should an
adversary suspect that LSB steganography has been
used, he has no way of knowing which pixels to
target without the secret key.
In its simplest form, LSB makes use of BMP images,
since they use lossless compression. Unfortunately to
be able to hide a secret message inside a BMP file,
one would require a very large cover image.
Nowadays, BMP images of 800 × 600 pixels are not
often used on the Internet and might arouse
suspicion. For this reason, LSB steganography has
also been developed for use with other image file
formats.
4.1LSB and Palette Based Images
Palette based images, for example GIF images, are
another popular image file format commonly used on
the Internet. By definition a GIF image cannot have a
bit depth greater than 8, thus the maximum number
of colours that a GIF can store is 256. GIF images are
indexed images where the colours used in the image
are stored in a palette, sometimes referred to as a
colour lookup table. Each pixel is represented as a
single byte and the pixel data is an index to the colour
palette. The colours of the palette are typically
ordered from the most used colour to the least used
colours to reduce lookup time.
GIF images can also be used for LSB steganography,
although extra care should be taken. The problem
with the palette approach used with GIF images is
that should one change the least significant bit of a
pixel, it can result in a completely different colour
since the index to the colour palette is changed. If
adjacent palette entries are similar, there might be
little or no noticeable change, but should the adjacent
palette entries be very dissimilar, the change would
be evident. One possible solution is to sort the palette
so that the colour differences between consecutive
colours are minimized. Another solution is to add
new colours which are visually similar to the existing
colours in the palette. This requires the original
image to have less unique colours than the maximum
number of colours (this value depends on the bit
depth used). Using this approach, one should thus
carefully choose the right cover image. Unfortunately
any tampering with the palette of an indexed image
leaves a very clear signature, making it easier to
detect.
A final solution to the problem is to use grey scale
images. In an 8-bit grey scale GIF image, there are
256 different shades of grey. The changes between
the colours are very gradual, making it harder to
detect.
5. TRANSFORM DOMAIN
To understand the steganography algorithms that
can be used when embedding data in the transform
domain, one must first explain the type of file format
connected with this domain. The JPEG file format is
the most popular image file format on the Internet,
because of the small size of the images.
5.1 JPEG compression
To compress an image into JPEG format, the RGB
colour representation is first converted to a YUV
representation. In this representation the Y
component corresponds to the luminance (or
brightness) and the U and V components stand for
chrominance (or colour). According to research the
human eye is more sensitive to changes in the
brightness (luminance) of a pixel than to changes in
its color. This fact is exploited by the JPEG
compression by down sampling the color data to
reduce the size of the file. The colour components (U
and V are halved in horizontal and vertical directions,
thus decreasing the file size by a factor of 2.
The next step is the actual transformation of the
image. For JPEG, the Discrete Cosine Transform
(DCT) is used, but similar transforms are for example
the Discrete Fourier Transform (DFT). These
mathematical transforms convert the pixels in such a
way as to give the effect of “spreading” the location
of the pixel values over part of the image . The DCT
transforms a signal from an image representation into
a frequency representation, by grouping the pixels
into 8 × 8 pixel blocks and transforming the pixel
blocks into 64 DCT coefficients each. A modification
of a single DCT coefficient will affect all 64 image
pixels in that block.
The next step is the quantization phase of the
compression. Here another biological property of the
human eye is exploited: The human eye is fairly good
at spotting small differences in brightness over a
relatively large area, but not so good as to distinguish
between different strengths in high frequency
brightness. This means that the strength of higher
frequencies can be diminished, without changing the
appearance of the image. JPEG does this by dividing
all the values in a block by a quantization coefficient.
The results are rounded to integer values and the
coefficients are encoded using Huffman coding to
further reduce the size.
5.2 JPEG steganography
Originally it was thought that steganography
would not be possible to use with JPEG images, since
they use lossy compression which results in parts of
the image data being altered. One of the major
characteristics of steganography is the fact that
information is hidden in the redundant bits of an
object and since redundant bits are left out when
ISSN: 0975 – 6779| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02
Page 140
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN
ELECTRONICS AND COMMUNICATION ENGINEERING
using JPEG it was feared that the hidden message
would be destroyed. Even if one could somehow
keep the message intact it would be difficult to
embed the message without the changes being
noticeable because of the harsh compression applied.
However, properties of the compression
algorithm have been exploited in order to develop a
steganographic algorithm for JPEGs.
One of these properties of JPEG is exploited to
make the changes to the image invisible to the human
eye. During the DCT transformation phase of the
compression algorithm, rounding errors occur in the
coefficient data that are not noticeable. Although this
property is what classifies the algorithm as being
lossy, this property can also be used to hide
messages.
It is neither feasible nor possible to embed
information in an image that uses lossy compression,
since the compression would destroy all information
in the process. Thus it is important to recognize that
the JPEG compression algorithm is actually divided
into lossy and lossless stages. The DCT and the
quantization phase form part of the lossy stage, while
the Huffman encoding used to further compress the
data is lossless. Steganography can take place
between these two stages. Using the same principles
of LSB insertion the message can be embedded into
the least significant bits of the coefficients before
applying the Huffman encoding. By embedding the
information at this stage, in the transform domain, it
is extremely difficult to detect, since it is not in the
visual domain.
6. IMAGE OR TRANSFORM DOMAIN
6.1Patchwork
Patchwork is a statistical technique that uses
redundant pattern encoding to embed a message in an
image. The algorithm adds redundancy to the hidden
information and then scatters it throughout the image.
A pseudorandom generator is used to select two areas
of the image (or patches), patch A and patch B. All
the pixels in patch A is lightened while the pixels in
patch B are darkened. In other words the intensities
of the pixels in the one patch are increased by a
constant value, while the pixels of the other patch are
decreased with the same constant value. The contrast
changes in this patch subset encodes one bit and the
changes are typically small and imperceptible, while
not changing the average luminosity.
A disadvantage of the patchwork approach is
that only one bit is embedded. One can embed more
bits by first dividing the image into sub-images and
applying the embedding to each of them. The
advantage of using this technique is that the secret
message is distributed over the entire image, so
should one patch be destroyed, the others may still
survive. This however, depends on the message size,
since the message can only be repeated throughout
the image if it is small enough. If the message is too
big, it can only be embedded once.
The patchwork approach is used independent of
the host image and proves to be quite robust as the
hidden message can survive conversion between
lossy and lossless compression.
6.2 Spread Spectrum
In spread spectrum techniques, hidden data is
spread throughout the cover-image making it harder
to detect. A system proposed by Marvel et al.
combines spread spectrum communication, error
control coding and image processing to hide
information in images.
Spread spectrum communication can be defined
as the process of spreading the bandwidth of a
narrowband signal across a wide band of frequencies.
This can be accomplished by adjusting the
narrowband waveform with a wideband waveform,
such as white noise. After spreading, the energy of
the narrowband signal in any one frequency band is
low and therefore difficult to detect. In spread
spectrum image steganography the message is
embedded in noise and then combined with the cover
image to produce the stego image. Since the power of
the embedded signal is much lower than the power of
the cover image, the embedded image is not
perceptible to the human eye or by computer analysis
without access to the original image.
7. LIMITATION AND FUTURE SCOPE
Digital Image Steganography system allows an
average user to securely transfer text messages by
hiding them in a digital image file. A combination of
Steganography and encryption algorithms provides a
strong backbone for its security. Digital Image
Steganography system features innovative techniques
for hiding text in a digital image file or even using it
as a key to the encryption.
Digital Image Steganography system allows a user to
securely transfer a text message by hiding it in a
digital image file. 128 bit AES encryption is used to
protect the content of the text message even if its
presence were to be detected. Currently, no methods
are known for breaking this kind of encryption within
a reasonable period of time (i.e., a couple of years).
Additionally, compression is used to maximize the
space available in an image.
8.REFERENCES
[1]
N . Provos, “Defending Against Statistical
Steganography,” Proc 10th
USENEX Security
Symposium 2005.
[2]
N . Provos and P. Honeyman, “Hide and
Seek: An introduction to Steganography,” IEEE
Security & Privacy Journal 2003.
[3]
S . Katzenbeisser and Petitcolas ,
”Information Hiding Techniques for Stenography and
Digital watermaking” Artech House, Norwood, MA.
2000 .
[4]
L. Reyzen And S. Russell , “More efficient
provably secure Steganography” 2007.
ISSN: 0975 – 6779| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02
Page 141
JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN
ELECTRONICS AND COMMUNICATION ENGINEERING
[5]
S.Lyu and H. Farid , “Steganography using
higher order image statistics , “ IEEE Trans. Inf.
Forens. Secur. 2006.
[6]
Venkatraman , s, Abraham , A . & Paprzycki
M.” Significance of Steganography on Data Security
“ , Proceedings of the International Conference on
Information Technology : Coding and computing ,
2004.
[7]
Fridrich , J ., Goljan M., and Hogea , D ;
New Methodology for Breaking stenographic
Techniques for JPEGs. “ Electronic Imaging 2003”.
[8]
http:/ aakash.ece.ucsb.edu./ data hiding /
stegdemo.aspx.Ucsb
data
hiding
online
demonstration . Released on Mar .09,2005.
[9]
Mitsugu
Iwanmoto
and
Hirosuke
Yamamoto, “The Optimal n-out-of-n Visual Secret
Sharing Scheme for GrayScale Images”, IEICE
Trans. Fundamentals, vol.E85-A, No.10, October
2002, pp. 2238-2247.
[10]
Doron Shaked, Nur Arad, Andrew Fitzhugh,
Irwin Sobel, “Color Diffusion: Error Diffusion for
Color Halftones”, HP Laboratories Israel, May 1999.
[11]
Z.Zhou, G.R.Arce, and G.Di Crescenzo,
“Halftone Visual Cryptography”, IEEE Tans. On
Image Processing,vol.15, No.8, August 2006, pp.
2441-2453.
[12]
M.Naor
and
A.Shamir,
“Visual
Cryptography”, in Proceedings of Eurocrypt 1994,
lecture notes in computer science, 1994, vol.950, pp.
1-12.
[13]
Robert Ulichney, “The void-and-cluster
method for dither array generation”, IS&T/SPIE
Symposium on Electronic Imaging and Science, San
Jose, CA, 1993, vol.1913, pp.332-343.
ISSN: 0975 – 6779| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02
Page 142
Download