Project Report for Stegnography

advertisement
Project Report on
Implementation of Message Encryption/Decryption
Techniques
Work carried at
Defence Terrain Research Laboratory (DTRL)
Defence Research & Development Organisation (DRDO)
Under the supervision of
Ms Geeta Gupta
Scientist
DTRL, DRDO
Metcalf House, Delhi-54
Submitted by:
Neelam Aggarwal (1151153108)
1
TABLE OF CONTENTS
Page no.
DECLARATION..................................................................................... 6
CERTIFICATE.........................................................................................7
ACKNOWLEDGEMENT......................................................................... 8
ABSTRACT............................................................................................ 9
1. INTRODUCTION..............................................................................10
1.1 PURPOSE.............................................................................11
1.2 SOFTWARE & HARDWARE REQUIREMENTS........................12
2. DIGITAL IMAGE PROCESSING.........................................................13
2.1 HISTORY..............................................................................16
2.2 IMAGE SAMPLING & QUANTIZATION.................................17
2.3 IMAGE REPRESENTATION....................................................20
2.4 IMAGE OPERATIONS...........................................................22
2.5 IMAGE FORMATS................................................................28
3. MESSAGE ENCODING TECHNIQUES...............................................32
3.1 STEGANOGRAPHY...............................................................32
3.2 ALGORITHMS.......................................................................40
3.2.1 LSB...........................................................................40
3.2.2 DCT..........................................................................43
2
3.3 APPLICATIONS.....................................................................46
4. SOFTWARE PARADIGM APPLIED....................................................47
4.1 WATERFALL MODEL............................................................47
4.2 SOFTWARE REQUIREMENTS SPECIFICATION......................47
4.3 DESIGN................................................................................48
4.4. TESTING..............................................................................48
5. INTERFACE LAYOUTS AND RESULTS..............................................51
6. CONCLUSION..................................................................................64
7. SCOPE AND LIMITATIONS...............................................................66
8. FUTURE SCOPE...............................................................................67
9. REFERENCES...................................................................................68
3
List of Figures
Page numbers
1) Figure 2.1 Standard Processing of Image Born Images
14
2) Figure 2.2 Continuous Image
17
3) Figure 2.3 A plot of amplitude values along the line of continuous image 18
4) Figure 2.4 Sampling of the continuous image
18
5) Figure 2.5 Quantization of sampled image
18
6) Figure 2.6 a) Continuous image projected onto a sensor array
b) Result of image sampling and quantization
20
7) Figure 2.7 Coordinates convention to represent digital images
21
8) Figure 2.8 Array
21
9) Figure 2.9 Array
22
10) Figure 2.10 Low contrast to high contrast
23
11) Figure 2.11 Blurring of Image
25
12) Figure 2.12 Sharpened Image
26
13) Figure 2.13 Image and its histogram
27
14) Figure 3.1 Steganography
32
15) Figure 3.2 Modern Steganography
33
16) Figure 3.3 Block Diagram
34
17) Figure 3.4 Steganography in practise
34
18) Figure 5.1 Interface for loading an image
51
19) Figure 5.2 Dialog Box for image file selection
52
4
20) Figure 5.3 Loaded Image
53
21) Figure 5.4 Interface for image operations
54
22) Figure 5.5 Blurred Image
55
23) Figure 5.6 Contrast Image
56
24) Figure 5.7 Gray Scale Image
57
25) Figure 5.8 Inverted Image
58
26) Figure 5.9 Sharpen Image
59
27) Figure 5.10 Image before Encoding
60
28) Figure 5.11 Image after Encoding
61
29) Figure 5.12 Interface when entered wrong password
62
30) Figure 5.13 Message Decoded with correct password
63
5
Declaration
I hereby declare that this submission is my own work and that to the
best of my knowledge and belief. It contains no material previously
published or written by another person neither material which to a
substantial extent has been accepted for the award of any other
degree of the university.
NEELAM AGGARWAL
( B.Tech, IT)
6
DATED:
Certificate
This is to certify that project report entitled “Implementation of
Message Encoding Algorithm” which is submitted by Neelam
Aggarwal student of B.tech, 4th year, IT, Bharti Vidyapeeth’s College
Of Engineering is a record of candidate own work carried out by her
under my supervision. The matter embodied in this project report is
original and has not been submitted for the award of any other
degree.
Date :
Mrs. Vanita Jain
(HOD of IT dept.)
7
Acknowledgement
I owe a great many thanks to a great many people who helped and
supported me during the project.
My deepest thanks to members of DTRL for guiding and correcting
me in the completion of project. They help throughout the project
and made necessary correction as and when needed.
I express my thanks for extending their support.
I would also thank my Institution and my faculty members without
whom this project would have been a distant reality. I also extend
my heartfelt thanks to my well wishers.
8
Abstract
Encrypting data has been the most popular approach for protecting
information but this protection can be broken with enough
computational power. An alternate approach to encrypting data
would be to hide it by making this information look like something
else. In this way only receiver would realize its true content. In
particular, if the data is hidden in an image then everyone would
view it as a picture. At the same time receiver could still retrieve the
true information. This technique is often called data hiding or
steganography. For implementing steganography the images which
are collection of pixels should be in a proper format. For this purpose
image processing is done to convert the required image in proper
format.
Image processing usually refers to digital image processing. Image
processing is any form of signal processing for which the input is an
image, such as a photograph or video frame; the output of image
processing may be either an image or, a set of characteristics or
parameters related to the image. The basic operations performed on
images are contrast enhancement, gray scale conversion, inverting
the image etc.
9
1) INTRODUCTION
1.1) Introduction:
Our goal is to build a simple application that is able to send and
receive encrypted messages embedded inside images. The user is
able to choose the image he wants and the program must tell if this
image will suit the text or not. No pixel deformation or size distortion
is allowed. TIF images may suffer slight size increments or
decrements, but we will get to that later. The user can set a different
password for every message he sends, which will enable the
manager to transmit the same image to two groups, but with two
different passwords and two different messages. Encrypting data has
been the most popular approach for protecting information but this
protection can be broken with enough computational power. An
alternate approach to encrypting data would be to hide it by making
this information look like something else. In this way only concern
receiver would realize its true content. In particular, if the data is
hidden inside of an image then everyone would view it as a picture.
At the same time receiver could still retrieve the true information.
This technique is often called data hiding or steganography. For
implementing steganography the images which are collection of
pixels should be in a proper format. For this purpose image
processing is done to convert the required image in proper format.
10
Image processing usually refers to digital image processing. Image
processing is any form of signal processing for which the input is an
image, such as a photograph or video frame.
1.2) Purpose
The purpose of this project is to make software through which we
can perform basic image operations on the desired images and also
can encrypt messages or hide the messages for the purpose of
security. Through steganography we are encrypting the messages
whereas cryptography was created as a technique for securing the
secrecy of communication and many different methods have been
developed to encrypt and decrypt data in order to keep the message
secret. Unfortunately it is sometimes not enough to keep the
contents of a message secret, it may also be necessary to keep the
existence of the message secret. The technique used to implement
this, is called steganography.
Steganography is the art and science of invisible communication. This
is accomplished through hiding information in other information,
thus hiding the existence of the communicated information. Thus
image steganography is a better approach than cryptography.
Purpose of image processing is to make the quality of an image
better so that the required operations can be easily performed on it.
Image steganography is performed on the desired formats which are
suitable.
One can use this software for performing simple image operations
on the images and encrypt the desired message which is to be sent
to another person by preventing its security. This software performs
image operations with message encryption.
11
1.3) Software and Hardware requirements:
Hardware:
Processor
Intel(R) Pentium(R) D CPU 2.66 GHz
RAM
512 MB
Operating System
Windows XP
Software:
Front End
Net beans IDE 6.5
JDK 1.6.0_26
12
2) Digital Image Processing:
Digital image processing is the use of algorithms to perform image
processing of digital images. As a subcategory or field of digital signal
processing, digital image processing has many advantages over
analog image processing. It allows a much wider range of algorithms
to be applied to the input data and can avoid problems such as the
build-up of noise and signal distortion during processing. Since
images are defined over two dimensions (perhaps more) digital
image processing may be modelled in the form of multidimensional
systems.
With the fast computers and signal processors available in the 2000s,
digital image processing has become the most common form of
image processing and generally, is used because it is not only the
most versatile method, but also the cheapest.
Digital image processing technology for medical applications was
inducted into the Space Foundation Space Technology Hall of Fame
in 1994.
Image processing in its broadest sense is an umbrella term for
representing and analyzing of data in visual form. More narrowly,
image processing is the manipulation of numeric data contained in a
digital image for the purpose of enhancing its visual appearance.
Through image processing, faded pictures can be enhanced, medical
13
images clarified, and satellite photographs calibrated. Image
processing software can also translate numeric information into
visual images that can be edited, enhanced, filtered, or animated in
order to reveal relationships previously not apparent. Image
analysis, in contrast, involves collecting data from digital images in
the form of measurements that can then be analyzed and
transformed.
Originally developed for space exploration and biomedicine, digital
image processing and analysis are now used in a wide range of
industrial, artistic, and educational applications. Software for image
processing and analysis is widely available on all major computer
platforms. This software supports the modern adage that "a picture
is worth a thousand words, but an image is worth a thousand
pictures."
14
Fig 2.1 Standard processing of space borne images
Each of the pixels that represent an image stored inside a computer
has a pixel value which describes how bright that pixel is, and/or
what color it should be. In the simplest case of binary images, the
pixel value is a 1-bit number indicating either foreground or
background. For a gray scale images, the pixel value is a single
number that represents the brightness of the pixel. The most
common pixel format is the byte image, where this number is stored
as an 8-bit integer giving a range of possible values from 0 to 255.
Typically zero is taken to be black, and 255 are taken to be white.
Values in between this make up the different shades of gray.
Although simple 8-bit integers or vectors of 8-bit integers are the
most common sorts of pixel values used, some image formats
support different types of value, for instance 32-bit signed integers
or floating point values. Such values are extremely useful in image
processing as they allow processing to be carried out on the image
where the resulting pixel values are not necessarily 8-bit integers. If
this approach is used then it is usually necessary to set up a color
map which relates particular ranges of pixel values to particular
displayed colors.
15
2.1) History:
Many of the techniques of digital image processing, or digital picture
processing as it often was called, were developed in the 1960s at the
Jet Propulsion Laboratory, Massachusetts Institute of Technology,
Bell Laboratories, University of Maryland, and a few other research
facilities, with application to satellite imagery, wire-photo standards
conversion, medical imaging, videophone, character recognition, and
photograph enhancement. The cost of processing was fairly high,
however, with the computing equipment of that era. That changed in
the 1970s, when digital image processing proliferated as cheaper
computers and dedicated hardware became available. Images then
could be processed in real time, for some dedicated problems such
as television standards conversion. As general-purpose computers
became faster, they started to take over the role of dedicated
hardware for all but the most specialized and computer-intensive
operations.
16
2.2) Image Sampling and Quantisation
To create a digital image, we need to convert the continuous sensed
data into digital form. This involves two processes:
1) Sampling
2) Quantization.
The basic idea behind sampling and quantization is illustrated in Fig.
2.1. An image may be continuous with respect to the x- and ycoordinates, and also in amplitude. To convert it to digital form, we
have to sample the function in both coordinates and in amplitude.
Digitizing the coordinate values is called sampling. Digitizing the
amplitude values is called quantization.
17
Fig 2.2 Continuous Image.
Fig 2.3 a plot of amplitude values along the line of continuous image.
18
Fig 2.4 Sampling of the above continuous image.
Fig 2.5 Quantization of the sampled image.
The one-dimensional function shown in Fig. 2.3 is a plot of amplitude
(gray level) values of the continuous image along the line segment
AB in Fig. 2.2.The random variations are due to image noise. To
sample this function, we take equally spaced samples along line AB,
as shown in Fig. 2.4.The location of each sample is given by a vertical
tick mark in the bottom part of the figure. The samples are shown as
small white squares superimposed on the function.
The set of these discrete locations gives the sampled function.
19
However, the values of the samples still span (vertically) a
continuous range of gray-level values.
In order to form a digital function, the gray-level values also must be
converted (quantized) into discrete quantities. The right side of Fig.
2.4 shows the gray-level scale divided into eight discrete levels,
ranging from black to white. The vertical tick marks indicate the
specific value assigned to each of the eight gray levels. The
continuous gray levels are quantized simply by assigning one of the
eight discrete gray levels to each sample. The assignment is made
depending on the vertical proximity of a sample to a vertical tick
mark. The digital samples resulting from both sampling and
quantization are shown in Fig. 2.6. Starting at the top of the image
and carrying out this procedure line by line produces a twodimensional digital image. Sampling in the manner just described
assumes that we have a continuous image in both coordinate
directions as well as in amplitude. In practice, the method of
sampling is determined by the sensor arrangement used to generate
the image.
Fig 2.6
a) Continuous image projected onto a sensor array.
b) Result of image sampling and quantization.
20
2.3) Image Representation:
The result of sampling and quantization is a matrix of real numbers.
We will use two principal ways to represent digital images. Assume
that an image f(x, y) is sampled so that the resulting digital image has
M rows and N columns.
The values of the coordinates (x, y) now become discrete quantities.
For notational clarity and convenience, we shall use integer values
for these discrete coordinates. Thus, the values of the coordinates at
the origin are (x, y) = (0, 0). The next coordinate values along the first
row of the image are represented as (x, y) = (0, 1). It is important to
keep in mind that the notation (0, 1) is used to signify the second
sample along the first row. It does not mean that these are the actual
values of physical coordinates when the image was sampled.
21
Fig 2.7 Coordinate convention to represent digital images.
The notation introduced in the preceding paragraph allows us to
write the complete M*N digital image in the following compact
matrix form:
Fig 2.8 Array
22
The right side of this equation is by definition a digital image. Each
element of this matrix array is called an image element, picture
element, pixel, or pel. The terms image and pixel will be used
throughout the rest of our discussions to denote a digital image and
its elements. In some discussions, it is advantageous to use a more
traditional matrix notation to denote a digital image and its
elements:
Fig 2.9 Array
2.4) Image Operations:
2.4.1) Color depth:
Image editing encompasses the process of altering images, whether
they are digital photographs, traditional analog photographs, or
illustrations. Traditional analog image editing is known as photo
retouching, using tools such as an airbrush to modify photographs, or
editing illustrations with any traditional art medium. Graphic
software programs, which can be broadly grouped into vector
graphics editors, raster graphics editors, and 3d modellers, are the
primary tools with which a user may manipulate, enhance, and
transform images. Many image editing programs are also used to
render or create computer art from scratch.
23
2.4.2) Contrast of images:
To apply a contrast filter, you determine if a pixel is lighter or darker
than a threshold amount. If it's lighter, you scale the pixel's intensity
up otherwise you scale it down. In code this is done by subtracting
the threshold from a pixel, multiplying by the contrast factor and
adding the threshold value back again. As with the brightness filter
the resulting value needs to be clamped to ensure it remains in the
range 0 - 255.
To apply a brightness filter you simply add a fixed amount to every
pixel in the image and then clamp the result to ensure it remains in
the range 0 - 255.
Fig 2.10 low contrast to high contrast
2.4.3) RGB to Gray scale:
In photography and computing, a gray scale or grey scale digital
image is an image in which the value of each pixel is a single sample,
that is, it carries only intensity information. Images of this sort, also
known as black-and-white, are composed exclusively of shades of
24
gray, varying from black at the weakest intensity to white at the
strongest.
Gray scale images are distinct from one-bit bi-tonal black-and-white
images, which in the context of computer imaging are images with
only the two colors, black, and white (also called bi-level or binary
images). Gray scale images have many shades of gray in between.
Gray scale images are also called monochromatic, denoting the
absence of any chromatic variation (i.e., one color).
Gray scale images are often the result of measuring the intensity of
light at each pixel in a single band of the electromagnetic spectrum
(e.g. infrared, visible light, ultraviolet, etc.), and in such cases they
are monochromatic proper when only a given frequency is captured.
But also they can be synthesized from a full color image.
2.4.4) Inverted Image:
An inverted image could be interpreted as a digital version of image
negatives. After inversion, every color takes the exact opposite one (I
know this terminology is not that scientific, but it’s useful as a
conceptual information). Let’s put this in more scientific terms. A
positive image should be defined as a normal, original RGB or gray
image. A negative image denotes a tonal inversion of a positive
image, in which light areas appear dark and dark areas appear light.
In negative images, a color reversing is also achieved, such that the
red areas appear cyan, greens appear magenta, and blues appear
yellow. In simpler sense, for the gray scale case, a black and white
image, using 0 for black and 255 for white, a near-black pixel value of
5 will be converted to 250, or near-white.
25
Image inversion is one of the easiest techniques in image processing.
Therefore, it’s very applicable to demonstrations of performance,
acceleration, and optimization. Many of the state of the art image
processing libraries such as Open CV, Gandalf, VXL etc., perform this
operation as fast as possible, even though some more accelerations
using parallel hardware are possible.
2.4.5) Blurring:
Blurring an image usually makes the image unfocused. In signal
processing, blurring is generally obtained by convolving the image
with a low pass filter. In this Demonstration, the amount of blurring
is increased by increasing the pixel radius.


Fig 2.11 Blurring of image
26
2.4.6) Sharpening an Image:
Sharpening is one of the most impressive transformations you can
apply to an image since it seems to bring out image detail that was
not there before. What it actually does, however, is to emphasize
edges in the image and make them easier for the eye to pick out -while the visual effect is to make the image seem sharper, no new
details are actually created.
Fig 2.12 sharpened images
2.4.7) Histogram:
Histograms are the basis for numerous spatial domain processing
techniques. Histogram manipulation can be used effectively for
image enhancement. The histogram of a digital image with gray
levels in the range [0, L-1] is a discrete function h(rk) = nk, where r is
the kth gray level and nk is the number of pixels in the image having
gray level rk. It is common practice to normalize a histogram by
dividing each of its values by the total number of pixels in the image,
denoted by n. Thus, a normalized histogram is given by p(rk) = nk/n,
27
for k=0, 1,p ,L-1. Loosely speaking, p(rk) gives an estimate of the
probability of occurrence of gray level rk. Note that the sum of all
components of a normalized histogram is equal to 1.
Fig 2.13 Image and its histogram
2.4.8) Equalization:
Consider for a moment continuous functions, and let the variable r
represent the gray levels of the image to be enhanced. In the initial
part of our discussion we assume that r has been normalized to the
interval [0, 1], with r=0 representing black and r=1 representing
white. Later, we consider a discrete formulation and allow pixel
values to be in the interval [0, L-1].
For any r satisfying the aforementioned conditions, we focus
attention on transformations of the form
s=T(r) 0 _ r _ 1
That produces a level s for every pixel value r in the original image.
For reasons that will become obvious shortly, we assume that the
transformation function
T(r) satisfies the following conditions:
28
(a) T(r) is single-valued and monotonically increasing in the interval
0 _ r _ 1; and
(b) 0 _ T(r) _ 1 for 0 _ r _ 1.
The requirement in (a) that T(r) be single valued is needed to
guarantee that the inverse transformation will exist, and the
monotonicity condition preserves the increasing order from black to
white in the output image. A transformation function that is not
monotonically increasing could result in at least a section of the
intensity range being inverted, thus producing some inverted gray
levels in the output image. While this may be a desirable effect in
some cases, that is not what we are after in the present discussion.
Finally, condition (b) guarantees that the output gray levels will be in
the same range as the input levels.
2.5) Image Formats:
2.5.1) Raster formats:
a) Joint Photographic Expert Group (JPEG):
JPEG stands for "Joint Photographic Expert Group" and, as its name
suggests, was specifically developed for storing photographic images.
It has also become a standard format for storing images in digital
cameras and displaying photographic images on internet web pages.
JPEG files are significantly smaller than those saved as TIFF, however
this comes at a cost since JPEG employs lossy compression. A great
thing about JPEG files is their flexibility. The JPEG file format is really
a toolkit of options whose settings can be altered to fit the needs of
each image.
29
b) Tagged Image File Format (TIFF):
TIFF stands for "Tagged Image File Format" and is a standard in the
printing and publishing industry. TIFF files are significantly larger
than their JPEG counterparts, and can be either uncompressed or
compressed using lossless compression. Unlike JPEG, TIFF files can
have a bit depth of either 16-bits per channel or 8-bits per channel,
and multiple layered images can be stored in a single TIFF file.
TIFF files are an excellent option for archiving intermediate files
which you may edit later, since it introduces no compression
artefacts. Many cameras have an option to create images as TIFF
files, but these can consume excessive space compared to the same
JPEG file. If your camera supports the RAW file format this is a
superior alternative, since these are significantly smaller and can
retain even more information about your image.
c) Portable Network Graphics (PNG):
PNG uses ZIP compression which is lossless, and slightly more
effective than LZW (slightly smaller files). PNG is a newer format,
designed to be both versatile and royalty free, back when the LZW
patent was disputed. The PNG (Portable Network Graphics) file
format was created as the free, open-source successor to the GIF.
The PNG file format supports true color (16 million colors) while the
GIF supports only 256 colors. The PNG file excels when the image has
large, uniformly colored areas. The lossless PNG format is best suited
for editing pictures, and the lossy formats, like JPG, are best for the
final distribution of photographic images.
30
d) Graphics Interchange Format (GIF):
GIF is limited to an 8-bit palette, or 256 colors. This makes the GIF
format suitable for storing graphics with relatively few colors such as
simple diagrams, shapes, logos and cartoon style images. The GIF
format supports animation and is still widely used to provide image
animation effects. It also uses a lossless compression that is more
effective when large areas have a single color, and ineffective for
detailed images or dithered images.
e) Bitmap Format (BMP):
The BMP file format (Windows bitmap) handles graphics files within
the Microsoft Windows OS. Typically, BMP files are uncompressed,
hence they are large; the advantage is their simplicity and wide
acceptance in Windows programs.
f) RAW:
RAW refers to a family of raw image formats that are options
available on some digital cameras. These formats usually use a
lossless or nearly-lossless compression, and produce file sizes much
smaller than the TIFF formats of full-size processed images from the
same cameras. Although there is a standard raw image format, (ISO
12234-2, TIFF/EP), the raw formats used by most cameras are not
standardized or documented, and differ among camera
manufacturers. Many graphic programs and image editors may not
accept some or all of them, and some older ones have been
effectively orphaned already. Adobe's Digital Negative (DNG)
31
specification is an attempt at standardizing a raw image format to be
used by cameras, or for archival storage of image data converted
from undocumented raw image formats, and is used by several niche
and minority camera manufacturers including Pentax, Leica, and
Samsung.
2.5.2) Vector Formats:
a) Computer Graphics Metafile (CGM):
CGM is a file format for 2D vector graphics, raster graphics, and text,
and is defined by ISO/IEC 8632. All graphical elements can be
specified in a textual source file that can be compiled into a binary
file or one of two text representations. CGM provides a means of
graphics data interchange for computer representation of 2D
graphical information independent from any particular application,
system, platform, or device. It has been adopted to some extent in
the areas of technical illustration and professional design, but has
largely been superseded by formats like SVG.
b) Scalable Vector Graphics (SVG):
SVG is an open standard created and developed by the World Wide
Web Consortium to address the need (and attempts of several
corporations) for a versatile, scriptable and all-purpose vector format
for the web and otherwise. The SVG format does not have a
compression scheme of its own, but due to the textual nature of
XML, an SVG graphic can be compressed using a program such as
gzip. Because of its scripting potential, SVG is a key component in
web applications: interactive web pages that look and act like
applications.
32
3) Message Encoding Techniques:
3.1) Steganography:
Steganography is the art and science of invisible communication. This
is accomplished through hiding information in other information,
thus hiding the existence of the communicated information. Thus
image steganography is a better approach than cryptography.
Purpose of image processing is to make the quality of an image
better so that the required operations can be easily performed on it.
Image steganography is performed on the desired formats which are
suitable. Steganography includes the concealment of information
within computer files. In digital steganography, electronic
communications may include steganographic coding inside of a
transport layer, such as a document file, image file, program or
protocol. Media files are ideal for steganographic transmission
because of their large size. As a simple example, a sender might start
with an innocuous image file and adjust the color of every 100th
pixel to correspond to a letter in the alphabet, a change so subtle
that someone not specifically looking for it is unlikely to notice it.
Fig 3.1 Steganography
Patchwork – The biggest disadvantage of the patchwork approach is
the small amount of information that can be hidden in one image.
This property can be changed to accommodate more information but
one may have to sacrifice the secrecy of the information.
Patchwork’s main advantage, however, is its robustness against
malicious or unintentional image manipulation. Should a stego image
using patchwork be cropped or rotated, some of the message data
33
may be lost but since the message is repeatedly embedded in the
image, most of the information will survive.
Patchwork is most suitable for transmitting a small amount of very
sensitive information.
3.1.1) Ancient Steganography:
The first recorded uses of steganography can be traced back to 440
BC when Herodotus mentions two examples of steganography in The
Histories of Herodotus. Demaratus sent a warning about a
forthcoming attack to Greece by writing it directly on the wooden
backing of a wax tablet before applying its beeswax surface. Wax
tablets were in common use then as reusable writing surfaces,
sometimes used for shorthand. Another ancient example is that of
Histiaeus, who shaved the head of his most trusted slave and
tattooed a message on it. After his hair had grown the message was
hidden. The purpose was to instigate a revolt against the Persians.
3.1.2) Modern Steganography:
Fig 3.2 Modern Steganography—THE PRISONER`S PROBLEM:
34
Alice and Bob are communicating with each other using secret
message exchanging.
Fig 3.3 Block Diagram
3.1.3) Steganography in practice:
Fig 3.4 Steganography in practice
35
Steganography is a special case of data hiding. But data hiding cannot
always be steganography. In steganography main goal is to escape
from Wendy when Alice and Bob are communicating with each
other.
3.1.4) Different techniques:
a) Physical steganography:







Hidden messages within wax tablets — in ancient Greece, people
wrote messages on the wood, and then covered it with wax upon
which an innocent covering message was written.
Hidden messages on messenger's body — also used in ancient
Greece. Herodotus tells the story of a message tattooed on a
slave's shaved head, hidden by the growth of his hair, and
exposed by shaving his head again. The message allegedly carried
a warning to Greece about Persian invasion plans. This method
has obvious drawbacks, such as delayed transmission while
waiting for the slave's hair to grow, and the restrictions on the
number and size of messages that can be encoded on one
person's scalp.
During World War II, the French Resistance sent some messages
written on the backs of couriers using invisible ink.
Hidden messages on paper written in secret inks, under other
messages or on the blank parts of other messages.
Messages written in Morse code on knitting yarn and then knitted
into a piece of clothing worn by a courier.
Messages written on envelopes in the area covered by postage
stamps.
During and after World War II, espionage agents used
photographically produced microdots to send information back
and forth. Microdots were typically minute, approximately less
than the size of the period produced by a typewriter. World War II
microdots needed to be embedded in the paper and covered with
36
an adhesive, such as collodion. This was reflective and thus
detectable by viewing against glancing light.
b) Digital Steganography:
Modern steganography entered the world in 1985 with the advent of
the personal computer being applied to classical steganography
problems. Development following that was slow, but has since taken
off, going by the number of "stego" programs available. Digital
steganography techniques include:







Concealing messages within the lowest bits of noisy images or
sound files.
Concealing data within encrypted data or within random data.
The data to be concealed is first encrypted before being used to
overwrite part of a much larger block of encrypted data or a block
of random data (an unbreakable cipher like the one-time pad
generates cipher texts that look perfectly random if you don't
have the private key).
Chaffing and winnowing.
Mimic functions convert one file to have the statistical profile of
another. This can thwart statistical methods that help brute-force
attacks identify the right solution in a cipher text-only attack.
Concealed messages in tampered executable files, exploiting
redundancy in the targeted instruction set.
Pictures embedded in video material (optionally played at slower
or faster speed).
Injecting imperceptible delays to packets sent over the network
from the keyboard. Delays in key presses in some applications
37






(telnet or remote desktop software) can mean a delay in packets,
and the delays in the packets can be used to encode data.
Changing the order of elements in a set.
Content-Aware Steganography hides information in the semantics
a human user assigns to a datagram. These systems offer security
against a non-human adversary/warden.
Blog-Steganography. Messages are fractionalized and the
(encrypted) pieces are added as comments of orphaned web-logs
(or pin boards on social network platforms). In this case the
selection of blogs is the symmetric key that sender and recipient
are using; the carrier of the hidden message is the whole
blogosphere.
Modifying the echo of a sound file (Echo Steganography).
Secure Steganography for Audio Signals.
Image bit-plane complexity segmentation steganography
c) Network Steganography:
All information hiding techniques that may be used to exchange
steganographs in telecommunication networks can be classified
under the general term of network steganography. This
nomenclature was originally introduced by Krzysztof Szczypiorski in
2003. Contrary to the typical steganographic methods which utilize
digital media (images, audio and video files) as a cover for hidden
data, network steganography utilizes communication protocols'
control elements and their basic intrinsic functionality. As a result,
such methods are harder to detect and eliminate.
38
Typical network steganography methods involve modification of the
properties of a single network protocol. Such modification can be
applied to the PDU (Protocol Data Unit).
Moreover, it is feasible to utilize the relation between two or more
different network protocols to enable secret communication. These
applications fall under the term inter-protocol steganography.
Network steganography covers a broad spectrum of techniques,
which include, among others:


Steganophony - the concealment of messages in Voice-over-IP
conversations, e.g. the employment of delayed or corrupted
packets that would normally be ignored by the receiver (this
method is called LACK - Lost Audio Packets Steganography), or,
alternatively, hiding information in unused header fields.
WLAN Steganography – the utilization of methods that may be
exercised to transmit steganograms in Wireless Local Area
Networks. A practical example of WLAN Steganography is the
HICCUPS system (Hidden Communication System for Corrupted
Networks).
d) Printed Steganography:
Digital steganography output may be in the form of printed
documents. A message, the plaintext, may be first encrypted by
traditional means, producing a cipher text. Then, an innocuous cover
text is modified in some way so as to contain the cipher text,
resulting in the stego text. For example, the letter size, spacing,
typeface, or other characteristics of a cover text can be manipulated
to carry the hidden message. Only a recipient who knows the
39
technique used can recover the message and then decrypt it. Francis
Bacon developed Bacon's cipher as such a technique.
The cipher text produced by most digital steganography methods,
however, is not printable. Traditional digital methods rely on
perturbing noise in the channel file to hide the message, as such; the
channel file must be transmitted to the recipient with no additional
noise from the transmission. Printing introduces much noise in the
cipher text, generally rendering the message unrecoverable.
e) Text Steganography:
Steganography can be applied to different types of media including
text, audio, image and video etc. However, text steganography is
considered to be the most difficult kind of steganography due to lack
of redundancy in text as compared to image or audio but still has
smaller memory occupation and simpler communication. The
method that could be used for text steganography is data
compression. Data compression encodes information in one
representation into another representation. The new representation
of data is smaller in size. One of the possible schemes to achieve data
compression is Huffman coding. Huffman coding assigns smaller
length code words to more frequently occurring source symbols and
longer length code words to less frequently occurring source
symbols. Unicode steganography uses lookalike characters of the
usual ASCII set to look normal, while really carrying extra bits of
information. If the text is displayed correctly, there should be no
visual difference from ordinary text. Some systems however may
display the fonts differently, and would be easily spotted.
40
3.2) Algorithms:
3.2.1) LSB:
Algorithm to embed text message:
Step 1: Read the cover image and text message which is to be hidden in the
cover image.
Step 2: Convert text message in binary.
Step 3: Calculate LSB of each pixels of cover image.
Step 4: Replace LSB of cover image with each bit of secret message one by
one.
Step 5: Write stego image.
Algorithm to retrieve text message:
Step 1: Read the stego image.
Step 2: Calculate LSB of each pixels of stego image.
Step 3: Retrieve bits and convert each 8 bit into character.
41
FLOWCHART FOR LSB ENCODING
START
DISPLAY IMAGE
ENTER STRING TO BE ENCODED
FIND OUT ASCII CODE FOR EACH LETTER
CONVERT ASCII CODE INTO BINARY CODE
EXTRACT RGB VALUES FROM THE IMAGE
(ENCODING)
EMBED FIRST THREE BITS OF A LETTER INTO LAST
THREE BITS OF RED VALUE, NEXT THREE BITS INTO
GREEN VALUE AND LAST TWO BITS INTO THE BLUE
VALUE OF A PIXEL
STORE RGB VALUES IN ARRAY
DISPLAY IMAGE
STOP
42
FLOWCHART FOR LSB DECODING
START
DISPLAY IMAGE
RETRIEVE RGB VALUES FROM IMAGE
FOR EACH PIXEL
(DECODING)
EXTRACT FIRST THREE BITS OF A LETTER FROM LAST THREE
BITS OF RED VALUE, NEXT THREE BITS FROM GREEN
VALUE AND LAST TWO BITS FROM BLUE VALUE OF A PIXEL
STORE IT IN ARRAY
CONVERT BINARY CODE TO ASCII CODE
DISPLAY THE STRING
STOP
43
3.2.2) DCT Algorithm:
Algorithm to embed text message:
Step 1: Read cover image.
Step 2: Read secret message and convert it in binary.
Step 3: The cover image is broken into 8×8 block of pixels.
Step 4: Working from left to right, top to bottom subtract128 in each block of
pixels.
Step 5: DCT is applied to each block.
Step 6: Each block is compressed through quantization table.
Step 7: Calculate LSB of each DC coefficient and replace with each bit
of secret message.
Step 8: Write stego image.
Algorithm to retrieve text message:
Step 1: Read stego image.
Step 2: Stego image is broken into 8×8 block of pixels.
Step 3: From left to right, top to bottom subtract128 in each block of pixels.
Step 4: DCT is applied to each block.
Step 5: Each block is compressed through quantization table.
Step 6: Calculate LSB of each DCT coefficient.
Step 7: Retrieve and convert each 8 bit into character.
44
3.2.3) Comparison:
The relative easiness to implement LSB makes it a popular method.
To hide a secret message in an image, a proper cover image is
needed. Because this method uses bits of each pixel in the image, it
is necessary to use a lossless compression format, otherwise the
hidden information will get lost in the transformations of a lossy
compression algorithm. Disadvantages of using LSB alteration are
mainly in the fact that it requires a fairly large cover image to create
a usable amount of hiding space. Even now, uncompressed images of
800 x 600 pixels are not often used on the Internet, so using these
might rise suspicion. Another disadvantage will arise when
compressing an image concealing a secret using a lossy compression
algorithm. The hidden message will not survive this operation and is
lost after the transformation.
On the other hand, the Discrete Cosine Transform provides a
mathematical and computational method of taking spatial data,
dividing it into parts of differing importance with respect to visual
quality, and compressing it into an accurate and overall high quality
image. The DCT upholds many of the properties of the Fourier
Transform, such as orthonormality and corresponding relations that
follow, such as Parseval’s and Plancheral’s. Its inverse, the IDCT,
allows reconstruction of an image frame that had been encoded by
the DCT, and hence transforms back to the time domain. Despite its
similarities to the Fourier Transform, it has been shown that for
application purposes the DCT is much more practical and efficient,
and it is commonly thought of in regards to image compression for
JPEG and MPEG files.
Comparative analysis of LSB based and DCT based steganography is
done on basis of parameters like PSNR. Both gray scale and colored
images have been used for experiments. Peak signal to noise ratio is
45
used to compute how well the methods perform. PSNR computes
the peak signal to noise ratio, in decibels, between two images. This
ratio is used as a quality measurement between two images. If PSNR
ratio is high then images are best of quality.
PSNR is the peak signal to noise ratio, in decibels, between two
images. This ratio is used as a quality measurement between two
images. If PSNR ratio is high then images are better of quality.
Comparison of LSB based and DCT based stego images using PSNR
ratio shows that PSNR ratio of DCT based steganography scheme is
high as compared to LSB based steganography scheme for all types
of images- (Gray scale as well as Color). DCT based steganography
scheme works perfectly with minimal distortion of the image quality
as compared to LSB based steganography scheme. Even though the
amount of secret data that can be hidden using this technique is very
small as compared to LSB based steganography scheme still, DCT
based steganography scheme is recommended because of the
minimum distortion of image quality.
46
3.3) Applications:
a) Used in modern printers.
b) Alleged use by terrorists.
c) Alleged use by intelligence department.
d) Privacy and anonymity is a concern on the internet.
e) Allows for two parties to communicate secretly and covertly.
f) It allows for some morally-conscious people to safely whistle blow
on internal actions.
g) It allows for copyright protection on digital files using the message
as a digital watermark.
h) One of the other main uses for Image Steganography is for the
transportation of high-level or top-secret documents between
international governments.
47
4) Software paradigm:
4.1) Waterfall model:
The waterfall model is a sequential design process, often used in
software development processes, in which progress is seen as
flowing steadily downwards (like a waterfall) through the phases of
Conception, Initiation, Analysis, Design, Construction, Testing,
Production/Implementation and Maintenance.
4.2) Software Requirement System:
A requirements specification for a software system – is a complete
description of the behaviour of a system to be developed. It includes
a set of use cases that describe all the interactions the users will have
with the software. In addition to use cases, the SRS also contains
48
non-functional (or supplementary) requirements. Non-functional
requirements are requirements which impose constraints on the
design or implementation (such as performance engineering
requirements, quality standards, or design constraints).
4.3) Design:
A software development process, also known as a software
development life cycle (SDLC), is a structure imposed on the
development of a software product. Similar terms include software
life cycle and software process. It is often considered a subset of
systems development life cycle. There are several models for such
processes, each describing approaches to a variety of tasks or
activities that take place during the process. Some people consider a
lifecycle model a more general term and a software development
process a more specific term. For example, there are many specific
software development processes that 'fit' the spiral lifecycle model.
ISO 12207 is an ISO standard for software lifecycle processes. It aims
to be the standard that defines all the tasks required for developing
and maintaining software.
4.4) Testing:
There are three types of testing i.e. unit testing, integration testing
and system testing. Unit testing is done with one module. Integration
testing is done by testing together two or more modules. System
testing is done when all the modules of the software are worked and
tested together.
49
Testing is the process of identifying the errors during the process of
execution of a program.
4.4.1) Unit Testing:
In computer programming, unit testing is a method by which
individual units of source code are tested to determine if they are fit
for use. A unit is the smallest testable part of an application. In
procedural programming a unit may be an individual function or
procedure. In object-oriented programming a unit is usually an
interface, such as a class. Unit tests are created by programmers or
occasionally by white box testers during the development process.
The goal of unit testing is to isolate each part of the program and
show that the individual parts are correct. A unit test provides a
strict, written contract that the piece of code must satisfy. As a
result, it affords several benefits. Unit tests find problems early in the
development cycle.
4.4.2) Integration Testing:
Integration testing is a logical extension of unit testing. In its simplest
form, two units that have already been tested are combined into a
component and the interface between them is tested. A component,
in this sense, refers to an integrated aggregate of more than one
unit. In a realistic scenario, many units are combined into
components, which are in turn aggregated into even larger parts of
the program. Integration testing (sometimes called Integration and
Testing, abbreviated "I&T") is the phase in software testing in which
individual software modules are combined and tested as a group. It
occurs after unit testing and before system testing. Integration
testing takes as its input modules that have been unit tested, groups
them in larger aggregates, applies tests defined in an integration test
50
plan to those aggregates, and delivers as its output the integrated
system ready for system testing. The purpose of integration testing is
to verify functional, performance, and reliability requirements placed
on major design items. These "design items", i.e. assemblages (or
groups of units), are exercised through their interfaces using Black
box testing, success and error cases being simulated via appropriate
parameter and data inputs. Simulated usage of shared data areas
and inter-process communication is tested and individual subsystems
are exercised through their input interface. Test cases are
constructed to test that all components within assemblages interact
correctly, for example across procedure calls or process activations,
and this is done after testing individual modules, i.e. unit testing. The
overall idea is a "building block" approach, in which verified
assemblages are added to a verified base which is then used to
support the integration testing of further assemblages.
4.4.3) System Testing:
System testing of software or hardware is testing conducted on a
complete, integrated system to evaluate the system's compliance
with its specified requirements. System testing falls within the scope
of black box testing, and as such, should require no knowledge of the
inner design of the code or logic. As a rule, system testing takes, as
its input, all of the "integrated" software components that have
successfully passed integration testing and also the software system.
51
5) Interface Layouts and Results:
GUI for User Input
Fig 5.1 Interface for Loading an Image
52
Fig 5.2 Dialog Box for Image File Selection
53
Fig 5.3 Loaded Image
54
Fig5.4 Interface for Image Operations
55
Fig 5.5 Blurred Image
56
Fig 5.6 Contrast Image
57
Fig 5.7 Gray Scale Image
58
Fig 5.8 Inverted Image
59
Fig 5.9 Sharpen Image
60
OUTPUT FOR STEGANOGRAPHY:
FOR ENCODING:
Fig.5.10 Image before Encoding
61
Fig5.11 Image after Encoding
62
FOR DECODING:
Fig 5.12 Interface when entered wrong password
63
Fig 5.13 Message decoded with correct password
64
6) Conclusion:
In this report we studied the various image filtering operations which
helps to enhance, modify, warp, mutilate the images. It compares a
pixel somehow with the pixels around it to filter the image.
Steganography is the art of hiding information in an innocuous cover.
Its basic purpose is to make communication unintelligible to those
who do not possess the right keys. The message can be hidden inside
of images or, other digital objects, which remains imperceptible to a
casual observer. By embedding a secret message into a cover image,
a stego -image is obtained. As the stego -image does not contain any
easily detectable visual artifacts due to message embedding.
The common approaches for message hiding in images include Least
Significant Bit (LSB) insertion methods, Frequency Domain
Techniques, Spread Spectrum Techniques, statistical methods, Cover
Generation methods and Fractal Techniques. The change in
behaviour of the stego-image is dependent on the specific approach
used for hiding information. Attacking on stego-images is also of
different natures to take care different steganographic approaches.
We studied about the LSB technique of image encoding. LSB is often
used as this is the easiest technique to encode and decode. The
image formats typically used in such steganography methods are
lossless and the data in this method can be directly inserted and
recovered in presence of stego-key. For 24 bit image, the colours of
each component like RGB (red, green and blue) are changed. LSB is
effective in using BMP images since the compression in BMP is
lossless. But for hiding the secret message inside an image of BMP
file using LSB algorithm it requires a large image which is used as a
65
cover. LSB substitution is also possible for GIF formats, but the
problem with the GIF image is whenever the least significant bit is
changed the whole colour palette will be changed. The problem can
be avoided by only using the gray scale GIF images since the gray
scale image contains 256 shades and the changes will be done
gradually so that it will be very hard to detect. For JPEG, the direct
substitution of steganographic techniques is not possible since it will
use lossy compression. So it uses LSB substitution for embedding the
data into images.
So in a way, LSB is an efficient technique for encoding large amounts
of data and is comparatively easy to implement than other
techniques.
66
7) Scope and Limitations:
In the present world, the data transfers using internet is rapidly
growing because it is so easier as well as faster to transfer the data
to destination. So, many individuals and business people use to
transfer business documents, important information using internet.
Security is an important issue while transferring the data using
internet because any unauthorized individual can hack the data and
make it useless or obtain information un- intended to him. The scope
of the project is to limit unauthorised access and provide better
security during message transmission. To meet the requirements, I
use the simple and basic approach of steganography. In this project,
the proposed approach finds the suitable algorithm for embedding
the data in an image using steganography which provides the better
security pattern for sending messages through a network. I used the
Least Significant Bit algorithm in this project for developing the
application which is faster and reliable and compression ratio is
moderate compared to other algorithms. The image resolution does not
change much and is negligible when we embed the message into the
image and the image is protected with the personal password. So, it
is not possible to damage the data by unauthorized personnel.
67
8) Future Scope:
Future work includes experimentation with a wider range of images
with high quality and more optimised one. Also to sustain the
message even after the alteration made in the cover image like
cropping, resizing etc. The future work on this project is to improve
the compression ratio of the image to the text. This project can be
extended to a level such that it can be used for the different types of
image formats like .bmp, .jpeg, .tif etc., in the future. The security
using Least Significant Bit Algorithm is good but we can improve the
level to a certain extent by varying the carriers as well as using
different keys for encryption and decryption.
Future implementations will improve the research related works
which can be done in regard to all above referred approaches like
DCT. This will include implementation of DCT which is more suitable
for formats like jpeg and mpeg. DCT implementation will provide
working with wider range of images with less calculation. It involves
fewer calculations due to fft algorithm used in it.
68
9) References:
a) 1000 java tips (Alexandre Patchine, Dr. Heinz Kabutz)
b) Digital Image Processing 2nd Edition (Raefel C Gonzalez)
c) The Complete Reference 5th Edition (Herbert Schildt)
d) IEEE Paper titled ”A steganography implementation” by Mehboob,
Faruqui.
e) Bandwidthers blog on steganography.
f) A quick premier on MSDN blogs.
g) “Implementation and Evaluation of Image Processing
Algorithms on Reconfigurable Architecture” by Daggu
Venkateshwar Rao, Shruti Patil, Naveen Anne Babu and V
Muthukumar.
h) http://www.peterindia.net/SteganographyLinks.html
i) Paper titled “Information Security through Image Stegnography”
by Nani Koduri M.S. in Information Security and Computer Forensic.
j) Paper titled “Hiding Data in data” in April 2002 in issue of Windows
and .NET Magazine by Gary C. Kessler.
69
Download