Project Report on Implementation of Message Encryption/Decryption Techniques Work carried at Defence Terrain Research Laboratory (DTRL) Defence Research & Development Organisation (DRDO) Under the supervision of Ms Geeta Gupta Scientist DTRL, DRDO Metcalf House, Delhi-54 Submitted by: Neelam Aggarwal (1151153108) 1 TABLE OF CONTENTS Page no. DECLARATION..................................................................................... 6 CERTIFICATE.........................................................................................7 ACKNOWLEDGEMENT......................................................................... 8 ABSTRACT............................................................................................ 9 1. INTRODUCTION..............................................................................10 1.1 PURPOSE.............................................................................11 1.2 SOFTWARE & HARDWARE REQUIREMENTS........................12 2. DIGITAL IMAGE PROCESSING.........................................................13 2.1 HISTORY..............................................................................16 2.2 IMAGE SAMPLING & QUANTIZATION.................................17 2.3 IMAGE REPRESENTATION....................................................20 2.4 IMAGE OPERATIONS...........................................................22 2.5 IMAGE FORMATS................................................................28 3. MESSAGE ENCODING TECHNIQUES...............................................32 3.1 STEGANOGRAPHY...............................................................32 3.2 ALGORITHMS.......................................................................40 3.2.1 LSB...........................................................................40 3.2.2 DCT..........................................................................43 2 3.3 APPLICATIONS.....................................................................46 4. SOFTWARE PARADIGM APPLIED....................................................47 4.1 WATERFALL MODEL............................................................47 4.2 SOFTWARE REQUIREMENTS SPECIFICATION......................47 4.3 DESIGN................................................................................48 4.4. TESTING..............................................................................48 5. INTERFACE LAYOUTS AND RESULTS..............................................51 6. CONCLUSION..................................................................................64 7. SCOPE AND LIMITATIONS...............................................................66 8. FUTURE SCOPE...............................................................................67 9. REFERENCES...................................................................................68 3 List of Figures Page numbers 1) Figure 2.1 Standard Processing of Image Born Images 14 2) Figure 2.2 Continuous Image 17 3) Figure 2.3 A plot of amplitude values along the line of continuous image 18 4) Figure 2.4 Sampling of the continuous image 18 5) Figure 2.5 Quantization of sampled image 18 6) Figure 2.6 a) Continuous image projected onto a sensor array b) Result of image sampling and quantization 20 7) Figure 2.7 Coordinates convention to represent digital images 21 8) Figure 2.8 Array 21 9) Figure 2.9 Array 22 10) Figure 2.10 Low contrast to high contrast 23 11) Figure 2.11 Blurring of Image 25 12) Figure 2.12 Sharpened Image 26 13) Figure 2.13 Image and its histogram 27 14) Figure 3.1 Steganography 32 15) Figure 3.2 Modern Steganography 33 16) Figure 3.3 Block Diagram 34 17) Figure 3.4 Steganography in practise 34 18) Figure 5.1 Interface for loading an image 51 19) Figure 5.2 Dialog Box for image file selection 52 4 20) Figure 5.3 Loaded Image 53 21) Figure 5.4 Interface for image operations 54 22) Figure 5.5 Blurred Image 55 23) Figure 5.6 Contrast Image 56 24) Figure 5.7 Gray Scale Image 57 25) Figure 5.8 Inverted Image 58 26) Figure 5.9 Sharpen Image 59 27) Figure 5.10 Image before Encoding 60 28) Figure 5.11 Image after Encoding 61 29) Figure 5.12 Interface when entered wrong password 62 30) Figure 5.13 Message Decoded with correct password 63 5 Declaration I hereby declare that this submission is my own work and that to the best of my knowledge and belief. It contains no material previously published or written by another person neither material which to a substantial extent has been accepted for the award of any other degree of the university. NEELAM AGGARWAL ( B.Tech, IT) 6 DATED: Certificate This is to certify that project report entitled “Implementation of Message Encoding Algorithm” which is submitted by Neelam Aggarwal student of B.tech, 4th year, IT, Bharti Vidyapeeth’s College Of Engineering is a record of candidate own work carried out by her under my supervision. The matter embodied in this project report is original and has not been submitted for the award of any other degree. Date : Mrs. Vanita Jain (HOD of IT dept.) 7 Acknowledgement I owe a great many thanks to a great many people who helped and supported me during the project. My deepest thanks to members of DTRL for guiding and correcting me in the completion of project. They help throughout the project and made necessary correction as and when needed. I express my thanks for extending their support. I would also thank my Institution and my faculty members without whom this project would have been a distant reality. I also extend my heartfelt thanks to my well wishers. 8 Abstract Encrypting data has been the most popular approach for protecting information but this protection can be broken with enough computational power. An alternate approach to encrypting data would be to hide it by making this information look like something else. In this way only receiver would realize its true content. In particular, if the data is hidden in an image then everyone would view it as a picture. At the same time receiver could still retrieve the true information. This technique is often called data hiding or steganography. For implementing steganography the images which are collection of pixels should be in a proper format. For this purpose image processing is done to convert the required image in proper format. Image processing usually refers to digital image processing. Image processing is any form of signal processing for which the input is an image, such as a photograph or video frame; the output of image processing may be either an image or, a set of characteristics or parameters related to the image. The basic operations performed on images are contrast enhancement, gray scale conversion, inverting the image etc. 9 1) INTRODUCTION 1.1) Introduction: Our goal is to build a simple application that is able to send and receive encrypted messages embedded inside images. The user is able to choose the image he wants and the program must tell if this image will suit the text or not. No pixel deformation or size distortion is allowed. TIF images may suffer slight size increments or decrements, but we will get to that later. The user can set a different password for every message he sends, which will enable the manager to transmit the same image to two groups, but with two different passwords and two different messages. Encrypting data has been the most popular approach for protecting information but this protection can be broken with enough computational power. An alternate approach to encrypting data would be to hide it by making this information look like something else. In this way only concern receiver would realize its true content. In particular, if the data is hidden inside of an image then everyone would view it as a picture. At the same time receiver could still retrieve the true information. This technique is often called data hiding or steganography. For implementing steganography the images which are collection of pixels should be in a proper format. For this purpose image processing is done to convert the required image in proper format. 10 Image processing usually refers to digital image processing. Image processing is any form of signal processing for which the input is an image, such as a photograph or video frame. 1.2) Purpose The purpose of this project is to make software through which we can perform basic image operations on the desired images and also can encrypt messages or hide the messages for the purpose of security. Through steganography we are encrypting the messages whereas cryptography was created as a technique for securing the secrecy of communication and many different methods have been developed to encrypt and decrypt data in order to keep the message secret. Unfortunately it is sometimes not enough to keep the contents of a message secret, it may also be necessary to keep the existence of the message secret. The technique used to implement this, is called steganography. Steganography is the art and science of invisible communication. This is accomplished through hiding information in other information, thus hiding the existence of the communicated information. Thus image steganography is a better approach than cryptography. Purpose of image processing is to make the quality of an image better so that the required operations can be easily performed on it. Image steganography is performed on the desired formats which are suitable. One can use this software for performing simple image operations on the images and encrypt the desired message which is to be sent to another person by preventing its security. This software performs image operations with message encryption. 11 1.3) Software and Hardware requirements: Hardware: Processor Intel(R) Pentium(R) D CPU 2.66 GHz RAM 512 MB Operating System Windows XP Software: Front End Net beans IDE 6.5 JDK 1.6.0_26 12 2) Digital Image Processing: Digital image processing is the use of algorithms to perform image processing of digital images. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of noise and signal distortion during processing. Since images are defined over two dimensions (perhaps more) digital image processing may be modelled in the form of multidimensional systems. With the fast computers and signal processors available in the 2000s, digital image processing has become the most common form of image processing and generally, is used because it is not only the most versatile method, but also the cheapest. Digital image processing technology for medical applications was inducted into the Space Foundation Space Technology Hall of Fame in 1994. Image processing in its broadest sense is an umbrella term for representing and analyzing of data in visual form. More narrowly, image processing is the manipulation of numeric data contained in a digital image for the purpose of enhancing its visual appearance. Through image processing, faded pictures can be enhanced, medical 13 images clarified, and satellite photographs calibrated. Image processing software can also translate numeric information into visual images that can be edited, enhanced, filtered, or animated in order to reveal relationships previously not apparent. Image analysis, in contrast, involves collecting data from digital images in the form of measurements that can then be analyzed and transformed. Originally developed for space exploration and biomedicine, digital image processing and analysis are now used in a wide range of industrial, artistic, and educational applications. Software for image processing and analysis is widely available on all major computer platforms. This software supports the modern adage that "a picture is worth a thousand words, but an image is worth a thousand pictures." 14 Fig 2.1 Standard processing of space borne images Each of the pixels that represent an image stored inside a computer has a pixel value which describes how bright that pixel is, and/or what color it should be. In the simplest case of binary images, the pixel value is a 1-bit number indicating either foreground or background. For a gray scale images, the pixel value is a single number that represents the brightness of the pixel. The most common pixel format is the byte image, where this number is stored as an 8-bit integer giving a range of possible values from 0 to 255. Typically zero is taken to be black, and 255 are taken to be white. Values in between this make up the different shades of gray. Although simple 8-bit integers or vectors of 8-bit integers are the most common sorts of pixel values used, some image formats support different types of value, for instance 32-bit signed integers or floating point values. Such values are extremely useful in image processing as they allow processing to be carried out on the image where the resulting pixel values are not necessarily 8-bit integers. If this approach is used then it is usually necessary to set up a color map which relates particular ranges of pixel values to particular displayed colors. 15 2.1) History: Many of the techniques of digital image processing, or digital picture processing as it often was called, were developed in the 1960s at the Jet Propulsion Laboratory, Massachusetts Institute of Technology, Bell Laboratories, University of Maryland, and a few other research facilities, with application to satellite imagery, wire-photo standards conversion, medical imaging, videophone, character recognition, and photograph enhancement. The cost of processing was fairly high, however, with the computing equipment of that era. That changed in the 1970s, when digital image processing proliferated as cheaper computers and dedicated hardware became available. Images then could be processed in real time, for some dedicated problems such as television standards conversion. As general-purpose computers became faster, they started to take over the role of dedicated hardware for all but the most specialized and computer-intensive operations. 16 2.2) Image Sampling and Quantisation To create a digital image, we need to convert the continuous sensed data into digital form. This involves two processes: 1) Sampling 2) Quantization. The basic idea behind sampling and quantization is illustrated in Fig. 2.1. An image may be continuous with respect to the x- and ycoordinates, and also in amplitude. To convert it to digital form, we have to sample the function in both coordinates and in amplitude. Digitizing the coordinate values is called sampling. Digitizing the amplitude values is called quantization. 17 Fig 2.2 Continuous Image. Fig 2.3 a plot of amplitude values along the line of continuous image. 18 Fig 2.4 Sampling of the above continuous image. Fig 2.5 Quantization of the sampled image. The one-dimensional function shown in Fig. 2.3 is a plot of amplitude (gray level) values of the continuous image along the line segment AB in Fig. 2.2.The random variations are due to image noise. To sample this function, we take equally spaced samples along line AB, as shown in Fig. 2.4.The location of each sample is given by a vertical tick mark in the bottom part of the figure. The samples are shown as small white squares superimposed on the function. The set of these discrete locations gives the sampled function. 19 However, the values of the samples still span (vertically) a continuous range of gray-level values. In order to form a digital function, the gray-level values also must be converted (quantized) into discrete quantities. The right side of Fig. 2.4 shows the gray-level scale divided into eight discrete levels, ranging from black to white. The vertical tick marks indicate the specific value assigned to each of the eight gray levels. The continuous gray levels are quantized simply by assigning one of the eight discrete gray levels to each sample. The assignment is made depending on the vertical proximity of a sample to a vertical tick mark. The digital samples resulting from both sampling and quantization are shown in Fig. 2.6. Starting at the top of the image and carrying out this procedure line by line produces a twodimensional digital image. Sampling in the manner just described assumes that we have a continuous image in both coordinate directions as well as in amplitude. In practice, the method of sampling is determined by the sensor arrangement used to generate the image. Fig 2.6 a) Continuous image projected onto a sensor array. b) Result of image sampling and quantization. 20 2.3) Image Representation: The result of sampling and quantization is a matrix of real numbers. We will use two principal ways to represent digital images. Assume that an image f(x, y) is sampled so that the resulting digital image has M rows and N columns. The values of the coordinates (x, y) now become discrete quantities. For notational clarity and convenience, we shall use integer values for these discrete coordinates. Thus, the values of the coordinates at the origin are (x, y) = (0, 0). The next coordinate values along the first row of the image are represented as (x, y) = (0, 1). It is important to keep in mind that the notation (0, 1) is used to signify the second sample along the first row. It does not mean that these are the actual values of physical coordinates when the image was sampled. 21 Fig 2.7 Coordinate convention to represent digital images. The notation introduced in the preceding paragraph allows us to write the complete M*N digital image in the following compact matrix form: Fig 2.8 Array 22 The right side of this equation is by definition a digital image. Each element of this matrix array is called an image element, picture element, pixel, or pel. The terms image and pixel will be used throughout the rest of our discussions to denote a digital image and its elements. In some discussions, it is advantageous to use a more traditional matrix notation to denote a digital image and its elements: Fig 2.9 Array 2.4) Image Operations: 2.4.1) Color depth: Image editing encompasses the process of altering images, whether they are digital photographs, traditional analog photographs, or illustrations. Traditional analog image editing is known as photo retouching, using tools such as an airbrush to modify photographs, or editing illustrations with any traditional art medium. Graphic software programs, which can be broadly grouped into vector graphics editors, raster graphics editors, and 3d modellers, are the primary tools with which a user may manipulate, enhance, and transform images. Many image editing programs are also used to render or create computer art from scratch. 23 2.4.2) Contrast of images: To apply a contrast filter, you determine if a pixel is lighter or darker than a threshold amount. If it's lighter, you scale the pixel's intensity up otherwise you scale it down. In code this is done by subtracting the threshold from a pixel, multiplying by the contrast factor and adding the threshold value back again. As with the brightness filter the resulting value needs to be clamped to ensure it remains in the range 0 - 255. To apply a brightness filter you simply add a fixed amount to every pixel in the image and then clamp the result to ensure it remains in the range 0 - 255. Fig 2.10 low contrast to high contrast 2.4.3) RGB to Gray scale: In photography and computing, a gray scale or grey scale digital image is an image in which the value of each pixel is a single sample, that is, it carries only intensity information. Images of this sort, also known as black-and-white, are composed exclusively of shades of 24 gray, varying from black at the weakest intensity to white at the strongest. Gray scale images are distinct from one-bit bi-tonal black-and-white images, which in the context of computer imaging are images with only the two colors, black, and white (also called bi-level or binary images). Gray scale images have many shades of gray in between. Gray scale images are also called monochromatic, denoting the absence of any chromatic variation (i.e., one color). Gray scale images are often the result of measuring the intensity of light at each pixel in a single band of the electromagnetic spectrum (e.g. infrared, visible light, ultraviolet, etc.), and in such cases they are monochromatic proper when only a given frequency is captured. But also they can be synthesized from a full color image. 2.4.4) Inverted Image: An inverted image could be interpreted as a digital version of image negatives. After inversion, every color takes the exact opposite one (I know this terminology is not that scientific, but it’s useful as a conceptual information). Let’s put this in more scientific terms. A positive image should be defined as a normal, original RGB or gray image. A negative image denotes a tonal inversion of a positive image, in which light areas appear dark and dark areas appear light. In negative images, a color reversing is also achieved, such that the red areas appear cyan, greens appear magenta, and blues appear yellow. In simpler sense, for the gray scale case, a black and white image, using 0 for black and 255 for white, a near-black pixel value of 5 will be converted to 250, or near-white. 25 Image inversion is one of the easiest techniques in image processing. Therefore, it’s very applicable to demonstrations of performance, acceleration, and optimization. Many of the state of the art image processing libraries such as Open CV, Gandalf, VXL etc., perform this operation as fast as possible, even though some more accelerations using parallel hardware are possible. 2.4.5) Blurring: Blurring an image usually makes the image unfocused. In signal processing, blurring is generally obtained by convolving the image with a low pass filter. In this Demonstration, the amount of blurring is increased by increasing the pixel radius. Fig 2.11 Blurring of image 26 2.4.6) Sharpening an Image: Sharpening is one of the most impressive transformations you can apply to an image since it seems to bring out image detail that was not there before. What it actually does, however, is to emphasize edges in the image and make them easier for the eye to pick out -while the visual effect is to make the image seem sharper, no new details are actually created. Fig 2.12 sharpened images 2.4.7) Histogram: Histograms are the basis for numerous spatial domain processing techniques. Histogram manipulation can be used effectively for image enhancement. The histogram of a digital image with gray levels in the range [0, L-1] is a discrete function h(rk) = nk, where r is the kth gray level and nk is the number of pixels in the image having gray level rk. It is common practice to normalize a histogram by dividing each of its values by the total number of pixels in the image, denoted by n. Thus, a normalized histogram is given by p(rk) = nk/n, 27 for k=0, 1,p ,L-1. Loosely speaking, p(rk) gives an estimate of the probability of occurrence of gray level rk. Note that the sum of all components of a normalized histogram is equal to 1. Fig 2.13 Image and its histogram 2.4.8) Equalization: Consider for a moment continuous functions, and let the variable r represent the gray levels of the image to be enhanced. In the initial part of our discussion we assume that r has been normalized to the interval [0, 1], with r=0 representing black and r=1 representing white. Later, we consider a discrete formulation and allow pixel values to be in the interval [0, L-1]. For any r satisfying the aforementioned conditions, we focus attention on transformations of the form s=T(r) 0 _ r _ 1 That produces a level s for every pixel value r in the original image. For reasons that will become obvious shortly, we assume that the transformation function T(r) satisfies the following conditions: 28 (a) T(r) is single-valued and monotonically increasing in the interval 0 _ r _ 1; and (b) 0 _ T(r) _ 1 for 0 _ r _ 1. The requirement in (a) that T(r) be single valued is needed to guarantee that the inverse transformation will exist, and the monotonicity condition preserves the increasing order from black to white in the output image. A transformation function that is not monotonically increasing could result in at least a section of the intensity range being inverted, thus producing some inverted gray levels in the output image. While this may be a desirable effect in some cases, that is not what we are after in the present discussion. Finally, condition (b) guarantees that the output gray levels will be in the same range as the input levels. 2.5) Image Formats: 2.5.1) Raster formats: a) Joint Photographic Expert Group (JPEG): JPEG stands for "Joint Photographic Expert Group" and, as its name suggests, was specifically developed for storing photographic images. It has also become a standard format for storing images in digital cameras and displaying photographic images on internet web pages. JPEG files are significantly smaller than those saved as TIFF, however this comes at a cost since JPEG employs lossy compression. A great thing about JPEG files is their flexibility. The JPEG file format is really a toolkit of options whose settings can be altered to fit the needs of each image. 29 b) Tagged Image File Format (TIFF): TIFF stands for "Tagged Image File Format" and is a standard in the printing and publishing industry. TIFF files are significantly larger than their JPEG counterparts, and can be either uncompressed or compressed using lossless compression. Unlike JPEG, TIFF files can have a bit depth of either 16-bits per channel or 8-bits per channel, and multiple layered images can be stored in a single TIFF file. TIFF files are an excellent option for archiving intermediate files which you may edit later, since it introduces no compression artefacts. Many cameras have an option to create images as TIFF files, but these can consume excessive space compared to the same JPEG file. If your camera supports the RAW file format this is a superior alternative, since these are significantly smaller and can retain even more information about your image. c) Portable Network Graphics (PNG): PNG uses ZIP compression which is lossless, and slightly more effective than LZW (slightly smaller files). PNG is a newer format, designed to be both versatile and royalty free, back when the LZW patent was disputed. The PNG (Portable Network Graphics) file format was created as the free, open-source successor to the GIF. The PNG file format supports true color (16 million colors) while the GIF supports only 256 colors. The PNG file excels when the image has large, uniformly colored areas. The lossless PNG format is best suited for editing pictures, and the lossy formats, like JPG, are best for the final distribution of photographic images. 30 d) Graphics Interchange Format (GIF): GIF is limited to an 8-bit palette, or 256 colors. This makes the GIF format suitable for storing graphics with relatively few colors such as simple diagrams, shapes, logos and cartoon style images. The GIF format supports animation and is still widely used to provide image animation effects. It also uses a lossless compression that is more effective when large areas have a single color, and ineffective for detailed images or dithered images. e) Bitmap Format (BMP): The BMP file format (Windows bitmap) handles graphics files within the Microsoft Windows OS. Typically, BMP files are uncompressed, hence they are large; the advantage is their simplicity and wide acceptance in Windows programs. f) RAW: RAW refers to a family of raw image formats that are options available on some digital cameras. These formats usually use a lossless or nearly-lossless compression, and produce file sizes much smaller than the TIFF formats of full-size processed images from the same cameras. Although there is a standard raw image format, (ISO 12234-2, TIFF/EP), the raw formats used by most cameras are not standardized or documented, and differ among camera manufacturers. Many graphic programs and image editors may not accept some or all of them, and some older ones have been effectively orphaned already. Adobe's Digital Negative (DNG) 31 specification is an attempt at standardizing a raw image format to be used by cameras, or for archival storage of image data converted from undocumented raw image formats, and is used by several niche and minority camera manufacturers including Pentax, Leica, and Samsung. 2.5.2) Vector Formats: a) Computer Graphics Metafile (CGM): CGM is a file format for 2D vector graphics, raster graphics, and text, and is defined by ISO/IEC 8632. All graphical elements can be specified in a textual source file that can be compiled into a binary file or one of two text representations. CGM provides a means of graphics data interchange for computer representation of 2D graphical information independent from any particular application, system, platform, or device. It has been adopted to some extent in the areas of technical illustration and professional design, but has largely been superseded by formats like SVG. b) Scalable Vector Graphics (SVG): SVG is an open standard created and developed by the World Wide Web Consortium to address the need (and attempts of several corporations) for a versatile, scriptable and all-purpose vector format for the web and otherwise. The SVG format does not have a compression scheme of its own, but due to the textual nature of XML, an SVG graphic can be compressed using a program such as gzip. Because of its scripting potential, SVG is a key component in web applications: interactive web pages that look and act like applications. 32 3) Message Encoding Techniques: 3.1) Steganography: Steganography is the art and science of invisible communication. This is accomplished through hiding information in other information, thus hiding the existence of the communicated information. Thus image steganography is a better approach than cryptography. Purpose of image processing is to make the quality of an image better so that the required operations can be easily performed on it. Image steganography is performed on the desired formats which are suitable. Steganography includes the concealment of information within computer files. In digital steganography, electronic communications may include steganographic coding inside of a transport layer, such as a document file, image file, program or protocol. Media files are ideal for steganographic transmission because of their large size. As a simple example, a sender might start with an innocuous image file and adjust the color of every 100th pixel to correspond to a letter in the alphabet, a change so subtle that someone not specifically looking for it is unlikely to notice it. Fig 3.1 Steganography Patchwork – The biggest disadvantage of the patchwork approach is the small amount of information that can be hidden in one image. This property can be changed to accommodate more information but one may have to sacrifice the secrecy of the information. Patchwork’s main advantage, however, is its robustness against malicious or unintentional image manipulation. Should a stego image using patchwork be cropped or rotated, some of the message data 33 may be lost but since the message is repeatedly embedded in the image, most of the information will survive. Patchwork is most suitable for transmitting a small amount of very sensitive information. 3.1.1) Ancient Steganography: The first recorded uses of steganography can be traced back to 440 BC when Herodotus mentions two examples of steganography in The Histories of Herodotus. Demaratus sent a warning about a forthcoming attack to Greece by writing it directly on the wooden backing of a wax tablet before applying its beeswax surface. Wax tablets were in common use then as reusable writing surfaces, sometimes used for shorthand. Another ancient example is that of Histiaeus, who shaved the head of his most trusted slave and tattooed a message on it. After his hair had grown the message was hidden. The purpose was to instigate a revolt against the Persians. 3.1.2) Modern Steganography: Fig 3.2 Modern Steganography—THE PRISONER`S PROBLEM: 34 Alice and Bob are communicating with each other using secret message exchanging. Fig 3.3 Block Diagram 3.1.3) Steganography in practice: Fig 3.4 Steganography in practice 35 Steganography is a special case of data hiding. But data hiding cannot always be steganography. In steganography main goal is to escape from Wendy when Alice and Bob are communicating with each other. 3.1.4) Different techniques: a) Physical steganography: Hidden messages within wax tablets — in ancient Greece, people wrote messages on the wood, and then covered it with wax upon which an innocent covering message was written. Hidden messages on messenger's body — also used in ancient Greece. Herodotus tells the story of a message tattooed on a slave's shaved head, hidden by the growth of his hair, and exposed by shaving his head again. The message allegedly carried a warning to Greece about Persian invasion plans. This method has obvious drawbacks, such as delayed transmission while waiting for the slave's hair to grow, and the restrictions on the number and size of messages that can be encoded on one person's scalp. During World War II, the French Resistance sent some messages written on the backs of couriers using invisible ink. Hidden messages on paper written in secret inks, under other messages or on the blank parts of other messages. Messages written in Morse code on knitting yarn and then knitted into a piece of clothing worn by a courier. Messages written on envelopes in the area covered by postage stamps. During and after World War II, espionage agents used photographically produced microdots to send information back and forth. Microdots were typically minute, approximately less than the size of the period produced by a typewriter. World War II microdots needed to be embedded in the paper and covered with 36 an adhesive, such as collodion. This was reflective and thus detectable by viewing against glancing light. b) Digital Steganography: Modern steganography entered the world in 1985 with the advent of the personal computer being applied to classical steganography problems. Development following that was slow, but has since taken off, going by the number of "stego" programs available. Digital steganography techniques include: Concealing messages within the lowest bits of noisy images or sound files. Concealing data within encrypted data or within random data. The data to be concealed is first encrypted before being used to overwrite part of a much larger block of encrypted data or a block of random data (an unbreakable cipher like the one-time pad generates cipher texts that look perfectly random if you don't have the private key). Chaffing and winnowing. Mimic functions convert one file to have the statistical profile of another. This can thwart statistical methods that help brute-force attacks identify the right solution in a cipher text-only attack. Concealed messages in tampered executable files, exploiting redundancy in the targeted instruction set. Pictures embedded in video material (optionally played at slower or faster speed). Injecting imperceptible delays to packets sent over the network from the keyboard. Delays in key presses in some applications 37 (telnet or remote desktop software) can mean a delay in packets, and the delays in the packets can be used to encode data. Changing the order of elements in a set. Content-Aware Steganography hides information in the semantics a human user assigns to a datagram. These systems offer security against a non-human adversary/warden. Blog-Steganography. Messages are fractionalized and the (encrypted) pieces are added as comments of orphaned web-logs (or pin boards on social network platforms). In this case the selection of blogs is the symmetric key that sender and recipient are using; the carrier of the hidden message is the whole blogosphere. Modifying the echo of a sound file (Echo Steganography). Secure Steganography for Audio Signals. Image bit-plane complexity segmentation steganography c) Network Steganography: All information hiding techniques that may be used to exchange steganographs in telecommunication networks can be classified under the general term of network steganography. This nomenclature was originally introduced by Krzysztof Szczypiorski in 2003. Contrary to the typical steganographic methods which utilize digital media (images, audio and video files) as a cover for hidden data, network steganography utilizes communication protocols' control elements and their basic intrinsic functionality. As a result, such methods are harder to detect and eliminate. 38 Typical network steganography methods involve modification of the properties of a single network protocol. Such modification can be applied to the PDU (Protocol Data Unit). Moreover, it is feasible to utilize the relation between two or more different network protocols to enable secret communication. These applications fall under the term inter-protocol steganography. Network steganography covers a broad spectrum of techniques, which include, among others: Steganophony - the concealment of messages in Voice-over-IP conversations, e.g. the employment of delayed or corrupted packets that would normally be ignored by the receiver (this method is called LACK - Lost Audio Packets Steganography), or, alternatively, hiding information in unused header fields. WLAN Steganography – the utilization of methods that may be exercised to transmit steganograms in Wireless Local Area Networks. A practical example of WLAN Steganography is the HICCUPS system (Hidden Communication System for Corrupted Networks). d) Printed Steganography: Digital steganography output may be in the form of printed documents. A message, the plaintext, may be first encrypted by traditional means, producing a cipher text. Then, an innocuous cover text is modified in some way so as to contain the cipher text, resulting in the stego text. For example, the letter size, spacing, typeface, or other characteristics of a cover text can be manipulated to carry the hidden message. Only a recipient who knows the 39 technique used can recover the message and then decrypt it. Francis Bacon developed Bacon's cipher as such a technique. The cipher text produced by most digital steganography methods, however, is not printable. Traditional digital methods rely on perturbing noise in the channel file to hide the message, as such; the channel file must be transmitted to the recipient with no additional noise from the transmission. Printing introduces much noise in the cipher text, generally rendering the message unrecoverable. e) Text Steganography: Steganography can be applied to different types of media including text, audio, image and video etc. However, text steganography is considered to be the most difficult kind of steganography due to lack of redundancy in text as compared to image or audio but still has smaller memory occupation and simpler communication. The method that could be used for text steganography is data compression. Data compression encodes information in one representation into another representation. The new representation of data is smaller in size. One of the possible schemes to achieve data compression is Huffman coding. Huffman coding assigns smaller length code words to more frequently occurring source symbols and longer length code words to less frequently occurring source symbols. Unicode steganography uses lookalike characters of the usual ASCII set to look normal, while really carrying extra bits of information. If the text is displayed correctly, there should be no visual difference from ordinary text. Some systems however may display the fonts differently, and would be easily spotted. 40 3.2) Algorithms: 3.2.1) LSB: Algorithm to embed text message: Step 1: Read the cover image and text message which is to be hidden in the cover image. Step 2: Convert text message in binary. Step 3: Calculate LSB of each pixels of cover image. Step 4: Replace LSB of cover image with each bit of secret message one by one. Step 5: Write stego image. Algorithm to retrieve text message: Step 1: Read the stego image. Step 2: Calculate LSB of each pixels of stego image. Step 3: Retrieve bits and convert each 8 bit into character. 41 FLOWCHART FOR LSB ENCODING START DISPLAY IMAGE ENTER STRING TO BE ENCODED FIND OUT ASCII CODE FOR EACH LETTER CONVERT ASCII CODE INTO BINARY CODE EXTRACT RGB VALUES FROM THE IMAGE (ENCODING) EMBED FIRST THREE BITS OF A LETTER INTO LAST THREE BITS OF RED VALUE, NEXT THREE BITS INTO GREEN VALUE AND LAST TWO BITS INTO THE BLUE VALUE OF A PIXEL STORE RGB VALUES IN ARRAY DISPLAY IMAGE STOP 42 FLOWCHART FOR LSB DECODING START DISPLAY IMAGE RETRIEVE RGB VALUES FROM IMAGE FOR EACH PIXEL (DECODING) EXTRACT FIRST THREE BITS OF A LETTER FROM LAST THREE BITS OF RED VALUE, NEXT THREE BITS FROM GREEN VALUE AND LAST TWO BITS FROM BLUE VALUE OF A PIXEL STORE IT IN ARRAY CONVERT BINARY CODE TO ASCII CODE DISPLAY THE STRING STOP 43 3.2.2) DCT Algorithm: Algorithm to embed text message: Step 1: Read cover image. Step 2: Read secret message and convert it in binary. Step 3: The cover image is broken into 8×8 block of pixels. Step 4: Working from left to right, top to bottom subtract128 in each block of pixels. Step 5: DCT is applied to each block. Step 6: Each block is compressed through quantization table. Step 7: Calculate LSB of each DC coefficient and replace with each bit of secret message. Step 8: Write stego image. Algorithm to retrieve text message: Step 1: Read stego image. Step 2: Stego image is broken into 8×8 block of pixels. Step 3: From left to right, top to bottom subtract128 in each block of pixels. Step 4: DCT is applied to each block. Step 5: Each block is compressed through quantization table. Step 6: Calculate LSB of each DCT coefficient. Step 7: Retrieve and convert each 8 bit into character. 44 3.2.3) Comparison: The relative easiness to implement LSB makes it a popular method. To hide a secret message in an image, a proper cover image is needed. Because this method uses bits of each pixel in the image, it is necessary to use a lossless compression format, otherwise the hidden information will get lost in the transformations of a lossy compression algorithm. Disadvantages of using LSB alteration are mainly in the fact that it requires a fairly large cover image to create a usable amount of hiding space. Even now, uncompressed images of 800 x 600 pixels are not often used on the Internet, so using these might rise suspicion. Another disadvantage will arise when compressing an image concealing a secret using a lossy compression algorithm. The hidden message will not survive this operation and is lost after the transformation. On the other hand, the Discrete Cosine Transform provides a mathematical and computational method of taking spatial data, dividing it into parts of differing importance with respect to visual quality, and compressing it into an accurate and overall high quality image. The DCT upholds many of the properties of the Fourier Transform, such as orthonormality and corresponding relations that follow, such as Parseval’s and Plancheral’s. Its inverse, the IDCT, allows reconstruction of an image frame that had been encoded by the DCT, and hence transforms back to the time domain. Despite its similarities to the Fourier Transform, it has been shown that for application purposes the DCT is much more practical and efficient, and it is commonly thought of in regards to image compression for JPEG and MPEG files. Comparative analysis of LSB based and DCT based steganography is done on basis of parameters like PSNR. Both gray scale and colored images have been used for experiments. Peak signal to noise ratio is 45 used to compute how well the methods perform. PSNR computes the peak signal to noise ratio, in decibels, between two images. This ratio is used as a quality measurement between two images. If PSNR ratio is high then images are best of quality. PSNR is the peak signal to noise ratio, in decibels, between two images. This ratio is used as a quality measurement between two images. If PSNR ratio is high then images are better of quality. Comparison of LSB based and DCT based stego images using PSNR ratio shows that PSNR ratio of DCT based steganography scheme is high as compared to LSB based steganography scheme for all types of images- (Gray scale as well as Color). DCT based steganography scheme works perfectly with minimal distortion of the image quality as compared to LSB based steganography scheme. Even though the amount of secret data that can be hidden using this technique is very small as compared to LSB based steganography scheme still, DCT based steganography scheme is recommended because of the minimum distortion of image quality. 46 3.3) Applications: a) Used in modern printers. b) Alleged use by terrorists. c) Alleged use by intelligence department. d) Privacy and anonymity is a concern on the internet. e) Allows for two parties to communicate secretly and covertly. f) It allows for some morally-conscious people to safely whistle blow on internal actions. g) It allows for copyright protection on digital files using the message as a digital watermark. h) One of the other main uses for Image Steganography is for the transportation of high-level or top-secret documents between international governments. 47 4) Software paradigm: 4.1) Waterfall model: The waterfall model is a sequential design process, often used in software development processes, in which progress is seen as flowing steadily downwards (like a waterfall) through the phases of Conception, Initiation, Analysis, Design, Construction, Testing, Production/Implementation and Maintenance. 4.2) Software Requirement System: A requirements specification for a software system – is a complete description of the behaviour of a system to be developed. It includes a set of use cases that describe all the interactions the users will have with the software. In addition to use cases, the SRS also contains 48 non-functional (or supplementary) requirements. Non-functional requirements are requirements which impose constraints on the design or implementation (such as performance engineering requirements, quality standards, or design constraints). 4.3) Design: A software development process, also known as a software development life cycle (SDLC), is a structure imposed on the development of a software product. Similar terms include software life cycle and software process. It is often considered a subset of systems development life cycle. There are several models for such processes, each describing approaches to a variety of tasks or activities that take place during the process. Some people consider a lifecycle model a more general term and a software development process a more specific term. For example, there are many specific software development processes that 'fit' the spiral lifecycle model. ISO 12207 is an ISO standard for software lifecycle processes. It aims to be the standard that defines all the tasks required for developing and maintaining software. 4.4) Testing: There are three types of testing i.e. unit testing, integration testing and system testing. Unit testing is done with one module. Integration testing is done by testing together two or more modules. System testing is done when all the modules of the software are worked and tested together. 49 Testing is the process of identifying the errors during the process of execution of a program. 4.4.1) Unit Testing: In computer programming, unit testing is a method by which individual units of source code are tested to determine if they are fit for use. A unit is the smallest testable part of an application. In procedural programming a unit may be an individual function or procedure. In object-oriented programming a unit is usually an interface, such as a class. Unit tests are created by programmers or occasionally by white box testers during the development process. The goal of unit testing is to isolate each part of the program and show that the individual parts are correct. A unit test provides a strict, written contract that the piece of code must satisfy. As a result, it affords several benefits. Unit tests find problems early in the development cycle. 4.4.2) Integration Testing: Integration testing is a logical extension of unit testing. In its simplest form, two units that have already been tested are combined into a component and the interface between them is tested. A component, in this sense, refers to an integrated aggregate of more than one unit. In a realistic scenario, many units are combined into components, which are in turn aggregated into even larger parts of the program. Integration testing (sometimes called Integration and Testing, abbreviated "I&T") is the phase in software testing in which individual software modules are combined and tested as a group. It occurs after unit testing and before system testing. Integration testing takes as its input modules that have been unit tested, groups them in larger aggregates, applies tests defined in an integration test 50 plan to those aggregates, and delivers as its output the integrated system ready for system testing. The purpose of integration testing is to verify functional, performance, and reliability requirements placed on major design items. These "design items", i.e. assemblages (or groups of units), are exercised through their interfaces using Black box testing, success and error cases being simulated via appropriate parameter and data inputs. Simulated usage of shared data areas and inter-process communication is tested and individual subsystems are exercised through their input interface. Test cases are constructed to test that all components within assemblages interact correctly, for example across procedure calls or process activations, and this is done after testing individual modules, i.e. unit testing. The overall idea is a "building block" approach, in which verified assemblages are added to a verified base which is then used to support the integration testing of further assemblages. 4.4.3) System Testing: System testing of software or hardware is testing conducted on a complete, integrated system to evaluate the system's compliance with its specified requirements. System testing falls within the scope of black box testing, and as such, should require no knowledge of the inner design of the code or logic. As a rule, system testing takes, as its input, all of the "integrated" software components that have successfully passed integration testing and also the software system. 51 5) Interface Layouts and Results: GUI for User Input Fig 5.1 Interface for Loading an Image 52 Fig 5.2 Dialog Box for Image File Selection 53 Fig 5.3 Loaded Image 54 Fig5.4 Interface for Image Operations 55 Fig 5.5 Blurred Image 56 Fig 5.6 Contrast Image 57 Fig 5.7 Gray Scale Image 58 Fig 5.8 Inverted Image 59 Fig 5.9 Sharpen Image 60 OUTPUT FOR STEGANOGRAPHY: FOR ENCODING: Fig.5.10 Image before Encoding 61 Fig5.11 Image after Encoding 62 FOR DECODING: Fig 5.12 Interface when entered wrong password 63 Fig 5.13 Message decoded with correct password 64 6) Conclusion: In this report we studied the various image filtering operations which helps to enhance, modify, warp, mutilate the images. It compares a pixel somehow with the pixels around it to filter the image. Steganography is the art of hiding information in an innocuous cover. Its basic purpose is to make communication unintelligible to those who do not possess the right keys. The message can be hidden inside of images or, other digital objects, which remains imperceptible to a casual observer. By embedding a secret message into a cover image, a stego -image is obtained. As the stego -image does not contain any easily detectable visual artifacts due to message embedding. The common approaches for message hiding in images include Least Significant Bit (LSB) insertion methods, Frequency Domain Techniques, Spread Spectrum Techniques, statistical methods, Cover Generation methods and Fractal Techniques. The change in behaviour of the stego-image is dependent on the specific approach used for hiding information. Attacking on stego-images is also of different natures to take care different steganographic approaches. We studied about the LSB technique of image encoding. LSB is often used as this is the easiest technique to encode and decode. The image formats typically used in such steganography methods are lossless and the data in this method can be directly inserted and recovered in presence of stego-key. For 24 bit image, the colours of each component like RGB (red, green and blue) are changed. LSB is effective in using BMP images since the compression in BMP is lossless. But for hiding the secret message inside an image of BMP file using LSB algorithm it requires a large image which is used as a 65 cover. LSB substitution is also possible for GIF formats, but the problem with the GIF image is whenever the least significant bit is changed the whole colour palette will be changed. The problem can be avoided by only using the gray scale GIF images since the gray scale image contains 256 shades and the changes will be done gradually so that it will be very hard to detect. For JPEG, the direct substitution of steganographic techniques is not possible since it will use lossy compression. So it uses LSB substitution for embedding the data into images. So in a way, LSB is an efficient technique for encoding large amounts of data and is comparatively easy to implement than other techniques. 66 7) Scope and Limitations: In the present world, the data transfers using internet is rapidly growing because it is so easier as well as faster to transfer the data to destination. So, many individuals and business people use to transfer business documents, important information using internet. Security is an important issue while transferring the data using internet because any unauthorized individual can hack the data and make it useless or obtain information un- intended to him. The scope of the project is to limit unauthorised access and provide better security during message transmission. To meet the requirements, I use the simple and basic approach of steganography. In this project, the proposed approach finds the suitable algorithm for embedding the data in an image using steganography which provides the better security pattern for sending messages through a network. I used the Least Significant Bit algorithm in this project for developing the application which is faster and reliable and compression ratio is moderate compared to other algorithms. The image resolution does not change much and is negligible when we embed the message into the image and the image is protected with the personal password. So, it is not possible to damage the data by unauthorized personnel. 67 8) Future Scope: Future work includes experimentation with a wider range of images with high quality and more optimised one. Also to sustain the message even after the alteration made in the cover image like cropping, resizing etc. The future work on this project is to improve the compression ratio of the image to the text. This project can be extended to a level such that it can be used for the different types of image formats like .bmp, .jpeg, .tif etc., in the future. The security using Least Significant Bit Algorithm is good but we can improve the level to a certain extent by varying the carriers as well as using different keys for encryption and decryption. Future implementations will improve the research related works which can be done in regard to all above referred approaches like DCT. This will include implementation of DCT which is more suitable for formats like jpeg and mpeg. DCT implementation will provide working with wider range of images with less calculation. It involves fewer calculations due to fft algorithm used in it. 68 9) References: a) 1000 java tips (Alexandre Patchine, Dr. Heinz Kabutz) b) Digital Image Processing 2nd Edition (Raefel C Gonzalez) c) The Complete Reference 5th Edition (Herbert Schildt) d) IEEE Paper titled ”A steganography implementation” by Mehboob, Faruqui. e) Bandwidthers blog on steganography. f) A quick premier on MSDN blogs. g) “Implementation and Evaluation of Image Processing Algorithms on Reconfigurable Architecture” by Daggu Venkateshwar Rao, Shruti Patil, Naveen Anne Babu and V Muthukumar. h) http://www.peterindia.net/SteganographyLinks.html i) Paper titled “Information Security through Image Stegnography” by Nani Koduri M.S. in Information Security and Computer Forensic. j) Paper titled “Hiding Data in data” in April 2002 in issue of Windows and .NET Magazine by Gary C. Kessler. 69