JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN ELECTRONICS AND COMMUNICATION ENGINEERING STEGANOGRAPHY THROUGH DIGITAL IMAGE PROCESSING TECHNIQUE 1 PALLAVI KHARE, 2 JAIKARAN SINGH, 3 MUKESH TIWARI 1,2,3 Dept. of E&TC, SSSIST Bhopal, India Pallavi3386@gmail.com ABSTRACT: This paper presents a proposal for communication which is major part of daily life of today. We are living an age of science. . With the increasing communication traffic demand, data security has become very important field. Lots of data security and data hiding algorithms have been developed in the last decade. we are implementing a method of “Digital Stenography” for image data hiding. Digital Image Steganography system allows an average user to securely transfer text messages by hiding them in a digital image file. A combination of Steganography and encryption algorithms provides a strong backbone for its security. Digital Image Steganography system features innovative techniques for hiding text in a digital image file or even using it as a key to the encryption. KEYWORDS: Steganography, Digital Stenography, Encryption ,Digitization Of Steganography. 1. INTRODUCTION The word steganography is of Greek origin and means "covered, or hidden writing". Steganography is the art and science of writing hidden messages in such a way that no one apart from the sender and intended recipient even realizes there is a hidden message. Generally, a steganographic message will appear to be something else: a picture, an article, a shopping list, or some other message. A steganographic message (the plaintext) is often first encrypted by some traditional means, producing a cipher text. Digital image steganography system is a stand-alone application that combines steganography and encryption to enhance the confidentiality of intended message. The user’s intended message is first encrypted to create unintelligible cipher text. Then the cipher text will be hidden within an image file in such a way as to minimize the perceived loss in quality. The recipient of the image is able to retrieve the hidden message back from the image with this system. 1.1 ENCODING In order to hide text in an image, the user must provide the text and a Target Image in which it is to be hidden. Optionally, users may enter or load a text key. When a user key is not provided, a default key is used. To make choosing images easier, an image bar is provided for the user. The user may set the source directory, whose contents are displayed as thumbnails in the image bar located on the bottom panel of application. These thumbnails are automatically generated from the specified source directory. The user then selects a target image and loads it for use as the Target Image. The user may also open the Target Image by selecting Open Image under File menu. The Image, JPEG user may enter the text to be embedded in several ways. He or she may type it directly into the text window, open a text file, or paste text from another application. The user may also open a text file by selecting Open Text under the File menu or use the Edit menu to paste from the clipboard. Loading a text key can be done from the Open Text Key selection on the Tools menu. When user has specified both the target image, and text body, Digital image steganography system is ready to hide the text body within the image. The user may then select Embed Text from the Tools menu. The key image will then be hashed into an appropriate Key Value. This value will be used for the encryption of the user’s plaintext to produce cipher text. The cipher text and key value will then be input into the steganography and bit placement algorithms, and the Output Image will be generated. After the application has generated the image, the user may then inspect it by selecting the Image tab on the main panel. The user is then able to compare the encrypted target image with the original target image located on the right box of the main window. The user may wish to save the generated image by selecting Save Image or Save Image As… under File menu. 1.2 DECODING In order to retrieve the hidden text from a source image, the user needs to provide a Source Image and any Keys used when the Source Image was generated. Loading the Source Image file is done similarly to the process of opening a Target Image. If the sender has used the default key, the recipient need not load any keys to the application. If a non-default key was used in text hiding process, the receiving party must have prearranged knowledge of the key for use in retrieving the text. This key could be text ISSN: 0975 – 6779| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02 Page 137 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN ELECTRONICS AND COMMUNICATION ENGINEERING file or a key string. If the key is a text file, the user may load the key by selecting Open Text Key from the Tools menu. Lastly, the user may simply type in the key by activating the Text tab of the key panel and entering it into the text box there. After the Source Image and any required Keys are loaded into the application, the hidden text can be retrieved by selecting Extract Text from the Tools menu. The key will be hashed into an appropriate Key Value. This key value will be used to recover the hidden cipher text from the Source Image. The key will then be used for the decryption of the cipher text to produce plaintext. The plaintext will finally be displayed in the text box on the Text tab of the main panel. After the application has generated the text, the user may wish to save the generated text by selecting Save Text under the File menu. The application is not able to identify the presence of the hidden message in images until the actual decoding process is attempted, as the image files are virtually indistinguishable from normal images. 2. DIGITAL IMAGE DEFINITIONS A digital image a[m,n] described in a 2D discrete space is derived from an analog image a(x,y) in a 2D continuous space through a sampling process that is frequently referred to as digitization. The effect of digitization is shown in Figure 1. The 2D continuous image a(x,y) is divided into N rows and M columns. The intersection of a row and a column is termed a pixel. The value assigned to the integer coordinates [m,n] with {m=0,1,2,...,M-1} and {n=0,1,2,...,N-1} is a[m,n]. In fact, in most cases a(x,y)--which we might consider to be the physical signal that impinges on the face of a 2D sensor--is actually a function of many variables including depth (z), color (λ), and time (t).. The image shown in Figure 1 has been divided into N = 16 rows and M = 16 columns. The value assigned to every pixel is the average brightness in the pixel rounded to the nearest integer value. The process of representing the amplitude of the 2D signal at a given coordinate as an integer value with L different gray levels is usually referred to as amplitude quantization. Fig. 1. Digitization of a continuous image Common Values: There are standard values for the various parameters encountered in digital image processing. These values can be caused by video standards, by algorithmic requirements, or by the desire to keep digital circuitry simple. Table 1 gives some commonly encountered values. Parameter Symbol Typical values Rows N Columns M Gray Levels L 256, 512 , 525 ,625 , 1024, 1035 256, 512, 768, 1024, 1320 2, 64, 256, 1024, 4096, 16384 Table 2.1 Common values of digital image parameters Quite frequently we see cases of M=N=2K where {K = 8,9,10}. The number of distinct gray levels is usually a power of 2, that is, L=2B where B is the number of bits in the binary representation of the brightness levels. When B>1 we speak of a gray-level image; when B=1 we speak of a binary image. 3. IMAGE DEFINITION To a computer, an image is a collection of numbers that constitute different light intensities in different areas of the image. This numeric representation forms a grid and the individual points are referred to as pixels. Most images on the Internet consists of a rectangular map of the image’s pixels (represented as bits) where each pixel is located and its colour. These pixels are displayed horizontally row by row. The number of bits in a colour scheme, called the bit depth, refers to the number of bits used for each pixel. The smallest bit depth in current colour schemes is 8, meaning that there are 8 bits used to describe the colour of each pixel. Monochrome and grey scale images use 8 bits for each pixel and are able to display 256 different colours or shades of grey. Digital colour images are typically stored in 24bit files and use the RGB colour model, also known as true colour. All colour variations for the pixels of a 24 bit image are derived from three primary colours: red, green and blue, and each primary colour is represented by 8 bits. Thus in one given pixel, there can be 256 different quantities of red, green and blue, adding up to more than 16-million combinations, resulting in more than 16-million colours. Not surprisingly the larger amount of colours that can be displayed, the larger the file size. When working with larger images of greater bit depth, the images tend to become too large to transmit over a standard Internet connection. In order to display an image in a reasonable amount of time, techniques must be incorporated to reduce the image’s file size. These techniques make use of mathematical formulas to analyze and condense image data, resulting in smaller file sizes. This process is called compression. ISSN: 0975 – 6779| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02 Page 138 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN ELECTRONICS AND COMMUNICATION ENGINEERING In images there are two types of compression: lossy and lossless. Both methods save storage space, but the procedures that they implement differ. Lossy compression creates smaller files by discarding excess image data from the original image. It removes details that are too small for the human eye to differentiate, resulting in close approximations of the original image, although not an exact duplicate. An example of an image format that uses this compression technique is JPEG (Joint Photographic Experts Group). Lossless compression, on the other hand, never removes any information from the original image, but instead represents data in mathematical formulas. The original image’s integrity is maintained and the decompressed image output is bit-by-bit identical to the original image input. The most popular image formats that use lossless compression is GIF (Graphical Interchange Format) and 8-bit BMP (a Microsoft Windows bitmap file). Compression plays a very important role in choosing which steganographic algorithm to use. Lossy compression techniques result in smaller image file sizes, but it increases the possibility that the embedded message may be partly lost due to the fact that excess image data will be removed. Lossless compression though, keeps the original digital image intact without the chance of lost, although is does not compress the image to such a small file size. Different steganographic algorithms have been developed for both of these compression types and are explained in the following sections. 3.1Image and Transform Domain Image steganography techniques can be divided into two groups: those in the Image Domain and those in the Transform Domain. Image – also known as spatial – domain techniques embed messages in the intensity of the pixels directly, while for transform – also known as frequency – domain, images are first transformed and then the message is embedded in the image. Image domain techniques encompass bit-wise methods that apply bit insertion and noise manipulation and are sometimes characterized as “simple systems”. The image formats that are most suitable for image domain steganography are lossless and the techniques are typically dependent on the image format. Steganography in the transform domain involves the manipulation of algorithms and image transforms. These methods hide messages in more significant areas of the cover image, making it more robust. Many transform domain methods are independent of the image format and the embedded message may survive conversion between lossy and lossless compression. In the next sections I have explained steganographic algorithms in categories according to image file formats and the domain in which they are performed. Steganography Text Images Transform Domain Audio/Video Image Domain Patchwork JPEG LSB in BMP& GIF Spread Spectrum Fig. 2 Categories of Image Steganography 4. IMAGE DOMAIN Least significant bit (LSB) insertion is a common, simple approach to embedding information in a cover image. The least significant bit (in other words, the 8th bit) of some or all of the bytes inside an image is changed to a bit of the secret message. When using a 24-bit image, a bit of each of the red, green and blue colour components can be used, since they are each represented by a byte. In other words, one can store 3 bits in each pixel. An 800 × 600 pixel image, can thus store a total amount of 1,440,000 bits or 180,000 bytes of embedded data. For example a grid for 3 pixels of a 24-bit image can be as follows: (00101101 00011100 11011100) (10100110 11000100 00001100) (11010010 10101101 01100011) When the number 200, which binary representation is 11001000, is embedded into the least significant bits of this part of the image, the resulting grid is as follows: (00101101 00011101 11011100) (10100110 11000101 00001100) (11010010 10101100 01100011) Although the number was embedded into the first 8 bytes of the grid, only the 3 underlined bits needed to be changed according to the embedded message. On average, only half of the bits in an image will need to be modified to hide a secret message using the maximum cover size. Since there are 256 possible ISSN: 0975 – 6779| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02 Page 139 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN ELECTRONICS AND COMMUNICATION ENGINEERING intensities of each primary colour, changing the LSB of a pixel results in small changes in the intensity of the colours. These changes cannot be perceived by the human eye - thus the message is successfully hidden. With a well-chosen image, one can even hide the message in the least as well as second to least significant bit and still not see the difference. In the above example, consecutive bytes of the image data – from the first byte to the end of the message – are used to embed the information. This approach is very easy to detect. A slightly more secure system is for the sender and receiver to share a secret key that specifies only certain pixels to be changed. Should an adversary suspect that LSB steganography has been used, he has no way of knowing which pixels to target without the secret key. In its simplest form, LSB makes use of BMP images, since they use lossless compression. Unfortunately to be able to hide a secret message inside a BMP file, one would require a very large cover image. Nowadays, BMP images of 800 × 600 pixels are not often used on the Internet and might arouse suspicion. For this reason, LSB steganography has also been developed for use with other image file formats. 4.1LSB and Palette Based Images Palette based images, for example GIF images, are another popular image file format commonly used on the Internet. By definition a GIF image cannot have a bit depth greater than 8, thus the maximum number of colours that a GIF can store is 256. GIF images are indexed images where the colours used in the image are stored in a palette, sometimes referred to as a colour lookup table. Each pixel is represented as a single byte and the pixel data is an index to the colour palette. The colours of the palette are typically ordered from the most used colour to the least used colours to reduce lookup time. GIF images can also be used for LSB steganography, although extra care should be taken. The problem with the palette approach used with GIF images is that should one change the least significant bit of a pixel, it can result in a completely different colour since the index to the colour palette is changed. If adjacent palette entries are similar, there might be little or no noticeable change, but should the adjacent palette entries be very dissimilar, the change would be evident. One possible solution is to sort the palette so that the colour differences between consecutive colours are minimized. Another solution is to add new colours which are visually similar to the existing colours in the palette. This requires the original image to have less unique colours than the maximum number of colours (this value depends on the bit depth used). Using this approach, one should thus carefully choose the right cover image. Unfortunately any tampering with the palette of an indexed image leaves a very clear signature, making it easier to detect. A final solution to the problem is to use grey scale images. In an 8-bit grey scale GIF image, there are 256 different shades of grey. The changes between the colours are very gradual, making it harder to detect. 5. TRANSFORM DOMAIN To understand the steganography algorithms that can be used when embedding data in the transform domain, one must first explain the type of file format connected with this domain. The JPEG file format is the most popular image file format on the Internet, because of the small size of the images. 5.1 JPEG compression To compress an image into JPEG format, the RGB colour representation is first converted to a YUV representation. In this representation the Y component corresponds to the luminance (or brightness) and the U and V components stand for chrominance (or colour). According to research the human eye is more sensitive to changes in the brightness (luminance) of a pixel than to changes in its color. This fact is exploited by the JPEG compression by down sampling the color data to reduce the size of the file. The colour components (U and V are halved in horizontal and vertical directions, thus decreasing the file size by a factor of 2. The next step is the actual transformation of the image. For JPEG, the Discrete Cosine Transform (DCT) is used, but similar transforms are for example the Discrete Fourier Transform (DFT). These mathematical transforms convert the pixels in such a way as to give the effect of “spreading” the location of the pixel values over part of the image . The DCT transforms a signal from an image representation into a frequency representation, by grouping the pixels into 8 × 8 pixel blocks and transforming the pixel blocks into 64 DCT coefficients each. A modification of a single DCT coefficient will affect all 64 image pixels in that block. The next step is the quantization phase of the compression. Here another biological property of the human eye is exploited: The human eye is fairly good at spotting small differences in brightness over a relatively large area, but not so good as to distinguish between different strengths in high frequency brightness. This means that the strength of higher frequencies can be diminished, without changing the appearance of the image. JPEG does this by dividing all the values in a block by a quantization coefficient. The results are rounded to integer values and the coefficients are encoded using Huffman coding to further reduce the size. 5.2 JPEG steganography Originally it was thought that steganography would not be possible to use with JPEG images, since they use lossy compression which results in parts of the image data being altered. One of the major characteristics of steganography is the fact that information is hidden in the redundant bits of an object and since redundant bits are left out when ISSN: 0975 – 6779| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02 Page 140 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN ELECTRONICS AND COMMUNICATION ENGINEERING using JPEG it was feared that the hidden message would be destroyed. Even if one could somehow keep the message intact it would be difficult to embed the message without the changes being noticeable because of the harsh compression applied. However, properties of the compression algorithm have been exploited in order to develop a steganographic algorithm for JPEGs. One of these properties of JPEG is exploited to make the changes to the image invisible to the human eye. During the DCT transformation phase of the compression algorithm, rounding errors occur in the coefficient data that are not noticeable. Although this property is what classifies the algorithm as being lossy, this property can also be used to hide messages. It is neither feasible nor possible to embed information in an image that uses lossy compression, since the compression would destroy all information in the process. Thus it is important to recognize that the JPEG compression algorithm is actually divided into lossy and lossless stages. The DCT and the quantization phase form part of the lossy stage, while the Huffman encoding used to further compress the data is lossless. Steganography can take place between these two stages. Using the same principles of LSB insertion the message can be embedded into the least significant bits of the coefficients before applying the Huffman encoding. By embedding the information at this stage, in the transform domain, it is extremely difficult to detect, since it is not in the visual domain. 6. IMAGE OR TRANSFORM DOMAIN 6.1Patchwork Patchwork is a statistical technique that uses redundant pattern encoding to embed a message in an image. The algorithm adds redundancy to the hidden information and then scatters it throughout the image. A pseudorandom generator is used to select two areas of the image (or patches), patch A and patch B. All the pixels in patch A is lightened while the pixels in patch B are darkened. In other words the intensities of the pixels in the one patch are increased by a constant value, while the pixels of the other patch are decreased with the same constant value. The contrast changes in this patch subset encodes one bit and the changes are typically small and imperceptible, while not changing the average luminosity. A disadvantage of the patchwork approach is that only one bit is embedded. One can embed more bits by first dividing the image into sub-images and applying the embedding to each of them. The advantage of using this technique is that the secret message is distributed over the entire image, so should one patch be destroyed, the others may still survive. This however, depends on the message size, since the message can only be repeated throughout the image if it is small enough. If the message is too big, it can only be embedded once. The patchwork approach is used independent of the host image and proves to be quite robust as the hidden message can survive conversion between lossy and lossless compression. 6.2 Spread Spectrum In spread spectrum techniques, hidden data is spread throughout the cover-image making it harder to detect. A system proposed by Marvel et al. combines spread spectrum communication, error control coding and image processing to hide information in images. Spread spectrum communication can be defined as the process of spreading the bandwidth of a narrowband signal across a wide band of frequencies. This can be accomplished by adjusting the narrowband waveform with a wideband waveform, such as white noise. After spreading, the energy of the narrowband signal in any one frequency band is low and therefore difficult to detect. In spread spectrum image steganography the message is embedded in noise and then combined with the cover image to produce the stego image. Since the power of the embedded signal is much lower than the power of the cover image, the embedded image is not perceptible to the human eye or by computer analysis without access to the original image. 7. LIMITATION AND FUTURE SCOPE Digital Image Steganography system allows an average user to securely transfer text messages by hiding them in a digital image file. A combination of Steganography and encryption algorithms provides a strong backbone for its security. Digital Image Steganography system features innovative techniques for hiding text in a digital image file or even using it as a key to the encryption. Digital Image Steganography system allows a user to securely transfer a text message by hiding it in a digital image file. 128 bit AES encryption is used to protect the content of the text message even if its presence were to be detected. Currently, no methods are known for breaking this kind of encryption within a reasonable period of time (i.e., a couple of years). Additionally, compression is used to maximize the space available in an image. 8.REFERENCES [1] N . Provos, “Defending Against Statistical Steganography,” Proc 10th USENEX Security Symposium 2005. [2] N . Provos and P. Honeyman, “Hide and Seek: An introduction to Steganography,” IEEE Security & Privacy Journal 2003. [3] S . Katzenbeisser and Petitcolas , ”Information Hiding Techniques for Stenography and Digital watermaking” Artech House, Norwood, MA. 2000 . [4] L. Reyzen And S. Russell , “More efficient provably secure Steganography” 2007. ISSN: 0975 – 6779| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02 Page 141 JOURNAL OF INFORMATION, KNOWLEDGE AND RESEARCH IN ELECTRONICS AND COMMUNICATION ENGINEERING [5] S.Lyu and H. Farid , “Steganography using higher order image statistics , “ IEEE Trans. Inf. Forens. Secur. 2006. [6] Venkatraman , s, Abraham , A . & Paprzycki M.” Significance of Steganography on Data Security “ , Proceedings of the International Conference on Information Technology : Coding and computing , 2004. [7] Fridrich , J ., Goljan M., and Hogea , D ; New Methodology for Breaking stenographic Techniques for JPEGs. “ Electronic Imaging 2003”. [8] http:/ aakash.ece.ucsb.edu./ data hiding / stegdemo.aspx.Ucsb data hiding online demonstration . Released on Mar .09,2005. [9] Mitsugu Iwanmoto and Hirosuke Yamamoto, “The Optimal n-out-of-n Visual Secret Sharing Scheme for GrayScale Images”, IEICE Trans. Fundamentals, vol.E85-A, No.10, October 2002, pp. 2238-2247. [10] Doron Shaked, Nur Arad, Andrew Fitzhugh, Irwin Sobel, “Color Diffusion: Error Diffusion for Color Halftones”, HP Laboratories Israel, May 1999. [11] Z.Zhou, G.R.Arce, and G.Di Crescenzo, “Halftone Visual Cryptography”, IEEE Tans. On Image Processing,vol.15, No.8, August 2006, pp. 2441-2453. [12] M.Naor and A.Shamir, “Visual Cryptography”, in Proceedings of Eurocrypt 1994, lecture notes in computer science, 1994, vol.950, pp. 1-12. [13] Robert Ulichney, “The void-and-cluster method for dither array generation”, IS&T/SPIE Symposium on Electronic Imaging and Science, San Jose, CA, 1993, vol.1913, pp.332-343. ISSN: 0975 – 6779| NOV 10 TO OCT 11 | VOLUME – 01, ISSUE - 02 Page 142