Professor: Dr. YUN QING SHI Presentation by: KARTHIK RAGHAVENDRA STEGANOGRAPHY OVERVIEW JPEG FILE INTERCHANGE FORMAT JSTEG ALGORITHM F3 ALGORITHM F4 ALGORITHM F5 ALGORITHM AND ITS ADVANTAGES OVER OTHER ALGORITHMS F5 ALGORITHM DEMO Steganography is the art and science of writing hidden messages in a carrier medium in such a way that no one, apart from the sender and intended recipient, suspects the existence of the message, a form of security through obscurity. Many different carrier file formats can be used, but ‘Digital Images’ are the most popular because of their frequency on the internet. IMAGE STEGANOGRAM SECRET MESSAGE Steganography Encryption algorithm Steganography Decryption algorithm • • • CARRIER SECRET MESSAGE No visible changes in image steganogram Must resist Visual and Statistical Attacks High Capacity for Secret Message CARRIER Fig 1: Flow of information in the JPEG compressor JPEG compressor cuts the uncompressed image into parts of 8 x 8 pixels. Discrete Cosine Transformation transfers the 8 x 8 brightness values into 8 x 8 frequency coefficients Quantization suitably rounds frequency coefficients to integers in range 2048..2047. (Lossy step) Histogram in Fig 2 shows discrete distribution of coefficient’s frequency of occurrence. Quantization followed by Huffman coding which ensures redundancy free coding of quantized coefficients. Distribution in Fig 2 shows 2 characteristic properties: Fig 2: Histogram of JPEG coefficients after quantization Coefficient’s frequency of occurrence decreases with increasing absolute value. Difference between 2 bars of histogram in middle is larger than on margin. Fig 3a: Histogram of JPEG coefficients after quantization Fig 3b: JSTEG equalises pairs of coefficients After quantization, JSTEG replaces the least significant bits of the frequency coefficients by the secret message. The embedding mechanism skips all coefficients with the values 0 or 1 as observed in Figure 3b. Resistant against visual attacks and offers a good capacity of about 12.8% of the staganogram’s size, but the secret message can be easily detected by statistical attacks. Fig 4 shows the statistical attack on JSTEG steganogram (with 50% of the capacity used, i.e. 7680 bytes). The diagram presents the probability of embedding: as a function of an increasing sample: Initially, the sample comprises the first 1% of the JPEG coefficients, then the first 2%, 3%, . . . The probability is 1.00 up to 54% and 0.45 at 56%; A sample of 59% and more contains enough unchanged coefficients to let the p-value drop to 0.00. Fig 4: Probability of embedding in a JSTEG steganogram (50 % of capacity used) Fig 5a: Histogram of JPEG coefficients after quantization Fig 5b: F3 produces a superior number of even coefficients Does not overwrite bits like JSTEG, instead it decrements the coefficient’s absolute values in case their LSB does not match – except coefficients with the value zero, where we cannot decrement the absolute value. Hence zero coefficient is not used in this method. The LSB of nonzero coefficients match the secret message after embedding, but the LSB is not overwritten as overwritten bits can be detected by statistical methods (ChiSquare method). Some embedded bits fall victim to shrinkage. Shrinkage occurs every time F3 decrements the absolute value of 1 and -1 producing a 0. The receiver cannot distinguish a 0 coefficient that is stegonagraphically unused from a 0 produced by shrinkage. It skips all zero coefficients. Hence repetitive embedding is necessary. Figure 5b shows the histogram of frequence of occurance versus JPE G coefficients for after applying F3 algorithm. The histogram shows more even coefficients than odd coefficients. This is due to repeated embedding after shrinkage. Shrinkage occurs only if we embed a 0 bit. The repetition of these 0 bits shifts the ratio of steganographic values in favour of the steganographic zeros. This is undesirable and can be detected by statistical means. F3 WEAKNESSES: Due to exclusive shrinkage of steganographic zeros. F3 embeds more zeros than ones, and produces statistically detectable peculiarities in the in the histogram. The histogram of Figure 2 contains more odd than even coefficients (except 0). Therefore, an unchanged carrier media contain more steganographic ones than zeros. Fig 6a: Histogram of JPEG coefficients after quantization Fig 6b: Histogram of JPEG coefficients with F4 interpretation of Steganographic values F4 eliminates the 2 weakness mentioned in F3 algorithm by mapping negative coefficients to the inverted steganographic value: even negative coefficients represent a steganographic one, odd negative a zero; even positive represents a zero, odd positive and even negative a one, as shown in Figure 6b. In Figure 6b, each 2 bars of the same height represent coefficients with inverse steganographic value (steganographic zeros are black, steganographic ones white). F4 WEAKNESSES: Embeds secret message data continuously resulting in changes to concentrate on the start of the file, and unused rest resides on the end. This phenomenon is called Continuous embedding. For a very short secret message comprising of 217 byes (1736 bits), F4 changes 1157 places. This is shows that the number of bits changed is significantly more which is not a good feature for attack proof steganographic algorithm. A new mechanism is required to decrease the number of bit changes. Overall algorithm is same as F4 algorithm. F5 is enhanced version of F4 algorithm with respect to 2 main features stated below which help in preventing statistical attacks and improving embedding efficiency: PERMUTATIVE STRADDLING MATRIX ENCODING CONTINUOUS EMBEDDING PROBLEM: In most of the cases, an embedded message does not require full capacity. Hence a part of the file remains unused. Figure 7 shows this concept of continuous embedding used by algorithm like F4. Figure 7 shows that the changes (x) concentrate on the start of the file, and unused rest resides on the end. To prevent attacks, the embedding function should use the carrier medium as regularly as possible. The embedding density must be same everywhere. Fig 7: Continuous Embedding concentrates changes (x) PERMUTATIVE STRADDLING: To prevent the Continuous Embedding problem discussed before, F5 algorithm uses a technique called Permutative Straddling for scattering the secret message over the whole carrier medium as shown in Figure 8 (Treat each pixel as JPEG coefficient). The straddling mechanism used in F5 shuffles all coefficients using a permutation first. Then, F5 embeds into the permuted sequence. The shrinkage does not change the number of coefficients (only their values) . The permutation depends on key derived from a password. F5 delivers the steganographically changed coefficients in its original sequence to the Huffman coder. With correct key, ,receiver will be able to repeat the permutation. Fig 8: Permutative Straddling scatters the changes (x) MATRIX ENCODING BY RON CRANDALL: New and efficient technique to improve embedding efficiency by reducing number of changes when embedding secret message. EXAMPLE: F4 1157 bits changed Message with 1736 bits F5 459 bits changed MATRIX ENCODING IMPLEMENTATION: Suppose we want to embed 2 bits x1,x2 in three modifiable bit places a1,a2 ,a3 changing one place at most. We have following 4 cases: In all 4 cases, we do not change more than one bit. General case: If we have a code word ‘a’ with ‘n’ modifiable bit places for ‘k’ secret message bits ‘x’, Matrix encoding technique embeds ‘k’ secret message bits by changing one of n= 2^k-1 places. PRESERVING CHARACTERSITIC PROPERTIES: F5 IMPLEMENTATION: Fig: Block Diagram of F5 Implementation Password driven Permutation. Pseudo one time pad for uniformly distributed message. Matrix Encoding with minimal embedding rate. Core embedding operation like F4. F5 Implementation Steps: Start JPEG compression. Stop after the quantisation of coefficients. Initialise a cryptographically strong random number generator with the key derived from the password. Instantiate a permutation (two parameters: random generator and number of coefficients). Determine the parameter k from the capacity of the carrier medium, and the length of the secret message. Calculate the code word length n = 2^k − 1. Embed the secret message with (1, n, k) matrix encoding (Hash based embedding). Implement inverse permutation. Continue JPEG compression (Huffman coding etc.) F5 has high embedding capacity (>13%) but can be pushed even further. F5 has high embedding efficiency. Resistance against both visual and statistical attacks. Uses a common image format and carrier medium (JPEG). F5—A Steganographic Algorithm High Capacity Despite Better Steganalysis – Andreas Westfeld Ron Crandall: Some Notes on Steganography. Posted on Steganography Mailing List, 1998. http://os.inf.tu-dresden.de/˜westfeld/crandall.pdf Derek Upham: Jsteg, 1997, e. g. http://www.tiac.net/users/korejwa/jsteg.htm Andreas Westfeld: The Steganographic Algorithm F5, 1999. http://wwwrn.inf.tu-dresden.de/˜westfeld/f5.html www.Wikipedia.org