Steganalysis of Block-DCT Image Steganography Ying Wang and Pierre Moulin Beckman Institute, CSL & ECE Department University of Illinois at Urbana-Champaign September 29th, 2003 Introduction • Steganography is a branch of information hiding, aiming to achieve perfectly secret communication. 2 Steganographer vs. Steganalyzer Steganographer • Embedding distortion De Steganalyzer • Trace of embedding? – Is X N typical of PS ? N • Various embedding methods can be used. • Detection methods – Ad hoc – Detection-theoretic 3 Block-DCT Embedding Spatial domain • Host image: u(m,n) • 2-D stationary process with 0 mean and correlation function ru (s, t ) E[u(m, n)u(m s, n t )] DCT domain • 8 8-DCT coefficients: u~ (k , l ) • 64 equal-size channels containing approximately independent data, with variances u~2 (k , l ) 4 Spatial domain DCT domain 8 8 u ( m, n ) 8 8 u~ (k , l ) 5 Modified Spread Spectrum Data Hiding Model DCT domain • Marked DCT coefficients: u~(k , l ) v~ (k , l ) ~ z (k , l ) a~k ,l u~ (k , l ) N (0, ~z2 (k , l )) • Constraint De and 1-D undetectability constraint: pu~ ( k ,l ) pu~( k ,l ) , k , l . Spatial domain • Stego-image: u(m, n) v(m, n) z (m, n) IDCT ~ ~ a (k , l )u (k , l ) v(m, n) DCT IDCT ~ z (k , l ) z (m, n) DCT v z 6 Statistics of the Pixel Differences • Block processing introduces discontinuity at the block boundaries • Develop steganalysis method based on pixel differences! 7 Host image Stego-image • d (m, n) u (m, n) u (m, n 1) • d (m, n) u(m, n) u(m, n 1) is a stationary process with is non-stationary zero mean and correlation function rd (k , l ) E[d m,n d mk ,n l ] 2ru (k , l ) ru (k , l 1) ru (k , l 1) • The pdfs for all pairs are the same • The pdfs for inner pairs {d 0 } and border pairs {d1}are different 8 9 Binary Hypothesis Testing Problem • Two populations {d 0 } {d1} • Difficulty: pdfs are unknown! • We use non-parametric two-sample goodness-of-fit tests such as KomogorovSmirnov (K-S) test. H 0 : F0 F1 • K-S test: H1 : F0 F1 F0 and F1 are cumulative density functions. • Test statistic: DM , N sup S 0 ( x) S1 ( x) x S0 ( x) # d 0 (m, n) x/ M S1 ( x) # d1(m, n) x/ N 10 • The decision rule with PFA is 1 DM , N DM , N , D 0 DM , N DM , N , . 0 / 1 DM , N DM , N , 11 Discussion • With the same embedding strength, stego-images of smooth host images such as Lena and Jet, are more likely to be detected than those of images with noise-like textures, such as Baboon. – The best candidates for steganography are complex images such as Baboon. – Block-DCT steganography is not suitable for smooth images. 12 • The key idea of our paper is to find an intrinsic property of natural images, which is modified by the information hiding process. – Another example: detecting wavelet-based information hiding. Upsampling introduces a stationary process in one subband to a non-stationary process in the spatial domain. 13 • The K-S test is universal in the sense that the pdfs can be unknown. • Comparing the K-S test with the likelihood ratio test, their universality is achieved at the cost of performance degradation. 14 References • N. F. Johnson and S. Katzenbeisser, ``A survey of steganographic techniques", in S. Katzenbeisser and F. Peticolas (Eds.): Information Hiding, pp.43-78. Artech House, Norwood, MA, 2000. • J. D. Gibbons and S. Chakraborti, Nonparametric statistical inference, Marcel Dekker, New York, 1992. • L. Breiman, Probability, SIAM, Philadelphia, 1992. • O. Dabeer, K. Sullivan, U. Madhow, S. Chandrasekharan, and B. S. Manjunath, ``Detection of hiding in the least significant bit", Proc. CISS, The Johns Hopkins University, Mar. 2003. 15 Lena Baboon 16