New Feature Presentation of Transition Probability Matrix for Image Tampering Detection Luyi Chen1 Shilin Wang2 Shenghong Li1 Jianhua Li1 1Department of Electrical Engineering, Shanghai Jiaotong University 2School of Information Security, Shanghai Jiaotong University 1/18 Outline Markov Transition Probability Second order statistics and Feature Extraction Dimension and correlation between variables New Form of the feature Two elements and three elements Experiment Result Conclusion 2/18 Context Inspired by applying Markov Transition Probability Matrix to solve Image Tampering Detection as a two-class classification (proposed by Shi et al 07) Current feature extraction method Every element from 2D matrix (huge dimension) Boosting selection or PCA for dimension reduction, and the low dimensional features do not have corresponding physical meaning Goal: dimension reduction by decomposing adjacent elements to be statistically uncorrelated 3/18 Second Order Statistical Modeling of Image Image transformed with 8x8 BDCT Horizontal difference array Modeled with horizontal transition probability Can be applied to four directions Transition Probability Yij Yi,j+1 Difference array Xij : BDCT Coefficeints 4/18 Feature Extraction of Transition Probability Matrix Thresholding is applied to difference array (with threshold of T) The transition probability matrix is used as the feature Dimension of the feature is (2T+1)2 If we consider four directional transition, the dimension needs to be multiplied by 4. P4, 4 P3, 4 P3, 4 P 4, 4 P4, 3 P3, 3 P4,3 P3,3 P3, 3 P4, 3 P3,3 P4,3 P4,4 P3,4 P3,4 P4,4 5/18 Example: Transition Probability Matrix 0.8 0.6 0.4 0.2 0 -4 -3 -2 -1 0 1 2 3 4 1 2 3 4 5 6 7 8 9 6/18 Problem of Current Presentation of the Feature Dimension of the feature is square proportional to the threshold 250 200 dimension of feature 150 100 50 0 2 3 4 5 threshold of difference element 6 7 7/18 Correlation Between Adjacent Elements in Difference Array Assume adjacent BDCT coefficients are uncorrelated, i.e., Transition Probability Yij Yi,j+1 Difference array E ( xij xi k , j ) 0 (k 1,2,...N 1) ( yij , yi k , j ) 0.5 (k=1) (k>1) 0 E[( yij y )( yi k , j y )] y2 Xij : BDCT Coefficeints 8/18 Correlation Calculated on Dataset Figure . Correlation between adjacent elements on difference array of block DCT coefficients: (1) k=1; (2) k=2 9/18 PCA Transform of Two-component Random Parameters Correlation Matrix 1 1 1 2 2 2 1 1 1 2 2 1 Eigenvalues 1 1 2 1 Eigenvectors Uncorrelated new random variables z1 1 z2 2 T yij y i 1, j 10/18 Decomposition of Second Order Statistics into Marginal Ones Marginal histograms are output of two linear filters 1 0.5 0 -4-3 -2-1 0 z yij yi 1, j 1 ij xij xi 2, j z yij yi 1, j 2 ij xij 2 xi 1, j xi 2, j sum histogram 0.4 0.3 0.3 0.2 0.2 0.1 0.1 -4 -3 -2 -1 0 1 2 34 9 78 56 4 3 12 difference histogram 0.4 0 12 3 4 0 -4 -3 -2 -1 0 1 2 3 4 11/18 Feature Dimension Linearly Proportional to Threshold 250 Transition Probability Matrix Our new form Feature Dimension 200 150 100 50 0 2 3 4 5 6 7 Threshold 12/18 The Approach Can be Generalized to Three Elements Correlation Matrix 1 0 1 0 1 Eigenvalues 1 1 2 1 2 3 1 2 Eigenvectors 1 1 1 2 2 2 1 1 1 0 2 3 2 2 1 1 1 2 2 2 Decomposed variables z1 z2 1 z 3 2 yij T 3 yi 1, j y i 2, j 13/18 Dataset and Classifier Columbia Splicing Detection Evaluation Dataset 921 authentic, 910 spliced 2/3 Training, 1/3 Test LibSVM, Gaussian RBF kernel 14/18 Single Feature Performance Type of Joint Statistics Feature Dimension Accuracy 1st order Markov Transition Probability 81 87.09 (1.39) Our new form (Sec. 3.1) 46 87.97 (1.45) 2nd order Markov Transition Probability 729 85.84 (0.92) Our new form (Sec. 3.2) 77 85.54 (1.34) 2 elements 3 elements 15/18 Combined Features Performance Feature T Dimension Accuracy Moment+Transition Probability Matrix 3 266 89.86 (1.02) 3 220 89.62 (0.91) 4 236 89.78 (1.03) 5 252 89.78 (1.09) Moment+New Form 16/18 Computation Complexity Comparison Feature Type Computing Time (seconds) Transition Probability Matrix 0.0516 (0.0054) Marginal Distribution of two new variables 0.0502 (0.0005) On Core 2 Duo 1.6G, 3G Ram 17/18 Conclusion Our new form has lower feature dimension, faster computation, and almost as good performance Dimension Reduction is more obvious in higher order, but further research is needed to improve discrimination performance 18/18