Image Clustering Based on Camera Fingerprints Chang-Tsun Li Department of Computer Science University of Warwick UK WCPM 1 Digital Image Acquisition Process Lens CFA Scene Sensor CFA Interpolation PostProcessing Photo CFA: Colour Filter Array Bayer CFA R G G B mapping of CFA to sensor pixels R G R G R G G B G B G B R G R G R G G B G B G B WCPM 2 What is Camera Fingerprint • Lens aberration • Sensor pattern noise • Colour filter array (CFA) interpolation artefacts • Camera response function • Quantisation table of JPEG compression Lens Scene CFA Sensor CFA Interpolation PostProcessing Photo WCPM 3 Camera Fingerprint for Multimedia Forensics Multimedia Forensics: The use of “fingerprints” left in images by the imaging devices for • source device identification • source device linking • content integrity verification • image classification WCPM 4 What is Sensor Pattern Noise Sensor Pattern Noise (SPN) is the noise left in the images by the sensors of digital imaging devices such as cameras, camcorders and scanners. SPN is mainly caused by – manufacturing imperfection and – different sensitivity of pixels to light due to the inhomogeneity of silicon wafers. Sensors made from the same silicon wafer possess unique SPN because of the non-uniform imperfection. SPN can differentiate cameras of the same model. WCPM 5 “Traditional” SPN Extraction Method Lukáš et al’s model for SPN extraction (IEEE TIFS 2006) n I (i, j ) I ' (i, j ) I ' Weiner _ filter ( I ) – I is the original image – I’ is the low-pass filtered version of I by the Weiner filter applied in the wavelet transform domain – n is the extracted SPN SPN is the high-frequency component of the image. WCPM 6 Interference from Scene Details Scene details, e.g., brick walls, tree leaves, or other kinds of textures, contribute to the high-frequency components of images. natural image SPN a clean SPN a contaminated SPN WCPM 7 SPN Enhancement at Warwick • C.-T. Li, "Source Camera Identification Using Enhanced Sensor Pattern Noise," IEEE Trans. on Information Forensics and Security, June 2010 • C.-T. Li and Y. Li, "Color-Decoupled Photo Response NonUniformity for Digital Image Forensics," IEEE Trans. on Circuits and Systems for Video Technology, 2012 • X. Lin and C.-T. Li, "Preprocessing Reference Sensor Pattern Noise via Spectrum Equalization," IEEE Trans. on Information Forensics and Security, 2016 . • X. Lin and C.-T. Li, "Enhancing Sensor Pattern Noise via Filtering Distortion Removal," IEEE Signal Processing Letter, accepted for publication in 2016 WCPM 8 Image Classification/Clustering Scenario: A forensic investigator • has a large set of images taken by an unknown number of unknown digital cameras and • wishes to cluster those images into a number of classes, each including the images acquired by the same camera. Each data point represents one image Each cluster present one unknown device WCPM 9 Challenges Facing Image Classification The forensic investigator does not have the cameras that have taken the images to generate reference SNPs for comparison. No prior knowledge about the number and types of the imaging devices are available. With a large dataset, exhaustive fingerprint comparison is computationally prohibitive. Given the shear number of images, analysing each image in its full size is computationally infeasible. WCPM 10 Image Classification – a MRF Approach Step 1. Extract and enhance the fingerprint of each block cropped from the images Step 2. Establish a similarity matrix ρ for a Focus Set of M images Step 3. Train the classifier based on the similarity matrix ρ. For each fingerprint i, which is treated as a random variable 3.1. Assign a unique random class label 3.2. Calculate a reference similarity (i.e., a “soft” threshold) 3.3. Establish a membership committee (neighbourhood) 3.4. Update the class label iteratively based on the information from the membership committee until there are no changes of class labels to any SPN throughout a entire iteration Step 4. Classify the rest of the dataset using the classifier WCPM 11 Establishing Similarity Matrix To establish an M × M similarity matrix ρ, the similarity between any two enhanced SPNs i and j in the Focus Set is calculated using i , j 1 1 1.00 2 3 2 3 …. M 1.00 1.00 4 1.00 : : 1.00 M (i, j ) 4 1.00 (ni n i ) (n j n j ) ni n i n j n j , i, j {1,2,3,..., M } WCPM 12 Classifier Training • Each fingerprint (SPN) is treated as a random variable. 3.1. Assign a unique random class label to each SPN 3.2. Calculate a reference similarity r inter-class similarity intra-class similarity Normally intra-class similarities > inter-class similarities. r 1 2 2 μ similarity 1 r μ2 Similarity • A similarity less than r indicates that the two images are taken by different devices, otherwise by the same device. WCPM 13 Classifier Training 3.3. Establish a membership committee For each SPN i , a membership committee Ci with c SPN members from the focus set that are most similar to i is established. × vv vv vv WCPM 14 Classifier Training 3.4. Update the class label iteratively according to p(fi |ρ(i, Ci ), Li) until there is no change of class label to any SPN in x consecutive iterations • fi : class label of SPN i • LC i { f i } { f j | j Ci } • ρ(i, Ci ) is the similarities between SPN i and the members of Ci, i.e., • p(fi |ρ(i, Ci ), Li): probability of assigning fi given the conditions • ri: reference similarity (“soft” threshold) of I (i, Ci ) { (i, j ) | j Ci } 1 p ( f i | (i, C i ), Li ) exp[ U i ( f i , (i, C i ), Li )] Zi Zi exp[ U ( f , (i, C ), L )] f i Li i i i i WCPM 15 Objective Function / Cost Function U i ( f i , (i, Ci ), Li ) s( f i , f j ) (i, j ) ri jCi 1 s( f i , f j ) 1 inter-class similarity intra-class similarity , if f i f j , if f i f j The combination of the s(.) and ρ(.) says, μ1 r μ2 similarity • a penalty (i.e. positive value) will be incurred Similarity if ρ(i,j) > r and a different label than fi is to be assigned to i or if ρ(i,j) < r and the same label as fj is to be assigned to i • a reward (i.e. negative value) will be given if ρ(i,j) < r and a different label than fi is to be assigned to i or if ρ(i,j) > r and the same label as fj is to be assigned to i WCPM 16 Image Classification The centroids of the image clusters provided by the classifier training process at the end of Step 3.4 are used to classify the images. To classify an image x, we compare the similarity of its SPN to the centroid of each identified cluster and classify it to the class with its centroid closest to the image. WCPM 17 Classifier Training - Simulation Clustering Initial label configure: Each pattern in is progress assigned….. an unique label / colour WCPM 18 Final Classification WCPM 19 Table 1. Classification error rate. c is the size of the membership committee. Experimental Results Misclassification rates: 1200 images taken by six cameras, each taking 200. Class identification stops when there are no class label changes throughout an iteration. Block Size c 256 × 256 256 × 512 512 × 512 Focus set size (M) Focus set size (M) Focus set size (M) 120 300 120 300 120 300 M-1 8.889 4.000 3. 778 1.333 1.444 1.444 M/2 8.333 4.000 2.333 1.333 1.444 1.444 M/3 8.333 4.000 2.333 1.333 1.444 1.444 M/4 8.333 4.000 2.333 1.333 1.444 1.556 M/5 8.333 4.000 3. 778 1.222 1.444 1.444 c: the size of the membership committee M: the size of the focus set A misclassification rate in the range (1.2 ~ 1.6) is likely to be the best the system can achieve. WCPM 20 Conclusions Multimedia forensics using “fingerprint” left in the images by the imaging devices has emerged as a new area of research in the last few years. Sensor pattern noise (SPN) is one of the most promising types of fingerprint. The “traditional” SPN extraction method is unable to cope with the interference of scene details. The proposed classifier is feasible, but is unable to classify images without clean SPNs provided by the proposed SPN enhancer. WCPM 21