Counter Forensics

advertisement
UNIVERSAL COUNTER FORENSICS
METHODS FOR FIRST ORDER STATISTICS
M. Barni, M. Fontani, B. Tondi, G. Di Domenico
Dept. of Information Engineering, University of Siena (IT)
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Outline
1. MultiMedia Forensics & Counter-Forensics
2. Universal counter-forensics
3. Proposed approach
1. Application to pixel domain
2. Application to DCT domain
4. Results and discussion
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
MM Forensics & Counter-Forensics
• MM Forensics:
• Goal: investigate the history of a
MM content
• Rapidly evolving field, but…
• Countermeasures are evolving
too!
• Counter-Forensics:
• Goal: edit a content without
leaving traces (fingerprints)
project www.rewindproject.eu
011011010
011000000
11100001
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Forensics & Counter-Forensics
• MM Forensics is evolving rapidly…
• Countermeasures are evolving too!
• Counter-Forensics goal: allow to alter a content without leaving
traces (fingerprints)
Counter
Forensics
Taxonomy [K07]
Scope
Universal
Approach
Targeted
Integrated
Postprocessing
[K07] M. Kirchner and R. Böhme, “Tamper hiding: Defeating image forensics,” in Information Hiding, ser. Lecture Notes in Computer
Science, vol.4567. Springer,2007,pp. 326–341.
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Universal Counter - Forensics
• General idea:
A. If you know what statistic is used by the analyst
B. just adapt the statistic of your forgery to be very close to the
statistic of “good” sequences
C. Any detector based on that statistic will be fooled!
• Game Theory:
• This scenario can be seen as a game [B12]
• Forensic Analyst vs. Attacker
• Different games are possible:
① The adversary directly know the statistic of the “untouched sequences”
② The adversary only has a training set of “untouched sequences”
[B12] M. Barni. A game theoretic approach to source identification with known statistics. In Proc. of ICASSP 2012, IEEE Int.
Conference on Acoustics, Speech, and Signal Processing, 2012.
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Outline of the scheme
• Fool a detector = force it to misclassify
• Approach: make the processed image statistic close to that of
(an) untouched image
• If it’s close enough… the detector must do a false-positive or
a false-negative error
• Assumptions:
• Analyst’s detector relies only on first order statistics
• Adversary has a database (DB) of histograms of untouched
images
• So the adversary:
• Processes the image
• Searches the DB for the nearest untouched histogram
• Computes a transformation map from one histogram to the
another
• Applies the transformation, minimizing perceptual distortion
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Practical applications
• We show how the proposed method can be used for two
different CF tasks:
• Hiding traces left by processing operations in the histogram of pixel
values
• Hiding traces left by double JPEG compression in the histogram of
quantized DCT coefficients
• You will notice that switching between different domains
do not change the scheme, but just the implementation
of each “block”
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Application #1
Conceal traces in the image histogram
• We propose a method to conceal traces left by any
processing operation in the image histogram
• Many detectors exist based on histogram analysis:
• Detection of Contrast Enhancement (pixel histogram) [S08]
• Detection of double JPEG compression (histograms of DCT
coefficients) [B12]
• We make no assumptions on the previous processing
[S08] M. C. Stamm and K. J. R. Liu. Blind forensics of contrast enhancement in digital images. In Proc. of ICIP 2008, pages 3112–
3115, 2008.
[B12] T.Bianchi, A.Piva, "Image Forgery Localization via Block-Grained Analysis of JPEG Artifacts", IEEE Transactions on Information
Forensics & Security, Volume: 7, Issue: 3 , Page(s): 1003 - 1017
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Basic notation
• Y and hY denote the processed image and its histogram
• X and hX denote the untouched image and its histogram
• Z and hZ denote the attacked image and its histogram
• Γ denotes the set of histograms (in the database)
respecting possible constraints imposed by the attacker
(e.g: retaining a minimum contrast)
• With ν* we always denote the normalized version of the h*
histogram
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Phase 1: histogram retrieval
• Goal: search a database of untouched image histograms
to find h* such that:
• It has the most similar shape w.r.t. hY
• It belongs to Γ
• We propose to use the Chi-square distance, defined as
• Therefore, the retrieved histogram is
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Phase 2: histogram mapping
• Goal: find the best mapping matrix
that turns
to
•
number of pixels to be moved from value to
• A maximum distortion constraint is given, that avoid changes bigger
than
of the value of a pixel
• We choose the Kullback-Leibler divergence to measure the statistical
dissimilarity between the histograms, and yield the following
optimization problem:
Convex! 
Mixed
Integer

Non
Linear
Problem
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Phase 3: pixel remapping
• We have the mapping matrix, but which specific pixels should be changed?
• Intuition: editing pixels in textured/high-variance regions causes smaller
perceptual impact
• We propose an iterative approach: for each couple (i,j)
Evaluate the SSIM map between Z and Y
Find pixels having value i, and:
1.
2.
a.
b.
3.
scan these pixels by decreasing SSIM, change the first n(ij) to j
mark edited pixels as “unchangeable”, repeat 2. for (i, j+1)
Pixel Remapping
If no more pixel of value i have to be remapped, repeat from
1., with (i+1,j)
• Remarks
• SSIM map evaluated iteratively, to take into account on-going modifications
• Obtained image will have, by construction, the desired histogram
DB
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Advantage of iterative remapping
• If SSIM map is not iteratively computed, visible artifact are
likely to appear…
With iterative
update
Without
iterative
update
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Experimental validation
• We use the proposed technique to hide traces left by:
• Gamma-correction
• Histogram Stretching (equalization)
• Both these operators leave strong traces in image histogram
Original
Gamma Corrected
Equalized
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Case study
Processed
image
(gamma-correction)
Remapped
image
Histogram
Original
Database
Image
Resulting
Histogramhistogram
from
DB
Remapped
histogram
15000
3000
10000
2000
Best match10000
5000
5000
1000
0
0000
0
Search
50
50
50
100
100
100
150
150
150
200
200
200
250
250
250
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Before Counter-Forensics
15000
10000
5000
0
0
50
100
150
200
250
DB histogram
3000
2000
After Counter-Forensics
1000
0
0
50
100
150
200
10000
Dmax = 4
5000
0
0
50
100
150
200
250
250
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Histogram enhancement detection
• Stamm’s detector [S08]
• It detects the peak-and-gap behavior of the histogram
• This is done by considering the contribution of high-frequencies in the
Fourier transform of the histogram
Original
Gamma Corrected
Equalized
[S08] M. C. Stamm and K. J. R. Liu. Blind forensics of contrast enhancement in digital images. In Proc. of ICIP 2008, pages 3112–
3115, 2008.
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Dataset & Experiment setup
• Database of untouched histograms from 25.000 JPEG images
•
•
•
•
(MIRFLICKR dataset). Total weigth: ~10MB
Apply gamma-correction and histogram equalization to 1300 images
from the UCID dataset
Each processed image is “attacked” with the proposed technique,
using {2,4,6} as values for the Dmax constraint
We constrain the database search to histograms whose contrast is
not smaller than that of the enhanced image (this is our Γ )
We evaluate performance of Stamm detector in distinguishing:
• Processed vs. untouched images
• Processed&Attacked vs. untouched images
• We evaluate the similarity between attacked and processed images
using:
• PSNR (“mathematical” metric)
• Structural Similarity Index (“perceptual” metric) [W04]
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Experimental results
• Results in countering detection of gamma-correction
Attacked – Processed distance
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Experimental results
• Results in countering detection of histogram equalization
Attacked – Processed distance
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Application #2
Conceal traces in the image histogram
• Method to conceal traces left by double compression in the
histograms of quantized DCT coefficients
• Huge number of detectors exploit double quantization, e.g.:
• Estimation of previous compression [P08]
• Forgery detection [H06]
[P08] T. Pevny and J. Fridrich, “Estimation of primary quantization matrix for steganalysis of double-compressed JPEG images,”
Proceedings of SPIE, vol. 6819, pp. 681911–681911–13, 2008
[H06] J. He, Z. Lin, L. Wang, and X. Tang, “Detecting doctored JPEG images via DCT coefficient analysis,” in Lecture Notes in
Computer Science. Springer, 2006, pp. 423–435.
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Double Quantization
• DQ is a sequence of three steps:
1. quantization with step b
2. de-quantization with step b
3. quantization with step a
Characteristic
gaps
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
More on DQ…
• Why is it interesting?
• Allows forgery detection
• Tells something about the
history of the content
(e.g. fake quality problem)
• NOTICE:
• Effect is visible when first quantization is stronger than the
second
• The behavior is observed in the histogram of quantized DCT
coefficients
• If JPEG compression has been carried, holes are always present in
the histogram of de-quantized coefficients
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
More on DCT histograms…
• Double JPEG compression leaves the trace in the
histogram of each DCT coefficient
• How is this histogram calculated?
• Intuition:
8x8
DCT
Image
Coeff.
Analysis
Block-wise DCT
Single blocks
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Perception in the DCT domain
• Understand relationship between changes in the DCT domain and
effects in the spatial domain
• Just Noticeable Difference (JND) => minimum amount of change in
a coefficient leading to a visible artifact
• Watson defined JND for the DCT case, 1.4 1.0
taking into account Human Visual
1.0 1.45
System (HVS) properties:
• More sensitive to low frequencies
• Luminance masking: brighter
blocks can be changed more
• Contrast masking: more contrast
allows more editing
14.5
17.2
17.2
21
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
What we want to do
• In this case, traces are left in DCT histograms of
quantized coefficients…
• We must change these histograms, to make them similar
to those of an singly-compressed image!
• We need to revisit the previous application to adapt to the
DCT domain
• More histograms (64 instead of 1)
• More variables (coefficients vary from -1024 to 1016)
• Less intuitive remapping rules…
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Histogram retrieval… revisited!
• Need all DCT histograms of singly compressed images
• Just take some JPEG images and extract them? NO!
• DCT histograms depends on the undergone quantization
• Search would be practically dominated by this fact
• We need to simulate JPEG compressed images:
• Take DCT histograms of never-compressed images
• During search, quantize each of them with the same factor of the
query histogram
• Distances may be weighted, to give more importance to
low frequency coeffs
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Histogram mapping… revisited!
• The problem is the very same, repeated 64 times
• Problem: how to set the perceptual constraint (Dmax)?
• Idea: make it depend on JNDs
=> allow at most the amount of change leading to a JND
• Here we cannot exploit local information (luminance/contrast)
1
1.4
2
1.0
1.0
2
1.45
2
Notice:
• we’re working on
quantized coefficients!
• Changes will be expanded
after de-quantization!
14.5
2
17.2
2
2
17.2
2
21
=> Watson’s matrix must be
divided by the quantization
step
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Pixel mapping… revisited!
• We have to move some DCT coefficients from a value to
another… how do we choose them?
• We exploit Watson model again
• This time, we can exploit local information too
• Algorithm:
1.
Evaluate the JND for all blocks;
2.
For each element n(ij)
a.
b.
c.
3.
Find coefficients having value i, and:
scan these coeffs by decreasing JND, change the first n(ij) to j
mark edited coeffs as “unchangeable”, repeat 2. for (i, j+1)
If no more pixel of value i have to be remapped, repeat from
2., with (i+1,j)
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Does it work so smoothly?
• No, it doesn’t
• Artifacts show up, probably due to the high number of
changed coefficients in high frequencies
• Possible solutions
• Consider the joint impact of changes in more than one frequency
• Anything else? [open question!]
• However, most detectors usually rely on low-frequency
coefficients
• We made some experiments remapping only the first 16
(in zig-zag ordering) coefficients
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Experimental setup: detector
• We implement a detector for double compression based
on calibration
ReadCurrent
from file
• Calibration allows to
50
estimate the original
distribution of a quantized
signal
• Basic idea with JPEG:
• Cut small number of rows/
columns
• Compute 8x8 DCT and
histograms
40
45
35
30
25
20
15
10
5
0
400
500
600
700
800
900
1000
Estimated
Estimated
15
10
5
0
1000
1200
1400
1600
1800
2000
2200
2400
2600
2800
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Experimental setup: method
• 200 TIFF (never compressed) images
• Experiment consists in evaluating detector performance
before and after counter – attack
Compress
Run
detector
ReCompress
Run
detector
Remove
traces
Run
detector
• Detector evaluated in these tasks:
• Discriminate single- vs. double- compressed images
• Discriminate single- vs. attacked images
• We do not want to cheat
• i.e., we do not use threshold values from the first experiment to do
classification in the second
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Experimental results
Curva ROC
Doppiamente
Rimappate
compresse
vs. Singolarmente
vs. Singolarmente
compresse
compresse
1
0.9
0.8
Prob. di rilevazione
0.7
Mean SSIM:
0.968
Mean PSNR:
42.9 dB
0.6
0.5
0.4
0.3
0.2
0.1
AUC=0.998
AUC=0.771
0
0
0.1
0.2
0.3
0.4
0.5
0.6
Prob. falso allarme
0.7
0.8
0.9
1
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Conclusions
• Our universal CF methods allow to conceal traces left by
any processing in the first-order statistic
• Evaluation of the effectiveness should probably rely on
statistic measures rather than on detectors
• Future works:
• Explore connections with Optimal Transportation theory
• Explore the use on un-quantized DCT coefficients (conceal traces
of single compression)
• Develop an integrated method to re-compress an image without
leaving traces
• Explore the use of different objective function for the histogram
mapping problem
MM&SECCounter-Forensics
2012 – Coventry, UK
Universal
Thank you
Questions?
Acknowledgments
This work has been supported by the REWIND project
Download