as a PDF

advertisement
International Journal of Innovative
Computing, Information and Control
Volume 3, Number 3, June 2007
c
ICIC International °2007
ISSN 1349-4198
pp. 575—588
A LSB STEGANOGRAPHY APPROACH AGAINST PIXELS SAMPLE
PAIRS STEGANALYSIS
Xiangyang Luo, Fenlin Liu
Zhengzhou Information Science and Technology Institute
Zhengzhou 450002, P. R. China
xiangyangluo@126.com; liufenlin@vip.sina.com
Peizhong Lu
Department of Computer Science and Engineering
Fudan University
Shanghai 200433, P. R. China
pzlu@fudan.edu.cn
Received October 2005; revised February 2006
Abstract. Through dynamic compensation of pixel values of the LSB (Least Significant
Bit) embedded image, this paper presents a novel LSB information hiding method against
pixels Sample Pairs Steganalysis (SPA), a powerful steganalysis method proposed by Dumitrescu et al. with high precision. This approach embeds messages in the LSB plane of
the carrier image randomly via a chaotic system, then makes a dynamic compensation
on the stego-image. Even when the embedding ratio is near 100%, such dynamic compensation can lead SPA steganalysis to an incorrect judgment review because of getting
a very small estimate value close to 0. Moreover, the initial value of the chaotic system
and the selection parameters of the compensation can be used to improve the security of
steganography approach. Experimental results show that this method can also resist some
others steganalysis methods, such as RS analysis, DIH method, and various improved
versions of SPA and RS steganalysis.
Keywords: Steganography, LSB embedding, Sample pairs steganalysis, Chaotic system,
Dynamic compensation
1. Introduction. Steganography is one of the important research subjects in the field of
information security. It enables secret communication by embedding messages in the texts,
images, audio, video files or other digital carriers. Among all the image information hiding
methods, LSB embedding is widely used for its high hiding capacity, and simpleness to
realize. Many public steganographical softwares, such as S-Tools, EZStego and Steganos
apply this technique. Therefore, it’s with great significance to detect the images with
hidden messages produced by LSB embedding effectively, accurately and reliably. And
many experts made efforts on the LSB steganography and steganalysis research over the
years.
Fridrich et al. [1] developed a steganalysis method for detection of LSB embedding in
24-bit color images (the Raw Quick Pairs —RQP method), which is based on analysis of
close pairs of colors created by LSB embedding. It works reasonably well as long as the
number of unique colors in the cover image is less than 30% of the number of pixels. The
575
576
X. LUO, F. LIU AND P. LU
RQP method can only provide a rough estimate of the size of the secret message, and the
result becomes progressively unreliable once the number of unique colors exceeds about
50 percent of the number of pixels. Westfeld [2,3] performed the blind steganalysis based
on statistical analysis of PoVs (pairs of values). This method, so-called χ2 -statistical
test, gave a successful result to a sequential LSB steganography. Provos [4] pointed
that this idea can also detect random embedding if applied to smaller part of images.
Chandramouli [5,6] introduced hypothesis test to judge the existence of secret messages
by LSB embedding, and the general framework was given in [7]. Fridrich et al. [8,9]
also proposed a powerful RS (regular and singular groups method). This method made
statistics of the alterations of regular groups and singular groups in the image to estimate
the embedded length accurately. It is suitable for color or gray-scale images when the
messages are embedded randomly. Sorina Dumitrescu et al. [10,11] proposed SPA, a
method to detect LSB steganography via sample pair analysis. When the embedding
ratio is more than 3%, this method can estimate the embedded length with relatively
high precision. Paper [12] presented a reliable detection method of LSB steganography
based on the difference image histogram (DIH method). This method not only can detect
the existence of messages embedded by sequential or random LSB replacement in images
reliably, but also can estimate the hiding length. If the embedding ratio is above 50%, the
result of DIH method is more accurate than that of RS method. In a sense, RS and SPA
are a couple of the most reliable detectors of thinly-spread LSB steganography until now
[13]. Andrew D. Ker [13,14] evaluated the reliabilities of Pairs [15], RS and SPA (which is
called Couples Steganalysis in [14]) through large quantities of experiments, and proposed
some good improvements. We also present a LSM sample pair steganalysis based on the
improvement of SPA in literature [16].
However, most of the methods discussed above (especially the ones with high reliability)
are based on some important statistical hypothesis. Though these assumptions are held
for natural images, they also offer the possibility for the encoder to avoid being detected.
In fact, encoders may adjust the embedded images purposely to make them still satisfy
these assumptions to avoid exposure. Against RS analysis, Jeong Jae Yu et al. put
forward SES [17] , which was realized by adding one to or subtracting one from the pixel
values of the carrier image randomly during embedding, and part of the image without
the hidden message would be used to adjust RS statistics measurement. This method
can resist the detection from RS or χ2 -statistical test, and in this process, the more
the number of pixels is used to adjust the RS statistics measurement, the slimmer the
possibility a hidden message can be captured by RS detection. So the message embedded
by LSB should not be too large in order to make the system more secure. The author
recommended that more than 50% of the pixels should be used to adjust RS statistics
measurement. However, as far as SPA steganalysis is concerned, no effective attack seems
to have been proposed in published literature.
Through analysis of the method described in literature [10], we developed a new LSB
embedding method with resistance to SPA detection, which is called DCLS (Dynamic
Compensation LSB Steganography) method for short. DCLS method is perhaps the first
one focusing on defeating SPA. Firstly, this method embeds messages in the LSB plane
of the carrier image randomly via a chaotic system, then makes a dynamic compensation
on the stego-image. The compensation style isn’t fixed, i.e. the position and shape of this
part can be chosen at will, such as rectangular and circular. Moreover, some parameters,
A LSB STEGANOGRAPHY APPROACH
577
such as the initial value of the chaotic system, the starting point, length and width of
a chosen rectangle area can be used as part of a system key, which further improves
the security of the whole hiding system. Large quantities of experiments show that the
method described in this paper also can effectively evade attacks from RS, DIH, improved
RS and SPA, and other analyses based on sample pairs. Even when the embedding ratio
is near 100%, this method is still effective to such attacks. Meanwhile, this method can
powerfully defeat χ2 -statistical test.
This paper is constructed as follows. Principle of SPA steganalysis is briefly introduced
in Section 2. Section 3 describes the principle and algorithm of DCLS method, as well
as corresponding its analysis. Section 4 presents the chaotic system and experimental
results, and then Section 5 concludes this paper.
2. Principle of SPA Steganalysis.
2.1. SPA steganalysis method. The principle of SPA method is based on finite-state
machine theory. The states of finite-state machine are selected multisets of sample pairs.
If sample pairs were drawn from images, there are some inherent relations. But after
random LSB embedding, these multisets will change, and it causes changes to these
statistics relations.
Assuming that the pixel value of an image is represented by the succession of samples
s1 , s2 , · · · , sN (the index represents the location of a sample in the image), a sample pair
means a two-tuple (si , sj ), 1 ≤ i, j ≤ N . Let P be a set of sample pairs drawn from an
image, then P can be seen as a multiset of two-tuples (u, v), where u and v are the values
of two adjacent samples, 0 ≤ u ≤ 2b − 1, 0 ≤ v ≤ 2b − 1, and b is the number of bits to
represent each sample value. Denote by Dn the submultiset of P that consists of sample
pairs of the form (u, u + n)or(u + n, u), i.e., wherenis a fixed integer, 0 ≤ n ≤ 2b − 1. For
each integer m, 0 ≤ m ≤ 2b−1 − 1, define
Cm = {(u, v) ∈ P | bu/2c − bv/2c = m or bv/2c − bu/2c = m},
where b•c denotes the maximal integer that is smaller than •.
Obviously, Dn form a partition of P , and Cm forms another partition of P , and D2m
is contained in Cm . We partition D2m+1 into two multisets X2m+1 and Y2m+1 , where
X2m+1 = D2m+1 ∩ Cm+1 , Y2m+1 = D2m+1 ∩ Cm , for 0 ≤ m ≤ 2b−1 − 2, and X2b −1 = ∅,
Y 2b −1 = D2b −1 . Both X2m+1 and Y2m+1 contain pairs (u, v) that differ by 2m+1 (i.e.,
|u − v| = 2m + 1). Those pairs in which the even component is larger are in X2m+1 ,
whereas those pairs in which the odd components is larger are in Y2m+1 . For natural
images (normal signals), the probability for a sample pair in D2m+1 to have a larger or
smaller even component is the same, all the algorithms discussed in literature [10] are
based on this important assumption. That’s to say:
E {|X2m+1 |} = E {|Y2m+1 |}
(1)
Through deduction, the following quadratic equations can be derived:
(|Cm |−|Cm+1 |)p2
4
(|D 0
|−|D 0
|+2|Y 0
|−2|X 0
|)p
2m+2
2m+1
2m+1
− 2m
2
0
0
+|Y2m+1
| − |X2m+1
| = 0, m ≥ 1
(2)
578
and
X. LUO, F. LIU AND P. LU
(2|C0 |−|C1 |)p2
4
(2|D0 |−|D0 |+2|Y 0 |−2|X 0 |)p
0
2
1
1
−
2
(3)
+|Y10 | − |X10 | = 0, m = 0
where 1 ≤ m ≤ 2b−1 − 1, and |∗0 | represents the cardinality of the multiset * in the
embedded image. Resolving either (2) or (3), the length of the hidden information, the
value of p can be obtained.
To get a more robust estimate of the length, literature [10] uses a more accurate assumption
¯)
¯)
(¯ j
(¯ j
¯[
¯
¯[
¯
¯
¯
¯
¯
E ¯
(4)
X2m+1 ¯ = E ¯
Y2m+1 ¯
¯
¯
¯
¯
m=i
m=i
to replace the assumption (1), where 1 ≤ i ≤ m ≤ j ≤ 2b−1 − 1, then a more powerful
quadratic equation can be obtained to estimate the value of p:
(|Ci |−|Cj+1 |)p2
4
and
(|D 0 |−|D 0
|+2
Pj
(|Y 0
|−|X 0
|))p
2m+1
2m+1
2j+2
m=i
− 2i
2
Pj
0
0
+ m=i (|Y2m+1
| − |X2m+1
|) = 0, i ≥ 1
(2|C0 |−|Cj+1 |)p2
4
0
(2|D00 |−|D2j+2
|+2
Pj
(|Y 0
|−|X 0
(5)
|))p
2m+1
2m+1
m=0
−
2
(6)
Pj
0
0
+ m=0 (|Y2m+1
| − |X2m+1
|) = 0, i = 0
The precise estimate of hiding ratio can be obtained by resolving either equation (5)
or (6)1 . Through experiments, the recommendation values of i, j and the judgment
threshold are given by the author. The most precise estimate length can be obtained
when i = 0, j = 30 and the judgment threshold is 0.018, and the average error is 0.023 in
this situation. The experimental result arranged in literature [10] is showed in Table 1:
Table 1. The detection results in literature [10]
Embedding ratio 0%
3%
5% 10% 15% 20%
Error ratio
0.1379 0.1103 0
0
0
0
2.2. Principle analysis of SPA method. As illustrated in Section 2.1, the principle
of SPA steganalysis bases on the assumption (1) or the assumption (4). Through making
statistics of some characteristics of the sample pairs in the image, it constructs equations
about p. Then the ratio of the hidden message can be obtained and used to compare with
the threshold to judge whether secret messages exist or not. The statistical characteristic
include |Cm |, |D2m |, |X2m+1 | and |Y2m+1 |, the cardinalities of the sets Cm , D2m , X2m+1
and Y2m+1 , i.e. the quantities of them.
For each modification pattern π ∈ {00, 10, 01, 11} and any submultiset A ⊆ P , denote
by ρ(π, A) the probability that the sample pairs of A are modified with pattern π as a
result of the LSB embedding. We say that multiset A is unbiased if ρ(π, A) = ρ(π, P )
holds for each modification pattern π. And A is called unbiased in short, otherwise A is
called biased.
1
|Ct | (0 ≤ t ≤ 2b−1 − 1) is unknown to attackers, because they can’t get original carrier images. All the
values obtained via statistics in equations (2 ∼ 6) can only obtained by making statistics to the examined
image, so |Ct | in these equations actually is |Ct0 |.
A LSB STEGANOGRAPHY APPROACH
579
The most straightforward attack method is to avoid operating LSB embedding at locations where adjacent sample pairs have close values, for example, no embedding in the
adjacent sample pairs that differ by less than 3 in value. In other words, we can purposefully trick C0 to be biased, violating the accuracy of detection. Such an attack against
SPA is to only embed message bits into candidate sample locations where all adjacent
sample pairs are in Ct , t ≥ τ , where τ is the prefixed threshold. In other words, any
sample pair (u, v), |u − v| ≥ 2τ − 1 will satisfy |u0 − v 0 | ≥ 2τ − 1, (u0 , v 0 ) represents the
values of the two samples after LSB embedding where. Clearly, this LSB embedding
scheme conditioned on Ct , t ≥ τ , and both encoder and decoder can refer to the same Ct
to decide whether a sample is a candidate for embedding.
This kind of attack was examined in literature [10], and the corresponding counter
measures are discussed. In general, the proposed method is open for attack if the locations
of chosen sample pairs in P are known and if the algorithm examines a specific close set CS
and the chosen s is also known. For SPA detection can make statistics of all the possible
relations of adjacent pixel pairs, through equation (5), different multisets ∪jm=i Cm and
∪j+1
m=i+1 Cm can be chosen to estimate the value of p for different i and j. The estimate
accuracy will be improved if ∪jm=i Cm and ∪j+1
m=i+1 Cm are unbiased. In fact, it is extremely
difficult, if not impossible, to select locations of embedded message bits in such a way
that all of Cm , 0 ≤ m ≤ 2b−1 − 1 become biased. So it is unavailable to attempt to foil
the SPA detection by choosing embedding locations.
Let the difference between |Y2m+1 | and |X2m+1 | as εm , then εm = |Y2m+1 | − |X2m+1 |,
0 ≤ m ≤ 2b−1 − 2.
2
For 1 ≤ i ≤ j ≤ 2b−1 − 2, denote eij =
2
eij =
j
P
j
P
εm
m=i
|D2i |−|D2j+2 |
, and for 0 = i ≤ j ≤ 2b−1 − 2, denote
εm
m=0
2|D0 |−|D2j+2 |
. The error bounds on the estimate of embedding ratio is presented in
2|ei |
literature [10] through theoretical deduction as |p − p̂(i, j)| ≤ 1−ejij (1 − p), where 0 ≤ i ≤
j ≤ 2b−1 − 2, and p̂(i, j) is the estimate value p obtained by resolving equation (5) or (6).
Noting that |eij | should be small to reduce the estimate error, in other words, we would
P
P
like to reduce jm=i εm and increase |D2i | − |D2j+2 |. jm=i εm decreases in general as the
difference between iand j increases. However, given an i, the larger the distance j − i,
the larger the difference |D2i | − |D2j+2 |, more robust estimate of p. Therefore, as it’s said
in literature [10], we should let i be 0, and choose a sufficiently large j to obtain robust
estimate by resolving equation (6). In fact, |X2m+1 | and |Y2m+1 | are so small that can be
ignored when m > 30, and the recommendation value of j is given as 30 by the author
based on large quantities of experiments.
3. Dynamic Compensation LSB Steganography System. Because equation (6)
bases on assumption (4), the attempt this paper made to defeat SPA attack starts from
assumption (4).
¯
¯ j
¯
¯S
¯
X2m+1 ¯¯
For an image whose least significant bits have been embedded with message, ¯
m=i
¯ j
¯
¯S
¯
and ¯¯
Y2m+1 ¯¯ will be altered, and the distance value between Y2m+1 and X2m+1 before
m=i
580
X. LUO, F. LIU AND P. LU
and after embedding will be altered correspondingly, but how much could this alteration
affect the obtained hiding ratio?
0
0
||∪j Y2m+1
|−|∪jm=i X2m+1
||
0
0
Let ∪m=i
and ∪jm=i Y2m+1
,
be the irrelevance between ∪jm=i X2m+1
j
j
0
0
Y
+
∪
X
| m=i 2m+1 | | ¯¯m=i 2m+1 | ¯ ¯
¯
¯
¯
¯
¯
¯
0
0
0
0
¯ − ¯∪j X2m+1
¯¯ = δi,j × (¯∪j Y2m+1
¯ + ¯∪j X2m+1
¯).
written as δi,j , then ¯¯∪jm=i Y2m+1
m=i
m=i
m=i
Figure 2 shows the effect on the detection result of an original image, Lena (Figure
1, color, red component), caused by the alteration of δi,j in the SPA steganalysis on the
assumption (1) (i = j) and (4) (i = 0, j = 30).
Figure 1. Original Lena image
Figure 2. The effect irrelevance
causes on detection results
Generally, the value of δi,j will become larger after LSB embedding. And as shown in
the above figure, the estimate value p getting from detection
¯with the
¯
¯ jwill grow
¯ j increment
0
0
¯
¯
¯
of δi,j . In other words, the closer the difference between ∪m=i Y2m+1 and ¯∪m=i X2m+1
is to 0, the
the
¯ estimate
¯length
¯is. So to defeat SPA, we must let the difference
¯ jsmaller
j
0
0
¯
¯
¯
between ∪m=i Y2m+1 and ∪m=i X2m+1 ¯ be small enough to make p smaller than the
judgment threshold. Based on this idea, we present an LSB embedding method that can
defeat SPA steganalysis. In this method, all the pixel values in part of the LSB embedded
image are compensated. The operation of adding 1 (or subtracting 1) will apply to all
the pixel values in the chosen part to adjust δi,j to reach the smallest, close to 0, then to
make the estimate value of SPA close to 0. And the range of this part can be dynamically
chosen according to δi,j , the irrelevance after embedding.
Regarding the carrier image as I, embedded message as S, the embedded image as I 0 ,
and when I 0 is dynamically compensated, it is written as I 00 . Assuming that the width
and height of the image are: W = M , H = N , where M , N respectively represents the
number of pixels in the horizontal and vertical direction of the image. Generally, the
whole LSB steganography system can fall into two processes as information embedding
and information extraction. These two processes can be realized in the two sides of the
communication, the sender and the receiver. In this paper, we simply preprocess the
carrier image firstly, then operate message embedding on the LSB plane of the pixels,
afterwards,
values¯ of the image, that’s to adjust the statistics
¯ the¯ pixel
¯ 0 pa-¯
¯ jcompensate
j
0
0
¯
¯
¯
¯
image.
For the value
of ¯Y2m+1 ¯
rameters
¯
¯
¯
¯ 0 ∪¯m=0 Y2m+1 and ∪m=0 X2m+1 in ¯the30embedded
0
0
¯ and ¯∪30
¯
and ¯X2m+1 ¯ is nearly 0 when m > 30, only ¯∪m=0 Y2m+1
m=0 X2m+1 need to be dynamically compensated in reality. The principle of the whole hiding system is illustrated
in Figure 3.
Then the message embedding algorithm, compensation preprocessing, compensation
algorithm and extraction algorithm will be explained respectively as below.
A LSB STEGANOGRAPHY APPROACH
581
Figure 3. The steganography system against sample pairs steganalysis
3.1. Message embedding algorithm. There are two modes of LSB embedding as follows, sequential embedding and random embedding. According to the SPA steganalysis
method, we only also consider the later embedding mode in this thesis. In random embedding process, use a random number generator based on a chaotic system.
Input: original carrier image I 0 , message bit-stream m1 , · · · , ml , l is the embedding
length, mi = 0 or 1, 1 ≤ i ≤ l; initial key K.
Output: stego-image I 0 .
Steps:
1. Image index. Make the index of the pixels of the cover image in the order of from
left to right and from up to down, and put it as: c1 , c2 , · · · , cM ×N .
2. Generates random sequence. According to the initial key K, generates a random
sequence k1 , · · · , kM ×N via a perturbed digital chaotic system.
3. Selects embedding positions. Uses the pixels in those index locations to embed
message bits:
j1 = k 1
ji = (ji−1 + ki ) i ≥ 2
This means that using message bit mi to substitute the LSB of the pixel in the location
cji , then decide the next embedding locations according to ki . If (ji−1 + ki ) > M × N ,
ji = (ji−1 + ki )Mod(M × N ), i.e. should make a M × N module.
4. Embeds LSB message. Embeds all message bits and gets the stego-image I 0 .
3.2. Compensation preprocessing. For the pixel values of part of the image will be
added by 1 in the compensation process, the pixel value of 255 will be 256 after adding
1, which overruns the range which can be denoted by one byte. One countermeasure is
to put the pixel value as 0 after adding 255 by 1, but this will lead to the unsmoothness
of part of the image, like the appearance of flecks, as illustrated in Figure 4 (in the back
brim of the hat and the bulge of nose). Therefore, it is necessary to preprocess the stegoimage before compensating, and the concrete operation is to scan all the pixel values in
the carrier image and modify all the pixels whose value is 255 into 253. Obviously, such
modification will cause little effect on the image and won’t cause difference in vision. At
the same time, this modification model can remain the parity of adjoint sample pairs and
won’t effect correct extraction of the message bits. If subtracting 1 from the pixel value
realizes the compensation, efforts should be made analogously.
582
X. LUO, F. LIU AND P. LU
Figure 4. The color image Lena after the pixel value 255 being modified as 0
¯
¯ 0
¯
¯ 0
¯ and ¯X2m+1
¯ grows
3.3. Compensation algorithm. The irrelevance between ¯Y2m+1
after LSB embedding, and compensation algorithm deals with how to eliminate this alteration. To make comprehension easy, we first assume that the adjacent pixel pairs whose
difference is 2m + 1 distribute uniformly in the embedded image. Then after the pixels
values of half stego-image are added by 1, the total number of the adjacent pairs whose
difference is 2m + 1 is almost invariable (only slight alteration near the parting line). But
as the pixels values in compensation area all are added by 1, the parity of the pixel value
0
¯
¯ 0
¯, it has been reduced by |Y2m+1 | , but has been inwill be interchanged. As for ¯Y2m+1
2
0
0
0
¯ 00 ¯ |Y2m+1
|X2m+1
|
| |X2m+1
|
¯
¯
creased by
,
so
the
pixel
value
after
modification
is
Y
+
. In
=
2m+1
2
2
2
00
this paper, |∗ | denotes the cardinality of the multiset ∗ in the compensated stego-image.
0
0
¯ 00 ¯ |Y2m+1
¯ 00 ¯ ¯ 00 ¯
| |X2m+1
|
¯=
¯ = ¯X2m+1 ¯ after half the
Similarly, ¯X2m+1
+ 2 . It is obvious that ¯Y2m+1
2
¯¯ 00 ¯ ¯ 00 ¯¯
¯ − ¯X2m+1 ¯¯
pixel values of the image have been compensated. Then the value of ¯¯Y2m+1
is 0, i.e. their irrelevance is 0. So the value of p is 0 by resolving the equation (6) in SPA,
which foils SPA into the incorrect judgment that no hidden message is in the image.
For
Lena¯has been embedded with 5% message, we make statistics
¯ if the¯ image
¯ example,
0
30
0
¯
¯
¯
Y
X
and
∪
of ¯∪30
m=0 2m+1
m=0 2m+1 , and calculate their irrelevance δ0,30 = 0.0107, then the
estimate value obtained by SPA is p ¯= 4.97%. Then
compensate
the
¯
¯ pixel values of half
¯ 30
00
00
¯
¯
¯
this stego-image, make statistics of ¯∪30
Y
X
and
∪
m=0 2m+1
m=0 2m+1 , and calculate their
irrelevance, we get δ0,30 = 0.0013, meanwhile, the estimate value obtained by SPA is
p = 0.59%. For the threshold used in SPA is 0.018, SPA steganalysis will come to the
conclusion that no hidden information is in the image, then protecting the hidden system
from SPA attack is realized.
¯ 30But after
¯ the image,
¯ 30 Lena
¯has been embedded with 90%
0
0
¯
¯
¯
message, make statistics of ∪m=0 Y2m+1 and ∪m=0 X2m+1 ¯, their irrelevance obtained by
calculation is δ0,30 = 0.1873, correspondingly p = 93.72%. After the pixel values of half
the image have been compensated, we get δ0,30 = 0.0094 and p = 4.64%, repeating the
above procedures. This result is still larger than the threshold, so correct judgment will
be made, and the steganography will be exposed to SPA attack. This means that directly
adding the pixel values of half the stego-image by one as a whole is not adequate and the
compensation area of the image should be dynamically chosen.
One of the simplest methods for selection is exhaustive search. Assuming that the
compensation area of the image is the rows 0 ∼ i, 1 ≤ i ≤ N − 2 (when i is N -1, after
all the pixel values in the image are added by one, δ0,30 is invariable, though X2m+1 and
Y2m+1 have interchanged). i starts from 1, add all the pixel values in the rows 0 ∼ i by 1,
A LSB STEGANOGRAPHY APPROACH
583
(1)
make statistics and calculate corresponding δ0,30 , then add i by the degree of 1 in turn,
(i)
(i)
the δ0,30 in N − 2 circumstances will be obtained, choose the lowest irrelevance min δ0,30 ,
(i)
then the rows 0 ∼ i corresponded by min δ0,30 will be used as compensation area.
Obviously, the compensation area can be chosen at will. It can be rectangular chosen
randomly or other shapes chosen in the stego-image. Compensation should be made to get
the lowest δ0,30 by adjusting the size of that shape area , then lead the SPA steganalysis
to get a relatively low estimate valuep, we call such compensation with the lowest δ0,30
the optimum one in this selection.
After the image, Lena has been LSB embedded with 90% ratio, the irrelevance obtained
after the optimum compensation is 2.3736 × 10−4 as i = 241, and the embedding ratio
obtained by SPA detection is p = 0.12%. For this value is far less than the threshold 0.018,
SPA steganalysis will come to the incorrect conclusion that no information is hidden in
Lena. When it is embedded with a ratio of 100%, after being optimally compensated by
the above method, the SPA detection result is 0.073%, still far lower than the judgment
threshold.
Obviously, both two sides of the communication must know that in which part of
the image whose pixel values are added by 1, otherwise correct message extraction is
impossible. Then the parameters, such as the starting point and the down edge of the
rectangular range can be regarded as the secret key K 0 . K 0 and Kwill consist of the system
secret key together to decide the security of the steganography system. The system secret
key will be transmitted through a secure channel.
3.4. Message extraction algorithm. Extraction is the reverse process of embedding
and compensation. Firstly, subtract 1 from the pixel values in the corresponding range
according to the secret key K 0 , so the image I 00 is restored into the image before compensation I 0 (without the consideration of the effects of the noise in the transmission),
then generate corresponding pseudo-random sequence according to the embedded secret
key K, and find pixels in the locations corresponded by the pseuo-random sequence in
the image, finally extract the embedded message in their least significant bits. Extraction algorithm is considerably simple, and isn’t the emphasis of this paper, so we give no
detailed description here.
Note 1: The purpose of dynamic compensation is to get a relative low δ0,30 via statistics,
after the pixel values in the selected optimum area are added or subtracted by one,
then to make SPA detection get a very low estimated p. How about the effect dynamic
compensation causes on δi,j corresponded by different i, j?
Figure 5(a), 5(b) and 5(c) describe the δi,j corresponded by different i, j in the original
image Lena, the 50% LSB embedded Lena and dynamically compensated Stego-Lena
respectively. It can be seen that δi,j becomes low while δ0,30 is low after compensation
(the lowest δi,j before compensation is about 0.035 and the largest δi,j after compensation
is less than 0.013), i.e. all the δi,j are compensated, which makes attack on LSB embedding
difficult. In fact, |Y2m+1 | and |X2m+1 | are too low to can be ignored when m > 30.
4. Experiment Results and Analysis. DCLS approach embeds messages in the LSB
plane of the carrier image randomly via chaotic system firstly. In this experiment, we selected a perturbed digital chaotic system proposed by [19] and made a small improvement.
The system uses perturb and diffuse kill to improve the dynamics characteristic of digital
584
X. LUO, F. LIU AND P. LU
(a)
(b)
(c)
Figure 5. (a) The irrelevance of original grey-color image Lena; (b)(c)
The irrelevance before and after the compensation in the 50% stego-image
Lena
chaotic system. Experiment results shown that this chaotic system has some excellent
dynamics characteristics, such as long cycle and nicer randomicity, so the steganography
system has a comparative high security. The chaotic system can be described simply as
follows.
Considering a single dimension subsection linear chaotic mapping with four subintervals:
[0, a), [a, b)[b, 1 − a), [1 − a, 1],
⎧
4x
0≤x<a
⎪
⎪
⎨
2 − 4x
a≤x<b
(7)
g(x) =
4x
−
2
b≤x<1−a
⎪
⎪
⎩ −4x
1−a≤x≤1
Diffuse the third subinterval of defining interval of g(x), c3 = [b, 1 − a), divide c3 into
two segments according to e : 1 − e scale: c31 = [b, b + e/r) and c32 = [b + e/r, 1 − a],
where e is diffused coefficient, r is constant, x is initial number. Let a = 0.25, b = 0.5 and
r = 4, then we can obtain a new subsection linear chaotic mapping as follows according
to the perturbing algorithm presented in [19].
⎧
g(x)
x∈
/ c3
⎨
4
g( 2e (x − 0.5))
x
∈
c31
(8)
f (x) =
⎩ x−(0.5+ 4e )
x ∈ c32
g( 1−e + 0.75)
where e ∈ (0, 1) is diffuse coefficient, x ∈ (0, 1) and its initial number is x0 . When
constructing chaotic sequence, according to the key K can get x0 and e. Dividing the
mapping trivial into 2A subinterval equably, one can obtain some random numbers among
0∼ 2A − 1, these numbers conduct a sequence. In order to using the chaotic system into
A LSB STEGANOGRAPHY APPROACH
585
DCLS method, we make a small modification to this sequence. Adding 1 to all of this
sequence, we can get a new sequence and it also is random.
4.1. Standard images and detection results. In order to test the availability of the
steganographic technique described in this paper, some standard images are embedded
with message and analyzed in the experiment. The test image set includes twenty-four
typical gray-scale images, and twenty- two images of it come from the twenty-four images
enumerated in literature [10]. All the images are embedded by the LSB algorithm described in Part B of Section 3, then they are compensated by the algorithm described in
Part C of Section 3. For each image, we make all kinds of experiments with the embedding
ratio of 3%, 5%, 10%, 20%, · · · , 90%.
Table 2 gives the average estimate value p obtained by SPA steganalysis before and after
these twenty-four stego-images are dynamically compensated. The DC means dynamic
compensation.
Table 2. The effect on SPA steganalysis result caused by dynamic compensation
Embedding ratio
Average estimate
value(%)
Before DC
After DC
0
0.69
0.69
3%
3.07
0.01
5%
4.95
0.02
10%
10.27
0.03
20%
20.13
0.04
30%
29.94
0.04
40%
40.30
0.06
50%
50.35
0.06
60%
60.37
0.06
70%
70.10
0.09
80%
81.01
0.10
90%
89.60
0.11
The table above shows that after the dynamic compensation is applied to stego-images,
the estimate value of embedding ratio obtained by SPA steganalysis is greatly reduced.
The embedding ratio 0 in the table means images are original ones, so the estimate values
are the same before and after compensation. The largest average value of p is only 0.11%,
much smaller than the judgment threshold 1.8%, so it can defeat SPA attack effectively.
Meanwhile, similar experiment has been applied to 2000 JPG images with the pixel of
1024 × 768 from digit camera. The result is encouraging, none of the secrete images is
captured by SPA detection.
(a)
(b)
Figure 6. (a) Compensation area chosen by the method of average interval
(b) SPA detection result before and after the compensation
4.2. Performance effect caused by random selection. In this part we discuss the
effect on the detection result caused by the random selection of compensation area. The
test image set used in the experiment is the same as the one used in Section 4.1, and the
586
X. LUO, F. LIU AND P. LU
compensation area is selected by the method of average interval. Some i × i areas are
chosen as the compensation area, where 32 ≤ i ≤ 128, as illustrated in Figure 6(a). The
(i)
value of i is determined by min δ0,30 . Figure 6(b) shows the effect on SPA steganalysis
after the implementation of average interval method to select compensation area.
Obviously, when the compensation area of the image is chosen by the method of average interval, the value of the embedding ratio is a very small value close to 0 even the
embedding ratio is as high as 90%. So it concludes that this method can defeat SPA
attack effectively.
Figure 7(a), 7(b) describes the distribution of |Cm | in the 50% LSB embedded graycolor image Lena before and after compensation respectively. It can be obviously seen
that the distribution of |Cm | in both images are very similar. |Cm | will change after the
stego-image being compensated, but the alternant amplitude is small, because |D2m | keeps
the same before and after compensation. Large quantities of experiments show that |Cm |
still satisfies the inequality relation 2 |C0 | > |C1 | > |C2 | > · · · > |Cm | > |Cm+1 | > · · · of
natural images.
(a)
(b)
Figure 7. (a) The distribution of |Cm | in image before compensation
(b) The distribution of |Cm | in image after compensation
4.3. Effect on DCLS method caused by preprocessing. Figure 8 shows the estimate
value of embedding ratio obtained by SPA steganalysis in the color image Lena after it is
preprocessed by the method described in the section 3, LSB embedded, and compensated.
We can see from Figure 8 that the SPA attack on the image which has been preprocessed
also becomes invalid, when the stego-images are compensated.
4.4. Defeating RS steganalysic approach. Further experiments show that DCLS
method can also defeat RS attack. Table 3 gives the average estimate values of embedding
ratio p obtained by RS steganalysis before and after the twenty-four stego-images in the
Part A are dynamically compensated.
It can be seen obviously from the Table 3 that all the embedding rates obtained from
the compensated stego-images are extremely small, where DC means dynamic compensation. This shows that the DCLS method also has a powerful performance against RS
steganalysis.
A LSB STEGANOGRAPHY APPROACH
587
Figure 8. Detection result before and after compensation of the preprocessed image Lena
Table 3. Effect on RS steganalysis result caused by dynamic compensation
Embedding ratio
Average estimate
value(%)
Before DC
After DC
0
1.05
1.05
3%
3.33
0.79
5%
5.26
0.87
10%
10.89
0.84
20%
21.14
1.04
30%
31.30
1.27
40%
41.28
1.55
50%
51.56
1.72
60%
61.64
1.90
70%
71.33
2.00
80%
81.47
2.44
90%
89.75
2.52
4.5. Defeating other steganalytical approaches. To test the defeating performance
of DCLS method to χ2 -statistical test, DIH, Pairs and the improved methods of Pairs,
RS and SPA, 2000 JPG images (from a digit camera) are used in the experiment. These
images are embedded with 10%, 50% and 90% LSB message, then dynamically compensated and finally detected by these steganalysic methods respectively. In the experiment,
χ2 -statistical test is from [3], DIH steganalysis is from [12], and the improved methods
of Pairs, RS, and SPA are from literature [13], [16] and [18] (the uniform threshold is
assumed as 0.03). Results show that all the attacks from the above methods can be
effectively defeated by our DCLS approach.
5. Conclusions. A new LSB steganography, DCLS approach, is presented. Firstly, this
approach embeds a message in the LSB plane of the carrier image randomly via a chaotic
system, then makes a dynamic compensation on the stego-image. Theoretic analysis and
experiments show that this approach can be effective against sample pairs steganalysis.
Some parameters of chaotic system and the compensation are used as part of system
secret key, which further improves the security of the hiding system. What’s more, more
empirical results also show that this method has rather strong resistance to some other
steganalysis via pixels sample pairs analysis, such as RS, DIH, χ2 -statistical test, Pairs
and the improved methods of Pairs, RS and SPA detection, and deeper research will be
made to develop more robust steganography systems.
Acknowledgment. This work is supported partially by the Nation Natural Science
Foundation of China (No.60673082, 90304014 and 60374004), Special Funds of Authors
of Excellent Doctoral Dissertation in China (No.200084), National High Technology Research and Development Program of China (“863” Program, No.2006AA10Z409), Henan
Science Fund for Distinguished Young Scholar (No.0412000200) and HAIPURT (No.
2001KYCX008). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.
588
X. LUO, F. LIU AND P. LU
REFERENCES
[1] Fridrich, J., R. Du and L. Meng, Steganalysis of LSB encoding in color images, Proc. of the IEEE
International Conference on Multimedia and Expo, CD-ROM, IEEE Press, Piscataway, N. J., 2000.
[2] Westfeld, A., Detecting low embedding rates, Proc. of the Information Hiding Workshop, Springer
LNCS, vol.2578, pp.324-339. 2002.
[3] Westfeld, A. and A. Pfitzmann, Attacks on steganographic systems, in Information Hiding, A.
Pfitzmann (ed.), New York: Springer-Verlag, pp.61-76, 1999.
[4] Provos, N., Defending against statistical steganalysis, Proc. of the 10th USENIX Security Symposium,
Washington, DC, 2001.
[5] Chandramouli, R. and N. Memon, Analysis of LSB based image steganography techniques, Proc. of
the IEEE International Conference on Image Processing, pp.1019-1022, 2001.
[6] Chandramouli, R. and N. D. Memon, On sequential watermark detection, IEEE Transactions on
Signal Processing, vol.51, no.4, pp.1034-1044, 2003.
[7] Chandramouli, R. and N. D. Memon. A distributed detection framework for steganalysis, Proc. of
the ACM Multimedia Workshop Marina, pp.123-126, 2000.
[8] Fridrich, J. and M. Goljan, Practical steganalysis of digital images — State of the art, in Security
and Watermarking of Multimedia Contents IV., E. J. Delp III and P. W. Wong (eds.), Proc. SPIE,
vol.4675, pp.1-13, 2002.
[9] Fridrich, J., M. Goljan and R. Du, Reliable detection of LSB steganography in color and grayscale
images, Proc. of the ACM Workshop Multimedia Security, Ottawa, ON, Canada, pp.27-30, 2001.
[10] Dumitrescu, S., X. Wu and Z. Wang, Detection of LSB steganography via sample pair analysis,
IEEE Transactions on Signal Processing, vol.51, no.7, pp.1995-2007, 2003.
[11] Dumitrescu, S., X. Wu and Z. Wang, Detection of LSB steganography via sample pair analysis,
Springer LNCS, vol.2578, pp.355-372, 2003.
[12] Zhang, T. and X. Ping, Reliable detection of LSB steganography based on the difference image
histogram, Proc. of the IEEE ICSAAP 2003, Part III, pp. 545-548, 2003.
[13] Andrew D. Ker, Improved detection of LSB steganography in grayscale images, Proc. of the 6th
Information Hiding Workshop, Springer LNCS, vol.3200, pp.97-115, 2004.
[14] Andrew D. Ker, Quantitive evaluation of pairs and RS steganalysis, in Security, Steganography,
and Watermarking of Multimedia Contents VI., E. J. Delp III and P. W. Wong (eds.), Proc. SPIE,
vol.5306, pp.83-97, 2004.
[15] Fridrich, J., M. Goljan and D. Soukal, Higher-order statistical steganalysis of palette images, in
Security and Watermarking of Multimedia Contents V., E. J. Delp III and P. W. Wong (eds.), Proc.
of the SPIE, vol.5020, pp.178-190, 2003.
[16] Lu, P., X. Luo et. al., An improved sample pairs method for detection of LSB embedding, Proc. of
the 6th Information Hiding Workshop, Springer LNCS, vol.3200, pp.116-128, 2004.
[17] Yu, J. J., J. W. Han et. al., A secure steganographic scheme against statistical analyses, Proc. of the
IWDW 2003, Springer LNCS, vol.2939, pp.497-507, 2004.
[18] Luo, X., B. Liu and F. Liu, Improved RS method for detection of LSB steganography, Proc. of
the 2005 International Conference on Computational Science and Its Applications, Springer LNCS,
vol.3481, pp.508-516, 2005.
[19] Liu, B., Luo X., Liu F., Perturbing scheme of digital chaos, Journal of Shanghai Jiaotong University
(Science), English version, vol.11, no.2, pp.172-176, 2006.
Download