Steganography

advertisement
CSEP 590 TU Project
Steganography and Digital Watermarking
Section I: Steganography
1.1: Introduction
Steganography literally means ‘covered writing’. It is also referred to as hiding information in
plain sight. The information is out in the open, anyone can look at it. The message goes
undetected however as the very existence of the message is a secret. A popular example is from
the Fox drama Prison Break1. The protagonist Michael Scofield is tattooed with the blueprints of
the prison but the jail officials fail to distinguish this as a blueprint of the prison he is attempting to
escape from.
1.2: History
A variety of methods have been used throughout history to hide information. According to
Greek historian Herodotus, a Greek named Histiaeus2 wanted to pass on a message to a remote
city through a human courier. In order to pass these instructions securely Histiaeus shaved the
head of his messenger, wrote the message on his bare scalp and then waited for the hair to grow
back. This time consuming method of communication was effective however, as the messenger
was able to pass through inspections without raising any suspicion. At the destination the
intended recipient shaved the head of the messenger and read the message. A more recent use
of steganography was to help slaves escape prior to the civil war. The underground railroad was
one of the main escape routes used by slaves. Quilts, hung out to dry were used to display
information inconspicuously. The quilts had special patterns that provided information to the
slaves to aid in their escape3.
1.3: Steganography and Cryptography
Although steganography and cryptography are related there are some differences between
them. Steganography hides a message within another message and tries to offer the impression
that nothing but normal text, audio or video message is being exchanged. In cryptography the
message is encrypted; it looks like a meaningless jumble of characters. Despite the differences,
techniques can be borrowed from cryptography, a more thoroughly researched science and
applied to steganography. A founding principle of cryptography enunciated by Auguste Kerckhoff
is the assumption that the method of encryption is known to the opponent and the security must
lie only in the choice of the key. Therefore for a secure steganographic system we can assume
that an opponent understands the system. The opponent however can obtain no evidence (or
better even grounds for suspicion that a communication has taken place) if she does not know
the key.
1.4: Steganographic Techniques
In Steganography something is done to conceal a message. There are different ways to
classify steganographic methods. We could categorize them according to the type of coversi used
for secret communication. Another approach is based on the cover modifications that are used to
embed the message in the cover object. Some of these techniques are described in the following
sections.
i
The object that is actually going to be seen out in the open the text, picture or sound that will be used to
carry the message right under everyone’s nose
Somnath Banerjee
Page 1 of 6
CSEP 590 TU Project
1.4.1: Substitution System
This type of system replaces redundant bits of a cover with bits from the secret message. A
surprising amount of information can be hidden with little impact on the digital cover such as
picture or audio. A particular method of substitution is Least Significant bit (LSB) substitution.
Let us call the cover c, the secret message m and assume that any cover is represented by a
sequence of numbers cj of length l(c). The embedding process consists of choosing a subset { j 1
, …..j l(m) } of cover elements and performing the substitution operation which changes the LSB
of cj by 1 or 0. In order to decode the message the recipient must have access to the sequence
of element indices used in the embedding process. In the simplest case the sender uses all the
cover elements for information transfer starting with the first element. In most cases the secret
message will have less number of bits than l(c) and the sender may leave cover elements that
are not required unchanged. This leads to a security flaw, as the part that was modified will have
different statistical properties from the part that was not. One solution is to enlarge the secret
message with random bits so that the length of the secret message is equal to that of the cover. A
better approach is the use of a pseudorandom number generator to spread the secret message
over the cover in a random manner. If the communicating parties share a key k they can create
random sequences
j 1= k1
j i= j i - 1 + ki
k 1 , …..k l(m)
and use the element with indices
i≥ 2
for information transfer. As the receiver has access to the seed
k , she can construct ki and so
the entire sequence of elements j i .
1.4.2: Statistical Methods
Statistical methods use a so called a “1 bit” steganographic scheme. This scheme embeds
one bit of information in a digital carrier. This is accomplished by modifying the cover in such a
way that certain statistical characteristics change significantly if “1” is transmitted. If a cover is left
unchanged then it indicates a “0”. The receiver must be able to distinguish between modified and
unmodified covers in order to receive the secret message.
Assuming m is the secret message and
l(m)
is the length of the message in bits. A cover is
divided into l(m) disjoint blocks B1,…….,B l(m) . A secret bit mi is inserted into the i th block by
placing a “1” into Bi if mi = 1 , otherwise the block is left unchanged. The detection of a specific bit
is done via a test function that distinguishes between modified and unmodified blocks,
f(Bi ) = 1 if block (Bi ) was modified and = 0 if it was not
The receiver successively applies f to all cover blocks to restore the secret message.
1.4.3: Distortion Techniques
Distortion techniques require the knowledge of original cover in the decoding process. The
sender applies a series of modifications to a cover in order to get the stegano object. The
sequence of modifications corresponds to a specific secret message the sender wants to
transmit. The recipient measures the differences to the original cover in order to reconstruct the
sequence of modifications and this corresponds to the secret message. A flaw in this system is
that the receiver must have access to the original covers. If an eavesdropper has access to them,
Somnath Banerjee
Page 2 of 6
CSEP 590 TU Project
she can easily detect the cover modifications and has evidence for a secret communication. An
assumption therefore is that the original covers have been distributed through a secret channel.
Text based hiding methods are of distortion type. A technique for text distortion is modulating
the position of lines and words. Adding spaces and “invisible” characters to text provides a
method to pass hidden information. HTML files could be used for including extra spaces, tabs and
line breaks. These are ignored by web browsers and they go unnoticed until the source of the
web page is revealed.
1.4.4: Cover generation Techniques
In contrast to systems where secret information is added to a specific cover by applying an
embedding algorithm, cover generation techniques generate a digital object only for the purpose
of being a cover for secret communication. Due to the tremendous volume of information that is
out there, it is impossible for a human to observe all communications around the world. Mimic
functions can be used to hide the identity of a message by changing its statistical profile in such a
way that it matches the profile of an innocent looking text. The English language possesses
several statistical properties. For instance, distribution of characters is not uniform e occurs a lot
more frequently than z. This fact is used in data compression schemes such as Huffman
encoding. A mimic function can be constructed out of Huffman compression functions. These
functions can only fool machines; to a human observer the mimicked text will look completely
meaningless because of grammatical errors. To overcome these limitations mimicry has been
enhanced by the application of context free grammars. Context Free Grammars explain the rules
of constructing sentences in languages from different parts of speech. Context Free Grammar
can be used to create grammatically correct English text to hide messages. Spam mimic4
provides a good example of a cover generation method.
1.5: Steganalysis
The goal of steganography is to avoid the detection or even raising the suspicion that a secret
message is being passed on. Steganalysis is the art of detecting these covert messages. It
involves the detection of embedded messages. The types of steganalysis attacks are similar to
those of cryptanalysis attacks. Stego only attack is the attack where only the stego objectii is
available for analysis. In the known cover attack the original cover object and the stego object are
both available. In a known message attack the attacker knows the hidden message and tries to
analyze the stego object for patterns in order to determine the hiding algorithm. In the known
stego attack the algorithm is known and both the original cover object and the stego object are
available.
Section 2: Digital Watermarking
2.1: Introduction
Digital Watermarks are imperceptible or barely perceptible transformations of digital data.
Watermarking principles are used whenever cover workiii is available to parties that may be
interested to remove it. The watermarking schemes should be robust against manipulations that
attempt to remove it. A principal application of watermarking is to provide proof of ownership of
digital data by embedding copyright statements. Digital watermarking may also be used for
fingerprinting applications in order to distinguish distributed data sets.
ii
iii
A cover that has an embedded secret message in it
A picture, music or video or other object that can have watermarks embedded in it.
Somnath Banerjee
Page 3 of 6
CSEP 590 TU Project
2.2: Watermark Properties
Watermarks can be characterized by a number of properties. The relative importance of
properties depends on the application and the role that the watermark will play in that application.
Some properties are associated with the watermark embedding process, some with detection and
others with keys.
2.2.1: Effectiveness
The effectiveness of a watermark is the probability of detection immediately after
watermarking. Although one hundred percent effectiveness is desirable it may not be feasible in
practice without sacrificing other properties such as fidelity. Some applications may necessitate
sacrificing on effectiveness so that other more important characteristics are realized.
2.2.2: Fidelity
Fidelity is the closeness between the original and watermarked versions of the cover. As the
watermarked cover may degrade during transmission before being seen by a person, fidelity may
also be defined as the closeness between the original and watermarked covers when it is being
viewed by the consumer.
2.2.3: Payload
Payload refers to the number of bits a watermark encodes within a unit of time or within a
cover. For a photograph the data payload would refer to the number of bits encoded within the
image. In audio transmissions data payload refers to the number of embedded bits per second
that are transmitted. For video transmissions the data payload may refer to the number of bits per
frame or the number of bits per second. The requirements for data payload are application
dependent.
2.2.4: Blind/Informed Detection
In some applications the original cover may be available during watermark detection while in
others it may not make sense to have the original available for e.g. in a copy control application.
The detectors that need the original cover in order to detect a watermark are called informed
detectors. Conversely detectors that do not need information about the original cover are blind
detectors. Informed detection is also referred as private watermarking system where detection is
available to only a select group of individuals whereas blind detection is referred as public
watermarking system where everyone must be allowed to detect the watermark.
2.2.5: False Positive Rate
This refers to the detection of watermark in a cover that does not contain one. With respect to
a detector application, this refers to the number of false positives we expect to occur in a given
number of runs of the detector. The requirement of false positives will depend on the application.
In a proof of ownership application a small number of false positives may be acceptable. However
in a copy control application where huge numbers of detectors are active we may not want any
false positives during the lifetime of the cover.
Somnath Banerjee
Page 4 of 6
CSEP 590 TU Project
2.2.6: Robustness
The watermark must be detectable after common signal processing operation. This is the
robustness of the watermark. Some operations on images that test robustness are spatial
filtering, lossy compression, scanning and geometric distortions. Audio watermarks need to be
robust to recording on audio tape and variations in playback speed.
2.2.7: Security
The security of a watermark is its resistance to hostile attacks. The types of attacks are
unauthorized detection, embedding and removal. Removal of a watermark prevents its detection.
These could be the total elimination of the watermark or changing so that it escapes detection.
Unauthorized detection and removal are active attacks. In contrast the passive attack is
unauthorized detection where an adversary can detect and distinguish watermarks.
2.2.8: Cipher and Keys
In modern cryptographic algorithms, security is derived from the key as the algorithm is made
public. Similarly it should not be possible to detect the watermark in a cover without the key even
if the watermarking algorithm is known. Also, without the key, an adversary should not be able to
remove or damage the watermark without significant degradation of fidelity. In some systems the
watermark message is first encrypted with a cryptographic key and then embedded with a
watermark key.
2.3: Attacks on Watermarks
In the previous section on security it was mentioned that an adversary/adversaries are
interested in impairing or damaging a watermark. Following sections describe some of the
attacks.
2.3.1: Collusion
In this type of attack the adversary obtains many copies of the cover with the same
watermark. The adversary studies these copies to determine the operation of the watermarking
algorithm. With sufficient insight an adversary will be able to isolate and remove the watermark.
2.3.1: Stirmark
This attack tries to distort the watermark with geometric distortions. Some of the distortions
used are scaling, rotation, cropping, flipping, bending and aspect ratio changing. Watermarks can
resist some types of modifications, but tend to break when everything is thrown at them at once.
2.3.1: Mosaic
This is a type of attack in which the samples of the watermarked object are scrambled and
descrambled before it is presented to a watermark detector. A watermarked image is broken into
many small rectangular patches, so small that a watermark can not be detected. These image
segments are displayed in table with the segment edges adjacent. The html table with all portions
of the original image adjacent to one another is visually identically to the image before it was split.
This technique can be used in web applications, the scrambling consist of breaking up the images
and the descrambling is performed by the web browser.
Somnath Banerjee
Page 5 of 6
CSEP 590 TU Project
References
[1] Steganography: Seeing the Unseen by Neil F. Johnson and Sushil Jajodia.
[2] Steganalysis of Images Created Using Current Steganography Software by Neil F. Johnson
and Sushil Jajodia
[3] The Steganographic File System by Ross Anderson, Roger Needham and Adi Shamir
[4] Techniques for data hiding by W. Bender, D. Gruhl, N. Morimoto and A. Lu
[5] Digital Watermarking by I. J. Cox, M.L. Miller and J. A. Bloom
[6] A Review of Data Hiding in Digital Images Eugene T. Lin, Edward J. Delp
[7] Practical challenges for Digital Watermarking Applications Ravi K. Sharma and Steve Decker
1
http://tvdramas.about.com/od/prisonbreak/ss/prisonbrphotos.htm
2
http://www.randomhouse.com/features/thecodebook/excerpt.html
3
Hidden in Plain View : A Secret Story of Quilts and the Underground Railroad
by Jacqueline L. Tobin, Raymond G. Dobard
4
http://www.spammimic.com/
Somnath Banerjee
Page 6 of 6
Download