EE4414 Multimedia Communication System II, Fall 2005, Yao Wang
Homework 10 Solution (Data Hiding, Watermarking, Steganography)
___________________________________________________________________________________
Review questions
1.
Describe the differences between data hiding, watermarking, and steganography. What are their respective applications and requirements? Is one a special case of another?
Data hiding refers to a technique that embeds data in a cover media. Watermarking is a special type of data hiding, where the embedded data carries information that can be used to prove the ownership or integrity of the cover media. Steganography is also another special type of data hiding, wherein the embedded data is a secrete message that is typically unrelated to the cover media. The cover media can be chosen randomly or to facilitate the conveyance of the secrete message. Because multimedia documents (audio or image or video) have a lot of redundancy, these documents are often chosen as the cover media for embedding secret messages (which themselves could be an image or plain text or encrypted text). With watermark, the embedding algorithm should be designed so that the embedded watermark is hard to remove (except for applications using fragile watermark). On the other hand, for steganography, this is usually not a concern.
2.
The following are some possible applications of watermarking. Describe what they are and their requirements. a.
Ownership assertion b.
Fingerprinting c.
Copy prevention and control d.
Detection of unauthorized alteration (authenticate integrity of the data)
Ownership assertion: embedded watermark is used to proof the ownership of the cover media. The owner or its trusted delegates should be able to detect the presence of the watermark (with the help of the original and the secret key used to generate the watermark) even if the watermarked document is subjected to accidental or malicious alterations (such as compress/decompress, cropping, printing/scanning, shrink and then zoom).
Fingerprinting: a different watermark is embedded to each sold copy of the original media. If this copy is altered or being reproduced illegally, based on the embedded watermark, the owner can track down the person who purchased the copy.
Copy prevention and control: the watermark contains information on how many copies can be made. A compliant recorder/copier will detect such watermark before making copies.
Detection of unauthorized alteration: the watermark should be such that if the media with the watermark is altered, even if slightly, then the extracted watermark is different from the original watermark.
3.
What is the difference between robust and fragile watermarks? Which technique you will use for the above applications (a-e in Prob. 2)?
Robust watermarks are those that can withstand accidental or malicious manipulations of the watermarked document.
Fragile watermarks are those that will change if the watermarked document is somewhat manipulated.
Robust watermark is useful for ownership assertion, fingerprinting, and copy control, whereas fragile watermark is used for prevention/detection of illegal alteration of a document.
4.
Propose one possible technique, for embedding a watermark (invisible) in an image for ownership assertion. Also describe the corresponding watermark detection scheme. Remember that the watermarked image should look almost identical to the original image, and that the watermark should be detectable even after the watermarked image is modified somewhat (saved with a different compression ratio; printed, copied and then scanned; etc.)
See slides 29 and 30 in lecture note “memon_F05.pdf” for two possible techniques: one adds a pseudo random sequence in image sample domain directly, another in DCT coefficient domain.
More specifically, in the transform domain technique described on Slide 20, the image is subject to block-wise DCT transform, then all (or selected blocks based on some secrete keys) DCT coefficients with intermediate frequencies (say those corresponding to row 3-4, and column 3-4 in
8x8 DCT) are modified by adding some random sequences (watermark signal). For example, if the chosen DCT coefficients are arranged into a 1D sequence, {f_1,f_2,…,f_N}, and watermark signal is represented by {w_1,w_2, …,w_N}, where w_i are zero-mean random variables with small magnitudes (e.g., only –1,1 are possible values), these DCT coefficients are changed to {f_1+w_1, f_2+w_2, …, f_N+w_N}. The image with changed DCT coefficients are then inverse-transformed back to the image domain. Because w_I have small magnitudes, they should not induce visible changes in the watermarked image. Only the mid-frequency DCT coefficients are modified, to satisfy the perceptual similarity and robustness requirements. If the very low frequency DCT coefficients are modified, the watermarked image is likely to have visible artifacts. On the other hand, if very high frequency coefficients are modified, the added watermark can be easily removed by high-pass filtering.
To detect the watermark, one can do the block-wise DCT on the watermarked image, select the same set of DCT coefficients, { f ' n
, n
1 , 2 ,..., N }, and correlate these coefficients with the watermark signal (the owner knows which coefficients are modified and the watermark signal), by calculating C
N n
1 f ' n w n
. If the watermarked image is not altered at all, then f ' n
f n
w n
, C
N .
Even if the watermarked image has been subjected to some manipulations, it is likely that C is a large number greater than T=N/2. Therefore, as long as C>=T, one can verify the presence of the watermark.
Note in this example, we have assumed that independent of { f n
, n n
N
1 f n w n
1 , 2 ,..., N } and has zero mean.
0 , which is true if { w n
, n
1 , 2 ,..., N } is
Instead of using correlation, if we allow the use of the original document, then one can calculate the correlation between the difference signal and the watermark, i.e. C
n
N
1
( f ' n
f n
) w n
, which is more reliable. This is the method described in Slide30.
5.
What is steganography? What are the design objectives for a steganography system?
Steganography refers to the technique that embeds a secrete message in a cover media. The design objectives include: 1) the stegoed media (the cover media after embedding) should still look like a typical, normal document; 2) it should be difficult for a steganalysis system to tell the presence of a secrete message from the stegoed media. 3) the intended recipient of the stegoed media should be able to detect and decipher the embedded message reliably.
6.
Propose one possible technique, to embed a secret message in an image or a song or a video, and a corresponding algorithm for deciphering the message. Remember that the image (or song or video) after embedding (the stegoed media) should look indistinguishable from the original one, and the intended recipient of the stegoed media should be able to decipher the secret message (with the knowledge of a secret key and stego algorithm) exactly, and that it should be difficult for a steganalysis system (the Warden Wendy) to detect the presence of the secret message.
A simple technique is to change the least significant bits of randomly chosen pixels (based on a secret key) according to the secrete message to be embedded. The secrete message is first encrypted and converted to a binary sequence of 0 or 1s. Then the LSBs of chosen pixels are reset to the bits corresponding to the secrete messages. The recipient can simply retrieve the message by collecting the LSBs of the same set of pixels (based on a key). Because only the LSBs are changed, the modified image should look very much like the original image.
7.
What is steganalysis? What are the design objectives for a steganalysis system? What are the two types of staganalysis techniques?
Steganalysis refers to the detection of presence of embedded data in a medium. The design objective is to maximize the “hit” rate, while minimize the “miss” rate, and in the mean time to minimize the false detection rate.
One may develop staganalysis techniques for a given known embedding method. Such methods tend to have higher detection rate, but is useless for a new embedding method. Universal steganalysis technique, on the other hand, aims to detect the presence of embedded data without knowing the embedding technique. This is possible as the embedded data are likely to change the statistics of the original media, no matter what embedding algorithms are used.