On the Design of Copyright Protection Protocols for Multimedia Distribution Using Symmetric and Public-Key Watermarking1 Stefan Katzenbeisser Institute for Information Systems Vienna University of Technology Favoritenstraße 9–11/184-2 A–1040 Wien, Austria skatzenbeisser@acm.org Abstract The advent of the Web, electronic commerce and the creation of electronic distribution channels for multimedia content have brought new challenges regarding the protection of intellectual property. As it has become increasingly difficult to protect the distribution medium against copying, techniques for asserting the copyright on information have gained in importance. We review the requirements of practical copyright protection schemes and survey protocols that use symmetric and public-key watermarking algorithms. tokens) into the digital object without loss of quality. Whenever the copyright of a digital object is in question, this information may be extracted to identify the rightful owner. The most prominent way of embedding information in multimedia data is the use of digital watermarking [5]. Due to their practicability, it is likely that copyright protection systems will appear in future multimedia applications; as a consequence, considerable interest in digital watermarking exists for electronic commerce applications. However, watermarking protocols are yet to experience wide-spread use. 2. Application Scenarios and Requirements 1. Introduction With the increasing availability and distribution of media in a digital form, the protection of intellectual property faces new challenges. The possibility to easily and cheaply reproduce content without loss of quality is undermining the film, music and entertainment industries. As a consequence, the question of how to effectively protect the copyright holder’s interests is critical to a wide-spread acceptance of electronic multimedia-distribution channels by content providers. Most implementations try to counteract the increased risk of copyright infringements by copy protection. They attempt to find ways which limit the access to copyrighted material and/or inhibit the copy process itself. Examples include encrypted digital TV broadcast or copy protection techniques on DVD’s. As recent examples show, copy protection is very difficult (if not impossible) to achieve in open systems. On the other hand, copyright protection does not restrict the use of copyrighted material, but attempts to resolve the copyright situation if an act of infringement occurred. The idea is to insert copyright information (so-called copyright 1 This work was supported by the Austrian Science Fund (FWF) under Program Nr. Z29-INF. In this paper, we consider the following distributed application for delivering digital videos to customers, shown in Figure 1. Several content providers ( CPi ) provide digital video streams to a broadcast organization ( B ), which in turn delivers the content upon request to different customers (Ci ). Examples for this kind of application include electronic-commerce systems (using the Internet as transfer medium), pay-per-view digital television or information systems based on multimedia databases (here, the “broadcast organization” could simply be a service provider). Although we do not intend to provide copy protection facilities, a copyright protection mechanism should be available to resolve the copyright situation in case an illegal copy of one multimedia object is found. In order to be useful in practice, such a mechanism has to fulfill several requirements. First, the mechanism should uniquely identify the copyright owner of the multimedia object; it should not be possible by an attacker to falsely identify one person as copyright holder (this includes e.g. forgery or copying of copyright tokens in the multimedia stream). In general, the overall system must be robust against intentional attacks; copyright tokens should be difficult to detect in and remove from the multimedia stream. C1 CP1 CP2 B CPn Cn Figure 1. Application scenario for multimedia distribution. Furthermore, the copyright owner wants to be able to trace unauthorized copies; given one illegal copy of a multimedia data, it should be possible to identify the original buyer. To be fair towards the customer, false claims of infringements must not be possible. On the other hand, one wants non-repudiation, i.e. the property that customers cannot later deny that they bought the illegally distributed objects (fairness towards the content provider). If we speak of a secure protocol we mean that the above requirements are satisfied. 3. Symmetric Watermarking The most prominent way to protect content is the use of symmetric watermarks. Basically a symmetric watermarking scheme consists of two algorithms, an embedding and an extraction algorithm. The embedding algorithm inserts a watermark into digital media using a secret key, thereby generating the watermarked media. Depending on the nature of the extraction algorithm, two types of watermarking schemes can be identified. The extraction process of private watermarking systems takes the watermarked media, the original media, the watermark and the secret key and outputs TRUE if the watermark is actually present. In the case of blind watermarking systems, the extractor extracts the watermark given only the watermarked media and the key; semiblind watermarking schemes also require the original, unmarked object in the extraction process. Clearly blind systems are preferable, as they do not require disclosure of the original unmarked media in the verification process. Watermark extraction should also be possible in case small modifications have been applied to the marked media, i.e. the embedding process should be robust. Such modifications can be the result of intentional attacks in order to remove the mark or the result of coding schemes (e.g. lossy compression) and errors during the transmission [5]. Example: Symmetric watermarking [4]. Let aj 2 f 1; 1g, be the watermark (encoded as strings of 1 and 1) to be hidden in a linearized video stream vi . The sequence aj is upsampled by a factor r, called chip-rate, to obtain a sequence bi = aj for j r i < (j + 1) r. The new sequence bi is modulated by a pseudo-noise signal pi (pi 2 f 1; 1g), scaled by a constant and added to the video stream to be watermarked: v^i = vi +bi pi . Here, v^i denotes the watermarked video stream. Due to the noisy appearance of pi , the watermark bi pi is also noise-like and therefore difficult to detect and remove. In order to verify the mark, the signal pi used in the embedding process must be known. The possibly modified datum vi is multiplied by the same noise-like signal pi that was used in the embedding process. After multiplication, all samples containing one specific watermark bit are added: sj = X j ri<(j+1) r pi vi X j ri<(j+1) r p2i bi : If the pseudo-noise signal pi and the video stream vi are uncorrelated, the sum should be close to sj r aj and aj can be recovered by aj = sign(sj ). Clearly, pi acts as a watermarking key in this system. A simple copyright protection protocol could be outlined as follows: the content provider watermarks his multimedia objects, forwards them to the broadcaster, who in turn sells them to the customers. In order to provide tracing of unauthorized copies, the broadcaster watermarks the object a second time with the identity of the buyer. If one customer sells media illegally and the content provider finds such a copy, he is able to extract the watermark and (with help of the broadcaster) to accuse the original customer of infringing his copyright in court. Unfortunately, such a protocol does not fulfill most of the requirements outlined in Section 2. Although current watermarking systems are quite robust against intentional and unintentional modifications of the multimedia stream, the intention of resolving the copyright situation can be subverted entirely by attacking the protocol rather than the watermark algorithm itself. Examples include inversion attacks, where an attacker attempts to insert a second watermark in an already marked object in a way that no third person can determine the actual order of watermark insertion, or copy attacks (in which watermarks are copied between multimedia objects without knowledge of the secret keys). In order to subvert these attacks, secure protocols require that watermarks are generated in a standard one-way manner (e.g. using a cryptographically secure one-way hash function) from the original, unmarked multimedia object. Another difficult problem is the symmetric nature of traditional watermarking algorithms. When a watermark should be verified, the symmetric key must be disclosed. Since in most watermarking schemes the key is naturally coupled with the location of the watermark in the digital media, it is possible for attackers to remove the mark completely once the key is known (in the previous example, the sequence bi pi can be subtracted from the multimedia stream, once both bi and pi are revealed). In general, watermark verification works securely only once; thereafter watermarks cannot provide copyright protection any more. To cope with the last problem, specialized tamper-resistant hardware might be used in closed systems, such as pay-TV applications. By concealing the entire watermark verification process in trusted hardware, the key can be kept secret from the verifier. Nevertheless, the presence of a watermark is proved by relying on the trusted hardware. Similar to the procedure applied in [6], a simple copyright protection protocol for the application scenario of Section 2 can be outlined as follows: 1. The content provider uses tamper-resistant hardware to generate a watermarking key envelope, containing an encrypted random watermarking key and a proof of his identity (a cryptographically signed identity string). Furthermore, he watermarks his objects (again in the trusted device) using the key envelope and forwards the marked object to the broadcaster. In order to avoid various protocol attacks, the watermark must contain a cryptographically secure hash of the original, unmarked media. 2. The user requests a document from the broadcaster. In order to be a legally binding request, he signs it cryptographically. 3. The broadcaster assigns a unique number to every request and stores it in a database. In order to provide tracing of copes, the broadcaster watermarks the requested document with the request identification number. For this purpose, he uses again a special hardware facility that outputs an encrypted version of the marked multimedia object. He then sends the encrypted marked object back to the customer. 4. The customer decrypts the marked object and continues to use it. If the copyright situation is in question, the content provider uses his own hardware device and the watermarking key envelope to prove the presence of the mark. Furthermore, the merchant is able to extract the mark identifying the customer; as his sales database contains the original signed request, the customer cannot later deny that he bought the document in question. In case a robust watermarking algorithm is used as basis, this protocol fulfills most requirements of Section 2; refer to [6] for details. The above protocol only allows to verify the presence of a watermark in a multimedia object, which is in general not sufficient to resolve rightful ownership. Consider an attacker that inserts his own watermark in the already marked object; he can then perform the above protocol, thereby pretending that he is the rightful owner. To resolve the situation, a dispute resolution process must take place, in which all alleged owners participate. This protocol tries to establish a strict precedence order on the claims, similar in spirit to the ordering system used for patent rights. The actual copyright holder can only be determined, if he/she participates in the protocol. In its easiest form, a judge checks the presence of watermarks in the multimedia objects claimed by the participating parties to be the original unmarked media stream. If one party watermarked an already watermarked object, his “original” media stream should contain a watermark from the other party; continuing this process should eventually reveal the order of watermark insertion. Note that the outlined protocol is fair towards the customer. As the watermarked media object does not leave the trusted hardware unencrypted and only the customer can decrypt the media stream, he cannot claim that any other person distributed an object containing his identity illegally. If one actually attempts to provide proofs of ownership with watermarking protocols, one needs a central registration facility (see e.g. [1]). This registration office keeps a copy of all marked objects, along with a timestamp. In case a content provider wants to register a new multimedia object, the registration office checks its database whether a perceptual similar object (according to some similarity measure) is already registered by a different content provider. In this case, the request is denied. This is necessary to prohibit a trivial attack: an attacker modifies an already registered multimedia object slightly and registers it himself under his name. He can then accuse the true copyright owner of stealing and modifying his object. However, there are several problems with such an approach. First, it requires a central facility, which is clearly unacceptable if the number of multimedia objects gets large. Second, it is not clear how “perceptual similarity” could be defined formally; furthermore, there is no reason to assume that the actual copyright holder will be the first one who registers his object. Even worse, what happens with perceptual similar objects (e.g. photographs of the same object) that are really courtesy of different content providers? 4. Public-Key Watermarking Although specialized hardware might be applicable in closed systems to prohibit publication of secret watermarking keys, an algorithmic solution would clearly be preferable. A promising approach is the use of public-key watermarking schemes. Similar to public key cryptography, a private key is used to embed a watermark in a multimedia object. However, the presence of the mark can be verified using a public key; therefore one also speaks of asymmetric detection. Furthermore, the knowledge of a public key must not enable an attacker to compute the corresponding private key, nor does it allow copying or forgery. In the watermark verification process, the unmarked original is not required; otherwise its publication would again allow subsequent attacks (the algorithm may, however, use a “derived” multimedia object, which is computed from the unmarked object; in this case it must be computationally infeasible to reconstruct the “true” original out of published information). Example: Public-Key Watermarking [3]. Several watermarking schemes with asymmetric detection process were proposed. Among them are systems that use properties of Legendre sequences, “one-way signal processing” techniques or eigenvectors of linear transforms. Unfortunately, none of these schemes is yet sufficiently robust against malicious attacks. See [3] for an analysis; most of these systems are not yet ready for practical use. Several issues must be considered in the construction of copyright protection protocols involving public-key watermarks. Most importantly, the original object is not available in the (public) detection process. This implies that protocol attacks are much more difficult to prevent, as the simple prescription of a one-way function is no longer possible. Consider the following simple protocol attack: an attacker takes an arbitrary video stream, collects information about it and constructs a purported watermark that can be detected in the media at any given strength. He then falsely claims that this constructed mark is actually his watermark and that he inserted the mark in the multimedia object previously. Such attacks work, as long as the watermark is not required to be generated in a standard one-way manner. Another issue lies in the fact that watermarks are verified only with the public key. During watermark verification, knowledge of the corresponding private key must be proved without revealing its value. Different types of public-key watermarking systems use ideas adopted from cryptographic zero-knowledge proofs to fulfill the requirements listed at the beginning of this section. Example: Zero-knowledge watermarking [2]. The main idea is to verify the watermark in a scrambled version of the multimedia object. Let be any permutation on n elements and G be a graph with n nodes. The public key of the content provider consists of G and (G), whereas is the private key and is therefore kept secret. The verification process consists of several interactive rounds in which the content provider proves a third person the presence of his watermark and knowledge of his own secret key; in each round he is able to cheat with a probability of 1=2. By performing several rounds, the verifier can gain any degree of certainty that the mark is actually present. Before the protocol starts, the content provider publishes (O) and (W ). In each round, the content provider chooses two permutations i and i with the property that i Æi = and computes scrambled versions of the graph G and watermarked multimedia object O with respect to i . He then constructs an ownership ticket, containing commitments (which are cryptographic primitives that allow a person to commit to a sequence of bits but keep them secret until some time later) of both i and i ; furthermore, the ticket contains hashes of the scrambled objects i (O) and graphs i (G). The verifier then flips a coin and asks the content provider to open one of the commitments (but not both). If the commitment containing i is opened, the verifier is able to compute scrambled versions of the document and graph; he then hashes the scrambled object and checks whether the hash value agrees with the bits contained in the ownership ticket. If, however, the commitment containing i is opened, the verifier applies the inverse permutation i 1 to both the scrambled watermarked document (O) and mark (W ). He then checks the presence of the scrambled watermark i 1 ( (W )) in i 1 ( (O)). Although only one commitment does not reveal any information about the secret key , it allows to verify the presence of the scrambled watermark in the scrambled multimedia object and the knowledge of the private watermarking key. The system does not allow forgery of the mark, nor does it allow to (falsely) pretend the presence of a water mark; for security considerations, refer to [2]. Although zero-knowledge watermarking systems seem quite secure from a cryptographic viewpoint, they require large amounts of data to be transmitted in the verification process; this amount could actually be several times larger than the size of the document in which the watermark must be verified. Both public-key watermarking and zero-knowledge watermarking can be used to construct secure copyright protection protocols that do not require hardware support. Instead of concealing the watermark verification process in hardware, the public watermark detection facility is used. The following protocol uses both public-key and symmetric watermarking, the first for identifying the copyright holder, whereas the latter is used to trace unauthorized copies. 1. The content provider publishes his public watermarking key cryptographically signed in a public database and uses his private key to watermark all multimedia objects he is going to distribute. Finally, the content provider forwards his marked objects to the broadcast organization. 2. The user requests a document from the broadcaster. In order to be a legally binding request, he signs it cryptographically. 3. The broadcast organization creates a unique number for the request and stores the signed customer request along with its number in his sales database. A random watermarking key is generated and the requested multimedia object is watermarked with the request number (and the generated watermarking key) using a symmetric watermarking algorithm. 4. It then sends the watermarked object back to the user in an encrypted form. If an illegal copy of the multimedia object is found, both watermarks must be extracted. The symmetric watermark will reveal the original customer, whereas the public-key watermark determines the copyright holder; it is both possible to use a traditional “public-key watermark” or a zeroknowledge watermarking scheme (in the latter case, the content provider and a judge engage in the probabilistic verification protocol). Note that in this application it is not so important to keep the symmetric watermarking key secret; as those keys are used only once, its knowledge enables the attacker only to remove the customer identification from one copy of the multimedia object. The detection of the public mark is not impaired. Thus, the overhead of public mark detection is not justified here. Similar to symmetric watermarking systems, the last protocol does not provide a proof of ownership of a multimedia object, as it might contain other watermarks. Again, a dispute resolution protocol is necessary. Note that the previously outlined protocol is not fair towards the customer, since it allows the broadcaster to distribute illegal copies containing watermarks identifying arbitrary buyers. By increasing the complexity of the protocol, this problem can be avoided. More specifically, the customer sends a watermark W , encrypted in a public-key cryptosystem with his own public key, to the broadcaster, which scrambles it according to a known permutation. Then he encrypts the multimedia object to be distributed and inserts the encrypted permuted watermark in the encrypted media. As long as watermark and object are encrypted in chunks of the same size, this produces an encrypted media object containing the permuted mark. In this case, the broadcaster does not have access to the marked media object (since the watermark is encrypted by the customer); furthermore, the customer has no knowledge of the watermark, since it was permuted by the broadcaster. 5. Conclusions We have surveyed the use of symmetric and publickey watermarking systems in copyright protection proto- cols. Whereas solutions using symmetric algorithms and specialized hardware are feasible today, protocols relying on public-key watermarking algorithms are still subject of research. We argue that watermarking alone is not sufficient to resolve rightful ownership of digital data; a protocol relying on the existing public-key infrastructure (which is also used for digital signatures) is necessary. It seems that the primary vulnerability of current watermarking protocols is the watermarking algorithm itself; most known watermarking systems are sensitive to intentional distortions of the digital data and do not merge the digital data and the watermark completely, as e.g. copy attacks show. Nevertheless, we believe that protocols relying on both public-key and symmetric watermarking algorithms may in the future be implemented in various multimedia distribution systems. Whereas copy protection might never be feasible in open systems, applications of digital watermarking technology could provide a mechanism supporting claims of rightful ownership once watermarking technology is standardized. References [1] A. Adelsbach, B. Pfitzmann, A. Sadeghi, “Proving Ownership of Digital Content”, in Proceedings of the Third International Workshop on Information Hiding, Springer Lecture Notes in Computer Science, vol. 1768, 2000, pp. 117–133. [2] S. Craver, S. Katzenbeisser, “Copyright Protection Protocols Based on Asymmetric Watermarking: The Ticket Concept,” in Communications and Multimedia Security Issues of the New Century, Kluwer Academic Publishers, 2001. [3] J. J. Eggers, J. K. Su, B. Girod, “Asymmetric Watermarking Schemes”, in Sicherheit in Mediendaten, GMD Jahrestagung, Proceedings, 2000. [4] F. Hartung, B. Girod, “Watermarking of Uncompressed and Compressed Video,” in Signal Processing, vol. 66, no. 3, May 1998, pp. 283–301. [5] S. Katzenbeisser, F. A. P. Petitcolas (eds.), Information Hiding Techniques for Steganography and Digital Watermarking, Boston, London: Artech House, 2000. [6] P. Tomsich, S. Katzenbeisser, “Copyright Protection Protocols for Multimedia Distribution Based on Trusted Hardware,” in Protocols for Multimedia Systems (PROMS 2000), Proceedings, Cracow (Poland), 2000, pp. 249–256.