Cryptography: Security for the digital society Gerard Tel, Utrecht University. Preface Cryptography has received a lot of attention in recent years, and there is ample choice in courses and books about the subject. This book is different from most of the others in some respects, making me believe that it can stand between the existing material. Many books, written by mathematicians, treat number theory and coding in-depth, but pay little attention to the computing and algorithmic aspects of cryptography. Other books, written by computer scientists, do treat the computations and protocols, but fail to explain the mathematical backgrounds. This book, written by an author who is both a mathematician and a computer scientist, tries to find a middle road between a mathematical and a computer science book. The book contains a lot of examples and applications from practice; sometimes historical, sometimes modern, sometimes practical, sometimes philosophical, sometimes funny. I hope to address a wide audience with this style. Cryptography and this book. The usual definitions of cryptography are too narrow to capture all the current applications of this exciting branch of science. Therefore I make a new attempt to border the scope of this book: Definition 0.1 Cryptography is the art of protection using information. Observe that information is mentioned here as the means of protection, and not as the object of protection. Of course information can itself be protected in a cryptographic way, and indeed, secrecy of information by encrypting messages is the oldest and best-known application of cryptography. But other applications include access control for buildings, signatures, digital money, and more. Table of Contents: 1. Introduction to Symmetric Encryption 1.1. Symmetric and Asymmetric Cryptography 1.2. Elementary Examples and Attacks 1.3. Information-theoretic Security 1.4. Unicity Distance and Computational Security Summary and Conclusions Exercises to Chapter 1 2. Data Encryption Standard 2.1. Operation of DES 2.2. Breaking DES with Brute Force 2.3. Differential Cryptanalysis 2.4. Rise and Decline of a Standard Summary and Conclusions Exercises to Chapter 2 3. Symmetric Encryption Continued 3.1. Hellman’s Cubic-root Attack 3.2. Product Systems 3.3. Stream Ciphers 3.4. Key Management 3.5. AES: Advanced Encryption Standard Summary and Conclusions Exercises to Chapter 3 4. Numbers: Properties 4.1. The Integers 4.2. Input Blinding 4.3. Modular Arithmetic 4.4. Multiplicative Group: Prime Modulus 4.5. Modular Ring: Composite Modulus Summary and Conclusions Exercises to Chapter 4 5. Numbers and Public Key Cryptography 5.1. Computable Functions in Modular Groups 5.2. Non-computable Functions in Modular Groups 5.3. Public Key Cryptosystems Summary and Conclusions Exercises to Chapter 5 6. Digital Signatures 6.1. Requirements and Applications 6.2. Signature Schemes 6.3. Hash Functions 6.4. Blind Signatures Summary and Conclusions Exercises to Chapter 6 7. Identification 7.1. Definitions and Attacks, Examples 7.2. Cryptographic Authentication 7.3. Zero Knowledge Proofs for Authentication 7.4. More on Zero Knowledge Proofs 7.5. Biometry Summary and Conclusions Exercises to Chapter 7 8. Secret Sharing and Threshold Cryptography 8.1. Introduction: Schemes and Use 8.2. An Additive Scheme 8.3. Shamir’s Threshold Scheme Summary and Conclusions Exercises to Chapter 8 9. Applications 9.1. Kerberos 9.2. Universal Mobile Telecommunication System (UMTS) 9.3. Secure Socket Layer Summary and Conclusions Exercises to Chapter 9 10. Regulation of Cryptography 10.1. The Legislation Process 10.2. Sale and Export of Cryptography 10.3. Legislation Regarding Confidentiality 10.4. Further Legislation Summary and Conclusions Exercises to Chapter 10 11. Smart Cards 11.1. Introduction 11.2. Attacks on Smart Cards 11.3. Opening Smart Cards Summary and Conclusions Exercises to Chapter 11 12. Electronic Money 12.1. Money and Requirements on Electronic Money 12.2. Coin Systems: Ecash 12.3. Register Systems Summary and Conclusions Exercises to Chapter 12 13. Secure Computing 13.1. Introduction 13.2. Homomorphic Encryption 13.3. Secret Elections 13.4. Beyond Function Evaluation Summary and Conclusions Exercises to Chapter 13 14. The Discrete Logarithm 14.1. Groups and the Logarithm Problem 14.2. Groups for Cryptography 14.3. Goup Algorithms 14.4. The Index Calculus Summary and Conclusions Exercises to Chapter 14 Appendix A: Probability Calculus a. Experiments and Expectation b. Bernoulli Sequences c. Collisions and the Birthday Theorems Appendix B: Images Bibliography Index Chapter 1: Introduction to Symmetric Encryption In symmetric systems, sender and receiver use the same key for encrypting, or decrypting, respectively, a message. The encryption process uses reversible steps, so that knowledge of this process (the key) is sufficient to retrieve the original message from the enciphered one. An attack on a system is any attempt by an outsider to retrieve the messages or the key. It is always assumed (Kerckhoffs’ Law) that the attacker knows all details of the system, but does not know the key. In a cipher text only attack, the attacker only has the ciphered messages. In a known plaintext attack, he knows corresponding plaintext and cipher text messages. In a chosen cipher text attack he can learn the plaintext corresponding to some cipher texts chosen by him. An attack that systematically searches through all possible keys is called a brute force attack. A cryptographic algorithm is perfectly secure when, given the cipher text, all a priori possible plaintexts are still possible. Perfect secrecy is attractive, but requires the keys to be as long as the plaintext. Smaller keys always allow the attacker to get information about the message by searching for a key that maps the cipher text onto a readable plaintext. Surprisingly short messages already reveal the entire message in this way. The unicity distance is the length of messages for which this occurs and can be computed exactly. Compression of messages (before encryption) increases the unicity distance and makes attacks more difficult. Chapter 2: Data Encryption Standard The Data Encryption Standard (DES) is the most influential symmetric encryption algorithm. Its use will decrease over time, but still it is of interest for cryptographers because it contains many important principles and has stimulated ingenious attacks. DES manipulates parts of the data by a point wise XOR with a bit string depending on the key and the rest of the data. This happens in 16 rounds, the data blocks are 64 bits and the key has 56 bits. A very powerful attack, the differential cryptanalysis, was developed to attack DES. In this attack, two data blocks are considered and the attack considers the difference in intermediate results. Though it was developed against DES, it cannot crack DES because the designers of DES anticipated the attack and DES is very well protected against it. A much simpler attack is possible against DES: the number of keys is so small that DES is vulnerable to a brute force attack. A key searching machine was demonstrated to the public in 1998, but it is considered likely that such projects were built in secrecy long before. DES is still used, but in triple version, where data is encrypted using three successive keys. A brute force attack is then impossible. Chapter 3: Symmetric Encryption Continued Symmetric encryption has many applications and many attacks are possible; some are shown in this chapter. Trying all keys, as in a brute force attack, is usually too costly to do during an attack. Hellman’s attack offers the possibility to do most of this work as a pre-computation, before the actual attack begins. Large tables are set up with information about a lot of keys, and the actual attack searches these tables in a smart way. The more memory space is available, the faster will the attack be, and the tables can be used for many attacks (on the same system). In multiple encryption, data is encrypted several times with the same algorithm, but with different keys, thus enlarging the effective key length. Because multiple encryption allows a special attack, the meet-in-the-middle-attack, multiple encryption is less efficient than it appears. Doubling the effective key length requires triple encryption, and tripling the effective key length requires five-fold encryption. The cryptographic algorithm used in mobile phones (gsm) is the A5/1, a stream cipher. An effective attack on this algorithm is known. The attacker must have a few seconds of unencrypted conversation and this is very difficult to achieve in practice. Thus the attack is probably not practical yet, but its existence is considered a severe weakness. Key management is a difficult matter for symmetric cryptography. The communicating parties must share a key, without this key being known to other parties. In various systems, the keys used are derived form master keys using additional information. Another well-known solution to the key problem is the Diffie-Hellman protocol. Chapter 4: Numbers: Properties Because modern cryptographic systems rely largely on arithmetic, studying numbers and their properties is important for cryptography. The well-known number systems (of integer, fractional, or real numbers) have important shortcomings for use in cryptography. The most important defect of real numbers is that most functions are invertible. Consequently, an attacker receiving a cipher text Y and knowing how it was computed from the plaintext X, is able to invert the computation and retrieve the plaintext X from Y. Protection of the plain text thus requires that the encryption algorithm is kept secret, a requirement that is contradictory to the aims of public key cryptography. For this reason, cryptographic algorithms compute in modular arithmetic. There is a fixed number m chosen as the modulus, and after each operation, the result is divided by m and the remainder is taken. Modular number systems are finite (there are only the numbers 0 through m-1). Even though m is large (typically 155 to 310 digits), all calculations can be carried out exact, without rounding errors. Furthermore, in modular arithmetic there are easy to compute functions for which the inverse is hard to compute. The best known of these functions is exponentiation. It is easy to compute a power, but it is hard to compute a higher-order root or a logarithm. Chapter 5 will show how these operations can be used to construct public key cryptographic algorithms. Sometimes modular computing is accelerated using the Chinese Remainder Theorem. If the modulus is composite, say m = p . q, the computation can be carried out with p and q as moduli separate, and the results combined into the final result. The computation is faster because the operations are performed for smaller moduli, and some operations can only be carried out for a prime modulus. Chapter 5: Numbers and Public Key Cryptography Some operations on numbers can, and other cannot be computed efficiently. Computing time for addition, multiplication, division, and exponentiation is polynomial in the length of the number; that is, if the number is expressed in k bits, the computing time is bounded by a polynomial in k, such as k2 or k3. For the inverses of exponentiation, extracting roots or finding logarithms, this is not the case. The computing time of the best-known algorithms is expressed as a kth power, as in ak. This means that for numbers of appropriate size, powers can, but roots and logarithms cannot be computed. Complexity theory is the branch of computer science that studies problems that can or cannot be computed efficiently, and also the question if perhaps polynomial algorithms for roots and logarithms can be found in the future. The complexity of these problems is as yet an open problem. Finding such algorithms is not proven to be impossible, although it is considered unlikely by experts. In the number theoretic cryptosystems, the sender interprets the message as a number x and carries out some arithmetic operations (most often including exponentiation). The result y is the encrypted message. The attacker sees y and knows how it is computed from x, but because of the exponentiation, finding x requires the computation of a root or logarithm and cannot be done efficiently. The receiver possesses some additional secret numbers, allowing finding y from x efficiently. The best-known algorithms of this sort are RSA (based on higher-order roots) and ElGamal (based on logarithms). Because arithmetic operations always take much time, these algorithms are often combined with symmetric cryptography when long messages are sent. This happens for example in the popular email encryption program Pretty Good Privacy. Chapter 6: Digital Signatures The digital signature ensures authenticity and non-repudiation of information; this means that the origin of information can be verified for all parties. Signatures are used for key certificates and electronic money, but also for signing emails. The best-known implementations are the RSA-signature and the ElGamal-signature, derived from the cryptosystems with the same name. For RSA, verifying a signature is much faster that signing, while for ElGamal the verification is somewhat more expensive than signing. The signatures can be shorter for ElGamal, which makes the ElGamal system more suitable for smart cards. To sign long messages, a hash function is used that maps any message to a bit string of fixed length. It is impossible to find the message from this bit string, or to find messages that are hashed to the same bit string. Instead of the message, the bit string (fingerprint or hash value) of the message is signed. Sometimes blinding is used in combination with signatures. A part signs a message prepared by another party, without seeing it. Blinding is used in various cryptographic protocols, for example to enforce anonymity. Chapter 7: Identification Identification means that a person (or party) proves its identity or its right to use something. Identification is one of the most important applications of cryptography. Cryptographic identification always uses a secret that must be known by the person identifying himself. A problem with this is that showing the secret decreases its value, because after the identification also an eavesdropper or the verifier will know the secret. Sometimes this is acceptable (if the verifier is trusted), sometimes cryptographic solutions are needed. Protection against eavesdroppers can be obtained by using one-time passwords, or by not sending the secret, but only using it in some computation and sending the result. Identification without revealing any information to the verifier is possible using a zero knowledge proof. Here the person sends a number, after which the verifier responds with a challenge and the person must give a reply. The person only gives one reply (for the actual challenge) but because he doesn’t know the challenge beforehand, a single good answer gives confidence that he actually knows the answers for all possible challenges. Zero knowledge proofs are often used in cryptographic protocols because a party can proof to play a protocol fairly, without revealing its secret information. Over the last years there is also increased interest in biometric identification. Here some physical characteristic of a person is sampled and compared to a registered template. Especially the fingerprint is popular because it is so easy to sample it. Chapter 8: Secret Sharing and Threshold Cryptography Secret sharing schemes are used in cryptography to protect vital information against misuse by one or several persons. The information is shared over various parties in such a way that at least a certain number of the shares is necessary in order to retrieve or use the secret. An important application is threshold cryptography, where a decryption key is shared in this way. Then decrypting or signing a message requires the presence (and consent) of a number of people. Addition scheme and polynomial scheme. Two sharing schemes are treated in this chapter, the addition scheme and the polynomial scheme. The first one is quite simple to understand and implement, but does not allow more than the necessary number of shares to be distributed. Consequently, all distributed shares are needed to use the secret, which makes the scheme a veto scheme. The polynomial scheme allows an arbitrarily large number of shares to be created and distributed, independent of the number of shares needed to use the secret. The polynomial scheme is a threshold scheme. Verification. Secret sharing schemes can be extended with additional information, allowing to verify whether a share holder uses his share in a fair way; cheating is detected. There are two important verification techniques. A public share is information, computed from the share and publicized; it allows verifying computations done with the secret share. A zero knowledge proof allows proving that a player is fair, without revealing his share. The use of verification is more interesting for threshold schemes than for veto schemes. In a veto scheme, verification only protects against cheaters afraid of being exposed. But a share holder may, in important cases, “sacrifice” himself by refusing to participate, or sending faulty values. The shareholder will be caught, but the secret is lost. In a threshold scheme the secret will remain available even if some shareholders cheat, provided that sufficiently many fair shareholders remain. Threshold RSA and Threshold Signatures. This chapter treat threshold cryptography for the ElGamal system, and not without reason. Threshold cryptography in the RSA system is much more difficult. Sharing the secret decryption exponent d would require to reveal to all share holders the modulus under which this secret is computed, and unfortunately knowing this modulus already allows to compute the secret. It is also more complicated to do signatures. It is quite easy to compute an ElGamal signature from shares of the secret key, but the partial results would allow the shares to be recovered, and then any shareholder can compute the secret key by itself. Chapter 9: Applications In this chapter we study the authentication system Kerberos, the mobile communication system UMTS, and the Secure Socket Layer (SSL) protocol for the protection of Internet sites. Most large scale cryptographic systems are completely or largely built on symmetric cryptography. In mobile phones, but also in heavily loaded internet servers, requirements on speed are too severe to be met by asymmetric cryptography. The current symmetric algorithms can be considered safe, but studying these systems reveals some inherent shortcomings of symmetric cryptography: 1 Authentication or identification. The key used for authentication is also stored in the authentication device (or in some other place), allowing misuse. The user must trust the central authority. In identification using asymmetric cryptography, the user is the only party knowing its secret key, and misuse is impossible. 2 Interaction with third party. Authentication protocols based on symmetric cryptography require that interacting parties connect to the central authority during the authentication (such as the Kerberos server, or the Home Environment of the subscriber). This interaction makes the authentication more expensive, and implies that authentication is impossible when this server is down or unreachable. Secure Socket Layer uses asymmetric cryptography for identification. The necessary certificates are issued by a third party, but its participation is not required during authentication. 3 Integrity and non-repudiation. Message Authentication Codes allow a receiver to verify that the message indeed comes from the intended sender, but this cannot be proved to a third party. A conflict about the content of the exchanged messages is then not easy to resolve. Because Secure Socket Layer claims to facilitate e-commerce, the lack of non-repudiation must be considered an omission. Because of cheap hardware getting more powerful, it is to be expected that the use of asymmetric cryptography will increase in the future. Also a lot of research is done in faster asymmetric cryptography, for example, based on elliptic curves. Chapter 10: Regulation of Cryptography Cryptography and law enforcement are often entangled in a love and hate relationship. Cryptography can prevent crimes and helps police fight crime by keeping their operations secret against criminals. On the other hand, criminals employ cryptography to plan operations or prevent the police from gathering evidence. Sometimes legal measures are taken, like restriction of the distribution of cryptographic products, restriction of the use of cryptography, or the obligation to deposit keys with the public administration (key escrow). Over the past years, governments have become aware that these measures are ineffective (because they are easy to circumvent), they are expensive, endanger the security of systems, and are a violation of civil rights. Consequently, measures against cryptography are being relaxed in all European countries where they have been effective. In The Netherlands and Belgium there are no restrictions on the distribution and use of cryptography. Only exportation outside the Benelux counties requires a permit. Chapter 11: Smart Cards Smart cards are a very helpful tool for users of cryptographic systems to execute the protocols for them, and also an important technique to improve the security of these systems. Data (such as secret keys or a cash balance) provided in smart cards can only be used in the way anticipated by the designer. Use of the keys for other purposes or copying of data is considered to be impossible. The protection against misuse or leaking of information, offered by smart cards is significant, but not absolute. A card reader can be manipulated to sample electric data, such as the current consumption of the card, in order to retrieve secret information from the card. It is also possible to open a card and read or modify the data inside. Chapter 12: Electronic Money Money plays an important role in our society, but during history did so in many different ways. Nowadays money is a token to facilitate payments, namely, to move amounts from one bank account into the other. Bank notes and coins enjoy a sharp legal definition and protection, not shared with electronic money. Electronic money and paying with it can be implemented with cryptographic means. Electronic money, like cash, consists of discreet objects, each with a particular value. Like physical money it has a mark of genuinity, namely a digital signature of the bank. Unlike its physical counterpart, it is extremely easy to copy it, including the genuinity mark, which creates the problem of potential double spending. Double spending is easy to track (after the fact) in a non-anonymous system. The payer identifies himself, and a certificate of coin ownership transmission is signed with every payment. Double spending can be made impossible if a payment is online, and the coin is checked with the bank during the payment. In anonymous offline payment systems, the problem is solved by having the payer give a share of its identity with every payment. A single payment respects the anonymity, but double spending a coin reveals two shares, and thereby the identity itself. Advanced identity sharing, verification, and zero knowledge proofs are necessary to fulfil all requirements of security and anonymity. Goodwill card, telephone cards, and the Dutch Chipknip use a much simpler and cheaper technique. Money is not represented by distinct objects, but the card maintains a balance register internally. Cryptography supports the identification and communication between the parties. Chapter 13: Secure Computing By secure computing we mean that a computation is carried out, while all inputs, and intermediate and final results remain secret to some of the participants. Computing a function, where only the result is made known to all participants, is the most common example. Sometimes a protocol can be designed specifically for a given problem; the millionaires’ problem and the Christmas gift problem are discussed in the chapter. For each problem intensive study would be required, and the solutions aren’t very flexible with respect to small changes in the problem statement. Several general techniques for solving these problems are known, namely based on either homomorphic secret sharing or homomorphic encryption. The inputs are protected using either a sharing scheme or encryption, after which the computation is carried out on the shares, or the cipher texts, respectively. Only the final result is reconstructed, or decrypted, respectively. The chapter demonstrates how these techniques can be used for certain classes of functions. The techniques are generically extendable to situations where participants may lie in order to influence the outcome of the protocol. Secret sharing schemes can be made verifiable with public shares, which must of course be modified with every step of the computation. When homomorphic encryption is used, the possibilities for cheating are limited because the computation itself is carried out in public. An important instance of secure computing is the organization of secret elections. The chapter discusses an old protocol based on anonymous channels. Modern protocols, like CyberVote and VoteHere, use homomorphic encryption. Chapter 14: The Discrete Logarithm There is a variety of cryptographic techniques that are based on the discrete logarithm problem. Besides the ElGamal algorithm and other protocols based on exponentiation of numbers, these include elliptic curve cryptography and the recent system XTR. This chapter introduces the mathematical notion of a group and explains the common structure behind all these cryptographic systems. We consider algorithms to compute the discrete logarithm. They can be used to attack logarithm-based systems so when setting the parameters for the systems they should be chosen so that these algorithms require too much time for an attack. We distinguish between group algorithms and special algorithms. 1 Group algorithms. These algorithms only use the common group structure of the cryptographic systems, which makes them applicable against all logarithm- 2 based cryptography, regardless of the underlying structure. Fortunately, for a group of size q (chosen to be a prime number), these algorithms all take time proportional to the square root of q, which makes attacks with them infeasible. Special algorithms. These algorithms use not only the group structure, but also other properties of the numbers involved. A successful group algorithm can therefore only be employed agains one particular cryptographic system. The most successful one is the Index Calculus, allowing moduli of 512 bits to be attacked successfully (in 2000). About the Author: Gerard Tel was born on August 9, 1962, in Amsterdam. He studied Mathematics and Computer Science at Utrecht University from 1981 to 1986. From 1986 to 1989 he was employed as a research assistant at Utrecht University, financially supported by the Netherlands Organization for Scientific Research (NWO). He received the Ph.D. in Computer Science in 1989, after which he spent six months at Carleton University (Ottawa, Canada) as a research fellow and summer lecturer. In October 1989 he returned to Utrecht University as a senior researcher and lecturer in the area of distributed algorithms. Gerard Tel has since taught on many algorithmical issues including graph algorithms, cryptography, and compression. Gerard Tel is author of several books: Topics in Distributed Algorithms (Cambridge University Press, 1991). Introduction to Distributed Algorithms (Cambridge University Press, 1994/2000). The author was organizer of several international conferences: Distributed Algorithms, 8th International Workshop, Terschelling 1994, with Paul Vitany. Proceedings were published by Springer Verlag as Lecture Notes in Computer Science vol. 857. SOFSEM’99: Theory and Practice of Informatics, Milovy (Czechia) 1999, with Jan Pavelka. Proceedings were published by Springer Verlag as Lecture Notes in Computer Science vol. 1725. Married in 1990, the author is a happy husband and father of three children. The Tel family belongs to the Reformed Church in The Netherlands. Email: gerard@cs.uu.nl Website: http://www.cs.uu.nl/~gerard/