Digital Signatures from Probabilistically Checkable Proofs by Raymond M. Sidney A.B., Harvard College, 1991 Submitted to the Department of Mathematics in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Mathematics at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 1995 ) Massachusetts Institute of Technology 1995. All rights reserved. Author ............. x.................. ............................ Department of Mathematics /i ,-1 May 5, 1995 I Certified by..... / ......... ... Silvio Micali Professor of Computer Science and Electrical Engineering Thesis Supervisor Certified by.........h .. Hartley Rogers, Jr. Professor of Mathematics f Department Advisor klL A Accepted by...--...-. .. A /... David A. Vogan ;,;A ,ClHUjSEf iT INSTf.U'TE OF TECHNOLOGY OCT 2 0 1995 LIBRARIES Professor of Mathematics Chairman, Committee on Graduate Studies .Ci¢,nce Digital Signatures from Probabilistically Checkable Proofs by Raymond M. Sidney Submitted to the Department of Mathematics on May 5, 1995, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Mathematics Abstract We prove a very strong soundness result for CS proofs which enables us to use them as efficient, noninteractive proofs of knowledge for NP statements. We then apply CS proofs to digital signature schemes, obtaining a general method for modifying digital signature schemes so as to shorten the signatures they produce, for sufficently large security parameters. Under reasonable complexity assumptions, applying our methods to factoring-based signature schemes yields schemes with 0(k1 /2 log- 1/ 2 k) bits of security from 9(k)-bit signatures; asymptotically, this compares favorably with the (k1 / 3 . log2/3 k) bits of security currently obtainable from traditional factoring-based (k)-bit signatures. Our technique can also be used to shorten the public keys needed to attain a given level of security. Thesis Supervisor: Silvio Micali Title: Professor of Computer Science and Electrical Engineering Acknowledgments I'd like to thank some of the people who helped to make me the man I am today. From a purely physical perspective, I guess that would be my parents, my grandparents, my great-grandparents, and so on, plus any role models I've had in mind when I go to the weightroom. More importantly, from a mental perspective, there're all the people who've had the (possibly undesired) opportunity to mold my intellectual so-called capabilities and my mental traits. Of course, my immediate forebears and other relatives all figure prominently here. Thanks for everything, Mom, Dad, Dan, Larry, and Jenny! You guys have supplied me with a lot of the motivation needed to get where I am today (wherever that is). My wife, Satomi Okazaki, has also been a big influence on me. She's been a great source of support and encouragement, and she's helped me a whole lot with my defense and other pleasant parts of grad school. My advisor (well, "thesis supervisor," technically), Silvio Micali, has been a big source of information, discussion, and inspiration. He's the guy who really got me interested in theoretical cryptography in the first place. Thanks for everything, Silvio! The other members of my thesis committee, Hartley Rogers, Jr. and Mauricio Karchmer, have not only helped me make this document more understandable than it would have been without them, but have also been willing to talk with me about various half-baked complexity-theoretic notions I've gotten during my years at MIT. Other professorial types who I feel have had a lot of influence on my thoughts include (in roughly reverse temporal order) Dan Stroock, whose course on stochastic processes almost made a probabilist out of me; Tom Leighton, whose class on parallel algorithms gave me the idea of joining MIT's Theory of Computation group; Sy Friedman, who teaches a mean introductory logic class; Alexander Kechris, whose mathematical analysis class gave me some notion of what a rigorous proof should be; and Yaser Abu-Mostafa, who gave me a chocolate bar and taught me all about information theory. Thanks also to Ron Rivest for the occasional little talk, particularly the one in which he suggested the contents of section 7.4. Some non-professorial (at present) friends of mine that I'd like to thank are such luminaries as the Reverend Stanley F. Chen; Andrew "Chou-man" Chou; Andreas Coppi; my office-mate and colleague, Rosario Gennaro, who (among other things) has given me a lot of advice and feedback on my Meisterwerk; my office-mate, Marcos Kiwi, who has been very willing to share his knowledge in matters complexitytheoretic; David Moews; Bjorn Poonen; Alex Russell; Dan Spielman; Ravi "Koods" Sundaram; Marc Spraragen; and Ethan Wolf. Finally, thanks to Phyllis Ruby for her patient assistance in dealing with MIT's considerable bureaucracy and helping me in many other ways. Thanks also to Anne Conklin, Bruce Dale, Be Hubbard, and David Jones. I am extremely grateful to the Fannie and John Hertz Foundation for funding me throughout my graduate career with a very excellent Fellowship for Graduate Study. Without their support, I'd be a poor soul. 5 This thesis is dedicated in fond memory of Hu-bang. 6 Contents 1 2 9 Introduction 1.1 A cursory look at proofs of knowledge .................. 1.2 Contributions of this thesis ........................ 9 10 12 Preliminaries 2.1 Notation. 2.2 NP ................... 2.3 Computing with circuits ....... 2.3.1 Special nodes ......... 2.3.2 Execution of circuits ..... 2.3.3 Subcircuits 2.3.4 Random variables and events .......... ..... .. 12 ............. ............. ............. ............. ............. ............. ..... ..19 ............. ..... 19 3 Probabilistically checkable proofs ..... 13 ..... 15 ..... 16 ..... 17 21 3.1 Definitions and results. 22 3.2 PCPs as proofs of knowledge ....................... 24 3.3 Credits ................................... 25 27 4 Introducing CS Proofs 4.1 4.2 The basic idea behind CS proofs ..................... 28 4.1.1 Committing with mailboxes. 28 4.1.2 Committing with random oracles ................ 29 4.1.3 Decommitting part of a committed proof. 31 3-round pseudo-CS proofs ........................ 7 32 4.3 CS proofs, at last ............................. 33 35 5 CS proofs of knowledge 5.1 What about soundness? ..... ........ .. .36 ......... 5.2 Setting the scene ............. 5.3 5.4 . . . . . . . . . . . . . . . . . 37 . . . . . . . . . . . . . . . 38 5.2.1 f-parents and implicit proofs 5.2.2 Anthropomorphization of circuits ......... ..... . .40 5.2.3 Other events in Qc, tic, and tc, . . . . . . . . . . . . . . . 42 . . . . . . . . . . . . . . . 43 Strong computational soundness ..... 5.3.1 The heuristics behind our proof . . . . . . . . . . . . . . . .44 5.3.2 The proof. .. Extracting NP witnesses ......... ... .. .. . . . . . . .45 . . . . . . . . . . . . . . . 6 Defining digital signature schemes 48 51 .............. 51 .............. 6.1 The purpose of digital signature schemes 6.2 Digital signature schemes without security 6.3 Security of digital signature schemes . . . .............. ..53 .. ... ......... . .55 6.4 Signing-oracle nodes. . . . . . . . . . . . . . . . .55 6.5 Forging circuits and security levels 6.6 An example of a digital signature scheme .... . . . . . . . . . . . . . . . 56 . . . . . . . . . . . . . . . 58 61 7 Derived digital signature schemes 7.1 Defining derived digital signature schemes ........... 7.2 Security of derived signature schemes . . . 7.3 Signature length for signature schemes . ... . .61 .............. ..63 .............. ..66 7.4 Public key length for signature schemes . . . . . . . . . . . . . . . . 67 7.5 Practical considerations . . . . . . . . . . . . . . . 68 7.5.1 Implementing random oracles . . . . . . . . . . . . . . . . . . 69 7.5.2 Asymptotics. ............ ..... ...... ... . .70 73 8 Conclusion 8 Chapter 1 Introduction Much of computational complexity theory is related to proof systems of various sorts. The class NP (languages recognizable in nondeterministic polynomial time) has long been characterized as the set of languages whose members contain short noninteractive proofs of membership. Relatively recently, it has been shown that PSPACE = IP (the set of languages recognizable in polynomial space coincides with the set of languages with short interactive proofs- see Lund, Fortnow, Karloff, and Nisan [20] and Shamir [27]) and NEXPTIME = MIP (the set of languages recognizable in nondeterministic exponential time coincides with the set of languages with short multipleprover interactive proofs- see Babai, Fortnow, and Lund [51). Also recently, in [22], Micali introduced "CS proofs," a proof system which is in some ways more practical than previous proof systems. CS proofs are based on the probabilistically checkable proofs of Babai, Fortnow, Levin, and Szegedy [4] and Feige, Goldwasser, Lovasz, Safra, and Szegedy [13] in a way which we shall describe later. In all of these proof systems, we have the model of a prover (or several provers) trying to convince a verifier of some fact. 1.1 A cursory look at proofs of knowledge At the same time that complexity theorists have been busy trying to use proof systems to prove that various complexity classes are equal or unequal, cryptographers and 9 other researchers have been trying to put proof systems of various sorts to practical uses. Uses of zero-knowledge proof systems for NP languages in different protocols reveal that in addition to zero-knowledge proofs of membership, a second type of zero-knowledge proof exists: zero-knowledge proofs of knowledge. These are proof systems in which one party in a protocol wishes to "prove" to another party that it "knows" something (in particular, we imagine that the Prover wishes to prove to the Verifier that it knows a short proof, or NP witness, of some fact). See Tompa and Woll [30], Feige, Fiat, and Shamir [12], and De Santis and Persiano [9] for more information about zero-knowledge proofs of knowledge. Now, it is more or less trivial to see exactly when and how normal NP proof systems for language membership can also be used as proofs of knowledge. And since probabilistically checkable proofs are really just NP witnesses which have been encoded for error-correction, the same statement holds for probabilistically checkable proof proof systems- a given probabilistically checkable proof essentially contains the NP witness that it proves knowledge of (we shall explain all this in much more detail later on). However, for CS proofs of knowledge, as with zero-knowledge proofs of knowledge, the situation is more complicated. This is because these types of proofs do not explicitly contain the NP witness whose existence they purport to prove. Indeed, zero-knowledge proofs of knowledge should contain no information about what the NP witness is (in a technical sense); and CS proofs, as we shall see later, can be much too short to actually reveal much information about an NP witness. 1.2 Contributions of this thesis In this thesis, we define what it means to use CS proofs as proofs of knowledge. We prove that a property which we call strong computational soundness of CS proofs of knowledge holds. We are guided in doing this not only by results in the area of zero-knowledge proofs, but also by Micali's original intent in developing CS proofs. In particular, 10 if we downsize CS proofs so that they only prove NP statements, we see that Micali wanted CS proofs to have the following two basic properties (among others): 1. It should be easy for the Prover to convince the Verifier of a true fact, if the Prover has an NP witness of that fact. 2. Unless the Prover has tremendous computational power, it should be very unlikely that it can convince the Verifier of any false fact. As we see, these properties, while desirable, do not suffice for using CS proofs as proofs of knowledge. It could conceivably be easy to provide CS proofs of true statements, even if the Prover doesn't know "why" the statements are true. Our proof of strong computational soundness of CS proofs of knowledge can be viewed as a kind of converse to property 1. Essentially, we show that: It is not much easier to find a CS proof of a statement than it is to find an NP witness of that statement. In addition to giving meaning to CS proofs of knowledge, we give an application of CS proofs of knowledge. We show how to use them to make existing digital signature schemesl more efficient. To do this, we present a general transformation which modifies digital signature schemes. In essence, we take a scheme in which one signs a message with some particular kind of signature string, and we change it into a scheme in which one signs a message by giving a CS proof of knowledge of that signature string. Thus, instead of supplying the original signature outright, the Signer supplies a different stringone which would be computationally difficult to find without knowledge of original signature. As we shall see, the benefit of signing by providing a CS proof of knowledge of a signature, rather than by giving an actual signature, is that the proofs of knowledge can be much shorter than the strings that they prove knowledge of. Hence our approach can lead to digital signature schemes with smaller signature lengths. 'Digital signature schemes are a way of providing authentication for electronic messages. We shall discuss them at length in chapter 6. 11 Chapter 2 Preliminaries In this chapter, we shall establish some notations, definitions, and conventions which we shall make use of. Once we have done this, we can enter into the technical matters at hand. 2.1 Notation By "l k" we mean the number k written in unary, that is, a string of k 's. If S, T, ... are probabilistic algorithms, and p(x, y,...) is a predicate, then we denote by Pr(p(x,y,...) : x - S;y - T;...) the probability that p(x,y,...) after the execution of the assignments x - holds S, y - T, ... , respectively. Furthermore, if H is a finite set, we denote by x ER H the act of setting x equal to a randomly chosen element of H. If v E {0, 1}* is a binary string, we denote by R the string obtained by reversing the order of the bits in v, and we denote by vl the length of v. In addition, if v is not the empty string, we define v to be the string obtained by performing a logical NOT operation on the last (least significant) bit of v. For example, 01011R = 11010, 1010111= 5, and 01011 = 01010. If A and B are events (i.e., subsets of a probability space), then AC is the complement of A and A \ B = A n BC. IN={0, 1,2,...}. 12 We shall make free use of 0(.), o(.), and related notations. Recall that if D is an infinite subset of IN, and f(.) and g(.) are nonnegative real-valued functions such that D = domain(f) C domain(g), then we write: * f = o(g) if lim f(n) n-oog(n) = 0 (the limit is taken over n E D). * f = O(g) if there exists a constant c > 0 such that for all sufficiently large n e D, f(n) < c. g(n). f = Q(g) if there exists a constant c' > 0 such that for all sufficiently large n E D, f(n) > c' g(n). · f = w(g) if lim f( --.oog(n) = oo (the limit is taken over n E D). * f = (g) if f = 0(g) and f = (g). We can also write statements like "f(n) = g(w(h(n)))," which means that there is some function j(n) such that j(n) = w(h(n)) and f(n) = g(j(n)). In addition, we shall feel free to extend these asymptotic notations to multivariable functions whose domain D C IN is such that for all n E N, D contains elements all of whose coordinates exceed n (if this condition on D doesn't hold, we cannot interpret the notion of the function's arguments all going to infinity). If C IN, then any function f(.) : polynomial in k and f(k) = k0°( 2.2 ) IN such that f(k) is computable in time is called a length function for C. NP Let WL(.) be a length function for IN, and let P(., ) be a deterministic polynomial time predicate (i.e., algorithm outputting either 0 or 1) such that P(x, w) = 0 whenever Iwl f WL(lxl). We define the language L(P, WL) to be the following set of strings: L(P, WL) = {x E {0,1}* : 3w such that P(x,w) = 1}. 13 We say that NP is the set of all such languages L(P, WL). For a given x E L(P, WL), any w such that P(x, w) = 1 is called an NP witness that (3w: P(x,w) = 1) (the name "WL(.)" was chosen to stand for "witness length"). Intuitively, NP is the class of languages whose members all have proofs of mem- bership which can be quickly verified by a deterministic Turing machine. Suppose we have a Prover and a Verifier, and we fix an NP language L(P, WL). If x E L(P, WL), then the Prover can clearly convince the Verifier of this fact by sending it an NP witness that (3w: P(x, w) = 1)- upon receipt of any such w, the Verifier can quickly check that it is a genuine NP witness. However, the simple protocol above actually convinces the Verifier of more than just the fact that x E L(P, WL) (i.e., the fact that (3w : P(x,w) = 1)). It also convinces the Verifier that the Prover knows an NP witness that (3w: P(x, w) = 1). It may appear that this is a meaningless distinction (and it is, in some cases). But consider the following: EXAMPLE. Let h : 0, 1)* -+ 0, 1* be a length-preserving, bijective mapping which can be computed in polynomial time, but which cannot be inverted quickly. Let WL(n) = n, and let P(x,w) = 1 iff x = h(w). Then it is easy to see that L(P, WL) = {0, 1*. So for any x E {0, 1}*, the Prover need not actually do anything to convince the Verifier that x E L(P, WL); in a sense, there is nothing to prove. In contrast, it's not trivial for the Prover to convince the Verifier that it knows an NP witness that (3w : P(x, w) = 1) (although it's not especially difficult, either; it suffices for the Prover simply to send the witness to the Verifier). In general, let L(P, WL) be any NP language, and take x E L(P, WL). Any proof of knowledge of an NP witness that (3w: P(x, w) = 1) can also be considered to be a proof that x E L(P, WL); however, as our example shows, the converse does not always hold. When we consider other types of "proof systems" for NP languages, we will also have to be careful about the distinction between proofs of membership and proofs of knowledge of NP witnesses, and things will get more complicated than they are with the simple "just send over an NP witness" protocol above. 14 2.3 Computing with circuits Although we shall make some use of ordinary polynomial time deterministic and randomized algorithms, our basic model of computation is a probabilistic circuit: an ordinary Boolean circuit containing special nodes which allow it to make random coin tosses. In other words, a probabilistic circuit is a directed, acyclic graph with five kinds of nodes: 1. Input nodes, with indegree 0 and outdegree at least 1. 2. Constant nodes, with indegree 0 and outdegree at least 1. 3. Gate nodes, which can be any of: * 2-input, 1-output AND gates, with indegree 2 and outdegree at least 1. * 2-input, 1-output OR gates, with indegree 2 and outdegree at least 1. * 1-input, 1-output NOT gates, with indegree 1 and outdegree at least 1. 4. Randomized nodes, with indegree 0 and outdegree at least 1, and which output either 0 or 1 with probability each, independently from what any other nodes do. 5. Output nodes, with indegree 1 and outdegree 0. The size of a circuit is the number of edges it has. If we give labels to its inputs and outputs (so that we can distinguish them), a probabilistic circuit is readily seen to compute a probabilistic function from its inputs to its outputs. That is, if a probabilistic circuit has a input nodes and b output nodes, then each possible input value x E {0, 1}a determines a probability distribution on the set {0, 1 }b of possible output values in a natural way. For convenience, we shall generally omit the word "probabilistic," and refer to probabilistic circuits simply as "circuits." The reason circuits are interesting is that they seem more or less to capture most notions of computing in a very concrete way. In particular, if an algorithm runs in t 15 steps, its computation can be simulated in a reasonably natural way by a circuit of size polynomial in t. 2.3.1 Special nodes We shall also make use of more specialized circuits which possess other types of nodes (in addition to the five kinds we mentioned earlier). We view the inputs and outputs of a "special node" as being labelled, so that we can make a distinction between different input bits or different output bits of the node. If f : {0, 1 ' - {O,1}b is any probabilistic function, a circuit could have nodes for evaluating f(.); such nodes have a distinct inputs, each with exactly one edge leading into it, and b distinct outputs, each with at least one edge leading out of it. Special nodes can be much more general than this, however. A collection of several special nodes might exhibit a linked probabilistic behavior, whereby the probabilistic functions computed by the nodes are not independent. behavior of a collection v 1 , v 2 ,...,vk The most general possible of special nodes, where each vi has ai inputs and bi outputs, is specified by a probability distribution on the Cartesian product F = ri Fi, where Fi = {All functions fi : {0,I}"' - {0, '}. A circuit can also have random oracle nodes. These are more or less a specific case of what we just discussed; they are nodes computing some true random function mapping {0,1} a to {O, }b, for some a and b. They differ from a collection of b randomized nodes in that any two nodes for computing the same random oracle which take the same input must also have the same output. In other words, the nodes should be seen as providing oracle access to some particular randomly-chosen function. A circuit can have nodes for evaluating more than one random oracle. There are two reasons that random oracles are not quite like other (possibly linked) probabilistic special nodes: 1. Any random oracle that a circuit has nodes for evaluating is independent (in 16 the sense of probability) of * any other random oracles. * the behavior of any randomized nodes. * the behavior of any more exotic probabilistic special nodes. 2. All parties participating in a protocol have access to each random oracle. Random oracles are therefore a source of common randomness in protocols. For a particular a-input, b-output random oracle, we envision all parties in a protocol as having access to a particular "black box"; when the protocol is begun, a specific function f : {0, 1'}a {0, }b is chosen uniformly at random from all such functions, and is put into the box. As we have indicated, circuits have nodes for evaluating such a random oracle; each such node will have a size "cost" of at least (a + b) associated with it (since each input and each output has at least one edge attached to it). To allow an algorithm to have access to a random oracle, we give it a special "oracle tape" which it can write its oracle queries on. After writing a query, the algorithm then enters some special state; after one step, it leaves this state with the answer to its question written on the tape. The exact details of implementation are unimportant; all we really require is that a random oracle evaluation takes poly(a, b) steps. 2.3.2 Execution of circuits Let C be a circuit with no input bits, containing any kinds of nodes. Make a list of all the special nodes in C, not including the random oracle nodes: we have nodes vl, v2 ,..., vk, where vi has ai inputs and bi outputs. We recall that the behavior of all the vi's can be specified by a probability distribution p on F = Ili Fi, where Fi = {All [deterministic] functions fi: {O,1}' _ {0, 1 }b}. Let's say that C has nodes for the k' random oracles hi : {0, 1} a h2: {0, 1 }a2 i {0, 1}b 2 , . .. , hk: {O,1 }ak' {1, 17 l}bk', and that C has F.0, 1 }b1, randomized nodes. Let (Si, pi) be the probability space of all functions hi: {O,1}' -+ {0, 1}b under the uniform distribution, and let ($,p,) be the probability space of all ¢-bit strings under the uniform distribution. Set QCto be the product probability space Qc = (F,p) x f(S, p) x ($,p,). DEFINITION: Global Let view. An element of QC is called a global view of C. ic be the set of all functions mapping the edges of C to values in {O,1}. We see that there is a natural mapping from QC to measure on Ic, enabling us to view tc as a probability choose particular instantiations c. This induces a probability space. In other words, if we for all random oracles in C, and we specify behaviors for all other nodes with probabilistic behavior, we determine everything "visible" that occurs in C. DEFINITION: Execution. An element of 'Ic is called an execution of C. Note that an execution of C contains exactly the information we get if we look to see what goes into and out of every node in C. A global view of C contains all that and much more; for example, it contains every value of every random oracle in C (not just the values that C happens to evaluate). If C is a circuit, then C can be executed by choosing an execution from the probability space tc. However, C can also be executed in a more on-line fashion. List the nodes of C in any order such that no node comes before any of its predecessors. Then we can evaluate the nodes of C one at a time, in this order, in a natural way. It's clear how to evaluate constant nodes, Boolean nodes, and output nodes. To evaluate a randomized node, output a random choice of either 0 or 1. To evaluate a random oracle node, check whether or not the same oracle query has already been asked. If so, output the same answer as was already returned; if not, output a randomly chosen binary string of the proper length. And finally, to evaluate any other special node, just choose its output according to the probability space (F, p), conditioned on what has been output previously. 18 The above procedure specifies precisely an execution of C (i.e., an element of tc). 2.3.3 Subcircuits It is sometimes useful to be able to consider part of a circuit as being a circuit in its own right. DEFINITION: Predecessor. One node of a circuit is a predecessor of another if there is a directed path from the first node to the second. DEFINITION: Subcircuit. If v is a node of C with nonzero indegree, and v is not an input node or a constant node, then let D be the subgraph of C induced by {v} U {all predecessors of v}. Obtain a new circuit Z by replacing v in D with a collection of output nodes, one node for each edge entering v. We call Z the subcircuit of C induced by v, and we write Z < C. We adopt the convention that C C as well, since C can (informally) be thought of as the subcircuit of C induced by C's outputs. If we focus on some particular node v of C with nonzero indegree, and Z is the subcircuit of C induced by v, then we can view the execution of C (i.e., the choosing of an element from tc) as occurring in three stages as follows: 1. Z is executed. 2. v's output is evaluated. 3. All remaining nodes' outputs are evaluated. This is really just a slightly different way of viewing an on-line execution of C. 2.3.4 Random variables and events Set E = {0, 1, *}. We call the elements of E "components" in analogy with the way we call the elements of {0, 1 } "bits." For example, an element of k is a "k-component string." A random variable on a probability space S (also called an S-random variable) is a function mapping S to either IR or E*. If Z 19 C, then any iz-random variable can also be viewed as an Qz-random variable and a Ic-random variable; furthermore, any Qz-random variable can also be viewed as an Qc-random variable. We recall that an event in a probability space S is just a subset of S. Events can also be thought of as being predicates on S- we can say that an S-event is an S-random variable with range {0, 1}. Any event has some nonnegative probability of occurring. Since S-events are S-random variables, if Z -< C, then any Tz-event can also be considered to be an Qz-event and a TIc-event; furthermore, any Qz-event can also be considered to be a Qc-event. 20 Chapter 3 Probabilistically checkable proofs We now introduce a new kind of proof system which has been the subject of much important research in recent years. The setting for these proof systems is the same as for normal NP proofs of language membership (i.e., the Prover should send a single proof string which the Verifier can quickly check on its own); however, the proof string is encoded redundantly in such a way that the Verifier actually only needs to look at a little bit of it to judge whether or not it's a valid proof (Consult Spielman [28] to learn about the interesting connections between probabilistically checkable proofs and error-correcting codes). Let us clarify this. If L(P, WL) is an NP language, and x E L(P, WL), then the Verifier checks a probabilistically checkable proof r that x E L(P, WL) by performing the following procedure: 1. The Verifier flips some coins. 2. The Verifier uses its coin flips and the value x to determine some small number of bit positions in r to view. 3. The Verifier decides whether or not to accept 7r as a valid proof, based on everything it has seen so far (i.e., its coin flips; x; and some small number of bits of ir). 21 3.1 Definitions and results It will be well worth our while to be very precise in our discussion here of probabilistically checkable proofs. For our purposes, the definitions and results of Babai, Fortnow, Levin, and Szegedy [4] or Feige, Goldwasser, Lovasz, Safra, and Szegedy [13] would essentially suffice; however, for the sake of efficiency, we follow the later work of Arora and Safra [3] and Arora, Lund, Motwani, Sudan, and Szegedy [2]. DEFINITION: PCP(r(n),q(n)). Let r(n) and q(n) be length functions for IN. The class PCP(r(n), q(n)) consists of all languages L C {0, 1}*such that there exist a length function R(n) = O(r(n)) and a deterministic algorithm T(.,, ) with random access to its third argument' such that: 1. Vx E L, 37rsuch that Pr(T(x, r,r) = 1: r ER {0O,}R( l Il)) 2. Vx B L V7r,Pr(T(x, r, r) = : r ER {O, }R( l l)) < = 1. 2 2 3. Vx E {O,1}* Vr E {O,I}R(jIx ) , T(x, r, r) examines only O(q(lxl)) bits of r. By this, we mean that the process of computing T(x, r, 7r)could be performed in the following way: (a) Compute Ti(x,r), which is a list P1,P2, .., pQ of O(q) bit positions in 7r. (b) Set r = 7rp, o ... o 7rpQ (doing this takes time O(q), since the Verifier has random access to xr). In other words, ir contains the values of O(q(lxI)) specific bits of ir. (c) Set T(x, r, ir) = T2(x, r, i). Here, T,(., ) and T2 (.,., ) are both deterministic algorithms which run sufficiently quickly that T(x, r, r) runs in time poly(lxl, r(Ixl), q(llx)). We make the natural convention that if any of the bit positions computed in step 1 is past the end of the string 7r,then T(., .)=0. 1We do not elaborate on this random access for the following reason: for our purposes, the third argument of T(., , ) will always have size polynomial in the size of the first argument; hence, random access to it can be accomplished in polynomial time anyway, even if only normal sequential access is supplied. In other words, our uses of the definition of PCP(., ) here would be unaffected by leaving out this "random access" qualifier. 22 If we have some L E PCP(r(n),q(n)), ifier that some x E {0, l}' and a Prover is trying to convince a Ver- is in L, then straightforward parsing of the definition of PCP(r(n), q(n)) gives us a protocol PCP-PROOF(.) for proving that x E L. Protocol PCP-PROOF(x): 1. The Prover computes a "probabilistically checkable proof," r, that x E L, and sends r to the Verifier. 2. The Verifier flips R(n) coins to produce a random string r. Together, x and r determine O(q(n)) bit positions in r to examine. 3. The Verifier checks if T(x, r, r) holds (looking only at those O(q(n)) bit positions in 7r). If so, the Verifier accepts the Prover's claim that x E L; if not, the Verifier rejects the claim. If x E L, and the Prover sends an appropriate proof of this fact (which is guaranteed to exist), then the Verifier always accepts. This genre of property is usually referred to as a "completeness" property of the proof system under consideration. On the other hand, if x ¢ L, then no matter what "proof" the Prover sends, the Verifier will reject it at least half of the time. This is called a "soundness" property of the proof system. Observe that if we repeat steps 2 and 3 times each, then if x ¢ L, the Verifier will reject with probability at least (1 - 2-r). Of course, PCP-PROOF(.) requires the Verifier to receive the entire string 7r,which could have size superpolynomial in xI. However, note that there are only 2 R(IIl) possible values for r, and for each one of them, O(q(x)) bits in r are examined. Hence we can assume that any probabilistically checkable proof r that x E L has length 0( 2 R(Il) q(lxl)). One of the major results shown by Arora, Lund, Motwani, Sudan, and Szegedy [2] is that NP = PCP(log n, 1). By the above remark, this means that if L E NP, then the statement x E L can be "proven" to a Verifier who flips only O(log Ixl) coins and then examines only a constant number of bits in a proof of size polynomial in xi (indeed, it is shown by Polishchuk and Spielman [23] that a proof of size O(lxll+") suffices). For a readable and well-annotated presentation of the NP = PCP(log n, 1) 23 theorem and related work and applications, see Sudan [29]. 3.2 PCPs as proofs of knowledge We actually need a slightly stronger statement than is explicit in NP = PCP(log n, 1). In particular, we are interested in using probabilistically checkable proofs as proofs of knowledge. Let WL(.) be a length function, and let P(., ) be a deterministic polynomial time predicate such that P(x,w) = 0 whenever [w[l# WL(IxI). To use probabilistically checkable proofs as proofs of knowledge, it would suffice if, given any probabilistically checkable proof that (3w: P(x, w) = 1), we could easily compute an NP witness of that fact. This is indeed the case, as is noted in Khanna, Motwani, Sudan, and Vazirani [18]. In the other direction, we note that from an NP witness that (3w: P(x, w) = 1), we can quickly compute a probabilistically checkable proof of the same fact. So knowing an NP witness of some fact permits one to compute a probabilistically checkable proofof that fact, and conversely. We sum up everything we need to know about probabilistically checkable proofs in the following theorem, which is implicitly proved in the literature. Theorem 1 ([3], [2]): Let WL(.) and P(., ) be as above. Then there exist a constant q E IN; a length function R(n) such that 0 < R(n) = O(log n); a length function rL(.) for IN; a polynomial time program r(., ) such that 17r(x,w)l = rL(Ixl); a polynomial time programW(., .); and a deterministic programT(.,., ) such that: * Vx E L(P, WL), if w is an NP witness that 3w: P(x, w) = 1, then Pr(T(x, r, r) = 1: r - r(x, w); r ER {0, 1}R(lz l) ) = 1. ·* x V7r,if Pr(T(x,r,7r) = 1: r ER {(0,}1R(Ix)) > -, then W(x, r) is an NP witness that (3w: P(x, w) = 1). 24 * T(., .,.) runs in time polynomial in the length of its first input. * Vx E {O,1* Vr E {O,1)R(llI), T(x, r, r) examines exactly q bits of r. Note that for a given WL(.) and P(., ), we have fixed the length of probabilistically checkable proofs that (3w: P(x, w) = 1) to be 7rL(x). We shall use the notation from the above theorem (i.e., q, R(.), T(., .), W(., )) in later chapters. 3.3 Credits In this chapter, we have not presented any results of our own; instead, we have just explained the probabilistically checkable proof results in the literature which are needed to construct CS proofs of knowledge. The characterization of NP as PCP(log n, 1) is the result of much research by many people in the areas of self-testing/correcting programs, interactive proof systems, and approximation of NP-hard functions. The papers most directly involved in proving this theorem include: * Fortnow, Rompel, and Sipser [15], which first connects proof systems and oracle machines. * Babai, Fortnow, and Lund [5], which proves that any NEXPTIME language has a multi-prover interactive proof system, thereby showing that (in modern terminology) NEXPTIME = PCP(n ° (l), n(')). * Babai, Fortnow, Levin, and Szegedy [4], which "scales down" the results of [5] to produce easily-verified proofs of essentially any computation, demonstrating that NP C PCP(log ° (0 ) n, log ° (l) n). * Feige, Goldwasser, Lovasz, Safra, and Szegedy [13], which expands upon the results of [5] to demonstrate that NP C PCP(log n log log n, log n log log n). * Arora and Safra [3], which introduces the notation "PCP(., )" and recursive proof-checking, and proves that NP = PCP(log n, logo (l ) log n). 25 * Arora, Lund, Motwani, Sudan, and Szegedy [2], which improves this further to NP = PCP(log n, 1). We stress that this is not meant to be a complete list; for much more historical information, see Sudan [29]. 26 Chapter 4 Introducing CS Proofs For our ultimate goal (reducing the lengths of digital signatures), we need to have a proof system which uses very short proofs. Let WL(.) and P(, -) be as usual. Then a probabilistically checkable proof that (3w: P(x, w) = 1) generally has size at least as big as an NP witness w of this fact (i.e., size at least WL(xl)). With an eye towards shorter proofs (both of language membership and of knowledge of NP witnesses), we shall present the construction of Micali [22] for producing CS proofs, or computationally-sound proofs, out of probabilistically checkable proofs'. A CS proof of a statement is meant to be a noninteractive, easily verified, short proof. As we shall see, the brevity of CS proofs is not obtained without a price: it is possible for CS proofs of false statements to exist. However, CS proofs have what Micali calls a computational soundness property, meaning that finding a CS proof for a false statement, although possible, is computationally infeasible. We shall discuss different varieties of computational soundness in chapter 5; in the present chapter, we are more concerned with definitions and motivation than with theorems. Our scenario is as follows. Fix some language L(P, WL) E NP. Let x E L(P, WL), and let w be an NP witness that (3w : P(x, w) = 1). Set n = Ix. We wish to create a low-communication proof system for knowledge of a witness that (3w : P(x, w) = 1), based on probabilistically checkable proofs. 'Actually, we modify the original construction a little bit to make slightly longer CS proofs; this will make it easier to prove our soundness theorem in section 5.3. 27 Actually, we should point out that CS proofs, like probabilistically checkable proofs, are useful for much more than just proving membership (or knowledge of witnesses) for NP languages; however, here, we present just the aspects of CS that we shall require for our purposes. 4.1 The basic idea behind CS proofs The starting point for CS proofs is our simple protocol PCP-PROOF(.) on page 23. We want to take this protocol and modify it so that the Verifier doesn't need to perform as much communication. 4.1.1 Committing with mailboxes Suppose that the Prover and the Verifier have a [physical] mailbox, for which the Verifier has the only key. Consider the protocol PCP-MAILBOX-PROOF(-,). Protocol PCP-MAILBOX-PROOF(x,w): 1. The Prover computes r = r(x, w). 2. For each of the Iri = WL(n) bits ri, the Prover writes on a separate piece of paper "Bit #i = ri." It then puts that piece of paper in the mailbox. 3. The Verifier flips R(n) coins to determine O(q(n)) bit positions in r to examine. It asks the Prover for the values of those bits. 4. The Prover sends those bits to the Verifier. 5. The Verifier opens the mailbox, takes out the pieces of paper corresponding to proof bits that it wanted to see, and checks two things: (a) If any bit that the Prover sent it has a different value than the corresponding bit in the mailbox, the Verifier rejects the proof. (b) Otherwise, the Verifier decides whether to accept or reject based on the values of those bits, as in an ordinary probabilistically checkable proof. In a sense, PCP-MAILBOX-PROOF(.,) and PCP-PROOF(.) barely differ (except that PCP-MAILBOX-PROOF(,.) is a protocol for performing proofs of knowledge). 28 However, if we don't consider all the effort the Verifier has to do in sifting through many pieces of paper to find specific pieces to be "communication," then the Verifier for PCP-MAILBOX-PROOF(., -) doesn't perform much communication at all. Essentially, the Prover starts out by using the mailbox to commit to a probabilistically checkable proof; after this, there's no way for it to fool the Verifier later by pretending it had a different probabilistically checkable proof in mind. The Prover can decommit any part of the probabilistically checkable proof by sending its value to the Verifier; the Verifier then just needs to check that it agrees with what's "on file" in the mailbox. 4.1.2 Committing with random oracles We need to have some way of simulating the committal above without the use of rather specialized postal hardware. As an alternative to a mailbox, let us have a security parameter 2 X and a random oracle f: {0, 1} 2 n {0, 1}. Function COMMIT(1K,r): 1. Append O's to the end of r until a string p whose length is c times a power of 2 is obtained. Say II = 23 r. Note that IPj < 21rl, and so a = O(log n). 2. Divide p into 2c segments of length K each, p = pao p o P2%-I o... (we call the p°'s the segments of 7r, or of p). 3. Compress pairs of adjacent po's together, meaning: compute pl = f(po, po), P = f(P, P), .. P2a- 1 = f(P2-2,P2-)4. Compress pairs of adjacent pl's together as above to compute p2 = P1= f(p2, P3),. .P2o-2_,= f(Pa-l-2, Pi -1. f(p l , pl), 5. Continue in this vein until an complete 2a-leaf binary tree has been created, with R = p at the root (see figure 4-1). 6. Output the value 7R E {0, 1}' as a committal to the probabilistically checkable proof 7r. 2Many cryptographic protocols make use of security parameters. Exactly what a security parameter means varies from protocol to protocol; in general, the larger the security parameter used is, the more resistant the protocol is to any kind of adversarial behavior. On the flip side, larger 29 P0 P2 PP 1 Pa = f(Po, 0 P) 1 P4 P3 10 1Pi = f (PP) P2 Pip o = PO 4 Pg P P3 = , f( =Po3 fJ P7 P5 F 6, P 1 l iPi= Pp Figure 4-1: The tree computed to commit to Ir, if a = 3 Suppose the Prover sends the Verifier a K-bit string as a committal. Later on, the Prover sends the Verifier a string r, claiming, "My value was a committal to 7r." For COMMIT(-, ) to be worth anything as a committal scheme, we would like it to be impossible for the Prover to decommit any string other than a string 7r which the Prover was actually thinking of, and on which it ran COMMIT(-, ) to obtain its committal. This is not quite the case; however, it is infeasible for the Prover to pull off a crooked decommittal. Essentially, if the Prover can decommit to two distinct values relative to the committal 1Z, then it must have found a collision of f(.); to have probability 2- ° (k) of finding such a collision, the Prover must evaluate 2n(k) different values of f(.). The type of commitment scheme we use here was first used in Merkle [21], for constructing a digital signature scheme. It was later used by Kilian in [19] for zeroknowledge proof protocols very similar to the "3-round pseudo-CS proofs" we present in section 4.2. Benaloh and de Mare [8] present an alternative method of committing to long strings which can be used in situations similar to ours. security parameters also require more resources (computation, storage, communication, etc.) from the participants. 30 4.1.3 Decommitting part of a committed proof It's not enough for us that COMMIT(., ) to a string ir. We need something more- is a good way to produce a short committal a way to decommit a single bit of ir without having to perform much communication. Let 0 < p < ITrl be the index of a bit of r, and consider the function DECOMMIT(.,-, ) and the predicate CHECK-DCM(.,-.,.-,). Function DECOMMIT('1K,7r,p): 1. Compute the values pb which were computed in evaluating COMMIT(l", r). 2. Set i = LJ. 3. Write the path (in the binary tree) from p to the root pO as pjO, Pl, pja, where jo = i and ju+l = [j./2J. 4. Set D = pj° O P2 po 2 P0j P l ... °2 a - 5. Output D as a decommittal of the bit rp. Predicate CHECK-DCM(1, D, p, PR): 1. Write D as D = A o Bo o Al o B1 o o A,-_ o B,_1, where each Au and each Bu has length eK. 2. Check that for u = 0, 1, ... , a - 2, Au+ f (A,, Bu) if bit #u of [P/'rJR is a 0 if bit #u of [P/IKJRis a 1 f(Bu, Au) 3. Check that 7 = f(A,_ f(B,- 1, 1, B,_1) A,_1) if bit #(a - 1) of [p/14JRis a 0 if bit #(a- 1) of [P/CJRis a 1 4. If all a of the f(.) values computed check out as above, output 1 (accept the decommittal as valid); otherwise, output 0 (reject the decommittal). The bottom line here is that it's about as difficult for a crooked Prover to convincingly decommit a bit that it wasn't thinking of when it made the committal 1Z as it would be to decommit an entire string r that it wasn't thinking of. The basic reason for the difficulty is identical, too- the Prover would more or less have to find 31 an f-collision. We won't go into more detail about this here, since chapter 5 will provide about all the detail one can handle. We see that a single decommittal has length 2ac. 4.2 3-round pseudo-CS proofs We can take all the things we've built so far, and put them together in a 3-round proof system which needs very little communication, the protocol 3-ROUND PROOF(.,, ), which is just a natural "conversion" of MAILBOX-PCP-PROOF(.,.). x, w): Protocol 3-ROUND-PROOF(1K, 1. The Prover computes r = 7r(x, w). 2. The Prover computes R = COMMIT(1S, r) and sends to the Verifier. 3. The Verifier flips R(n) coins, and uses the result to pick q bit positions 71, 72,..., rq,in the proof that it wishes to examine. It sends 7r, 2,..., 7rq to the Prover. 4. The Prover computes the decommittals Di = DECOMMIT(1, r, pi) for i = 1, 2,..., q. It sends the Verifier D1 o D2 o ... o Dq. 5. The Verifier checks that CHECK-DCM(1',Di,p, 1) = 1 for i = 1,2,.. ., q. If any of these does not hold, the Verifier rejects the proof. 6. Otherwise, the Verifier decides whether to accept or reject based on the values of the bits decommitted, as in an ordinary probabilistically checkable proof. Like PCP-PROOF(.), 3-ROUND-PROOF(-,.,.) can be repeated multiple times to obtain more security. For the sake of efficiency, the same committal T1 can be used for each repetition (i.e., only steps 3-6 need to be repeated). Furthermore, all the repetitions can be done in parallel, so that the entire protocol still takes only three rounds. 32 4.3 CS proofs, at last It's now just a short step to CS proofs. In effect, CS proofs are obtained by taking the 3-ROUND-PROOF(.,, ) protocol (repeated , times), and replacing the Verifier's coinflipping with a random oracle (to get rid of the interaction in 3-ROUND-PROOF(.,, )). We elaborate on this. Producing a CS proof with security parameter en(a "-CS proof") that (3w: P(x. w) = 1) requires two random oracles. One of them we have already seen, the random oracle f: {0, 1}2 random oracle, g: {0, l} function CS-PROOF(,, l X {0,1} {0, 1}'. We now add in an additional {0 , 1}R(n),, (recall that n = xi). Then the ) can be used to create CS proofs. Function CS-PROOF(1, , w): 1. Compute 7r = r(x, w). 2. Compute R = COMMIT(1N, 7r). 3. Evaluate g(x o R) and divide it into K "challenges" of length R(n) each, g(xo R) = r 1 r o o r,. 4. For each s = 1,2,...,c, (a) Compute p-, p,..., p', the positions of the bits in ir that T(.,.,) wish to examine, given inputs x and rs. would (b) Compute D? = DECOMMIT(1,r, pf) for i = 1,2,...,q. (c) Set hs = D o D o. 5. Output o D-. the string R o rl o hi o r2 o h 2 o ... o r, o h, as a -CS proof that (w: P(x,w) = 1). Carefully going through the code for CS-PROOF(.,-, ) reveals one other difference from 3-ROUND-PROOF(-,.,.). Namely, CS proofs explicitly contain the random challenge bits, g(x o R). In view of the fact that the string 1R is also contained in the CS proof, this might seem wasteful; however, it will simplify matters somewhat in section 5.3. With everything we've put together so far, it's not hard to come up with a predicate CHECK-CS-PROOF(., , ) for the Verifier to use to check if a CS proof is valid. 33 ., ): Predicate CHECK-CS-PROOF(1 x, 1. Write M as a concatenation M = 7i o o o h o 2 o h2 ... , oh,. 2. Divide g(x o 7) into xcpieces of length R(n) each, g(x o 1) = r o r 2 o.. o r,. 3. For each s = 1,2,...,K, (a) Check that r = rs. (b) Write h, as a concatenation h = D o D o... o Dq. (c) Compute p', p... .. ,pq, the positions of the bits in r that T(.,., ) would wish to examine, given inputs x and r. D ,p, (d) Check that CHECK-DCM(1K, 71) = 1, for i = 1, 2,...,q. (e) Check that the values decommitted for rprp2,... 7rps would cause T(.,., ) to accept the probabilistically checkable proof7r, given inputs x and r,. 4. If M passes all the checks above, accept the CS proof M (output a 1). Otherwise, reject M by outputting a 0. If CHECK-CS-PROOF(x, M, ) = 1, we say that M is a valid x-CS proof that (3w: P(x, w) = 1). Note that this can hold even if there actually is no such witness (i.e., valid CS proofs are not necessarily correct). A little bit of simple arithmetic reveals that the length of a Ic-CS proof that 2 logn). For convenience (3w: P(x, w) = 1) is precisely K[1+ R(n) + 2arqK]= O(rK later on, we define the "CS Proof length" function Ap,WL(n, 34 K) = I[1 + R(n) + 2aqK]. Chapter 5 CS proofs of knowledge As with other proof systems, there are two distinct parts needed to demonstrate that CS proofs can actually serve as proofs: 1. Some kind of completeness property, which ensures that an honest prover can convince a Verifier of any particular true statement with very high probability. 2. Some kind of soundness property, which ensures that a dishonest prover can't convince a Verifier of any false statement with nontrivial probability. As mentioned earlier, in [22], Micali is concerned with CS proofs as proofs of membership, and he demonstrates versions of these properties which he calls "feasible completeness" and "computational soundness." Feasible completeness means that not only can an honest Prover possessing an NP witness that x E L always successfully convince a Verifier that x E L, but in addition, it can do so in time poly(lx[, ). Computational soundness means that a dishonest Prover, trying to convince a Verifier of some false statement E L, can only succeed with an exponentially small (in I) probability, unless it has so much computational power that it can make exponentially many (in ra) oracle evaluations of the functions f(.) and g(.). In this chapter, we shall present our results on how CS proofs can be used as proofs of knowledge. As always, this requires proving a completeness property and a soundness property. 35 Feasible completeness is sufficient for use in CS proofs of knowledge. We here present Micali's definition in a way which we feel clarifies how it can be used for to CS proofs of knowledge. Let WL(.) and P(., ) be as usual (we fix WL(.) and P(-, ) for the duration of this chapter). Then feasible completeness means that: 1. Vxz {O,1}*, if w is an NP witness that (3w: P(x, w) = 1), then Pr(CHEcK-CS-PROOF(1r, M) = 1: M = CS-PROOF(1,x, w)) = 1. 2. The algorithms CS-PROOF(.,.,.) and CHECK-CS-PROOF(.,-, ) both run in polynomial time. 5.1 What about soundness? Micali's computational soundness is not strong enough to show that CS proofs can be used as proofs of knowledge. We shall therefore propose and prove a new notion of soundness for CS proofs. In addition to being useful for its applications to digital signature schemes, as will be demonstrated in chapter 7, the notion is interesting in its own right. If we consult the literature on zero-knowledge proofs, we see that there is a nontrivial difference between zero-knowledge proofs of membership and zero-knowledge proofs of knowledge (see Tompa and Woll [30], Feige, Fiat, and Shamir [12], and De Santis and Persiano [9]) which is similar to the distinction we face with CS proofs. It is not immediately obvious what it means for a protocol to "prove" that one of the parties "knows" an NP witness of some fact. The three papers above all define a soundness property for zero-knowledge proofs of knowledge by saying that there must exist a "knowledge extractor," which is essentially an efficient algorithm which, given some form of control over running the Prover, can produce an NP witness of the fact to be proved. The reason for defining soundness this way is that it's generally very difficult to 36 formalize in a natural way what it means for an algorithm to "know" something. Given this, we might be tempted to make a "knowledge extractor" definition for CS proofs, as well. Since [9] is about noninteractive proofs of knowledge, and we find ourselves in a similar situation, it might seem reasonable to adapt their definition to our needs. De Santis and Persiano present two types of "knowledge extractor" definitions for noninteractive zero-knowledge proofs of knowledge. The weaker definition involves the Prover having some particular x E {0, 1}' in mind, and trying to convince the Verifier that (3w: P(x, w) = 1). The stronger definition allows the Prover to "shop around" and generate an x E {0, 1}* and a proof together, which could conceivably give the Prover more opportunity to falsely prove that it knows an NP witness of some fact. In analogy with their definitions, we shall make two similar definitions of soundness later in this chapter: "computational soundness of CS proofs of knowledge" and "strong computational soundness of CS proofs of knowledge." However, the "knowledge extractor" approach is not precisely the best way to define soundness for CS proofs of knowledge. This is because, for CS proofs, as we shall see, performing committals via random oracle tree-hashing makes it possible to actually have some idea of what a Prover is "thinking." Given this, it is more intuitive to define soundness in terms of the Prover "having in mind" an actual NP witness. The resulting definitions of soundness, although slightly differently motivated, turn out to be essentially equivalent to the "knowledge extractor" definitions. In addition, Micali's original "computational soundness" property follows easily from either of our versions of soundness. Most of the remainder of this chapter is devoted to formalizing and proving our new notion of soundness for CS proofs. 5.2 Setting the scene Fix n, b > 0 and K > 1. Let C be a circuit with no input bits and (n + Ap,WL(n, output bits. Suppose that C has s random-oracle nodes for f : {0, 1}2' 37 i ) + b) {0, 1}", t random-oracle nodes for g: {O, }' x {O,i}X ~ {0, 1}R(n)', and any number of other special nodes of any types. Call C's g(.)-nodes G1,G2 , ... , Gt. Writing the output of C as x o M o E, where xl = n, MI = ApwL(n, K), and EJ = b, we think of C as a circuit which tries to output a triple consisting of a value x E {O,1}1' (we call x an instance); a valid -CS proof that (3w : P(x,w) = 1); and some b-bit auxiliary output, E. DEFINITION: f-node. An f-node is an f(.)-node which is a predecessor of Gi. Let Ci be the subcircuit of C induced by Gi. Since Gi has (n + K) inputs, Ci is a circuit with no inputs and exactly (n + e) output nodes. We observe that two such subcircuits Ci and Cj need not be disjoint. 5.2.1 f-parents and implicit proofs For the following definitions, let Z < C, and let 77 be a K-component Lz-random variable (i.e., any function mapping 'Lz to Es). DEFINITION: f-left parent in Z and f-right and the f-right parent of parent in Z. The f-left parent in Z are defined as follows: * If 7 is the output of at least one f(.)-node in Z, and if any two f(.)-nodes in Z which output q both have the same input, then the f-left parent and the f-right parent of 77in Z are the first and last Kebits of that common input, respectively. * If 77is the output of at least two f(.)-nodes in Z with at least two distinct inputs, then the f-left parent and the f-right parent of 77in Z are both equal to * (i.e., a sequence of *'s). * If 77is not the output of any f( )-node in Z, then the f-left parent and the f-right parent of in Z are both equal to *. The f-left parent of 7 and the f-right parent of 7 in Z are K-component z-random variables. For any e e iz, if ?r(e)= *, then the f-left parent and the f-right parent of 1 in Z also equal *,. DEFINITION:Implicit proof Z committed to with 38 77. 1. Set i70= . 2. Define ro -0 3. Define T7 -1 and to be the f-left and f-right parents of /0' in Z, respectively. and r/ - 2 to be the f-left and f-right parents of r - l in Z, and define /` - 2 and 3`- 2 to be the f-left and f-right parents of rq - ' in Z. 4. Continue in this fashion, defining parents of qr obtained, with to be the f-left and f-right in Z, until a tree similar to the one pictured in figure 4-1 is 7 We say that the string with and 's instead of p 's. 4 o° ... o 71a is the implicit proof Z committed to a/. 4 is a z-random variable of the same length that a probabilistically checkable proof that (3w: P(x, w) = 1) should be (assuming that lxJ= n), once that probabilistically checkable proof is padded to have length equal to xCtimes some power of 2. Indeed, as suggested by our terminology, the random variable is an attempt to reconstruct the probabilistically checkable proof that Z committed to via the short string r/. Of course, for a particular random variable 7, in a given execution e E Tz, it is quite possible that there is no such committed proof (i.e., that l/(e) was not computed by tree-hashing any kind of string), in which case 1(e) is likely to be just a string of *'s; more generally, there can be gaps in the committed proof, which are filled in with *'s in (e). Note that a string of *'s in an implicit proof can mean either of two very different things: 1. It was impossible to "trace back" the committal string 7(e) to a consistent value for that part of the implicit proof, using only values of f(.) which Z had evaluated. 2. At some point in the process of trying to "trace back" from nr(e)to the implicit proof, there were at least two distinct possibilities of how to trace back. The former event means that Z does not "know" some probabilistically checkable proof which tree-hashes to the committal 77(e). The latter event means that, later 39 on, if r7(e) is used as a committal, then it might be feasible to decommit more than one value for a given proof bit. Because the latter event implies that Z has found an f-collision, it occurs very infrequently. At long last, we define some particular random variables on * Write the output of C as x or o r o ha o r 2 o h2 o Ir[ = , El = b, and for each s = 1,2,...,K, c and fc,: or oh, o E, where lxl = n, r,j = R(n) and h,j = 2acqK(this breaks up C's output into an n-bit x; the pieces that a valid -CS proof that (3w: P(x,w) = 1 should contain; and an extra b-bit output). Recall that we already defined M = o r o hi o r2 o h2 o... o r, o h to be the CS proof output by C. * Fori =1,2,...,K: - Write the output of Ci (the input to Gi) as xi o yi, where 1xil = n and lYI = K. - Let i be the implicit proof Ci committed to with 'y. Intuitively, we think of Ci as somehow computing an (instance, K- CS proof) pair (i.e., Ci's output), which will be used as input to the node Gi. Of course, the candidate K-CS proof may not be a valid CS proof for the candidate instance. Nonetheless, each g(.)-node does a priori give C a chance to output an instance and a valid K-CS proof of knowledge for that instance. x, M, E, r, the r's, and the hs's are *c-random variables; and the yji'sand the ~i's are 'tc-random variables. 5.2.2 Anthropomorphization of circuits For the duration of section 5.2.2, fix Z Let f3, 3', 3o,and ]31 < C. be K-component Tz-random variables. Let the Fz-event "Z knows f(/o o /l) = /)" be {e E T'Iz Z has an f(.)-node which takes input flo(e) o /31(e) and outputs 40 d(e)}. We similarly define the Iz-events "Z knows the value of f(/3o o 1l)" and "Z knows 3' is an f-parent Pr( of O." Note that = f(/o /31 )I(Z knows the value of f( o ,o ))C) = 2- (the left-hand side of this equation is the conditional probability of an f2z-event). If A and A' are K-component 1z-random variables, we define a hz-event by saying that for any e E Tz, "Z knows that A' is an ancestor of A" holds if there is a sequence of K-bit strings A'(e) = Ao,Al,..., AJ = A(e) such that Z knows that Aj is an f-parent of Aj+l for each j = 0, 1,..., J- 1. Based on this event, we can define a Tz-random variable, "the number of ancestors of Athat Z knows." Since any ancestor of Athat Z knows must be either the left half or the right half of the input to one of Z's f(.)-nodes, and there are at most s such nodes, it follows that in any execution of Z, Z cannot know more than 2s ancestors of A. Finally, if D is a 2cK-component z-random variable, 0 < p < 2a . k, and is a K-bit Iz-random variable, we define another Tz-event by saying that for any e E sz, "Z knows that D(e) is a valid decommittal of bit #p relative to '(e)" "Z knows CHECK-DCM(16,D(e), p, 2(e)) = or equivalently, 1." This event is interpreted as meaning that two conditions hold: 1. Z knows all the values of f () that need to be evaluated to check the decommittal D(e) relative to 1R(i.e., to run CHECK-DCM(1i, D(e), p, 7(e))) (recall that there are a such values). 2. These values of f(.) are such that if CHECK-DCM(1', D(e),p,JZ(e)) were actu- ally run, it would output 1. Note that Pr(CHECK-DCM(1r, D,p, ) = 11(Zknows CHECK-DCM(I1',D,p, 1?) = 1)C ) < 2- K (the left-hand side of this equation is the conditional probability of an 41 z-event). We can see this by taking e Tz and examining the two ways that Z can fail to know D(e) is a valid decommittal: 1. Z knows all a of the values of f(.) that will be evaluated to check the decommittal D(e) relative to 7Z(e), and these values will not cause a Verifier to accept the decommittal as valid. In this case, the probability that D(e) is a valid decommittal is 0. 2. Z does not know all a of the values of f(.) that will be evaluated to check the decommittal D(e) relative to 1(e). In this case, the probability that any value of f(.) that Z doesn't know will equal the "correct" value for it (i.e., the appropriate substring of D(e), or R(e) itself, as the case may be) is 2- '. 5.2.3 Other events in Qfc, c, and Lci We define a few interesting events: * VALID = {CHECK-CS-PROOF(1K,x, M)}. This is the event that C outputs an instance x and a valid -CS proof that (3w : P(x, w) = 1). * SAMEj = {X = xi} n {r = yi}. This is the event that C's output begins with the same instance and short committal that were given as input to Gi (i.e., the output of Ci). * DIFF = (Ui SAMEi)C. * COLL = {Some f(.)-nodes in C have the same output, but distinct inputs}. This is the event that C finds an f-collision. * COLLi = {Some f-nodes have the same output, but distinct inputs}. This is the event that Ci finds an f-collision. Note that COLLi C COLL. * MOREi = {C knows more ancestors of yi than Ci does}. This is the event that C knows more useful values of f( .)- useful in the sense of being potentially helpful in decommitting bits relative to yi- than Ci does. 42 VALID is an Qc-event, but the other events defined here are Pc-events (which means they can also be viewed as Qc-events, of course). COLLi is actually a Ic-event. 2 5.3 Strong computational soundness For any E ZErL(n) and x E {0, 1} , define II(x,) = Pr(T(x, r, r) = 1: r ER {O, }R(n)). 1 We assume that if T examines any component of /r which is equal to *, then T rejects i (i.e., outputs 0) outright. For convenience,we extend the notation I(., ) to cover the case where has been extended to have length 2 . Kt. Let C be a circuit of the type we have been discussing. We now ask ourselves again what it means for C to know an NP witness that (3w: P(x, w) = 1). Certainly it seems intuitive that C knows such a witness if the event KNOWSi = VALIDn SAMEin (II(xi,, i) > ) holds. After all, from an execution of C, one can easily compute xi and i; and if (I(xi, i) > ), one can run W(x, i) to find n NP witness that (3w : P(x,w) = 1). There might be other, more subtle ways that C could know an NP witness that (3w : P(x, w) = 1). However, as it turns out, this is the only way to know an NP witness that we need to consider. That is, let us define the event KNOWS = (U KNOWSi) C VALID. i Then we shall show in section 5.3.2 that the following property holds for CS proofs of knowledge: DEFINITION: Strong computational soundness of CS proofs of knowledge. There exist positive constants cl and c2 such that for sufficiently large rK,for any circuit C of the type we have been considering, if the total number of f(.)-nodes and 43 g(.)-nodes in C (i.e., s + t) is at most 2c"' , then Pr(VALID \ KNOWS) < 2- C2 In effect, no circuit with fewer than exponentially many (in nE)random oracle nodes can output a valid pair (instance, K-CS proof) without knowing a genuine NP witness (in the sense that the event KNOWS occurs), except with probability exponentially small (in ). Our strong computational soundness property clearly implies the following: DEFINITION: Computational soundness of CS proofs of knowledge. This property is identical to strong computational soundness of CS proofs of knowledge, except that instead of quantifying over all circuits C with sufficiently few oracle nodes, we quantify over all such circuits C such that the instance x output by C is a constant (instead of permitting it to vary from execution to execution). That is, in effect, the value x is hard-wired into C. 5.3.1 The heuristics behind our proof Before stating and proving our soundness results, we give in words the basic ideas behind them. Let C be any circuit of the usual type. * It is very unlikely that C will find an f-collision (and hence, very unlikely that Ci will find an f-collision). * As long as Ci doesn't find an f-collision, the value xi o yi is a committal to the value 5i as being a possible probabilistically checkable proof from which to create a r;-CSproof that (3w : P(xi, w) = 1). * It is possible for C to output an instance x and a valid (3w : P(x,w) -CS proof M that = 1), even if x o M was never supplied as the input to any g(.)-node. However, it is very unlikely. * It is possible for C to output an instance x and a valid -CS proof M that (3w: P(x, w) = 1) such that the committal r of the CS proof is yi, but such 44 that some of the decommittals in the CS proof are to segments which were not actually "committed to" (relative to yi) by Ci. However, assuming that Ci did not find any f-collisions, this is also unlikely. There are two ways it can occur: 1. C can be lucky and know some ancestors of yi which Ci doesn't know, and which enable it to output a decommittal which it knows is valid. 2. C can be lucky and output a decommittal which it doesn't know is valid (but which is in fact valid). 5.3.2 The proof Fix a circuit C of the usual type. Throughout section 5.3.2, probabilities are to be evaluated over the probability space Qc. We intend to obtain our soundness results by progressing step by step, proving that the various ways of outputting an instance and a valid K-CS proof of knowledge of an NP witness for it without "knowing" such an NP witness all have a very low probability of occurring. Lemma 1 Pr(VALID n DIFF) < 2- '. Proof For C's output to be a valid tK-CSproof of anything, the strings rl or 2 .. or, and g(x or) must be identical. However, if we condition on the event DIFF occurring, then these strings are independently distributed, and moreover, g(x or) has a uniform distribution on strings of length R(n) · . So Pr(VALIDn DIFF) < Pr(VALIDIDIFF)= 2 -(R(n) i) < 2-. Lemma 1 says that it's unlikely for C to output a pair (x, M), where M is a valid rK-CSproof of knowledge that (3w : P(x, w) = 1), unless the x and the committal part of M were actually given as input to one of C's g(.)-nodes. Lemma 2 Pr(COLLi) < Pr(COLL) < s 2 . 2- %. 45 Proof Consider an on-line execution of C in which C's f(.)-nodes are evaluated, one after the other, in such an order that each f(-)-node is evaluated after all its predecessors. At the time when the jth node is evaluated, C knows at most j- 1 < s distinct values of f(.), and so the probability that evaluating the jth node lets C know a collision of f(.) is less than s 2-6. Since there are s nodes, the probability of finding an f(. )-collision is less than s2 2-. · Lemma 3 If A is any ci-event, then Pr(MOREiIA) < s(2s + 1) 2- K. Proof The proof of this is similar to the proof of lemma 2. To find a new ancestor of yi that Ci doesn't know, C must find a new f-parent either of yi itself, or of one of its ancestors known to Ci. Since there are at most 2s such ancestors, and C has at most s f(.)-nodes outside of Ci with which to find a new f-parent, our bound follows. Lemma 4 If X E {, l}n and r E E2' C, then Pr(VALIDn SAMEiJCOLL'n (x i = X, vi = r)) < s(2s + 1). 2- + [n(x, Proof r) + 2-] . This is essentially a formalization of what was said in section 5.3.1. Given the event we are conditioning on in the above probability, for M to be a valid K-CS proof of knowledge that (3w: P(x, w) = 1), one of two things must happen: 1. C must know more ancestors of than Ci does- in other words, the event MOREi must occur. 2. C must successfully answer the /cindependent random challenges in g(X o F). Each challenge can be answered successfully in two ways: (a) The challenge happens to be something that C knows how to answer; the probability of this is II(X, F). (b) The challenge is not something that C knows how to answer; in other words, at least one of the decommittals that C outputs is something that C 46 does not know is valid. Such a decommittal is in fact valid with probability at most 2 -K, we recall. Corollary 1 Pr(VALID n SAMEICOLL' n ((xi, Proof i) < )) < (2s2 + s + 3)- 2 - . This follows from the fact that for E G IN, ( + 2-")" < 9 2-s. U Lemma 5 For any 1 < i < t, Pr(VALIDn SAME n (lI(xi, 5i) < 2)) < (3S2 + s + 3). 2- . Proof Pr(VALID n SAMEi n (II(xi, 'i) < < Pr(VALID n SAMEi n (H(xi,) )) < 1) n COLLi) + Pr(VALID n SAMEi n (I(xi, ~i) < 1) n COLL) < Pr(COLL) + Pr(VALID n SAMEi (II(x2 , i) < ) n COLL:) < s2 2--C + (2s2 + s + 3).2 - . . Lemma 5 bounds the probability that a valid (instance, K-CS proof) pair is output for which the CS proof is not derived from a probabilistically checkable proof that C has "in mind." Theorem 2 CS proofs of knowledge enjoy strong computational soundness. Proof Let C be a circuit of the type we have been discussing. Then VALID = (VALID n DIFF) U U(VALID n SAMEi n (I(xi, i 47 i) 2)) UU(VALIDn SAME,n (nI(xi, ;i) = < )) (VALID n DIFF) U KNOWS U(VALID n SAME n ((x , vi) < )). i~~~~ So Pr(VALID \ KNOWS) < Pr(VALID n DIFF) + Pr(U(VALID n SAMEin (I(x,, vi) < ))) i t + < 2- + (3s2 +s+3) 2-K i=l = 5.4 [t(3s2 + s + 3) + 1]. 2- . Extracting NP witnesses We now wish to use our proof of strong computational soundness to edge even closer to soundness definitions which are based on "knowledge extractors." Let C be a circuit of the type we have been discussing. That is, C has s f(.)-nodes, t g(.)-nodes, no inputs, and outputs x o M o E E 1{, l}n x {0, l}ApWL(n) We define the X {O,1 }b. c-random variable w E {O,1}WL(n) as follows: 1. For i = 1,2,... ,t, let wi = W(x, i). 2. If there is a value i such that wi is an NP witness that (3w: P(x, w) = 1), then let w = wio, where io is the first such value. 3. If there is no such i, let w = WL(n) Let WITNESS be the Pc-event {P(x,w) = 1}. Lemma 6 Pr( VALID \ WITNESS) < [t(3s2 + s + 3) + 1]. 2- ". 48 Proof From the definition of w, KNOWS C WITNESS. Hence this follows from our proof of theorem 2. · Corollary 2 Let A be any Qc-event. Then Pr((A n VALID) \ (A n WITNESS)) < [t(3s2 + s + 3) + 1] 2 - 6. Now, starting with C, we can construct a circuit C' which does nothing more than execute C and then compute w. C' outputs x o M o E o w, and C' can be built "on top of" C by adding poly(s, t, n, K) edges and ordinary deterministic Boolean nodes to C (this upper bound on how much new circuitry needs to be added is easy, but somewhat painful, to verify). There are clear bijections between global views of C and global views of C', and between executions of C and executions of C'. In particular, we can think of any event or random variable associated with C (i.e., any tIc- or Qc-event or -random variable) as also being associated with C', and conversely. For the sake of our ultimate goal in chapter 7, we now need to put all of the above together, and also observe that the construction and results above possess a certain uniformity. Corollary 3 Let C(.) be a circuit of the type which we have been considering, except that, instead of having no inputs, C(.) takes an a-bit input, y. C(.) can be viewed as being a collection of 2a different circuits, {C(y) : y E {O,la), each of which takes no inputs. For each of these 2 a circuits, C(y), we can define Qc(y) and events such as VALID(y), WITNESS(y), etc. Then we can construct a circuit C'(.) "on top of C," which inputs an a-bit value, y, and outputs x o M o E o w as above; which has size no more than poly(s, t, n, n) more than the size of C(.); and such that for each y G {0, l}a, for each Qc(y)-event A(y), Pr((A(y) n VALID(y)) \ (A(y) n WITNESS(y))) < [t(3s2 + s + 3) + 1] 2- . 49 (The above probability can be thought of as being taken over Qc(y); over Q(y); over all global views of C(a); or over all global views of C'(a). All of these have the same meaning) Fortunately for the reader, this is as far as we have to go with proofs of knowledge. 50 Chapter 6 Defining digital signature schemes At this point, we can briefly steer away from proof systems. We wish to set the stage for our application of CS proofs of soundness, and so this chapter is concerned with formalizing exactly what digital signature schemes are, and what it means for them to be "secure." 6.1 The purpose of digital signature schemes Suppose that two parties, Alice and Bob, are communicating in some fashion other than a face-to-face conversation. Whenever Bob receives a message m which was allegedly sent by Alice, he would like to have some method of assuring himself of the validity of the communication. Furthermore, Bob might want a method of holding Alice accountable for what she says: if Alice later denies that she sent Bob m, Bob wants to have some way of proving to a third party (such as a judge) that Alice really did send him m. In short, there are essentially three (not necessarily disjoint) kinds of security Bob might want to have: 1. Sender authentication: Bob can be certain that m really did originate with Alice. 2. Message authentication: Bob can verify that the message m is exactly what was sent by Alice. 51 3. Accountability: Bob can convince a third party of the above two facts- namely, that none other than Alice sent him precisely m. Traditionally, if Alice sends Bob a handwritten note, her signature on the note is considered sufficiently difficult to forge that it is a "proof" (both to Bob and to a third party) that the note is really from her. It's somewhat debatable how well Alice's signature actually works for this purpose, and in any case, it's certainly questionable that it gives very trustworthy message authentication. Digital signatures were first introduced by Diffie and Hellman in [10] as a digital analog to handwritten signatures. Digital signatures are intended to provide all of the types of security mentioned above (sender authentication, message authentication, and accountability) for electronic communications. Precise procedures for producing and verifying signatures vary greatly among different digital signature schemes, but at a very high level, most schemes work essentially as follows: 1. Alice somehow obtains or generates a matching public key and secret key. Alice's secret key is something that only she knows, but her public key is required to be common knowledge. 2. To sign a string m, Alice uses m, her public key, and her secret key to compute a "signature" string. 3. To check a signature, Bob- or anyone else- runs some sort of verification algorithm on the message, the signature, and Alice's public key. The result of this computation tells him whether to accept or reject the signed message. Because Alice is the only person who knows her secret key, only she can compute a valid-looking signature (hopefully!). solve the difficult problem arising from the message m and her public key. This is how a digital signature scheme gets its security, and is why digital signature schemes tend to be based on a family of difficult problems. 52 6.2 Digital signature schemes without security As indicated, methods for producing and checking digital signatures tend to have certain important elements in common. We shall more or less follow the formalization of Goldwasser, Micali, and Rivest [17] in our definitions here. However, initially, our definition will be almost completely syntactic in nature- in particular, our definition of a digital signature scheme makes no mention whatsoever of security. We shall define the notion of security for a digital signature scheme soon afterward. A digital signature scheme has a few components: 1. A security parameter is a number k E IN. We permit the set K of legal values for k to be a proper subset of IN, as long as the set {lk : k E K}) is recognizable in polynomial time. We shall often tacitly assume that KCis an infinite set, so that we may consider asymptotic results about security. 2. For each k E KC,the message space Mk is the set of all messages which can be signed for that particular value of k. Without any real loss of generality, we shall assume that Mk = {O, 1}ML(k) for all k E IC, where ML(k) is a length function for IC. 3. A key generator is a randomized algorithm G(.) which, on input 1k for k E KC, runs in expected time polynomial in k, and outputs a public key Pk of length PL(k) and a secret key of length SL(k), where PL(k) and SL(k) are length functions for KIC. 4. A signing algorithm is a (possibly randomized) algorithm O(.,,-, .) such that if Pk and Sk are matching public and private keys output by G(lk), and m E Mk, then u(lk, Pk,Sk,m) runs in expected time polynomial in k, and outputs a signature 0a E {O,l1}L(k) for m, where aL(k) is a length function for KC. 5. A verifying algorithm is a deterministic algorithm V(., -,, ) such that if Pk is a public key output by G(lk), m E Mk, and al = L(k), then V(lk, Pk, m, a) runs in time polynomial in k and outputs a 0 or 1 verdict on the validity of the 53 signature eoof the message m. 1 indicates a proper signature, and 0 indicates an improper one. DEFINITION: Digital signature scheme. A digital signature scheme is a triple (G(.), a(., ., ., .), V( , , )), as above, such that if k E KCand m E Mk, then Pr(V(lk, Pk, m,m) = 1: (Pk,Sk) - G(lk);am ((lk, Pk, Sk,m))= 1. In other words, the essential part of a digital signature scheme is that a legitimate signature should always be recognized as being such. Recall that we have yet to say anything about security considerations; our definition as it stands does not preclude a rather useless digital signature scheme which produces the empty string as the signature for every m E MkWe also permit digital signature schemes to have access to random oracles. It may seem that a verifying algorithm with access to random oracles is no longer deterministic; however, we make a distinction between the "free randomness" of a randomized algorithm and the "common randomness" used by a deterministic algorithm with access to random oracles. We view an algorithm of the latter type as being deterministic, because anyone else with access to the same random oracles can predict exactly how the algorithm will behave. In contrast, a randomized algorithm's coin flips are private- unless it specifically sends them to someone else, no other parties will know their values. Some important measures of a digital signature scheme's efficiency include the running times of these three programs (especially the last two, since a given Signer only needs to generate keys once), the lengths of the keys produced by G(-), the lengths of the signatures produced by o(.,.,-, .), and the sizes of the actual programs for o(.,.,., .) and V(.,.,., ) (for applications where memory is at a premium, such as smart cards). We shall often be somewhat informal in referring to digital signature schemes. For example, if S is a digital signature scheme, then Sk is "S with security parameter k," and an "Sk-signature of m" is what a Signer obtains by running a(lk, Pk, Sk, m). 54 6.3 Security of digital signature schemes We use the strongest natural model of security for digital signature schemes: security against an existential forger implementing an adaptive chosen-message attack. In simpler terms, a forger, Chet, starts out knowing only Alice's public key. He performs some computations on it, and comes up with a message ml whose signature he feels might be helpful to him. He gives ml to Alice, who gives him a (valid) signature am, for it. Chet then computes some more, and gets Alice to sign another message m2 for him. He continues computing and gathering signatures in this fashion until, using all the information at his disposal, he finally computes a message m and a (hopefully valid) signature am for it. Chet is successful in forging if his output (m, Am) is a valid signed message which he never asked Alice to sign. Note that there are many other notions of security for digital signature schemes. [17] lists numerous other types of attacks. For example, Chet might not be permitted to obtain valid signatures from Alice, and he might have to be able to forge any message m E Mk, rather than just some single message; in that case, Chet would be a universal forger implementing a key-only attack. For any reasonable notion of what constitutes successful forgery, we can make the informal definition that a digital signature scheme achieves b bits of security with security parameter k if no adversary with at most 2 b/2 can successfully forge with probability higher than "units" of computational power . 2 b/2 Now we shall formalize this definition. 6.4 Signing-oracle nodes For the following definition, we assume that we have some particular digital signature scheme in mind. DEFINITION: Signing-oracle node. A signing-oracle node for security param- eter k is a special node which takes as input a message m E Mk and outputs a signature r,, E {0, 1}oL(k) of m relative to a public key Pk, which is supplied as input 55 to the circuit containing the node. In other words, a circuit containing a signing-oracle node should have exactly PL(k) input bits, and the values of those bits is partially responsible for defining the node's computation. We describe in detail exactly how this computation works: 1. A secret key Sk which is compatible with Pk is chosen at random. That is, the probability distribution on the secret keys is the same as the probability distribution on secret keys output by G(lk), conditioned on G(lk)'s public key output being Pk. 2. am is then computed exactly as if a(lk, Pk, Sk, m) were run. 3. If the circuit contains more than one signing-oracle node, then the same secret key Sk is used for all of them. That is, first, an Sk is chosen from the appropriate probability distribution. Then, for each signing-oracle node in the circuit, that same Sk, along with ik and Pk, is supplied to o(., ., ., .), together with whatever input m the node has, to obtain a signature am, which is the node's output. 6.5 Forging circuits and security levels DEFINITION: Forging circuit. A forging circuit for security parameter k is a prob- abilistic circuit which, in addition to having the usual Boolean computation nodes (as well as nodes for any random oracles used by the digital signature scheme) can also have signing-oracle nodes. The forging circuit takes a public key, Pk, as input, does some computation with Pk (which can include obtaining signatures of messages relative to Pk- one signature per signing-oracle node), and outputs a forged message and signature (m, am) E {0, as legitimate by V(.,.,.,.)). 1}ML(k) X {0, }1)L(k)(which may or may not be accepted As indicated previously, the public key used by any oracle-signing nodes in the circuit is the same as the Pk passed as input to the circuit. We also call such a circuit a "k-forging circuit." A k-forging circuit successfully forges if its output (, * V(lk,k,m, am)= 1. 56 am) satisfies: * The circuit did not use m as the input to any of its signing-oracle nodes. In other words, the circuit did not explicitly receive a signature of m for free. With these preliminaries in place, we are ready to define what security means in the context of digital signature schemes. DEFINITION: Security level. Let S = (G(.),a(.,., .),V(.,.,.,.)) be a digital signature scheme. We say that S achieves security level (N, p) for security parameter k if for any k-forging circuit C(.) of size at most N, Pr(C(Pk) successfullyforges : (Pk, Sk) - G(lk)) < p. In other words, the probability of successful forgery with a circuit of size at most N is at most p. Naturally, this probability is taken over all random oracles, as well as over the values output by any randomized nodes and signing-oracle nodes in C(.). For convenience, we say that a digital signature scheme achieves b bits of security for security parameter k if it achieves security level (2 b/2,2-b/2). We may also be lazy and say things like, "Sk achieves (N,p) security" or "Sk has b bits of security." Observe how the above definition may be easily modified to obtain alternative notions of security. For example, if we did not permit forging circuits to contain signing-oracle nodes, then our definition would characterize security against existential forgers implementing a key-only attack- forgers who may not interact with the legitimate Signer to obtain valid signatures, and who must come up with a single valid forgery to be successful. REMARK. At this point, many presentations of digital signature schemes might make an additional definition, calling a digital signature scheme secure if it achieves w(k) bits of security for security parameter k. After all, if w(k) bits of security are achieved, then the effort to forge grows superpolynomially with k, even though the efforts expended by the legitimate Signer and the Verifier remain polynomial in k. We do not make this definition, because we feel that it is extremely important to focus on the precise amount of effort a forger needs to expend. From a security viewpoint, there is a huge difference between a system which can be broken only with 2 k steps of 57 computation, and a system which can be broken with klogl°g k steps of computation. It is clear that any signature scheme which achieves security level (N,p) for a security parameter also achieves security level (N', p'), whenever N' < N and p' > p. Similarly, any signature scheme which does not achieve security level (N, p) also does not achieve security level (N", p"), whenever N" > N and p" < p. 6.6 An example of a digital signature scheme We give here a convenient example of a digital signature scheme. It is essentially a cross between Rabin's scheme (see [24]) and Williams's scheme (see [31]), with a random oracle thrown in so that it is possible to rigorously prove security results; both of the above schemes are variants of the RSA scheme presented by Rivest, Shamir, and Adleman in [26]. The security of all of these schemes is based on the presumed intractability of factoring a product of two large primes. For our scheme, we will require a random oracle f : {0, 1}2k - 2 - {0, 1 }2k - 2 . We also need to know a few simple number theoretic facts which have been observed in [31], [17] and elsewhere: Let Ilk = {All k-bit prime numbers}. Define the subsets Fk, Gk C Ik by setting Fk = {p E HIk p _ 3 Hk = (mod 8)} and Gk = {q E :Ikq- 7 (mod 8)}, and take {p q : p E Fk and q E Gk}. If n E Hk and x is any residue modulo n (i.e., x E Z~), then precisely one of the residues {x, -x, 2x, -2x} is a square modulo n. Furthermore, any quadratic residue modulo n has exactly one square root which is itself a quadratic residue. 1. Legal security parameters. KC= {All sufficiently large integers}. 2. Messagespace. Mk = {0, 1 }2k-2. 3. Key generation. Upon input 1 k, G(.) chooses, uniformly at random, p E Fk and q E Gk. The public key is the 2k-bit product n = p. q E Hk, and the secret key is the concatenation p o q. 4. Signatures. We sign the message m E {O,1}2k-2 as follows: 58 (a) Let r be the [unique] element of {f(m), -f(m), 2f(m), -2f(m)} which is a quadratic residue modulo n. (b) Set aTmto be the [unique] square root of r modulo n which is itself a quadratic residue. 5. Verification. To verify a signature am of a message m relative to a public key Pk, we check that some element of {f(m), -f(m), 2f(m), -2f(m)} is congruent to oa2 modulo n. For the above digital signature scheme, the capability of signing messages obviously requires the ability to compute square roots modulo n. It turns out that knowing the factorization of n enables one to do this quickly, and conversely- that is, if one can compute square roots modulo n quickly a reasonable fraction of the time, then one can factor n quickly. The way the above scheme is set up, being able to compute square roots modulo n turns out to be more or less the only way to produce a signed message which will be accepted. It doesn't even do Chet any good to have Alice sign messages for him, unless Chet is fortunate enough to be able to use the same signature for two distinct messages. For example, if Chet can find ml m2 such that f(ml) = f(m 2 ), then he can ask Alice to sign ml, and use the signature he obtains as a signature of m 2. However, given that f(-) is a random function, if Chet wants to be able to find such an ml, m2 with probability at least 2 -2 k/3, he will have to perform at least 2 2k/3-1 evaluations of f(.). Although we shall not do so here, it is not hard to formalize arguments like the above to show that complexity-theoretic assumptions about the difficulty of factoring imply that forging signatures in this scheme is difficult. A typical assumption to make about the difficulty of factoring is that for some particular 0 < c < 1, for all sufficiently large k, for all probabilistic circuits C(-) of size at most exactly 2k inputs and k outputs, Pr(C(n) E {p, q}: p, q ER IIk; n = p q) <- 2 59 k 2 kc which have Under such an intractability assumption, it is not difficult to prove rigorously that this scheme has Ql(kc)bits of security. 60 Chapter 7 Derived digital signature schemes Now we can put together everything we've seen so far. Looking at how we defined digital signature schemes, we realize that the valid signature for a message m produced by the Signer is more or less just an NP witness that (3w: V(lk,Pk,m,w) = 1). Derived signature schemes are obtained by having a signature be a CS proofs of knowledge of the above witness, rather than the witness itself. 7.1 Defining derived digital signature schemes Let S = (G(-.) o(,,,.), , V(.,-,,.)) be a digital signature scheme such that the computation of V(.,, , .) can be "factored" into a polynomial time oracle computation followed by a polynomial time oracle-free computation. By this, we mean that there exist algorithms V(.,., ) and V2 (-, ) such that: V runs in deterministic polynomial time, and can make calls to any random oracles used by the signature scheme. * V2 also runs in deterministic polynomial time, but does not consult any random oracles. * V(lk, Pk,m, a)= V2(V(lk, Pk,m), a). Without loss of generality, we require that there be a non-decreasing length function for K, VL(-), such that IVl(lk,Pk,m)l = VL(k). 61 REMARK. It's not difficult to see that the sample digital signature scheme we presented in section 6.6 possesses this curious "factoring" property, as do many other schemes. In fact, the "factoring" property of V(.,.,,.) required above is trivially seen to hold for any deterministic polynomial time verifying algorithm which does not consult any random oracles. Set WL(n) { aL(k) if there is a k such that VL(k)=n = 0 P(x, w) otherwise 1 if Iwl =w = L(Il) and V2 (x,w) = 1} 0O otherwise We (slightly informally) define the derived digital signature scheme, S', to be a "2-parameter digital signature scheme" as follows: 1. Legal security parameters. A security parameter for S' is an ordered pair (k, I), where k E KCand > 1. 2. Message space. M(k,r,) = Mk. 3. Key generation. G'(1k, 1n) has the same distribution as G(lk). 4. Signatures. Set h = -V(lk,Pk, m). Then a'(lk, 1, Pk, Sk, m) = a -CS proof (lk, Pk, Sk, m) as the NP witness). that (3a: P(rh, ) = 1) (with om = 5. Verification. V'(lk, Jl, Pk, m, a') = 1 if and only if a' is a valid CS proof that (3ra: P(n, a) = 1), where ri = Vl(lk,Pk,m). Note that since V(-,-, ) is deterministic, there is a unique Th which both the Signer and the Verifier can compute, and so they can each perform the computations they're supposed to be able to do. In fact, G'(lk, 1), a'(lk, 1 x, Pk, Sk, m), and V'(Ik, 1', Pk, m, a') all run in time poly(k, K). Of course, technically, S' is not a proper signature scheme (according to our definition). Nonetheless, we shall effectively consider it to be one, and shall' freely talk about things such as the security it achieves for security parameter (k, K). 62 7.2 Security of derived signature schemes We now investigate the security of these variants of digital signature schemes. In particular, we would like to determine some relationship between the security of a digital signature scheme, and the security of the corresponding derived scheme. Suppose we have a signature scheme S and a corresponding derived scheme S'. Say there is a forging circuit F'(.) (with access to the random oracles f(.) and g(.), as well as to any random oracles that Sk uses) of size N', such that Pr(F'(Pk) successfully forges an S()-signature: (Pk, Sk) - G'(lk, 1n)) = p'. What we shall do is construct a circuit F(.), which has size "not too much bigger than N'." and which forges Sk-signatures "about as well as" .F'(.) forges Sk,,)-signatures. This will imply that forging S' signatures isn't much easier than forging S-signatures. Set s and t equal to the number of f(.)-nodes and g(.)-nodes, respectively, in F'(.). In addition,let ,u be the number of signing oracle nodes in '(-). Our construction of F(.) shall take place in several steps, during which we construct intermediate circuits F1 (), .T2 ('), .... To help keep things straight, the outputs and random variables of circuit Fi(-) shall have the subscript "i," as shall events pertaining to Fi(). STEP 1. Let F1 (.) = '(). Admittedly, this step isn't a very big step forward; it's just so we can take stock of our situation. In particular, F1 (') takes a PL(k)-bit input, Pk, and produces an (ML(k) + AP,WL(VL(k), ic))-bit output, m1 o al'. For a given Pk, define the following events on VALID1(Pk) = NEW1(Pk) = r,(p,): {CHECK-CS-PROOF(K,rl,1 C') = 1}, where / {ml was not the input to any of Fl(Pk)'s 1 = Vl(lk, Pk,ml). signing-oracle nodes). Note that (VALID1(Pk) n NEW1(Pk)) is the event that Fl (Pk) successfully forges. 63 STEP 2. Take each signing-oracle node in F1 (.) and replace it with: 1. A signing-oracle node with the same inputs, and which computes Sk-signatures (as opposed to Sk )-signatures). 2. Circuitry to compute a Sk,n)-signature from the Sk-signature output by the Sk-signing-oracle node. Call the resulting circuit F 2 (.). Set s' and t' equal to the number of f(.)-nodes and g(-)-nodes, respectively, in F2 (.). Note that s' s and t' t, unless Iu = 0, since we need to use random oracle nodes to compute Sk )-signatures. In any case, however, s' is no more than poly(t, k, ic) bigger than s, and t' is no more than poly(/i, k, K) bigger than t. STEP 3. Add to F2 (.) circuitry to compute rh2 from m 2. In addition, reorder the outputs of TF2(-) to produce a circuit F3(.), which inputs Pk and outputs rh3 o 31'0 m3 , where rh3 = Vl(lk, Pk, m3 ). Regard 3(.) as being a circuit C(.) of the type discussed in corollary 3. That is, 'F3 (.) is a collection of 2 PL(k) circuits, each taking no input, and each of which, when executed, outputs a candidate (instance, K-CS proof) pair, h 3 o a3', together with an "extra" output, m3 . We can define the events NEW 3 (Pk), VALID3 (Pk), and WITNESS3(Pk) on QF3(pk) in a natural way, exactly as in section 5.4. STEP 4. Apply corollary 3 to C(.) = F3 (.), obtaining a circuit C'(.) (which we shall call.4(-),instead). .F4 (.) takes an input Pk, and outputs rh 4 o a4' o mn4 o a4 Notice that, for a given Pk, there is not an obvious isomorphism mapping global views of F (Pk) to global views of F4 (Pk), or mapping executions of Fl(Pk) to executions of .F 4 (Pk). Nonetheless, it's clear that all of the circuits i() are actually very closely related. The events VALID4 (Pk), NEW 4 (Pk), and WITNESS 4 (Pk), defined on the space of global views of F4 (Pk), have extremely close parallels with corresponding events defined for previous circuits J(Pk). 64 Now, by corollary 3, Pr(NEW 4(Pk) n Pr(WITNESS 4(Pk)) is at least Pr(NEW 4(Pk) n VALID4 (Pk)) - [t'(3s'2 + s' + 3) + 1] 2- . Saying that '(-) forges successfully with probability exceeding p' is equivalent to saying that Pr(Pk)- Pr(NEW1(Pk) n VALIDi(Pk)) > p', Pk where Pr(Pk) is the probability that G'(., .), on input (k, K), outputs Pk as a public key. This statement, in turn, is equivalent to a similar statement about E3 (.): E Pr(Pk). Pr(NEW 3(Pk)n VALID3 (Pk)) > p'. Pk Everything comes together when we take two more steps: STEP 5. Modify 4(.) so that it no longer outputs ri and a'. Call the result sF5 (). 5() can be seen to be a forging circuit for Sk, except that it contains random oracle nodes for evaluating f(.) and g(.). Furthermore, F5 () successfully forges with probability exceeding p' - [t'(3s'2 + s' + 3) + 1]. 2- . STEP 6. Replace the f(.)-nodes and g(.)-nodes in () with simulated oracle nodes to construct a circuit h6(). In other words, each simulated random oracle node should just output a random value, except that whenever two simulated random oracle nodes receive the same input, they should have the same output. At long last, we have our final circuit, F(.) = s 6 (.). The size of F(.) is no more than poly(s, t, , k, K) more than N'; and the probability that F(.) successfully forges [an Sk signature] is less than p' by at most poly(s, t,#y, k, ;) 2 - K (the polynomial bound on the size of F(.) can be easily, but painfully, verified). Theorem 3 If Sk,,) does not achieve (N',p') security, then Sk does not achieve (poly(N', k), p' - poly(N', k)- 2- ) security. 65 Proof This follows from the fact that s, t, /, and are all less than N'. · It's a simple matter now to restrict our derived signature schemes to being ordinary, -parameter schemes. Corollary 4 Let S be a digital signature scheme, with associated derived digital signature scheme S'. Let b(k) = w(log k) be such that for all k E IC, Sk achieves b(k) bits of security; and let b'(k) > b(k). Define the digital signature scheme Tk = S(k,b'(k)). Then Tk achieves Q(b(k)) bits of security. By theorem 3, we can find a constant c such that for sufficiently large N' Proof and k, if the scheme S,k ((N'k),p'- ) does not achieve (N', p') security, then Sk does not achieve (N'k)C. 2- N)security. Setting N = (N'k)C and p = p'- (N'k)C 2- " , and taking the contrapositive of this, we obtain: If NI/C. k- 1 and k are large enough, and if Sk achieves (N,p) security, then Sk,K)achieves(N/c. k-l, p + N 2- ') security. We know that Sk achieves ( 2 w(logk), 2 (lg k)) security. Any 2 (Ig k) function grows faster than k does; hence we can apply our contrapositive theorem. We find that for large enough k, S(k,K) achieves (2 b(k)/2c . k-1, 2- b(k)/2 + 2 6(k)/2 2 - ") security. Since > b(k), we see that for large enough k, Tk achieves (2 b(k)/2c. k-1, 22 -b(k)/2) security. Given that b(k) = w(log k), this implies that Tk achieves Q(b(k)) bits of security. 7.3 Signature length for signature schemes Recalling that a K;-CSproof that x E L(P, WL) has length O(K2 log J11),we see that a Tk-signature of a message m has length O(b'(k)2 log k). That is, our 1-parameter derived digital signature scheme produces signatures of length O(b'(k)2 log k) to achieve 0(b(k)) bits of security. Consider once more our sample digital signature scheme from section 6.6, which we shall call S. For a given security parameter k, all permissible messages have length 66 0(k); public keys, secret keys, and signatures have length 0(k) as well. How many bits of security do these signatures provide? At present, the best available factoring algorithms can factor k-bit numbers in exp(O(k 1 / 3 . log2/3 k)) steps (see Adleman's survey paper [1] for more information). Thus Sk has O(k1 / 3 log2/3 k) bits of security If we set b'(k) = 0(k'/ 3 (k) bit signatures. for log 2/3 k), we see that the scheme Tk has the same bit security (up to a constant factor) as Sk, but produces signatures of length O((k :1/3 . log 2 /3 k) 2 . log k) = O(k2 / 3 log 7/3 k). Of course, the easier factoring turns out to be, the more it pays to use CS proofs as signatures, and the more savings can be gained in signature lengths. The "breakeven" point occurs if 0(k)-bit -:Dnatures supply 0(k1 /2 log - 1/2 k) bits of security; however, as mentioned in the previous paragraph, it is already known that factoring is easier than that. We can examine the security/signature length relationship in the other direction, as well- what security do Sk and Tk have for signatures of a given size 0(k)? If we assume that Sk actually does achieve 0(k that Tk achieves 0(k1 /2 7.4 . /3 . log 2 /3 k) bits of security, then we can see log -1 / 2 k) bits of security. Public key length for signature schemes As suggested by Rivest [25], in addition to decreasing signature length, CS proofs can also be used to decrease the size of user's public keys. We show here what we mean. Let S be a digital signature scheme such that V(.,., , ) can be "factored," in the sense that there exist algorithms V(., ) and V2('', ,) such that: * V1 runs in deterministic polynomial time, and can make calls to random oracles. * V2 runs in deterministic polynomial time, and does not consult any random oracles. * V(lk, Pk, m, ) = V2(V(lk, m), Pk, ). This is a slightly stronger condition on V(.,.,- ) then we encountered for constructing derived schemes, because it does not permit any oracle calls to depend on the value 67 Pk. However, once again, both our sample digital signature scheme and any trapdoor scheme with a deterministic, oracle-free verifying algorithm satisfy the conditions here. Let b(k) = w(logn) be such that Sk achieves b(k) bits of security, and let b'(k) satisfy b'(k) > b(k). Let H(.) be a polynomial time computable collision-free hash function which, for each k E IC, maps PL(k)-bit inputs to b'(k)-bit outputs (the intent is to use H(-) to hash public keys to obtain shorter public keys). We modify the scheme S in the following way to produce a scheme S": * G"(lk) works just like G(lk), and outputs the same secret key Sk, but instead of outputting the public key Pk, it outputs Pk = H(Pk) as the Signer's public key. * Set N = Vl(lk,m). Then a"(lk,Pk,Sk,m) produces a b'(k)-CS proof that (3a3Pk: (V2 (N, Pk, ) = 1 and H(Pk) = Pk)). · V"(lk, Pk,m, ") tests whether or not a" is a valid CS proof. If H(.) were an actual random oracle, then it would be very unlikely that any forger could compute any value other than Pk which is mapped to Pk by H(-). This is ideal from a security point of view. However, with a random oracle for H(.), we find ourselves unable to write a program for a"(., .,., , ), since the statement that it would be trying to produce a CS proof for would contain an oracle call- and hence would not be an NP statement. Indeed, this is precisely the reason we need our verifying algorithms to possess a "factoring" property. So instead of a random oracle, we use a collision-free hash function for H(.). We discuss this substitution and related matters more in section 7.5.1. 7.5 Practical considerations There are two important points to be addressed about how feasible it actually is to use our construction of derived signature schemes. 68 7.5.1 Implementing random oracles One difficulty is that our derived schemes require the use of random oracles. This is fine in theory, but to actually implement our schemes, we need to replace the random oracles with something a bit more concrete. In practice, in situations like this (see e.g. Fiat and Shamir [14]), a cryptographically strong hash function is usually substituted for a random oracle; a good hash function of this type should more or less appear to behave like a "random" function. We can also use a "poly-random as is defined and constructed function" constructed from a one-way function, by Goldreich, Goldwasser, and Micali in [16]. However, to use a poly-random function in the setting of digital signatures, we must make public the "seed" of the poly-random function we use, which may mean that the function no longer appears to be random for our purposes. This would open up the possibility of new methods of forging which were impossible against random oracles. There are more thoughts and philosophy on these matters in Micali [22]; also, see Bellare and Rogaway [7] for a more general discussion of the role of random oracles in cryptography. In any event, the basic idea is to replace the random oracles with functions from the 'cryptographer's toolbox" which appear (to a computationally-bounded adversary) to be random. The hope is that the type of functions selected will behave sufficiently like random oracles that very little security (in some appropriate sense) will be lost in the transition from random oracles to "random" functions. Digital signature schemes of the type we consider in this thesis depend not merely on "random" functions, but on trapdoor functions- families of one-way functions which each have some secret information associated with them, enabling them to be inverted quickly. The trapdoor function our sample scheme uses is squaring modulo a particular type of composite number. Unfortunately, this trapdoor function only gives b(k) = O(kl/ 3 log2 /3 k) bits of security when mapping inputs of size (k) to outputs of size (k), since a circuit of size 2b(k) can completely "break" the function by factoring a 0(k)-bit integer (at best, of course, we could naively cross our fingers and hope to achieve (k) bits of security in this sense). It seems quite possible that 69 any trapdoor function has this kind of security inefficiency. In a sense, the CS proof mechanism allows us to use very secure one-way functions, such as cryptographically strong hash functions, to bypass partially one of the unpleasant consequences of the insecurity present in trapdoor functions. In particular, the unpleasant consequence we are worried about here is the need to use inputs for trapdoor functions which are significantly longer than the amount of security that they provide. The definition of derived signature schemes is actually somewhat reminiscent of how, for encryption of data, one often uses a public-key cryptosystem only to agree on a common key to use for a conventional cryptosystem, which is then used to encrypt the bulk of the message at hand. In other words, for convenient encryption and decryption, some kind of public-key mechanism is needed (just as, in our setting, some kind of trapdoor function is needed), but it would be inefficient to use it for encrypting an entire long message. Thus it is combined with some kind of "random" function technology to make the entire transaction much more efficient. Of course, when using one-way functions to obtain short public keys, as in section 7.4, some of the above thoughts again apply. However, in this situation, our chosen function H(.) need not actually behave in a "random" manner; it suffices for it to be difficult to invert, so that it's infeasible for an adversary to obtain a spurious public key which also hashes to Pk. 7.5.2 Asymptotics A more serious problem (from a practical perspective) is that our results are of an asymptotic nature. That is, the improvements in signature length we produce do not evidence themselves except for "sufficiently large" security parameters. It is all too possible that the constants hidden in the "O(.)" notation of our results are too big to make derived proof systems readily useful. Let us do some rough calculations. Suppose we wish to use a derived signature scheme Sk, ) to sign a message with b bits of security. Essentially, there are two orthogonal ways that an adversary, Chet, can successfully forge a message: 70 1. Chet can break the original signature scheme, Sk. 2. Chet can break the K-CS proof which was "layered" on top of Sk to make S'k,K). We want to get a lower bound on . To do this, we need only worry about Chet forging via the second possibility. In addition, we need only consider forging circuits with no signing-oracle gates. Let's examine this in detail. Assume with some minor loss of generality that Vl(lk, Pk, m) = m. This means that a properly signed message in Sk consists of an instance (the message) and an NP witness (the signature). To have b bits of security from S(k,,), we need to be able to claim that any circuit C(.) of size at most 2b/2 forges successfully with probability at most 2 - b/2 . Since a successful forgery is a valid (instance, -CS proof) pair, to obtain this conclusion from corollary 3, we need the inequality [t(3s2 + s + 3) + 1]. 2- K < 2 - b/2 to hold for any circuit C(.) of size at most 2/2. This implies that 3s2t < 2 . 2-b/2 Since each f(.)-gate has 2K inputs, and each g(.)-gate has s 2K + t K K inputs, it follows that < size(C(.)). However, there's not much more we can say to relate s and t to the size of C(.) than this. So we more or less need to be able to say that whenever s 2K + t . s < 2b/2 holds, 3s2 t < 2 2 - b/2 holds as well. 3s2 t is maximized, subject to the constraint that s 2n + t K < 2b/ 2, when s = t =- 2 . Thus we require that 23b/2 -3 9 < 2 . 2-b/2 - or equivalently, 9K3 2 71 > 22b If we are trying to obtain b = 100 bits of security from our signatures, this means that > 175. How long are the resulting S'(k, K)-signatures? We remember that 2 . Current research in complexity theory (see Bellare, Goldwasser, Ap.,WL(n,rc)> 2qaKn Lund, and Russell [6]) indicates that a value of q somewhere around 20 is sufficient to make probabilistically checkable proofs (recall that q is the number of proof bits of a probabilistically checkable proof that the Verifier examines). Even if the height of the tree we tree-hash with to create CS proofs is only a = 1, this means that signatures are 1.2 Megabits long. In contrast, the best factoring algorithms known at present factor k-bit numbers in expected running time approximately exp (1.9(k In 2)l/ 3 [ln(k in 2)]2/3) (see [1]). Let us assume that this means that our sample factoring-based digital signature scheme has (2k ln2)/ 3 [ln(2kln 2)]2/3 bits of security (to break Sk, Chet needs to factor a 2k-bit number). It turns out that k = 8500 provides more than 100 bits of security for Sk, if all our assumptions hold. In other words, 17 kilobit signatures suffice to provide 100 bits of security with the original digital signature scheme. This is less than 1/70 as long as the derived signatures we considered. Obviously, these numbers do not speak well for derived digital signature schemes. However, times change. Ongoing progress in both hardware and software design continually decreases the time needed to factor large numbers and compute discrete logarithms [1], and since most digital signature schemes are based on one of these two problems, users are always finding it necessary to increase their security parameters. At some point, keys could become long enough that it becomes worthwhile to use derived signature schemes. 72 Chapter 8 Conclusion We have proven that CS proofs satisfy a stronger, more constructive notion of soundness than was originally presented in [22]. Our proof enables us to use CS proofs as noninteractive proofs of knowledge- that is, they can be an efficient method for a Prover to convince a Verifier that it "knows" some NP fact. We have demonstrated how to apply these proofs of knowledge to digital signature schemes. Starting with a digital signature scheme, we have created a derived digital signature scheme from it, in which the signer signs a message by providing a proof that it knows a signature of the message in the original signature scheme. For large enough security parameters, these new signatures can be much shorter than the original signatures. It is also possible to use this approach to create digital signature schemes with shorter public keys than would otherwise be needed to achieve a given security level. Our method of modifying digital signature schemes is very general. Although (for demonstration purposes) we concentrated especially on a particular simple variant of the RSA signature scheme (see [26], [24], and [31]), our construction can be applied to any of a broad category of signature schemes- for example, the El-Gamal digital signature scheme [11]. In additional to our results about CS proofs of knowledge and how to apply them to digital signature schemes, we have also explicitly modelled security for both of these settings against a non-uniform adversary (i.e., against circuits), instead of 73 against a more customary uniform adversary (i.e., a Turing machine). We feel that this is the proper way to model adversaries in most cryptographic settings, because it more accurately takes into account the attacks that an adversary might make- to successfully attack a cryptographic protocol, an adversary only needs to be able to "break" it for whatever value of security parameter is actually in use. 74 Bibliography [1] L. M. Adleman. Algorithmic number theory- the complexity contribution. In Proceedingsof the 35th Annual IEEE Symposium on Foundations of Computer Science, pages 88-113. IEEE, 1994. [2' S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. Proof verification and hardness of approximation problems. In Proceedings of the 33rd Annual IEEE Symposium on Foundations of Computer Science, pages 14-23. IEEE, 1992. [3' S. Arora and S. Safra. Probabilistic checking of proofs: a new characterization of NP. In Proceedings of the 33rd Annual IEEE Symposium on Foundations of Computer Science, pages 2-13. IEEE, 1992. [41 L. Babai, L. Fortnow, L. Levin, and M. Szegedy. Checking computations in polylogarithmic time. In Proceedings of the 23rd Annual ACM Symposium on Theory of Computing, pages 21-31. ACM, 1991. t51 L. Babai, L. Fortnow, and C. Lund. Non-deterministic exponential time has two-prover interactive protocols. Computational Complexity, 1:3-40, 1991. I6] M. Bellare, S. Goldwasser, C. Lund, and A. Russell. Efficient probabilistically checkable proofs and applications to approximation. In Proceedings of the 25th Annual ACM Symposium on Theory of Computing, pages 294-304. ACM, 1993. [7] M. Bellare and P. Rogaway. Random oracles are practical: A paradigm for designing efficient protocols. In Proceedings of the 1st ACM Conference on Computer and Communications Security. ACM, 1993. [81 J. Benaloh and M. de Mare. One-way accumulators: a decentralized alternative to digital signatures. In Advances in Cryptology- EUROCRYPT '93, pages 274-285. Springer-Verlag, 1993. [9] A. De Santis and G. Persiano. Zero-knowledge proofs of knowledge without interaction. In Proceedings of the 33rd Annual IEEE Symposium on Foundations of Computer Science, pages 427-436. IEEE, 1992. [10] W. Diffie and M. Hellman. New directions in cryptography. IEEE Transactions on Information Theory, IT-22(6):644-654, November 1976. 75 [11] T. El-Gamal. A public key cryptosystem and a signature scheme based on discrete logarithms. In Advances in Cryptology- CRYPTO '84, pages 10-18. Springer-Verlag, 1984. [12] U. Feige, A. Fiat, and A. Shamir. Zero-knowledge proofs of identity. Journal of Cryptology, 1(2):77-94, 1988. [13] U. Feige, S. Goldwasser, L. Lovasz, S. Safra, and M. Szegedy. Approximating clique is almost NP-complete. In Proceedings of the 32nd Annual IEEE Symposium on Foundations of Computer Science, pages 2-12. IEEE, 1991. [14] A. Fiat and A. Shamir. How to prove yourself: Practical solutions to identification and signature problems. In Advances in Cryptology- CRYPTO '86, pages 186-194. Springer-Verlag, 1986. [15] L. Fortnow, J. Rompel, and M. Sipser. On the power of multi-prover interactive protocols. In Proceedings of the 3rd Annual IEEE Symposium on Structure in Complexity Theory, pages 156-161. IEEE, 1988. [16] 0. Goldreich, S. Goldwasser, and S. Micali. How to construct random functions. Journal of the Association for Computing Machinery, 33(4):792-807, October 1986. [17] S. Goldwasser, S. Micali, and R. L. Rivest. against chosen-message attacks. A digital signature scheme secure Siam Journal of Computing, 17(2):281-308, April 1988. [18] S. Khanna, R. Motwani, M. Sudan, and U. Vazirani. On syntactic versus com- putational views of approximability. In Proceedings of the 35th Annual IEEE Symposium on Foundations of Computer Science, pages 819-830. IEEE, 1994. [19] J. Kilian. A note on efficient zero-knowledge proofs and arguments. In Proceedings of the 24th Annual ACM Symposium on Theory of Computing, pages 723-732. ACM, 1992. [20] C. Lund, L. Fortnow, H. Karloff, and N. Nisan. Algebraic methods for inter- active proof systems. In Proceedings of the 31st Annual IEEE Symposium on Foundations of Computer Science, pages 2-10. IEEE, 1990. [21] R. C. Merkle. A certified digital signature. In Advances in CryptologyCRYPTO '89, pages 218-238. Springer-Verlag, 1989. [22] S. Micali. CS proofs. In Proceedings of the 35th Annual IEEE Symposium on Foundations of Computer Science, pages 436-453. IEEE, 1994. [23] A. Polishchuk and D. A. Spielman. Nearly linear-size holographic proofs. In Proceedings of the 26th Annual A CM Symposium on Theory of Computing, pages 194-203. ACM, 1994. 76 [24] M. O. Rabin. Digitalized signatures and public-key functions as intractable as factorization. Technical Report TR-212, MIT, January 1979. [25] R. L. Rivest, March 1995. Personal communication. [26] R. L. Rivest, A. Shamir, and L. Adleman. A method for obtaining digital signa- tures and public-key cryptography. Communications of the AC1M,21(2):120-126, 1978. [27] A. Shamir. IP=PSPACE. In Proceedings of the 31st Annual IEEE Symposium on Foundations of Computer Science, pages 11-15. IEEE, 1990. [28] D. A. Spielman. Computationally Efficient Error-Correcting Codes and Holographic Proofs. PhD thesis, Massachusetts Institute of Technology, 1995. [29] M. Sudan. Efficient Checking of Polynomials and Proofs and the Hardness of Approximation Problems. PhD thesis, University of California at Berkeley, 1992. [30] M. Tompa and H. Woll. Random self-reducibility and zero knowledge interactive proofs of possession of information. In Proceedings of the 28th Annual IEEE Symposium on Foundations of Computer Science, pages 472-482. IEEE, 1987. [31] H. C. Williams. A modification of the RSA public-key cryptosystem. Transactions on Information Theory, IT-26(6):726-729, 1980. 77 IEEE