1 >>: Now we're going to have Vinod from Microsoft Research here at Redmond. In case you don't know him, here he is. He's going to speak about fully homomorphic encryption from the integers. Enjoy. >> Vinod Vaikuntanathan: Thanks. This talk's about fully homomorphic encryption from the integers. This paper appeared in EuroCrypt this year and this is joint work from Marten van Dijk from RSA Labs and my ex-colleagues Craig and Shai from IBM Watson. So over the past two days, we have heard a lot of things, a lot of very interesting talks and stuff about the problems of computing on encrypted data, from outsourcing computation, fully homomorphic encryption, so on and so forth. Perhaps too much. And this talk is going to be about more of the same. So let's refresh our minds a little bit. The setting that we are looking at is sort of a two-party setting. There's a client, Alice, who is computationally bounded. She's resource bounded. She has an input X and she wants to compute a program P on her input X and get the result. The computer program P is computationally intensive. So she wants to delegate her computation to a powerful server. She wants to do it while preserving the privacy of her input. She's a paranoid cryptographer, so she doesn't want to give the server her input so she wants to protect the privacy. The way to do this is encrypt. The economical way to achieve privacy is to encrypt your input. So she encrypts and sends it to the cloud and now, we want a magic box which takes the ciphertext and encryption effects in the program and somehow computes the result of the program on the input X inside the encryption. So it takes two [indiscernible] program and the ciphertext outputs the encryption of the result. And assume that you have an encryption with such a magic box. And it can compute and send it back to the client, she decrypts, everyone's happy. That's good. A fully homomorphic encryption is exactly an encryption scheme that which bundled with this magic box, something that takes a program, a ciphertext and computes the program inside the ciphertext. So we're all cryptographers. That may be a little more formal -- yes? >>: [inaudible]. >> Vinod Vaikuntanathan: That's okay. I mean, people have short memories. Not 2 too much, though. Okay. So, you know, again, this is a refresher. Fully homomorphic encryption is exactly like a regular encryption. You have a key generation algorithm, encryption and decryption. And the extra thing here is this magic box, which is [indiscernible], which is exactly an evaluation algorithm, which takes a program, ciphertext computes the program inside the ciphertext. So we need two properties, which are critical. One is what we will call compactness, which says that no matter how big a program you compute on the encrypted data, the resulting ciphertext doesn't grow. I mean, it's not like there is resulting ciphertext just as big as the program. In that case, it would be completely pointless for our replication. So we need ciphertext to be small. And number two -- so this is a correctness property. On the other hand, we need security which is exactly semantics, standard [indiscernible] semantics security. Okay. So this is a fully homomorphic encryption. A little bit of history which you already heard from Craig's talk yesterday, this was defined by -- this notion was defined by [indiscernible] in '78. So far, we knew before Craig's work, we knew very limited classes of homomorphic encryption. Either you could do addition or multiplication, but not both. And some, you know, a little bit of an extension, where you can do a number of additions, unlimited number of additions but just one multiplication. In other words you, can evaluate quadratic formulas on an encrypted input. That's what we knew from before. And it turns out that if you don't want this compactness property, if you allow the ciphertext to grow with the function that you're computing, you have a whole bunch of constructions. In fact, a recent result also with Craig and Shai, we show that you can construct such an encryption scheme from [indiscernible] circuit generically. So this is -- so it can be based on DDH or, you know, or any of these, you know, standard assumptions. So really, the hard part in constructing homomorphic encryption is compactness, to ensure that the ciphertext stays small. This problem was open for a while until Craig constructed a fully homomorphic encryption based on algebraic number theory and [indiscernible] lattices and all kinds of heavy machinery. The question we will ask in this talk is that is there an elementary construction of fully homomorphic encryption. In particular, I don't want use [indiscernible] lattices. I don't know [indiscernible] lattices so basically I can't use them 3 And I want to construct an encryption scheme which uses just addition and multiplication, okay? It has to be, you know, relatively simple, easy to understand. That's exactly what we will show in this talk. We'll construct a fully homomorphic public encryption scheme which uses only addition and multiplication over the integers, and the security is based on what we call the approximate greatest common divisor, approximate GCD and sparse subset sum. I'll define both in the course of the talk. Okay. So let's jump right into the construction. The construction will proceed in a number of steps. And the first step is to -- really, the basic building block is a construction of a secret key, somewhat homomorphic encryption scheme. I'll tell you what somewhat is in a little bit. So it's different from what we wanted at the end in two aspects. One, it's a secret key encryption, as opposed to a public encryption. It turns out that this can be very easily fixed. It's a generic transformation that takes the secret key of this form and converts it into a public encryption. And the second deficiency is that it's only somewhat homomorphic. Roughly speaking, what this means is that it can evaluate a lot of circuits, a lot of programs, but not every program. Not every polynomial time computable function. In terms of that, that can be fixed using techniques borrowing from Gentry, like bootstrapping and squashing and so forth. So what I'm going to focus on in this talk is the basic building block. Namely, the construction of a secret key, somewhat homomorphic encryption scheme. That's what I'll describe in this talk. Okay. So jump right into it. How do we construct a secret key somewhat homomorphic encryption? The secret key is going to be a large, odd number, okay? And it's a security parameter, let's say, and quite a bit odd number. The encryption of a bit, so I'll describe a bit encryption for simplicity, but you can extend this to large domains as well. To encrypt a bit B, I pick a random multiple of P. Very large multiple of P so this is roughly N squared, but the Qs, the multiple Q is [indiscernible] into the five bits. As opposed to secret, which is only [indiscernible] which is a pretty large multiple of P. That's number one. Number two, pick random small even number, two times R. Pick a random small number R, roughly in bits. Much smaller than the secret. Multiply by two, you get a random 4 small even number. And sum all these things together. So it's Q times P plus 2 times R plus the bit that you want to encrypt. This is the ciphertext. This is the encryption procedure. How do you decrypt? Well, the first thing I do is I know the secret P. So I can remove the multiple of P mod from the ciphertext. Take the ciphertext mod P. What I get is the remaining part, two times R plus P mod P. All right. Now, the key thing to observe is that this number is so small that taking it mod P doesn't do anything to it. It remains the same. It's like taking ten mod 17 right? Ten mod 17 is done. So remains the same. And now, the bit that I'm interested in is simply the least ordered, the least significant bit of two times R plus P. So take this mod 2 or simply read off the list [indiscernible]. I get the bit that I'm interested in. This is the decryption procedure. So far, I used a little bit of modular arithmetic, a little bit of integer addition, multiplication. That's all. Okay. So this is the encryption scheme. The only thing I have to show you now is how to add and multiply encrypted bits. Given an encryption of B1 and B2, how do I compute the encryption of B1 plus B2 and B1 times B2. So given this, you can write any program as a combination of addition and multiplication so you can kind of put these building blocks together to compute any program that you want. So this is what I will illustrate. This actually turns out to be pretty simple. Actually, very simple. And the idea is that a ciphertext is a near multiple of P. If I add or multiply two near multiples of P, I get another near multiple of P. It's not going to be as near as -- you know, as close as before. But still, you know, you consider the parameters so that it still remains pretty close. Okay. So concretely, how does this work? Let say we have two ciphertext. A ciphertext of B1, a ciphertext of B2, right? They want to compute a ciphertext that encrypts B1 plus B2. How do I do this? I simply add these two numbers as integers. Right. C1 plus C2, and that's simply a multiple of P plus a number, a small enough number. It's a little bit bigger than before, but its least significant bit is the XR. B1 XR B2. Okay, and the key thing to note is that if I set the original noise to be small enough, then this noise doesn't grow. It still remains pretty small compared to the secret key. Multiplication, it's the same thing, except that the expression gets a little bit more complicated. If you write it down, it turns out to be a multiple of P again. 5 A number which remains small enough, although it grows much faster than if you add. If you multiply two numbers, it grows much faster than if you add. But still, its least significant bit is the product of the two bits you are interested in and we can make sure that it stays within bounds. Okay, so this is how you add and multiply bits. Once you have this, you can compose them together to compute any program. So this is what you do. So happy, right? I mean, everything's great. Except for two problems. One, the ciphertext grows with each operation. Every time you add to ciphertext, the resulting ciphertext becomes, you know, the bit length grows by one. When you multiply, the bit length grows by a factor of two, right? So this is not any good, because what this means is that the resulting ciphertext of computer program is essentially going to be as big as program. So why does the client -- it's not outsourcing anymore. So you want to keep the ciphertext bounded. So that's one thing we want to do. The second problem is that the noise which is contained within the ciphertext, which is this number R that I talked about, also grows with each operation. This is also problematic. Let's see why that's the case. A ciphertext, you can write it as a point in the number line, right? So this is all multiples of P. A ciphertext is a multiple of P, plus a little bit of noise. Two times R plus P, okay. And I claim that if this noise is larger than half the multiple of P, then, you know, you're done. I mean, the decryption doesn't succeed. Why is that the case? Well, you're going to first compute the ciphertext mod P, right? The [indiscernible] use the nearest mod. When I say mod, the operation gives you the nearest multiple of P, and the distance between this and the nearest multiple of P is some number odd prime which is different from two times P. Different from what I used for encryption. In particular, the least significant bit of odd prime is going to be something different. In particular, it's going to be the negation of the bit that I encrypted. So, you know, the decryption is not going to succeed anymore. So what the lesson we learn is that if the noise becomes larger than half the multiple of P, decryption fails. So we have to keep it under this bound. Okay. So these are two issues for the encryption scheme. They're serious issues, right. I mean, the first issue prevents the usage of this encryption scheme totally. We really have to handle it. And second issue is also problematic. 6 Okay. So how do -- okay. This is exactly what I said. How do we solve these issues? I'm going to just sketch the solution, right. So how do we solve the issue of the ciphertext growing? Solution is fairly simple. You publish in the public key or publish, you know, in the sky a large multiple of P. A large random multiple of P, okay, without any noise. It's just a multiple of P. And after you do every operation, additional multiplication of the ciphertext, you take the result mod this number in the sky. Okay. So what does this give us? What does this buy us? We ensure that the ciphertext stays less than X zero because you're taking it modular X zero, right? So it stays within a fixed bound. The encrypted bit remains the same, because taking something modular X zero simply shifts the ciphertext by an exact multiple of P, and the noise remains, noise remains exactly the same, particularly it does not increase and the least significant bit is exactly the same. So the encrypted bit doesn't change. Okay. So this is how you solve the issue of the ciphertext growing. Right? And the second issue is that the noise grows with each operation. So I have two answers to this issue. One is that, you know, that's fine, but you can actually perform a fairly large number of homomorphic operations already before the noise hits this bound, and that might already be useful for many sort of low complexity, you know, functions that are subtracters. That's one answer. But what we were shooting for is a fully homomorphic encryption scheme. You're not happy with evaluating small programs. We want to evaluate every program. Okay? And this, it turns out, can be handled by this very beautiful theorem of Gentry, who shows how to take a somewhat homomorphic encryption scheme, which can evaluate large enough programs. By large enough, I mean something that is larger than the decryption, the size of the decryption circuit of the encryption scheme and turn it into a fully homomorphic encryption. Okay. So this is the blueprint that Craig was talking about yesterday. apply this, you get a fully homomorphic encryption. Once you Okay. So this is the entire construction of -- the sketch of the entire construction of the fully homomorphic encryption scheme. What I want to tell you about very briefly is the security argument. Why is this scheme secure? Seems so simple, it can't be secure. Right? But, you know, it is 7 secure under a reasonable assumption, and I'll try to convince you of this. I'll talk about the security of the secret key somewhat homomorphic encryption scheme that I presented. So the security of the public key version of it is equivalent to the security of this scheme, because the transformation is generic and it preserves -- it doesn't add any more complexity assumptions. And the security of the final scheme, we need to assume that a [indiscernible] problem remains hard and, you know, that's an extra assumption we need to make. Let's focus on the security of the somewhat homomorphic encryption scheme. Okay. So what's the complexity assumption that we're going to use? It's something we call the approximate greatest common divisors assumption. The approximate GCD assumption. This is a name that Howgrave-Graham coined in 2002, 2003, I think. But the problem was known even before, even before that, under different names. So what's the problem? And it's going to be very similar to this learning parity with noise or, you know, learning with [indiscernible] problem that Christoph and Chris will talk about next, and the problem is the following. You have a box which contains a large random odd number P. An odd number P chosen from this domain zero to P. And you have an adversary who has article access to this box, okay. So what the adversary gets is the following. He can press this button. The first time he presses this button, he gets an exact multiple of P. A large random number Q zero, multiplied with P. That's what he gets. But, you know, he's not happy with that. I mean, he has this red button. I have to push it many, many times. So, you know, I'm going to push it as many times as I like, polynomial number of times. Every time I do it, I get multiple of P, a large multiple of P, plus this noise. It's exactly the distribution of the secret key -- of the ciphertext in our scheme. Okay. So he gets polynomial number of these multiples, plus one exact multiple. And our assumption, the approximate GCD assumption is that given all this thing, and polynomial computing power, you cannot find the number that's hidden within this box. Namely, P. Okay? This is the assumption. Questions? Okay. >>: [inaudible]. I was wondering. 8 >> Vinod Vaikuntanathan: Sorry? >>: [inaudible]. >> Vinod Vaikuntanathan: So you're saying in the ciphertext, the noise terms are even, and here the noise terms are like random? Is that what you're saying? >>: [inaudible]. >> Vinod Vaikuntanathan: No. So the assumption is different from the security. So the security of the scheme doesn't follow directly from the assumption, and you'll see it, you'll see it in a little bit. There are a number of differences between these two. >>: [inaudible]. >> Vinod Vaikuntanathan: It's hard. I'll talk about this a little bit. But the security parameter is N, right, so it's the -- it's a dimension for which lattice problems are hard, roughly speaking. And the numbers are of size roughly N to the 5. So N, I don't know what the reasonable value of N is. Maybe something like thousand. And numbers are of size. So it's not -- it's polynomial time, but, you know. >>: So you're saying, so, for instance, Q zero P would be like ->> Vinod Vaikuntanathan: The Q zero P is of size N to the 5. N to the 5 plus N squared, but, you know, N to the 5 really. Okay? So this is the assumption. So how does this relate to the security of the scheme? Well, the security of the scheme says that given an exact multiple, the number of near multiples of P with random noise plus, you know, bit that you -- that I want to encrypt, you can't guess this bit. Okay. So the adversary can't guess this bit, given one exact multiple and many sort of approximate multiples. So clearly, there is sort of a syntactic difference between the assumption and what we need for security. Namely, that in the assumption, the assumption is that the adversary cannot solve a search problem; namely, guessing the number P. Whereas breaking semantic security apriori seems much easier. It just has to guess this one bit of the noise. So they seem to be different problems apriori, and what we will show is that they are actually equivalent. 9 In other words, if you can break the semantic security, you can actually with this entire number P. You can break the approximate GCD assumption. is something that we need to show, and what I'll show in the rest of the sort of an idea of how you -- how the security [indiscernible]. A taste of security [indiscernible]. Okay? come up So this talk is how the Good. So again, what do we need to show? We have an encryption breaker, which breaks semantic security by guessing this bit. And we want to show that that guy solves the approximate GCD problem, which is a search problem, right. So the proof proceeds in a number of steps. The first thing I will show is by sequence of steps, I'll transform the encryption breaker into an adversary that does the following. It gets an exact multiple. It gets a whole bunch of near multiples of P, and it gets this distinguished number, which is also a near multiple, but I want to sort of write it separately. And it predicts the least significant bit of Q in this distinguished multiple. Okay. So you'll see why I'm writing this in two different steps in a little bit. So I claim that an encryption breaker can be transformed into this algorithm. And I want a [indiscernible] property from the cyber [indiscernible] I wanted adversary to predict the least significant bit of Q for every ciphertext C. For every number C which is of this form. Okay. So in some sense, the proof involves a worst case to average case argument over C. So the adversary that we started with, the one that breaks the encryption, only works for a random near multiple, whereas we want to transform it into something that works on every near multiple. It involves an argument that says computing the least significant bit of Q is equal to computing -- it's the same as computing the least significant bit of R. That's actually easy to see, because P is odd, so the least significant bit of Q, plus the LSB of R, is equal to LSB of C. So computing one is equivalent to computing the other. Right? And hybrid argument and so forth. But, you know, this is something we can show using standard tricks. In other words, we can show that an encryption breaker results in an adversary which succeeds in this game, in predicting LSB of Q. Okay. The next thing you can show is that to break the approximate GCD problem, it's enough to predict Q from Q times P plus R. And this is actually very easy to see, because if you can predict Q, you can divide C by Q. What you will get is P plus a very 10 small number. is P. So you can take that small number off, and what you get So predicting Q is equivalent to predicting P. Okay? So what I wanted to show was an arrow between the encryption breaker, and I wanted to show the encryption breaker provides the approximate GCD solver. I simplified my problem by doing these reductions and the only thing I need to show now is an arrow between these two problems, right? So I want to show that an adversary that succeeds in this game results in an adversary that succeeds in this other game. Yes? >>: [inaudible]. >> Vinod Vaikuntanathan: No. Odd is small compared to P. Right? So if I can predict Q, I'm going to take C and divide it by Q so what I'll get is P, which is an integer, plus R over Q, which is very, very small. Okay? So I can take this, I can round it to the nearest integer, and that should give me P. Okay? So all I need to show is that these two problem -- is to go from this adversary that predicts least significant bits to one that predicts Q itself. Okay. So look at the left and right, okay? The game that the adversary plays in both cases is completely the same. It's exactly the same, except that here he predicts one bit and there he predicts sort of the entire number. Okay? So how do we show that? How do we show this implication? Well, the main idea is to use this adversary, which gives me one bit, again and again, using different number C, and make it so this adversary is going to give me one bit, the least significant bit of Q every time. I'm going to use this bit to make this number Q smaller and smaller. Make this number C, and therefore the number Q, smaller and smaller until I get the entire Q [indiscernible]. Okay. So that's, roughly speaking, the idea. So how could you do it, right? So what is, you know, what's the natural way to make this work? Here's what I'm going to do. I'm going to take my number C, I'm going to feed it to the adversary, okay, and he's going to give me the least significant bit of Q, okay. I'll see that now there are two possibilities. One is that the LSB is zero, okay, in which case I'm going to take this number and divide it by two and round it. Okay? So what happens is that I compute Q half times P. Q half is an integer because I know that the least significant bit of Q is zero. Times P plus noise which is also smaller by a factor of half, a factor of two. 11 Okay. So what did I do? I made the Q smaller by one bit. So that's progress. Right? If LSB -- what do I do in the other case if the least significant bit is one? I'm going to take C, subtract B from it and then divide by two, okay? So again, I -- the new C is Q minus one divided by two, which is an integer, times P plus R half, okay? Again, I made the Q, the effect of Q smaller by one bit. Okay. Then I do this again and again, I learn the bits of Q, one by one, and, you know, I'm done. Okay? So everyone happy? Okay. So people who haven't heard the talk before, please. Okay, good. So the problem with this blueprint is that to do this step, I'm assuming that I know P. But what am I trying to find out by running this whole algorithm? It's P itself, right? So it's, you know, circular. And that's not, that's not any good. So what do we do? We're going to replace this [indiscernible] by something more sophisticated, and we'll do this by proving the following. We want to keep the overall structure of the reduction the same, because it's so nice, right. We want to -- we don't want to deviate from that, but we want to replace the second step by something more sophisticated. So what do we do? We'll show the following Lemma, okay? The Lemma says that given two near multiples, Z1 and Z2, Q1 times P plus R1 and Q times P plus R2. Plus this least significant bit oracle, I can compute a number Z prime, which is the following quantity. It's a multiple of P, plus a small noise, where this multiple is GCD of the multiples Q1 and Q2. GCD of the numbers Q1 and Q2. So this is the Lemma that we are going to show. Why does this help us? Well, because of two things. One, if I choose Q1 and Q2 at random, the GCD of these two numbers is going to be one with very high probability. Okay? So what I get here is one times P, which is P, plus R prime, which is much smaller compared to P. Okay. So I get essentially P, plus, you know, with a little bit of noise. Okay. So now I am very happy, because I can take this second step, which didn't work, because I needed to know P, and replace the P here by this number Z prime, which is essentially as good as P, okay? It's P plus a little bit of noise. So it's as good as P, and everything, everything goes through. Okay. So this is, this is what we are going to do. And the only thing that remains to be shown is how to prove this Lemma. So this is the crux of the proof. 12 >>: [indiscernible]. >> Vinod Vaikuntanathan: So remember, this is the key thing that maybe I glossed over, but so I transformed the encryption breaker into this algorithm. And this algorithm succeeds for an arbitrary C. So it doesn't care about the nice distributions. It works for any -- so this is the worst established case part of the reduction. Which I hid under the box, but this is guaranteed to succeed for any C. Okay? And then I don't need to care about, you know, the distribution of however or not. Okay? So the only thing that remains to show is how do you prove this Lemma. So very high levels, 50,000 feet, the main idea is that you want to run the GCD algorithm, the binary GCD algorithm on Q1 and Q2. But the point is that you don't have Q1 times Q2, you only have Q1 times P plus R1, some form of encryption of Q1. Right. The main observation that is you can still run the GCD algorithm under the hood, even if you're not given the inputs by themselves, and the algorithm works just fine, because the -- because it's nice [indiscernible]m in some sense. The proof is very similar to the R. Say, we sort of go 25 years back, look at the hard core bit [indiscernible] hard core bit proof of [indiscernible] and proof is very, very similar to -- very similar in spirit. Okay. So let me tell you sort of a very high level how to show this Lemma, okay? So you're given two near multiples, Z1 and Z2, and your goal is to sort of reduce the Q inside the Z to get sort of GCD of Q1 and Q2. So you're given these two things. The first thing you do is you compute the LSB of Q1 and Q2. This you can do using the article that you have. So now, there are two cases. Either at least one of the BIs is zero, LSBs is zero, in which case is we're in the good half of the security of the description. You can divide the Z by 2 and everything goes through. The bad case is that both the BIs are one. In which case you cannot divide by two. You have to subtract P and divide by two and so forth. In that case, you use the fact that in both the LSBs are one, you can subtract the two numbers, right, so you're essentially subtracting two numbers whose last bit is one, and you get a new number whose last bit is zero. Now you can take that divided by two and continue. Okay. So after these two steps, the point is that at least one of the numbers has 13 an even multiple and you can keep reducing on that even multiple. And then everything goes through. What you get at the end is the GCD of the two numbers inside the encryption. Okay? So that's the security profile. The question that remains to be asked, so what we showed is that if you can break the semantic security of the scheme, then you can solve the approximate GCD problem. The question is how hard is the approximate GCD problem, right? So this, I'm going to wave my hands very vigorously, though you can't see it. So the point is that this problem was studied by Howgrave-Graham and also earlier by Lagarias, Coppersmith and Nguyen and Stern under various different names. So it turns out that this problem is equivalent to the simultaneous Diophantine approximation problem. Yeah. So what do we know about attacks on this problem? The attacks are lattice based. Lagarias himself had an algorithm that solves this problem for a range parameters and then there's Coppersmith's algorithm for finding small [indiscernible]. You can sort of adapt that to solve this problem. Then there is algorithm in Nguyen and Stern and [indiscernible] so on and so forth. The point is that all of these methods run out of steam when I set the parameters appropriately. In particular, when the multiple, the bit length of the multiples is larger than square of the bit lengths of the secrets, say in particular this is satisfied in our case because the bit length of Q is N to the 5, and the bit lengths of the Z is N squared, right. These algorithm run out of steam. These algorithms stop working when the parameters satisfy this constraint, okay? So, you know, this is evidence that the problem is hard, although, you know, we need to sort of work more to understand hardness of this problem. Okay. So this is all I wanted to say. Future directions, you know, this is the obvious future direction. Construct an efficient fully homomorphic encryption scheme. The one thing Craig mentioned which I want to mention is that all homomorphic encryption schemes, all of them use the same blueprint. They construct a somewhat homomorphic encryption scheme and then they sort of bootstrap, use this bootstrapping trick to make it fully homomorphic. The question is, is it really necessary? And the overhead of the whole process comes from the bootstrapping procedures. The question is, is it really necessary? >>: [inaudible]. 14 >> Vinod Vaikuntanathan: Sorry? Which of the three? Okay. Well, there is Craig's. There's the integer stuff that I showed you. And the third one is this encryption scheme of [indiscernible], which is similar to Craig's original encryption scheme, except that I would say that it lies between the two schemes. So the assumption is still lattice-based, but the ciphertext are integers. So somewhat lies between the two schemes. Okay. So not quite efficient. Question is can you make it efficient. So there are some recent improvements. The first reference is modern [indiscernible] and I forgot what HH is. GH is Gentry and [indiscernible]. They implement a scheme and come up with some numbers, performance numbers. So a second and probably more exciting for me direction is to evaluate security of the approximate GCD assumption. So the thing is the scheme says nothing about ideal lattices, right? All we are doing is manipulating numbers, adding and multiplying numbers. Whereas the original scheme explicitly used ideals and polynomials and so forth, this doesn't scream out ideal lattices at the very least. The question is can you use some version of this scheme to construct a fully homomorphic encryption based on, you know, regular lattice problems or something that doesn't, you know, depend on ideal lattices. Ideally, we want to sort of get a fully homomorphic encryption scheme based on the shortest [indiscernible] problem on regular lattices. The question is can you do this. Okay. So that's that. Questions? Yes? >>: [indiscernible] security parameter if it's 80 bits currently [indiscernible] only number about 3 billion, you're saying it's more like ->> Vinod Vaikuntanathan: No, 80 is not good, because 80 is the -- >>: It's a different security ->> Vinod Vaikuntanathan: Yes, it's a different security. >>: You're talking 10 to 15 bits each number. >> Vinod Vaikuntanathan: that. My arithmetic is really bad. Yes, yes, something like >>: [indiscernible] works for the analysts, the mention of the lattices that you were using can also serve in [inaudible]. 15 >> Vinod Vaikuntanathan: So [indiscernible]. >>: Yes. >>: Is there any kind of like decoding encryption or [indiscernible] that I should know? >> Vinod Vaikuntanathan: Good point. So the search for decision for LWE is listed according essentially. Uses some of those ideas. >>: [inaudible]. >> Vinod Vaikuntanathan: It is, it uses essentially Goldrick Levin. >>: [inaudible]. >> Vinod Vaikuntanathan: Well, then, I don't know. >>: [inaudible] connecting your question is ->> Vinod Vaikuntanathan: The most interesting question of the day, right? Yes. >>: Any other questions? >>: It is a pleasure to present Chris, who came from Georgia Tech, who comes from Georgia Tech, sorry. Joint work on bi-deniable encryption. >> Chris Peikert: So you can ask questions. I'll just make you feel really guilty about it. How about that? Okay. This is bi-deniable encryption. It's a work with Adam O'Neill, also at Georgia Tech as of a couple days ago. Now he's at Texas. Okay. So clouds, now that we have that out of the way. I want to talk about a totally different concept called deniable encryption, which kind of started with some ideas of been low and was formalized a bit by [indiscernible] in '97, and imagine for the moment that Alice and Bob are now siblings and they're trying to prepare a party for their big brother, surprise party. So they want to keep it a secret and so Alice and Bob are going to use some encryption and the way they do that is, well, Alice -- Bob will have his public key that Alice knows and Bob is going to receive this message from her. So she's going to encrypt it by taking his public key and then some random local coins of her own, and then she'll encrypt the message, surprise 16 party, and she'll ship it over to Bob, ciphertext. But big brother is watching them, and, you know, he decides using the methods that big brothers often have to ask them, you know, what are they talking about? What were you saying behind my back? And he can kind of twist their arm, basically, and say Alice, tell me the coins that you used to send this message. Or Bob, you know, give me your secret key. And if he does that, then the surprise is spoiled. So Alice and Bob really don't want that to happen. So what they would prefer is if there was some way for them to encrypt the message in a deniable way, what we're going to call deniable way, and so maybe Alice uses a slightly -- maybe she uses a different encryption algorithm called deniable ink, and Bob should still get the message that she intends for him to receive, but in preparation or in case big brother comes along and tries to coerce them, then they're able to create some fake coins and maybe a fake secret key. So Alice can do this locally, and come up with some different fake looking, you know, fake coins, and a fake key so that when big brother comes along and coerces them, you know, looks like something innocuous that they can all agree on. So basically, the point is that the fake coins that Alice provides and the fake key that Bob provides should look completely consistent as if another message had been encrypted. There should be nothing suspicious at all about these coins and keys. That's very important. Yes? >>: [inaudible]. >> Chris Peikert: Yeah, we'll talk about that. One issue is, you know, how do they know -- of course, it has to look like a consistent story so the coins here should correspond to the same message as the key, as the secret key it's going to decrypt, so there's this question of coordination so I'll get to that in a moment. So, you know, it's not clear that you can do this kind of thing. It seems a bit paradoxical, although not outright. And if you could do it, then there are a lot of nice applications that you might imagine using for it. They all pretty much come down to this anti-coercion property. So a lot of professionals will often have ethical or professional obligations that they have to give. For example, journalists may have to promise their whistle blowers that they won't be able to be coerced into revealing who they are, or what exactly that they -- their whistleblower delivered to them. Lawyers, you know, have ethical obligations to their clients. Whistle blowers, things like that. Or if you find yourself dealing 17 with a different kind of big brother entirely you, might want a feature like this. Another application that has traditionally been in connection with deniable encryption is voting where the idea is simply that if you want to vote for somebody, then you encrypt their name as the ballot and you encrypt it deniably so that if someone comes to you later and tries to coerce you into finding out who you voted for, well, you can reveal whatever coins you like showing them that you voted for whoever they, you know, happened to want you to vote for. You have to be a little bit careful when thinking about this, because in voting, sometimes a person wants to be coerced, and so, you know, because maybe they'll be paid a hundred dollars for voting for Smith instead of Johnson and so you have to be careful about whether the person doing the encrypting is trying to be prevented from outside coercion or is willingly able to be coerced. So this takes some care and it depends a bit on what exactly you're trying to achieve here. And then there's a different application entirely, which doesn't apriori have to do with coercion per se, which is secure protocols that tolerating adaptive break-ins. So what I mean by this is a deniable scheme in particular has a property that's called non-committing encryption, and that's something that can be used to design adaptively secure multiparty computation protocols. But, in fact, den ability is strictly stronger notion than non-committing encryption, but yet it's one way to achieve it. So you have to go back a little ways to find some state of the art. The only paper really that I know of in the theoretical literature is the one I mentioned before by Cannetti, et al., and the main construction in that paper was what I'll call a sender-deniable public-key encryption scheme. This means that the coercer could go and force Alice to reveal this, the coins that she used to perform the encryption. So that's the main construction involved there. And then if you add some interaction and basically flip the roles of Alice and Bob, so you have, now have an interactive encryption algorithm or encryption protocol, then you can get receiver deniability, where the receiver can be coerced, but not the sender. And then they get what I'll call bi-deniability, which is a shorthand for sender and receiver can be coerced via some interaction with third parties, but the catch is that at least one of those third parties must remain coerced. So whoever is in the system, there is at least one person who must remain uncoerced and whose coins remain secret. Okay. Is that an accurate description? Yeah, okay. Good. 18 And then this is also, not just desirable in theory, but actually the practitioners seem to care about this problem quite a bit. So there's this product called TrueCrypt which advertises pretty highly on its list of features on the front page that it provides something called plausible den ability and there's an old, apparently defunct software called the rubber hose fire system which is supposed to give you some denability as well. What these actually give you is some kind of limited denability. They allow you to say in this part of my hard drive, I'm actually not storing anything here. It's just a gibberish that I kind of patted out and there's nothing on this part of the hard drive so nothing to look at. Move along here. And this is a plausible story. I think you can -- I think you can make the case that it could work, you know, if you're at a border crossing or something like that. It's good for storage, but I'm not sure it's so convincing of an argument for communication, because if Bob and Alice are sending messages to each other, something's going on. I mean, they're speaking. I think it's not really plausible to say, well, I just happened to have sent Bob a bunch of random bits and there was really nothing behind it. >>: [inaudible]. >> Chris Peikert: What's that? >>: [inaudible]. >> Chris Peikert: Yeah, that's kind of how TrueCrypt does it, the ciphertext, they're random looking and they make a huge buffer on the hard drive full of random bits and those random bits slowly turn into ciphertext that have messages underneath them, but who knows where that stops and where the noise begins. So that's roughly how they work. Okay, so that's kind of the state of the art here. Let me tell you now about what this work, what we're able to accomplish in this work. So we get, again, what I call bi-deniable encryption which means that the sender and receiver can simultaneously be coerced by big brother and in the most general case, they can reveal any message underneath the ciphertext, which can be chosen as late as the time of coercion. Or the moment, maybe, just before coercion. So that's the most general thing that we can achieve. I just want to point out that it is a true public key scheme. So there's no interaction, no third parties. It's 19 just, you know, Bob -- yeah, Bob publishes his public key and then Alice can send a noninteractive message by encrypting to that public key. It's going to use some special properties of lattice-based encryption schemes. Won't be until the last slide that we'll actually see a lattice, but we'll kind of see abstractly what it is about these encryption schemes that make it possible. One down side is that the semen its full generality has large keys. So the secret key in particular is large and the public keys are accordingly as large. Unfortunately, this pretty easy information theoretic argument says that this has to be so, if you really want to be able to reveal any message chosen at the time of coercion. So the point is that if I've got a public key and then there's a ciphertext, and I want to be able to reveal a secret key that could be of any, you know, 2 to the K possible messages, then there have to be at least 2 to the K possible secret keys that can reveal, because decryption is going to have to work and decrypt that fixed ciphertext to any 2 to the K possible messages. I need 2 to the K possible keys, that means the keys are of length K. So large key, down side. It's inherent, but you can get around it. And the way of getting around it is something called plan-ahead den ability, which is a notion also from the CDNO paper. And here, we're able to use short keys and the difference here is that there's a bounded number now of alternative messages and these are decided at encryption time. So like I showed you on the first slide, surprise party, here's the plans and data was lame today. And those are the two messages that now Alice and Bob will be able to equivocate to when coercion time comes. But the advantage of this is that the secret key is now independent, short, independent of the message length. And another plus about this for [indiscernible], who is not here, you can tell him later, is that there's no issue with agreement now of which message they need to consistently open it as. So the fake message is kind of baked into that ciphertext that was sent over and assuming that one of these is a bombshell or whistle blow document and another is something innocuous, it's pretty clear which one they're going to want to open it up to? >>: [indiscernible]. >> Chris Peikert: Yeah, you can encrypt arbitrary messages. So the two message, the bombshell whistle blowing and the more innocuous plans for the party are both, are arbitrary, but then they can open them to either of the two later on. So is it pretty clear? Almost of this is without a formal definition yet that we'll get 20 to. >>: [inaudible]. >> Chris Peikert: Yes, the receiver has to receive the ciphertext. >>: In both cases? >> Chris Peikert: In both cases, yeah. That's actually necessary as well. If you don't have the ciphertext and you then commit to a secret key, then the correctness requirement requires that if you give out the key first and then any ciphertext will have to actually decrypt to what it, its real meaning is. The [inaudible] is going to have to rely on [inaudible]. All right good. Let's see -- oh, good. Actually, one automatic consequence of this is that we get improved constructions of non-committing encryption, in particular in the lattice scenario, because the best known lattice-based non-committing encryption is three round or, rather, three flow so it's an interactive one and now we have noninteractive. So that's just a plus. So let me give you a definition of sender den ability. There's a few different ways you can imagine defining this, so we'll just do one for the moment and then talk about some variations you might consider. So as with any public key encryption scheme, we have a normal key gen encrypt and decrypt algorithms, and then I'm also going to ask for what we'll call a deniable encryption algorithm and a faking algorithm and these are for the sender. So just as warm-up, we're going to do just a sender deniable scheme, a sender deniable definition. So the definition's pretty easy to write down. Let's just say for the moment that the messages are bits. So we're asking for any two bits, B and B prime, on the one side we have the actual experiment where a key gets generated and the bit B gets encrypted and then the view of the adversary is everything. Well, not everything, but it's everything from the sender side. So you get to see the public key, the ciphertext and then you get to see the coins R that the sender used. So this corresponds to the coercer came to or the receiver to the -- rather to the sender. The coercer coerced the sender and got their real coins. Okay? And then on the right-hand side, we have something like we generate the key pair, but then 21 the sender now will deniably encrypt the bit B prime which could be the same or could be different than what's on the left. Then at the time of coercion, the sender will now create some fake coins, R star, using its knowledge of the previous coins and the previous bit. And then give out -- well, now it hands the fake coins to the coercer. So what this -- the nice trick about this is this definition by itself is strong enough to give you semantic security and other nice properties. So this is kind of all you need for ->>: [inaudible]. >> Chris Peikert: Yeah, so if you want to fake, in this case, I mean, for the definition itself, you would keep the original coins around if you want ->>: I meant ->> Chris Peikert: Sorry? >>: [inaudible] for operation. >> Chris Peikert: It's actually nice. In operation, you don't need, depending on the scheme, you don't need to remember the coins you used for encryption, but you can then get them from the ciphertext itself. The answer is it depends. Sometimes you might have to store them and other times you might not. >>: [indiscernible] random coins. >> Chris Peikert: Yes. >>: [indiscernible]. >> Chris Peikert: If you can say I erased them -- >>: [indiscernible]. >> Chris Peikert: Right. That's just the thing. If you weren't going to keep your coins in the first place, you might tell the coercer you didn't keep your coins around and not give them anything. Actually, we'll talk about that in a second, what that means. 22 Okay, so this is a pretty simple definition to wrap your head around. Just a variant you could consider is one thing I just want to point out is the encryption here is one algorithm and then you have a possibly separate, different algorithm, deniable encrypt, that you're running if you want to create a fakable ciphertext. So you can also consider requiring just a single encryption algorithm and there's basically a generic transformation that will just do a mix and match of these two so you can create a joined single algorithm from joining up the Enc here and the den-enc. It's a generic transformation. One caveat is you only get a one over poly distinguishing advantage there. >>: [inaudible]. >> Chris Peikert: The generic, it's a parity trick. So to, I mean, to -- what I would do is if -- it's a bit hard to say in words. If I wanted to encrypt a bit B, then I will do a random -- I'll do, let's say, K times, I'll run either this or this each time a random choice, one of the two, K times, and then I'll encrypt bits which may be X or to the real bit that I want to send. And then all I need to do is flip one of these DenEncs to an Enc. So that's basically the generic trick there. Okay. So that's just the warmup sender den ability. So let's think, the real definition that we want to achieve is bi-deniability. We again have three, the three normal algorithms as before and then we have the deniable ken gen as well as the deniable Enc encrypt and we have a faking algorithm for both the sender and the receiver. Okay. So again, we ask that these two experiments be indistinguishable. This is exactly the one before, except now the coercer owns the whole world. So this experiment happens and then the coercer gets not only the public key but the secret key, not only the ciphertext, but also the randomness of the sender and that's everything in the world. Okay? And then what we ask on the right side is now the sender and receiver are going to run the deniable algorithms instead, and the deniable generator will output a PK, but then something we call a faking key. So this might be a different behaving thing. It's a faking key, deniable encryption is the same. It takes the public key and the bit, and then these two are going to fake some things. So the receiver will now use his faking key and the ciphertext that he's received to fake -- give a fake secret key, which should look like this secret key in the real honest experiment and the sender does exactly the same thing as before. And note that the inputs to these two faking algorithms are independent. So there's not -- I mean, they can 23 run them locally. >>: [inaudible]. >> Chris Peikert: Sorry? >>: [inaudible]. >> Chris Peikert: What's that? Yeah, so the B here has to be the same thing, but that's it. >>: [inaudible]. >> Chris Peikert: Oh, yeah, and the ciphertext, right. So the ciphertext is public information, yeah. So the B is the coordination here. As I mentioned before, if you're doing plan-ahead, then there's no coordination needed. I mean, the B is implicit. It's already in the ciphertext. So with plan ahead den ability, there's no B here in either of those. Okay. And then I mean, just a technical point is the receiver faking algorithm, here I've got it outputting a fake secret key. You could ask it to output fake coins for the key generator which is strictly -- which is at least as powerful. In fact, that's what our scheme does, it outputs coins for the key jen, but just thinking about it as a secret key is the most convenient thing to do. And we can consider merging key gen and deniable gen together into one and that all works as well, just generically. So for clarity, I think it's a bit easier to think about two different algorithms, you know, actual deniable algorithms and then using the faking to produce the fake secret key and coins. >>: [inaudible]. >> Chris Peikert: Yeah, and this will have a negligible indistinguishability. And then if you join them, you get the one over poly thing, which is a little bit of an annoyance. Okay. So I just want to spend a couple slides -- yeah? >>: So the one over poly comes from [inaudible]. 24 >> Chris Peikert: It comes, well, it's the distinguishing advantage between the whole two experiments if you were to use a unified key gen and a unified encrypt. And it comes even with one side. Even if you only do it for one side. And it's the same if you do it for both sides. Yeah, yeah. >>: Even if only the receiver gets [inaudible]? >> Chris Peikert: Correct, that's right. Yeah. This is kind of an artifact of this, the parity thing kind of only goes one way. So you know it. Okay. So maybe a slight philosophical digression or discussion is whether any of this is meaningful. So the first thing when we talk about denability that people often reply is, well, everybody knows that you're running a deniable algorithm, and everyone knows the coins and keys could be fake. So, you know, who do you think you're fooling? Don't be a jerk. And you're absolutely right to be thinking of this. We're not fooling anybody. So the point is not to convince your coercer that you actually sent something. But instead, to actually preempt them from being able to coerce you in the first place. So let me unpack that a little bit. Think about the most perfectly secure secret communication you could imagine, like talking through a lead pipe, you know, and only the person on the other end can hear it. This is inherently deniable, whatever went through this pipe later on, it's gone, it's ephemeral, and nobody can pin you down on what was said through that pipe. But encryption, on the other hand, is typically like entirely coercible. So if you send a ciphertext across, there's only one possible meaning that that ciphertext could have had, okay? So this is trying to undo that undesirable side effect of encryption as a tool for confidentiality. So the purpose is, again, not to convince the coercer of anything, but to use a scheme which then does not open yourself up to being coerced in the first place. So it's to preempt coercion and to be able to give a completely, you know, non-falsifiable credible explanation for what happened that can't be disproved. And explains, you know, everything that you could have possibly done. A second obligation or second objection, rather, and something people think about, well, captain the user just say, well, I just erased my coins. I cont have them when the coercer comes calling. In fact, it was part of my protocol to erase them. 25 So sorry, I have nothing for you. have them, even if you do? And isn't it morally equivalent to saying I don't I think not. I think there's a qualitative difference between erasing or claiming that you don't have something, and being able to give away a fake explanation for it. One is that you might be in a regime where you're subject to auditor something like that, and so you're required to keep all your actions. If you're in the financial industry or something like that, then you have to keep all your randomness. Then if you get a subpoena, you'd have to surrender it. And then another is that, you know, for a sender creating a ciphertext, it is kind of plausible that, well, they chose the random coins they needed to produce the ciphertext and then threw them away. They were ephemeral. There's no reason to keep them around, which is the point we were mentioning earlier. But for a receiver, it's really essential that the receiver keeps a secret key around. The whole point is that he has a secret key to be able to decrypt. So it's not really a credible thing to say, oh, I just happen to have erased my secret key, even though I'm supposed to use it to decrypt. Anyway, that's my justification for why I think this is a meaningful notion. All right. So let's do some technical stuff. This is a tool for deniability called translucent sets that was introduced in CDNO. And the idea is that you've got the public key, public -- some description of what's called a translucent set, and it comes with a secret key or secret trap door. And the set is basically you just have a universe, let's say it's all K bit strings and within this universe, there's a set P of special strings. And so this is the property of a translucent set. If you're given PK, then you can sample from this little P set here. Okay? You can efficiently sample from it. Of course, you can sample from U just by picking a K bit random string. You can sample from P and the main property of a P sample is pseudo random. It looks like a P sample. Given only a public key, you can't distinguish whether something is from P or from all of U. This immediately means it can be faked from a U sample. If you pick from P, you can claim it came from U. I have this K bit string. It's K bits. I say these are just the K random bits that I chose when I was sampling from U. And nobody can be convinced otherwise. So this allows you to claim that even though you picked a P, it allows you to claim that you picked from U. Nice idea. Now, given the secret key, you can actually distinguish, this trapdoor allows you 26 to distinguish between a P and a U. That's what's going to allow the decrypter to get the true message, the true meaning, rather than the faked one. So there are many instantiations of this idea. You can do it from pretty much general assumptions, trap door permutations, DDH, lattices, pretty much any assumption you like. So let's see how this would give us deniability. We're going to define a normal encryption algorithm as follows. If you want to encrypt a zero bit, you send two U samples. If you want to encrypt a one bit, you send a U and then a P. If you wanted to encrypt something deniably, then a zero would become a P and a P and a one would be the exact same as before, just a U, P. Okay? And Bob can use his secret key to tell exactly which is U and which is P and so he can decode the right answer here. So he's just going to decode, if he sees two of the same thing, he calls is it a zero. If he sees a U and a P, calls it a one and that's all. So let's see how this allows deniability. It allows Alice, when she gets coerced to fake things because she can always claim a P -- she can always claim, even though it's a P, she can claim that it's a U, wherever she wants. So you can just kind of check if she created a deniable ciphertext, either a zero or a one, she can always go to a zero or a one up top. Just by turning Ps into Us and claiming them. So that's just an easy check. And the problem is what happens when Bob gets coerced? Because his secret key allows actually reveals the true values. Okay? So this is precisely the technical problem we're going to address. And it's something we're calling bi-translucent sets. So the main properties, there's a couple of main properties. The first is that a public key now has many different secret keys and each of them are going to give you a slightly different behavior when you test, which whether a sample is a U sample or a P sample, okay. So the test is just use slightly different. So if I have one secret key, it might call these things P values. But if I have a different secret key, it might call these P values. So my own test is slightly fuzzy. And it depends on exactly which secret key I hold. And now if I'm given a P sample that was chosen by the sender, then most secret keys that I could have are going to classify it correctly. So they'll call it a P sample with very high -- with overwhelming probability. Okay? But there are some very rare secret keys which will misclassify this value X here. Moreover if I have the faking key for this translucent set, then I can actually generate a good looking 27 secret key, so a perfectly valid looking secret key that's going to cause this misclassification. Okay. So it's crucial here that I'm looking at X and I'm using my faking key to actually find a particular SK star that's going to give me exactly the wrong decryption. It's going to call it a U, even though it thinks it's a P -- even though it really is a P. And this is essentially what's going to allow us to fake, for the receiver Bob, to fake the secret key. And it's really crucial here that this be a good looking secret key. I mean, you could come up with some totally bogus secret key that just causes the decryption algorithm to crash, and that, you know, that would be something. But this secret key looks abnormal. It's not something that would be encountered in real life. So that's not going to convince a coercer. So we're really going to have to work pretty hard to get a good looking secret key here. So everything clear, at least abstractly? All right. So now I think we're going to have a picture of a lattice. Yeah? >>: [inaudible]. >> Chris Peikert: Yeah, I wish I could do it both ways, but I don't know how to do it both ways. So if I could take a U sample and come up with a secret key that makes it look like a P sample, then we would have solved this one over poly issue. And I mean, you can imagine possibly there's a way to do it, but so far, no luck. So maybe. Maybe if we keep working at it. But in other words, the point here is that if you have the faking key, you can induce, like, what we'll call an oblivious misclassification of a P sample. So oblivious meaning when you do the decryption using that fake secret key, it gives you the wrong answer, but you don't know it would be the wrong answer. It's crucial that you don't recognize that when it's giving you the wrong answer. Okay. >>: [inaudible]. >> Chris Peikert: Right, right, right. So if my key, my secret key is kind of chosen independently of the ciphertext, then with overwhelming probability, it will give you the right, the right classification. Yeah. 28 Okay. So here we'll see how to do it with lattices. So we'll just kind of do it proof by picture here. Good, okay. So on the left side, we have a lattice. That's the green points. And the secret in the lattice is roughly speaking going to be a random short vector within some bounded radius in this lattice. So the public key is a description of the lattice and then the secret key is a short vector, formally speaking, it's drawn from a Gaussian distribution over that lattice. And you can generate these two things together. And then the way one sends a secret -- rather, the way that one sends a ciphertext is kind of a standard way that goes back all the way to [indiscernible] of roughly how these work. If you want to send a U sample, you just send a uniform X in what's called the dual space. And so what this means is that if you take the secret key and do a dot product with X, you're going to get something that's uniform. Your dot products, you get a real value and it will be distributed modular one. So its fractional value, fractional part will be uniform. The reason that's so is, well, X is uniform. If you take the dot product of any primal vector with any dual vehicle to you get an integer so that's going to be one fact that we're using. So you get an integer and then this other part is uniform over the fractional part. Okay. So that's how you send a U sample. A P sample is an X that's very close to this dual lattice. So you take a point in the dual and you add a little bit of Gaussian error, and you send that over as the ciphertext. And the nice property here is if you take a dot product of the secret key with X, it's going to be very close to an integer. And the reason is simply that SK dotted with this point is exactly an integer and then SK dotted with that point has to be still very close to an integer. So you can distinguish a U sample from a P sample, and yet an X chosen close to the dual versus uniformly in the dual is these two things are indistinguishable. And you can show this using learning with errors. Good. So this is the basic translucency. This says that you can define a P sample -- you can define P samples and U samples, which are indistinguishable but which can be distinguished if you know short vector over here on the left. Okay. Now, we have to do the hard work of the receiver side, the receiver deniability, the receiver faking, rather. So the faking key in this case is now not going to be just one short vector, as in the normal case, but it's going to be 29 an entire short basis of this primal lattice. And everything I'm about to say you'll have to take my word for it. We have algorithms for generating these things and doing the steps that I'm about to describe. So what we have now is we want to be able to fake this P sample X. We'd like to be able to come up with a fake secret key that makes it look like a uniform U sample. So the way that it works is we're given this X and we're going to choose now a secret key which should be a short vector in this primal lattice. We're going to choose it highly correlated with this red vector, this little red offset here. So we're going to use the faking key to extract out that red offset, and then we're just going to, you know, go some random distance in that exact direction and then pick a secret key somewhere from that neighborhood. Okay? So the dotted line is I'm going, you know, north/northeast for some random number of steps and then I pick the secret key vector from around that neighborhood. Okay. So just inducing this big correlation between the secret key and the ciphertext. And when you do it in this way, the reason this is going to work is that the secret key, because these two things are so correlated, their dot product is going to be very large, and because I walked out a random distance, it will be random modular one. So basically, when you take the dot product here, you're going to get a uniform value, and so were you to decrypt it that way, it would look like a U sample, okay? Now, I'm not really done. I have to show that the secret key looks good. You know, it doesn't look suspicious in any way. So from 30,000 feet, you know, the main issue is we've got this fake secret key and we chose it in a way that's super highly correlated with X. Why should it look normal, okay? So the main step here is going to be some statistical trickery going on, and imagine now an alternative experiment where instead of doing it by choosing X first and then secret key correlated with X, we're going to flip around the order in which we choose things. So now I'm going to choose my secret key in a completely normal way, just a short vector centered at zero, and now I'm going to choose my X to be very close to the lattice, but then I'm going to walk out again in some random distance in the direction of my secret key. Okay. So whereas before I was choosing my ciphertext and choosing a fake secret key in the same direction, now I'm choosing a normal secret key and a ciphertext that's kind of pointing in the same direction as well. And what you can show is that in these two experiments, you get the exact same joint distribution between the ciphertext and the key. They're basically just two 30 Gaussians with some -- the same amount of correlation between them, and there's some nice, nice math that allows you to show that they're exactly the same. Yeah? >>: [inaudible]. >> Chris Peikert: Statistical, yeah. It's just two Gaussians that have a certain random amount of correlation between them, and you can view it as, you know, you can view it as one depends on the other or they're just jointly chosen from a Gaussian that has some covariants. So now, what we've done here is we've got a normal secret key, and then we've generated our ciphertext by this process where I'm taking a little bit of Gaussian stuff and then walking in the direction of the secret key some random distance and under LWE now, I can replace that red -- that little red offset with actually a totally random X. Okay. So my random X now, I'm still walking in that direction, but if my random X is uniform, my X star is also uniform. So now I'm left in this world where I've actually got a normal secret key and a U sample over on the right, and that's exactly what I wanted to arrive at, wanted. So basically, the point is that were you to fake a secret key to cause this thing to look bad, it's indistinguishable from as if you had had a normal secret key and someone sent you a random ciphertext in the first place. So yeah, that's the 30,000 foot explanation. Any questions so far? Okay. So just going to wrap it up. So the basic scheme that I showed you just does bit by bit encryption and it needs to use a fresh key for every bit and a fresh secret key for every bit. Pretty bad big keys, but again that's inherent if you want this full, like, [indiscernible] ability so what we do for plan ahead deniability is this, just a hybrid encryption, basically. You encrypt a short symmetric key and then you can use that to encrypt an arbitrarily long message. It's pretty to see how to get this. But it only allows you to open this ciphertext as one of two or some plan-ahead -- you know, predetermined number of fake messages. So an open question really is to get this full deniability, like a unified key gen and encrypt algorithm with the full, you know, negligible distinguishing advantage, pretty close, you know, maybe there's some way to do it, fake a U sample as a P sample, but we don't quite see it yet. And then the other thing is kind of weird. We worked pretty hard, we only know how to do this with lattices. It seems like a pretty nice abstract definition, at least at the level of translucency. It should be possible to get it from other kinds of assumptions. We have maybe a candidate based on pairings, but it's unclear whether it works yet. But I don't know. You need this 31 somehow to have this oblivious decryption error. Like a very rare decryption error that you can't recognize when it happens, but you can force it to happen if you have a -- so I don't know. I think it's a nice, primitive, it should be instantiable from other assumptions and should have other applications as well. But the search goes on. Thanks. Yes? >>: So I actually worked on [indiscernible] from a practical side, and I have a paper that it -- sorry. It gets rejected all the time ->> Chris Peikert: I know the feeling. >>: The reason is you mentioned that, that lack the motivation. So the main [indiscernible] by that is the following. If you are going to use like a specialized system, like yours, I mean, you already achieved it, because you are somehow implicitly using the [indiscernible] because you want to send deniable messages, okay? So in the [indiscernible] you mentioned before, like you have a regime that can force you to do [indiscernible] but use this system, but use other models so you can ->> Chris Peikert: Right. >>: So how would you respond to that? I mean, I have a hard time responding to this. >> Chris Peikert: I mean, the thing that I would argue and will argue in the paper is that if you're in a regime that's going to force you to use a committing encryption scheme, provably committing, then that's, you know, that's where you're looking. And you might as well posit a regime where encryption is not allowed at all. So the question really is not so much that are you cheating by just using this. Someone may say why are you using crypto in the first place if you have nothing to hide. And I think this argument is kind of bogus and has been shown to be so. But the point is not to really say that you can, you know, that you're out to cheat or anything like that. It's really to prevent coercion in the first place. So -- and you can't say, you know, in the future what's going to happen. You know, if you were some human rights worker in China or something like that, or you needed to cross the borders between regime, it's entirely within your, you know, prerogative to use these kinds of things and I don't know ->>: So their solution would be something like using some standard encryption scheme. 32 >> Chris Peikert: Right. That's actually one thing I wanted to point out. The encryption scheme that we define here is actually quite natural. It's not absurdly, you know, pathological in how it operates. It's almost I mean, you can view it as kind of a twist on [indiscernible] or something like that. And look, it just happens to have this deniability feature to it. So you can argue that you're using the scheme for other properties they may like of it and [indiscernible] deniability. >>: [indiscernible] it's a different definition, but deniable encryption based on different assumptions. We can talk about that. >> Chris Peikert: Yeah, yeah. So I think David was next. >>: Yeah, just in particular, electronic voting would be a place where we do actually have deniability as a first order requirement and two, you actually have plan-ahead. This is a list ->> Chris Peikert: Yeah, the candidates. >>: So let me ask you something. It was a fake [indiscernible] which is [indiscernible]. Then you can actually find ->> Chris Peikert: Then he can see what the real message was. That's right. >>: Okay. >> Chris Peikert: You say look, I don't have it. I never had it in the first place. >>: But, I mean, if you are using this by -- you are using this system, probably we can generate this [indiscernible]. >> Chris Peikert: I never generated it. >>: So if he gets access to your computer ->> Chris Peikert: software default. I used the recommended key generation procedure, which is the That's what I used. So -- >>: So fully understand ->> Chris Peikert: Right. 33 >>: So okay. To be able to prove this state, you need the basis. >> Chris Peikert: To be able to fake a secret key, I need a basis. >>: Okay. If you do not have the basis, you cannot fake. then you have to generate the basis? So if you want to fake, >> Chris Peikert: I never needed -- I never intended to fake anything so I didn't generate it in the first place. >>: But okay. >> Chris Peikert: Your Honor -- >>: Using the system that you want to fake so you have the basis somewhere. >> Chris Peikert: The system just seemed really natural. It's fast -- >>: If someone corrupts your system ->> Chris Peikert: Yes, absolutely. I want corruptions, because if someone breaks into to, you know, create the fake thing that you really for when there's a protocol involved in or -- to distinguish between coercions and your system before you have a chance want, then the jig is up. So this is being coerced, whether it's a subpoena >>: A subpoena is usually subpoena your hard drive, right? >> Chris Peikert: ahead of time. Maybe. I mean, in one case, you can actually prepare a hard drive >>: [indiscernible] one time is not just inherently bi-deniable, but it was a real-live case of equivocation under coercion back in the '40s. >> Chris Peikert: Okay, good. That's good to know. Yeah, so the one time pat meets all this. It's a secret key, of course, but who ->>: It was Israeli [indiscernible] before there was a state of Israel. illegal. So it was 34 >>: [inaudible] comes to you and said I want to see some [inaudible]. There are two possibilities. One you generate the [indiscernible] using the real key ->> Chris Peikert: The ciphertext? >>: Ciphertext, ciphertext. >> Chris Peikert: So using the real encryption algorithm, okay. >>: Using the real encryption algorithm in which case you cannot [inaudible]. >> Chris Peikert: That's right. >>: [indiscernible] in which case you can equivocate but then they know you know the secret key ->> Chris Peikert: Wait, you're confusing something. The basis is on the key gen side. The encrypt is on the sender side. There's no basis or anything like that. So again, I mean, I want to be clear that I don't think, you know, I'm not fooling anybody. I'm not convincing the coercer that this is actually a message that was sent or received, it's just I want to protect myself by preempting the ability to be, you know, coerced and pinned down on any particular message. And so it's really more about, you know, if you view a game theoretically, the coercer is going to know that you're not going to go along with it so maybe they won't even try in the first place. Or if they do -- and you can't, I don't think you can give a subpoena saying give me the short basis for this if no such thing ever existed. It would be like you have subpoena for the uniform [indiscernible]. Craig? >>: So to flip the U to the P, intuitively you would receive [indiscernible]. >> Chris Peikert: Yeah. >>: [indiscernible] parallel fashion. Why didn't that work. >> Chris Peikert: To flip a U to a P, you somehow have to get -- yeah, you want to get a vector that is pointing very orthogonal to the thing. It's roughly speaking, it seems hard just to make proof go through when X is uniform out there. 35 It's got to look properly distributed. And you go orthogonally and I don't know, somehow it just didn't seem to work out. >>: If you go out ->> Chris Peikert: Yeah -- >>: -- secret key [indiscernible]. >> Chris Peikert: The problem is somehow even though you walk out in that direction, you've still got a sample secret key from around that region and just the radius of that is already too large. It's going to interact badly with the random X so you can't kind of keep it under control. I mean, that seemed to be the [indiscernible]. It's no argument that it's [inaudible]. Yes? >>: [indiscernible] doesn't the ciphertext have to be as long as the possible [inaudible]. >> Chris Peikert: Yes, it will be -- yeah. >>: Isn't that fishing in and of itself if I'm sending a ciphertext that's twice as long as the [indiscernible]. >> Chris Peikert: Right. >>: [indiscernible]. >> Chris Peikert: Actually, the prescribed encryption algorithm is also telling you to accepted a message, a ciphertext that's that long. >>: I see, so you're intentionally padding out ->> Chris Peikert: Yeah. >>: So you can only equivocate based on how much padding you decided on. >> Chris Peikert: Or send a new key every time, yeah. >>: We had this discussion before, but maybe by analogy, if you agree, in some points a fake encryption and fake [indiscernible] more functionality. 36 >> Chris Peikert: That's right. >>: In some sense. So in some sense, of course, people are skeptical and they come to you and they say use a [indiscernible]. >> Chris Peikert: They use [indiscernible]. >>: Maybe like situations that you know, you come to a service and the service [indiscernible] five bucks, and, you know, you say yes you get it. No, you don't get it. Suddenly, when you come out, somebody will tell you give me five dollars and you say I chose no or something. So it's kind of [indiscernible] why won't you pick up five bucks. But, you know, they can prove [indiscernible]. >> Chris Peikert: Right. I mean, that's what -- deniability is about having a plausible, you know, non-falsifiable story that accounts for everything. >>: [indiscernible] you know the story about, you know, whatever life gives you, gives you. But this technique [indiscernible] you can do it. >>: Another thing is with deniable encryption, [indiscernible] makes a simulation [indiscernible] uses a technique, part of the larger protocol. You might end up finding some where this makes a certain, you know, implementations of functionalities that you like much more efficient, then you get bi-deniability other [indiscernible] for free. Have you thought about particular functionalities that this would make it easy ->> Chris Peikert: I guess one consequence of this is you now get two round non-committing encryption scheme where before we only had three round. >>: Right. >> Chris Peikert: So that's a possible side effect. Which, again, if you don't even care about deniability or faking or anything, you've got a better ->>: I see. >>: Just to -- dumb question, but CCA security is off the table, right? >> Chris Peikert: Yeah, pretty much. Actually, it's -- 37 >>: Proves, I mean, just like purely from the definition. >> Chris Peikert: It is a really good question. I didn't think about it too much, but it's not clear whether it rules out CCA entirely. >>: [indiscernible]. >> Chris Peikert: You could give him one query. And then the other -- >>: You can use public key for one question. But you can use encryption [indiscernible] you can use many message, just as long as you ->> Chris Peikert: [indiscernible] yeah. it's like ruled out. So this CCA question is, I don't know that >>: [indiscernible]. >>: So this is the last talk, and David is going to talk about, we have the technology, now where next. David? >> David Molnar: Thank you very much. What I wanted to talk to you today is a little bit different from the talks we've had so far. I want to instead go over some of the things that we can do now that we've had such progress in cloud cryptography. So this workshop and the other work in the last five, ten years has seen breakthroughs in cryptography. We now have homomorphic encryption. We now have efficient, secure multiparty computation, at least for some functionalities, and we have tamper resistant cryptography. These are all problems motivated by practice and are making serious theoretical strides. And what we want to do with this, I suppose, is we want to make things that were previously impossible possible. So for example, I recently heard we had a customer who wanted to run some inference algorithms over data held by our search engine. And they didn't want to reveal what that algorithm was. And we, of course, don't want to reveal all of our search logs to everyone in the world. So that sounds impossible, but with the techniques that we have today, we know that's becoming more practical. And in addition, there is a whole class of people who would like to move to the cloud, 38 but one can who there are things standing in their way. Security concerns most among them. And of the challenges that we have here is that these techniques we're talking about perhaps get rid of some of these road blocks and move people on to the cloud otherwise wouldn't have done so. So I just want to share a few sort of things I've run into that have actually stopped me from running things on the cloud. And I want to sort of match those up with the kinds of things that we talk about in workshops like this. This is the Amazon elastic compute cloud terms of service from last December -- as of last December. How many people just get a feeling, how many people actually run something on EC-2? Okay. Well, we know you do, yes. Tom, of course, is our resident EC-2 internals expert. But I wanted to run an internal tool on EC-2 for testing purposes to see how well it ran on the cloud. And so this is fine. But then I had to actually go read the terms of service. Does anyone want to guess which of these terms is objectionable? Any bets, anyone? >>: 4.4.1, 4.4.2. >> David Molnar: Yes, actually, both of those. 4.4.1 and 4.4.2 are the answer. In particular, this one that you agree to give Amazon a copy of your application so they can make sure that it's verified in compliance with this agreement. I didn't include all the context, but really this agreement is saying you're not running gamble, you're not running malware, you're not trying to hack other people's computers. So, of course, you see that and you say well, wait a second, what if my application is proprietary, like a pre-release version of, you know, some product or if it's not even to be released, but what if it has proprietary information? What if I don't know what verifying compliance means? That's fairly vague. Who is going to see this application and it's vague from this whether or not data on which my cap dation works on, if that's included or not. So this actually was a blocker. This stopped me from running something on Amazon. So in the context of the stuff we're talking about today, well, you know, we have some tools now we can address this concern. We could obfuscate ->>: [inaudible]. 39 >> David Molnar: Yes, exactly. And that's the kind of -- so the question was, could we give them the application of universal turning machine. The answer is yes, of course we could do that. That's sort of a crude form of obfuscation, right? Because it wouldn't give them anything useful. But how do we actually convince ourselves that's the sufficient thing to do? And so these are all different things we could think of we could use in computing on encrypted data, to alleviate the issue of having to give the data set we use. Or one thing that I've heard happens in practice is people only ship small amounts of data at a time to the application in the cloud. So if they lose some of it, it's not a big deal. But in reasoning rigorously about all these is one thing. And another thing that comes up is how do you convince yourself and convince your peers that this is the right thing to do? So another thing that comes up when you start talking about can I run very high security jobs in the cloud, there are these physical security requirements that come up. So here, you have here on the left, this is part of a best practices checklist for this standard called SAS-70. SAS-70 is not super interesting. It's just a list of things you have to do in order to get a certain stamp which then tells you, yeah, my data center is good to handle super secure material. And these are all kind of very interesting sort of physical security things. Like you must have bullet resistant glass. You must cut your vegetation so that intruders cannot hide behind it. You must have a man trap. Oh a man trap is a sort of entrance device where there's two doors and only one of them can be opened at once. So you enter the first door, it shuts behind you. And then if you're a bad person, you can't get out. So you know, we laugh, but this is actually fairly serious, because the people who wrote this standard were thinking in terms of that picture over there, where there are these giant cages and there's one cage around each property that holds different classified or high sensitive Dat . This is in the very elastic, right? You're not going to have people in Azure or EC-2 going around moving cages on the time scales it takes to spin up VMs. And so this leads us another issue, which is on several standards, including some ones we have here, require proof of deletion. If I give you sensitive data for processing, I need you to tell me you deleted it. Or I need to have some -- I need to have some way of knowing that it was deleted. Now, of course, if you have your own machine, that's fairly straightforward. Although not quite as drastic as this. 40 But then, this imposes a cost on a cloud provider which some of them aren't willing to take. And what do we do about this from the point of view of people who are familiar with cryptography and know how to use it? One thing that's sort of working in our favor is not every piece of data needs these kind of protections. There's actually a movement towards labeling data, at least inside Microsoft and some of the people are using our stuff. There are different ways to spell it, but it's kind of low, medium and high. So around here, we actually asked to sort of say, oh, is this data low impact, medium impact, or high impact. What does that mean? Well, low impact are things like, you know, public web pages, public keys, stuff where it's pretty much okay to be distributed to anyone in the entire company. We don't really care if it gets out that much. We case, but we don't care that much. Medium are things like address, phone numbers or personally identifiable information. And then on high, you have things like credit cards, pass words, pre-release software, and that's where we start getting these physical security requirements, where people start saying hey, wait a second, you know, I'm not going to give you this data unless you can tell me that it's got guards around it 24/7. And we actually have this thing now where the new SharePoint, that's our Wiki solution, will actually ask you to do mandatory labeling of data. So you can know, at any given point of time, is this like the high security, the medium security or not. And it just dovetails with the work we're talking about here, it says this is where you want to apply some of the techniques we've been talking about. And some of those things may have to stay on premises and other things may go into the cloud. So the other thing that comes up that's interesting is we actually scan for high security data. So we actually use an RSA product that searches for high security -- you know, data that's been identified as high impact. And we look for cases where people accidentally posted this data in places that they shouldn't. This is actually interesting, when you talk about encrypted data, because now you have attention between this product, which we like, because it's sort of finds the Social Security numbers people left lying around on that Wiki page, and the fact that the data is encrypted, which means it's harder to scan. So this is a question of you know, searchable encryption might be a way out here. 41 But we do have these processes, intrusion detection is another one, actually, where people say wait a second. I have encrypted packets going through, what do I do if I want my intrusion detection system to actually look at these packets and tell me if they're good or not? So figuring out a way to do that without completely giving away your keys to everyone who says they need them for a middleware box is a difficult problem. So the other thing I just wanted to bring up too are there are these audit requirements that are starting to emerge from cloud services. This is one effort. This A6 group. They have an automated way of querying your cloud provider to say, does it have certain audit standards? So if you're a cloud provider, you can put up a little text file saying yeah, I have this certification and yes, I do have cages around my data center and, yes, I do have this kind of encryption used. And the goal of these guys is to have you as a customer of cloud providers be able to write little scripts that say okay, I want to pick whichever cloud provider meets these minimum requirements and I want to just go ahead and post my data to whoever's cheapest today. Well, this is interesting from the point of view of cloud cryptography, because now the question is can we actually use cryptography to assist this auditing? Can we use this cryptographic mechanisms we've talked about in this workshop to sort of prove really are meeting the guarantees we say we are at lower cost than having hordes of people from PricewaterhouseCoopers come in and inspect us every week. So for example, this is just one example. This project we have here. There are many other examples you can think of. But we have something we bill on top of Azure block storage, which is just a simple key value store. Very similar to Amazon S3 and what we do in this system is every time you end up storing data or getting data, we ask you to write a little, what we call an attestation. Think of it as a simple log message that chains back with all your previous log messages. And we built this in a way using public key signatures that anyone in the world can verify these logs. So what this means, the third party can audit their use of this system and say, okay, if there was a problem, if these table storage or the blob storage end up corrupting bits or end up trying to replay stale data, a third party can tell and then you can sort of say, okay, we screwed up and here's some money back for your trouble. And that's just, you know, one example where you could use some crypt -- some 42 cryptography to produce an audit trail that could dovetail with some of the work people are already doing in the community on getting cloud auditing solutions. And we also did some, you know, performance and showed that you can get this at modest cost. So if you're interested in this, we have an implementation. Talk to me and I can show you the paper and everything. But where this is all going is I've talked about some places where we have some [indiscernible] synergies in terms of what people are doing in terms of policy and what we've talked about in this workshop. But the truth is there's a huge catch-up left to be had between the people that are making these policies and those of you here in this room. So for example, I'll give two examples. One is sort of a corporate policy and the other is sort of European data protection, which I was lucky enough to talk to Casper about, who was here yesterday, if any of you got a chance to talk to him. Both of these have a notion, soft of sensitive data, where it's difficult to move the data outside of some perimeter. In the corporate case, we have this high impact data. We're not really allowed to move it outside of the premises. We're not allowed to give it to even Azure or even EC-2. And in the European case we have personally identifiable information. There's a whole set of things which are not supposed to leave the European Union. There are complicated exceptions to this, but in general not supposed to leave. Now, the question is what if I encrypt this data and the key is not available to the cloud? Right? So I tack an encrypted blob and I put it on the cloud. Is that still high impact data? Is that still data that's governed by these rules? And the answer is yes, actually. You do not get a free pass just by encrypting according to the rules as they are now. Now, we know that if the keys are handled properly, we know that doesn't make any sense. And we know based on this stuff we've talked about here that you can still do a lot of high functionality by leveraging the new break-throughs in cryptography that we've had. And you may even get better guarantees by having things in the cloud than storing them on premises. But the fact is the teams that set requirements today don't understand what we're talking about yet. Inside a company there will be a budge of teams, legal, engineering, IT, who get together and hash this out over months and months and months. I had a chance to talk to Casper and asked him who in EU is making these rules. There are about 2,500 people who are civil servants in the EU who make these rules. And about ten of them know what they're doing from a technical point of view. 43 So this is sort of the thing I just wanted to sum up here and say is look, we have this amazing technology, but there's a bit of work left to do before we can make the policy rules catch up with it. To sum pick up, we have incredible progress, everything we've heard about here today, even if it's not completely practical yet, we've seen in the last five years a lot of emphasis on making really practical technologies that can be deployed. Yesterday, it's Danish sugar beets. Tomorrow, it will be the world. But the question we have now is how do we make them, you know, match up with the kind of rules people are putting in place which are preventing these applications from happening on the cloud. Or happening at all. And then our opportunity is to say how can we sort of dovetail these new techniques with the kinds of audit requirements and kinds of data security requirements that we see people wanting to have. And finally, I just wanted to say that there's an opportunity for us or an obligation for us to sort of educate policymakers that another world is possible. That we can do some of these things that we're talking about, but we have to actually engage and figure out, you know who exactly is the right person to talk with, and what will it take to change their minds. So thank you.