>> Kristin Lauter: Okay. Welcome. So thank you all for coming. So this afternoon we are very pleased to have Nigel Smart visiting us from the University of Bristol. Nigel is actually a recipient of the Royal Society Wolfson Merit Award, he is a long time colleague who has worked in many fields of cryptography, including elliptic curve cryptography, and did early work on standardization of elliptic curve cryptography. He has also been a member of the research staff at Hewlett-Packard Libraries and he is well known for giving very entertaining talks. So welcome, Nigel. And today he is going to talk to us about multi-party computation from theory to practice. Thank you. >> Nigel Smart: Okay. So I’m going to disappoint you from that intro, maybe not give such an entertaining talk. Okay? So what I’m going to do, some, right so nonentertaining talk about multi-party computational theory practice. I’ll set up some scene at the beginning, which you can ignore as just a usual guff that you put at the beginning of a talk to say it’s really relevant, then we’ll talk about, you know, you’ve seen crypto talks before, yeah? Then what we’ll do is kind of look at the protocol. There will be zero complicated maths in the talk, so if anyone wants to really, really heavy math section fall asleep now. The easiest math here you can teach a high school student. There is a lot of hidden math in the talk, okay? There is a lot of hidden stuff but we’re not going to actually talk about that. And at the end I’m going to show you some graphs and some other things and if you’ve got I’m going to show you a demo. Okay. So here is an example. If you talk to drug companies they’ve got a database of molecules and toxicology results. We want to know what the drug company A or drug company B has tested this drug for toxicology before, such as some people do. So there’s a drug company might be doing stuff to do with cancer some might have something to do with Alzheimer’s, and the cancer company has already tested a gazillion drugs, found out a few kill people, and then the other one said that the, drug company B, says oh, I’ve got this new chemical. I want to know whether this would be good for Alzheimer’s. I wonder if someone’s done a toxicology test on it before, because that would save them lots of money and time. And so if they could kind of do this without revealing what drug they’re looking at that would be all and without the first company revealing its information that would be a possibly good thing to do. It turns out this is actually really easy. It just sounds complicated. Think comparing drugs turns out if you want to compare chemicals there’s a style of technique, a bit string of 880 bits long and you just compare some statistical distance between these two bit strings, which you could do with any technology. It sounds like a really complicated thing but it’s actually quite trivial to actually implement. So any old technique you could do would do this. Another one, actually these slides are really old. So this slide was when I gave this talk about six months ago first to another company down south, and it’s great how much I’ve predicted the future here. [laughter] So imagine there’s some agency that wants to kind of look at traffic and maybe some other organization doesn’t want to give this information away. Is there a way to solve this problem? Maybe there is, I don’t know, but there we go. So you kind of get the idea. So I’ve made it even relevant for the current month. So there we go. Okay. So these are both examples of what are called computing encrypted data. What we have is we have two parties or three parties or four parties have some data and they want to compute a function on that data. And essentially there are two ways of computing on encrypted data that we know about. It’s funny how [inaudible] encryption, which is this new super-duper thing that everyone is really kind of interested in, and here the basic paradigm is that party A since, well, one of the basic paradoxes is that party A sends encrypted data to party B, party B does some computation on it, returns the result to part A, who then decrypts and gets the result. So in some sense it acts as like an outsourcing of computation. Party B is computing on the encrypted data in some way. And then there is another way of computing on encrypted data, it’s been around since the dark ages, which is the 1980s in cryptography land, and this is called multi-party computation. And here different parties put their inputs into a protocol and then the protocol itself computes the function they want to compute. Okay? And then they get the output out. So one is kind of like a fire and forget, and the other is kind of like an interactive protocol. Okay? So that’s two ways we can do this. In theory both technologies are brilliant. In theory we can compute everything. In theory, you know, this is some nirvana land where everything works and we’re all happy. Okay so the problem is that both technologies have an issue. Fully homomorphic encryption has a huge computational cost. Computing anything apart from very simple functionalities is really, really expensive. Okay? But the benefit is there is zero communication cost. So we have zero communication cost, which is good. In MPC you have the opposite. The other thing holds is that there is actually no computational cost for computing on the encrypted data, but you pay in communication because the parties have to engage in a protocol and they have to transmit data between each other. In theory we could make both technologies error tolerant in theory. So even if players want to deviate from protocols we can insure that they actually follow the protocols, but we’ll kind of touch on that a bit later. Okay. So that was theory. What happens in practice? In practice FHE runs like a snail. Okay? You can, it’s totally impractical for all but the simplest functions. But you can do some useful things with it, as the group at Microsoft here has kind of demonstrated. So there is kind of useful stuff you can do with it. The MPC is actually been deployed for some operations in the real world. Danish sugar beat auctions, okay? [laughter] Okay. Yeah. So this got a laugh without me having to insult the Danish, which is really good if you watched the video and you Google from Youtube this the part where I kind of insult my Danish colleagues. But you’ve already done it, so it’s fine. [laughter] Okay. So MPC has been deployed for the Danish sugar beat auction, [inaudible] is also a company in Estonia, which has the Share [inaudible] system, but these things are basically three parties, you can only have semi-honest adversaries, the adversaries are forced to follow the protocol in some way, like just out of the goodness of their own heart they are forced to follow the protocol, and then you can only tolerate one out of three bad guys, okay? So this is a bad situation. MPC is slightly practical, we can do real stuff with it, but we would like to have more than or less than three parties, the killer application is actually two parties. We’d like to tolerate all but one of the parties being bad, and we’d like to tolerate the parties if the bad guy decides to arbitrarily deviate from the protocol they’d like to be able to detect this, or without giving up efficiency. And we’re going to do this. And the way we’re going to not give up efficiency is, wait for it, we’re going to use full homomorphic encryption to put the MPC on steroids so it runs faster. Okay? So this fully homomorphic encryption, which is kind of like a slow thing, we’re going to use to make MPC run faster, okay? So this is what the whole point of the talk is going to be. Okay. So now that’s the set up. Now we can do some math, very simple math. So it’s all going to be easy to understand. Okay. So we’re going to have N parties, N could be 2, 3, 4, 5, 6, 8, 9, 10, or whatever. The protocol is going to be linear in the parties in terms of its complexity, so N could really be anything you want but we’ve run it with 10 parties and we haven’t run it with any more than 10 parties just because we can’t be asked to open up more than 10 terminal Windows to do it, yeah? So 10 is enough on my screen so that’s fine. Okay. All but one of them can be bad. So as long as one guy is okay we’re fine. Now, the set up is going to be some global secret key that no one knows. And the global secret key is alpha, and it’s a secret sharing of values alpha one up to [inaudible]. So we have N parties. I think it’s Kristin and me. So I have alpha one, Kristin has alpha two. Okay? And we come up with alpha, I just generate a random alpha one, you generate a random alpha two, and there is an alpha none of us know, so that’s fine. Okay. So that’s just a set up. Okay, now we’re going to need a secret sharing scheme. So all data in the protocol is going to be secret shared between me and Kristin, and we’re going to represent the data as elements in the finite field of P elements, and a secret value X is going to be shared by everybody is going to hold two pieces of information. One of value X side, which is their share of the X, and another value, which we’ll call the MAC share, which is gamma I of X, and these are defined such that X is the X1 up to, is the sum of X1 up to Xn, so X1+X2, and gamma of X is the sharing of the value alpha times X. Okay? So as no one knows alpha, and we never reconstitute this MAC value, even if we know X we can’t work out alpha or the MAC value, yeah? So it’s kind of nice and simple. Just for future [inaudible], if you have a public constant V we can always form a trivial sharing of the public constant V, such trivial, that’s easy. So if you’ve got public value we can always produce a sharing of it. How we actually get values it applies inputs into the protocol. I won’t cover it in this talk, but that can all be done, yeah? So see last year’s crypto paper for how that can happen. But this is kind of like nice and simple, yeah? So instead of holding one value, if I hold one value if I was just computing on the cleared value I’d just hold X myself, yeah? But I’m computing on an encrypted value, which is secret shared, instead of holding just one value I hold two values, which is my share of X and my share of MAC. Okay, it’s nice and simple. Okay. We’re going to work in what’s called the preprocessing model. Most modern MPC protocols are in the preprocessing model and here’s what the idea is is that we have two phases of computation. In the first phase of computation we just compute random stuff, and the random stuff is independent of the function we’re going to evaluate and is independent of the inputs that the party is going to put into the function. So imagine you’re a bank this is the stuff you process overnight, [inaudible] when everything comes online your customer asks you to do X with some input and you have some other input and then you actually start processing, yeah? So this is kind of that model of how a business would work. Okay. So in its basic form and because this is an expository talk I want you to learn something from I will give you the basic idea. Okay? In the basic idea we will evaluate an arithmetic circuit, okay? In reality you don’t think about arithmetic circuits such as the [inaudible] way of describing functions, yeah? This is not a good idea. But just so you can see that this is a universal computation we are going to evaluate arithmetic circuits for the purposes of this talk, and so because we’re only interested in arithmetic circuits the offline phase will simply compute triples, which is called a beaver triple, which is a sharing of a random value A, a sharing of a random value B, and a sharing of C such that C equals A times B. Now for the first part of this talk just imagine that this happens magically, okay? So it’s going to be magically we generate these triples. We’ll come back to this piece of magic later on. Okay. So how are we going to compute stuff? Well from the point of view of this talk we’re going to evaluate arithmetic circuits. If we evaluate arithmetic circuits then plus and multiply over FP are sort of universal gates to those circuits. So we can do everything. As I said we can assume the shares of the inputs are already shared, so all we have to do, all we have to do is actually work out how to add and multiply, and then we’re done. We can go home. Okay? It’s nice and simple. We just have to work out how to add and multiply. It turns out addition is easy, multiplication is hard. Okay? So that’s the kind of idea. A bit like fully homomorphic encryption addition is easy multiplication is hard. Okay. So let’s look at addition. Addition really is trivial. This is the second most complicated mathematical slide in the entire talk. Right? See here the mathematics is really very simple. We have two shared values X and Y, we want to compute the share of zed, so you just sum up locally your things. As magically, by the power of addition, the equations are satisfied, yes? It’s really quite, yeah, this is complex math here, yeah? So really a high school student would have problems, yeah? Complex maths. But look at what happens here. If I were computing on the data in the clear I would have to do one addition. Here I just have to do two. So my blow up in computational time is just two. This is very, very trivial, yes? It’s very easy to do and very, very fast. It’s local computation. We don’t have to do any communication. Okay. Now what happens here is this addition trick works because we have a linear secret sharing scheme. Now, a linear secret sharing scheme means that we can locally compute without interaction any linear function of our shares. So to put that in more concrete terms, those who are mathematically challenged, if we have a V1 and V2 and V3, which are public shares of X and Y we can compute the sharing of this function without any interaction. Okay, so it’s nice and simple. And we’re going to use this factor in our method for multiplication. Okay? So here’s a little note here is that think of notation here. If I have a shared value X, and I just reveal the Xi values we’re going to call this a partial opening because we haven’t opened the whole thing, we haven’t revealed the MAC share, we’ve only revealed the data share. These are called a partial opening, okay? We never want to reveal a MAC share. That’s a bad idea. So we’re never going to do that. Okay. So happy so far? Good. This is the most complicated mathematical slide in the entire talk, and you’ll see by the end of the slide if you see it for 10 seconds you’ll see it’s not very complicated at all. So I want to multiply X and Y to get a sharing of X and a sharing of Y to get a sharing of zed. So how do I do this? Well, I take one of these magically precomputed triples off the list, we’ll call it A, B, and C, and then I partially open X minus A to obtain a value alpha epsilon, which is X minus A, okay? Now, this reveals no information about X because A you can treat as a key, and this is a one-time pad encryption under key A of the value X. So this is information theoretically secure, yeah? >>: X minus A or Xi minus A? >> Nigel Smart: What’s that? Sorry. >>: So is this one part [inaudible]? >> Nigel Smart: Yeah, yeah. So what happens is all parties compute the function X minus A and then the output they partially open. So from that we can all compute X minus A, so we know that everybody knows publically X minus A. So I send Xi minus Ai to you, you send your Xi minus Ai back to me, okay? Then we have Y minus B, and again roe then becomes a one-time pad encryption of Y under key B. And now we just compute this linear function. Tuh-duh! Okay. And why this works is because of this long piece of mass here. Look, you compute that linear function and you stuck the things in that you first thought of up here and you simplify and you get x times y, so it’s magic. Okay? Beautifully magic protocol. So what happens here is to multiply I have to consume something off my list, I have to engage in a little bit a communication, and then I do a local computation. And that’s it. In some sense the reason why I say computing arithmetic circuits is the wrong notion is because the basic operations are take something off the list, do some communication, compute a linear function. It just happens to be that we build a multiplication data out of those three basic operations. If you kind of want to evaluate more complex functions like I’ll have at the end of the talk it turns out that these three basic operations are what you use, not arithmetic circuits. Okay? So yeah. But we kind of want to be simple but just doing multiplication and addition at the moment. So we can add, we can multiply, we can do everything. We can go home. We’ve done everything, yeah? That’s nice and simple. Yeah? Okay so the problem is that we could have cheaters. How do we know that all the parties have followed the protocol correctly? And that’s where the MACs come in. Just go back and see what could go wrong. Well, we’re computing local data so I’m doing stuff just kind of computing along, blah, blah, blah, blah, blah, blah, blah, so the only time I communicate with someone else and could possibly send the wrong piece of information is here. Yeah? So all the time I’m computing on shared secret values, every time I add shared secret values or form a linear combination I get my shares of the max computed correctly. So all I have to do is check that the partially open values are correct, yeah? Okay. So how are we going to do this? So this is a trick that’s going to be in Azorics [phonetic] this year. So what we do is kind of very trivial, is that -- now it’s been accepted I can say it’s trivial, yeah? Okay so let’s just say that the set of all partially open values is the set HA, so each player I all of us agree that there’s a set of partially open values HA, yeah? For J equals one up to T, where T is the number of things we’ve opened. We can actually batch these T every 10,000, we can take t is 10,00 every 10,00 we do this check. Okay? And for each one of these partially open values we have a sharing each individual has a sharing of the MAC on that value. So we want to know whether these MACs correspond to these partially open values. Yeah? Without revealing these, and each player has a MAC key. So given these values, and given our sharings of these without revealing them, and without revealing the MAC key we want to check that they’re all okay. Okay. So this is how we’re going to do it. We just generate by [inaudible] some random string a set of random values J, one for every value we want to check. Well then we all compute the value A, which we can do locally because we all know the values that have been partially opened, and so we just multiply them by our J and take the sum? Okay? So this is now public, but is random depends on these other Js. Now, we can also locally compute our share of what we think is the MAC of A by applying the same linear function, okay? Then we compute sigma I, which is the MAC, minus alpha I which is our MAC key times A. Call that sigma I. If everything is correct the sigma I will sum up to zero, okay? So all we have to do is we broadcast sigma I, we make sure we kind of commit to it first, then broadcast it, yeah? So we don’t have a problem with who goes first or who goes last. So we just kind of like commit and broadcast sigma I. Add them up. If we get zero we’re happy, if we don’t get zero we know someone at some point gave us a wrong piece of information and we abort the protocol and say someone is a cheater but we cannot determine who is the cheater, okay? So because we haven’t got all this [inaudible] we kind of can’t -- no that’s not true. That’s not true. Yeah there’s some on this majority, dishonest majority protocol for which you can identify cheaters, but we can’t in this. Okay. So we can verify correctness. So we’re done. Yes. >>: So we do this once every T multiplication. >> Nigel Smart: Yes. Or every T times you’ve opened something, yeah? So however many you want to do you can batch up and you can kind of look at performance matrix of your network and -- but we’ve taken about 100,000 every 100,000 you do this just plop, plop, plop and it just checks, and at the end of the computation. You’ve got to make sure you always do it at the end. Yes. Actually at the end it’s kind of you have to do it then you do the opening and then you do it again. Yeah? So at the end you have to actually do it twice. You actually do it because I don’t want to reveal information to the, I don’t want to reveal the output of the function if I know everything has been going correctly, yeah? So I do it once, then I do the opening, then I do it again to check that the opening at the actual final answer is correct. Yeah. Okay. >>: In practice does it make much sense to check intermediate values? Because you know you’ll catch it in the end, so people are going to be cheating. >> Nigel Smart: But you have to check all of them. So you can’t just check the final value because it could be that it comes -- all of the partially open values you have check. At some point you have to check all of them so you might as well do them as you go through in batches. You don’t store them all up to the end because you could have a very big computation. Yeah. Okay. So the thing I haven’t told you is how to do the preprocessing. So that’s the -- and I haven’t told you anything about fully homomorphic encryption yet. So you should be going why is he talking about fully homomorphic encryption? So we are going to do fully homomorphic encryption and preprocessing. The other thing you should notice is the online phase involved only information theoretic primitives for creating sort of random values and checking broadcast works and blah, blah, blah, which involves hash functions, everything else is very simplistic no complicated math. Okay. Here’s going to be some complicated math, but we’re going to hide it. Okay. So what I’m going to do is again explain a very naïve version in the real world you actually add lots of bells and whistles to what I’m going to describe to make it run a lot, lot faster. Okay? But lets just assume we’re doing a simple version of preprocessing. So [inaudible] I have a fully homomorphic encryption scheme, which is a public key scheme. So go public key and secret key, whose plane text space is the finite field with P elements. In practice it’s not going to be the finite field of P elements, the lane text space is going to be some mega ring [inaudible] P so I can pack things into Sim D [phonetic] manner and do a gazillion at once, yeah? So we’ll just keep things simple. And it’s fully homomorphic because if I take a message, and [inaudible] by this, if I take a message at one I encrypt it to get cipher text one and can take a message at two encrypt it to get cipher text two, if I add the two cipher text, whatever that means and decrypt them I get the sum of the messages mod p. And there’s a there’s another procedure, which is multiplying cipher text, which is if I multiply cipher text and decrypt I get the product of the messages. Okay. Now, we are only going to need to evaluate circuits of multiplicative depth one, so the circuits we are going to evaluate in fully homomorphic encryptions just are going to have multiplicative depth one. That means it’s not really fully homomorphic encryption, it’s somewhat homomorphic encryption. But it’s kind of more cool to call it fully homomorphic encryption, so I’ll continue to call it fully homomorphic encryption just to add to my cool stakes. Because it’s only multiplicative depth one, it means we can run things very fast because we don’t need this boot strapping technique of gentry, which makes things go very slow. Okay. We’re going to need something slightly more complicated for math for fully homomorphic encryption scheme, we’re going to need to be able to do distributive decryption, okay? That actually makes things slightly less efficient, okay, because of distributive decryption in latter space schemes is not as nice as it is in like, normal discrete log-based schemes, for example. So we assume that the shared no one actually knows the secret key for the FHE scheme, but each party holds a share SKI of the secret key, and then together they can decrypt the cipher text, any old cipher text. They can take any old cipher text and they can decrypt it. So we assume there is a procedure for doing this. I’m not going to explain what that is, you can easily define it for the BGV FHE scheme, most FHE schemes you can define it. So let’s just assume this magically, as a magic box. So we assume that. Now remember those alpha I’s we had earlier? Those are the keys that we just generated a random alpha I each. We’re going to encrypt alpha I with respect to public key and then broadcast it. So because everybody’s got the encryption of alpha I that means everybody can compute the encryption of alpha, even though we don’t know alpha because it’s [inaudible] homomorphic and alpha was the sum of the alpha I’s. So we assume that everyone’s also got the encryption of alpha. Yeah? Okay. Now, we’re going to need sliver protocol here, which is quite funky. So I’ll go through it. So what we’re going to do is we’re going to give it the cipher text, CT, which encrypts the data value M. And what we want to do is given that encryption we want to be able to create an additive sharing of N because that’s what we really need in our protocol is additive sharing, yeah? So we’re going to encrypt something, sorry. We obtain an encryption of something in some way, which we’ll explain in a minute, and then we want to obtain an additive sharing of the value that was encrypted. And if need be we also want to come up with a fresh cipher text, in other words a cipher text which doesn’t have very much noise in it, which also encrypts M. Yeah? Because this cipher text could be quite noisy, so we want to kind of clean it at the same time. Okay so we do this with this reshare protocol. So it’s very trivial so we might as well go through it. Each party generates a random value FI and transmits the cipher text, which encrypts FI. Given that everybody can now encrypt, sorry, can now compute a cipher text, which encrypts M+F because they can take the cipher text as input and add in the sum of the cipher text FI, where F is the sum of the FI’s. Okay? So this cipher text is a FHE encryption of a one-time pad encryption of N. Okay? So it’s perfectly legit to decrypt it because no one is going to work out what M is if we decrypt this. So we decrypt it with our share decryption thing to give us M+F, and now we create the sharing because party one sets M1 to be M+F, minus the thing they first thought of. Every other party says MI is equal to minus the thing they first thought of, and then we see that the MI’s sum up to M, which is what we wanted. And if we wanted to find a new fresh encryption we just encrypt M+F with respect to some fresh noise and subtract the sum of the FI’s here. And this is now the new encryption of F, sorry, the new encryption of N, but at multiplicative depth zero, yeah? Because there’s been no multiplications required to compute this new cipher text. And we can get a fresh cipher text out and we can do a distributive sharing with this. We use some default randomness here like zero. Yeah. Happy? Okay. So this is how we’re going to generate the preprocessing. First we have to generate A and B and then we have to come onto C. Okay. So how are we going to generate A? Well, we generate A as follows: everybody generates their own Ai randomly, which guarantees that the sum of the Ai’s is A is random because one of us on this, so one of us is going to come up with a random Ai. Then we transmit the encryptions of these Ai’s, we add them all up, okay? That gives us an encryption of A. Now, remember what we want is we want some sharing of the MAC of A. So the way we do that is we’ve go this encryption of alpha, we’ve got an encryption of A, so we multiply these two together and that gives us an encryption of alpha times A, which we then reshare to get the MAC. And we reshare with the protocol on the previous slide. Yeah? Very simple. And we do the same thing to execute B to compute B. So how are we going to compute C? >>: [inaudible] >> Nigel Smart: What’s that? Sorry. >>: [inaudible] >> Nigel Smart: No. >>: [inaudible] >> Nigel Smart: Multiply cipher text for A and the cipher text for B. The cipher text for A has no multiplications to produce it, cipher text for B will have no multiplications to produce it, so we’ve got cipher text A, cipher text B. So [inaudible] of circuit of multiplicative depth one we compute a cipher text, which encrypts A times B, and then to get the sharings we just reshare. Apply the reshare protocol obtaining a fresh cipher text, which hasn’t had any multiplications in it, which encrypts the same value because we want to multiply again, and we don’t want to use a depth two FHE scheme because that would just be too much hassle, yeah? So this trick is just to avoid having to use a depth two circuit. So now we have the cipher text alpha times this new encryption of C gives us the encryption of alpha times C, and then we reshare it and out pops the sharings of C. You know, the sharings of the MAC on C. Okay. So this is efficient, very efficient because we only compute with depth one circuits. Yeah? >>: I mean, if you’re only doing one multiplication you could just use like [inaudible], right? >> Nigel Smart: No. [laughter] Good. Everyone, every talk says that. That’s a standard question at this point. >>: Because some of the resharing stuff seems like maybe ->> Nigel Smart: The problem is the product on this sum is that the cipher text space is really small. >>: Right. >> Nigel Smart: And you can’t do the packing. Okay, so we are going to send a 1024 bit or 2000 bit thing to get one bit of information. I’m sending, okay. I’m sending like, huge cipher text, but I’m packing 10,000 things in at one go so it’s kind of -yeah, yeah. It doesn’t work alas. Okay? So we only compute [inaudible], everything’s very fast. We could do other preprocessing. So when I later on I’m going to explain some different functions that I can compute, we don’t just do [inaudible] circuits, we do other preprocessing. We compute shared random bits. We compute sharings of A and sharings of one over A as preprocessing. Yes, there is a lot of preprocessing we can do, which allows us to leverage a lot of prior work on MPC protocols to actually make them more efficient because we can put a lot of computation in the preprocessing phase because we can use the power of FHE in the preprocessing phase, where as prior protocols had to do all this extra computation in the online phase. So there’s quite a big benefit we get out of systems. Okay. So we’re [inaudible] called the SPDZ protocol. The first presentation was at crypto last year. There have been all sorts of bells and whistles added since and we can compute all sorts of different stuff. It’s very efficient and very practical for some applications. Very, very, very practical for some applications. Okay? It’s got better security properties than other MPC implementations. You can have dishonest majority, you can have N anything we want, we can have two up to whatever, 10 because that’s how many Windows I can put on my screen. We have it’s actively secure. So it’s good, yeah? This is everything you’ve ever wanted for MPC but were afraid to ask. So it does that. It’s really flexible in terms of parameters. It’s not quite suited to evaluate binary circuits. If you’ve got a function that is best described as a binary circuit or a circuit of a small finite field, then SPDZ is probably not the protocol for you. There’s another protocol called Tiny-OT, which is better for that, which is the other one that was presented at the crypto last year. And it turns out that there are trade offs between SPDZ and Tiny-OT. It’s not necessarily the case even if you were evaluating like, a hash function, which one’s fastest is kind of -- even though a hash function you should describe as a binary circuit as a natural way of describing it it’s not necessarily so clear that it’s a binary circuit. So it’s unclear which one is the fastest. But they are pretty comparable, have roughly the same sort of performance except one is kind of focused on, Tiny-OT is two party binary circuit, SPDZ is for any number of parties things that you describe with operations molt P. Okay. Let’s look at performance-wise. So we’ve implemented all sorts of things here for various people to investigate different applications. As I said before the real key is we don’t evaluate circuits. The key is to open data. And it turns out that if your actually doing computation it’s not the amount of data you open. So again, here’s another thing for theoreticians is that theoreticians often count the total amount of data transferred as a complexity measure. I don’t really care. The real thing that really slows us down with these protocols is the total number of connections you make. How many packets you send. Not the total amount of data in the packets, yeah? So what we do is we have compilers, which minimize the number of times we actually call open and we actually open like, maybe a thousand things at once. So we kind of reorder the program so we only do a small amount, small numbers of communication but the same amount of communication but smaller number of times. We can offload stuff into the offline phase. We expect more impressive results to come out in the next few months. Obviously this talk is ages old. So some of the data can be a bit out of date, but we are continually improving it. So here’s our current run times. Sorry, our run times about February this year. Okay. So here we have one thread, four threads, seven threads on the machine. We have one control thread, which is why we haven’t gone to eight, okay? So these are the total number of multiplications per second. So what is interesting is if you put, what we found is interesting is if you put latency versus throughput. The reason why we’re doing is because most MPC performance figures they give you throughput and then they hide the important figure. You know, if I say I can get a throughput of a gazillion a second, but actually get the first one out of my pipeline takes ten seconds, I’m not really interested in the fact you can et a gazillion a second through the throughput, yeah? I’m interested in the fact I have to wait ten seconds. So it’s kind of important to actually see how you can trade latency versus throughput. So this is a much more important for practice than latency versus throughput graphs. Interestingly, [inaudible] has done the same graphs for Tiny-OT and it really is boom, boom all the time. So this is kind of nice. Okay, so what do we have? [inaudible] multiplication. We can have very, very small latency, yeah? And we can really push up to 600 thousand multiplications a second. That’s about a 286. Yeah? Yeah, so wait, then you go wait a minute, okay? But yeah, 286 that’s really good performance, okay? So it’s a 286 performance we can do really well. And the issue here is seven threads is not better than four because at this point we really are maxing out the network card, not the network. Yeah? We need more network cards on the machine. The threads are just kind of like pumping the network card and it can’t keep up. Okay. But there we go. Interested in comparison. Now here’s an interesting thing. Integer comparison on a normal programming language is faster than integer multiplication. Not so with MPC. Integer comparison is more expensive than multiplication, okay? So here we can do about 6,000 per second. Throughput latency we can put latencies of about 10. It depends on whether we’re comparing 32 bit or 64 bit numbers. So we take a molt P, which the P could be 120 bits long, but we’re looking at like we could say well, we know the number inside it is 32 bits because we got a range analysis which says the thing inside it is 32 bits. Yeah? So that’s quite reasonable. Nowhere near 286. 286 could do whatever million of these a second. Okay? So we’re kind of getting worse. Sorting. You want to sort some numbers? Because that’s just comparisons. Okay, one of my post doc’s has done he can now sort a list of a million. It takes many hours but he can sort a list of a million using this. So this is kind of like some early stuff we can do just like the 400, so in 2.5 seconds we can do 32 bit numbers for a list of 400 of them. Yeah. Yeah. There’s three of you let’s go with you first. >>: MPC context with you already have half the inputs ->> Nigel Smart: So this context we’re assuming that the values have already been shared. So if they have different. Yeah, so if you had one half, one half, one half they only have to emerge so it’s much easier, yeah? So we’re assuming the worst possible case where people just dump shared data and now we want to sort it. Yeah. >>: And so what does it mean to have one thread multiparty multiplication? >> Nigel Smart: Okay, so one thread means that so what we’re doing is in our system we can have multiple threads playing the parts of the party. So we can do many, many things in parallel. It’s just sorting we haven’t managed to ->>: So one thread doesn’t mean ->> Nigel Smart: No, one thread means that one party is running one thread rather than maxing out his microprocessor. So the processor has got eight threads on it, sorry, processor has eight cores on it, so he’s only maxing out one core. Yeah. >>: That’s independent of the number of participants? >> Nigel Smart: It’s independent. Yeah, okay so these figures are for two participants. If you’ve got a three participant it scales there’s like a 10 percent extra, four participants we maybe have 25 percent extra cost, it really is very -- we don’t really see much difference so we just give the two party case. Yeah. >>: Oh, it was about what does [inaudible] mean? >> Nigel Smart: What does [inaudible] mean, okay yeah. Okay. So here we have fixed and floating point multiplication. Now, if you want to do something interesting you’re going to want to do something interesting. You don’t want to do floating-point multiplication. So currently I’ve got students over the summer who are implementing a whole floating point library for me so we can do sine, cosine, shine, square root. Yeah? >>: Single or double precision? >> Nigel Smart: These are single precision. Yeah. These are single precision. Double precision, yeah. We can go to double precision it’s just much more complicated. These are single precision values. And here we see we can do multiplication. So we can do about 6,000 a second or we got kind of, see we got kind of worse, okay? Addition is really bad. Okay, so Dan Burnstein wasn’t surprised when I showed him this because he said it turns out Intel say that the addition on their processor uses up more energy than the multiplication. Okay? In MPC land, I don’t know why, but it turns out if you think about it for five seconds you realize why addition could be more complicated, right? So I’ve got floating point numbers, I’ve got 19 for an exponent. Yeah, suddenly Josh just went oh yeah. Yeah. Multiplication is just trivial, whereas addition I have to shift this around dependent on this value and I don’t know what this value is, yeah? So it’s much more tricky. So there we go, you’ve got floating point addition. Fixed floating-point comparison we can do about 4 thousand, so yeah, we’ve got numbers, loads of numbers. Yes? >>: What’s, like, the storage cost? >> Nigel Smart: Well, in terms of offline processing? >>: Well, I mean just in terms of what you have to keep sitting. Because I mean, for example, like the thing from [inaudible] it seemed like you were keeping around like a couple hundred thousand->> Nigel Smart: Okay so for these things it’s not very much at all. So if they’re floating point so it turns out actually that the floating point operations you are storing maybe like a hundred times more data than you would normally, right? In files, but really it’s peanuts compared to what you’re actually doing. So there is stuff to do like AS that can be quite expensive. >>: Say you kind of like, space time trade off, or if I were willing to store a little more I could get a better output? >> Nigel Smart: No, this is assuming the best. >>: This is assuming I’ve taken ->> Nigel Smart: Yeah, you’ve taken a whole data set. But in terms of in practice what you would do is you could kind of have like two coupled computers, one doing the offline phase pushing data into the online phase, or you could have one core on the other computer doing the offline phase. It depends how you want to as a system integrate them together. >>: Okay. >>: When it comes to [inaudible] share the same order except the negative ones are [inaudible]? >> Nigel Smart: [inaudible] because in MPC land we’ll encrypt and exponent in floating point, whereas in fixed point it’s not encrypted so you’ve actually got less to do, much, much less to do. Much, much easier. Yeah, which is why a lot of prior work here has always looked at fixed-point numbers. So these floating point numbers are kind of using algorithms from there’s a paper I think by some Japanese authors I think earlier in the year, I can’t remember. We took their algorithms and kind of tweaked them to out accepting. Okay. [inaudible]. So here’s a small finite field example. Okay, yeah. We’ve got some time. So imagine you want to verify passwords. So this is a static password solution that Eric Yules [phonetic] explained at the real world crypto workshop. So we have so we got some static password we want to verify. We’re going to do it with two servers without storing a password. Okay. This protocol is stupid, right? Okay. So just to set the scene. So I have some password P, which okay, so we assume that we got two servers and the servers have actually stored one server stored one value and the other server stored the other value, such that the [inaudible] of them is equal to the password we want to verify. So to check the password without revealing the password, what we do is we generate a random zoring [phonetic] of the password, send P1 to server one, p2 to server two. Server one just combines his bits, server two combines his bits, and then all we have to do is secure a comparison tick. Okay, so this is nice and trivial. Dumb. Okay, so the problem is [inaudible]. Okay. The reason that you might want to evaluate the AS function over T, which is something that we introduced in Asia crypt 2009 paper and everyone went why the hell would you want to do this? The reason you might want to do this, which you probably just discovered is that you might want to verify dynamic passwords. So if I’m doing secure I.D. tokens in EMV cap, don’t know what the MV is come to my talk tomorrow. If you don’t know what EMV is you’re going to be using it in a few years. If you go to Europe that’s why you can’t buy stuff on credit cards. Okay. So password could be typically like the AES encryption of some one-time message M, like some data or whatever, yeah, under some key you get the password and then you type it in. And you can replace AS without any other key, PRF for example, DES, or HMAC, MD5, or HMAC SHA-1 or whatever. Yeah. Dumb. So [inaudible] we’ve been looking at doing this for a dynamic password thing. I’ll show you a demo in a few minutes at the end, okay? So we need to be able t evaluate AES or we need to be able to evaluate MD5 or SHA-1. So we can evaluate SHA-1, err AES. We can kind of get latency now. We’re kind of getting latencies now of about 10-20 milliseconds to evaluate AES between two servers, which is really quite [inaudible]. Or we can do a thousand a second. You see the difference here? You can either get latency here or you can push the throughput up here. You cannot get there and there at the same time, which is kind of why people who claim I can do a thousand a second, yeah you can, but we can run twenty times faster or a ten times faster than you so no. Okay. And we’re actively secure, okay? So there we go. This AES, if you want to zoom in you’ve got more and more details okay. If you want to do DES, you might want to do DES especially if you’re an old bank you might want to do DES. You know, it’s still pretty reasonable. If you talk to a major European bank they’re kind of looking at latencies of their verification of about 300 milliseconds and the maximum throughput is about 300 across the day. 300 per second verifications a day. So they want to be the point here, which is not far off, yeah? So it’s pretty close to where you want to be. So yeah, increase the number of cores and you’re done. Make a better network card and you’re done. Yeah? So it’s kind of very easy. MD5 pretty bad. SHA-1 really bad. It turns out SHA-2 is better than SHA-1, yeah? So it turns out that [inaudible], or whoever it is, is designing cryptographic algorithms now to be friendly towards MPC. AES is a beautiful function for MPC, yeah. So you go to a block cipher designer should be a really complicated Boolean circuit blah, blah, blah, blah with very little mathematical structure. Yeah, right. AES really has no mathematical structure does it? I mean, it’s kind of like very, very, very nice. It’s parallel, it’s got beautiful mathematical properties. AES is superduper fast for MPC evaluation. So when we chose it in 2009, we chose it as the benchmark for MPC calculations because we thought it would be a complex function for MPC to cope with. It actually turns out to be an easy function for MPC to cope with. SHA-1 is the worst so far, okay? SHA-2 turns out does a lot more in parallel. So SHA-2 runs faster than SHA-1. Okay. Yeah. Haven’t looked at SHA-3 yet. Okay, so I’m going to show you a demo. How do I do this right? I go escape? Yes. Then it was on exploder. Okay. So what I’m going to do assuming that all of the networks work, and the machine’s backup base at some god-forsaken time in the morning have fallen over, okay so a bit dodgy. Okay? It normally works. So what I’m going to do is I’m going to type something to a website, it’s going to go from this webpage, it’s going to go over the ether to Bristol. It’s going to go to the Bristol web server. The Bristol web server is going to capture the data, it’s going to send the data. It’s going to send it to two back end servers who are going to engage in two AES calculations. The reason they have to engage in two AES calculations is because we don’t want to store a per-user AES key. We’re going to use a PRF to derive the per-user AES key from a master key, and only the master key is going to be stored. So there are two AES calculations going to be performed. It’s then going to these two servers and to do the calculation, decide whether I should be allowed into the system or not, okay, return the value to the web server and then it’s going to come back over the ether to here and display on this terminal, which is then going to go up and display on the screen there. So I mean, this is magic! Yes? So I’m going to pull things out of my hat. Now, I would like to point out when we first looked at AES in 2009, actively secure twoparty AES this would take between 40-80 minutes per AES evaluation, okay? So I’m going to do two AES evaluations for you, actively secure, and you can therefore sit here for another 80-160 minutes, okay? Ready? Good. Okay, so I want to log in to this website and I have my one-time password token, which is magically this thing over here. So I’m going to type in my username, which is Nigel, lets get it right. And then my one-time information just because every time I type in Nigel I always type in smart afterword it’s going to be smart, which is not very one-time. And then my magic token here has told me my password is 9d5ee72c. For those who use the caps system to log onto banks in Europe you’ll understand why I’ve got eight characters and it looks like this, and I click submit. Two AES calculations, plus the delay in the Internet going over. Impressed? Yeah? Look, password is verified. Okay, lets go back. You may say I’ve cheated, so lets just push in a random password. I’ll put in P there. Password not verified. There we go. Very cool huh? So that’s that speed of MPC these days you can do real stuff for real in real time across the inter-web. So there we go. Thank you. Thank you. [applause] >>: More questions? >> Nigel Smart: Oh, my gosh. Do you want to say who goes first? >>: No, you go ahead. >> Nigel Smart: All right. Yeah? >>: So the two servers doing MPC in your demo, are they in the same data center? >> Nigel Smart: Yeah. [inaudible] University. Yeah, they’re kind of connected. They’ve got a dedicated, they’ve got extra network cards and we just kind of -- we don’t even go through a hub, directly connected. >>: So in terms of the wish list and everything you would want can you threshold this? So you know how to use it and if you -- >> Nigel Smart: Yeah, you can threshold it if you want. You have to tweak the fully homomorphic stuff, which is not too much hassle, but yeah, theoretically you can just threshold the whole shebang. Yeah, you change the threshold of the speed stuff and it should all just go through. If you want to go down to [inaudible] then you can get rid of the MACs, yeah. So woo-hoo you save an extra addition every time you want to add. Yeah. Okay whose -- yeah? >>: The trade-off between latency and throughput is very hard to read. I need binoculars. Are the numbers published anywhere? Because ->> Nigel Smart: Yeah, yeah. Oh well, assuming the shepherd allows us the paper will appear in CCS this year. Assuming the shepherd lets it through. So I have no idea, but I suspect they will. Yeah. But basically the real message there is don’t worry about what numbers actually are, if anyone ever tells you you’ve got a throughput of a MPC characterization which is the normal thing people tell you, just go yeah and what’s the latency? Because they always kind of go oh, we can do 2,000 AES’s a second, yeah, and it takes you ten seconds to get an answer out. That’s much better. >>: Isn’t the latency dependent on the network? So is that why they don’t say that? >> Nigel Smart: Yeah, the latency does depend on the network. So if we slowed a network down we can get worse latency. So if you put these things through other parts, like you put one in California or you put one in New York, you get much worse. So the ping times for these things are, oh god, .1 milliseconds, and the ping time from Bristol to London is 4 milliseconds. So if you did it Bristol to London you’d get a 40-fold decrease. Yeah, that’s really scientific isn’t it? But you get the idea. Yeah? >>: [inaudible] >> Nigel Smart: Okay, so the thing is with [inaudible] you really are focused on two parties only, you’re really focused on binary circuits. So for AES calculations [inaudible] might win, okay? For floating point calculations we are going to just beat them into the ground because we’re not evaluating binary circuits. We’re doing far more rich ability to express how to evaluate functions. So for anything involves statistics or integer calculations we’re going to beat Yow [phonetic] hands down. So again it’s like you have to think about [inaudible], but in terms of general MPC we’re pretty close to the YOW [phonetic] stuff but we’re not as good as them, but we can do far more stuff. So, yeah. In terms of the network, oh yeah. You can do YOW in all sorts of different ways. Yeah, there’s so many. There could be like three papers on YOW at this year’s crypto with different optimizations that know one knows how they actually play out. So I think that’s an open question. >>: So you must be expecting this question, but what can you say about offline performance? For example, for these two AES calculations how many precomp ->> Nigel Smart: [inaudible]. Oh god, I should know this. A few thousand multiplications. So for AES it’s particularly bad with SPDZ. So with SPDZ we’re talking about 20 seconds of precomputation per AES, which is a bit of a problem. With Tiny-OT we can get exactly the same performance figures and we can do precomputation times of well under a second for Tiny-OT for AES. So again it’s [inaudible]. I just kind of talked about speed here, if you want to talk about different ones you can do different things. But for online performance SPDZ and Tiny-OT are roughly the same. >>: The slow down for comparisons is [inaudible]? >> Nigel Smart: Yeah, this is the key, right? So what this does is it opens up a huge amount of optimization. So what we’re kind of doing is at the moment is we’re just spending our time thinking about different ways of doing things. So we can think about different ways of doing comparison. So just kind of like a stupid thing this opens up a whole new question in numerical analysis. So for example, we want to do matrix characterizations, which require us to do square roots. So how are you going to do square roots? Well, you’re probably going to use [inaudible] method. And if you’re going to do matrix calculation you’re probably going to use principle component analysis, which is also an iterative method. If you do iterative methods, there are two ways to do an iterative method. Iterate for a thousand times and hope, or iterate until the epsilon is really small. If you have to iterate until the epsilon is really small, then every time you have to do that test whether the epsilon is really, really small. Whether you’ve converged or not, yeah? So this will reveal how many iterations you have to do to perform the calculation. So what does that reveal about the inputs to the algorithm? So it turns out, so we’ve got some Europe American analysts in Bristol interested in this and they’ve been doing some prior work for us. It turns out that if you do Newton style method it reveals very little information. So you can actually work out how many times you’re going to do the loop dynamically, whereas for a more linear scale method like the [inaudible] method for matrices, then it does reveal you get like an approximation of the first argon value of the matrix is revealed by the number of times you execute the algorithm, which is kind of really funky. Yeah. >>: [inaudible] for fixed and floating, then you would [inaudible] square or cubed root by halving or dividing by three and then you create a very small and fixed number of iterations. [inaudible] >> Nigel Smart: Yeah. So I couldn’t remember that one but I knew there was one. But there are a number of things we can do, which are all cool. The division by three is quite nice because that’s for free because it’s just constant. So that doesn’t involve anything. >>: Any other questions? So let’s thank Nigel again. >> Nigel Smart: Thank you. [applause]