>> Kristin Lauter: Okay. Welcome. So... we are very pleased to have Nigel Smart visiting us...

advertisement
>> Kristin Lauter: Okay. Welcome. So thank you all for coming. So this afternoon
we are very pleased to have Nigel Smart visiting us from the University of Bristol.
Nigel is actually a recipient of the Royal Society Wolfson Merit Award, he is a long
time colleague who has worked in many fields of cryptography, including elliptic
curve cryptography, and did early work on standardization of elliptic curve
cryptography.
He has also been a member of the research staff at Hewlett-Packard Libraries and he
is well known for giving very entertaining talks. So welcome, Nigel. And today he is
going to talk to us about multi-party computation from theory to practice. Thank
you.
>> Nigel Smart: Okay. So I’m going to disappoint you from that intro, maybe not
give such an entertaining talk. Okay? So what I’m going to do, some, right so nonentertaining talk about multi-party computational theory practice. I’ll set up some
scene at the beginning, which you can ignore as just a usual guff that you put at the
beginning of a talk to say it’s really relevant, then we’ll talk about, you know, you’ve
seen crypto talks before, yeah?
Then what we’ll do is kind of look at the protocol. There will be zero complicated
maths in the talk, so if anyone wants to really, really heavy math section fall asleep
now. The easiest math here you can teach a high school student. There is a lot of
hidden math in the talk, okay? There is a lot of hidden stuff but we’re not going to
actually talk about that.
And at the end I’m going to show you some graphs and some other things and if
you’ve got I’m going to show you a demo. Okay. So here is an example. If you talk to
drug companies they’ve got a database of molecules and toxicology results. We
want to know what the drug company A or drug company B has tested this drug for
toxicology before, such as some people do.
So there’s a drug company might be doing stuff to do with cancer some might have
something to do with Alzheimer’s, and the cancer company has already tested a
gazillion drugs, found out a few kill people, and then the other one said that the,
drug company B, says oh, I’ve got this new chemical. I want to know whether this
would be good for Alzheimer’s. I wonder if someone’s done a toxicology test on it
before, because that would save them lots of money and time.
And so if they could kind of do this without revealing what drug they’re looking at
that would be all and without the first company revealing its information that would
be a possibly good thing to do. It turns out this is actually really easy. It just sounds
complicated. Think comparing drugs turns out if you want to compare chemicals
there’s a style of technique, a bit string of 880 bits long and you just compare some
statistical distance between these two bit strings, which you could do with any
technology.
It sounds like a really complicated thing but it’s actually quite trivial to actually
implement. So any old technique you could do would do this. Another one, actually
these slides are really old. So this slide was when I gave this talk about six months
ago first to another company down south, and it’s great how much I’ve predicted the
future here. [laughter]
So imagine there’s some agency that wants to kind of look at traffic and maybe some
other organization doesn’t want to give this information away. Is there a way to
solve this problem? Maybe there is, I don’t know, but there we go. So you kind of
get the idea. So I’ve made it even relevant for the current month. So there we go.
Okay.
So these are both examples of what are called computing encrypted data. What we
have is we have two parties or three parties or four parties have some data and they
want to compute a function on that data. And essentially there are two ways of
computing on encrypted data that we know about. It’s funny how [inaudible]
encryption, which is this new super-duper thing that everyone is really kind of
interested in, and here the basic paradigm is that party A since, well, one of the basic
paradoxes is that party A sends encrypted data to party B, party B does some
computation on it, returns the result to part A, who then decrypts and gets the
result.
So in some sense it acts as like an outsourcing of computation. Party B is computing
on the encrypted data in some way. And then there is another way of computing on
encrypted data, it’s been around since the dark ages, which is the 1980s in
cryptography land, and this is called multi-party computation. And here different
parties put their inputs into a protocol and then the protocol itself computes the
function they want to compute.
Okay? And then they get the output out. So one is kind of like a fire and forget, and
the other is kind of like an interactive protocol. Okay? So that’s two ways we can do
this. In theory both technologies are brilliant. In theory we can compute
everything. In theory, you know, this is some nirvana land where everything works
and we’re all happy.
Okay so the problem is that both technologies have an issue. Fully homomorphic
encryption has a huge computational cost. Computing anything apart from very
simple functionalities is really, really expensive. Okay? But the benefit is there is
zero communication cost. So we have zero communication cost, which is good. In
MPC you have the opposite. The other thing holds is that there is actually no
computational cost for computing on the encrypted data, but you pay in
communication because the parties have to engage in a protocol and they have to
transmit data between each other.
In theory we could make both technologies error tolerant in theory. So even if
players want to deviate from protocols we can insure that they actually follow the
protocols, but we’ll kind of touch on that a bit later.
Okay. So that was theory. What happens in practice? In practice FHE runs like a
snail. Okay? You can, it’s totally impractical for all but the simplest functions. But
you can do some useful things with it, as the group at Microsoft here has kind of
demonstrated. So there is kind of useful stuff you can do with it. The MPC is
actually been deployed for some operations in the real world. Danish sugar beat
auctions, okay?
[laughter]
Okay. Yeah. So this got a laugh without me having to insult the Danish, which is
really good if you watched the video and you Google from Youtube this the part
where I kind of insult my Danish colleagues. But you’ve already done it, so it’s fine.
[laughter]
Okay. So MPC has been deployed for the Danish sugar beat auction, [inaudible] is
also a company in Estonia, which has the Share [inaudible] system, but these things
are basically three parties, you can only have semi-honest adversaries, the
adversaries are forced to follow the protocol in some way, like just out of the
goodness of their own heart they are forced to follow the protocol, and then you can
only tolerate one out of three bad guys, okay?
So this is a bad situation. MPC is slightly practical, we can do real stuff with it, but
we would like to have more than or less than three parties, the killer application is
actually two parties. We’d like to tolerate all but one of the parties being bad, and
we’d like to tolerate the parties if the bad guy decides to arbitrarily deviate from the
protocol they’d like to be able to detect this, or without giving up efficiency.
And we’re going to do this. And the way we’re going to not give up efficiency is, wait
for it, we’re going to use full homomorphic encryption to put the MPC on steroids so
it runs faster. Okay? So this fully homomorphic encryption, which is kind of like a
slow thing, we’re going to use to make MPC run faster, okay? So this is what the
whole point of the talk is going to be.
Okay. So now that’s the set up. Now we can do some math, very simple math. So it’s
all going to be easy to understand. Okay. So we’re going to have N parties, N could
be 2, 3, 4, 5, 6, 8, 9, 10, or whatever. The protocol is going to be linear in the parties
in terms of its complexity, so N could really be anything you want but we’ve run it
with 10 parties and we haven’t run it with any more than 10 parties just because we
can’t be asked to open up more than 10 terminal Windows to do it, yeah? So 10 is
enough on my screen so that’s fine.
Okay. All but one of them can be bad. So as long as one guy is okay we’re fine. Now,
the set up is going to be some global secret key that no one knows. And the global
secret key is alpha, and it’s a secret sharing of values alpha one up to [inaudible]. So
we have N parties. I think it’s Kristin and me. So I have alpha one, Kristin has alpha
two. Okay? And we come up with alpha, I just generate a random alpha one, you
generate a random alpha two, and there is an alpha none of us know, so that’s fine.
Okay. So that’s just a set up. Okay, now we’re going to need a secret sharing
scheme. So all data in the protocol is going to be secret shared between me and
Kristin, and we’re going to represent the data as elements in the finite field of P
elements, and a secret value X is going to be shared by everybody is going to hold
two pieces of information. One of value X side, which is their share of the X, and
another value, which we’ll call the MAC share, which is gamma I of X, and these are
defined such that X is the X1 up to, is the sum of X1 up to Xn, so X1+X2, and gamma
of X is the sharing of the value alpha times X. Okay?
So as no one knows alpha, and we never reconstitute this MAC value, even if we
know X we can’t work out alpha or the MAC value, yeah? So it’s kind of nice and
simple.
Just for future [inaudible], if you have a public constant V we can always form a
trivial sharing of the public constant V, such trivial, that’s easy. So if you’ve got
public value we can always produce a sharing of it. How we actually get values it
applies inputs into the protocol. I won’t cover it in this talk, but that can all be done,
yeah? So see last year’s crypto paper for how that can happen. But this is kind of
like nice and simple, yeah?
So instead of holding one value, if I hold one value if I was just computing on the
cleared value I’d just hold X myself, yeah? But I’m computing on an encrypted value,
which is secret shared, instead of holding just one value I hold two values, which is
my share of X and my share of MAC. Okay, it’s nice and simple.
Okay. We’re going to work in what’s called the preprocessing model. Most modern
MPC protocols are in the preprocessing model and here’s what the idea is is that we
have two phases of computation. In the first phase of computation we just compute
random stuff, and the random stuff is independent of the function we’re going to
evaluate and is independent of the inputs that the party is going to put into the
function.
So imagine you’re a bank this is the stuff you process overnight, [inaudible] when
everything comes online your customer asks you to do X with some input and you
have some other input and then you actually start processing, yeah? So this is kind
of that model of how a business would work. Okay. So in its basic form and because
this is an expository talk I want you to learn something from I will give you the basic
idea. Okay?
In the basic idea we will evaluate an arithmetic circuit, okay? In reality you don’t
think about arithmetic circuits such as the [inaudible] way of describing functions,
yeah? This is not a good idea. But just so you can see that this is a universal
computation we are going to evaluate arithmetic circuits for the purposes of this
talk, and so because we’re only interested in arithmetic circuits the offline phase will
simply compute triples, which is called a beaver triple, which is a sharing of a
random value A, a sharing of a random value B, and a sharing of C such that C equals
A times B. Now for the first part of this talk just imagine that this happens magically,
okay?
So it’s going to be magically we generate these triples. We’ll come back to this piece
of magic later on. Okay. So how are we going to compute stuff? Well from the point
of view of this talk we’re going to evaluate arithmetic circuits. If we evaluate
arithmetic circuits then plus and multiply over FP are sort of universal gates to
those circuits. So we can do everything. As I said we can assume the shares of the
inputs are already shared, so all we have to do, all we have to do is actually work out
how to add and multiply, and then we’re done. We can go home. Okay? It’s nice and
simple.
We just have to work out how to add and multiply. It turns out addition is easy,
multiplication is hard. Okay? So that’s the kind of idea. A bit like fully
homomorphic encryption addition is easy multiplication is hard. Okay. So let’s look
at addition. Addition really is trivial. This is the second most complicated
mathematical slide in the entire talk. Right? See here the mathematics is really very
simple.
We have two shared values X and Y, we want to compute the share of zed, so you
just sum up locally your things. As magically, by the power of addition, the
equations are satisfied, yes? It’s really quite, yeah, this is complex math here, yeah?
So really a high school student would have problems, yeah? Complex maths.
But look at what happens here. If I were computing on the data in the clear I would
have to do one addition. Here I just have to do two. So my blow up in computational
time is just two. This is very, very trivial, yes? It’s very easy to do and very, very
fast. It’s local computation. We don’t have to do any communication. Okay.
Now what happens here is this addition trick works because we have a linear secret
sharing scheme. Now, a linear secret sharing scheme means that we can locally
compute without interaction any linear function of our shares. So to put that in
more concrete terms, those who are mathematically challenged, if we have a V1 and
V2 and V3, which are public shares of X and Y we can compute the sharing of this
function without any interaction. Okay, so it’s nice and simple. And we’re going to
use this factor in our method for multiplication. Okay?
So here’s a little note here is that think of notation here. If I have a shared value X,
and I just reveal the Xi values we’re going to call this a partial opening because we
haven’t opened the whole thing, we haven’t revealed the MAC share, we’ve only
revealed the data share. These are called a partial opening, okay? We never want to
reveal a MAC share. That’s a bad idea.
So we’re never going to do that. Okay. So happy so far? Good. This is the most
complicated mathematical slide in the entire talk, and you’ll see by the end of the
slide if you see it for 10 seconds you’ll see it’s not very complicated at all. So I want
to multiply X and Y to get a sharing of X and a sharing of Y to get a sharing of zed. So
how do I do this?
Well, I take one of these magically precomputed triples off the list, we’ll call it A, B,
and C, and then I partially open X minus A to obtain a value alpha epsilon, which is X
minus A, okay?
Now, this reveals no information about X because A you can treat as a key, and this
is a one-time pad encryption under key A of the value X. So this is information
theoretically secure, yeah?
>>: X minus A or Xi minus A?
>> Nigel Smart: What’s that? Sorry.
>>: So is this one part [inaudible]?
>> Nigel Smart: Yeah, yeah. So what happens is all parties compute the function X
minus A and then the output they partially open. So from that we can all compute X
minus A, so we know that everybody knows publically X minus A. So I send Xi
minus Ai to you, you send your Xi minus Ai back to me, okay?
Then we have Y minus B, and again roe then becomes a one-time pad encryption of
Y under key B. And now we just compute this linear function. Tuh-duh! Okay. And
why this works is because of this long piece of mass here. Look, you compute that
linear function and you stuck the things in that you first thought of up here and you
simplify and you get x times y, so it’s magic. Okay?
Beautifully magic protocol. So what happens here is to multiply I have to consume
something off my list, I have to engage in a little bit a communication, and then I do a
local computation. And that’s it. In some sense the reason why I say computing
arithmetic circuits is the wrong notion is because the basic operations are take
something off the list, do some communication, compute a linear function. It just
happens to be that we build a multiplication data out of those three basic
operations. If you kind of want to evaluate more complex functions like I’ll have at
the end of the talk it turns out that these three basic operations are what you use,
not arithmetic circuits. Okay?
So yeah. But we kind of want to be simple but just doing multiplication and addition
at the moment. So we can add, we can multiply, we can do everything. We can go
home. We’ve done everything, yeah? That’s nice and simple. Yeah?
Okay so the problem is that we could have cheaters. How do we know that all the
parties have followed the protocol correctly? And that’s where the MACs come in.
Just go back and see what could go wrong. Well, we’re computing local data so I’m
doing stuff just kind of computing along, blah, blah, blah, blah, blah, blah, blah, so the
only time I communicate with someone else and could possibly send the wrong
piece of information is here. Yeah?
So all the time I’m computing on shared secret values, every time I add shared secret
values or form a linear combination I get my shares of the max computed correctly.
So all I have to do is check that the partially open values are correct, yeah? Okay. So
how are we going to do this?
So this is a trick that’s going to be in Azorics [phonetic] this year. So what we do is
kind of very trivial, is that -- now it’s been accepted I can say it’s trivial, yeah? Okay
so let’s just say that the set of all partially open values is the set HA, so each player I
all of us agree that there’s a set of partially open values HA, yeah? For J equals one
up to T, where T is the number of things we’ve opened. We can actually batch these
T every 10,000, we can take t is 10,00 every 10,00 we do this check. Okay?
And for each one of these partially open values we have a sharing each individual
has a sharing of the MAC on that value. So we want to know whether these MACs
correspond to these partially open values. Yeah? Without revealing these, and each
player has a MAC key. So given these values, and given our sharings of these
without revealing them, and without revealing the MAC key we want to check that
they’re all okay. Okay.
So this is how we’re going to do it. We just generate by [inaudible] some random
string a set of random values J, one for every value we want to check. Well then we
all compute the value A, which we can do locally because we all know the values that
have been partially opened, and so we just multiply them by our J and take the sum?
Okay? So this is now public, but is random depends on these other Js.
Now, we can also locally compute our share of what we think is the MAC of A by
applying the same linear function, okay? Then we compute sigma I, which is the
MAC, minus alpha I which is our MAC key times A. Call that sigma I. If everything is
correct the sigma I will sum up to zero, okay? So all we have to do is we broadcast
sigma I, we make sure we kind of commit to it first, then broadcast it, yeah? So we
don’t have a problem with who goes first or who goes last.
So we just kind of like commit and broadcast sigma I. Add them up. If we get zero
we’re happy, if we don’t get zero we know someone at some point gave us a wrong
piece of information and we abort the protocol and say someone is a cheater but we
cannot determine who is the cheater, okay?
So because we haven’t got all this [inaudible] we kind of can’t -- no that’s not true.
That’s not true. Yeah there’s some on this majority, dishonest majority protocol for
which you can identify cheaters, but we can’t in this. Okay. So we can verify
correctness. So we’re done. Yes.
>>: So we do this once every T multiplication.
>> Nigel Smart: Yes. Or every T times you’ve opened something, yeah? So however
many you want to do you can batch up and you can kind of look at performance
matrix of your network and -- but we’ve taken about 100,000 every 100,000 you do
this just plop, plop, plop and it just checks, and at the end of the computation.
You’ve got to make sure you always do it at the end.
Yes. Actually at the end it’s kind of you have to do it then you do the opening and
then you do it again. Yeah? So at the end you have to actually do it twice. You
actually do it because I don’t want to reveal information to the, I don’t want to reveal
the output of the function if I know everything has been going correctly, yeah? So I
do it once, then I do the opening, then I do it again to check that the opening at the
actual final answer is correct. Yeah.
Okay.
>>: In practice does it make much sense to check intermediate values? Because you
know you’ll catch it in the end, so people are going to be cheating.
>> Nigel Smart: But you have to check all of them. So you can’t just check the final
value because it could be that it comes -- all of the partially open values you have
check. At some point you have to check all of them so you might as well do them as
you go through in batches. You don’t store them all up to the end because you could
have a very big computation.
Yeah. Okay. So the thing I haven’t told you is how to do the preprocessing. So that’s
the -- and I haven’t told you anything about fully homomorphic encryption yet. So
you should be going why is he talking about fully homomorphic encryption? So we
are going to do fully homomorphic encryption and preprocessing.
The other thing you should notice is the online phase involved only information
theoretic primitives for creating sort of random values and checking broadcast
works and blah, blah, blah, which involves hash functions, everything else is very
simplistic no complicated math.
Okay. Here’s going to be some complicated math, but we’re going to hide it. Okay.
So what I’m going to do is again explain a very naïve version in the real world you
actually add lots of bells and whistles to what I’m going to describe to make it run a
lot, lot faster. Okay? But lets just assume we’re doing a simple version of
preprocessing.
So [inaudible] I have a fully homomorphic encryption scheme, which is a public key
scheme. So go public key and secret key, whose plane text space is the finite field
with P elements. In practice it’s not going to be the finite field of P elements, the
lane text space is going to be some mega ring [inaudible] P so I can pack things into
Sim D [phonetic] manner and do a gazillion at once, yeah?
So we’ll just keep things simple. And it’s fully homomorphic because if I take a
message, and [inaudible] by this, if I take a message at one I encrypt it to get cipher
text one and can take a message at two encrypt it to get cipher text two, if I add the
two cipher text, whatever that means and decrypt them I get the sum of the
messages mod p.
And there’s a there’s another procedure, which is multiplying cipher text, which is if
I multiply cipher text and decrypt I get the product of the messages. Okay.
Now, we are only going to need to evaluate circuits of multiplicative depth one, so
the circuits we are going to evaluate in fully homomorphic encryptions just are
going to have multiplicative depth one. That means it’s not really fully
homomorphic encryption, it’s somewhat homomorphic encryption. But it’s kind of
more cool to call it fully homomorphic encryption, so I’ll continue to call it fully
homomorphic encryption just to add to my cool stakes.
Because it’s only multiplicative depth one, it means we can run things very fast
because we don’t need this boot strapping technique of gentry, which makes things
go very slow.
Okay. We’re going to need something slightly more complicated for math for fully
homomorphic encryption scheme, we’re going to need to be able to do distributive
decryption, okay? That actually makes things slightly less efficient, okay, because of
distributive decryption in latter space schemes is not as nice as it is in like, normal
discrete log-based schemes, for example.
So we assume that the shared no one actually knows the secret key for the FHE
scheme, but each party holds a share SKI of the secret key, and then together they
can decrypt the cipher text, any old cipher text. They can take any old cipher text
and they can decrypt it. So we assume there is a procedure for doing this. I’m not
going to explain what that is, you can easily define it for the BGV FHE scheme, most
FHE schemes you can define it. So let’s just assume this magically, as a magic box.
So we assume that. Now remember those alpha I’s we had earlier? Those are the
keys that we just generated a random alpha I each. We’re going to encrypt alpha I
with respect to public key and then broadcast it. So because everybody’s got the
encryption of alpha I that means everybody can compute the encryption of alpha,
even though we don’t know alpha because it’s [inaudible] homomorphic and alpha
was the sum of the alpha I’s. So we assume that everyone’s also got the encryption
of alpha. Yeah?
Okay. Now, we’re going to need sliver protocol here, which is quite funky. So I’ll go
through it. So what we’re going to do is we’re going to give it the cipher text, CT,
which encrypts the data value M. And what we want to do is given that encryption
we want to be able to create an additive sharing of N because that’s what we really
need in our protocol is additive sharing, yeah?
So we’re going to encrypt something, sorry. We obtain an encryption of something
in some way, which we’ll explain in a minute, and then we want to obtain an
additive sharing of the value that was encrypted. And if need be we also want to
come up with a fresh cipher text, in other words a cipher text which doesn’t have
very much noise in it, which also encrypts M.
Yeah? Because this cipher text could be quite noisy, so we want to kind of clean it at
the same time. Okay so we do this with this reshare protocol. So it’s very trivial so
we might as well go through it. Each party generates a random value FI and
transmits the cipher text, which encrypts FI. Given that everybody can now encrypt,
sorry, can now compute a cipher text, which encrypts M+F because they can take the
cipher text as input and add in the sum of the cipher text FI, where F is the sum of
the FI’s.
Okay? So this cipher text is a FHE encryption of a one-time pad encryption of N.
Okay? So it’s perfectly legit to decrypt it because no one is going to work out what M
is if we decrypt this. So we decrypt it with our share decryption thing to give us
M+F, and now we create the sharing because party one sets M1 to be M+F, minus
the thing they first thought of. Every other party says MI is equal to minus the thing
they first thought of, and then we see that the MI’s sum up to M, which is what we
wanted. And if we wanted to find a new fresh encryption we just encrypt M+F with
respect to some fresh noise and subtract the sum of the FI’s here.
And this is now the new encryption of F, sorry, the new encryption of N, but at
multiplicative depth zero, yeah? Because there’s been no multiplications required
to compute this new cipher text.
And we can get a fresh cipher text out and we can do a distributive sharing with this.
We use some default randomness here like zero. Yeah. Happy?
Okay. So this is how we’re going to generate the preprocessing. First we have to
generate A and B and then we have to come onto C. Okay. So how are we going to
generate A? Well, we generate A as follows: everybody generates their own Ai
randomly, which guarantees that the sum of the Ai’s is A is random because one of
us on this, so one of us is going to come up with a random Ai.
Then we transmit the encryptions of these Ai’s, we add them all up, okay? That gives
us an encryption of A. Now, remember what we want is we want some sharing of
the MAC of A. So the way we do that is we’ve go this encryption of alpha, we’ve got
an encryption of A, so we multiply these two together and that gives us an
encryption of alpha times A, which we then reshare to get the MAC.
And we reshare with the protocol on the previous slide. Yeah? Very simple. And we
do the same thing to execute B to compute B. So how are we going to compute C?
>>: [inaudible]
>> Nigel Smart: What’s that? Sorry.
>>: [inaudible]
>> Nigel Smart: No.
>>: [inaudible]
>> Nigel Smart: Multiply cipher text for A and the cipher text for B. The cipher text
for A has no multiplications to produce it, cipher text for B will have no
multiplications to produce it, so we’ve got cipher text A, cipher text B. So [inaudible]
of circuit of multiplicative depth one we compute a cipher text, which encrypts A
times B, and then to get the sharings we just reshare.
Apply the reshare protocol obtaining a fresh cipher text, which hasn’t had any
multiplications in it, which encrypts the same value because we want to multiply
again, and we don’t want to use a depth two FHE scheme because that would just be
too much hassle, yeah?
So this trick is just to avoid having to use a depth two circuit. So now we have the
cipher text alpha times this new encryption of C gives us the encryption of alpha
times C, and then we reshare it and out pops the sharings of C. You know, the
sharings of the MAC on C.
Okay. So this is efficient, very efficient because we only compute with depth one
circuits.
Yeah?
>>: I mean, if you’re only doing one multiplication you could just use like
[inaudible], right?
>> Nigel Smart: No.
[laughter]
Good. Everyone, every talk says that. That’s a standard question at this point.
>>: Because some of the resharing stuff seems like maybe ->> Nigel Smart: The problem is the product on this sum is that the cipher text space
is really small.
>>: Right.
>> Nigel Smart: And you can’t do the packing. Okay, so we are going to send a 1024
bit or 2000 bit thing to get one bit of information. I’m sending, okay. I’m sending
like, huge cipher text, but I’m packing 10,000 things in at one go so it’s kind of -yeah, yeah. It doesn’t work alas.
Okay? So we only compute [inaudible], everything’s very fast. We could do other
preprocessing. So when I later on I’m going to explain some different functions that
I can compute, we don’t just do [inaudible] circuits, we do other preprocessing. We
compute shared random bits. We compute sharings of A and sharings of one over A
as preprocessing. Yes, there is a lot of preprocessing we can do, which allows us to
leverage a lot of prior work on MPC protocols to actually make them more efficient
because we can put a lot of computation in the preprocessing phase because we can
use the power of FHE in the preprocessing phase, where as prior protocols had to do
all this extra computation in the online phase.
So there’s quite a big benefit we get out of systems. Okay. So we’re [inaudible]
called the SPDZ protocol. The first presentation was at crypto last year. There have
been all sorts of bells and whistles added since and we can compute all sorts of
different stuff. It’s very efficient and very practical for some applications. Very,
very, very practical for some applications. Okay?
It’s got better security properties than other MPC implementations. You can have
dishonest majority, you can have N anything we want, we can have two up to
whatever, 10 because that’s how many Windows I can put on my screen. We have
it’s actively secure. So it’s good, yeah? This is everything you’ve ever wanted for
MPC but were afraid to ask. So it does that.
It’s really flexible in terms of parameters. It’s not quite suited to evaluate binary
circuits. If you’ve got a function that is best described as a binary circuit or a circuit
of a small finite field, then SPDZ is probably not the protocol for you. There’s
another protocol called Tiny-OT, which is better for that, which is the other one that
was presented at the crypto last year. And it turns out that there are trade offs
between SPDZ and Tiny-OT.
It’s not necessarily the case even if you were evaluating like, a hash function, which
one’s fastest is kind of -- even though a hash function you should describe as a
binary circuit as a natural way of describing it it’s not necessarily so clear that it’s a
binary circuit. So it’s unclear which one is the fastest.
But they are pretty comparable, have roughly the same sort of performance except
one is kind of focused on, Tiny-OT is two party binary circuit, SPDZ is for any
number of parties things that you describe with operations molt P.
Okay.
Let’s look at performance-wise. So we’ve implemented all sorts of things here for
various people to investigate different applications. As I said before the real key is
we don’t evaluate circuits. The key is to open data. And it turns out that if your
actually doing computation it’s not the amount of data you open. So again, here’s
another thing for theoreticians is that theoreticians often count the total amount of
data transferred as a complexity measure. I don’t really care.
The real thing that really slows us down with these protocols is the total number of
connections you make. How many packets you send. Not the total amount of data in
the packets, yeah?
So what we do is we have compilers, which minimize the number of times we
actually call open and we actually open like, maybe a thousand things at once. So we
kind of reorder the program so we only do a small amount, small numbers of
communication but the same amount of communication but smaller number of
times.
We can offload stuff into the offline phase. We expect more impressive results to
come out in the next few months. Obviously this talk is ages old. So some of the
data can be a bit out of date, but we are continually improving it.
So here’s our current run times. Sorry, our run times about February this year.
Okay. So here we have one thread, four threads, seven threads on the machine. We
have one control thread, which is why we haven’t gone to eight, okay? So these are
the total number of multiplications per second. So what is interesting is if you put,
what we found is interesting is if you put latency versus throughput.
The reason why we’re doing is because most MPC performance figures they give you
throughput and then they hide the important figure. You know, if I say I can get a
throughput of a gazillion a second, but actually get the first one out of my pipeline
takes ten seconds, I’m not really interested in the fact you can et a gazillion a second
through the throughput, yeah? I’m interested in the fact I have to wait ten seconds.
So it’s kind of important to actually see how you can trade latency versus
throughput.
So this is a much more important for practice than latency versus throughput
graphs. Interestingly, [inaudible] has done the same graphs for Tiny-OT and it really
is boom, boom all the time. So this is kind of nice. Okay, so what do we have?
[inaudible] multiplication. We can have very, very small latency, yeah? And we can
really push up to 600 thousand multiplications a second. That’s about a 286. Yeah?
Yeah, so wait, then you go wait a minute, okay? But yeah, 286 that’s really good
performance, okay?
So it’s a 286 performance we can do really well. And the issue here is seven threads
is not better than four because at this point we really are maxing out the network
card, not the network. Yeah? We need more network cards on the machine. The
threads are just kind of like pumping the network card and it can’t keep up.
Okay. But there we go. Interested in comparison. Now here’s an interesting thing.
Integer comparison on a normal programming language is faster than integer
multiplication. Not so with MPC. Integer comparison is more expensive than
multiplication, okay?
So here we can do about 6,000 per second. Throughput latency we can put latencies
of about 10. It depends on whether we’re comparing 32 bit or 64 bit numbers. So
we take a molt P, which the P could be 120 bits long, but we’re looking at like we
could say well, we know the number inside it is 32 bits because we got a range
analysis which says the thing inside it is 32 bits. Yeah?
So that’s quite reasonable. Nowhere near 286. 286 could do whatever million of
these a second. Okay? So we’re kind of getting worse. Sorting. You want to sort
some numbers? Because that’s just comparisons. Okay, one of my post doc’s has
done he can now sort a list of a million. It takes many hours but he can sort a list of
a million using this.
So this is kind of like some early stuff we can do just like the 400, so in 2.5 seconds
we can do 32 bit numbers for a list of 400 of them. Yeah. Yeah. There’s three of you
let’s go with you first.
>>: MPC context with you already have half the inputs ->> Nigel Smart: So this context we’re assuming that the values have already been
shared. So if they have different. Yeah, so if you had one half, one half, one half they
only have to emerge so it’s much easier, yeah?
So we’re assuming the worst possible case where people just dump shared data and
now we want to sort it. Yeah.
>>: And so what does it mean to have one thread multiparty multiplication?
>> Nigel Smart: Okay, so one thread means that so what we’re doing is in our
system we can have multiple threads playing the parts of the party. So we can do
many, many things in parallel. It’s just sorting we haven’t managed to ->>: So one thread doesn’t mean ->> Nigel Smart: No, one thread means that one party is running one thread rather
than maxing out his microprocessor. So the processor has got eight threads on it,
sorry, processor has eight cores on it, so he’s only maxing out one core.
Yeah.
>>: That’s independent of the number of participants?
>> Nigel Smart: It’s independent. Yeah, okay so these figures are for two
participants. If you’ve got a three participant it scales there’s like a 10 percent extra,
four participants we maybe have 25 percent extra cost, it really is very -- we don’t
really see much difference so we just give the two party case. Yeah.
>>: Oh, it was about what does [inaudible] mean?
>> Nigel Smart: What does [inaudible] mean, okay yeah.
Okay. So here we have fixed and floating point multiplication. Now, if you want to
do something interesting you’re going to want to do something interesting. You
don’t want to do floating-point multiplication. So currently I’ve got students over
the summer who are implementing a whole floating point library for me so we can
do sine, cosine, shine, square root. Yeah?
>>: Single or double precision?
>> Nigel Smart: These are single precision. Yeah. These are single precision.
Double precision, yeah. We can go to double precision it’s just much more
complicated. These are single precision values. And here we see we can do
multiplication. So we can do about 6,000 a second or we got kind of, see we got kind
of worse, okay?
Addition is really bad. Okay, so Dan Burnstein wasn’t surprised when I showed him
this because he said it turns out Intel say that the addition on their processor uses
up more energy than the multiplication. Okay? In MPC land, I don’t know why, but
it turns out if you think about it for five seconds you realize why addition could be
more complicated, right? So I’ve got floating point numbers, I’ve got 19 for an
exponent. Yeah, suddenly Josh just went oh yeah.
Yeah. Multiplication is just trivial, whereas addition I have to shift this around
dependent on this value and I don’t know what this value is, yeah? So it’s much
more tricky.
So there we go, you’ve got floating point addition. Fixed floating-point comparison
we can do about 4 thousand, so yeah, we’ve got numbers, loads of numbers.
Yes?
>>: What’s, like, the storage cost?
>> Nigel Smart: Well, in terms of offline processing?
>>: Well, I mean just in terms of what you have to keep sitting. Because I mean, for
example, like the thing from [inaudible] it seemed like you were keeping around like
a couple hundred thousand->> Nigel Smart: Okay so for these things it’s not very much at all. So if they’re
floating point so it turns out actually that the floating point operations you are
storing maybe like a hundred times more data than you would normally, right? In
files, but really it’s peanuts compared to what you’re actually doing. So there is stuff
to do like AS that can be quite expensive.
>>: Say you kind of like, space time trade off, or if I were willing to store a little
more I could get a better output?
>> Nigel Smart: No, this is assuming the best.
>>: This is assuming I’ve taken ->> Nigel Smart: Yeah, you’ve taken a whole data set. But in terms of in practice
what you would do is you could kind of have like two coupled computers, one doing
the offline phase pushing data into the online phase, or you could have one core on
the other computer doing the offline phase. It depends how you want to as a system
integrate them together.
>>: Okay.
>>: When it comes to [inaudible] share the same order except the negative ones are
[inaudible]?
>> Nigel Smart: [inaudible] because in MPC land we’ll encrypt and exponent in
floating point, whereas in fixed point it’s not encrypted so you’ve actually got less to
do, much, much less to do. Much, much easier. Yeah, which is why a lot of prior
work here has always looked at fixed-point numbers. So these floating point
numbers are kind of using algorithms from there’s a paper I think by some Japanese
authors I think earlier in the year, I can’t remember. We took their algorithms and
kind of tweaked them to out accepting. Okay. [inaudible]. So here’s a small finite
field example. Okay, yeah. We’ve got some time. So imagine you want to verify
passwords.
So this is a static password solution that Eric Yules [phonetic] explained at the real
world crypto workshop. So we have so we got some static password we want to
verify. We’re going to do it with two servers without storing a password. Okay.
This protocol is stupid, right? Okay. So just to set the scene.
So I have some password P, which okay, so we assume that we got two servers and
the servers have actually stored one server stored one value and the other server
stored the other value, such that the [inaudible] of them is equal to the password we
want to verify. So to check the password without revealing the password, what we
do is we generate a random zoring [phonetic] of the password, send P1 to server
one, p2 to server two. Server one just combines his bits, server two combines his
bits, and then all we have to do is secure a comparison tick.
Okay, so this is nice and trivial. Dumb. Okay, so the problem is [inaudible]. Okay.
The reason that you might want to evaluate the AS function over T, which is
something that we introduced in Asia crypt 2009 paper and everyone went why the
hell would you want to do this? The reason you might want to do this, which you
probably just discovered is that you might want to verify dynamic passwords. So if
I’m doing secure I.D. tokens in EMV cap, don’t know what the MV is come to my talk
tomorrow. If you don’t know what EMV is you’re going to be using it in a few years.
If you go to Europe that’s why you can’t buy stuff on credit cards. Okay. So
password could be typically like the AES encryption of some one-time message M,
like some data or whatever, yeah, under some key you get the password and then
you type it in.
And you can replace AS without any other key, PRF for example, DES, or HMAC,
MD5, or HMAC SHA-1 or whatever. Yeah. Dumb.
So [inaudible] we’ve been looking at doing this for a dynamic password thing. I’ll
show you a demo in a few minutes at the end, okay? So we need to be able t
evaluate AES or we need to be able to evaluate MD5 or SHA-1. So we can evaluate
SHA-1, err AES. We can kind of get latency now. We’re kind of getting latencies now
of about 10-20 milliseconds to evaluate AES between two servers, which is really
quite [inaudible]. Or we can do a thousand a second.
You see the difference here? You can either get latency here or you can push the
throughput up here. You cannot get there and there at the same time, which is kind
of why people who claim I can do a thousand a second, yeah you can, but we can run
twenty times faster or a ten times faster than you so no. Okay.
And we’re actively secure, okay? So there we go. This AES, if you want to zoom in
you’ve got more and more details okay. If you want to do DES, you might want to do
DES especially if you’re an old bank you might want to do DES. You know, it’s still
pretty reasonable. If you talk to a major European bank they’re kind of looking at
latencies of their verification of about 300 milliseconds and the maximum
throughput is about 300 across the day. 300 per second verifications a day.
So they want to be the point here, which is not far off, yeah? So it’s pretty close to
where you want to be. So yeah, increase the number of cores and you’re done.
Make a better network card and you’re done. Yeah? So it’s kind of very easy.
MD5 pretty bad. SHA-1 really bad. It turns out SHA-2 is better than SHA-1, yeah?
So it turns out that [inaudible], or whoever it is, is designing cryptographic
algorithms now to be friendly towards MPC. AES is a beautiful function for MPC,
yeah. So you go to a block cipher designer should be a really complicated Boolean
circuit blah, blah, blah, blah with very little mathematical structure. Yeah, right.
AES really has no mathematical structure does it? I mean, it’s kind of like very, very,
very nice. It’s parallel, it’s got beautiful mathematical properties. AES is superduper fast for MPC evaluation. So when we chose it in 2009, we chose it as the
benchmark for MPC calculations because we thought it would be a complex function
for MPC to cope with. It actually turns out to be an easy function for MPC to cope
with. SHA-1 is the worst so far, okay?
SHA-2 turns out does a lot more in parallel. So SHA-2 runs faster than SHA-1. Okay.
Yeah. Haven’t looked at SHA-3 yet. Okay, so I’m going to show you a demo. How do
I do this right? I go escape? Yes. Then it was on exploder. Okay. So what I’m going
to do assuming that all of the networks work, and the machine’s backup base at
some god-forsaken time in the morning have fallen over, okay so a bit dodgy. Okay?
It normally works. So what I’m going to do is I’m going to type something to a
website, it’s going to go from this webpage, it’s going to go over the ether to Bristol.
It’s going to go to the Bristol web server. The Bristol web server is going to capture
the data, it’s going to send the data. It’s going to send it to two back end servers who
are going to engage in two AES calculations. The reason they have to engage in two
AES calculations is because we don’t want to store a per-user AES key. We’re going
to use a PRF to derive the per-user AES key from a master key, and only the master
key is going to be stored.
So there are two AES calculations going to be performed. It’s then going to these
two servers and to do the calculation, decide whether I should be allowed into the
system or not, okay, return the value to the web server and then it’s going to come
back over the ether to here and display on this terminal, which is then going to go up
and display on the screen there.
So I mean, this is magic! Yes? So I’m going to pull things out of my hat. Now, I
would like to point out when we first looked at AES in 2009, actively secure twoparty AES this would take between 40-80 minutes per AES evaluation, okay? So I’m
going to do two AES evaluations for you, actively secure, and you can therefore sit
here for another 80-160 minutes, okay?
Ready? Good. Okay, so I want to log in to this website and I have my one-time
password token, which is magically this thing over here. So I’m going to type in my
username, which is Nigel, lets get it right. And then my one-time information just
because every time I type in Nigel I always type in smart afterword it’s going to be
smart, which is not very one-time. And then my magic token here has told me my
password is 9d5ee72c. For those who use the caps system to log onto banks in
Europe you’ll understand why I’ve got eight characters and it looks like this, and I
click submit.
Two AES calculations, plus the delay in the Internet going over. Impressed? Yeah?
Look, password is verified. Okay, lets go back. You may say I’ve cheated, so lets just
push in a random password. I’ll put in P there. Password not verified.
There we go. Very cool huh? So that’s that speed of MPC these days you can do real
stuff for real in real time across the inter-web. So there we go. Thank you. Thank
you.
[applause]
>>: More questions?
>> Nigel Smart: Oh, my gosh.
Do you want to say who goes first?
>>: No, you go ahead.
>> Nigel Smart: All right. Yeah?
>>: So the two servers doing MPC in your demo, are they in the same data center?
>> Nigel Smart: Yeah. [inaudible] University.
Yeah, they’re kind of connected. They’ve got a dedicated, they’ve got extra network
cards and we just kind of -- we don’t even go through a hub, directly connected.
>>: So in terms of the wish list and everything you would want can you threshold
this? So you know how to use it and if you --
>> Nigel Smart: Yeah, you can threshold it if you want. You have to tweak the fully
homomorphic stuff, which is not too much hassle, but yeah, theoretically you can
just threshold the whole shebang. Yeah, you change the threshold of the speed stuff
and it should all just go through.
If you want to go down to [inaudible] then you can get rid of the MACs, yeah. So
woo-hoo you save an extra addition every time you want to add. Yeah.
Okay whose -- yeah?
>>: The trade-off between latency and throughput is very hard to read. I need
binoculars. Are the numbers published anywhere? Because ->> Nigel Smart: Yeah, yeah. Oh well, assuming the shepherd allows us the paper
will appear in CCS this year. Assuming the shepherd lets it through. So I have no
idea, but I suspect they will. Yeah. But basically the real message there is don’t
worry about what numbers actually are, if anyone ever tells you you’ve got a
throughput of a MPC characterization which is the normal thing people tell you, just
go yeah and what’s the latency? Because they always kind of go oh, we can do 2,000
AES’s a second, yeah, and it takes you ten seconds to get an answer out. That’s much
better.
>>: Isn’t the latency dependent on the network? So is that why they don’t say that?
>> Nigel Smart: Yeah, the latency does depend on the network. So if we slowed a
network down we can get worse latency. So if you put these things through other
parts, like you put one in California or you put one in New York, you get much
worse. So the ping times for these things are, oh god, .1 milliseconds, and the ping
time from Bristol to London is 4 milliseconds.
So if you did it Bristol to London you’d get a 40-fold decrease. Yeah, that’s really
scientific isn’t it? But you get the idea. Yeah?
>>: [inaudible]
>> Nigel Smart: Okay, so the thing is with [inaudible] you really are focused on two
parties only, you’re really focused on binary circuits. So for AES calculations
[inaudible] might win, okay? For floating point calculations we are going to just beat
them into the ground because we’re not evaluating binary circuits. We’re doing far
more rich ability to express how to evaluate functions.
So for anything involves statistics or integer calculations we’re going to beat Yow
[phonetic] hands down. So again it’s like you have to think about [inaudible], but in
terms of general MPC we’re pretty close to the YOW [phonetic] stuff but we’re not as
good as them, but we can do far more stuff. So, yeah.
In terms of the network, oh yeah. You can do YOW in all sorts of different ways.
Yeah, there’s so many. There could be like three papers on YOW at this year’s crypto
with different optimizations that know one knows how they actually play out.
So I think that’s an open question.
>>: So you must be expecting this question, but what can you say about offline
performance? For example, for these two AES calculations how many precomp ->> Nigel Smart: [inaudible]. Oh god, I should know this. A few thousand
multiplications. So for AES it’s particularly bad with SPDZ. So with SPDZ we’re
talking about 20 seconds of precomputation per AES, which is a bit of a problem.
With Tiny-OT we can get exactly the same performance figures and we can do
precomputation times of well under a second for Tiny-OT for AES. So again it’s
[inaudible]. I just kind of talked about speed here, if you want to talk about different
ones you can do different things.
But for online performance SPDZ and Tiny-OT are roughly the same.
>>: The slow down for comparisons is [inaudible]?
>> Nigel Smart: Yeah, this is the key, right? So what this does is it opens up a huge
amount of optimization. So what we’re kind of doing is at the moment is we’re just
spending our time thinking about different ways of doing things. So we can think
about different ways of doing comparison.
So just kind of like a stupid thing this opens up a whole new question in numerical
analysis. So for example, we want to do matrix characterizations, which require us
to do square roots. So how are you going to do square roots? Well, you’re probably
going to use [inaudible] method. And if you’re going to do matrix calculation you’re
probably going to use principle component analysis, which is also an iterative
method.
If you do iterative methods, there are two ways to do an iterative method. Iterate
for a thousand times and hope, or iterate until the epsilon is really small. If you have
to iterate until the epsilon is really small, then every time you have to do that test
whether the epsilon is really, really small. Whether you’ve converged or not, yeah?
So this will reveal how many iterations you have to do to perform the calculation.
So what does that reveal about the inputs to the algorithm? So it turns out, so we’ve
got some Europe American analysts in Bristol interested in this and they’ve been
doing some prior work for us. It turns out that if you do Newton style method it
reveals very little information. So you can actually work out how many times you’re
going to do the loop dynamically, whereas for a more linear scale method like the
[inaudible] method for matrices, then it does reveal you get like an approximation of
the first argon value of the matrix is revealed by the number of times you execute
the algorithm, which is kind of really funky. Yeah.
>>: [inaudible] for fixed and floating, then you would [inaudible] square or cubed
root by halving or dividing by three and then you create a very small and fixed
number of iterations. [inaudible]
>> Nigel Smart: Yeah. So I couldn’t remember that one but I knew there was one.
But there are a number of things we can do, which are all cool. The division by three
is quite nice because that’s for free because it’s just constant. So that doesn’t involve
anything.
>>: Any other questions?
So let’s thank Nigel again.
>> Nigel Smart: Thank you.
[applause]
Download