>> Kristin Lauter: Okay. So thank you all... pleased to have Matt Green visiting us from Johns Hopkins...

advertisement
>> Kristin Lauter: Okay. So thank you all for coming. This afternoon we're very
pleased to have Matt Green visiting us from Johns Hopkins University. And he will be
talking to us about Charm, a Framework For Rapidly Prototyping Cryptosystems.
>> Matthew Green: Thanks. Is this microphone working? Well, okay. So, yeah, I'm
going to be talking about a new framework called Charm. Charm is a collection of
ideas, some of which are new and a few of them are not terribly new but to paraphrase
somebody yesterday, I think that there's a compelling case for a lot of them. Let's pray
this works.
So a few people have asked me if Charm stands for something. This is kind of the stuff
that comes up when you start searching for Charm. There are many different things.
My favorite is a fundamental quirk that has an electric charge of plus two-thirds. That's
not what Charm is.
Baltimore is actually Charm city. And that's part of the reason that we chose the name.
For those of you who are fans of the wire, you should recognize the place a little bit.
That's what life is like on a daily basis.
So let me explain a little bit about what Charm is. And to do that, I have to explain how I
got to a place where I needed to build something like this. So for the last several years I
have been working on cryptography from a kind of a systems point of view. And what
that means is a bunch of different things.
A few years ago I was wording on attacking cryptographic RFID devices like the Exxon
Mobil Speedpass. More recently I've been working on things like outsourcing
attribute-based encryption, which we sell a lot of today, to the cloud to make it faster on
mobile devices.
Somewhere in the middle of my PhD I actually kind of dropped out for a while. And I
started a software security consulting business which I won't talk about, but it does have
clients, including Microsoft. So we've done pretty well with that. In my spare time, what
I like to do is I like to develop new cryptographic protocols, things that provide privacy
for database searches, things like adaptive oblivious transfer and things like unanimous
credential schemes that let you do access control.
Then I made one terrible mistake somewhere in the middle of my PhD, which is that I
told a bunch of my cryptographer colleagues that I knew how to write code. And the
end result was that they liked me to implement things.
I've written a few cryptographic libraries. The most recent is this thing called libfenc,
which stands for the Functional Encryption Library. Now, the genesis of this library is
about two years ago when Brent Waters came to me and said hey it would be a huge
win if we could develop a new attribute-based encryption library that supported many
new schemes and was very extensible. And I had a good motivation for this, which was
that I was actually working on a new grant doing medical record security, and we
thought maybe attribute-based encryption would be useful here.
So we came up with this new library. And the problem is that in order to do this, I chose
the worst possible library -- language to do it in. I decided to write a new library in C. I
thought this would be a few weeks worth of work. Ended up being three months and
about 15,000 lines of code. And somewhere in the middle of that, I decided that I would
never ever write a new cryptographic library at least in C again, that there was
something wrong with what we do.
So why -- the question that kind of came to me in the middle of this is why are -- why is
it that people are coming to me to write libraries? Why do we so infrequently do this?
Why is it so hard?
And the reason is that for a large part the knowledge isn't there. The people who know
how to write these cryptosystems are cryptographers. We have better things to do.
And we're not incentivized to implement any of our work. So that's why you see maybe
three percent of all the cryptoschemes that show up in EUROCRYPT and CRYPTO
ever actually make it into code. We assume that somebody out there, maybe the
systems community, is going to do it for us. I think that industry does do it, but if they
do, they don't like to tell people about it. They don't like to release their code back into
the wild. And I'm kind of looking at Kristin here, hoping that maybe I can kind of
influence that. But it really doesn't happen very often.
Another reason is that it's just really hard. The tools that we have are not right. We
have some libraries out there, but not enough of them. For example, if you're looking to
do lattice based crypto, what tool is there to do that? I hear people using SAGE. But
there's no lattice based cryptolibrary. There are pairing based crytolibraries, but they're
not always well supported, and they're not always kind of there in the right way that you
need them.
But the really, really big problem that makes this so painful is that there's almost no
reuse of code. And so I was talking to Sara today about a project she did last year, and
I said, so how much -- you know, how much work have you done on that project in the
last year? And the answer was, well, we fixed something a few months ago when
somebody asked us. And the thing is when people write these codes for research -this code for research purposes, they tend to kind of do it until it's not necessary
anymore. And then they drop it.
And it's very, very hard to take somebody else's library and use it as a building block in
your own code for reasons that we can get into. So the question is how do we solve
these problems.
So with all these new cryptosystems, just in case you're not convinced they exist, there
are things like IBE and CP-ABE, KP-ABE, group signatures, anonymous credentials, all
sorts of computations on encrypted data that we want to do.
So what we'd like to know is is there a better way that we can make it easy for
cryptographers to actually implement this code and to is paraphrase one of the big talks
that we had yesterday, is there a better way for cryptographers to interface with their
computers, to go directly from there research to working code with minimal effort.
To show you what I mean by working code, in the worst possible way, here is a
snapshot of the implementation of RSA encryption taken from OpenSSL. This is page
1, by the way. Okay. So you can see OpenSSL is not really a fair example, because
it's probably the worst library ever made. If you look for OpenSSL in code quality, one
of the first results that comes up is one called OpenSSL is written by monkeys. And it's
really terrible. And we can talk about that later.
But, you know, it's not something that looks at all like what you think of when you think
of RSA encryption. Let's see. We have page 2 on here. Oh, there we go.
So the question is why are we writing these libraries in languages like C anyway?
That's thought the way that other industries have gone. If you look at graphics, for
example, and gaming, typically what happens is we use C when we have to. We even
use assembly when we have to. We have a lower layer that does sort of optimized
native engines that do performance-critical tasks. And then we have a higher layer that
gives you a nice API that people who are not experts in that kind of work can use to
develop new things.
So we want to follow that framework, and we want to build something that's kind of
motivated by that idea. So our approach is this: Let's start by taking many of the
complex and performance-intensive mathematical routines, let's encapsulate them,
write them in C, and then provide them back to the user in a very simple way. Let's
expose them to a high-level language. We happened to choose Python here. You may
disagree with that. We're not married to it. But Python's a nice language for a bunch of
reasons.
And then, in this third bullet point is a really, really big bullet point that hides a lot of
detail. Let's create reusable and extensible tools that people can use to then go out and
implement cryptosystems and then reuse those cryptosystems and build more
complicated protocols on top of them. So the theory here is that we can rapidly develop
new cryptosystems instead of three months, we can do it in an afternoon.
This slide I sometimes get a little static on. Why Python? Not because I'm a Python
fanatic. I think Python has a lot of nice features, it's object oriented, blah, blah, blah,
blah, blah. One thing that's really important is that it's an interpreted language, which
means that you can dynamically generate code. We actually use this in a few of our
systems.
But there are many other things that make it useful. Many of the data structures are
very, very close maps to what cryptographers think about when they say let's output a
ciphertext. They don't say, well, let's define a big ciphertext data structure, they say let's
just output these group elements, and we expect them to be packed together. Python's
very nice for that.
So the benefits of this approach to cryptographers is that we can really focus on the
crypto algorithm, rather than all those implementation details we can reduce time. We
can basically get to something that looks like as close to a Latex paper as we can. So
here's kind of textbook RSA in Charm. Now, this may look a lot like something you'd
see in SAGE, but this is just kind of the example of how we want things to look in the
end.
So this is actually the full RSA protocol. So this is what RSA looks like. When you
compare that to the C code that we saw from OpenSSL you can see it's kind of a big
reduction in code size.
Just a ->>: [inaudible].
>> Matthew Green: Yeah?
>>: So what happens to performance then?
>> Matthew Green: Right. So the question is how much can we vary that -- those
performance optimizations so that you don't have to know about them?
For example, how much can we abstract into the -- basic the group, the ring interface so
that when you do this kind of exponentiation you don't have to explicitly have separate
modes that do CRT optimizations and so on, that that kind of thing can happen for you,
and you can flip switches.
So a lot of the idea of this is that we can kind of hide those details so that no matter how
much optimization you have, you get something that looks like readable code. And
that's kind of the basic starting point for what we did.
>>: But what about [inaudible] also extra cost just [inaudible].
>> Matthew Green: Is there extra cost from using interpreted language.
>>: Yes.
>> Matthew Green: And then -- so actually the interesting thing is that for many of the
schemes that we use, not really very much. Now, somebody who is doing very, very
tight loops in Python will object that, yes, there's a call overhead in Python and there's
some overhead from going from Python to C. But the truth is in many of the schemes
that we use, particularly by linear map schemes, we're doing pairings, a lot of the actual
work is taken up by the mathematical operations. The interpretation and, you know -it's a fairly optimized interpreter. A lot of that work, that really makes up a few percents
of the total cost.
So are you willing to sacrifice maybe five percent of your efficiency for something that's
usable and you could develop quickly? That's a big question. I don't know the answer
to that. I am, but maybe not everybody else is. So I'm not going to claim this solves
every problem. Yes?
>>: [inaudible].
>> Matthew Green: We will get to numbers, definitely. A few numbers.
>>: [inaudible].
>> Matthew Green: Yeah?
>>: [inaudible] be fair to [inaudible] not a big fan of, you're not showing here the nasty
details that OpenSSL [inaudible] because you're hiding them inside for example encode
and decode which has to deal with length, which is what OpenSSL [inaudible] it has to
do with very fine parameters match.
>> Matthew Green: Sure.
>>: Length and this and that. So it's buried in there somewhere. There's encode and
decode you have to write.
>> Matthew Green: Let me put it another way. It's buried in there somewhere, but it
does not have to be buried in your RSA encrypt, which ->>: Code does.
>> Matthew Green: But it doesn't have to be there. I mean, the way that -- let me put it
even a different way. If you wanted to extend OpenSSL to implement a new
cryptosystem, would you be able to do that with the building blocks they provide for
you? Is it modular enough that you can take those systems, you can extend them
easily?
And the answer that I've seen looking at most crytolibraries, including OpenSSL, is it's
not, it's thought designed from the point of view that you want to reduce things to the
smallest point.
>>: I agree.
>> Matthew Green: Yes?
>>: Let me see if I understand this. That description, you know, the code of the
previous thing, it's using no C code, it's using just directly Python or is it [inaudible] slide
or is it making calls out to things that are ->> Matthew Green: Yes. So as I said before is that what we want to do is we want
something that's performance that actually can perform properly. So we actually do use
C code for many of the computationally intensive operations.
>>: So Python is [inaudible].
>> Matthew Green: Exactly. But the truth is that if you look at many modern
cryptosystems, if you look at many of the research papers out there, the truth is that
that's what you're writing, is you're writing the glue. Most of the operations, even in the
lattice world, are really standard operations that are being put together in different
combinations. They're a very, very small number of exceptions.
So why are we writing in C? Why are we writing entire cryptosystems in C when what
we really want is those operations, that small set of operations to be efficient? I'm not
saying anything really new and controversial here. I'm just pointing something out that
seems pretty obvious. Yes?
>>: So just looking at your code, so what [inaudible] that N is the product of P and Q?
>> Matthew Green: Uh-huh. Yeah. This is -- yeah. Sorry, that's paramgen is actually
a separate [inaudible].
>>: Oh, specific for RSA [inaudible].
>> Matthew Green: Yes. I think there was -- yeah. Yeah. So -- so this is -- exactly.
So there are a lot of different things. This code is actually a couple months old. So I'm
not even sure how well it's been optimized. But this is pretty much what we want it to
look like.
>>: So can you comment either now or maybe later on how different Charm is, say for
example, from SAGE?
>> Matthew Green: Right. So the idea is that -- so SAGE does a lot of the same things
at a very low level. And I'm going to talk about some of the extensions. So what I want
to do is I don't want anyone to get the idea that this is it. I mean, we're not just taking
Python and throwing in a few exponentiations. If that was really all we were doing, it
would just be SAGE. We are actually adding some extra tools that are not available in
SAGE, some extra cryto-specific tools. And then we're adding some other support for
other capabilities that are really not there in SAGE.
We could have started with SAGE. And we could have built on top of that. And I'm not
totally opposed to that. We just felt that it was cleaner not to have to pull in all that
overhead of what SAGE is. So we kind of start with the basic building blocks.
I'm just going to give another example, because I wanted to have a couple of code
examples right up front. This is one of the nicest C implementations of a cryptosystem
I've ever seen. It's by John Bethencourt. It's a Paillier library. It's beautiful. I mean, he
writes -- there should be a museum in which John Bethencourt's code is. Because
when I write code, it doesn't look like this. But it's C code. And he wrote it around
GMP. And this is what it looks like. This is page 1. And this is page 2. And this is
page 3. And this is page 4.
So there's a lot of code involved in doing something very simple, even though he's using
a library to do all the fundamental operations. So this is what we can get to in Charm.
Not saying there's anything radical and new here, but the goal is to make something
that looks like what you see when you see a description of the Paillier protocol as close
as you possibly can. There are a lot of things that are going on in this code. For
example, we have basic routines for generating primes, for checking them. We have
some routines that allow us just some basic Python features that allow us to pack things
up. A lot of his code is allocating and deallocating memory and serializing it and putting
it into data structures. We can eliminate all of that, because we're working in a
high-level language. So just very simple things you can do once you move away from
C.
So there are a lot of previous approaches to this. Many people will say, well, there are
a lot of cryptographic libraries. What's the point of having yet another? There are also
some very -- some things like Cryptix. There are some other tools I'll talk about in a
second.
The problem is that these tools are really designed to work with an API. They're not
designed to generate new cryptosystems. They're very, very hard to extend. And
they're really not designed, you know, they're not designed to be used together. If
you've ever tried to combine OpenSSL with a pairing based cryptolibrary you'll find that
even if you're dealing with non pairing curves or doing anything like that or with miracle,
it's very hard to use those data structures together. They're not compatible.
I wanted to give one more painful OpenSSL example, just to drive this point home. This
is a piece of active OpenSSL code. I just wanted to point out the if 0 structure. Okay.
Some of you guys have seen this before. This code was not designed to be looked at.
This code was designed to be used from the point of view of a very nice API. It was
never designed to be updated. At least not by the people outside of OpenSSL.
There are many existing implementations of cryptosystem. For example, Brent hosts a
page called the Advanced Crypto Software Collection where a bunch of people have
sort of dropped their implementations of things like the Paillier scheme and the AVE.
The problem is that many of these things aren't reusable. There's no reunifying
framework for using these and so on. So I'm just covering the same things.
There are many other related projects that are very close but not quite what we need.
There's Cryptol, which is really designed for symmetric encryptions, very, very high
performance. But the goal is to produce FGPAs, not to produce code. There's SAGE.
And SAGE and SAGE has many, many things. But it has kind of too much. There's a
lot of material in SAGE that you don't necessarily want to import when you're building a
cryptolibrary.
Yehuda Lindell is actually building something that's related to this. But he's using Java
as his base language. We're working with him on some of that. But he's taking a
different approach than us, which is that he's actually designing his software. And we
are building our software kind of as we go. And we're sort of trying to see where those
two approaches end up.
So I want to give a quick overview of what we have in Charm. So these are the basic
components. At the very bottom of Charm we have these C math libraries which I
mentioned, which do the -- kind of the basic computational operations. Sorry. And a
little bit above that we have some Python wrappers for these, which give you some
abstractions for dealing with groups and rings and fields. And so we have some
cryptobase operations if we need to access ciphers, we need to do encryption efficiently
and some mode of operation we can use that.
We have a benchmark module which is built into the code that allows you to benchmark
not only the performance of a piece of Python code, but also to count the number of
operations that have been used in a particular run. You can actually specify a very
granular level what you want to count. It will also allow you to measure only the
performance of the C code. So if you feel that the Python code is actually some huge
overhead in your particular application, you just want to see how much time has been
spent down here, you can limit that in a benchmark module.
But above that, we have a lot of reusable components. We have a library of schemes.
And these schemes are designed on that you can build protocols out of them. We have
a protocol engine which is designed to sort of do all the basic infrastructure, if you need
to implement interactive protocols. We have a kind of a reusable toolkit of Python code
that does a lot of things like linear secret sharing, all these basic tools that you need to
implement things like attribute-based encryption. And at the top we have a -- an
architecture called the adapter architecture that deals with cases where you have two
schemes that are not directly compatible but you want to use them together in the same
protocol. So you have a signature scheme that has one input space and you have an
encryption scheme that has a very different input space or output space, and you want
to use them together.
So this is our answer for dealing with all of those cases where you would have to
normally write ugly glue code to get different libraries to work.
So I mentioned base modules. We use a lot of standard code to do the underlying math
routines. We use pairing-based crypto. We're moving to miracle. We use GMP and so
on. We provide Python extensions to do most of these things. And we are currently,
and I say in progress, meaning that we're currently thinking about what to do in terms of
adding lattice support.
We have this benchmark module which is designed not to induce a lot of overhead. It's
basically designed to make it as easy as possible for a developer to add to a program.
You can basically say okay, let's say I want to measure the number of multiplications,
exponentiations and so on, and we have this ability to sort of compare schemes with
that.
The way that you would use it is you say in your code, okay, init the benchmark module,
start the benchmark module now and then specify all the things you want to mention -you want to measure, and then end the benchmark module. And you can basic dump
those results after the fact. Essentially keeps a database of all the various counts that it
has.
The toolbox has a lot of different things in it. We have code for handling things like
certificates, padding encryption schemes. We have key and ciphertext serialization
which tends to be very specific for different formats. All of the secret sharing things
encodings. For example, padding schemes are also in there too. We're really hoping to
build this toolbox. And our approach right now, because we're working with a language
like Python, is that we implement a scheme. We see which parts of it are likely to be
reused. And we move those into the toolbox.
And over time we've actually built up a toolbox that's now to the point where we rarely
have to add new code but when we do it grows a little bit. We have these abstractions
for algebraic groups and so on.
So kind of the meat of this is that in a very, very short time we've been able to
implement a huge number of schemes. Now, if you think about the total number of
schemes that have been implemented now, it's probably -- probably on the order of two
dozen. If you think about the various libraries out there.
So in the course of a week, we're now able to implement a dozen schemes. Basically
just find a research paper. If it's in one of the settings we support, we can translate it.
We have this library that we're sort of building right now with some of the basic systems,
group signatures, identity based description. Because of Brent, we've now been forced
to implement three different attribute-based encryption schemes, including a distributed
attribute-based encryption scheme that he recently came out with. When I found out he
was giving a talk today, I was considering spending the weekend implementing his
latest scheme, but I did not do that. All these various pieces, including protocols and so
on.
Just to give you an idea of what it looks like to implement a scheme, here is a -- just
kind of brief listing of a few of those things that shows the number of lines of code we
had to write to make these actually happen. It's usually very small.
So I want to give a quick example of what it means to implement a scheme. And that
kind of will give an overview of some of the basic features. So we want to implement a
new scheme. Let's say we want to implement the Cramer-Shoup encryption scheme,
which most people are familiar with. So we create a class. And we -- this class inherits
from the standard based class called PKEnc which has a defined API. We define an
initialization scheme which responsible for taking care of setting up the parameters.
And then we define a set of standard encryption scheme algorithms, keygen, encrypt,
decrypt. These are things you see in every modern research paper. And then you
write the algorithms.
Again, there's nothing radical here, but this just gives you an idea of kind of what you
have to do. We have the ability to grab random elements from a group, random
elements from different groups, and we can basically put these things together, do a few
exponentiations, pack the results into ciphertext and then return those things in a data
structure without having to do any -- anything painful.
Here's another thing where we have a built-in routine for encoding messages to different
groups. Again, most of the same stuff. We're taking in a public key in a message here
and we're returning a ciphertext in this particular format. We use Python dictionaries to
encode most of the elements so that it's very easy to recognize a ciphertext when we
get it back. We don't have to worry about what order it's been put together in. We can
just basically say here are the elements that we want. Yes?
>>: [inaudible].
>> Matthew Green: ZR is in this particular implementation we are actually using -- R is
the order of the group. So we're using a prime order group here. This is just our
abstraction for prime order groups. And this is it. And again, there's not too much here.
The only thing that's really interesting here is that we are actually hashing something.
So we have two different groups here. We're hashing something to an integer group
and we're hashing these values.
This is the kind of stuff that when you write it in C, again, it's not super novel when you
think about it from a high level. The minute you have to write this, the minute you have
to deal with how do I serialize three group elements into a buffer, hash them, take the
results and transform it into an integer, it's extremely painful. You're looking at 20 lines
of code. We can do that in one. So just straightforward things. Yes?
>>: How do you guarantee that these encodings from [inaudible] you want to make
sure that the serialization is uniquely parsable, right, because otherwise you can get
collisions [inaudible] how does that [inaudible].
>> Matthew Green: So we -- right now, I mean, there's nothing -- it's not terribly
sophisticated. This notation here indicates a particular way that we're going to serialize
these group elements. So essentially it's basically take them, write them out into a
particular string of bytes, we know how that will be, concatenate them as a buffer.
>>: So you [inaudible].
>> Matthew Green: That's exactly right. So there are potentially ways that we can
make this ->>: [inaudible] variable length, so, I mean, I had to struggle with this, have a variable
length message and a fixed length something, another variable length field, once you
have two variable length fields, you actually have to think about it.
>> Matthew Green: We do.
>>: Do you take care of the thinking?
>> Matthew Green: We don't take care of that kind of issue yet, but we can in the
future. It's a good thing to know about. So again, decoding and so on is all here.
I want to give one more example, just to show the kind of things we have in the toolbox.
We look at an AV scheme -- yes?
>>: So sorry. I have to leave in a second so one question is have you submitted this
work anywhere yet or what is your plan?
>> Matthew Green: We have tack reported this one.
>>: So just one comment. Like actually I really do like the idea, like, you know, I think
it's -- I like the scheme of course [laughter]. Like I wonder, though, this just kind of a
general thing, I don't think it's -- like for a conference are you going to go for a security
conference or a crypto conference? Because for a security conference, I could guess
that they're like oh, all these schemes -- I think that's really cool is that you cover all
these schemes, right, and you can cover them easily. But I'm not sure if, like, you know,
the [inaudible] viewer knows ->> Matthew Green: There's no way this is getting into a crypto conference. This is not
something that cryptographers want to publish. I mean, this is not the kind of work that
exists there.
>>: You say it's not getting into a crypto ->> Matthew Green: I mean, this is not going into, you know, a major crypto conference.
This is -- you very rarely see this work.
>>: [inaudible] into a major security conference.
>>: Yes.
>> Matthew Green: That's the question, yeah. That is -- it's a big problem.
>>: I think you can -- I mean ->>: [inaudible] program [inaudible].
>>: No. I just mean -- I mean, we did something fairly similar. And I think you can
really sell it.
>>: [inaudible] you think ->>: I mean ->> Matthew Green: I think this would have been a great USENIX submission. We just
didn't have it ready by USENIX.
>>: Yeah, I mean, I do too. I just wonder ->> Matthew Green: Yeah.
>>: [inaudible].
>> Matthew Green: So now you're asking a more philosophical question, where -what's the venue for this kind of crypto engineering?
>>: Yeah.
>> Matthew Green: I don't know what it is. I wish I did.
>>: I don't mean it in a negative way, I mean it ->> Matthew Green: No, it's a big problem.
>>: [inaudible] figure it out way.
>> Matthew Green: It's a big problem. Yes?
>>: Other communities that might be interested is program languages, software
engineering. Actually which begs the question, I was wondering by choosing Python
here isn't it a little unfortunate there's a whole line of work on taking languages like ML
and these things and doing formal verifications of engines for analysis and why didn't -you know, it seems like you just missed what could have been a really nice opportunity
there.
>> Matthew Green: So let me ->>: Having another dimensional analysis ->> Matthew Green: So first of all the answer to that is I don't think using Python
precludes formal analysis. I have a lot of people say Python is not a strongly typed
language, but that's -- you can deal with that. But ->>: [inaudible].
>> Matthew Green: No, there ->>: [inaudible] people work on other languages.
>> Matthew Green: Sure. There are. But a lot of the engines that are out there are
working on very specific functional -- things like -- you have one at Microsoft here, which
is they used to do DKM, which was ->>: [inaudible].
>> Matthew Green: Yeah. So let me put it another way. The times -- the places where
I've seen these formally verified languages used to deploy actual shipping
cryptosystems, the products, they could fit, you know, in this cup. This never happens.
The reason is because when people actually want to ship things, what they do is they
do a formal analysis, they implement it over here, and then they reimplement it in
another language. They reimplement in C#, they reimplement in something that you
can use and developers know how to work with.
And so our philosophy is that -- maybe I'm being a little blunt about this, but our
philosophy is that these languages are very, very useful. But if you don't employ a
language that people are willing, developers are willing to use and companies are
willing to ship and people are actually comfortable with and are well supported, you're
really doing academic work that isn't going to advance the field very much. I mean, the
formal verification itself is very useful. When it comes to applying it to a language, we
should be applying it to languages that are likely to be used. And maybe that's not nice,
but I really think that's important.
And this is the problem with DKM is they have two implementations. They have a -- I
guess it's ML and they have a C# implementation. One hopes that these are the same
thing. One hopes that there's a direct map between what's been analyzed and what's
going to ship. Can anyone here tell me that's the case? And there's a ->>: [inaudible].
>> Matthew Green: Would I like to do that for Python. I would like to take that kind of
formal verification work and move it to languages like Python. I don't think it's
impossible. I don't think it's also required for many applications, but for those
applications where it is, we should be able to support it in a language we can use.
>>: [inaudible].
>> Matthew Green: Sure. Formal verification is ->>: [inaudible] you have to go down a path no one's ever gone down before.
>> Matthew Green: Yeah.
>>: Whereas the DKM thing is an example of where that's been shown possible with a
->> Matthew Green: Yes. And I think that there's some great feasibility results out
there, but I -- I'm probably offending a lot of people, this is going to get out, and -- but I
look at these more as feasibility results. They're not going to be practical resulting in we
can do them for languages that people are willing to use in a broad, wide scale.
Anyway ->>: I just wanted to comment. So Brent's question in some sense is getting at this
issue of incentive. So you said in the beginning, you know, why do people do this and
this question of incentives. So the issue with, you know, a industry obviously there is a
great incentive to build these libraries.
>> Matthew Green: Right.
>>: Then you're not generally allowed to share them with anyone else. And so one of
my questions is kind of what is your intention with this -- why were you like -- are you
going to release it publically and if so under what license if you're building on those
other pieces like GPM and stuff, where it could be restricted by what licenses they
release? I mean, there's this whole huge mess that SAGE has dealt with -- the SAGE
community has come to blows over these issues. I mean, they're not easy problems to
solve.
>> Matthew Green: So our goal has always been to release this publically. Under
probably LGPL. Which means that the library itself will be GPLed but you can use it
within another product without having to worry about that product also being GPLed.
And so far, with a couple of exceptions we've been able to do that. I'm going to have to
plug my computer in before it dies. But that is -- that's, in fact, our major goal.
Another thing that I'm going to say about this is that our goal here is to build a system
on top of which we can do research. The system itself is really a basic level, and we'd
like to continue to reuse it. Again, it's not so much that we're -- we're trying to publish
the similar, we would like to have a system that's kind of usable and everybody else can
use as well.
It's very necessary right now for the work that we're doing with the SHARP's grant that
we have a system like this. So if we don't have a system that allows us to build
cryptosystems easily we -- we are going to be kind of in trouble.
>>: Oh, I see. So there's actually a huge incentive there, right, which is proof of
concept in a particular setting, which is an enormously important setting [inaudible].
>>: SHARP also gives us some funding to do this with so that we can push this out and
then [inaudible] project objectives is things [inaudible].
>> Matthew Green: Okay. So there is -- I'll skip past this. I wanted to quickly give one
-- at least one slide from encryption in an attribute-based encryption scheme. My
implementation of this in C was only about a thousand lines. So this is a big
improvement. A couple of things you'll notice is that we have some very important
library routines. We have this create policy, which takes in a string. Attribute-based
encryption policies are typically strings separated by ands and ors, and we use that
across many AB schemes. So why not just put that in our toolbox if we're going to use
it.
We have this yet attribute list. We have the ability to calculate shares. And this is all
standard secret sharing code that we just reuse over and over again for every ABE
scheme we have.
We could use it if I was implementing Brent's hide. We'd use that again. So there are
many things that we could use. Again, the objection here is that we're hiding pieces of
code. But these are pieces of code that are themselves not very long. And we have
used them for many different schemes. So it's not totally obvious objection.
And this is decryption which I won't go into too much detail. But again, we have a few
library routines. And what I want to just kind of get at is that much of what you see up
here in Latex maps directly to what you see in this code. We prune the tree a little bit,
we get the coefficients and then we have this four everything in the prune list we do this
which kind of maps to this product up here I think. And then we did the final line, which
is just this decryption.
>>: So ->> Matthew Green: Yes?
>>: So I'm sort of curious who -- who you're expecting to be writing this code. So, I
mean.
>> Matthew Green: You.
>>: Okay. So there's sort of -- well, I mean, there's the cryptographer, right, and this is
certainly making their job easier for them, you know, they wrote up their paper and they
don't have to look like go hack around in C which they haven't used in, you know,
however many years. But, I mean, the incentive still isn't there necessarily, right, for the
cryptographer. I mean, they publish their paper. It's like this theoretical contribution and
then they're on to the next one.
>> Matthew Green: Sure.
>>: And so the question is, I mean, so ->> Matthew Green: Who's going to do it?
>>: So you would think it would be the person who wants to is plug it into their thing but
then my question -- and I don't know if there's a good answer, I mean, is this really -- is
it safe in some sense for someone who really doesn't know crypto? You know, they
could say, okay, well, they can go look up the paper and they can do this mapping
themselves. But, you know, maybe they messed something up and then it's completely
insecure. I mean, so how do you sort of ->> Matthew Green: So there are two answers to that. One is that I think that the grad
students of the world hopefully, you know, the crypto grad students of the world will
hopefully help a little bit with this.
In the short term we're actually working on it. If you have a scheme and you'd like to
see it implemented, just for vanity's sake you'd like to see it implemented, come to us.
We're happy to do it. It takes a couple hours. It's not a big deal.
In the longer term, though, there are two levels of this. One is implementing these basic
schemes which I agree takes a lot of knowledge. Now, the next thing is let's say you
want to build a protocol. You're not an expert. You want to build a protocol. Maybe it's
a simple protocol. It's encryption and signing put together. How do you do that
securely? How do you know, you know, what these schemes provide in terms of
security definitions? Can I build something that does these things?
That's an area where we think less expert people will be available. So we do have
some ideas there on how to make that easier for non-experts so they can't screw it up.
And I'll come back to that.
But, yeah, I agree, at some point you have to understand the scheme. But let me put it
this way. If you could implement this scheme in an hour instead of -- how long did it
take you to write your verifiable encryption scheme in -- last year?
>>: You know, just in C++, not ->> Matthew Green: Sure.
>>: [inaudible] rewrote it further?
>> Matthew Green: Sure.
>>: [inaudible] months.
>> Matthew Green: So that's what I'm saying. There's -- we're just trying to lower the
barrier of entry, the barrier of having to [inaudible] kind of stuff to the point where maybe
it's not such a problem. Maybe that won't be the solution. We'd like it to be, but who
knows.
So let me get to the point where -- let me get to the next aspect, which is ->>: [inaudible] generate C code out of this?
>> Matthew Green: Yes. So that's one of the things that we're working on now is we're
using some Python to C compilers along with our code to actually translate that Python
code to Efficient C. We think that's going to be important in terms of actually deploying
things to products.
>>: Python is only this upper level ->> Matthew Green: Yeah.
>>: [inaudible].
>> Matthew Green: Yeah. Well, that's the other thing. So you could also just import
the Python interpreter into your application. It's not that big. We've been experimenting
with that too. There are different ways to get this. So when you see our performance
numbers you can decide whether or not you believe in it. But we'd like to be able to do
that.
So we have a few things that we have put in place to make things easier to make it
difficult for people to make mistakes. And one of them is we have this scheme class
hierarchy, which we just start with a few standard APIs. So we have these things that
everyone knows.
But beyond that, we want to annotate schemes. So once you annotate a scheme, we'd
like to annotate it with various things that are useful like what complexity assumptions
does that scheme use in its security proof? What computational model does the proof
use? What security definition? And this is probably the most important things. You're
implementing a signature scheme. Is it strongly unforgeable, weakly unforgeable,
exactly what is the definition? Is it a one-time signature? So these things actually are
carried with the objects that you create.
We also have very important things like the input and the output space. So you have
Water's IBE, which takes in a vector of bits as the ID space. You have other IBE
schemes which take in strings. So you'd like to be able to map from one to the other.
How do you do that? Well, we have to annotate the scheme and say this is our input
space and this is our output space. Then you can combine the schemes.
Now, the question is, what do you do with this information? So you want to combine a
bunch of different schemes to build an application. How do you do that?
So, you know, you have all these different data structures and you could have
incompatible inputs and outputs. How do you make it so that you can actually connect
these things without writing glue code? The answer we came up with is something
called adapters. Adapters are largely responsible for the task of taking different kinds of
data type, like for example I have an identity that's a string and I want to encode it to
something I can use with the Water's IBE. It's job is to actually actively query that
scheme and say what is your input type for the ID space? Find out what it is and then
figure out exactly what it needs to do to translate whatever I want to put in as the ID into
something that can be fed into the Water's scheme.
So we have this idea of kind of code that is -- I wouldn't say intelligent but is at least
aware of other components and can adapt itself properly.
Other things we can do with adapters. We can actually use them to implement more
complex protocols. There's a whole class of transforms called IBE-to-PKE transforms
that take an identity based encryption scheme and sometimes a Mac or a signature and
actually build a CCA secure public key encryption scheme. There are many protocols
like that.
So we can build generic adapters that basically just feed them two schemes, signature
and an IBE, and they will decide whether or not those schemes have the necessary
security properties to build a higher-level scheme. So there are many different things
you can do. We can do hybrid encryption with an adapter. If I want to encrypt long
messages. Well, I have a IBE scheme or a public key encryption scheme that takes
short messages, this adapter than handles all of the work of taking a large message
generating a random session key and using a cipher to actually do the pay load
encryption. So there are many different things that you can do with these adapters.
And the idea is that ultimately we're going to work towards a place where we can make
these more automated.
So I'll give you one example. I already gave you the hybrid encryption example. So I'm
not going to talk about this. But so let's give a very specific one, actually. So we have
this application that needs to do hybrid encryption. Bulk data needs to be encrypted.
And so what we're going to do is we wants to use an IBE scheme to encrypt it. So we
want to basically build an adapter. So the example I'm going to start with is the.
Boneh-Boyen scheme from 2004, which we've already talked about today.
Two of the details of the scheme are that the plaintext space is in a group. It's actually
GT bilinear group. And the ID space is an integer.
Now, that's not exactly what I want to use for my application. So I can put in place an ID
space of that. And all this does is it applies target collision resistant hash to the identity
that I've given it. So the identity is now a string. There's nothing in the unlimited string.
Gets hashed down into this integer. There's nothing really interesting happening here.
But it has a couple of kind of weird side effects, one of which is that if you're willing to
analyze this new scheme in the random oracle model, you've actually taken a
selectively secure IBE scheme and you've converted it into a fully secure IBE scheme.
Not a big deal. But now if you actually query this and you say what are your security
properties, this will report that it's a fully secure IBE, as well as the changed interface.
Next step is we could just bolt on another adapter to this. So we want to do hybrid
encryption. Well, this will take in an arbitrary plaintext space, the same ID, pass the ID
through but do the hybrid encryption to move that data. So this is a very simple
example.
And this is what it looks like to instantiated these things. This is our IBE adapter. So
this is what encryption is for -- actually, sorry, this is hybrid encryption, I think. So this
basically does cipher encryption, instantiates the cipher and we say CCM -- this doesn't
specify it's AES. But that's what we use right now as a low level.
So here is an example of the CHK04 transform that takes a selectively secure IBE and
a one time signature and converts it into a PKE, a CCA secure public key encryption
scheme. So we're actually using a couple of things here. So we have this issue -- if
you're familiar -- how many people are familiar with a CHK transform? So the basic
idea is that you generate a key pair for a one time signature and you use the public key
as the identity for an IBE scheme when you encrypt. Now, the problem is if you
look at the Boneh-Boyen scheme itself, it can't necessarily take whatever that public key
is and use it directly because it expects the input to be an element of ZQ. So we apply
this ID space adapter and that allows us to take those arbitrary strings. Then we have
the one time signature. And this code is very, very, very small. All it does is the actual
work of implementing that eight or nine lines that represent the CHK scheme. It
generates a key pair from this, uses the public key to encrypt the message and it signs
the resulting ciphertext using this, which involves another conversion of data types. And
it returns the ciphertext and it returns the signature as well as the public key, which is
the ID. Yes?
>>: I guess I'm going to ask the serialization question again [inaudible] conversion of
data types has to be [inaudible] parsable so that, you know, public key for this one time
signature didn't also look like a public key in [inaudible] play around with [inaudible].
What did you find -- I mean, the industry level is [inaudible] encoded with some horrible
things.
>> Matthew Green: Well, in this case it's not as much of an issue because in this case
you're actually allowed to query on just about anything. If you could make it look like a
different public key, it will just look like a different identity. When you try to decrypt with
that IBE scheme, it won't decrypt. So this is not a case where you have to worry as
much.
But you're right ->>: You're relying on a very specific sort of security proof [inaudible].
>> Matthew Green: Exactly. Exactly. So in this case, that wouldn't be an issue. In
general, we do a very specific serialization set of rules that again right now is fixed line,
but we basically say, you know, if you put these things together in this order ->>: Okay, so ->> Matthew Green: It will be serialized in ->>: You don't do something like [inaudible].
>> Matthew Green: We don't. Not now. But we could in the future. Right now we
don't.
So, okay, so this is again -- so we have this public key. This is just an example of how
you can keep throwing these things together. We add hybrid encryption. So now we
have a CCA secure scheme. Okay. So this just goes on forever.
So we also have this dictionary. So many of these security definitions imply other
security definitions. So a strongly unforgeable one time signature implies a weakly
unforgeable one time signature, a strongly unforgeable many time signature implies a
one time signature and so on. To make these rules work we actually have to tell Charm
about all of these things. So we have rule engines that allow you to compare and see if
things work. We also have complexity assumption equivalences and so on. So we had
to make the system aware of these notions in order to build these systems.
Now, the vision of this is that right now we may have 20 different schemes. But those
20 schemes with used with adapters and other systems to combine -- to make many
different protocols and many different additional schemes. So our kind of vision is that
some day you're going to be able to say I need a scheme that has this functionality
under these complexity assumptions, these are my input and output requirements and
security definitions and it's going to basically output a list of schemes that it can build
from known adapters.
The applications of this may or may not be useful, but this is kind of where we'd like to
go. So you don't have to go out of your way to put these things together, they just get
assembled for you when you need them.
I like this, but most people may not. So interactive protocols. This is a very, very
important thing. I work in the area of interactive protocols. Very, very hard to deal with
this stuff when you have to write everything from scratch. What does it take to build an
interactive protocol? Well, you have to have some kind of database or internal memory
to maintain state. You have to serialize data, which you just mentioned is a horrifyingly
awful process.
You have to do network transmission, which is not so bad, but it takes a lot of work.
Then what happens if you have a subprotocol? So when I write an oblivious transfer
protocol, it has a zero knowledge proof in the middle of it. Well, then I have to jump out
into that protocol, complete the execution, come back with the result and then move on.
Writing that is not a lot of fun. So we need some way of making that transparent to the
developer. The developer doesn't -- I mean, these protocols mostly look the same. So
let's provide some templates for people.
So let's talk about what you have to do to set up the protocol. I'm going to start with -there's an example here. Very simple protocol that most people know, the Schnorr
HVZK. So we set up the protocol. Looks a little complicated, but it's not really that
complicated. We just say here are the states and here are the transactions and
basically here are the parties. The one thing that we have to provide to the system is
basically a socket interface that says the person you want to talk to is at the end of the
socket.
So this protocol, incidentally, describes both the states for the prover and the verifier.
This class that I'm building describes the entire protocol. When I instantiate it, I will say
now I want to be the prover, and I want to talk to this person. I want to actually run this
protocol.
Okay. So this is what you actually have to implement. So all you have to do is give a
list of the states and the functions that are going to implement the core of them. And
then this is kind of all that you have to do in terms of actual code writing to implement
the guts of this first state up here. You use built-in store functions to keep values that
you want to use later. And then you return from this function with the stuff that needs to
be pushed over the wire. So the entire protocol is this few lines. And you have to do
this for every single state. Okay. I'll skip the states. They all kind of look like this.
So that's fine. But there are a lot of protocols where, you know, you have to actually
implement the protocols yours. There are others where there are subprotocols that are
not given to you in the paper. This is a classic example of a anonymous credential
protocol. Protocol is kind of here. You go through these steps, and then you hit this
point where you say execute this zero knowledge proof.
I mean, anyone, off the top of your head, tell me exactly what the protocol for
implementing this is? It's not easy. And Sara actually worked on this last year. She
came up with a compiler that does this for you automatically. Well, this is not a new
idea. We can use that as part of our system as well. So we can take this complicated
proof, and we can spit out something that directly maps to that -- basically implements
that protocol, that full interactive protocol. The only thing that's different from what she
did is that where she had to write her own interpreter to actually handle the compiling
and the interpretation of that proof, we have an interpreter. We have Python. So all we
have to do is basically when we hit this line in the protocol, we just build a bunch of
Python code and we execute it dynamically right here in the compiler. In the interpreter.
So this is what it looks like from the prover's point of view. You set up some public
variables. You said up some secret variables. You tell it who you want to talk to. And
you make one call. This call will handle all of the work of compiling that zero knowledge
proof into Python code which itself is in that protocol structure which you can then
execute. It will then run it interactively with the remote party and come back with a
result. So this is a much closer map to what you see in a research paper than what you
might see if you have to set up all this other information.
Now, why do we do this dynamically? Why not compile this proof one time before we
run the protocol and use it. Well, the answer is that there is information available to us
now, now that we have the protocol running where we have all of these values where
we can basically say well, what kind of setting are we running, what kind of -- what kind
of group elements are these? Are they integers, are they elliptic curve group elements?
There's a lot of information that's not available to you before you run the protocol -- the
program that will be when you get to this point.
Your ZKPDL, Sara's ZKPDL actually requires you to specify a lot of this stuff in
advance. So we just wanted to get rid of that and say, well, you're already running this
protocol. Why specify it twice? So let's just have that information available to you by
introspection.
And this is what the compiled code looks like. It's not pretty. But this is a very simple
proof. And this is kind of what you get at the end. So I already showed you this.
So I want to point out that there is work. So Sara's ZKPDL is something that does this.
There's also the CACE, which is a sophisticated compiler which does stuff in advance.
It uses a much more implicated language to specify the proofs. They do have a nice
formal verification component speaking of which we hope to kind of leverage in the
future.
So people asked me about performance measurements. And I think this is kind of the
key is this something you want to do. The answer is sometimes. This is our
implementation of EC-DSA. So this is OpenSSL's native code optimized and we use
the built-in speed test program to see how fast it runs. Okay? We have C right here
and we have the Python version that uses some C code as the optimization but is
written in Python. Not very good, is it? It's terrible.
When you look at it, we're about half as fast on verify and we're even maybe four times
fast -- four times slower when it comes to signing.
But if you look at the actual timings here, they're not very high. So if you have to
generate a hundred thousand DSA signatures, EC-DSA signatures, don't use this
algorithm. This is not the right tool to use. But if you have to generate one or two in the
course of a protocol, why are you going to take on all sorts of extra overhead just to
generate a few signatures?
But if you look at more complex protocols -- and the difference here is that these
protocols use much more mathematically intensive operations. This over here is the
Bethencourt attribute-based encryption scheme working with a policy of size 50, 50
nodes.
The differences here are not that big. Really the only difference there's no difference in
key generation time between our code and the native optimized C code. There is a
difference in encryption time. And we found that that difference is entirely due to a
parsing library that we use to parse policies. And we're actually trying to fix that.
But there's no -- not very much math -- difference in the amount of time that we spend
doing actual operations. Decryption is almost the same. So depending on what kind of
operation you're doing, are you building something it does, a mode of operation per
cipher? Do not do it in Python. But if you're doing something that uses more powerful
operations like lots of pairings, then this is fine with very, very minimal cost. Yes?
>>: The CDSA is a big surprise. If the dominant operation is [inaudible] what you're
doing in C either way, right?
>> Matthew Green: Yes.
>>: Because the Python code doesn't do the loops.
>> Matthew Green: Yup.
>>: The Python code just [inaudible].
>> Matthew Green: It doesn't do the loops inside the exponentiation. But when you
think of the call overhead, so the exponentiation and point multiplication is so fast that
when you're in Python and you have to think about our call overhead in Python, it
actually becomes a significant part of the cost. Dealing with work like serialization and
all that other stuff that we have to do in a very simple routine really does become
significant. When you're over here, the cost -- the costs are still there, but they're not
relevant. So, yes?
>>: [inaudible] what is the reason why we, like you said, have to like specify a lot of
stuff ahead of time is that we need to do optimizations for like multi-based
exponentiation?
>> Matthew Green: Yes.
>>: Can you guys do anything like that?
>> Matthew Green: Sure.
>>: I mean -- okay.
>> Matthew Green: So the only thing that's different -- so we have all the information
that you have. The only difference is that we get it by looking at the variables that have
already been allocated. So what information do you need that would allow you to make
that observation?
>>: Well, for example, so -- I'm obviously [inaudible] library but if I'm doing like 20
proofs using the same, you know, GNH spaces every time, right, then in ZKPDL, you
can cache all those powers and like the first proof is going to be like more time
intensive, but then it's like -- I mean, it's a huge speedup. So if [inaudible] dynamically,
it seems like ->> Matthew Green: That's true.
>>: Okay. So you can't magically do that? Or can you?
>> Matthew Green: You're right.
>>: You can do the optimizations like at the time, right?
>> Matthew Green: Yes.
>>: But I'm saying you can't do it like from one thing to the next in some sense.
>> Matthew Green: You're right. So basically you have to say that I'm going to do 20
more of these, and you have to say do this operation. That's not something -- we can't
know that you're going to do another 19 after that.
>>: Okay.
>> Matthew Green: See, there are some optimizations you can't do. That's a good
point. So there are some things we can and some things we can't. But, again, I don't
know, if you have to do a thousand of them, that's going to be an issue. But not every
case that you have to do that.
Again, and the other thing that's worth mentioning is that you can take our compiler, and
you can use it in advance. You can specify optimizations. You can do those in
advance and just generate code. But that's not really what we've done.
So a couple of other benchmark module results. I just want to point these out. So this
is the kind of stuff that gets spit out of our benchmark module, the number of
exponentiations and so on. This is Water's encryption from his '09 paper. 152
exponentiations for a 50-element policy. 345 multiplications. And in the decryption
operation, 101 parents what we had to compute. And we have total time that's spent -- I
believe this is total time spent with the benchmark on, with the benchmark off. So this is
the kind of output we can produce.
A few statistics before I end. Just to give you an idea of how much smaller in terms of
code size from anything that we've produced before. All of our schemes and all of our
adapters are less than 2,000 lines of code and I think that problem overestimates a little
bit. Our benchmark module's a few hundred lines of code. Our toolbox is less than a
thousand. So really it takes much, much less code and much, much less work. And it's
much, much more readable to produce this kind of stuff.
So we're using it right now in these medical record research projects where we found
that again the overhead for these things, the Python overhead is negligible in all of the
use cases that we've come up with. We expect to use it in all kind of different other
applications we come up with. We have a request from SRI to implement distributed
scheme, distributed ABE, which we worked out a little while ago. And we're actually
hoping to hire a full-time developer. This is one of these tricky situations where having
grad students do work that is largely engineering may not be the best use of grad
students, so we're actually hoping to hire a full-time developer to support this and keep
it supported so that other people can use it and it won't end up being one of these
orphan projects.
Last thing I will say, and a lot of people have asked about this. What is left to do here?
There's a lot. We can optimize things. We can add more schemes. But really what I
don't have on the slide that I do want to talk about is we're now working on adding a
formal verification component. And the first step is to make sure that things are strongly
typed and that we can derive types automatically and then have you correct them. So
that's a big thing.
We're also working on some current work that involves signature batch verification
where we can take a signature verification algorithm, programatically analyze it, apply
some rules to it, generate a batch verifier which gets automatically coded into Python on
the fly so that depending on the input we take in, we can actually produce these things
and execute them. The hope is that using this whole system as a base we're going to
be able to do a lot of sort of more fundamental research that we think will be useful.
And here is the URL if you ever want to check it out, put the slides up. And that's it.
[applause].
>> Kristin Lauter: Questions?
>> Matthew Green: Yeah?
>>: [inaudible] other Charms, if you will, the ability to [inaudible] is there any way that
would be beneficial here [inaudible] having these protocols interacting in [inaudible].
>> Matthew Green: That's a really good idea. But no. It's great idea though. Yeah?
>>: You touched on this very briefly but [inaudible] fact was that Charm is a type of
top-level language whereas the libraries here might need to be kind of [inaudible]
something else. Right? So, you know, kind of using a glue language at this level has
some consistency with the way one [inaudible] languages so, you know, but maybe
that's okay. But to test that, you need to use this in applications that intensively use
cryptography.
>> Matthew Green: That's correct.
>>: And there I think the concerns would be going out and touring the Python runtime in
loops would be very costly.
>> Matthew Green: So that's something that we're actually working on. We're looking
at the cost of that. It's not as bad as you think it would be. If your purpose here is to
really use -- do intensive crypto, maybe this isn't the tool for it. But if your tool is -- if
you're trying to prototype something, you want to prototype a server that does interact
with protocols, oblivious transfer, any kind of anonymous credential, this is a really nice
tool. But we think that -- and that's also the reason we're looking at using Python to C
compilation, so we can eventually move that interpreter out of things.
>>: [inaudible] particular parameter choices and just use any two prams.
>> Matthew Green: Sure.
>>: Specially two prams.
>> Matthew Green: Yup.
>>: Where do you [inaudible].
>> Matthew Green: It depends.
>>: [inaudible].
>> Matthew Green: Typically we have initialization team where you can specify the
group setting you're using. Right now the schemes don't actively check to see that you
have -- so for example, if you give it a bilinear group that -- where DVH is easy, many of
these schemes become insecure. So we don't actively check for that kind of thing. But
we're actually hoping to annotate the groups and say you can provide any group that
has -- where these assumptions are believed to hold. That's going to take a little bit of
work, but it's the kind of thing we want to do to sort of bulletproof it to make it easy to
switch from setting to setting.
>>: I have a couple of comments and I'd be curious to your reaction to it. So I
apologize if one seems kind of positive and one seems kind of negative.
>> Matthew Green: No, no, that's okay.
>>: First of all, I want to say, I mean, this is totally awesome work, it's really great. For
example, William Stein and the SAGE community has been asking for, you know, who's
going to -- who's going to use SAGE to build crypto and what do you want to see built
and stuff like that. So in some sense, you know, you've answered that call already.
And both of these comments are kind of around the question of incentives. So I think
it's really cool that we see this confluence of the application of the crypto with the
healthcare scenarios that I'm not saying that all of your funding necessarily came from
SHARP's or whatever.
>> Matthew Green: Don't tell Carl.
>>: But here's a setting where there was a huge incentive to actually have these crypto
protocols filled. And so federal funding may have even been kind of round about that it
came to this project and yet now that this exists, it's going to have a huge -- it could
potentially have a huge impact not only in this setting but in other industries and also a
huge public good or benefit. For example I can think of scenarios where two companies
are negotiating over crypto protocols and it's difficult for people to, you know, release
their proprietary, you know, code or, you know, they may want to share it or they may
want to have some independent people, you know, doing evaluations or figuring out to
negotiate how to, you know, bring two different solutions together from different
companies, so for interoperability, which is a huge challenge, you know, for the
application of crypto protocol.
So I see that the application of this is potentially much broader than the setting in which
it was, you know, able to kind of spring up. And it just seems like more broadly that it
could be an argument for federal funding -- I mean, you might ask why hasn't crypto
development been federally funded so far before. And you might speculate that it might
have something to do with NSA or something like that.
But I think this is a great argument for, you know, like ->> Matthew Green: We would --
>>: -- this type of project.
>> Matthew Green: Yeah. I mean, this project was borne of necessity. It's not -- it's
not how I would be using my research time if I wasn't facing the prospect of building a
lot of cryptosystems. It's not -- it's not the kind of work that's going to get you crypto
papers. But it's really important anyway. And so I think that we're just going to keep
going with it and make sure that the tool is solid and that people can use it and hope
that the next time I give this talk you guys are saying thank you for building this thing
and -- but we'll see.
>>: A little on this incentives question. One of the shifts here is that ordinarily these
research grants are driving like publication in conference proceedings whereas we don't
really get much credit for publishing conference proceedings we get credit for
influencing the ->> Matthew Green: It's awful.
>>: So, you know, that's the kind of thing that actually looks good or that [inaudible]
assuming that what you just described actually happened. That would be, you know, a
sign of like [inaudible].
>> Matthew Green: We want to build things, I mean, we have to make them [inaudible]
happy at this point.
>>: Well, my second comment I made a strategic error. I should have made the
negative one first.
So on the negative side, I mean, one thing that I am kind of fascinated by that in the
industry what you see is this huge lag between proposals of, you know, cryptographic
protocols and systems and things and adoption, right? So 15 years like kind of
minimum. So accounted for by, you know, inertia but also a lack of trust in new
assumptions and giving the community time to whack away at crazy assumptions that
are being introduced all over the place.
>> Matthew Green: Sure.
>>: And so one danger that I actually see here is is that with the ability to quickly -- so
Brent publishes a paper and tomorrow he can have an implementation of it. Somebody
else can grab this and start using it. And that lag, I see that as an actual real danger
that there was -- there was -- there was actually something very sensible about having
this huge [inaudible] flag in adoption.
>> Matthew Green: So I agree that maybe a year or two lag. Maybe that's a comment
more on the assumptions that we're using.
>>: Yeah, it is.
>> Matthew Green: Than on the [inaudible] itself. I think a year or two lag would make
sense. But I really think that it would be nice to see the open source community start to
use these things. I mean IBE has been out now for ten years. It's used by one
company. And the reason for that is not entirely because Voltage owns the patent it's
because nobody outside of this community has access to code or the knowledge to use
it. I'd like to see that change. And maybe that's -- there will be some conflicting forces
there.
>>: [inaudible].
>> Matthew Green: Some of it's patents. But, I mean, what's stopping some open
source project from -- what's stopping, you know, Bitcoin or some open source project
like that from using IBE? Are they going to get sued? I think the reason is that people
truly just don't know how to use the technology. I'd like to -- I'd like to change that.
>>: Well, and if the technology gets used then more hackers can hack and whack
[inaudible].
>>: So in terms of designing Rational [inaudible] Python [inaudible] about trying to find
language that cryptographers would use or the developers would use or the grad
students who have been [inaudible] for stipends to use?
>> Matthew Green: I have been meaning to rename this a programmer focused
framework for application development because everything about this has been from
my point of view of, you know, let's find -- every single tool should be based on what
makes it easy to develop things. And we looked at a lot of languages. Yehuda Lindell
is using Java. I think that's great. But I think that Java is the wrong choice. I think that
things need to be as simple as possible to make them [inaudible]. That's why we chose
Python. We looked at Lua, too. That was not a good idea.
>>: [inaudible] second question obviously [inaudible] adoption [inaudible] one of the
schemes end up being compromised development that doesn't work [inaudible] capable
of just switching that configuration or something versus well, great, now Python doesn't
work [inaudible].
>> Matthew Green: I think that adapters might be a way to go there. Because if you
need to switch out this signature scheme for another one, right now that's awful and
painful. But here you should literally be able to change one term in your code another
signature scheme. It may not be as efficient, but the code will function.
>> Kristin Lauter: Any further questions?
>>: [inaudible] [laughter].
>> Matthew Green: Thanks.
[applause]
Download