24917

advertisement
1
>> Nishanth Chandran: All right. So it's our pleasure to have Tom Ristenpart
speaking today. Tom is an assistant professor at University of Wisconsin.
He's done a lot of great work in cryptography and security, looking at the use
of randomness in cryptography, hash functions, format preserving encryption and
all sorts of things. And in security, a lot of great work in cloud security,
in particular looking at vulnerabilities in EC2 and today he'll be talking
about cryptographic theory.
>> Thomas Ristenpart: Thanks for the introduction. So everyone done reading?
Is everyone enjoying the quote? Whether you agree or disagree -- or if you
actually wrote the quote, that would be awesome, actually. You can step
forward at the end of the talk. I'd love to hear who you were and have a
discussion.
So this talk today will be about cryptography particularly, and kind of this
loose kind of viewpoint of what I'm calling practice driven cryptographic
theory. This is joint work with lots of fine people, so I won't read their
names, but recognize that they did all the hard work, of course, and I'm just
taking the credit.
So we hear a lot about provable security crypto and doing formal analysis,
having lots of rigor, and the usual story we hear around that is theory, we
develop and this educates how we should implement and deploy cryptographic
algorithms. Or at least that's what you hear in the academic community.
But what we know in practice is that we have lots of cryptographic standards,
particularly, that have vulnerabilities in them. So there's a bit of a
disconnect between the, in theory, the powerful tools we have to avoid
vulnerabilities and the reality that we have of relatively poor track record of
completely secure standards.
And these aren't just implementation bugs. There are plenty of implementation
bugs out there. We haven't solved the problem of [indiscernible] bugs, but
these are problems in the cryptographic designs as well. And there's lots of
explanations for this about how standards are developed and I think most
importantly, in terms of how it relates to theory, we often find that standards
were developed without knowledge of applicable theory that might help, but
equally often that themes standards were developed before any suitable theory
existed. And this, I think, is really a comment addressed to theoreticians
2
who, perhaps, think theory is all there and ready and happily will help us
solve these types of problems.
So we definitely have this -- this is my favorite picture, in particularly in
light of the Olympics going on right now. We have a lot of pressure on our
athletes to not repeat this. And we might think that in the crypto community,
we also have a little bit of a baton-dropping situation. You have theorists on
the right trying to hand off some information or knowledge about how to design
crypto to practitioners on the left and something gets dropped.
>>:
He's got a baton.
>> Thomas Ristenpart:
Uh-oh, baton.
>>:
He's got a baton?
No baton.
No baton.
No baton.
What are the green --
>> Thomas Ristenpart: Oh, oh. So this guy is -- I think NBC or the Olympic
commission copyrights all the high res pictures of this, so the only ones I can
steal from free for online -- I shouldn't say that, we're being recorded.
>>:
That's malware.
>> Thomas Ristenpart: I think this is the guy who won the race, actually.
maybe that's like the static analysis community doing better than
cryptographers. There's a baton being dropped.
So
So today, I'm going to basically talk about kind of trying to change the
viewpoint a little bit instead of thinking about theory educating standards,
let's look at it completely opposite and think about how we might derive new
theory surrounding, in particular, cryptographic standards, right, and so we'll
take standards as ground truth and try to understand -- use theory to
understand better what they're giving us or what they're not giving us.
So in particular, this will give a -- I'll basically apply this very high level
approach to a bunch of different projects and we'll talk about each project in
turn, but the idea is that we're going to first start by looking at standards
and understand -- try to understand from them what are the often implicit
security goals. And then match them to existing formal definitions from
cryptographic theory or provide new definitions which we'll often see we don't
3
have appropriate formal definitions of security for these types of tasks.
And then use our theory now as a post facto tool to analyze the schemes and
these standards, the cryptographic schemes and this will help us clarify the
security posture of these systems, either by helping us suss out
vulnerabilities that were previously not known or provide proof that can help
us improve confidence in the cryptographic schemes.
Okay. So with that viewpoint, I'm going to talk through a few recent works
that I'll follow this regime. One is a bit older. I talked about this last
year a bit, the TLS 1.2 protocol. I'm going to talk about again. Those of you
guys who haven't seen it. And then I'll talk about newer work that will be
appearing soon on password-based cryptography and the HMAC construction.
Okay. So let's just get right into it with TLS. And kind of skip over some of
the basics. TLS is used everywhere to protect web communications. And it
consists of two main cryptographic portions. One is the secret key exchange
protocol to derive a shared secret, and that secret is used to send data via
the TLS record protocol. Which is aimed at providing confidentiality and
authenticity for communications.
So we'll be focusing on just the record protocol portion of this and not
concerning ourselves further with the key exchange. So inside the record
protocol is a particular cryptographic encryption scheme that we call the MAC
encoding crypt. Also sometimes called MAC then encrypt algorithm. And the
idea is that we're going to provide an all-in-one solution via two parts.
First we're going to take our payload data and authenticate it with a message
authentication code, and then we're going to encrypt the result of
concatenating the payload with the tag and some padding if needed with an
encryption mechanism and then send the result along the public channel.
So there's lots of different algorithms [indiscernible] that are used inside
TLS. We're just going to look particularly at the CBC algorithms for
underlying this MAC encoding crypt, okay? So what happens there. Well, CBC
mode within TLS works as follows. We take the payload, the MAC tag that we
computed already, and then we have to pad out the resulting string in order to
get a bit string that's a multiple of the block size underneath that's used
inside CBC mode. So if we're using AS, this is 128 bits. And so we need to
have a multiple of 128 bits and then we use standard CBC mode on the result.
4
So the most recent versions of the standard, they use explicit initialization
vectors. So this is where you choose C not randomly each time you encrypt a
new message and send it across. This is not what they did in prior versions of
the standard. Before, they used the like the last cipher text block of the
previous encryption.
Okay. And then of relevance to our results, particularly, are going to be the
padding choices that can be used in TLS, and basically it's this regime where
you pad out with bites and the bite value that you use is indicative of how
many bites of padding minus one that you have.
So if you have to pad out by one byte, you use zero zero. If you pad out by
two bites, you use 0101. Three bytes, 0202. The idea being, of course, from a
practical perspective, it's nice, right, because you just read the first byte,
it tells you how many more bytes you have to remove when you're decrypting.
And moreover, CBC mode -- or TLS allows extra padding. So you don't
necessarily -- you don't have to do just minimal length padding so to get out
to the next boundary, end bit boundary or the block length boundary, but you
can also pad out further. And the reason they do that, as indicated in the
standard, is that this is a counter measure thought to help prevent traffic
analysis attacks, basically, that analyze the lengths of particular messages.
So by padding out extra amounts, you can hide how long the message was. Or at
least that's the idea.
So to summarize on the padding choices, there's two that the standard supports.
One is minimal length padding. You just fill up enough to get to the next
block boundary, or you can have extra padding. And so you'd have extra blocks
of padding beyond what would be minimally needed.
And most implementations use minimal length padding, but some do use extra
padding, like new TLS by defaults uses, they choose a random amount of extra
padding each message to, again, try to us from trait traffic analysis attacks.
So first thing we noticed when we sat down to look at TLS was that this length
hiding goal that is explicit inside the TLS specification isn't well addressed
by existing definitions. So we rectified that. This was technically
relative -- technically straightforward. We took existing definitions and
extended it to a length hiding setting that also seeks to hide not just the
content of messages from adversaries, but also the lengths.
5
And then given that tool, we wanted to do an analysis of the TLS record layer.
And in this first work, which is the older stuff, which I've already talked
about, and I'll explain again, we point out that actually there exists an
attack in the case that you use truncated MACs with TLS MAC encode encrypt and
this variable length padding.
Then we show that if you aren't -- if you don't meet the preconditions of this
attack, that you have basically one short -- at least one short message and a
short tag in variable length padding. Then everywhere else, we get positive
results about security. So I'll mostly talk about the attacks. It's cute and
simple.
And then more recently, this somewhat prompted us to try to understand the
benefits of this type of padding based approach and we recently had a paper
discussing traffic analysis attacks against these types of counter measures and
I'll just briefly mention the results about that and point to you, maybe pique
your curiosity to go look at more extensive results in the paper.
So the very short take-away is that the tag size matters in using MAC encode
encrypt, not just here in TLS, but if you use it elsewhere. And if you have a
setting like this, where the tag is short, you have a message that's short, and
some of the padding ends up in the first block, then there can be
distinguishing attacks. And otherwise, you are okay. There's a pretty thin
line between security and security.
Okay. So just to give the attacks setting more concretely. We have an
attacker, the man in the middle. And we have a client on the left and a server
on the right. And the client's going to encrypt some message X, let's say it's
either yes or no, just for concreteness. Note that these are two different
length messages, right. If you assume that they're encoded as a byte per
letter.
And the goal of the attacker now is to determine which of the two messages was
encrypted. And so he gets this cipher text, and what he's going to do is now
maul the cipher text. So he's going to modify it and then forward on the
result of mauling it to the server and then look at how the serve responds to
see whether it was yes or no. It's a classic active man in the middle attack.
So let's set some numbers.
So we have a tag lengths of T equals 80.
This is a
6
classic choice for truncated tags. A block link of 128, which corresponds to
AS. And now we have, I'll just draw out the two cases we have for C, based on
whether it's yes or no being encrypted. And we'll have one extra block of
padding, which is -- I assume we're trying to hide lengths beyond the first end
bits.
So if no is encrypted, then we end up with encryption and padding that looks
like this. So we have the two byte no, the 80 bit tag, and then we have four
bytes of padding, filling out that block. And 16 bytes of padding in the
following block. So that's in hex, I guess. So yeah?
>>: What do you mean by hiding the tag, the length.
block.
You're hiding just one
>> Thomas Ristenpart: So if you extend out to another end bit, in theory, if
it's a secure scheme, the adversary shouldn't be able to tell whether you have
encrypted message that's of any length between, like, one byte all the way up
to 2N minus one bytes. So all of those should look the same to the adversary,
because he'll just see three random looking blocks of cipher text.
So if you choose to do this, and you end up, one of those messages you end up
encrypting is just two bytes, then this is the result you would have. Does
that answer your question?
>>: I think I'm just looking for a similar answer. Are you just trying to
hide how many bytes in the last block are messaged and not messaged?
>> Thomas Ristenpart: Yeah, so the adversary shouldn't be able to tell whether
there's, like, you know, 20 bytes of padding or one byte of padding, right.
Yeah, because it's all encrypted by CBC mode.
Okay. So then the other setting is if we have a yes message being encrypted,
and that means we have one less block of padding, byte of padding, excuse me.
So you subtract 1 from 13 and get 12, and makes the slide generation easy.
So we have these two situations. It could have been one of these two
situations. The adversary wants to figure out which, right. So now the
attacker wants to -- he gets C0C1C2, and he's going to apply the same mauling
procedure, or he's going to apply a mauling procedure to it. It will do two
different things, based on which of the two messages was actually encrypted.
7
So here's what the attacker does. He takes the first block that he gets, C0,
and he XORs into it the byte 10 to the fourth. So that means four bytes with
the fifth bit set to one. And the rest all zeroes. And we'll see what that
does in a second. And then he basically deletes the last block, doesn't care
about it, and forms a new cipher text at C0 prime, C1 and sends that on.
So in the first case, we've thrown away the last block. We've flipped four
bits in C0 and that ends up flipping four bits of padding when the recipient
does decryption, and so, in fact, this will decrypt just fine on the receiver's
end. The MAC will check out, the padding is correct, and the server will be
like happy with it and continue on with its communications.
Now, in the case that the message was yes, so what happens? We've thrown away
C2. We've flipped four bits here. That changes the padding to be the correct
values here, okay, but we've also flipped one bit of the tag and the security
properties of our tag will -- I mean, the tag's a deterministic function so
it's not going to verify, and the server will reject this decryption as
invalid.
And then particularly, when the [indiscernible], then the connection will get
torn down. And so of the attacker will see whether it was no or yes.
So this attack works particularly because you can use the IV to flip bits in
that padding, but that only works if you add padding that is inside the first
block of CBC encryption, and so fortunately, with the cyber suites that are
chosen within TLS, that doesn't work, because the MACs are always larger than
one block in length. If you use truncated MAC, which is a common thing to use
as indicated like in RFC 6066 for TLS, then the attack could apply.
Fortunately, we looked around quite a bit to make sure nothing was actually
vulnerable in practice right now, and it looks like that's the case for the
moment so no one's using this particular combination that's vulnerable.
Of course, that leaves the question of what happens with all these other cases.
I won't go into the details, but we analyzed the security and we show that,
okay, if you avoid this one corner case, you basically can show security.
So as I said before, this prompted us to try to understand what -- we were kind
of curious how far this length hiding security actually takes you in practice.
8
And in particular, there's been this long line of work on a particular traffic
analysis setting that TLS, in theory, should be able to help protect against.
The idea is you have an attacker who is again a man in the middle, but this
time passive, and is going to be viewing cipher text being sent back and forth
between some user and an HTTPS proxy. That means you're going to funnel your
website requests over a TLS tunnel so the IP address and everything about the
destination website is hidden and the proxy is going to forward on your request
to some server.
And the idea is that the attacker shouldn't be able to figure out what website
you're visiting. Because if you're using Tour or any of these privacy
enhancing technologies, this is an attack setting of interest.
And so this is called website fingerprinting, because the attacker, say, gets
some -- he wants to fingerprint what the destination website looks like in
terms of cipher text flowing back and forth, and then later figure out if this
is the same website that you're visiting.
And so the usual thing in [indiscernible] look at so-called closed world
settings so the attacker knows the set of possible websites that you're going
to visit. And the state of the art is to use these types of supervised machine
learning algorithms to try to train on labeled data so, you know, you set up
your own system to visit websites, train a machine learning algorithm to
classify whether it's website one or two or three, and then you take that
machinery in theory, you can go and apply this to an active attack setting.
So there's been a lot of work on this, as I said. And we couldn't really
figure out what the conclusions of this work were, because particularly most of
the works had been in a setting where there was no counter measures, like
length hiding, authenticating encryption be being used like in new TLS. So we
wanted to understand how much these countermeasures like the LHAE
countermeasure in TLS can help.
So we had a poor student go and implement all sorts of different attacks from
the literature, including some new ones, simulate a variety of these
countermeasures, including things like the TLS countermeasure and gather some
datasets that had been used in literature previously and then for all these
variables run extensive tests to see how good this attack can actually work,
how well this attack can actually work.
9
And so the results are what they thought were pretty pessimistic. So this
chart shows the average accuracy taken over a bunch of runs. Of figuring out
which of K websites a particular user visited when using a variety of
countermeasures. So none is just like raw encryption without any length hiding
or padding mechanisms. And there's a lot of other ones like pad MQ, session
ramp 255, packet random, which isle, this corresponds to TLS, what you can do
in TLS, like what new TLS does. And a bunch of other ones including more
advanced things that I won't explain, like traffic morphing.
And what we found is that you basically can set up a very simple machine
learning algorithm -- yes?
>>: Why is the exponential countermeasure worse than no countermeasure at all?
Isn't that what it's saying?
>> Thomas Ristenpart:
>>:
Where is this?
Say the exponential thing.
>> Thomas Ristenpart: Yeah, I mean. So the -- it's not clear that it's like a
linear thing, right. The countermeasure might actually increase -- so let me
explain what the classifiers and then I'll get right back to your question.
So what we show is a simple naive-based classifier that really just took a few
features from the traffic. So instead of looking at individual cipher text
lengths, what it did is look at much coarser information like the total
bandwidth used in either direction by the connection, the size of bursts and
the number of them, and like the total time for the connection, which is a bit
of a difference from what the prior works had been doing. They've been really
focusing on individual cipher text lengths as the important feature.
And so this is what this VNG plus plus classifier did. And so to get back,
before I give you just the closing thing here the punch line is that these
countermeasures, like pad to MTU -- which one were you looking at? Exponential
can actually exaggerate some of these features for us. Because you're padding
so it might separate better the total bandwidth used by one website compared to
another website, as compared to using no countermeasure at all.
That's a rare case.
Generally speaking, the countermeasures do a bit better
10
than no countermeasure.
answer your question?
But it's not necessarily always the case.
>>: [inaudible] I thought it was the other way around.
this graph?
Does that
Isn't higher better in
>> Thomas Ristenpart: For the attacker, yeah. So the attacker's doing better
against exponential over here than none, yeah. So it's a great question. And
the answer is that, you know, just for these features, it turned out that that
actually exacerbates the phenomenon.
Okay. So the point being that length hiding can be helpful in settings where
you really word about these fine grained lengths being revelatory, like yes
verse no, but it seemed to be less helpful, maybe not helpful at all, in
settings such as website fingerprinting. And this is just like, we did a lot
more analysis in the paper. It was a survey paper looking at all these things
and doing new analysis of all these various algorithms.
Okay. So that was looking at TLS 1.2. We gave new definitions I didn't talk
about, but they're in the paper. New attack that wasn't known before, and also
looked at, I guess, this traffic analysis stuff, which I didn't explain in too
much detail but you can go check out the paper.
So now I'll move on to password-based cryptography, unless there's further
questions about TLS. Okay. So PKCS#5 is basically the standard that helps us
answer the following question. We have an encryption mechanism like CBC mode.
Expects an encryption key that's uniformly selected at random from some space
of bit strings. So if you're using AS, you expect a random 128-bit key. In
practice, of course, we'd often like to encrypt our data with a human
memorizable password, right? Like mine's 12345. Clearly not a uniformly
selected bit string that's 128 bits.
So how we do it? PKCS#5 was a standard that was developed in or published in
'96 so it's been around for a long time that speaks to how to deal with this.
And the solution is very natural. It's so use what's called a password-based
key derivation function or PBKDF. So the idea is we'll take our password,
12345, we will choose a random salt, okay, which is a uniform but public bit
string of some length, say 64 bits or 128 bits. We'll concatenate those
together and then he'll hash the result with a cryptographic hash function like
shaw 1 or shaw 256 nowadays.
11
Then we'll take that output and we'll hash it again, and we'll hash it again
and again and again. We'll do that C times to get a bit string and then we're
going to use this as our key for CBC mode. Of course, if it's a little bit
longer, we can truncate K if we need to.
Okay. So the algorithm
a message, you choose a
salt and you encrypt it
and you output both the
in pseudo code is as such. You got a password, you get
salt, you hash it C times, the password concatenates
with whatever your underlying encryption mechanism is
salt and the cipher text as the resulting cipher text.
So this is used really widely or variants of it are used very widely in, you
know, tools like Winzip and OpenOffice, even WiFi, WPA to do derivation of
session keys uses one of the PBKDF functions, PKCS#5. And despite that, there
hasn't been much analysis of this standard. There is one paper a few years ago
which did a little bit of work. But it wasn't very satisfying. And otherwise,
there hasn't been much to say. So we wanted to understand there's some, you
know, get a better understanding of it.
So the first thing we realized is that -- and I'll explain this more, is that
again, we have no theoretical notions that really were satisfying in terms of
measuring the efficacy of the measures or the mechanisms underlying PBKDFs. In
particular, we needed a new framework for defining what we're calling
multi-instance security notions that measure kind of the hardness of breaking
many instances of cryptographic objects in parallel.
And then we analyzed PKCS#5 using this new framework. Give some proofs of
security amplification for them, so we basically prove what you would expect to
be able to prove about them. But nevertheless, we're the first to offer such
proofs, and we have somewhat I think are nice simulation-based techniques that
we think will be applicable to places beyond just like password-based
encryption but elsewhere as well.
So the question, the first question is really why do we need new theory. Seems
like this is a pretty straightforward thing to analyze. It's encryption. We
have lots of encryption definitions. So why can't we just use one of them?
Well, let's see. I'll give you kind of the most obvious approach to defining
security for password-based encryption scheme, and this is basically using the
classic indistinguishability chosen plain text attack notion. So let's fix set
D. That's the set of possible passwords. It's not, you know, all 128-bit
12
strings anymore, but it could be some set of possible passwords that people
choose.
And we're going to define security this way. So we have an attacker that's
going to play a game with a challenger, which is the gray box on the right, and
the attacker queries two meshes, M0, M1 of his choice. And what the challenger
does is picks a password. For simplicity in this talk, let's assume it's
uniformly chosen from this set. He also picks a random bit B, and then he
encrypts using our PB, like our password-based encryption mechanism, one of the
two messages based on the bit B under the password and then returns the result.
Okay. And the adversary wants to figure out which of the two messages it is.
So he guesses a bit B prime and wins if he got it right. So is nothing
exciting, right, if you're a cryptographer. Maybe it's super exciting if
you're not. This is the classic IND-CPA definition, just with passwords as
opposed to keys.
And of course, just the single query version for simplicity, but it's easy to
extend to multiple tries by the adversary.
Okay. So under this notion, you can prove, while this is a corollary of our
later results, but you can prove the security of PB-encrypt using this
hash-based construction. And in particular, get a statement that works in the
random oracle model assuming H is ideal, that says basically that the advantage
of any attacker in this game is bounded by Q over C times N, okay, where N is
the size of this dictionary of passwords. C is, again, this iteration count of
the hashes and Q is the number of hash computations, hash queries that the
attacker can make.
Okay. And this is what we would expect, right, because this matches a very, a
brute-force attack that works in time close to CN, right. What do you do? You
just try all of the passwords so you hash C times each of the passwords and you
try decrypting the cipher text to see which of the two messages it was.
So that's good. So it's tight with the known attack and so this seems to be
the best we can do. So there's two issues that remain. One is the kind of
obvious thing that everyone knows about passwords, which is that passwords suck
and N is small. So it's fundamental in this setting. But even if we have, you
know, N approximately 2 to 31, we can set C pretty large. A common choice is
10,000. We can't set it too large, because it might slow down functionality.
13
So, like, something like 10,000 is what's used in WPA, I think, for example.
But still, this means that you only need -- and our proof works showing that
you need only, that the best attack you can do is the brute force attack which
requires about time 2 to the 44, which by today's standards is again pretty
narrow. So this is like a fundamental limitation in passwords. And we're not
going to hope to do better than that.
But there is something else that's kind of suspicious about this whole
analysis, which is that we never said anything about salts in the whole
analysis. And so remember, a slide ago, I said there's a salt being chosen,
and randomly, S bits, but you notice nothing here about S. Nothing about salts
arises in the theorem statement.
So from the point of view of IND-CPA security, salting doesn't matter. Which
stands in stark contrast to the fact that we all well know that salting matters
really a lot in practice, and so let me explain why.
And the reason is that salting helps quite a bit in a multi -- there's two
reasons salting works, is very important. I'll talk about one of them. One is
that precomputation attacks become harder, all right. If you don't know the
salt a priori, there's an issue. We're giving the salt to the adversary right
away, so he gets it right away. So we'll talk built other issue, which is that
salting helps in this, we're calling multi-instance attacks.
Now let's think about a new attack model in which the attacker gets access to
basically M different challenges. So say he's getting C1 through CM, which are
cipher texts of independent messages and one to MM encrypted under independent
passwords chosen from this set D. And his job is to recover the messages. So
what do we know? We know the best known attacks are such. So with salts, we
get that the best known attack takes time, approximately MCN, right, because
you have to brute force, or the attack brute forces all M passwords and each M
of the passwords takes C in time.
Without salts, there's a faster attack, right, which takes time just C times N
plus M times N. Because you can use like rainbow tables to compute for each
possible password -- let's see, wait. For each possible -- yeah, for each
possible thing you, compute the hash, you can use times-based trails if you
want. But the point is that you use the multiplicative by M factor in the
attack.
14
Okay. So we saw that -- so this is the best known attacks. We like to prove
again that we get what we expect, which is that breaking M encryptions requires
MCN time. But now to prove something, we need to have a definition of
security. I just showed you the IND-CPA that doesn't speak to this notion of
security at all. So we need a new one.
>>:
Break M encryptions, does that mean break one of them or all of them.
>> Thomas Ristenpart: Means breaking all of them. So breaking one of them
would be like the traditional, like, single instance security, what we're
calling a single instance security setting, right. So IND-CPA. But here,
we're really saying you need to break all of them. And this is one benefit of
salting, that it gives you a second line of defense, so to speak. That even if
you can brute force one password, you may not be able to brute force a million
passwords. And we'd like to be able to show formally that that is actually
enjoyed.
Okay. So we need new definitions. So I'm going to walk you through defining
not multi-instance security, but two-instance security, because it fits on the
slide nicely with having two challenge boxes as opposed to many, many.
And so the first attempt that we had at capturing this role of salting and
forcing this MCN multiplicative factor improvement in security is to use what's
called the multi-user setting, which is from Bellare, et al. And now we have
two boxes with two different passwords, and the attacker can query both of them
with messages and he gets back from either one the same left or right message
encrypted under that particular password.
And his goal is to guess what the bit was. So this is defining
indistinguishability when you have access to multiple different instances of
the cryptographic primitive.
So this doesn't work for us, okay, because actually the best attack, again,
here works in just CN time as opposed to 2 CN time, because as soon as you
brute force one of the passwords, you learn the bit that was used for both
boxes.
So that doesn't help us. So the next natural thing is that, okay, we need
independence completely between these games so that they're totally separate.
15
So what we do is have independent challenge bit as well as independent
passwords for both boxes. And so the attacker again gets to interact with both
these boxes. Now, this becomes what, I guess, people in [indiscernible]
literature would call like a direct product game. So you have two independent
copies of the game that the adversary gets to interact with.
And so the question now is how do we gauge success. There's two challenge
bits. The most obvious one, which could be fine, is that the adversary tries
to guess both of the bits. So it's like an and measure. He has to win
against -- by guessing both and we can subtract off the trivial probability of
winning, which is a quarter here. There's two random bits. So the trivial
probability of winning its quarter.
We find this not particularly convenient to work with. I mean, mathematically,
it's a fine definition, but it's not convenient to work with. For example, it
doesn't really capture breaking both instances. The advantage doesn't really
capture both instances because as soon as you break one, you can just guess the
other bit and you get, you know, constant advantage, like one quarter. Where
we would expect that to be zero, because you haven't really beat both
instances. You just guessed against one of them.
So we can solve that by changing the advantage measure a little bit to make it
more convenient to work with, which is to use an XOR. This basically gets rid
of this weirdness about having constant advantage, even though we've only
broken one of the instances, and so that goes to zero. And this is basically
our suggestion that we think this is kind of the nice one to work with, it's
called the XOR measure, and you can, of course, extend this to multiple
instances. And actually, this is just for encryption, but you can do it for
lots of other settings. And in the paper, we talk -- yeah, is there a
question, or is it a stretch?
>>: No, question. So doesn't the Bellare paper discuss why they chose the
same bit? [inaudible].
>> Thomas Ristenpart: Right. So I should have said that this was not a
suggestion for our problem. This is just something that seemed like it could
be useful in our setting. And so the multi-user setting was suggested for a
different reason, which was to look at how to do efficiency improvements for,
like, public encryption schemes.
16
And you're absolutely right. There's something about whether you have a bit
that's joined across the games or independent, which dictates something about
what you're learning from each of the individual encryptions. So if there's a
joint bit, that means the information that you're learning potentially from the
box is correlated across the boxes. If there's independent box, it's like
there's independent information you're learning from two different challenges.
So to measure the hardness of this type of multi-instance tag where you really
have to break M things, you really need to have the independence. Otherwise,
there's nothing to do for the next -- the M plus 1 or the next instance if you
already know what the message was. There's nothing left to do. So you don't
have to break the password. It's a great question.
>>:
[inaudible].
>> Thomas Ristenpart:
>>:
Right.
You're looking at --
>> Thomas Ristenpart: Right. And another point of clarification, which is a
great opportunity to bring this up, we're not saying that just breaking one
instance is okay, right. Obviously, that's a bad thing too. With a we're
trying to get to is the why salting helps us also get this other measure. So
ideally, we would like to have that breaking any single instance is very hard.
But in passwords, that's not the case. So we like to have the second, you
know, setting where at least you can't break many instances.
>>:
[inaudible].
>> Thomas Ristenpart:
>>:
Is there actually?
[inaudible].
>> Thomas Ristenpart:
>>:
Oh, I disagree.
Oh.
Okay, it's gone.
>> Thomas Ristenpart: Not quite. I thought I saw the thing floating earlier.
I thought I was just freaking out. Somebody put something in my coffee this
morning.
17
Okay. Measure, XOR measure. Lots more in the paper. This is very general
thing, right. You can define multi-instance security for many places and games
like this have, you know, like XOR type results are very related but not the
same as what we're targeting. So anyway, there's like a lot to explore in this
setting. And I won't talk about that much more.
And the thing that's interesting and maybe not surprising, people work in like
amplification settings is things get technically very challenging from a
formal, like, proof setting here. So, for example, things that were surprising
us is that very classic technique of hybrid argument, which is very easy to use
in the single instance setting becomes very messy in the multi-instance
setting. And so if that was meaningful to you, you can ask me about it after
the talk, what more about that.
Okay. So back to the practically relevant thing, which is how to apply this to
PKCS#5. Well, we go on to give an indifferentiability styled notion, which I
won't talk about, that helps us with some of the proofs and we're able to show
that, indeed, we get the type of phenomenon in this multi-instance setting that
we expect. So we've gone from this Q over CN to this Q over MCN, which matches
the attack that we know works, right, in MCN time which you have salts that are
sufficiently long.
So as long as your salts provide independence between all the points, then -between all the instances, then the attacker really does need to brute-force
the all M passwords.
Okay.
>>:
So any questions?
Yeah.
Is it all getting rid of the random [inaudible].
>> Thomas Ristenpart: It's a great question. It's hard to say it's inherent,
but it would require a huge -- I mean, so basically, all this setting is in a
place where you basically can think of N here as just making this a flat set,
so this is like a 2 to the min entropy right. The passwords have low min
entropy, so we're talking something very small. Which means that you can't use
LHL or any of the standard model techniques we know. They just won't work.
And prove they won't work for any existing kind of known universal hashing
constructions.
18
So that doesn't work. The other thing, we don't know how to prove, like, the
times C amplification. If you go into the standard model, we can show, and
it's not trivial, because you have all these chains of potential, like you kind
of think that you have M different chains of random oracle queries that he
needs to fill out and you need to show they don't collide and overlap and do
weird stuff. We definitely wouldn't be able to show anything like that in the
standard model.
And, yeah, likewise, I think the M factor is also something that would be hard.
So it's a great question. Basically, I think at this point, for low entropy
inputs, which you're trying to do things like abstraction, we have no idea how
to do anything in the standard model. Hugo had some nice work on this, kind of
pushing the boundary on this direction, but I don't think it yet says much
about this. But maybe he'll have more to say about it soon.
And in practice, this is matching what we expect the hash functions are giving
us.
Any other questions?
Okay.
>>:
So -- yeah?
You can have related passwords?
Is there a difference?
>> Thomas Ristenpart: So yeah. It's glossed over here. In terms of related
passwords, we don't say much about this set N. So you can imagine it's -- so
actually, this is a simplification. In the paper, we look at -- we don't even
actually fix any particular distribution over the passwords. They can be
totally related. And what we show via this PBKDF indifferentiability style is
that you basically extract all of the entropy from the password space. So if
there's -- so you wouldn't get a nice, closed-form analysis here. But you show
that all the -- even if you have related passwords, you get all of the
security. So if you have one guy that has a password that's like 12345, the
next guy has 123456, and that six was chosen uniformly from zero to ten, then
we get, you know, whatever entropy, the first thing, which is zero plus the one
over ten.
Anyway, so the point is, yes, the answer is yes. Yeah. And it's kind of one
of the benefits of doing this simulation style approach, which it helps with
these types of things, yeah.
19
>>: Are there two versions of the PBKDF function?
both or one?
Does this result apply to
>> Thomas Ristenpart: I'll messing passing in the next -- maybe I won't,
actually. So the other one uses HMAC, and I'm going to talk about HMAC next.
And the result works if HMAC is a random oracle, which we'll talk about next.
Fantastic segue, thank you very much.
So okay, let's talk about HMAC. So just to set the stage, HMAC, for those of
you who don't know, is a keyed hash function. And it was designed originally
to help with problems with length extension attacks and the setting of secret
key cryptography. So designed with the idea that K is going to be a secret
key, like uniformly chosen, and M is some message that you're trying to
authenticate.
It's used widely as a message authentication code, TLS, PKCS#5, SSH, IPSec.
Pretty much anywhere you look. And there's been quite a few analyses of its
security in this uniform secret key setting from the prove security side point
of view. Also there's like crypt analytic work analyzing it as well.
So a little bit closer look at it. Most often, in papers, including some of
mine, I kind of stop at this view of HMAC, but for the purposes of this work,
it's helpful to get a little bit deeper. And in particular, HMAC is
implement -- has structure to it. It's not just a hash function. So use a
hash function twice. And what it does is it takes this key, K, does something
I'll mention in a sec, transforms it a little bit and then XORs it with ipad,
which is a constant bit string and then concatenates the message, hashes that,
takes the modified K, XORs it with a different constant value, and it
concatenates that to the result of this and hashes the final thing, and that is
the output of HMAC, okay?
So I glossed over some details, which is consistent with most of the academic
literature on this. And in particular, the actual [indiscernible] code for
HMAC is this, right. So what you first do is you check to see if the length of
your key is greater than D bits. D is a parameter of the underlying hash
function. So the hash function is like [indiscernible], this is the block size
of your underlying hash function. So with shaw 256, it's -- oh, boy, just
talked myself into a corner there. 256 bits, 512 bits. Someone who actually
knows something --
20
>>:
It's 64 bytes.
>> Thomas Ristenpart: 64 bytes. Anyway, let's forget about it. It's D, okay.
And if it's longer than this D bits, you first hash it with the hash function,
okay, and this gives you this K prime. Otherwise, we'll just set K prime to K.
And then what do we do? We then make sure that we then pad out K prime to get
D bits, okay. And we do it with zeroes, okay. And then we -- then that's K
double prime, as showed up there. And so then we XOR it with ipad, which is D
bits up here to get this H, and then down here, opad, which is also D bits, two
different D bit strings. Hopefully that's consistent with people who have
actually looked at HMACs.
Okay. So and this is really nice, HMAC is a great, you know, kind of Swiss
Army knife. It's got two inputs. They both can be arbitrary lengths. And so
both arbitrary length keys and arbitrary length messages and this is great
because it makes it very useful in a broad variety of settings.
And indeed, it's become used in many, many different settings. So, for
example, the PBKDF question, right, there's a password-based key derivation
function based on P-Mac where now we use the password as the key for HMAC,
which is again the arbitrary length is nice, passwords are arbitrary length and
we use as the input the salt. There's some other information too, but think of
it as salt. And then we hash it once, and then hash it again using the
password each time for key of HMAC. This is actually a simplification of PBKDF
number two, there's some other stuff going on, but this is the basic structure.
Okay. And, of course, this is not a uniform secret key anymore, which is
important. So these other analyses don't apply. There's been lots of other
suggestions, at least in other places for using HMAC in other ways. They're
kind of outside this traditional envisioned use. So Hugo has been suggesting
this very nice construction of HKDF for doing kind of general purpose key
extraction, where he uses the key input for HMAC is a public salt value. TLS
1.2 actually already does things somewhat like HKDF. Well, it also, sorry, it
uses HMAC as an extraction mechanism, not like HKDF, but for the same type of
purpose as what HKDF would solve.
And in there, K is again a non-uniform secret like password, PKCS#5, and then
I've used HMAC in some other places as well where K is a nonuniform secret.
21
So the question that came up, okay, we have analyses when K is a uniform
secret. What's going on when we don't have a uniform secret? And we'd like to
understand if these other applications are good.
So applying our approach, you know, the first thing we actually have a notion
of security that we thought was good for this, it's called indifferentiability
from random oracle. And this is offered by Mauer, et al. And this basically
says that HMAC altogether acts like a random function on two inputs. So if you
change the key or you change the message, you get a totally different,
independent-looking output. That's the very, very high level.
So then we did some analysis of HMAC under the -- to try to show that it was
indifferentiable from a random oracle. And where this popped out was that
there's some issues. One is that there's -- I say weak keys, but we really
should call them weak key pairs in HMAC. And this relates to something that
I've been calling the second indifferentiability paradox, which my co-authors
hate that name, but I think it's really cute. So you can decide for yourself.
And I'll explain that too.
And then we recover some positive results in the cases that HMAC avoids, these
weak key pairs.
Okay. So let's see. Running low on time. Let me just say briefly, why do we
want to do indifferentiability from random oracle? Well, I said before that we
had proven that HMAC, when it's used inside PBKDF2 is good, and that's true if
we model HMAC as a random oracle, so a function that like outputs random
things. Indifferentiability formalizes this notion of being indifferentiable
from a random -- formalizes this notion of behaving like a random oracle and
would, in particular, allow us to show that this theorem works when you take
into account the structure of underlying HMAC, this two-hash construction plus
the key massaging.
Okay. So let me skip some of the high level discussion. Let's give a little
bit of history, but there's some confusion about indifferentiability of HMAC
both by myself and others. So in particular, in the first paper that suggested
indifferentiability is the analysis tool for hash functions. They analyzed a
construction that they called HMAC, which is not HMAC. I've yelled at Yvgeni
about this, and he just yells back, we have a nice time.
Anyway, it's not HMAC.
It's standardized, so it didn't apply.
But people, you
22
know, I read that and I was like oh, looks like it's fine. And it actually
looks like the techniques that they use for that proof should work for real
HMAC too so everything should be good. And so there's been lots of works,
myself included, assuming that HMAC is a random oracle.
Krawczyk suggested that the CDMPO5 proof extends to HMAC, but didn't get into
details. And certainly, it does look like it should. So we were curious to
figure out if this was actually the case and do the due diligence on figuring
it out. And this is where we came across these weak key pairs.
Okay. So if we go back to your description of HMAC, which is not what people
in this product really thought of HMAC as, but is really HMAC, we can see
actually that there's immediately a troubling issue with the way key handling
is done. And I hear laughter in the back.
So I've highlighted the troubling issue, which is that if you have keys of
different lengths, this is an ambiguous encoding, right. And so in particular,
this gives rise to what we call colliding keys in HMAC. So this is any two
keys, K1, K2, such that they're not equal, but HMAC of K1 on a message is equal
to HMAC of K2 on the same message.
So an example of colliding key pairs is a key, K1, and then you add a zero to
it and you get a key, K2, and you can verify for yourself that HMAC is going to
treat those two keys as the same value or the output will be the same.
And this is a clearly not what you would expect if you thought HMAC was going
to give you random looking outputs for any different K1M, K2M. And so, of
course, and this does, indeed, give a very trivial indifferentiability
distinguisher. I guess if you had settings in which you had related keys that
looked like these types of relationships, then you'd have a related key attack
against HMAC. So it's not that great.
But definitely, you avoid this problem, and I'll talk more about the practical
implications of this in a couple slides and you can ask questions. The sky is
not falling yet, but this is very interesting.
So the colliding keys don't exist if you use fixed
H is collision resistant, I suppose. So these are
it doesn't work. But even when you do that, which
the original design goal was for like uniform keys
length keys, okay. Assuming
keys of different lengths so
you remember this is like
of a particular length.
23
Even if you do this, then there's another slight issue that arises in the way
that the differentiation between this internal application of H and the
external application of H gives. And in particular, we get what we call
ambiguous key pairs, and this is any keys, K1, K2, such that after you process
with this pre-processing to get K1 double prime and K2 double prime, you have
the equivalents that K1 double prime plus ipad is equal to K2 double prime plus
opad. I probably never mentioned this. Hopefully everyone picked up, but
which is XOR in reality.
Okay. So what does this mean? This means that there's not strict domain
separation between, like, internal application H and the external application
of H, and so examples of ambiguous key pairs are obviously like any K1 such
that you define a K2 that's equal to K1 plus ipad plus opad. And this in
particular works when K1, K2 is equal to D or you can also do it when K1, size
of K1 and size of K2 is equal to D minus 1. I don't think you can do it when
it's equal to D minus 2, because that's the first bit that's different between
ipad and opad.
And there's other, you know, other settings too. But kind of narrow corner
case in terms of the key space of HMAC. And so no domain separation. It's not
clear, this doesn't give rise to trivial kind of issues like the colliding key
pairs do, but does cause problems for proving indifferentiability of HMAC and,
in particular, one can -- well, we gave a lower bound on indifferentiability of
HMAC. It's a bit technical and I haven't even told you what
indifferentiability is technically. But the point is we basically show -- the
implication is the most important thing -- that the lower bound rules out
having proofs, indifferentiability-based proofs that give good, concrete
security bounds.
And I can offline tell you more about the details underlying the technical
issue. Instead, I'll talk about my favorite term, which is this second iterate
paradox, which has the same, similar structural issues as what happens from
ambiguous key pairs from HMAC. But for a simpler construction. And this is
actually -- yeah, so it's much simpler.
So this is a construction that was suggested in 2003 in a textbook by Schneider
and Ferguson, and it basically says, okay, let's take a message M, we'll hash
it once and then we'll hash it again, second iterate of a function. And again,
this was a suggestion to prevent length extension attacks, which it does seem
to do.
24
But the thing that's confusing is that, okay, you would expect, of course, that
if H is a random oracle, it's ideal. If you iterated H twice, you might lose
like a constant factor of two in the security or something, because there's two
opportunities for collisions. But you certainly would expect nothing worse
than that, right?
When it comes to indifferentiability, actually you can show that you can only
have indifferentiability with very poor bounds, and so in particular, like if
you tried to conclude using indifferentiability that this is collision
resistant, your best be able to show that it's collision resistant to like 2 to
the N over 4, where N is the output size, which is much less than 2 to the N
over 2.
Anyway, this again, it's a little bit technical. I have some slides, like, to
describe what is the underlying intuition, but I'm running out of time so I'm
going to skip all that. If you want to stick around, we can talk about it
after the hour's up. There's a lot of them. Uh-oh. Oh, here's some more
implications. You skip all the technical stuff, you get to the implications.
So H squared doesn't behave like random oracle. Can't hold with good bounds.
We do a lot of work to figure out if this is something that's practically
damaging, because this is all proof stuff that I've been talking about. The
closest we came is looking at a fairly natural setting called mutual proofs of
work where you're basically, two parties are trying to prove that they've been
doing computing some hash chains and you can abuse the properties arise from H
squared and also ambiguous pairs in the HMAC in this mutual proofs work
setting. But I'll refer you to the paper for details.
We don't know, I mean, hash [indiscernible]. I don't know fit actually gives
rise to any vulnerabilities in any places. It's a good question.
And so the most important practical question is what does all this imply about
HMAC, which is used all over the place. And weak K pairs could cause problems,
but they really don't seem to -- they're kind of like very loosely, like, the
weak keys in dads, right. They exist, but they don't really come up seemingly
in practice so much.
So, for example, like with colliding key pairs, you need two keys that like if
you had a password and another password which was that password concatenated
with a zero, if you went 123450, then these would be treated as the same
25
password by the PBKDF, which would speed up brute force attacks by a factor of
two. But that doesn't quite work, because actually, of course, in ASCii, when
you encode an ASCii, zero doesn't get encoded to the binary value zero.
So anyway, a lot of applications seems to serendipitously avoid them.
Certainly like HKDF and some of these other things are using fixed length keys
that are like uniformly chosen if they're public and it's not a big deal. And
here we can show stronger positive results about the security of HMAC in a
sense of indifferentiability, which then should allow showing security of these
other applications with good normal bounds like that you would expect in this
setting close to the [indiscernible] bound.
So the take-away is that this is something to be concerned about and aware of
but doesn't seem to be threatening any actual deployed thing that we're aware
of. And certainly makes doing formal analysis a huge pain.
Okay. So with that, I'll just briefly reiterate and we looked at three
different projects here. TLS 1.2 record protocol. We had some new attack that
hadn't been observed before when you have short Macs. Did some new proofs.
Password-based crypto, which kind of gave rise to this new definitional
framework that was needed to explain the benefits of salting, and which from a
theoretical -- I mean, from a theoretical perspective, we didn't quite have the
right tools to explain some of the benefits of longstanding mechanisms that
used in practice. And did some proofs finally for PKCS#5.
And we looked at HMAC and realized in these newer settings, HMAC is not that
new, but in these other non-originally envisioned settings that HMAC is being
used in, there's issues with weak key pairs potentially and we can overcome
these in some cases to recover nice, positive results.
So from this lens of practice-driven theory, I think the message is while it's
great to -- and certainly, useful perspective I like of theory as educating
practice, I think we can also -- I hope I convinced you that you can also turn
around a bit and use theory as a post facto tool, right, not only to uncover -basically as a vulnerability finding tool, right when we try to prove things in
this complexity theoretic way, we really have to nail down all the corner
cases. And as we saw, the corner cases come up and apparently these weren't
realized before. So it's a great way to suss these subtle issues out.
Of course, it helps give some rigor to intuition and informal security goals.
26
The best case, we can prove something which isn't the final word on security.
Of course, that only is if it's secure when deployed in a system, but can
certainly improve our confidence when we can show proofs.
And then, you know, maybe also surprisingly that contrary maybe to some
people's opinions, cryptographic practice is an excellent inspiration, I think,
for new theory, things that we hadn't really considered before. So if you go
look at standards, you're not just doing a community service to try to
understand security of deployed products, which is what maybe theoreticians
say, but you can actually pop out new theory that can be quite compelling.
So with that, I'll take more questions and thanks. Just waiting for somebody
to ask me to go back through the slides, yeah. No, I'm just kidding. Go
ahead.
>>: Is there an easy way to see why these single instance [indiscernible] does
not apply [inaudible].
>> Thomas Ristenpart: Well, so the hybrid argument would -- right, yeah. So
good. So this is a great question. The question, you know, can we -- why
doesn't single instance security prove multi-instance security, because you'd
expect a hybrid argument would do it.
And that's true. You could show a hybrid argument that would take the Q over
CN bound that we had before and give you a new bound in this setting, right.
But that new bound would have a factor of M in front of it, right. And that
factor of M would be on the top, a numerator, not on the bottom. So what we
would be showing is MQ over CN, which is exactly the opposite of what we want
to show, which is Q over MCN. So it's too loose is the short answer to use a
hybrid argument. And more generally speaking, this multi-instance security
setting, we're really worried about very small advantages, right. Because it's
no longer just the security of one instance but the security of many instances.
So if you, I don't know, think about -- well, don't think about what I was
thinking about. Doesn't make any sense.
But we're dealing with very small advantages, and we have to be very careful to
make sure we preserve them in our reductions. And this is where some of the
other technical complexities come up too in doing the proofs.
>>:
So why do you think the multi-instance security definitions never caught
27
on in the academic community?
people moved on.
>> Thomas Ristenpart:
>>:
It seems like it was presented in 2000 and then
The multiuser instance?
Multiuser, yeah.
>> Thomas Ristenpart: Well, so, that's a good question. I think the -- I
recall the multiuser and multi-instance are, despite the poor naming
conventions, perhaps, we chose in this paper are different.
So one's looking at -- I mean, I think the way to conceptualize is multiuser is
like the weakest link security. As soon as the weakest of all of the instances
or users is broken, then all of the security goes away.
And multi-instance security, we're asking for, instead, that the attacker has
to break not just the weakest link, but he has to break all the links. So this
is -- you get this -- you get this ability to measure things like U over MCM
here.
So I don't know. I think there is -- the reason the multiuser security notion,
which is the weakest link notion, didn't catch on is that most settings, it
doesn't really matter whether you analyze it in the hybrid -- if you analyze in
the single instance setting and then use a general hybrid argument or use
multiuser.
My recollection, I'd have to go back and read the original paper from 2000
again, is that they're doing that to analyze constructions like you're using
the same randomness across other multiple users or doing other things that are
kind of non-standard for efficiency purposes and you needed that framework.
But for like normal situations where you have independent instances, you do
fine using the normal single instance security notion. But yeah, we'll talk
more later if you want. Yeah?
>>: Okay. So I'm just wondering, this notion of multi-use instance, I feel
like it's a little bit similar to like harness [indiscernible] like you first
to sharing ->> Thomas Ristenpart:
Okay.
I'm cutting you off.
Go ahead.
28
>>: So yeah. And another maybe -- so in your definition, I feel like it's
like, yeah, harness application is somewhere inside this, because you are like
from CQ over CN, you like make it stronger to M times CN.
>> Thomas Ristenpart: Yeah, so if I might, the question is like what's the
relationship between this and the traditional hardness amplification
literature. The answer is very involved, actually.
So for example, like this XOR measure smacks very directly of, like, XOR lemma
type constructions. But like in an XOR lemma type construction, the
construction is actually -- the construction is actually that you XOR the bits
from these hard core predicates, and the adversary's goal is to guess that bit.
Whereas here, we're doing something slightly different. We have independent
challenged bits that are separate from the construction and the goal is to
guess the XOR of all those bits, okay.
>>:
[inaudible] and then I share to B1 and B2.
>> Thomas Ristenpart:
distributed the same.
>>:
Yeah.
Oh, share it.
I guess you could.
Those would be
So that's what I was --
>> Thomas Ristenpart: Yeah, but the construction is not -- yeah. So you're
absolutely right. There's a lot of connections and we've only, like, scratched
the surface on exploring the connections. And indeed, like so yeah, there's a
lot of work. So you can use the same type of techniques that have been used.
Like Unger has this paper about analyzing whether you can show and securities
applied by XOR security in their context in kind of a general setting, which
seems like -- which does apply here, actually. But it's partial result and you
have to do some mapping between the two.
So yeah, we're not trying to claim that, like, this isn't related to hardness
amplification. In fact, it is. And lots of the same techniques probably
arise. Maybe there's connections we haven't even seen yet.
>>: Another slightly related question is so can you go back to that
[indiscernible] --
29
>> Thomas Ristenpart:
many definitions.
Oh, other way?
This way?
This is what's wrong.
Too
>>: In this setting, the dictionary size is small. Like traditionally, we
were thinking like something negligible. But in here, like there couldn't be
anything negligible.
>> Thomas Ristenpart: Yeah, so I mean, clearly if N is like really, really
big, like 2 to the thousand, it hardly matters whether you're attacking a
million or ten instances. But yeah, here we're really worried about concrete
security here. So like asymptotically, it's not clear you get any benefit from
these.
It's, again, somewhat subtle to think about this stuff asymptotically. So it's
really, we're trying to preserve concrete security, right. So you can think of
it like if you're trying to break ML all instances or MDH instances or stuff,
we would expect again that the best you could do is like two times the time -I'm sorry, M times the time of the best known attack would be the kind of
obvious approach to break, like, lots of instances. And there you have
individual negligible instances, sure, but we could also hope to gain that
extra log and security parameter, like, improvement by like having to break
many in parallel. That probably made no sense. We should probably talk about
it offline, yeah.
Download