Cormac Herley: So let`s get started

advertisement
>> Cormac Herley: So let’s get started. It is a pleasure to introduce Paul Van Oorschot a repeat
visitor who is going to talk to us about password aging policies and quantifying security
advantages. Paul is well known to many of us. He used to work on Crypto for a long time, but
then saw the error of his ways and decided to start working on passwords.
>> Paul Van Oorschot: Thank you Cormac. So a very apropos intro, because what I am going to
do in this talk is actually go back to Crypto. So we are all familiar with password aging policies
and I don’t need to ask people how excited they are about changing their passwords every 30, 70
or 90 days. Generally it’s not the people get up in the morning and say, “Today’s the day I get to
change my password and I am very excited.” The problem here that I am looking at is the
expectation is that somehow this helps and improves security and we are somehow convinced
that this is going to help security.
But the question to ask is: How much does it help and in which ways? So that’s the question I
want to explore. This work, if you are interested in looking at the details its in a paper that will
appear shortly in a special issue of Designs, Code and Cryptography, Scott Vanstone was
actually my PhD advisor. So the thing about password aging policies is we have this general
notion or we come to believe that it helps security in some vague ways and I will talk about a
couple of them later, but quantitatively what is the benefit? What gain and what benefit do we
get?
And in order to answer that question the easiest thing to do is to try and formalize things. Now if
you formalize things you end up making assumptions in models that don’t actually map onto the
real world and so then what’s the value of your model? Well I do start here by formalizing and
going to a Crypto security model where rather than passwords which are chosen in a very
skewed distribution by users we are going to start out looking at cryptographic keys which are
randomly chosen and equally probable in order to figure out what the bounds are. Then I want to
take that because we can get precise answers and try and get some insight into mapping that onto
real user passwords. It will turn out that will give us some lower bounds and with skew
distributions we can then see what those bounds can tell us. It turns out it gives us some insights
as to how bad things are and even in the idealized setting things aren’t great in terms of the
advantages.
I am happy to take questions throughout. I don’t think I have an awful lot of slides here so it’s
probably best to keep the context of a particular slide. So jump out if you have some questions.
Okay so here’s the background thing to think about during the talk. Suppose you change your
password continuously, basically as often as an interface allows? Does that prevent successful
guessing attacks? So just try and think of that in the background and then we will try and put
some precise numbers to that question.
So here’s the model we are going to start out with. I’m not claiming that this maps well onto
users passwords, but this is where we start. So we have a crypto key search problem, you have a
randomly chosen key, we are assuming we are going to do an exhaustive search. So there is a
thick key space, assume there is capital R keys and we assume that an attacker actually is going
to be able to guess through all R keys, and we are going to have two different periods we are
going to talk about; one is the exhaustive guessing time, exhaustive period time, that’s capital T
and the second time period we are going to talk about is the password expiration policy period
which is capital P. We are going to go through a set of different sequences are here. We are
going to look at the relationship between the exhaustive guessing period and the expiration
period is and then consider the different cases as well.
So we are going to assume for simplicity and online guessing attack. I will have some comments
about offline later on. It is a deterministic finite search. So the attacker will be successful if you
do not change your key by the time he has guessed every single key through the key space
somewhere in there will be your key and he will have guessed it. The policy period is known.
So for example you have to change your key every 30 days or 90 days. So that period is known,
but the attacker is does not know at which precise moment within that period a user changes his
key. If the attacker get’s to the end of his key space we assume that he will just start guessing
again.
Now when he guesses through the key spaces a second time he might guess through it in a
different deterministic order, but it is a deterministic order again that he is going to go through.
So he will guess through the key sequence once, every single one once and then in a possibly
different order the next time. And we will assume these are random orderings and that allows us
to get the probabilities [indiscernible] precisely.
>>: [inaudible].
>> Paul Van Oorschot: Okay, so it turns out that’s a good question. So the analysis is simplest if
we assume it’s an online guessing attack. It turns out there is a little hitch in this particular
model that if it’s an offline guessing attack how does that map onto the real world? Well it turns
out if its offline if you actually change your key then the attacker actually can’t be verifying
offline unless he grabs another copy of the key file. So just to avoid that issue right now let’s
assume it’s online guessing and we don’t have to worry about that little glitch.
>>: There’s another time period to consider which is: how long between the attacker breaking
your key and him selling it to the black market to the bad guy in Ukraine who is stealing your
money?
>> Paul Van Oorschot: That’s another question too, exactly.
>>: Is there any sense of a lockout sort of thing, a time period of defense for the online aspect?
>> Paul Van Oorschot: No, in this case we are just going to assume these are defensive measures
which can help you stop attacks, but let’s just say that they don’t do that. Surprisingly there are a
lot of online properties that don’t do that and actually don’t even do throttling. So, I want to get
through the simple model first, get some precise answers and equations and then map to how that
maps onto passwords.
>>: Sorry, the attacker is only focused on one account at a time, is that right?
>> Paul Van Oorschot: Well again, for this analysis let’s assume that these are guesses on one
account and then we can go through and see if he –. In the real world the attacker is probably
going to guess the most probable password against all accounts and then the second most
probable against all accounts. Some attackers are looking to attack your account, but if it’s a
generic attack they would probably do most probable guesses against all the accounts for which
they have access.
Okay, so here is the key K with case of 1, case of 2, up to case of capital R. Let’s assume that’s
a visual representation of the key space and the users key is going to just be lower case k without
a subscript. The attacker is going to guess key case of I, that time t of i and bingo, that value
there would be where he get’s the correct key. So the question then is if you change your user
key k to k*, so after lower case u, units of time, if you change your key what security advantage
results? So what is the change in the probability of successful guess over all our guesses? We
are assuming it’s over all our guesses. So that’s the model we are going to set up. We are going
to answer this question precisely and then we want to map it to our real problem of interest.
Okay, so for the first analysis let’s assume that the exhaustive guessing period, the time it takes
for the attacker to guess all our keys is capital T and that’s going to be less than or equal to the
policy change period, which is capital P, which is your 90 days or whatever. Now for the
attacker to be successful, if we assume that there are two periods, call it W1 and W2, think about
the users key being k and then k* and the attacker is successful if he guesses k while it is active
in period W1 or k* in W2 and the only time he is unsuccessful is if its neither k within W1 nor
k* in W2. We can go through and figure out the exact probabilities here because if the
proportion of keys in W1, if it is changed after units of time the proportion of keys is U divided
by capital R and let’s call that Q and that leaves a proportion of 1 minus Q in W2. And simply
from the sizes of those partitions you get the probabilities because we are assuming it’s a
randomly chosen key and they are all equally probable.
So just from that analysis and what’s on that slide the only case of attacker failure is if k is not in
W1, k* is not in W2 down at the bottom here and the probability is simply going to be 1 minus Q
times Q. So the probability of success is going to be adding up Q1, and Q2, and Q3, and the
probability of attack or failure is Q4. So it turns out that our probability of success is 1 minus Q
plus Q squared. The probability of failure is 1 minus that. Does anybody know how to figure
out a min/max here? Q minus Q squared does that equal zero? Min/max Q equals one-half.
Probability of failure one-quarter, probability of success is three-quarters.
So this is what it looks like. You are going to get the minimum here. The probability of success
get’s minimized if the user changes exactly half way through the period. The probability of
success is still greater than or equal to 75 percent. If you didn’t change it at all there was a 100
percent of probability of success for the attacker. If you do change it at the optimal time it drops
to 75 percent. So the X axis is the proportion of the key space search before the user changes a
key and Y is probability of attacker success. So that’s the simple base case and then we are
going to go through some more realistic cases.
So if the search fails again what’s the attacker going to do? They are going to repeat the search
starting through the key space in a possibly different deterministic order. If you then go through
i different searches, lower case i, then the probability of success is going to be 1 minus the
probability that he fails on all the searches. So it’s going to be 1 minus one-quarter of the i. So
this is what happens: across the bottom now we have the number of searches that the attacker
get’s and already it is .75 probability of success here and goes up to .9375, then it’s 98 percent
and 99.6 percent. So even with a small number of total period t the attacker is going to have
your guaranteed success.
So that’s not surprising because what you are doing here is you are giving the attacker many
more guesses. What turns out to be important is how many guesses does the attacker get?
Changing the key ends up not changing the overall probability of success over here, so our
summary from this is that when the exhaustive period t is less than or equal to policy expiration
period capital P, the policy doesn’t really help us. It doesn’t give us a lot of hope or certainly in
any sort of environment you would hope that the attacker is going to have less than a 98 or 99
percent chance of success.
All right. So, that was our base analysis where t was less than or equal to p. It gives us the exact
intuition of for if t was equal to twice the policy period p, well the optimal thing here for the
defender to do is to change their key halfway through the period. In fact that was kind of what
we modeled. When we modeled if you don’t have to change your key until 90 days, most users
will not change it until 90 days, and so you will almost always have it pushed out to the end.
But, the optimal thing to do here if t was equal to 2p the optimal thing would still be to change it
after 45 days. We can’t expect the users to do that, but that’s what they would be best off doing
if we knew exactly what the attackers search time was. We don’t know that so it’s very hard. So
all the analysis we do here is actually assuming the defender changes the key at the optimal time.
We couldn’t even achieve that with real users because we don’t know what the real attack is.
So let’s go forward a little bit more. So suppose we generalize, suppose rather than t equal to p
or t equal to 2p we say t equals lower case t times p. So for example now the policy expiration
period capital p maybe is 30 days one month and the search time might be 24 months. So the
little lower case t would be 24 here. So this model is more of the space. So what happens here?
Well the analysis is almost exactly the same. The easiest way to go through the analysis is to
assume that lower case t equals 3. So now we have got 3 periods here: W1, W2 and W3 rather
than just W1 and W2, go through it in your head for 3, but every time you want to use 3 stick in
a t instead and you get exactly the same analysis. The only time the attack will fail is if the key
for each period fails to be in all 3 of those. So the probability of failure is going to be 1 minus 1
over t to the t, the probability of success 1 minus 1 over t to the t.
And magically this little e comes out here. I don’t know how that happens in mathematics.
Somebody jigged the system many years ago and this number comes up. So it turns out that if t
get’s very, very large in this case –. So the exhaustive search period for the attacker is very, very
large, the probability of success is still going to be 63 percent or so. So this is what graphically
we have. We start out here, this along the X axis is T and t basically tells us that the relationship
between capital T and capital P here. So that’s how many full, exhaustive searches the attacker
get’s within one –. Sorry, that’s how many policy periods there are in an exhaustive search.
Then we come down here and very rapidly we hit a lower limit. So you can think of this as the
attacker search power getting weaker, the expectation of attack success falls, but it doesn’t get
any lower than .632. So intuitively you might have thought it would fall further than that. The
numbers just tell us it doesn’t fall any further than that. So that was for the case capital t is equal
to lower case t times p.
Now the next thing we want to consider is suppose we have multiple full exhaustive searches, i
periods full exhaustive search. Then the probability of success is going to be 1 minus the
probability of failure in each of those and we just get the exponent i on that right hand side here
again. Now if we graph this one I think this tells really the full story we want to get. So the
difference here on the X axis I have got how many full exhaustive search periods we have and
now the Y axis here is still the probability of attacker success, although this axis is kind of what
you saw before on the other graphs. Here’s our probability of success, 1.0 with just one search
period. It’s 1.0 if you don’t change your key at all and its .75 if you change it once, so the st
equals 2. It hits the limit here of .632. So this is all for one search period, but then you go across
here and in this graph we are showing t as increasing with these different curves. So t equals 1 is
this curve, t equals 2 is this curve, then it comes down, t equals 128 is this curve, it’s not much
different than t equals 32, then if go to 256 and 512 you get essentially the same curve. So the
curve doesn’t change much over that.
So with t increasing, meaning the attacker search power is weaker. That helps the defender, but
it get’s bounded again essentially by this curve and even when the attacker is in his worst case he
does not too poorly. The sad news is that the probability of success is at least 95 percent if you
get three or more search periods independent of lower case t and the probability of success is
approaching 1.0 even for small y independent of t. So again it boils down to the important point
is how many guesses does the attacker get? Not how often you are changing your key.
Changing the key helps a little bit.
>>: This seems to get very close to the [inaudible] case, which is a game where I pick a random
password in the space and you try to guess it and then I change it and repeat.
>> Paul Van Oorschot: Right, exactly. So this same analysis goes back to DES key search
machines and the people who did the crypto work on this 20 to 30 years ago they would say,
“Yeah, that will make sense.” So now we are mapping that just to the password guessing
problem.
>>: [indiscernible].
>> Paul Van Oorschot: Well this is the online attack. So he has to go find the key and then use it
and go online right away. So if, for example you’re targeting one individual this is probably
what they would do. If you are trying to harvest a bunch of keys and then sell them online in
some underground system to someone else then it’s a different issue.
Okay, so now let me make a comment on the exhaustive search time t being much longer than
the policy period p. So here you would actually have the key changing very often before the full
exhaustive search. And again, the important part is not the number of changes, it is how many
guesses the attacker gets. So a limited case we saw in that slide. So now a comment again about
offline attacks. So if you go back into some of the original documents and I guess these would
be US government documents, probably the ones which most people would find before others,
the original goal of password exploration policies was stated as being to limit the risk of
password compromise in one year to one in a million.
So that was what they stated their original goal was. Now this is entirely unobtainable today if
we think about offline attacks and we assume there is no iterated hashing, lots of people still
don’t do iterated hashing as a defense, because modern processing resources easily allow 7 to 10
billion guesses per second. And if you take for example an 8 character totally random password,
passwords aren’t really totally random, but let’s just say they were, and you had a 93 symbol
alphabet, then you would be able to search through that full 52 bit space in 9 days, which would
mean to reduce the probability of success to 1 in a million you would be changing your password
every 800 milliseconds.
>>: So it is one processor you are guessing?
>> Paul Van Oorschot: Well actually yeah and so attackers would typically have not just like
quad core processors, these are GPUs, you would have a thousand of these or you rent them in
the cloud. So this is one processor which is very, very cheap today.
>>: I think you once convinced me that it’s not per processor, it’s per dollar.
>> Paul Van Oorschot: Yes, per dollar right. Okay so what this tells us is that the password
expiration period, the aging policies give absolute no protection against offline attacks.
>>: Well I generally agree that these policies are useless. You are putting the context in
assuming the dumbest defender possible who is not iterating at all, not doing anything and in that
case if you actually had something that’s a space of 52 elements and you are doing things
sensibly you are in pretty good shape. To say, “This is never good and we are going to start by
assuming that you hash in the dumbest way possible,” and then assume that his argument carries
over for all other reasonable ways of hashing, that logically doesn’t follow.
>> Paul Van Oorschot: Well the argument I am trying to make here is: what does the password
expiration policy grant you? If you put in place a lot of these other things, which you are
mentioning, which are very useful; throttling, hashing, salt and the other stuff, well if you do
those things you probably don’t need to do password exploration at all. So that’s the question.
>>: [inaudible].
>> Paul Van Oorschot: So if we can look and first decide: what does the changing itself give
you? Then we can try and weave together what the best defense is.
>>: But your argument is like saying, “Well if everyone has to set their password to either dog or
cat and they have to go back and forth between either dog or cat, well clearly password reset
doesn’t help you in this context; therefore it helps you in no context.” I don’t buy that argument.
>> Paul Van Oorschot: Well, I mean I would love to chat with you over the next couple of days
while I am here about this, but what I want to do is I first want to see is say right here: if we are
looking at offline, if we are not iterating, what does it buy you? So that’s what I want to
establish. All right?
>>: Okay.
>> Paul Van Oorschot: So here’s a partial summary of what we got. We can look at first saying
the exhaustive search period is about the same as the password expiration period or a little bit
less. Then we can have a multiple, that the password exhaustive search period is a multiple of
the expiration period and then we can talk about going through multiple searches over this space
t, i times t. And in all cases that alone, without these other defenses, is not going to help you an
awful lot. But this is just for probable random keys; this is not for user passwords. So can we
use these bounds, these equations that we have and say, “Okay, well what happens when we
actually have user chosen passwords? Do things get better or worse?” So the answer is: they
don’t get better and we shouldn’t be surprised by that.
So just a background there are two empirical studies here that help us understand where we go in
practice with user chosen passwords. So the first one is Joe Bhonneau’s study, he had access to
70 million Yahoo passwords and a couple of the data points from his study, for online guessing,
if you were to try the most popular passwords on each of a large number of accounts, so say you
got 10 guesses per account, you could get about 1 percent of the full space of passwords with
simply 10 guesses per account. So that shows you there is a lot of skew in real user chosen
passwords. And if you had the optimal attacker and they were able to carry out a massive search,
so there was no throttling and a lot of other things weren’t done which you should do, if you got
1 million guesses per account, 10 to the 6 guesses per account, you would get about 50 percent of
user passwords. And this is surprising; this shows really big skew on the front end of these
distributions.
There was also a study published at CCS by Weir and colleagues. They looked at 32 million
passwords and most of that was the raw q data set. Of the 32 million passwords they took a sublist of 5 million and then they took the 50,000 most popular passwords from that sub-list of 5
million and they took that 50,000 and guessed it on the remaining –.
>>: [inaudible].
>> Paul Van Oorschot: Well, but of the 32 million I think there were only 13 or 15 million that
were unique. So I think there was maybe 12 million or 15 million left, take out 5 million, and of
the remaining sub-list of 1 million they were able to get over 25 percent on those other sub-lists,
so a big skew again there. So what this means is that the results are going to be worst for the
defender when you are guessing against user chosen passwords rather than crypto keys, because
what’s the attacker going to do? He is not going to guess randomly. He is going to try and use
some empirical data and guess what are approximated to be the more popular passwords.
Question?
>>: So the point was brought up that users choose passwords differently. Some are really smart
and careful and then there is my grandmother who really doesn’t chose very good.
>> Paul Van Oorschot: And the majority of the people are the grandmothers, because the rest of
us who are smart are all in the room here because we are working at Microsoft and we are
computer security experts.
>>: Right, I don’t want to tell you what my password was before I got into computer science. So
you are suggesting that password exploration policy gives more benefits to the smart password
choosers than to the dumb password choosers?
>> Paul Van Oorschot: I don’t remember saying that, but let me think. The exploration policies,
well if we go through the crypto model, if the attacker get’s to guess a sufficient number of
guesses, he is going to guess through all the keys, but in practice he is not going to guess the
totally random ones because he won’t get to those one; mainly because most attackers would
rather get through more accounts of weaker users than they would be to try and get the tougher
passwords. So I guess that’s what I am talking here on the next slide.
The two main differences when you are trying to analyze for user chosen passwords rather than
crypto keys is: one there is a length in password variation. So we don’t even know what the full
set of passwords is because most systems don’t [indiscernible] you can have 16, or 32, or you
know maybe 64 character passwords, or longer. But we can model that by saying the vast
majority of users don’t choose passwords longer than 16 characters or 12 characters and that’s
not a bad approximation, but the skew in these distributions is the hardest thing to model. So
what real attackers do, and this is what’s important, is that they are what we call early quitters.
They are going to not guess through full exhaustive spaces. They are going to guess the most
probable keys against larger number of accounts and never guess the very infrequent keys, unless
they are doing a targeted attack on just a particular user, that’s not the way they are going to get
access to the greatest number of accounts.
>>: [indiscernible].
>> Paul Van Oorschot: Oh, so more about that in the last slide, but there is a side effect of the
policies to keep into account as well. So what happens as a result of the skew is that if we
looked at what the attacker’s success would be over guessing full spaces the results from the
idealized model would be the same. But in fact rather than over the full exhaustive period if you
look at how early in the period the attacker get’s success the more skewed the distribution is, the
earlier we expect within the period the attacker is successful. So this makes online attackers very
happy, because the online attacker get’s immediate feedback when they have guessed the right
password.
Okay, so the attack success rates from the idealized crypto model will be just as we said over the
full period t, but it will be earlier within the periods. The offline attack, the full searches, are
basically going to happen pretty quickly with today’s hardware. For the online attack the skew is
going to be very helpful. So let’s come back then to say, “What does password expiration do?”
Does it actually stop guessing attacks? I think the answer is no, if a password is “guessable”, if
it’s within reach, then the password is “guessable”. If it’s guessable, it’s guessable and the
attacker is going to guess it and if he doesn’t guess it in this period he is going to guess it next
period. It’s not a good defensive mechanism if you are really trying to rely on it. It’s like
playing hide and seek, it’s the peas and the shells, and I’m going to move the shells around a
couple of times and if you guess it wrong the first time you are going to guess it right the second
time. I mean there are only so many places I can move the shell, right.
If the attack vector is not a guessing attack then expiration periods can temporarily stop an
ongoing access. If I have a password and I am using that password continually to access your
account and you change your password, changing the password is going to help. But the
expiration period won’t stop continued access if there is a backdoor you put in the first time that
you got access to the account, which some people do. It doesn’t undo any damage that happens
on your original access to the account. If you stole a bunch of files changing the password
doesn’t un-steal the files. It doesn’t stop attack vectors which are persistent or which can reexecute. So for example, persistent clients like Malware or persistent man in the middle attacks,
or things like that.
>>: [indiscernible].
>> Paul Van Oorschot: Well, if it’s an offline guessing attack I guess the way I’m thinking about
that is first of all that happened so fast that he’s probably got access. The little glitch that came
in with the offline attack is that if it is an offline attack and you change your password then we
get into this whole thing of, “Okay, so that assumed that the attacker somehow got a hold of a
leaked password file,” and then if there is an online attack and you change your password you
have to assume that he get’s a hold of the leaked password file again, right.
>>: Well I don’t know, I’m not sure if this is one of the reasons that people have these policies,
but I can imagine that if the password file get’s leaked and people have to change their
passwords relatively quickly by the time the attacker cracks all of these passwords it’s already to
late.
>> Paul Van Oorschot: Right, so this get’s back to the earlier point that was made that if they are
cracked then the period between them being “cracked” and them actually being, if you are going
to call it “cracked” or “compromised”, but them actually being exploited as a result of that, if
there is a lag and you change it in that lag period you are okay. But if it turns out that you can go
through the full exhaustive search in 9 days I wouldn’t want to be relying on this for my
defensive protection that we just happen to be a day ahead of the attacker using it.
>>: My assumption is that the reason for the possibility is the second one that you listed; that it is
trying to put some bound on when the password does get compromised and how long does the
attacker have access? And you list the reasons why that is not that’s not valuable, but
presumably those reasons are not enough to convince people to make these policies?
>> Paul Van Oorschot: So I think it’s hard to say it gives no protection at all. So I think we have
to acknowledge that it provides some advantage. The question is at what cost? We haven’t
measured at all in any of these equations is: what is the cost and the impact to the users and the
user’s time? The typical assumption is that they are unhappy, but that’s free. And in enterprise
environments you guys are paid well, you should be made to suffer a little bit for all that pay you
get. So you do some things that we tell you to do and that’s fine. That doesn’t work as well in
the online environment. Of course we don’t see password expiration periods at all on the web,
right, very few.
>>: You have to quantify the harm that you are preventing through this policy and argue that it’s
less than the time/money we are spending on users or you have to argue that there is some other
mechanism working, because you have these changing policies that are actually hurting
[indiscernible].
>> Paul Van Oorschot: But I mean it is interesting to think about the question of: if the main
advantage is that you are limiting the period of access –. Well I guess there is two cases: one is
that you are hoping that the change happens before the exploit actually –. Well you are hoping –.
If it’s that you are stopping the continual access then a clever attacker may already have what he
wants or it maybe he has got another back door in. So it would seem that you failed and you are
trying to salvage something, but it doesn’t seem like the policy –. You would really like to have
a defense that works instead, that stops access originally.
Okay, so I am pretty much done here. So in the end what do the policies provide? I think it can
temporarily stop ongoing access. Now of course if you temporarily stop it then you have to
worry about the next time that it’s guessed did you stop it in that time too and then the next time
it’s guessed did you stop it? So accumulatively again this is not a winning game, but I think it is
useful to revoke some sort of a delegated access. You know you are on vacation and you give
the password to your colleague, or your secretary, or someone and that’s a very useful.
Now the second thing is you guys are on shared projects while you are on interim here and you
come back and if there is not a password change policy you come back two years from now and
you still have access to all the accounts you had two years ago when you were here. So this
limits that, but if these are the two most useful benefits is there a cheaper way of addressing
those issues than getting 100 percent of the population to change their password every 90 days?
The other thing it does, as I mentioned, is it does force an offline attacker into having to grab a
new copy of the hash file, but if grabbing a copy of the hash file is easy to begin with that’s
probably a problem. So we can talk about why that was possible to begin with and why there
weren’t other mechanisms in place to stop that from happening. Do you have a question?
>>: [indiscernible].
>> Paul Van Oorschot: This is my last slide. So from what I have shown you I think it is fair to
say there is limited value, not no value, but limited value and this is ignoring the usability
impact, usability cost and perhaps one of the most damaging factors here is this result from
Zhang and his colleagues at North Carolina in their CCS 2010 paper. So if you know the current
password, which is the case that you are trying to disrupt an ongoing access, if you know the
current password then it turns out, because people have these little algorithms that they use,
because they have to update it every month or three months, it’s actually for many accounts or a
significant percentage of accounts, it’s actually possible to guess the next password.
So they were able to guess for 17 percent of user accounts with less than 5 guesses, if they knew
the current password, they could guess the next password. Now that’s pretty devastating,
especially if you are thinking, “Well we are going to change the password before it get’s used,”
but if you actually know it’s the current password and you were able to verify this offline, and
then you try to go online and it doesn’t work, well you know the old one and now you have to
guess from that older one and the next one. So that’s a pretty severe attack against this password
expiration policy as well.
So I guess my view of this is that the benefits are at best partial and minor. They are not zero,
but they are partial and minor and there is no concrete evidence how useful they are. I am trying
to come up with a bound on them not being useful, but we haven’t seen any evidence showing
how useful they are. So I would like to see some of that evidence from someone before they
continue to use these and then I will leave it with: what alternatives can deliver equivalent or
almost equivalent benefits at lower costs to users? So that’s all I had and I am happy to take
questions and discuss other cases.
[Applause]
>>: I think you missed the most damaging thing about password expiration policies, which is
that every restriction you put on a user causes them to choose from a smaller set of possible
passwords. And when you have to choose a password that’s going to be easy for you to modify
in 70 or 90 days you have to choose from the set that you know how to easily modify to make it
memorable so you can make the minimal change to it again. You are going to make it even
easier to guess passwords and you are going to create popular passwords that weren’t popular
before, that are far more popular than some of the ones you have to worry about.
>> Paul Van Oorschot: So understand that and I think there is validity to that, although I have
heard a counter argument to that as well. I have heard this from [indiscernible] as well, but there
is a possibility, and we haven’t seen the studies, but there is the possibility that by forcing users
to change passwords the base password that they have in their algorithm, the base password
together with the little variation they have, whether it’s sticking the month on the end or some
number on the end that rotates from 1 to 12, that might be harder to guess than the password they
would have. So I would love to see a study that you or your colleagues do, but we don’t know.
We don’t know which of those play out.
>>: So I agree with that to some extent. I have often thought we would make it to a point where
our IT department sends out a note saying, “After carful study we have decided the very best
password is foo and now everybody should just using that, because satisfies all of our
constraints.” But seriously I don’t agree with the premise you are saying and it’s because I don’t
change my password because I fear there is some long-term, ongoing, slow, exhaustive search
attack that somebody is mounting against my password and if I don’t change it eventually it will
get hit. I change it because of leakage and leakage is not represented in this model.
>> Paul Van Oorschot: Are you worried about leakage, like man in the middle, or a password
has file?
>>: Neither, I am worried about exposure.
>>: [indiscernible].
>>: Yea, exactly. I am not a big fan of crypto key rolling because I don’t fear a leakage there so
much, but here where I see all the time where somebody is in a meeting and they are logging into
a display and they think they are typing their user name, they think they are in the password
field, but they are actually in the user field, so it’s displaying, they get half way through it, they
realize it and they suddenly erase. And if they are supposed to change their password every 90
days they might walk out of a meeting saying, “Oh no, I guess I better change it now,” but if this
is a password that they have had for five years, they chose very carefully, they think it’s really
good they are probably going to go away saying, “Oh I hope nobody really saw when I did that,
because I don’t really want to change this password that I have had for five years.”
>> Paul Van Oorschot: But to pick up on the point that you are saying that this analysis misses is
basically the observation attack that the longer you are using a password for the greater in
accumulation of the probability that someone has seen you use that password –.
>>: Or seen parts of it. Maybe this meeting they saw the first half and then because of
something else, six months later somebody saw the second half.
>> Paul Van Oorschot: I think that’s valid, of course the whole space of all the attackers if that
collapses down to just this tiny percentage that has seen you physically seen you enter your
password, that’s a smaller subset than attackers from the other space.
>>: Sure, but there are other leakage challenges.
>> Paul Van Oorschot: Yeah, fair enough.
>>: So I mean I open it up that people at MSR give talks externally sometimes and if they are
giving a talk on the projector and it’s recorded then pretty much anybody that watches it or cares
to watch it could possibly see that slip up.
>> Paul Van Oorschot: If you have a video onto someone typing the keys that’s a bad leakage.
>>: That’s exactly a place where there is a partial leakage, sometimes that over time, over six
months maybe the camera was in the right place.
>> Paul Van Oorschot: I think that’s fair, but again this reminds me too of this great fear that
everyone has about sticking a yellow sticky on your machine and well your risk there is your
family members at home, your spouse, the cleaning ladies, all the rest, but again that’s
worrisome, but it’s a tiny fraction of all the attackers that are attacking from the other side of the
world as well.
>>: It’s also not scalable right. I mean I could have the entire population of China trying to do
visualization of passwords and there wouldn’t be enough people to get the passwords.
>> Paul Van Oorschot: It does get us to an interesting other discussion about generic attacks
verses targeted attacks. And if we start to talk about targeted attacks or someone wants to get
into your laptop I think we need a whole lot better than passwords securing our accounts if we
are worried about targeted attacks. That’s a very difficult thing to defend against. I have got one
question on my monitor here and the question is: how about the benefit that aging policies
prevent users from using the same password for all their properties or accounts. And certainly if
you are rotating having an aging policy which causes you to rotate passwords probably causes
that. You know no one really wants to have this password, when it has to get rotated, rotate all
your other passwords at the same time. So that is a side benefit, of course from a usability point
of view you might actually want to be sharing your password amongst a bunch of low value
properties, but anyone that actually has an expiration policy in place probably considers
themselves not a low value property. So I think that is a potential benefit of the policy.
>>: [inaudible].
>> Paul Van Oorschot: I would guess we haven’t seen the studies on this. I don’t know if there
are studies, but my guess would be it does flatten out the passwords, but if you are rotating
between, fluffy, and dog, and marshmallow, if it’s flatter, but not flat enough to withstand attack
then we don’t know that it helps. I guess the real issue is: if it is as predictable as the Zhang
study says that if I know one of your passwords then I can get the other ones with reasonable
probability that might be a more interesting question than how flat the distribution is.
>>: [inaudible].
>> Paul Van Oorschot: I think a lot of the online attackers would just to a big database, and it’s
growing all the time, but you take the password that they had at distribution. So that won’t map
exactly onto the Microsoft Research set of passwords, but there is probably some overlap. And
it’s not that you want to guess 100 percent of the passwords, but you would be happy if you
could guess 1, or 2, or 5 percent, because that might get you into allow you to do some privilege
escalation attack or something else. So I don’t really have an answer to your first question.
>>: We have heard a lot of good math and theory. Is it possible to conduct an empirical study on
this to find out if it’s harder to break into say, you have got two equal companies, one has got a
password expiration policy and one doesn’t, is that something that is possible?
>> Paul Van Oorschot: Let me ask Mr. [indiscernible] and Mr. [indiscernible] here, your
Microsoft colleagues, whether it would be possible to carry out an experiment like that.
>>: You mean white hats or black hats? I don’t know which data would be more helpful.
>>: I think black hats probably do it all the time right, but they don’t publish so much and white
hats it’s getting it past your legal time. That might be a little bit challenging, but the short
answer is I guess don’t think there are good data sets. I don’t think we even have a good data set
of –. There hasn’t been a large scale breach of a data set that had a password expiration policy in
place. So we don’t know the distribution. My guess would also be that it’s somewhat flat, but
that’s kind of speculation. We don’t know by how much and we certainly don’t have any data
that ones who do expiration have faired better than ones who cannot.
>> Paul Van Oorschot: Yea, I mean it’s really only been in the last 5 years or so that we have
had some of these large breaches and some of the studies, for example the Bhonneau study with
the 70 million Yahoo passwords that they cooperated with letting him analyze in a privacy
preserving way, but it’s only in the last 5 years that we have had the benefit of some large data
sets like this. This is one of the challenges in research in this area, is you want to do ecologically
valid research with real data sets rather than made up ones. I mean that’s the ideal case and it’s
hard to get exactly those passwords. There have been some studies that show, you know studies
with the Mechanical Turk and others, have certain properties that are quite useful for studying,
but it’s a hard area to get real data on and very large data sets is what you want.
>>: You mentioned as an example an 8 character data set [indiscernible]. That is an excellent
number if people have real keyboards, like even uppercase is easy. Now that we all type on little
postage stamps shouldn’t you count key strokes? This means going from lower case a, to the
number 1, to upper case B, to lower case c, is god knows how many key strokes.
>> Paul Van Oorschot: Yea and I think the mobile phones –. I would like to see a study and I
can’t recall having seen one, but I would like to see a study of how passwords chosen from
mobile devices, what their key spaces are compared to passwords chosen on desktop machines. I
don’t think that would be a happy result either.
>> Cormac Herley: Any other questions? Let’s thank the speaker again.
[Applause]
Download