>> Cormac Herley: So let’s get started. It is a pleasure to introduce Paul Van Oorschot a repeat visitor who is going to talk to us about password aging policies and quantifying security advantages. Paul is well known to many of us. He used to work on Crypto for a long time, but then saw the error of his ways and decided to start working on passwords. >> Paul Van Oorschot: Thank you Cormac. So a very apropos intro, because what I am going to do in this talk is actually go back to Crypto. So we are all familiar with password aging policies and I don’t need to ask people how excited they are about changing their passwords every 30, 70 or 90 days. Generally it’s not the people get up in the morning and say, “Today’s the day I get to change my password and I am very excited.” The problem here that I am looking at is the expectation is that somehow this helps and improves security and we are somehow convinced that this is going to help security. But the question to ask is: How much does it help and in which ways? So that’s the question I want to explore. This work, if you are interested in looking at the details its in a paper that will appear shortly in a special issue of Designs, Code and Cryptography, Scott Vanstone was actually my PhD advisor. So the thing about password aging policies is we have this general notion or we come to believe that it helps security in some vague ways and I will talk about a couple of them later, but quantitatively what is the benefit? What gain and what benefit do we get? And in order to answer that question the easiest thing to do is to try and formalize things. Now if you formalize things you end up making assumptions in models that don’t actually map onto the real world and so then what’s the value of your model? Well I do start here by formalizing and going to a Crypto security model where rather than passwords which are chosen in a very skewed distribution by users we are going to start out looking at cryptographic keys which are randomly chosen and equally probable in order to figure out what the bounds are. Then I want to take that because we can get precise answers and try and get some insight into mapping that onto real user passwords. It will turn out that will give us some lower bounds and with skew distributions we can then see what those bounds can tell us. It turns out it gives us some insights as to how bad things are and even in the idealized setting things aren’t great in terms of the advantages. I am happy to take questions throughout. I don’t think I have an awful lot of slides here so it’s probably best to keep the context of a particular slide. So jump out if you have some questions. Okay so here’s the background thing to think about during the talk. Suppose you change your password continuously, basically as often as an interface allows? Does that prevent successful guessing attacks? So just try and think of that in the background and then we will try and put some precise numbers to that question. So here’s the model we are going to start out with. I’m not claiming that this maps well onto users passwords, but this is where we start. So we have a crypto key search problem, you have a randomly chosen key, we are assuming we are going to do an exhaustive search. So there is a thick key space, assume there is capital R keys and we assume that an attacker actually is going to be able to guess through all R keys, and we are going to have two different periods we are going to talk about; one is the exhaustive guessing time, exhaustive period time, that’s capital T and the second time period we are going to talk about is the password expiration policy period which is capital P. We are going to go through a set of different sequences are here. We are going to look at the relationship between the exhaustive guessing period and the expiration period is and then consider the different cases as well. So we are going to assume for simplicity and online guessing attack. I will have some comments about offline later on. It is a deterministic finite search. So the attacker will be successful if you do not change your key by the time he has guessed every single key through the key space somewhere in there will be your key and he will have guessed it. The policy period is known. So for example you have to change your key every 30 days or 90 days. So that period is known, but the attacker is does not know at which precise moment within that period a user changes his key. If the attacker get’s to the end of his key space we assume that he will just start guessing again. Now when he guesses through the key spaces a second time he might guess through it in a different deterministic order, but it is a deterministic order again that he is going to go through. So he will guess through the key sequence once, every single one once and then in a possibly different order the next time. And we will assume these are random orderings and that allows us to get the probabilities [indiscernible] precisely. >>: [inaudible]. >> Paul Van Oorschot: Okay, so it turns out that’s a good question. So the analysis is simplest if we assume it’s an online guessing attack. It turns out there is a little hitch in this particular model that if it’s an offline guessing attack how does that map onto the real world? Well it turns out if its offline if you actually change your key then the attacker actually can’t be verifying offline unless he grabs another copy of the key file. So just to avoid that issue right now let’s assume it’s online guessing and we don’t have to worry about that little glitch. >>: There’s another time period to consider which is: how long between the attacker breaking your key and him selling it to the black market to the bad guy in Ukraine who is stealing your money? >> Paul Van Oorschot: That’s another question too, exactly. >>: Is there any sense of a lockout sort of thing, a time period of defense for the online aspect? >> Paul Van Oorschot: No, in this case we are just going to assume these are defensive measures which can help you stop attacks, but let’s just say that they don’t do that. Surprisingly there are a lot of online properties that don’t do that and actually don’t even do throttling. So, I want to get through the simple model first, get some precise answers and equations and then map to how that maps onto passwords. >>: Sorry, the attacker is only focused on one account at a time, is that right? >> Paul Van Oorschot: Well again, for this analysis let’s assume that these are guesses on one account and then we can go through and see if he –. In the real world the attacker is probably going to guess the most probable password against all accounts and then the second most probable against all accounts. Some attackers are looking to attack your account, but if it’s a generic attack they would probably do most probable guesses against all the accounts for which they have access. Okay, so here is the key K with case of 1, case of 2, up to case of capital R. Let’s assume that’s a visual representation of the key space and the users key is going to just be lower case k without a subscript. The attacker is going to guess key case of I, that time t of i and bingo, that value there would be where he get’s the correct key. So the question then is if you change your user key k to k*, so after lower case u, units of time, if you change your key what security advantage results? So what is the change in the probability of successful guess over all our guesses? We are assuming it’s over all our guesses. So that’s the model we are going to set up. We are going to answer this question precisely and then we want to map it to our real problem of interest. Okay, so for the first analysis let’s assume that the exhaustive guessing period, the time it takes for the attacker to guess all our keys is capital T and that’s going to be less than or equal to the policy change period, which is capital P, which is your 90 days or whatever. Now for the attacker to be successful, if we assume that there are two periods, call it W1 and W2, think about the users key being k and then k* and the attacker is successful if he guesses k while it is active in period W1 or k* in W2 and the only time he is unsuccessful is if its neither k within W1 nor k* in W2. We can go through and figure out the exact probabilities here because if the proportion of keys in W1, if it is changed after units of time the proportion of keys is U divided by capital R and let’s call that Q and that leaves a proportion of 1 minus Q in W2. And simply from the sizes of those partitions you get the probabilities because we are assuming it’s a randomly chosen key and they are all equally probable. So just from that analysis and what’s on that slide the only case of attacker failure is if k is not in W1, k* is not in W2 down at the bottom here and the probability is simply going to be 1 minus Q times Q. So the probability of success is going to be adding up Q1, and Q2, and Q3, and the probability of attack or failure is Q4. So it turns out that our probability of success is 1 minus Q plus Q squared. The probability of failure is 1 minus that. Does anybody know how to figure out a min/max here? Q minus Q squared does that equal zero? Min/max Q equals one-half. Probability of failure one-quarter, probability of success is three-quarters. So this is what it looks like. You are going to get the minimum here. The probability of success get’s minimized if the user changes exactly half way through the period. The probability of success is still greater than or equal to 75 percent. If you didn’t change it at all there was a 100 percent of probability of success for the attacker. If you do change it at the optimal time it drops to 75 percent. So the X axis is the proportion of the key space search before the user changes a key and Y is probability of attacker success. So that’s the simple base case and then we are going to go through some more realistic cases. So if the search fails again what’s the attacker going to do? They are going to repeat the search starting through the key space in a possibly different deterministic order. If you then go through i different searches, lower case i, then the probability of success is going to be 1 minus the probability that he fails on all the searches. So it’s going to be 1 minus one-quarter of the i. So this is what happens: across the bottom now we have the number of searches that the attacker get’s and already it is .75 probability of success here and goes up to .9375, then it’s 98 percent and 99.6 percent. So even with a small number of total period t the attacker is going to have your guaranteed success. So that’s not surprising because what you are doing here is you are giving the attacker many more guesses. What turns out to be important is how many guesses does the attacker get? Changing the key ends up not changing the overall probability of success over here, so our summary from this is that when the exhaustive period t is less than or equal to policy expiration period capital P, the policy doesn’t really help us. It doesn’t give us a lot of hope or certainly in any sort of environment you would hope that the attacker is going to have less than a 98 or 99 percent chance of success. All right. So, that was our base analysis where t was less than or equal to p. It gives us the exact intuition of for if t was equal to twice the policy period p, well the optimal thing here for the defender to do is to change their key halfway through the period. In fact that was kind of what we modeled. When we modeled if you don’t have to change your key until 90 days, most users will not change it until 90 days, and so you will almost always have it pushed out to the end. But, the optimal thing to do here if t was equal to 2p the optimal thing would still be to change it after 45 days. We can’t expect the users to do that, but that’s what they would be best off doing if we knew exactly what the attackers search time was. We don’t know that so it’s very hard. So all the analysis we do here is actually assuming the defender changes the key at the optimal time. We couldn’t even achieve that with real users because we don’t know what the real attack is. So let’s go forward a little bit more. So suppose we generalize, suppose rather than t equal to p or t equal to 2p we say t equals lower case t times p. So for example now the policy expiration period capital p maybe is 30 days one month and the search time might be 24 months. So the little lower case t would be 24 here. So this model is more of the space. So what happens here? Well the analysis is almost exactly the same. The easiest way to go through the analysis is to assume that lower case t equals 3. So now we have got 3 periods here: W1, W2 and W3 rather than just W1 and W2, go through it in your head for 3, but every time you want to use 3 stick in a t instead and you get exactly the same analysis. The only time the attack will fail is if the key for each period fails to be in all 3 of those. So the probability of failure is going to be 1 minus 1 over t to the t, the probability of success 1 minus 1 over t to the t. And magically this little e comes out here. I don’t know how that happens in mathematics. Somebody jigged the system many years ago and this number comes up. So it turns out that if t get’s very, very large in this case –. So the exhaustive search period for the attacker is very, very large, the probability of success is still going to be 63 percent or so. So this is what graphically we have. We start out here, this along the X axis is T and t basically tells us that the relationship between capital T and capital P here. So that’s how many full, exhaustive searches the attacker get’s within one –. Sorry, that’s how many policy periods there are in an exhaustive search. Then we come down here and very rapidly we hit a lower limit. So you can think of this as the attacker search power getting weaker, the expectation of attack success falls, but it doesn’t get any lower than .632. So intuitively you might have thought it would fall further than that. The numbers just tell us it doesn’t fall any further than that. So that was for the case capital t is equal to lower case t times p. Now the next thing we want to consider is suppose we have multiple full exhaustive searches, i periods full exhaustive search. Then the probability of success is going to be 1 minus the probability of failure in each of those and we just get the exponent i on that right hand side here again. Now if we graph this one I think this tells really the full story we want to get. So the difference here on the X axis I have got how many full exhaustive search periods we have and now the Y axis here is still the probability of attacker success, although this axis is kind of what you saw before on the other graphs. Here’s our probability of success, 1.0 with just one search period. It’s 1.0 if you don’t change your key at all and its .75 if you change it once, so the st equals 2. It hits the limit here of .632. So this is all for one search period, but then you go across here and in this graph we are showing t as increasing with these different curves. So t equals 1 is this curve, t equals 2 is this curve, then it comes down, t equals 128 is this curve, it’s not much different than t equals 32, then if go to 256 and 512 you get essentially the same curve. So the curve doesn’t change much over that. So with t increasing, meaning the attacker search power is weaker. That helps the defender, but it get’s bounded again essentially by this curve and even when the attacker is in his worst case he does not too poorly. The sad news is that the probability of success is at least 95 percent if you get three or more search periods independent of lower case t and the probability of success is approaching 1.0 even for small y independent of t. So again it boils down to the important point is how many guesses does the attacker get? Not how often you are changing your key. Changing the key helps a little bit. >>: This seems to get very close to the [inaudible] case, which is a game where I pick a random password in the space and you try to guess it and then I change it and repeat. >> Paul Van Oorschot: Right, exactly. So this same analysis goes back to DES key search machines and the people who did the crypto work on this 20 to 30 years ago they would say, “Yeah, that will make sense.” So now we are mapping that just to the password guessing problem. >>: [indiscernible]. >> Paul Van Oorschot: Well this is the online attack. So he has to go find the key and then use it and go online right away. So if, for example you’re targeting one individual this is probably what they would do. If you are trying to harvest a bunch of keys and then sell them online in some underground system to someone else then it’s a different issue. Okay, so now let me make a comment on the exhaustive search time t being much longer than the policy period p. So here you would actually have the key changing very often before the full exhaustive search. And again, the important part is not the number of changes, it is how many guesses the attacker gets. So a limited case we saw in that slide. So now a comment again about offline attacks. So if you go back into some of the original documents and I guess these would be US government documents, probably the ones which most people would find before others, the original goal of password exploration policies was stated as being to limit the risk of password compromise in one year to one in a million. So that was what they stated their original goal was. Now this is entirely unobtainable today if we think about offline attacks and we assume there is no iterated hashing, lots of people still don’t do iterated hashing as a defense, because modern processing resources easily allow 7 to 10 billion guesses per second. And if you take for example an 8 character totally random password, passwords aren’t really totally random, but let’s just say they were, and you had a 93 symbol alphabet, then you would be able to search through that full 52 bit space in 9 days, which would mean to reduce the probability of success to 1 in a million you would be changing your password every 800 milliseconds. >>: So it is one processor you are guessing? >> Paul Van Oorschot: Well actually yeah and so attackers would typically have not just like quad core processors, these are GPUs, you would have a thousand of these or you rent them in the cloud. So this is one processor which is very, very cheap today. >>: I think you once convinced me that it’s not per processor, it’s per dollar. >> Paul Van Oorschot: Yes, per dollar right. Okay so what this tells us is that the password expiration period, the aging policies give absolute no protection against offline attacks. >>: Well I generally agree that these policies are useless. You are putting the context in assuming the dumbest defender possible who is not iterating at all, not doing anything and in that case if you actually had something that’s a space of 52 elements and you are doing things sensibly you are in pretty good shape. To say, “This is never good and we are going to start by assuming that you hash in the dumbest way possible,” and then assume that his argument carries over for all other reasonable ways of hashing, that logically doesn’t follow. >> Paul Van Oorschot: Well the argument I am trying to make here is: what does the password expiration policy grant you? If you put in place a lot of these other things, which you are mentioning, which are very useful; throttling, hashing, salt and the other stuff, well if you do those things you probably don’t need to do password exploration at all. So that’s the question. >>: [inaudible]. >> Paul Van Oorschot: So if we can look and first decide: what does the changing itself give you? Then we can try and weave together what the best defense is. >>: But your argument is like saying, “Well if everyone has to set their password to either dog or cat and they have to go back and forth between either dog or cat, well clearly password reset doesn’t help you in this context; therefore it helps you in no context.” I don’t buy that argument. >> Paul Van Oorschot: Well, I mean I would love to chat with you over the next couple of days while I am here about this, but what I want to do is I first want to see is say right here: if we are looking at offline, if we are not iterating, what does it buy you? So that’s what I want to establish. All right? >>: Okay. >> Paul Van Oorschot: So here’s a partial summary of what we got. We can look at first saying the exhaustive search period is about the same as the password expiration period or a little bit less. Then we can have a multiple, that the password exhaustive search period is a multiple of the expiration period and then we can talk about going through multiple searches over this space t, i times t. And in all cases that alone, without these other defenses, is not going to help you an awful lot. But this is just for probable random keys; this is not for user passwords. So can we use these bounds, these equations that we have and say, “Okay, well what happens when we actually have user chosen passwords? Do things get better or worse?” So the answer is: they don’t get better and we shouldn’t be surprised by that. So just a background there are two empirical studies here that help us understand where we go in practice with user chosen passwords. So the first one is Joe Bhonneau’s study, he had access to 70 million Yahoo passwords and a couple of the data points from his study, for online guessing, if you were to try the most popular passwords on each of a large number of accounts, so say you got 10 guesses per account, you could get about 1 percent of the full space of passwords with simply 10 guesses per account. So that shows you there is a lot of skew in real user chosen passwords. And if you had the optimal attacker and they were able to carry out a massive search, so there was no throttling and a lot of other things weren’t done which you should do, if you got 1 million guesses per account, 10 to the 6 guesses per account, you would get about 50 percent of user passwords. And this is surprising; this shows really big skew on the front end of these distributions. There was also a study published at CCS by Weir and colleagues. They looked at 32 million passwords and most of that was the raw q data set. Of the 32 million passwords they took a sublist of 5 million and then they took the 50,000 most popular passwords from that sub-list of 5 million and they took that 50,000 and guessed it on the remaining –. >>: [inaudible]. >> Paul Van Oorschot: Well, but of the 32 million I think there were only 13 or 15 million that were unique. So I think there was maybe 12 million or 15 million left, take out 5 million, and of the remaining sub-list of 1 million they were able to get over 25 percent on those other sub-lists, so a big skew again there. So what this means is that the results are going to be worst for the defender when you are guessing against user chosen passwords rather than crypto keys, because what’s the attacker going to do? He is not going to guess randomly. He is going to try and use some empirical data and guess what are approximated to be the more popular passwords. Question? >>: So the point was brought up that users choose passwords differently. Some are really smart and careful and then there is my grandmother who really doesn’t chose very good. >> Paul Van Oorschot: And the majority of the people are the grandmothers, because the rest of us who are smart are all in the room here because we are working at Microsoft and we are computer security experts. >>: Right, I don’t want to tell you what my password was before I got into computer science. So you are suggesting that password exploration policy gives more benefits to the smart password choosers than to the dumb password choosers? >> Paul Van Oorschot: I don’t remember saying that, but let me think. The exploration policies, well if we go through the crypto model, if the attacker get’s to guess a sufficient number of guesses, he is going to guess through all the keys, but in practice he is not going to guess the totally random ones because he won’t get to those one; mainly because most attackers would rather get through more accounts of weaker users than they would be to try and get the tougher passwords. So I guess that’s what I am talking here on the next slide. The two main differences when you are trying to analyze for user chosen passwords rather than crypto keys is: one there is a length in password variation. So we don’t even know what the full set of passwords is because most systems don’t [indiscernible] you can have 16, or 32, or you know maybe 64 character passwords, or longer. But we can model that by saying the vast majority of users don’t choose passwords longer than 16 characters or 12 characters and that’s not a bad approximation, but the skew in these distributions is the hardest thing to model. So what real attackers do, and this is what’s important, is that they are what we call early quitters. They are going to not guess through full exhaustive spaces. They are going to guess the most probable keys against larger number of accounts and never guess the very infrequent keys, unless they are doing a targeted attack on just a particular user, that’s not the way they are going to get access to the greatest number of accounts. >>: [indiscernible]. >> Paul Van Oorschot: Oh, so more about that in the last slide, but there is a side effect of the policies to keep into account as well. So what happens as a result of the skew is that if we looked at what the attacker’s success would be over guessing full spaces the results from the idealized model would be the same. But in fact rather than over the full exhaustive period if you look at how early in the period the attacker get’s success the more skewed the distribution is, the earlier we expect within the period the attacker is successful. So this makes online attackers very happy, because the online attacker get’s immediate feedback when they have guessed the right password. Okay, so the attack success rates from the idealized crypto model will be just as we said over the full period t, but it will be earlier within the periods. The offline attack, the full searches, are basically going to happen pretty quickly with today’s hardware. For the online attack the skew is going to be very helpful. So let’s come back then to say, “What does password expiration do?” Does it actually stop guessing attacks? I think the answer is no, if a password is “guessable”, if it’s within reach, then the password is “guessable”. If it’s guessable, it’s guessable and the attacker is going to guess it and if he doesn’t guess it in this period he is going to guess it next period. It’s not a good defensive mechanism if you are really trying to rely on it. It’s like playing hide and seek, it’s the peas and the shells, and I’m going to move the shells around a couple of times and if you guess it wrong the first time you are going to guess it right the second time. I mean there are only so many places I can move the shell, right. If the attack vector is not a guessing attack then expiration periods can temporarily stop an ongoing access. If I have a password and I am using that password continually to access your account and you change your password, changing the password is going to help. But the expiration period won’t stop continued access if there is a backdoor you put in the first time that you got access to the account, which some people do. It doesn’t undo any damage that happens on your original access to the account. If you stole a bunch of files changing the password doesn’t un-steal the files. It doesn’t stop attack vectors which are persistent or which can reexecute. So for example, persistent clients like Malware or persistent man in the middle attacks, or things like that. >>: [indiscernible]. >> Paul Van Oorschot: Well, if it’s an offline guessing attack I guess the way I’m thinking about that is first of all that happened so fast that he’s probably got access. The little glitch that came in with the offline attack is that if it is an offline attack and you change your password then we get into this whole thing of, “Okay, so that assumed that the attacker somehow got a hold of a leaked password file,” and then if there is an online attack and you change your password you have to assume that he get’s a hold of the leaked password file again, right. >>: Well I don’t know, I’m not sure if this is one of the reasons that people have these policies, but I can imagine that if the password file get’s leaked and people have to change their passwords relatively quickly by the time the attacker cracks all of these passwords it’s already to late. >> Paul Van Oorschot: Right, so this get’s back to the earlier point that was made that if they are cracked then the period between them being “cracked” and them actually being, if you are going to call it “cracked” or “compromised”, but them actually being exploited as a result of that, if there is a lag and you change it in that lag period you are okay. But if it turns out that you can go through the full exhaustive search in 9 days I wouldn’t want to be relying on this for my defensive protection that we just happen to be a day ahead of the attacker using it. >>: My assumption is that the reason for the possibility is the second one that you listed; that it is trying to put some bound on when the password does get compromised and how long does the attacker have access? And you list the reasons why that is not that’s not valuable, but presumably those reasons are not enough to convince people to make these policies? >> Paul Van Oorschot: So I think it’s hard to say it gives no protection at all. So I think we have to acknowledge that it provides some advantage. The question is at what cost? We haven’t measured at all in any of these equations is: what is the cost and the impact to the users and the user’s time? The typical assumption is that they are unhappy, but that’s free. And in enterprise environments you guys are paid well, you should be made to suffer a little bit for all that pay you get. So you do some things that we tell you to do and that’s fine. That doesn’t work as well in the online environment. Of course we don’t see password expiration periods at all on the web, right, very few. >>: You have to quantify the harm that you are preventing through this policy and argue that it’s less than the time/money we are spending on users or you have to argue that there is some other mechanism working, because you have these changing policies that are actually hurting [indiscernible]. >> Paul Van Oorschot: But I mean it is interesting to think about the question of: if the main advantage is that you are limiting the period of access –. Well I guess there is two cases: one is that you are hoping that the change happens before the exploit actually –. Well you are hoping –. If it’s that you are stopping the continual access then a clever attacker may already have what he wants or it maybe he has got another back door in. So it would seem that you failed and you are trying to salvage something, but it doesn’t seem like the policy –. You would really like to have a defense that works instead, that stops access originally. Okay, so I am pretty much done here. So in the end what do the policies provide? I think it can temporarily stop ongoing access. Now of course if you temporarily stop it then you have to worry about the next time that it’s guessed did you stop it in that time too and then the next time it’s guessed did you stop it? So accumulatively again this is not a winning game, but I think it is useful to revoke some sort of a delegated access. You know you are on vacation and you give the password to your colleague, or your secretary, or someone and that’s a very useful. Now the second thing is you guys are on shared projects while you are on interim here and you come back and if there is not a password change policy you come back two years from now and you still have access to all the accounts you had two years ago when you were here. So this limits that, but if these are the two most useful benefits is there a cheaper way of addressing those issues than getting 100 percent of the population to change their password every 90 days? The other thing it does, as I mentioned, is it does force an offline attacker into having to grab a new copy of the hash file, but if grabbing a copy of the hash file is easy to begin with that’s probably a problem. So we can talk about why that was possible to begin with and why there weren’t other mechanisms in place to stop that from happening. Do you have a question? >>: [indiscernible]. >> Paul Van Oorschot: This is my last slide. So from what I have shown you I think it is fair to say there is limited value, not no value, but limited value and this is ignoring the usability impact, usability cost and perhaps one of the most damaging factors here is this result from Zhang and his colleagues at North Carolina in their CCS 2010 paper. So if you know the current password, which is the case that you are trying to disrupt an ongoing access, if you know the current password then it turns out, because people have these little algorithms that they use, because they have to update it every month or three months, it’s actually for many accounts or a significant percentage of accounts, it’s actually possible to guess the next password. So they were able to guess for 17 percent of user accounts with less than 5 guesses, if they knew the current password, they could guess the next password. Now that’s pretty devastating, especially if you are thinking, “Well we are going to change the password before it get’s used,” but if you actually know it’s the current password and you were able to verify this offline, and then you try to go online and it doesn’t work, well you know the old one and now you have to guess from that older one and the next one. So that’s a pretty severe attack against this password expiration policy as well. So I guess my view of this is that the benefits are at best partial and minor. They are not zero, but they are partial and minor and there is no concrete evidence how useful they are. I am trying to come up with a bound on them not being useful, but we haven’t seen any evidence showing how useful they are. So I would like to see some of that evidence from someone before they continue to use these and then I will leave it with: what alternatives can deliver equivalent or almost equivalent benefits at lower costs to users? So that’s all I had and I am happy to take questions and discuss other cases. [Applause] >>: I think you missed the most damaging thing about password expiration policies, which is that every restriction you put on a user causes them to choose from a smaller set of possible passwords. And when you have to choose a password that’s going to be easy for you to modify in 70 or 90 days you have to choose from the set that you know how to easily modify to make it memorable so you can make the minimal change to it again. You are going to make it even easier to guess passwords and you are going to create popular passwords that weren’t popular before, that are far more popular than some of the ones you have to worry about. >> Paul Van Oorschot: So understand that and I think there is validity to that, although I have heard a counter argument to that as well. I have heard this from [indiscernible] as well, but there is a possibility, and we haven’t seen the studies, but there is the possibility that by forcing users to change passwords the base password that they have in their algorithm, the base password together with the little variation they have, whether it’s sticking the month on the end or some number on the end that rotates from 1 to 12, that might be harder to guess than the password they would have. So I would love to see a study that you or your colleagues do, but we don’t know. We don’t know which of those play out. >>: So I agree with that to some extent. I have often thought we would make it to a point where our IT department sends out a note saying, “After carful study we have decided the very best password is foo and now everybody should just using that, because satisfies all of our constraints.” But seriously I don’t agree with the premise you are saying and it’s because I don’t change my password because I fear there is some long-term, ongoing, slow, exhaustive search attack that somebody is mounting against my password and if I don’t change it eventually it will get hit. I change it because of leakage and leakage is not represented in this model. >> Paul Van Oorschot: Are you worried about leakage, like man in the middle, or a password has file? >>: Neither, I am worried about exposure. >>: [indiscernible]. >>: Yea, exactly. I am not a big fan of crypto key rolling because I don’t fear a leakage there so much, but here where I see all the time where somebody is in a meeting and they are logging into a display and they think they are typing their user name, they think they are in the password field, but they are actually in the user field, so it’s displaying, they get half way through it, they realize it and they suddenly erase. And if they are supposed to change their password every 90 days they might walk out of a meeting saying, “Oh no, I guess I better change it now,” but if this is a password that they have had for five years, they chose very carefully, they think it’s really good they are probably going to go away saying, “Oh I hope nobody really saw when I did that, because I don’t really want to change this password that I have had for five years.” >> Paul Van Oorschot: But to pick up on the point that you are saying that this analysis misses is basically the observation attack that the longer you are using a password for the greater in accumulation of the probability that someone has seen you use that password –. >>: Or seen parts of it. Maybe this meeting they saw the first half and then because of something else, six months later somebody saw the second half. >> Paul Van Oorschot: I think that’s valid, of course the whole space of all the attackers if that collapses down to just this tiny percentage that has seen you physically seen you enter your password, that’s a smaller subset than attackers from the other space. >>: Sure, but there are other leakage challenges. >> Paul Van Oorschot: Yeah, fair enough. >>: So I mean I open it up that people at MSR give talks externally sometimes and if they are giving a talk on the projector and it’s recorded then pretty much anybody that watches it or cares to watch it could possibly see that slip up. >> Paul Van Oorschot: If you have a video onto someone typing the keys that’s a bad leakage. >>: That’s exactly a place where there is a partial leakage, sometimes that over time, over six months maybe the camera was in the right place. >> Paul Van Oorschot: I think that’s fair, but again this reminds me too of this great fear that everyone has about sticking a yellow sticky on your machine and well your risk there is your family members at home, your spouse, the cleaning ladies, all the rest, but again that’s worrisome, but it’s a tiny fraction of all the attackers that are attacking from the other side of the world as well. >>: It’s also not scalable right. I mean I could have the entire population of China trying to do visualization of passwords and there wouldn’t be enough people to get the passwords. >> Paul Van Oorschot: It does get us to an interesting other discussion about generic attacks verses targeted attacks. And if we start to talk about targeted attacks or someone wants to get into your laptop I think we need a whole lot better than passwords securing our accounts if we are worried about targeted attacks. That’s a very difficult thing to defend against. I have got one question on my monitor here and the question is: how about the benefit that aging policies prevent users from using the same password for all their properties or accounts. And certainly if you are rotating having an aging policy which causes you to rotate passwords probably causes that. You know no one really wants to have this password, when it has to get rotated, rotate all your other passwords at the same time. So that is a side benefit, of course from a usability point of view you might actually want to be sharing your password amongst a bunch of low value properties, but anyone that actually has an expiration policy in place probably considers themselves not a low value property. So I think that is a potential benefit of the policy. >>: [inaudible]. >> Paul Van Oorschot: I would guess we haven’t seen the studies on this. I don’t know if there are studies, but my guess would be it does flatten out the passwords, but if you are rotating between, fluffy, and dog, and marshmallow, if it’s flatter, but not flat enough to withstand attack then we don’t know that it helps. I guess the real issue is: if it is as predictable as the Zhang study says that if I know one of your passwords then I can get the other ones with reasonable probability that might be a more interesting question than how flat the distribution is. >>: [inaudible]. >> Paul Van Oorschot: I think a lot of the online attackers would just to a big database, and it’s growing all the time, but you take the password that they had at distribution. So that won’t map exactly onto the Microsoft Research set of passwords, but there is probably some overlap. And it’s not that you want to guess 100 percent of the passwords, but you would be happy if you could guess 1, or 2, or 5 percent, because that might get you into allow you to do some privilege escalation attack or something else. So I don’t really have an answer to your first question. >>: We have heard a lot of good math and theory. Is it possible to conduct an empirical study on this to find out if it’s harder to break into say, you have got two equal companies, one has got a password expiration policy and one doesn’t, is that something that is possible? >> Paul Van Oorschot: Let me ask Mr. [indiscernible] and Mr. [indiscernible] here, your Microsoft colleagues, whether it would be possible to carry out an experiment like that. >>: You mean white hats or black hats? I don’t know which data would be more helpful. >>: I think black hats probably do it all the time right, but they don’t publish so much and white hats it’s getting it past your legal time. That might be a little bit challenging, but the short answer is I guess don’t think there are good data sets. I don’t think we even have a good data set of –. There hasn’t been a large scale breach of a data set that had a password expiration policy in place. So we don’t know the distribution. My guess would also be that it’s somewhat flat, but that’s kind of speculation. We don’t know by how much and we certainly don’t have any data that ones who do expiration have faired better than ones who cannot. >> Paul Van Oorschot: Yea, I mean it’s really only been in the last 5 years or so that we have had some of these large breaches and some of the studies, for example the Bhonneau study with the 70 million Yahoo passwords that they cooperated with letting him analyze in a privacy preserving way, but it’s only in the last 5 years that we have had the benefit of some large data sets like this. This is one of the challenges in research in this area, is you want to do ecologically valid research with real data sets rather than made up ones. I mean that’s the ideal case and it’s hard to get exactly those passwords. There have been some studies that show, you know studies with the Mechanical Turk and others, have certain properties that are quite useful for studying, but it’s a hard area to get real data on and very large data sets is what you want. >>: You mentioned as an example an 8 character data set [indiscernible]. That is an excellent number if people have real keyboards, like even uppercase is easy. Now that we all type on little postage stamps shouldn’t you count key strokes? This means going from lower case a, to the number 1, to upper case B, to lower case c, is god knows how many key strokes. >> Paul Van Oorschot: Yea and I think the mobile phones –. I would like to see a study and I can’t recall having seen one, but I would like to see a study of how passwords chosen from mobile devices, what their key spaces are compared to passwords chosen on desktop machines. I don’t think that would be a happy result either. >> Cormac Herley: Any other questions? Let’s thank the speaker again. [Applause]