>>: Good morning, everyone. It is my great pleasure to welcome Collin Jackson to come here and interview with us. Collin is interview with my group, the security group, for either a research or post doc position. Collin is a Ph.D. candidate from Stanford University. He has been publishing prolifically in the area of web and browser security. Actually Collin also interned with us as well as worked with us for quite some time. Collin also worked on [indiscernible] as a developer, as a full-time developer as well as part-time developer. He's also been consulting for many potentially successful start-ups in Silicon Valley. So Collin will tell us about his dissertation, Securing the Web Platform. >>Collin Jackson: Thanks. It's great to be here. I see a lot of familiar faces, people I met during my internship here in summer 2006. I was working with Dan Simon on antiphishing research, then I met Hun and we got really interested in treating the browser as sort of an operating system, as a platform for building applications. And it's really worked out well for me. We wrote a bunch of papers after that internship, I guess culminating in the paper I wrote with John and Hun. I think a lot of my research since then has been inspired by ideas we were kicking around while I was there for the summer. I really am glad that I had a chance to meet everyone. So I'm an optimist about the web. We hear about all of these horrible attacks that are going on on the web every day. But I think fundamentally it's becoming an extremely useful platform for building really compelling applications. You're seeing web pages are becoming more interactive. People are making more use of JavaScript, of Flash, lots of technologies that are bringing sort of near-to-desktop experiences. At the same time you're seeing people getting more interested in software that doesn't ship on a lifecycle of four years but rather on a lifecycle of every few hours or every week you can push out a new update to your users. They don't have to agree to the update. They don't have to have their update be managed by some IT staff. The web page simply runs the latest and the greatest that the website has to offer. So that allows a much faster interactive cycle where you can be in much more connection with your users. You can see immediate changes. I think there's a lot of advantages to seeing programs move onto the web and web pages becoming more exciting and dynamic. So how do we secure the web as a platform for applications? The browser security policy, it was essentially I think born in 1996 when Netscape introduced JavaScript. This is when cookies were also being introduced. You see a lot of the decisions that we're having to live with today are a result of the very initial policy that Netscape came up with. And ever since then, all the decisions that we're making are being made by backwards capability with existing applications. So you have to be very delicate with the changes you make to browser security policy because you don't want to break any existing websites. People will stop using your browser. I think the security policy was really designed when browsers didn't have multiple tabs. Users typically had one browser window open at a time. Each web page they went to typically that web page was providing all the content they were interacting with. Maybe some of the images were coming from different servers, but for the most part you were having one principal that the user was interacting with and you had to design your security model around that. The goals, if you look at security papers from 1996, were how do you protect the final system from a rogue Java. Those were some of the security problems we were facing. If you look at the web today, you have lots of tabs open. You have lots of different principles that are contributing content to a particular page, so you have these flash movies being brought in from different domains. You have these iframes that allow one web page to import content from another web page. Also different data communication and mashup features are being added to browsers. The users are going to have lots of these concurrent sessions going on. You might be logged in at multiple sites, and this adage of you should always log out of one site before you start your session. Another is really not realistic for the kinds of browsing patterns we're seeing today. So I think there's a new kind of security model that you need for thinking about users who are browsing in this kind of a web. So this is an idea that I've been trying to develop at Stanford called the 'web attacker threat model'. It's basically a set of assumptions for what we're going to give the attacker. If your feature is not secure in the web attacker threat model, it's not a very useful feature. The web attacker is an extremely conservative set of assumptions about what the attacker can do. First we're going to give the attacker a web server. It's extremely easy to get a web server today. You just find a hosting company and you pay them a little bit of money or you can set up your own. We give the web attacker a domain. Because you own the domain, the certificate authorities are willing to give a certificate to the web attacker. So we're not going to make the assumption that certificate authorities will only issue certificates to good people. We're only going to make the domain validated assumption about certificates. So the certificate authority is confirming that the attacker does actually own this domain. And then the most critical part of a web attacker threat model is we assume the attacker gets an introduction to the user. So you can imagine trying to design a feature that works as long as the user never goes to a bad site. If you had a white list of all the good sites on the Internet, maybe that feature would work. In reality, users are constantly interacting with bad content. So we want to design the browser's inbox to be robust in that model. So this is a basic set of assumptions. I'm going to talk a little bit later about how we might want to adjust these for particular features where these set of assumptions don't make sense. Basically the attacker needs to get their attack code to run on the user's browser. It's a pull model. The user has to go to a web page before the web page can interact with the user. So in order for us to reason about the security features of the browser, we need to assume that the attacker got their code onto the browser somehow, otherwise the attacker could never do anything interesting to the user. So here is the really key thing with the web attacker threat model. We're going to be conservative and we're going to make sure -- we're not going to make the assumption that the user is confused about where they are. So why do I say this is a conservative nonassumption? Because if you have a security feature that doesn't work, even when the user is not confused about what's going on, they actually understand how to read domain names or how to use extended validation, there's no way you can build a user interface on top of it that the user will be able to use correctly because the browser itself is confused about what's going on. There's a whole other body of research, some of which I did at Microsoft, which is how do you make sure the user isn't confused about what principle they're interacting with? How do you let them know, this is the real bank website and this is some other fake, impostor bank website. I think this is a really important research problem. There's no way we can make headway on this problem if the browser itself is confused about what's going on inside this rectangle of content. What the security indicators in the browser are telling you is here is the domain that was in charge of filling this rectangle with pixels, but what pixels appear there actually need to be under the control of the Bank of the West, otherwise there's nothing that the certificate UI can do. This is sort of an attempt to separate out separate areas of research in browser security. So if you're a web attacker, you can think of it intuitively as some other window or tab that the user visited and then forgot about, now they're trying to do something important. So whenever you're looking at a web feature, always assume that in the background there is a window that's got malicious code that came from the attacker and it really wants to attack your session with a site that you trust. The user is not confused about where they are, so the attacker wants to do things like override the pixels of this area. They may want to know what the user is typing into the password field. They might want to attack the storage associated with this principle. So in the browser you can store cookies and there's even a local storage database now that's associated with a particular principle, in this case the URL of the site determines the principle that the cookies and local storage use. So the attacker wants to get access to that storage. They may want to know what you're doing in this other window. They want to know what you're clicking on, what you're typing in. Another resource that the attacker might want is the IP address of the user that's accessing this web page. So at Microsoft you're probably familiar with the idea that some web pages are not accessible to the outside world. You have this firewall. You have lots of interesting web pages that live inside the firewall. Then you have web pages that are external that everybody on the web can access. So you should be able to go to a web page in one tab and have this be an external evil web page. You're not confused about what web page that is. But that web page wants to use your IP address to connect to a website that the attacker can't directly connect to because of the way the connectivity is set up there are web pages inside your firewall that the attacker needs your browser in order to connect to, using your browser as a proxy essentially into the firewall. These are some different types of resources that you have in your browser that the attacker might want to gain access to. So I've looked at all of these different resources and tried to find cases where the web attacker might be able to jump in and steal something that the user has. So being able to access the pixels and keystrokes while you're on the bank website, that's something that mashup [indiscernible] was trying to solve. >>: It seems like it's using another [indiscernible], which is the memory? >>Collin Jackson: That's right. >>: Pixels is like [indiscernible]? >>Collin Jackson: That's right. So the storage, you can think of it as being both inmemory storage and also the persistent storage in the browser. There are other types of resources, these are just five I've written papers on. I think there's more opportunities for applying the web attacker to things you might want to steal from the user. So I'm going to talk about this in detail today. I'll also talk about this one. Briefly right now the way that the browser renders hyperlinks on those web pages to determine whether you've been to a particular URL or not, by the way a hyperlink renders, either purple or blue. We looked at how a web attacker might use that to determine what you're doing in another window and proposed a solution for that. And also there are some tricks that the attacker can play with DNS that allow a web attacker to essentially hijack your IP address and connect to resources inside your firewall. We've been working on that problem as well. But I'm going to talk about the first three next. So here's an outline for what I'd like to talk about. First I'm going to talk about how do we understand whether this web attacker threat model is realistic. How expensive is it to get an introduction. Are there large scale attacks possible just by setting up a web server and trying to attack users to it. Then I'll talk about how we take various web features, analyze them, find vulnerabilities, propose solutions that get deployed by vendors. The deployment problem is a particularly hard problem in the web because of this backwards capability issue. And then I'll talk about different additional capabilities that might give the web attacker, if you want a feature that's a little bit more secure, a little bit more powerful. So one potential approach to the web attacker, you know the web attacker is this website out there that's got some evil JavaScript code on it. You could say we're going to find all those bad websites, categorize them as being bad, and throw up a warning if the user ever goes to that website. There's actually been a lot of work in trying to figure out which websites are trying to attack you and which websites aren't. Google has been building up this big database. There's also this phishing filter in Internet Explorer that try to warn you when you go to a bad site. I mean, this is a warning that I took just a few days ago on a real government website, because the attacker had injected some JavaScript code via cross-scripting into the government website, and that evil JavaScript code was trying to execute various types of malware, various types of attacks. So the problem with trying to block all the content on the web that's evil is that it's just the barrier to entry is so very, very low that the attacker, if they found this warning on their site, wouldn't have to go through very much effort in order to set up another one, and another one. And you could keep doing this with stolen credit cards until the blacklist had to store huge amounts of content in the browser in order to warn the user in real-time when they went to one of these bad sites. The spider that's going out and crawling all the bad sites, might see the bad JavaScript but not be able to find it in time or the website might serve up different JavaScript to the spider than the actual user who's using a real browser. So I think this is an important approach. I think it works better for phishing and malware than it does for crossover request forgery and other types of weak web attacks that work in the web attacker threat model. So I said that setting up the server is really cheap. You can pay $10 and get domain and hosting. You can get a domain validated SSL certificate for zero dollars. We found a CA that will just, by virtue of you owning that domain, give you SSL. So that's why we assume that the web attacker can talk over the HTTPS protocol. But we weren't sure how much it would cost to actually get introductions to the user. And this is important because we need to figure out just what the scale of an attack might be if an attacker knew about a vulnerability in the browser. So we ran a series of experiments at Stanford where we used ad networks to try to test our attack code. We wrote some evil JavaScript that used some vulnerability in the browser and then we found our ad network, we placed an iframe on that ad network that ran our attack code and checked to see whether it works. You're probably wondering how I got this past the [indiscernible] at Stanford. And the answer is that we only attacked ourselves. We were trying to bypass the browser's security policy, the same policy that prevents you from attacking other sites. So we set up two servers at Stanford, and essentially all we did was try to take some data from one server with some code that was served by the other. And so this way we weren't harming any other websites on the Internet, we were only running the test attack against ourselves. So there was nothing that we could steal, but we were able to demonstrate the viability of the attack with our attack code that was on the user's browser. So what this graph shows is the amount of time that we had on the user's browser while some ad was showing. So this is an example of a web page. It's a news site. This is an example of the iframe that we had. You can't see it, but there's a bunch of JavaScript running in this iframe while you're reading the news. And that JavaScript is running the experiment to find out whether or not the attack worked or not. So what we found is that a lot of users navigated away very quickly after one or two seconds. They clicked in the URL code they wanted to go to and the ad was shut off. But there were a lot of users who stayed there for days or even weeks. And I think that this is really a testament to the fact that users keep a lot of tabs open, and they don't reboot very often anymore. And there's really no reason to close a tab unless you need to restart your browser. And so if you're a web attacker and you want to do an attack on a long time scale, $1 will buy you two thousand impressions. So if you spend $1,000, you can get two million impressions. And that's actually a large number of browsers, it's enough that you could get exposure to a lot of browser vulnerabilities if there are only specific to a particular platform. So this is understanding whether or not an attack works and how many different browsers you could hack at once. But what if you want to run an experiment that determines whether or not a security feature is working. Yeah, go ahead. >>: [Indiscernible]? >>Collin Jackson: So there's actually a big business in trying to determine whether or not an advertiser is serving up evil iframes or not. This was a big problem for MySpace. Several times now someone has taken an iframe on MySpace and then used to try to exploit Active X or some other browser plug-in vulnerability to install malware on the user's machine. And this actually got viewed by tens of thousands of users before someone realized what was going on and shut it off. So I can give you some news articles that talk about this attack happening in practice. But there is evidence that real attackers are looking at this model and are finding it to be sort of financially attractive. $1 for two thousand users is pretty good, even if your attack only has a 1% success rate. So we can also use this sort of ad network methodology as a platform for experimenting with different browser's security features and figuring out how they work, treating the web as sort of an ecosystem rather than a particular product because there are lots of different browsers out there, lots of different versions of those browsers, lots of different configurations. You've got firewalls. Also a different thing, it's affecting the HTML that people view. You might think a particular security assumption holds on the web, but without actually testing it, seeing whether or not the browser's security policy is being enforced correctly, you can't know whether or not it works the way you think it does. So you can test your hypotheses on a large scale like this. So we had a hypothesis about how the browser referral had to work. We wanted to test it. We got about 200,000 users to run our JavaScript code by virtue of displaying our iframe and came up with some very interesting observations about the way browser referrals are handled. We had a hypothesis that the browser blocks the referer when you have an HTTPS page requesting an HTP image. We found that to be true most of the time. What was interesting is the referer was also being blocked from HTP to HTP about 3% of the time, which is a significant fraction of the time. And we did some more experiments that found that that blocking was not happening over HTTPS. The X and Y here are just two servers we set up. If you see X connecting to Y, that means the referer was X and the server you're connecting to is Y. So we found this interesting observation that when you have HTTPS requests, the referer is very unlikely to be blocked, but it's likely to be blocked if you have HTTP. From this we formed our hypothesis that most of the referred blocking that's going on in the network is happening as a result of proxies and other things that live outside of your computer, and only have access to your HTTP traffic. They have a hard time modifying your HTTPS traffic in transit. So as a result of this experiment, we made a bunch of recommendations to vendors about how they should handle the origin header proposal, which is a new proposal for replacement for the refer header. We're trying to understand what privacy problems could people have with the refer header that causes them to block this and how can we provide a new header that provides the security that people want from the refer header without having all the privacy problems. I'll talk a little bit more later about why you might want to use the refer header to prevent attacks. So, anyway, this is the measurement approach we've taken at Stanford. Now I'd like to talk a little bit about how we've actually tried to find vulnerabilities in features in the web attacker threat model now that we understand the web attacker threat model is a realistic thing for an attacker to use. So here is a very simple web attack. The user goes to a login page and you can't tell, but there's an iframe on this page. The iframe is the login box, because this is a module that's used on multiple Google properties. So right now it looks like the user is only interacting with Google. You have the security indicators that are all in the right place and they say Google on them, so there really doesn't seem to be any interaction with some untrusted third party. The web attacker threat model, we don't look at this one window. We have to assume the user has lots of windows open, they're all sitting around in the background, and the attacker at any moment might try to jump in and steal some pixels or keystrokes from this page. Here is how the attacker might do it. They want to replace this login frame with an evil login frame, so when you type in your user name and password, it will submit the password to the attacker instead of submitting it to Google. So what the attacker can do in some other tab in the background is use the window dot open API. Essentially they name the frame that they want to access and they pick a URL that will populate this with content that's controlled by the attacker. Now, the browser has this mixed content warning system which tries to warn you when you have an HTTPS page rendering content from an HTTP URL. So this is why it's important we make this assumption the attacker can get an SSL certificate, because if they can, they can make sure this iframe will still be served over HTTPS and the user won't get any warnings telling it it's a bad idea to log into this frame. So once the attacker has replaced that frame with their own content, you'll never know you're actually talking to the attacker when you type in your password. >>: [Indiscernible]? >>Collin Jackson: Right. So the browser has this window dot open API that allows you to name a window anywhere on your system. You specify a new location for that window. This API is designed to be cross origin. So you can name a window that's not under your control. The browser is supposed to navigate that window to a new location. It's a very dangerous API. >>: [Indiscernible]? >>Collin Jackson: Yeah. So how did the attacker get introduced to the user? The web attacker threat model says the attacker gets an introduction for free. In practice, what they're doing is you've got some tab open underneath this one where there's an ad. >>: The ad is a script? >>Collin Jackson: The ad -- yes, it's an iframe that actually has JavaScript running. Half of the networks limit what content you can put in the iframe. >>: The browser might limit JavaScript in an iframe? >>Collin Jackson: That's right. So the security restricted flag on iframes, for example, is a Microsoft proposal that would allow you to limit what can happen inside of an iframe. There's some effort underway to standardize that in HTML 5 so other browser vendors can adopt it as well. The problem with limiting what can go on in the iframe is some of that functionality is intended. It may be that the iprovider actually needs to run some JavaScript to figure out which ad to show on your system, and so there's the tradeoff there. For the moment, the sort of finance aspects of advertising seem to be winning over the security aspects. So I haven't seen a whole lot of people using security restricted yet. Yeah? >>: So here you have a malicious ad [indiscernible] this page. >>Collin Jackson: So the ones we've looked at typically put your ad in a cross domain frame. So they have some throw-away domain that they don't care about. So this is, you know, Boston Metro News. Often what happens is Boston Metro News will iframe the ad provider, and the ad provider makes a second iframe, just some domain that they don't care about. And that's where your actual ad lives. So it has no privileges to the browser other than living at this strange domain and the ability to run JavaScript in your browser. >>: My next question is, can such an ad be run across the iframe, how can it [indiscernible]? Does it really allow this kind of cross domain navigation? >>Collin Jackson: Yeah, so it used to be that this attack worked, and it worked in almost every browser. The iframe independently fixed this particular issue, but unfortunately if you had flash player installed, they hadn't gotten the memo that this is a bad thing, and so they still allowed you to navigate any window on the system across origins. There was no origin checking going on when you tried to make a navigation. So what we did is we contacted all the vendors and said, Look, this attack is bad, and we provided a bunch of different variants on the attack to show why you might want to look down this API. And we got everyone to agree, it was a lot of work, but we got everyone to agree on what's called the descendant policy, it's actually based on IE, that says you can navigate a frame if you control a frame that is one of the parents or grandparents of that frame. So if you want to navigate this frame, you better be in control of Google.com. You don't actually have to be the direct parent of this frame, but you need to be in the same origin as this one. And so we came up with some arguments why the policy made sense based on pixel delegation, the idea that if you give someone some pixels, you can take them away at any time. And so that's why that got adopted. So this is an example of the web attacker trying to steal your keystrokes. And the key assumption that we had to make was that you weren't just interacting with this page, you had some other guy in the background that was trying to jump in. So let me give you another example of a web attack. Any questions about this before I move on? Okay, so let's talk about crossover request forgery. I'm sure this is a topic that's near and dear to some of our hearts. So basic crossover request forgery attack, the user gets introduced to the attacker, because that's how web attacker works. And the web attacker is gonna create an HTML form that's gonna submit some HTP request to a target server. And in this case, we're relying on this fact the users keep concurrent sessions open with a lot of different sites at the same time. And so they're already authenticated to this bank website using cookies. So the attacker can automatically in JavaScript, without any user interaction, submit this form to the -- they serve up this HTML document. The HTML document submits the form to the bank website, and then the bank sees that the user appears to be trying to transfer some money into the attacker's bank account. So because the user's already authenticated, using cookies, the attacker allows the request. This is a very basic CSRF attack. So we'd like to prevent this attack. But before I talk about how to prevent it, let me describe a variant on it. This is something that we discovered at Stanford called the login crossover request forgery attack. And the idea is that when you -- we go to the web attacker's website, rather than trying to transfer money out of your account, or do something with your existing session, all the attacker wants to do is create a new session for you. And so the attacker submits this form to the login page of a particular web page, and then you become logged in under the attacker's account. Essentially it's a CSRF attack, but instead of targeting the transfer page, we target the login page. So now you're authenticated under the attacker's account using the attacker's password. Why would that be a bad thing? So there's a lot of examples that we came up with. One example is, when you do a web search, if the -- if Google thinks that you're logged in as the attacker, it gets stored in the attacker's search history. So now the attacker knows what you're searching for even if you're only using the search box in your browser. So this is a variant on the CSRF attack. >>: How does the hacker know that your browser [indiscernible]? >>Collin Jackson: So the attacker knows the user name and password for the attacker. They use this web attack. The user gets introduced to the attacker via some iframe ad or whatever. The attacker silently logs you into Google under the attacker's account. And now the attacker, who knows the attacker's user word and password, go to Google, log in, go to the web search history page, and watch as entries start getting added to it. Every time the user does a search, Google thinks that they're logged in under the attacker's web account. So it gets added to the attacker's search history. So this sort of turned our CSRF understanding on its head because people used to call it a session hijacking attack, or a session riding attack. The idea being that you already had to have a session, and the attacker was just trying to jump in and join it. But what's fundamentally happening here is that the attacker's making a form post abusing the fact that the form post API in JavaScript is cross origin, and confusing the web server as to where this request is really coming from. So let me give you a more subtle example of this attack. So let's say you go to a web page that has a PayPal donate button. This is how a lot of websites make money. Users, out of the goodness of their heart, make a donation. But maybe you only want to donate $5 to [indiscernible].com and you don't want to give away everything you've got. So you click the donate button and you get sent to the PayPal login page. In this case, the attacker opened it up in a new tab. So this is nice because the attacker's now sitting here in the background being a happy web attacker trying to join into your session with PayPal at just the right moment. So this is actually what it looks like when you're about to make a donation. You're supposed to enter in your user name and password, click login, and for a few seconds you see this sort of logging in screen, and then you land on some page that assumes you're a logged-in user. So in this case, PayPal wants me to add a checking account to my PayPal account. So let's say I fill in my bank details and click continue. What do you think happens? >>: [Indiscernible]. >>Collin Jackson: Exactly. So while this logging in prompt was running, the attacker is sitting there in the other tab and can submit any formal request it wants to the PayPal site. So that wouldn't be possible if the attacker didn't have a foothold in your browser. There's something about the attacker being able to do things concurrently with PayPal that allowed this attack to work. And as a result, when the user clicks continue, they've just saved their banking information into the wrong account. The attacker can now log into their PayPal account, which they have the user name and password for, and make transactions using the user's credit card. So we've been working with a variety of browser vendors and websites to try to build better defenses for crossover request forgery. Go ahead. >>: So basically you assume the user is visiting the attacker page first? >>Collin Jackson: Right. So the web attacker, we always assume the user visits the web attacker's page. You don't make that assumption, a lot of the attacks get a lot harder. But we've done these measurements that show it's actually quite common for users to get introduced to attackers. So we want to make sure that we are securing PayPal against that. >>: Do you think the problem here is that because these two tabs are sharing a session, should the two tabs share the session? >>Collin Jackson: Well, I think you're hitting on a good point, which also happened with the frame example before, which is that there are all these cross origin APIs in the browser. Frame navigation, form submission. It's very easy to mess up if you're a website and get hosed by this cross origin API. These APIs I think are the most important APIs to analyze for security. Unfortunately, I think a lot of them were designed in a simpler era where people were really not interacting with as many principles at a time. So, yeah, you could design a browser that didn't have any cross origin APIs, but a variety of web pages which were designed sometime between 1996 and 2009 might be relying on those APIs. So now you have this dilemma: do we break existing web pages or do we allow the current web pages to run, but somehow adjust the protocol so that just the attacks are blocked and different types of legitimate activity are still allowed? I think that's really something that a lot of the [indiscernible] bodies are wrestling with right now with respect to this cross origin request forgery attack. They want to find some way that we can still allow cross origin form posts, because so many websites are using them. Every website that adds a Google search box to their web page is using cross origin form to connect to Google. In fact, you can think of any cross origin hyperlink as somehow connecting two domains together in a cross origin way. So people are trying to design a solution to cross origin request forgery that doesn't break the web, but somehow makes this problem of how do you defend your website against CSRF a little bit easier. The way that PayPal is doing it today, they have these secret tokens that are on their transfer forms. And when you click submit, they check to make sure that the web page that has the transfer details also has the secret token that they put in the original form. If it's not there, they reject the request. It turns out it's quite hard to build a web page that does this correctly. We've analyzed a lot of different frameworks that are trying to do this automatically, and we found vulnerabilities, or they leaked the token in various ways. So we'd like to come up with something that's a little bit simpler. One thing that I do like is that rails and other authentication frameworks, like ASP dot net, are building a CSRF protection into the framework so that each web page doesn't have to solve this problem individually; they can just pick their framework of choice and then get protection automatically. But we've been looking at the refer header as a possible alternative to the secret tokens. Unfortunately it's not usable over HTP because of all the blocking that goes on, and we didn't really know that until we ran our experiment to see just how bad it was. But now that we know that blocking of HTP is really bad, the silver lining is that over HTTPS refer accuracy is very good. And so if PayPal wanted to prevent this login CSRF attack, they could do it with only a few lines of code by essentially just doing refer checking on the login page. They wouldn't even have to implement a secret token defense because their site is served completely over SSL. All right. So I talked about a few examples of things that we've worked on at Stanford to try to take the basic web attacker threat model, the attacker just gets an introduction, and see what kinds of attacks we could find and then work with vendors to deploy a solution. But sometimes the web attacker threat model isn't strong enough to describe the attacks that we're worried about. So I'd like to give you a few examples of where web security research departs from the web attacker threat model. So one thing that you might be interested in is the attacker is like a rogue network access point or the attacker is someone who's sitting next to you at Starbucks. And you might want to protect your sessions with websites against that kind of attack. It's very hard to do this with basic HTP because HTP was not designed for security against the active net attacker. But you may be able to protect yourself if the website's using SSL. We found a bunch of attacks on sites that use HTTPS ranging from mixed content vulnerabilities to problems with cookie integrity. And we've been proposing solutions that try to make it easier for websites like PayPal that want to do everything over SSL to protect themselves against these active attacks. But it turns out to be a deceptively hard problem. So that's one thing you might do, is essentially give the web attacker also the ability to corrupt your sessions with the website. But the attacker still gets to live in a separate tab, running JavaScript code running simultaneously while you're at the SSL site. That turns out to be very important for some of the attacks. So another thing that you might want to give web attacker, and this is a threat model that I think Helen is very interested in with the Gazelle browser, is an attacker knows about a vulnerability in your browser, but luckily the vulnerability is somewhere in a sandboxed part of the browser. So how do you design the browser such that the attacker who knows about a vulnerability still can't bypass the browser's security policies and do bad things, like install malware on your system, read all the files in your system, or even bypass the same origin policy and connect to other sites? So there's been a lot of work in doing this. I worked on the Chrome Project at Google, which was trying to focus on -- our initial version was really focusing on protecting your file system from being read or written by a website that found a vulnerability in your browser. But I think there's a lot of work in trying to go even further and enforce more of the browser's origin policy in some sort of privileged component under the theory that most of the vulnerabilities live in the image renderer's or the HTML JavaScript engines which have most of the complexity of the browser, essentially building the browser like an operating system where you have a small privileged component that's the trusted computing base, and then a large component that has all the complexity of the web. So it's important the attacker be able to get on your system. So I think the introduction aspect of the web attacker is a very important part of analyzing the effectiveness of these solutions. Another example that we've been looking at is the web attacker gets to have a widget in your mashup. So the user knows that this widget is controlled by the evil gadget guy. And their final thought, as long as it lives within its little rectangle, and the web attacker shouldn't be able to affect all the other components of your mashup. So iGoogle is a very [indiscernible] version of a mashup, but you can really think of any ad network that serves ads and iframes as a form of mashup, because they're bringing together content from multiple sources. So how do you do things like design a communications scheme where the integrator, Google in this case, can talk to the evil gadget without being attacked by all the other frames? So Microsoft designed a protocol that does this using identifier messaging. And the idea is that you attach messages to the end of your URLs, and because the navigation API in browsers is cross origin, this works even if you're sending a message to an iframe that's been put in a throwaway domain. So this allows you to build the mashup that has lots of different iframes. The code is written by lots of different people, but they can still communicate using these messages. So the key thing about navigation is that you can navigate a URL, but you can't read the URL of a cross origin frame. It's a write-only API. And when you navigate, you get to know what URL you're sending it to, including the domain. So you get confidentiality. If you ever set the URL of a frame, you know that it was -- it's only going to be readable by the origin you're trying to send a message to. However, there's no authentication. There's no way using basic messaging to know where you got a message from. It could have come from anyone. It doesn't say in the message in any sort of secure way, This message was sent to you by the integrator. So it's kind of analogous to having your messages being encrypted with a public key. So IBM tried to build something like this. Microsoft also tried to build something like this. And they're essentially trying to add authentication to the existing confidential API. Some way of essentially figuring out where a message came from using a multi-stage protocol. So we looked at Microsoft protocol. We had to reverse engineer it a little bit. We found that it was very similar to the Needham-Schroeder protocol for network communication. Unfortunately, there's this anomaly in the Needham-Schroeder protocol that basically says if the good guy ever tries to initiate an interaction with the bad guy, then the authentication property of the protocol can be broken. And there was an analogous attack on Windows Live channels that did exactly that. So here is an example. You have a gadget that thinks it's talking to the integrator, that actually it's talking to another gadget in the same page that was written by someone other than the integrator. So we [indiscernible] the fix to the protocol that's based on the Needham-SchroederLowe protocol. And this got adopted by both the Smash team as well as the Windows Live Channels library. So that's an example of where we took the web attacker and we gave them something else: a foothold into the mashup. And that ended up being very useful for finding an attack that otherwise wouldn't work. I should note that there are certain threat models that don't even make sense for analyzing the Windows Live Channels. For example, Windows Live Channels is not designed to resist a network attacker. So a network attacker could potentially fake messages to the integrator and you'd never know because that's not part of what the protocol's trying to do. But if you pick the gadget threat model where the attacker gets a gadget in somebody else's mashup, then you can actually find this attack. So [indiscernible] messaging is kind of a hack. We've been advocating for a new protocol that adds support for cross origin communication. The idea is that if you want to send a message from one frame to another, whether it be in a different window or the same window, all you have to do is just call the post message method of this frame and specify whatever your message is. And then the browser will let this guy know where the message came from. So, unfortunately, the original post message proposal had a problem. It didn't really provide confidentiality. It had the reverse problem that [indiscernible] messaging had. So you could send messages and you'd know exactly where the message came from, so there was no problem with that. But when you received a message, there were tackles using framework navigation, the attacker could jump in and receive your confidential message without you intending it. So we wrote a paper at Microsoft that was describing a communication mechanism between origins called mashup OS. And mashup OS [indiscernible], which was for communication and cross origins, had this second argument where you specified the origin that you were trying to send the message to. And so what we did is we proposed that post message adopt the second argument. So when you send a message, you're not only saying what you want the message to be, but also who you want it to be received by. And luckily we got this just in time before it was adopted in Firefox. And so the initial limitation in Firefox shipped with this mandatory second argument. And so now all of the browsers have adopted this, including IE, the second argument that lets you say who you're sending a message to. Yeah? >>: [Indiscernible]? >>Collin Jackson: Very good question. The question mark in the circle browser is the web working group, which is the web hypertext application's something working group. They are the working group behind HTML 5. And this is a standards group that's moving very fast, uncomfortably fast for some people, at standardizing a lot of the de facto features that web browser's implementing that no other working group is taking on. So things like content sniffing in browsers, something that every browser has to do, it's not standardized. It's extremely important for security. And so we've been working with the web working group to try to get those features written down, so if you're building a new browser, and, by the way, there's been a lot of competition in the browser market lately, so I'm very excited to see that people are interested in believe new browsers. If you're building a new browser, you don't have to invent your security policy from scratch. There's a document out there somewhere that tells you all the hairy details of how you implement these policies. All right. So I think I'm running out of time, so I'm going to wrap up. I think the web attacker threat model is a nice basis for research because it separates out the user interface problems of how do you figure out where the user thinks they are and how do you protect their password that they're entering into the page from how do you make sure the browser's not confused about how it's allocating the resources that this web page has. Things like your pixels, your cookies, those are problems that really don't require the user. And you shouldn't need user interface in order to protect those resources. They're completely outside of the user UI problems. You don't need a user study in order to figure out whether or not you've successfully prevented the attack. What you need instead is an analysis based on what the capabilities of the threat model are to show whether or not it's possible for the attack to work. So I think as the web progresses, we're gonna find more and more ways for the attacker to get used to the user. I think we can't necessarily blacklist every bad website on the Internet. These ad networks are going to cause a lot of introductions to happen. And that's good. I think it's good for there to be a very low barrier to entry to get your JavaScript code running in the user's browser. But there's a responsibility with these cheaper introductions of building a security policy that can deal with all of the different principles that are throwing code into the user's system. I think that very slowly we're making progress on the authentication problem. I think there's a lot of promising options out there, things like password authenticated key exchange, card space. I think it's good that people are experimenting with that. I think eventually we'll make some headway on the phishing problem. But as we do, it will become more and more that the browser's security policy not be confused because the user will be less confused. And so the browser will become the greatest point of weakness. And I think that we may not use exactly the weakest threat model web attacker that I introduced at the beginning, but we'll keep giving the attacker some very minor tweaks to their capabilities. Maybe give them an introduction into a mashup like I showed earlier, or some -- know about some bug in the browser. So we're gonna keep tweaking the threat model, but this cheap introduction I think is going to remain the same. That's why I think this is a good way for thinking about web security rather than just assuming the user will only go to good websites. So that's my research talk. Any questions? You guys have been a really good audience. Thanks for the questions. Yeah? >>: I'm stepping back. Do you think the web is getting more secure messages? >>Collin Jackson: I think the web is getting more useful. I think that's what matters. >>: That's not the question I asked. >>Collin Jackson: If you look at what is happening with fraud, as a percentage of web transactions, it's been going down, even though it's on the rise. So I talk to a lot of guys at the anti-phishing working group. >>: How do they knows these things? >>Collin Jackson: The credit card companies often feel a lot of the fraud themselves because they eat it. They want to eat it. That's their business model. When someone's credit card number gets used in a malicious transaction, the user complains, the credit card company's job is to pay for the damage. They try to push it off on to the store owners as much as they can. They keep very good logs of what's going on there. They're the ones who know web fraud is something like a nickel out of every hundred dollars that's going on on the web, which is actually nothing compared to the fees that Visa charges you with every transaction. So it's under control. I think the web is becoming more useful. I think attacks are still on the rise. If you look at certain types of web attacks, things like malware resulting from websites getting hacked via subsequently injection, there are particular sectors growing at alarming rates. I think there's going to be a lot of work to do for web security researchers for many years to come. But I'm perpetually an optimist that the web is going to continue to be used as a platform for building really interesting web applications. We'll probably see a lot more native code running. I know there's been a lot of proposals to do this. The browser is going to become a more powerful thing. But the idea you can go to a web page and the web page lives in a sandbox that doesn't have complete control over your computer I think is an extremely powerful way to build websites, and it allows users to interact with people they don't trust quite as much as they would if they wanted to install a program onto their local system. So I think it's really good that browsers are being used because the alternative of just downloading arbitrary code from people and running it is really bad. >>: What do you think is the main pay point in today's web applications that's actually preventing the web applications [indiscernible] same as desktop applications, to be treated the same, what do you think the major hurdles for web applications are? >>Collin Jackson: So the question was, what are the main hurdles for web applications, for them to offer near-to-desktop experiences and be as powerful as a desktop application. So I guess there's two approaches to that question. You can ask what the development challenges or what are the security challenges. I think that as a web application developer, there's a lot of trivia that you need to learn in order to build a secure web application. And you're constantly going to be running up against different security restrictions in browsers that prevent you from doing things that you want, likes uploading files from user's file systems, going to full-screen mode, lots of things that you can't do for security reasons. So I think there needs to be some way to solve that problem, and I think that one approach that you might want to take is to really focus on these frame works, things like APS dot net, really on URLs that will solve all of the uninteresting, trivial web problems, things like crosshair scripting, crosshair request forgery, so the web developer doesn't really have to think about that. They can code in a higher level language that allows them to abstract away all the details of where the web form works and where the JavaScript works. So there's been some effort in this direction, but I think it's still -- you still see a lot of people hand coding web applications. So there's a lot more work to do. >>: Do you think we have the solutions to these problems today? Do you think those are [indiscernible]? >>Collin Jackson: When you look at problems like sequel injection, you know the answer to sequel injection, you use parameterized sequel. Somehow, this is the biggest source of vulnerability on the web. A lot of that has to do with the fact that legacy web applications make up the majority of the web. And so we need to provide a solution for those people or we need to provide really, really powerful frameworks that encourage them to go rewrite their application 'cause it's just so easy to do that. And I think that we're making progress. It's not as fast as I'd like. But that is happening. >>: One of the problems with frameworks is that if you're lucky, the people that wrote them have done everything you need, but they're not always anticipating what you're doing. >>Collin Jackson: That's right. >>: And so you no longer understand whether you're getting certain protections or under what circumstances, if you want to modify that framework in some way, that you will no longer get those protections, and whatever you provide in a framework is always going to be some delta behind the cutting edge because you can never ship a framework at the same rate as the people who are on the cutting edge. >>Collin Jackson: Yeah. >>: So I find the framework, there's something fundamentally unsatisfying about solving things in the framework because you still have this horribly complex system underneath that is very brittle that you're hoping holding together because some of this framework. >>Collin Jackson: Yeah. >>: Can't have like a Windows update mechanism where we know the scripting is messed up, there's some fraud, a couple weeks down the road you make a fix for the API, anyone using the framework gets a notification saying this framework has changed, do you want to take the risk of taking the patch [indiscernible]? I'm saying I think one thing that's really difficult is one thing you said, there's so much trivia, security is so complicated. Security doesn't even know all the things they need to do. How do we expect a developer to understand all the nuances they have to follow to execute a secure website. >>Collin Jackson: Yeah, I mean, I definitely think that the complexity is slowly getting out of control. Browser vendors are constantly under pressure to add new features. They want to make the web more usable, closer to desktop features. With each new feature they add, there's a risk it's going to create a pitfall for some website to fall into. You can't treat the web as a static platform or we'll just fix all the bugs and then everything will be perfect and we can build a really awesome framework on it. The web is gonna always to continue to evolve forward. People are going to start using plug-ins, they're going to use different types of technologies. We have to treat it as a platform that does adapt over time. I think it's possible that you could have it adapt in a way that it actually gets simpler to reason about. There's been some efforts to do this. Just to finish my thought. There's been some efforts on contact restriction, for example, a way for you to basically just say, Clean slate, I want you to get rid of all these features that I'm not using so I don't have to worry about whether I'm using them wrong. Let's bring the web page back to sort of a simpler web where there are only a few things that can happen, and those are the features I know that I'm using, and you can declare as a website what features you actually want out of the browser. So there's been some effort to get that deployed. There's a lot of complexity in building a content restriction system that makes sense. But I think there is some hope in that regard. >>: So a web developer today is trying to find a program that runs on this dynamic plan where the rules are changing out from under the developer. One can also write a serveronly program where you log into [indiscernible]. >>Collin Jackson: Yeah. >>: Do you think writing web applications is strictly harder than writing a server-only application? It seems like the web has all these anomalies, the web [indiscernible], strictly harder than writing the server piece of it. If you care about security, that's right. Is it fundamental? [Indiscernible] solve all these same problems in the end or maybe there's something wrong about the way the web evolved and you look at the other direction, we know how to write web certifications that's not [indiscernible]. >>Collin Jackson: I think that's going to end up pushing a lot of the security problems of the web onto the server where you'd have to solve them again. >>: So it's the same problems but just [indiscernible]? >>Collin Jackson: Yeah. >>: I disagree. I think it's different. When you write a server application, you're concerned with the security of the server. When you write a web application, not only do you have to be concerned about your server, you have to be concerned about attacks that are going to occur on the client. That's the difference between [indiscernible]. >>Collin Jackson: I think you're both right. I think there is some amount of -fundamentally different about the way the browser works versus what a server could do if it knew that it was isolated. But there are a lot of interesting problems about delegation. Let's say I want to know whether or not you're signed into Google in order for [indiscernible]. If they're completely isolated from each other, it's very hard to do that. If you're going to have to solve that problem anyway, you might as well have everyone solve it the same way instead of solving it in 10,000 different ways on the server. >>: I think the lines are blurring anyway [indiscernible]? >>: Agreed. >>: You almost have to go to the lowest common denominator. When you write a [indiscernible] application, I'm not worried about another [indiscernible] application attacking the application that my user's going to be using. >>: This process you've lost all the interesting [indiscernible]. >>Collin Jackson: If we understood the web security all along, we might actually be able to do that. But it's still sort of being revealed as we go. All right. Thanks, everyone. >>: Thank you.