>>: Good morning, everyone. It is my great... come here and interview with us. Collin is interview...

advertisement
>>: Good morning, everyone. It is my great pleasure to welcome Collin Jackson to
come here and interview with us. Collin is interview with my group, the security group,
for either a research or post doc position. Collin is a Ph.D. candidate from Stanford
University. He has been publishing prolifically in the area of web and browser security.
Actually Collin also interned with us as well as worked with us for quite some time.
Collin also worked on [indiscernible] as a developer, as a full-time developer as well as
part-time developer. He's also been consulting for many potentially successful start-ups
in Silicon Valley. So Collin will tell us about his dissertation, Securing the Web
Platform.
>>Collin Jackson: Thanks. It's great to be here. I see a lot of familiar faces, people I
met during my internship here in summer 2006. I was working with Dan Simon on antiphishing research, then I met Hun and we got really interested in treating the browser as
sort of an operating system, as a platform for building applications. And it's really
worked out well for me. We wrote a bunch of papers after that internship, I guess
culminating in the paper I wrote with John and Hun.
I think a lot of my research since then has been inspired by ideas we were kicking around
while I was there for the summer. I really am glad that I had a chance to meet everyone.
So I'm an optimist about the web. We hear about all of these horrible attacks that are
going on on the web every day. But I think fundamentally it's becoming an extremely
useful platform for building really compelling applications. You're seeing web pages are
becoming more interactive. People are making more use of JavaScript, of Flash, lots of
technologies that are bringing sort of near-to-desktop experiences.
At the same time you're seeing people getting more interested in software that doesn't
ship on a lifecycle of four years but rather on a lifecycle of every few hours or every
week you can push out a new update to your users. They don't have to agree to the
update. They don't have to have their update be managed by some IT staff. The web
page simply runs the latest and the greatest that the website has to offer.
So that allows a much faster interactive cycle where you can be in much more connection
with your users. You can see immediate changes. I think there's a lot of advantages to
seeing programs move onto the web and web pages becoming more exciting and
dynamic.
So how do we secure the web as a platform for applications? The browser security
policy, it was essentially I think born in 1996 when Netscape introduced JavaScript. This
is when cookies were also being introduced. You see a lot of the decisions that we're
having to live with today are a result of the very initial policy that Netscape came up
with. And ever since then, all the decisions that we're making are being made by
backwards capability with existing applications. So you have to be very delicate with the
changes you make to browser security policy because you don't want to break any
existing websites. People will stop using your browser.
I think the security policy was really designed when browsers didn't have multiple tabs.
Users typically had one browser window open at a time. Each web page they went to
typically that web page was providing all the content they were interacting with. Maybe
some of the images were coming from different servers, but for the most part you were
having one principal that the user was interacting with and you had to design your
security model around that.
The goals, if you look at security papers from 1996, were how do you protect the final
system from a rogue Java. Those were some of the security problems we were facing.
If you look at the web today, you have lots of tabs open. You have lots of different
principles that are contributing content to a particular page, so you have these flash
movies being brought in from different domains. You have these iframes that allow one
web page to import content from another web page. Also different data communication
and mashup features are being added to browsers. The users are going to have lots of
these concurrent sessions going on. You might be logged in at multiple sites, and this
adage of you should always log out of one site before you start your session. Another is
really not realistic for the kinds of browsing patterns we're seeing today.
So I think there's a new kind of security model that you need for thinking about users
who are browsing in this kind of a web. So this is an idea that I've been trying to develop
at Stanford called the 'web attacker threat model'. It's basically a set of assumptions for
what we're going to give the attacker. If your feature is not secure in the web attacker
threat model, it's not a very useful feature. The web attacker is an extremely conservative
set of assumptions about what the attacker can do.
First we're going to give the attacker a web server. It's extremely easy to get a web server
today. You just find a hosting company and you pay them a little bit of money or you
can set up your own. We give the web attacker a domain. Because you own the domain,
the certificate authorities are willing to give a certificate to the web attacker. So we're not
going to make the assumption that certificate authorities will only issue certificates to
good people. We're only going to make the domain validated assumption about
certificates.
So the certificate authority is confirming that the attacker does actually own this domain.
And then the most critical part of a web attacker threat model is we assume the attacker
gets an introduction to the user. So you can imagine trying to design a feature that works
as long as the user never goes to a bad site. If you had a white list of all the good sites on
the Internet, maybe that feature would work. In reality, users are constantly interacting
with bad content. So we want to design the browser's inbox to be robust in that model.
So this is a basic set of assumptions. I'm going to talk a little bit later about how we
might want to adjust these for particular features where these set of assumptions don't
make sense.
Basically the attacker needs to get their attack code to run on the user's browser. It's a
pull model. The user has to go to a web page before the web page can interact with the
user. So in order for us to reason about the security features of the browser, we need to
assume that the attacker got their code onto the browser somehow, otherwise the attacker
could never do anything interesting to the user.
So here is the really key thing with the web attacker threat model. We're going to be
conservative and we're going to make sure -- we're not going to make the assumption that
the user is confused about where they are. So why do I say this is a conservative nonassumption? Because if you have a security feature that doesn't work, even when the
user is not confused about what's going on, they actually understand how to read domain
names or how to use extended validation, there's no way you can build a user interface on
top of it that the user will be able to use correctly because the browser itself is confused
about what's going on.
There's a whole other body of research, some of which I did at Microsoft, which is how
do you make sure the user isn't confused about what principle they're interacting with?
How do you let them know, this is the real bank website and this is some other fake,
impostor bank website. I think this is a really important research problem. There's no
way we can make headway on this problem if the browser itself is confused about what's
going on inside this rectangle of content.
What the security indicators in the browser are telling you is here is the domain that was
in charge of filling this rectangle with pixels, but what pixels appear there actually need
to be under the control of the Bank of the West, otherwise there's nothing that the
certificate UI can do. This is sort of an attempt to separate out separate areas of research
in browser security.
So if you're a web attacker, you can think of it intuitively as some other window or tab
that the user visited and then forgot about, now they're trying to do something important.
So whenever you're looking at a web feature, always assume that in the background there
is a window that's got malicious code that came from the attacker and it really wants to
attack your session with a site that you trust. The user is not confused about where they
are, so the attacker wants to do things like override the pixels of this area. They may
want to know what the user is typing into the password field. They might want to attack
the storage associated with this principle.
So in the browser you can store cookies and there's even a local storage database now
that's associated with a particular principle, in this case the URL of the site determines
the principle that the cookies and local storage use. So the attacker wants to get access to
that storage. They may want to know what you're doing in this other window. They
want to know what you're clicking on, what you're typing in.
Another resource that the attacker might want is the IP address of the user that's accessing
this web page. So at Microsoft you're probably familiar with the idea that some web
pages are not accessible to the outside world. You have this firewall. You have lots of
interesting web pages that live inside the firewall. Then you have web pages that are
external that everybody on the web can access. So you should be able to go to a web
page in one tab and have this be an external evil web page. You're not confused about
what web page that is. But that web page wants to use your IP address to connect to a
website that the attacker can't directly connect to because of the way the connectivity is
set up there are web pages inside your firewall that the attacker needs your browser in
order to connect to, using your browser as a proxy essentially into the firewall. These are
some different types of resources that you have in your browser that the attacker might
want to gain access to.
So I've looked at all of these different resources and tried to find cases where the web
attacker might be able to jump in and steal something that the user has. So being able to
access the pixels and keystrokes while you're on the bank website, that's something that
mashup [indiscernible] was trying to solve.
>>: It seems like it's using another [indiscernible], which is the memory?
>>Collin Jackson: That's right.
>>: Pixels is like [indiscernible]?
>>Collin Jackson: That's right. So the storage, you can think of it as being both inmemory storage and also the persistent storage in the browser. There are other types of
resources, these are just five I've written papers on. I think there's more opportunities for
applying the web attacker to things you might want to steal from the user. So I'm going
to talk about this in detail today. I'll also talk about this one.
Briefly right now the way that the browser renders hyperlinks on those web pages to
determine whether you've been to a particular URL or not, by the way a hyperlink
renders, either purple or blue. We looked at how a web attacker might use that to
determine what you're doing in another window and proposed a solution for that.
And also there are some tricks that the attacker can play with DNS that allow a web
attacker to essentially hijack your IP address and connect to resources inside your
firewall. We've been working on that problem as well. But I'm going to talk about the
first three next.
So here's an outline for what I'd like to talk about. First I'm going to talk about how do
we understand whether this web attacker threat model is realistic. How expensive is it to
get an introduction. Are there large scale attacks possible just by setting up a web server
and trying to attack users to it. Then I'll talk about how we take various web features,
analyze them, find vulnerabilities, propose solutions that get deployed by vendors. The
deployment problem is a particularly hard problem in the web because of this backwards
capability issue.
And then I'll talk about different additional capabilities that might give the web attacker,
if you want a feature that's a little bit more secure, a little bit more powerful.
So one potential approach to the web attacker, you know the web attacker is this website
out there that's got some evil JavaScript code on it. You could say we're going to find all
those bad websites, categorize them as being bad, and throw up a warning if the user ever
goes to that website. There's actually been a lot of work in trying to figure out which
websites are trying to attack you and which websites aren't. Google has been building up
this big database. There's also this phishing filter in Internet Explorer that try to warn
you when you go to a bad site. I mean, this is a warning that I took just a few days ago
on a real government website, because the attacker had injected some JavaScript code via
cross-scripting into the government website, and that evil JavaScript code was trying to
execute various types of malware, various types of attacks.
So the problem with trying to block all the content on the web that's evil is that it's just
the barrier to entry is so very, very low that the attacker, if they found this warning on
their site, wouldn't have to go through very much effort in order to set up another one,
and another one. And you could keep doing this with stolen credit cards until the
blacklist had to store huge amounts of content in the browser in order to warn the user in
real-time when they went to one of these bad sites.
The spider that's going out and crawling all the bad sites, might see the bad JavaScript but
not be able to find it in time or the website might serve up different JavaScript to the
spider than the actual user who's using a real browser.
So I think this is an important approach. I think it works better for phishing and malware
than it does for crossover request forgery and other types of weak web attacks that work
in the web attacker threat model.
So I said that setting up the server is really cheap. You can pay $10 and get domain and
hosting. You can get a domain validated SSL certificate for zero dollars. We found a
CA that will just, by virtue of you owning that domain, give you SSL. So that's why we
assume that the web attacker can talk over the HTTPS protocol.
But we weren't sure how much it would cost to actually get introductions to the user.
And this is important because we need to figure out just what the scale of an attack might
be if an attacker knew about a vulnerability in the browser.
So we ran a series of experiments at Stanford where we used ad networks to try to test
our attack code. We wrote some evil JavaScript that used some vulnerability in the
browser and then we found our ad network, we placed an iframe on that ad network that
ran our attack code and checked to see whether it works.
You're probably wondering how I got this past the [indiscernible] at Stanford. And the
answer is that we only attacked ourselves. We were trying to bypass the browser's
security policy, the same policy that prevents you from attacking other sites.
So we set up two servers at Stanford, and essentially all we did was try to take some data
from one server with some code that was served by the other. And so this way we
weren't harming any other websites on the Internet, we were only running the test attack
against ourselves. So there was nothing that we could steal, but we were able to
demonstrate the viability of the attack with our attack code that was on the user's
browser.
So what this graph shows is the amount of time that we had on the user's browser while
some ad was showing. So this is an example of a web page. It's a news site. This is an
example of the iframe that we had. You can't see it, but there's a bunch of JavaScript
running in this iframe while you're reading the news. And that JavaScript is running the
experiment to find out whether or not the attack worked or not.
So what we found is that a lot of users navigated away very quickly after one or two
seconds. They clicked in the URL code they wanted to go to and the ad was shut off.
But there were a lot of users who stayed there for days or even weeks. And I think that
this is really a testament to the fact that users keep a lot of tabs open, and they don't
reboot very often anymore. And there's really no reason to close a tab unless you need to
restart your browser.
And so if you're a web attacker and you want to do an attack on a long time scale, $1 will
buy you two thousand impressions. So if you spend $1,000, you can get two million
impressions. And that's actually a large number of browsers, it's enough that you could
get exposure to a lot of browser vulnerabilities if there are only specific to a particular
platform.
So this is understanding whether or not an attack works and how many different browsers
you could hack at once. But what if you want to run an experiment that determines
whether or not a security feature is working.
Yeah, go ahead.
>>: [Indiscernible]?
>>Collin Jackson: So there's actually a big business in trying to determine whether or not
an advertiser is serving up evil iframes or not. This was a big problem for MySpace.
Several times now someone has taken an iframe on MySpace and then used to try to
exploit Active X or some other browser plug-in vulnerability to install malware on the
user's machine. And this actually got viewed by tens of thousands of users before
someone realized what was going on and shut it off.
So I can give you some news articles that talk about this attack happening in practice.
But there is evidence that real attackers are looking at this model and are finding it to be
sort of financially attractive. $1 for two thousand users is pretty good, even if your attack
only has a 1% success rate.
So we can also use this sort of ad network methodology as a platform for experimenting
with different browser's security features and figuring out how they work, treating the
web as sort of an ecosystem rather than a particular product because there are lots of
different browsers out there, lots of different versions of those browsers, lots of different
configurations. You've got firewalls. Also a different thing, it's affecting the HTML that
people view.
You might think a particular security assumption holds on the web, but without actually
testing it, seeing whether or not the browser's security policy is being enforced correctly,
you can't know whether or not it works the way you think it does. So you can test your
hypotheses on a large scale like this.
So we had a hypothesis about how the browser referral had to work. We wanted to test it.
We got about 200,000 users to run our JavaScript code by virtue of displaying our iframe
and came up with some very interesting observations about the way browser referrals are
handled.
We had a hypothesis that the browser blocks the referer when you have an HTTPS page
requesting an HTP image. We found that to be true most of the time. What was
interesting is the referer was also being blocked from HTP to HTP about 3% of the time,
which is a significant fraction of the time. And we did some more experiments that
found that that blocking was not happening over HTTPS. The X and Y here are just two
servers we set up. If you see X connecting to Y, that means the referer was X and the
server you're connecting to is Y.
So we found this interesting observation that when you have HTTPS requests, the referer
is very unlikely to be blocked, but it's likely to be blocked if you have HTTP. From this
we formed our hypothesis that most of the referred blocking that's going on in the
network is happening as a result of proxies and other things that live outside of your
computer, and only have access to your HTTP traffic. They have a hard time modifying
your HTTPS traffic in transit.
So as a result of this experiment, we made a bunch of recommendations to vendors about
how they should handle the origin header proposal, which is a new proposal for
replacement for the refer header. We're trying to understand what privacy problems
could people have with the refer header that causes them to block this and how can we
provide a new header that provides the security that people want from the refer header
without having all the privacy problems. I'll talk a little bit more later about why you
might want to use the refer header to prevent attacks.
So, anyway, this is the measurement approach we've taken at Stanford. Now I'd like to
talk a little bit about how we've actually tried to find vulnerabilities in features in the web
attacker threat model now that we understand the web attacker threat model is a realistic
thing for an attacker to use.
So here is a very simple web attack. The user goes to a login page and you can't tell, but
there's an iframe on this page. The iframe is the login box, because this is a module that's
used on multiple Google properties. So right now it looks like the user is only interacting
with Google. You have the security indicators that are all in the right place and they say
Google on them, so there really doesn't seem to be any interaction with some untrusted
third party.
The web attacker threat model, we don't look at this one window. We have to assume the
user has lots of windows open, they're all sitting around in the background, and the
attacker at any moment might try to jump in and steal some pixels or keystrokes from this
page. Here is how the attacker might do it. They want to replace this login frame with an
evil login frame, so when you type in your user name and password, it will submit the
password to the attacker instead of submitting it to Google.
So what the attacker can do in some other tab in the background is use the window dot
open API. Essentially they name the frame that they want to access and they pick a URL
that will populate this with content that's controlled by the attacker.
Now, the browser has this mixed content warning system which tries to warn you when
you have an HTTPS page rendering content from an HTTP URL. So this is why it's
important we make this assumption the attacker can get an SSL certificate, because if
they can, they can make sure this iframe will still be served over HTTPS and the user
won't get any warnings telling it it's a bad idea to log into this frame.
So once the attacker has replaced that frame with their own content, you'll never know
you're actually talking to the attacker when you type in your password.
>>: [Indiscernible]?
>>Collin Jackson: Right. So the browser has this window dot open API that allows you
to name a window anywhere on your system. You specify a new location for that
window. This API is designed to be cross origin. So you can name a window that's not
under your control. The browser is supposed to navigate that window to a new location.
It's a very dangerous API.
>>: [Indiscernible]?
>>Collin Jackson: Yeah. So how did the attacker get introduced to the user? The web
attacker threat model says the attacker gets an introduction for free. In practice, what
they're doing is you've got some tab open underneath this one where there's an ad.
>>: The ad is a script?
>>Collin Jackson: The ad -- yes, it's an iframe that actually has JavaScript running. Half
of the networks limit what content you can put in the iframe.
>>: The browser might limit JavaScript in an iframe?
>>Collin Jackson: That's right. So the security restricted flag on iframes, for example, is
a Microsoft proposal that would allow you to limit what can happen inside of an iframe.
There's some effort underway to standardize that in HTML 5 so other browser vendors
can adopt it as well. The problem with limiting what can go on in the iframe is some of
that functionality is intended. It may be that the iprovider actually needs to run some
JavaScript to figure out which ad to show on your system, and so there's the tradeoff
there.
For the moment, the sort of finance aspects of advertising seem to be winning over the
security aspects. So I haven't seen a whole lot of people using security restricted yet.
Yeah?
>>: So here you have a malicious ad [indiscernible] this page.
>>Collin Jackson: So the ones we've looked at typically put your ad in a cross domain
frame. So they have some throw-away domain that they don't care about. So this is, you
know, Boston Metro News. Often what happens is Boston Metro News will iframe the
ad provider, and the ad provider makes a second iframe, just some domain that they don't
care about. And that's where your actual ad lives. So it has no privileges to the browser
other than living at this strange domain and the ability to run JavaScript in your browser.
>>: My next question is, can such an ad be run across the iframe, how can it
[indiscernible]? Does it really allow this kind of cross domain navigation?
>>Collin Jackson: Yeah, so it used to be that this attack worked, and it worked in almost
every browser. The iframe independently fixed this particular issue, but unfortunately if
you had flash player installed, they hadn't gotten the memo that this is a bad thing, and so
they still allowed you to navigate any window on the system across origins. There was
no origin checking going on when you tried to make a navigation.
So what we did is we contacted all the vendors and said, Look, this attack is bad, and we
provided a bunch of different variants on the attack to show why you might want to look
down this API. And we got everyone to agree, it was a lot of work, but we got everyone
to agree on what's called the descendant policy, it's actually based on IE, that says you
can navigate a frame if you control a frame that is one of the parents or grandparents of
that frame.
So if you want to navigate this frame, you better be in control of Google.com. You don't
actually have to be the direct parent of this frame, but you need to be in the same origin
as this one.
And so we came up with some arguments why the policy made sense based on pixel
delegation, the idea that if you give someone some pixels, you can take them away at any
time. And so that's why that got adopted.
So this is an example of the web attacker trying to steal your keystrokes. And the key
assumption that we had to make was that you weren't just interacting with this page, you
had some other guy in the background that was trying to jump in.
So let me give you another example of a web attack. Any questions about this before I
move on? Okay, so let's talk about crossover request forgery. I'm sure this is a topic
that's near and dear to some of our hearts. So basic crossover request forgery attack, the
user gets introduced to the attacker, because that's how web attacker works. And the web
attacker is gonna create an HTML form that's gonna submit some HTP request to a target
server. And in this case, we're relying on this fact the users keep concurrent sessions
open with a lot of different sites at the same time. And so they're already authenticated to
this bank website using cookies. So the attacker can automatically in JavaScript, without
any user interaction, submit this form to the -- they serve up this HTML document. The
HTML document submits the form to the bank website, and then the bank sees that the
user appears to be trying to transfer some money into the attacker's bank account. So
because the user's already authenticated, using cookies, the attacker allows the request.
This is a very basic CSRF attack.
So we'd like to prevent this attack. But before I talk about how to prevent it, let me
describe a variant on it. This is something that we discovered at Stanford called the login
crossover request forgery attack. And the idea is that when you -- we go to the web
attacker's website, rather than trying to transfer money out of your account, or do
something with your existing session, all the attacker wants to do is create a new session
for you. And so the attacker submits this form to the login page of a particular web page,
and then you become logged in under the attacker's account. Essentially it's a CSRF
attack, but instead of targeting the transfer page, we target the login page.
So now you're authenticated under the attacker's account using the attacker's password.
Why would that be a bad thing? So there's a lot of examples that we came up with. One
example is, when you do a web search, if the -- if Google thinks that you're logged in as
the attacker, it gets stored in the attacker's search history. So now the attacker knows
what you're searching for even if you're only using the search box in your browser. So
this is a variant on the CSRF attack.
>>: How does the hacker know that your browser [indiscernible]?
>>Collin Jackson: So the attacker knows the user name and password for the attacker.
They use this web attack. The user gets introduced to the attacker via some iframe ad or
whatever. The attacker silently logs you into Google under the attacker's account. And
now the attacker, who knows the attacker's user word and password, go to Google, log in,
go to the web search history page, and watch as entries start getting added to it.
Every time the user does a search, Google thinks that they're logged in under the
attacker's web account. So it gets added to the attacker's search history.
So this sort of turned our CSRF understanding on its head because people used to call it a
session hijacking attack, or a session riding attack. The idea being that you already had
to have a session, and the attacker was just trying to jump in and join it.
But what's fundamentally happening here is that the attacker's making a form post
abusing the fact that the form post API in JavaScript is cross origin, and confusing the
web server as to where this request is really coming from.
So let me give you a more subtle example of this attack. So let's say you go to a web
page that has a PayPal donate button. This is how a lot of websites make money. Users,
out of the goodness of their heart, make a donation. But maybe you only want to donate
$5 to [indiscernible].com and you don't want to give away everything you've got. So you
click the donate button and you get sent to the PayPal login page. In this case, the
attacker opened it up in a new tab. So this is nice because the attacker's now sitting here
in the background being a happy web attacker trying to join into your session with PayPal
at just the right moment.
So this is actually what it looks like when you're about to make a donation. You're
supposed to enter in your user name and password, click login, and for a few seconds you
see this sort of logging in screen, and then you land on some page that assumes you're a
logged-in user.
So in this case, PayPal wants me to add a checking account to my PayPal account. So
let's say I fill in my bank details and click continue. What do you think happens?
>>: [Indiscernible].
>>Collin Jackson: Exactly. So while this logging in prompt was running, the attacker is
sitting there in the other tab and can submit any formal request it wants to the PayPal site.
So that wouldn't be possible if the attacker didn't have a foothold in your browser.
There's something about the attacker being able to do things concurrently with PayPal
that allowed this attack to work. And as a result, when the user clicks continue, they've
just saved their banking information into the wrong account. The attacker can now log
into their PayPal account, which they have the user name and password for, and make
transactions using the user's credit card.
So we've been working with a variety of browser vendors and websites to try to build
better defenses for crossover request forgery. Go ahead.
>>: So basically you assume the user is visiting the attacker page first?
>>Collin Jackson: Right. So the web attacker, we always assume the user visits the web
attacker's page. You don't make that assumption, a lot of the attacks get a lot harder. But
we've done these measurements that show it's actually quite common for users to get
introduced to attackers. So we want to make sure that we are securing PayPal against
that.
>>: Do you think the problem here is that because these two tabs are sharing a session,
should the two tabs share the session?
>>Collin Jackson: Well, I think you're hitting on a good point, which also happened with
the frame example before, which is that there are all these cross origin APIs in the
browser. Frame navigation, form submission. It's very easy to mess up if you're a
website and get hosed by this cross origin API. These APIs I think are the most
important APIs to analyze for security. Unfortunately, I think a lot of them were
designed in a simpler era where people were really not interacting with as many
principles at a time.
So, yeah, you could design a browser that didn't have any cross origin APIs, but a variety
of web pages which were designed sometime between 1996 and 2009 might be relying on
those APIs. So now you have this dilemma: do we break existing web pages or do we
allow the current web pages to run, but somehow adjust the protocol so that just the
attacks are blocked and different types of legitimate activity are still allowed?
I think that's really something that a lot of the [indiscernible] bodies are wrestling with
right now with respect to this cross origin request forgery attack. They want to find some
way that we can still allow cross origin form posts, because so many websites are using
them. Every website that adds a Google search box to their web page is using cross
origin form to connect to Google. In fact, you can think of any cross origin hyperlink as
somehow connecting two domains together in a cross origin way.
So people are trying to design a solution to cross origin request forgery that doesn't break
the web, but somehow makes this problem of how do you defend your website against
CSRF a little bit easier.
The way that PayPal is doing it today, they have these secret tokens that are on their
transfer forms. And when you click submit, they check to make sure that the web page
that has the transfer details also has the secret token that they put in the original form. If
it's not there, they reject the request.
It turns out it's quite hard to build a web page that does this correctly. We've analyzed a
lot of different frameworks that are trying to do this automatically, and we found
vulnerabilities, or they leaked the token in various ways. So we'd like to come up with
something that's a little bit simpler.
One thing that I do like is that rails and other authentication frameworks, like ASP dot
net, are building a CSRF protection into the framework so that each web page doesn't
have to solve this problem individually; they can just pick their framework of choice and
then get protection automatically.
But we've been looking at the refer header as a possible alternative to the secret tokens.
Unfortunately it's not usable over HTP because of all the blocking that goes on, and we
didn't really know that until we ran our experiment to see just how bad it was.
But now that we know that blocking of HTP is really bad, the silver lining is that over
HTTPS refer accuracy is very good. And so if PayPal wanted to prevent this login CSRF
attack, they could do it with only a few lines of code by essentially just doing refer
checking on the login page. They wouldn't even have to implement a secret token
defense because their site is served completely over SSL.
All right. So I talked about a few examples of things that we've worked on at Stanford to
try to take the basic web attacker threat model, the attacker just gets an introduction, and
see what kinds of attacks we could find and then work with vendors to deploy a solution.
But sometimes the web attacker threat model isn't strong enough to describe the attacks
that we're worried about. So I'd like to give you a few examples of where web security
research departs from the web attacker threat model.
So one thing that you might be interested in is the attacker is like a rogue network access
point or the attacker is someone who's sitting next to you at Starbucks. And you might
want to protect your sessions with websites against that kind of attack.
It's very hard to do this with basic HTP because HTP was not designed for security
against the active net attacker. But you may be able to protect yourself if the website's
using SSL.
We found a bunch of attacks on sites that use HTTPS ranging from mixed content
vulnerabilities to problems with cookie integrity. And we've been proposing solutions
that try to make it easier for websites like PayPal that want to do everything over SSL to
protect themselves against these active attacks.
But it turns out to be a deceptively hard problem. So that's one thing you might do, is
essentially give the web attacker also the ability to corrupt your sessions with the website.
But the attacker still gets to live in a separate tab, running JavaScript code running
simultaneously while you're at the SSL site. That turns out to be very important for some
of the attacks.
So another thing that you might want to give web attacker, and this is a threat model that
I think Helen is very interested in with the Gazelle browser, is an attacker knows about a
vulnerability in your browser, but luckily the vulnerability is somewhere in a sandboxed
part of the browser. So how do you design the browser such that the attacker who knows
about a vulnerability still can't bypass the browser's security policies and do bad things,
like install malware on your system, read all the files in your system, or even bypass the
same origin policy and connect to other sites?
So there's been a lot of work in doing this. I worked on the Chrome Project at Google,
which was trying to focus on -- our initial version was really focusing on protecting your
file system from being read or written by a website that found a vulnerability in your
browser. But I think there's a lot of work in trying to go even further and enforce more of
the browser's origin policy in some sort of privileged component under the theory that
most of the vulnerabilities live in the image renderer's or the HTML JavaScript engines
which have most of the complexity of the browser, essentially building the browser like
an operating system where you have a small privileged component that's the trusted
computing base, and then a large component that has all the complexity of the web.
So it's important the attacker be able to get on your system. So I think the introduction
aspect of the web attacker is a very important part of analyzing the effectiveness of these
solutions.
Another example that we've been looking at is the web attacker gets to have a widget in
your mashup. So the user knows that this widget is controlled by the evil gadget guy.
And their final thought, as long as it lives within its little rectangle, and the web attacker
shouldn't be able to affect all the other components of your mashup.
So iGoogle is a very [indiscernible] version of a mashup, but you can really think of any
ad network that serves ads and iframes as a form of mashup, because they're bringing
together content from multiple sources.
So how do you do things like design a communications scheme where the integrator,
Google in this case, can talk to the evil gadget without being attacked by all the other
frames?
So Microsoft designed a protocol that does this using identifier messaging. And the idea
is that you attach messages to the end of your URLs, and because the navigation API in
browsers is cross origin, this works even if you're sending a message to an iframe that's
been put in a throwaway domain. So this allows you to build the mashup that has lots of
different iframes. The code is written by lots of different people, but they can still
communicate using these messages.
So the key thing about navigation is that you can navigate a URL, but you can't read the
URL of a cross origin frame. It's a write-only API. And when you navigate, you get to
know what URL you're sending it to, including the domain. So you get confidentiality.
If you ever set the URL of a frame, you know that it was -- it's only going to be readable
by the origin you're trying to send a message to.
However, there's no authentication. There's no way using basic messaging to know
where you got a message from. It could have come from anyone. It doesn't say in the
message in any sort of secure way, This message was sent to you by the integrator.
So it's kind of analogous to having your messages being encrypted with a public key. So
IBM tried to build something like this. Microsoft also tried to build something like this.
And they're essentially trying to add authentication to the existing confidential API.
Some way of essentially figuring out where a message came from using a multi-stage
protocol.
So we looked at Microsoft protocol. We had to reverse engineer it a little bit. We found
that it was very similar to the Needham-Schroeder protocol for network communication.
Unfortunately, there's this anomaly in the Needham-Schroeder protocol that basically
says if the good guy ever tries to initiate an interaction with the bad guy, then the
authentication property of the protocol can be broken. And there was an analogous attack
on Windows Live channels that did exactly that.
So here is an example. You have a gadget that thinks it's talking to the integrator, that
actually it's talking to another gadget in the same page that was written by someone other
than the integrator.
So we [indiscernible] the fix to the protocol that's based on the Needham-SchroederLowe protocol. And this got adopted by both the Smash team as well as the Windows
Live Channels library. So that's an example of where we took the web attacker and we
gave them something else: a foothold into the mashup. And that ended up being very
useful for finding an attack that otherwise wouldn't work.
I should note that there are certain threat models that don't even make sense for analyzing
the Windows Live Channels. For example, Windows Live Channels is not designed to
resist a network attacker. So a network attacker could potentially fake messages to the
integrator and you'd never know because that's not part of what the protocol's trying to
do.
But if you pick the gadget threat model where the attacker gets a gadget in somebody
else's mashup, then you can actually find this attack.
So [indiscernible] messaging is kind of a hack. We've been advocating for a new
protocol that adds support for cross origin communication. The idea is that if you want to
send a message from one frame to another, whether it be in a different window or the
same window, all you have to do is just call the post message method of this frame and
specify whatever your message is. And then the browser will let this guy know where the
message came from.
So, unfortunately, the original post message proposal had a problem. It didn't really
provide confidentiality. It had the reverse problem that [indiscernible] messaging had.
So you could send messages and you'd know exactly where the message came from, so
there was no problem with that. But when you received a message, there were tackles
using framework navigation, the attacker could jump in and receive your confidential
message without you intending it.
So we wrote a paper at Microsoft that was describing a communication mechanism
between origins called mashup OS. And mashup OS [indiscernible], which was for
communication and cross origins, had this second argument where you specified the
origin that you were trying to send the message to. And so what we did is we proposed
that post message adopt the second argument. So when you send a message, you're not
only saying what you want the message to be, but also who you want it to be received by.
And luckily we got this just in time before it was adopted in Firefox. And so the initial
limitation in Firefox shipped with this mandatory second argument. And so now all of
the browsers have adopted this, including IE, the second argument that lets you say who
you're sending a message to.
Yeah?
>>: [Indiscernible]?
>>Collin Jackson: Very good question. The question mark in the circle browser is the
web working group, which is the web hypertext application's something working group.
They are the working group behind HTML 5. And this is a standards group that's moving
very fast, uncomfortably fast for some people, at standardizing a lot of the de facto
features that web browser's implementing that no other working group is taking on. So
things like content sniffing in browsers, something that every browser has to do, it's not
standardized. It's extremely important for security. And so we've been working with the
web working group to try to get those features written down, so if you're building a new
browser, and, by the way, there's been a lot of competition in the browser market lately,
so I'm very excited to see that people are interested in believe new browsers.
If you're building a new browser, you don't have to invent your security policy from
scratch. There's a document out there somewhere that tells you all the hairy details of
how you implement these policies.
All right. So I think I'm running out of time, so I'm going to wrap up.
I think the web attacker threat model is a nice basis for research because it separates out
the user interface problems of how do you figure out where the user thinks they are and
how do you protect their password that they're entering into the page from how do you
make sure the browser's not confused about how it's allocating the resources that this web
page has.
Things like your pixels, your cookies, those are problems that really don't require the
user. And you shouldn't need user interface in order to protect those resources. They're
completely outside of the user UI problems. You don't need a user study in order to
figure out whether or not you've successfully prevented the attack. What you need
instead is an analysis based on what the capabilities of the threat model are to show
whether or not it's possible for the attack to work.
So I think as the web progresses, we're gonna find more and more ways for the attacker
to get used to the user. I think we can't necessarily blacklist every bad website on the
Internet. These ad networks are going to cause a lot of introductions to happen. And
that's good. I think it's good for there to be a very low barrier to entry to get your
JavaScript code running in the user's browser. But there's a responsibility with these
cheaper introductions of building a security policy that can deal with all of the different
principles that are throwing code into the user's system.
I think that very slowly we're making progress on the authentication problem. I think
there's a lot of promising options out there, things like password authenticated key
exchange, card space. I think it's good that people are experimenting with that. I think
eventually we'll make some headway on the phishing problem.
But as we do, it will become more and more that the browser's security policy not be
confused because the user will be less confused. And so the browser will become the
greatest point of weakness.
And I think that we may not use exactly the weakest threat model web attacker that I
introduced at the beginning, but we'll keep giving the attacker some very minor tweaks to
their capabilities. Maybe give them an introduction into a mashup like I showed earlier,
or some -- know about some bug in the browser.
So we're gonna keep tweaking the threat model, but this cheap introduction I think is
going to remain the same. That's why I think this is a good way for thinking about web
security rather than just assuming the user will only go to good websites.
So that's my research talk. Any questions? You guys have been a really good audience.
Thanks for the questions. Yeah?
>>: I'm stepping back. Do you think the web is getting more secure messages?
>>Collin Jackson: I think the web is getting more useful. I think that's what matters.
>>: That's not the question I asked.
>>Collin Jackson: If you look at what is happening with fraud, as a percentage of web
transactions, it's been going down, even though it's on the rise. So I talk to a lot of guys
at the anti-phishing working group.
>>: How do they knows these things?
>>Collin Jackson: The credit card companies often feel a lot of the fraud themselves
because they eat it. They want to eat it. That's their business model. When someone's
credit card number gets used in a malicious transaction, the user complains, the credit
card company's job is to pay for the damage.
They try to push it off on to the store owners as much as they can. They keep very good
logs of what's going on there. They're the ones who know web fraud is something like a
nickel out of every hundred dollars that's going on on the web, which is actually nothing
compared to the fees that Visa charges you with every transaction.
So it's under control. I think the web is becoming more useful. I think attacks are still on
the rise. If you look at certain types of web attacks, things like malware resulting from
websites getting hacked via subsequently injection, there are particular sectors growing at
alarming rates. I think there's going to be a lot of work to do for web security researchers
for many years to come.
But I'm perpetually an optimist that the web is going to continue to be used as a platform
for building really interesting web applications. We'll probably see a lot more native
code running. I know there's been a lot of proposals to do this. The browser is going to
become a more powerful thing. But the idea you can go to a web page and the web page
lives in a sandbox that doesn't have complete control over your computer I think is an
extremely powerful way to build websites, and it allows users to interact with people they
don't trust quite as much as they would if they wanted to install a program onto their local
system.
So I think it's really good that browsers are being used because the alternative of just
downloading arbitrary code from people and running it is really bad.
>>: What do you think is the main pay point in today's web applications that's actually
preventing the web applications [indiscernible] same as desktop applications, to be
treated the same, what do you think the major hurdles for web applications are?
>>Collin Jackson: So the question was, what are the main hurdles for web applications,
for them to offer near-to-desktop experiences and be as powerful as a desktop
application.
So I guess there's two approaches to that question. You can ask what the development
challenges or what are the security challenges.
I think that as a web application developer, there's a lot of trivia that you need to learn in
order to build a secure web application. And you're constantly going to be running up
against different security restrictions in browsers that prevent you from doing things that
you want, likes uploading files from user's file systems, going to full-screen mode, lots of
things that you can't do for security reasons.
So I think there needs to be some way to solve that problem, and I think that one
approach that you might want to take is to really focus on these frame works, things like
APS dot net, really on URLs that will solve all of the uninteresting, trivial web problems,
things like crosshair scripting, crosshair request forgery, so the web developer doesn't
really have to think about that. They can code in a higher level language that allows them
to abstract away all the details of where the web form works and where the JavaScript
works. So there's been some effort in this direction, but I think it's still -- you still see a
lot of people hand coding web applications. So there's a lot more work to do.
>>: Do you think we have the solutions to these problems today? Do you think those are
[indiscernible]?
>>Collin Jackson: When you look at problems like sequel injection, you know the
answer to sequel injection, you use parameterized sequel. Somehow, this is the biggest
source of vulnerability on the web. A lot of that has to do with the fact that legacy web
applications make up the majority of the web. And so we need to provide a solution for
those people or we need to provide really, really powerful frameworks that encourage
them to go rewrite their application 'cause it's just so easy to do that.
And I think that we're making progress. It's not as fast as I'd like. But that is happening.
>>: One of the problems with frameworks is that if you're lucky, the people that wrote
them have done everything you need, but they're not always anticipating what you're
doing.
>>Collin Jackson: That's right.
>>: And so you no longer understand whether you're getting certain protections or under
what circumstances, if you want to modify that framework in some way, that you will no
longer get those protections, and whatever you provide in a framework is always going to
be some delta behind the cutting edge because you can never ship a framework at the
same rate as the people who are on the cutting edge.
>>Collin Jackson: Yeah.
>>: So I find the framework, there's something fundamentally unsatisfying about solving
things in the framework because you still have this horribly complex system underneath
that is very brittle that you're hoping holding together because some of this framework.
>>Collin Jackson: Yeah.
>>: Can't have like a Windows update mechanism where we know the scripting is
messed up, there's some fraud, a couple weeks down the road you make a fix for the API,
anyone using the framework gets a notification saying this framework has changed, do
you want to take the risk of taking the patch [indiscernible]? I'm saying I think one thing
that's really difficult is one thing you said, there's so much trivia, security is so
complicated. Security doesn't even know all the things they need to do. How do we
expect a developer to understand all the nuances they have to follow to execute a secure
website.
>>Collin Jackson: Yeah, I mean, I definitely think that the complexity is slowly getting
out of control. Browser vendors are constantly under pressure to add new features. They
want to make the web more usable, closer to desktop features. With each new feature
they add, there's a risk it's going to create a pitfall for some website to fall into.
You can't treat the web as a static platform or we'll just fix all the bugs and then
everything will be perfect and we can build a really awesome framework on it. The web
is gonna always to continue to evolve forward. People are going to start using plug-ins,
they're going to use different types of technologies. We have to treat it as a platform that
does adapt over time.
I think it's possible that you could have it adapt in a way that it actually gets simpler to
reason about. There's been some efforts to do this. Just to finish my thought. There's
been some efforts on contact restriction, for example, a way for you to basically just say,
Clean slate, I want you to get rid of all these features that I'm not using so I don't have to
worry about whether I'm using them wrong. Let's bring the web page back to sort of a
simpler web where there are only a few things that can happen, and those are the features
I know that I'm using, and you can declare as a website what features you actually want
out of the browser.
So there's been some effort to get that deployed. There's a lot of complexity in building a
content restriction system that makes sense. But I think there is some hope in that regard.
>>: So a web developer today is trying to find a program that runs on this dynamic plan
where the rules are changing out from under the developer. One can also write a serveronly program where you log into [indiscernible].
>>Collin Jackson: Yeah.
>>: Do you think writing web applications is strictly harder than writing a server-only
application? It seems like the web has all these anomalies, the web [indiscernible],
strictly harder than writing the server piece of it. If you care about security, that's right.
Is it fundamental? [Indiscernible] solve all these same problems in the end or maybe
there's something wrong about the way the web evolved and you look at the other
direction, we know how to write web certifications that's not [indiscernible].
>>Collin Jackson: I think that's going to end up pushing a lot of the security problems of
the web onto the server where you'd have to solve them again.
>>: So it's the same problems but just [indiscernible]?
>>Collin Jackson: Yeah.
>>: I disagree. I think it's different. When you write a server application, you're
concerned with the security of the server. When you write a web application, not only do
you have to be concerned about your server, you have to be concerned about attacks that
are going to occur on the client. That's the difference between [indiscernible].
>>Collin Jackson: I think you're both right. I think there is some amount of -fundamentally different about the way the browser works versus what a server could do if
it knew that it was isolated. But there are a lot of interesting problems about delegation.
Let's say I want to know whether or not you're signed into Google in order for
[indiscernible]. If they're completely isolated from each other, it's very hard to do that. If
you're going to have to solve that problem anyway, you might as well have everyone
solve it the same way instead of solving it in 10,000 different ways on the server.
>>: I think the lines are blurring anyway [indiscernible]?
>>: Agreed.
>>: You almost have to go to the lowest common denominator. When you write a
[indiscernible] application, I'm not worried about another [indiscernible] application
attacking the application that my user's going to be using.
>>: This process you've lost all the interesting [indiscernible].
>>Collin Jackson: If we understood the web security all along, we might actually be able
to do that. But it's still sort of being revealed as we go. All right. Thanks, everyone.
>>: Thank you.
Download