>> Helen Wang: Good morning, everyone it's my great... Charlie Reis, who is a Ph.D. candidate from the University...

advertisement
>> Helen Wang: Good morning, everyone it's my great pleasure to introduce
Charlie Reis, who is a Ph.D. candidate from the University of Washington. His
advisors are Hank Levy and Steve Gribble.
Charlie has done a lot of work on securing the web, whether it's securing the
browser or securing the web services or investigating, you know, the bad things
that ISPs may do to people. So Charlie is going to interview with both the
security and the networking research group, and he's going to talk about his
experience of building a safer web.
>> Charlie Reis: Thank you, Helen.
So hello. Thanks for having me here today.
As Helen mentioned, I'm going to talk today about some of my dissertation work
to try to make it safer to browse the web for what it's actually being used for in
many cases today, which is running program code. And this includes projects
both to try to improve web browsers and web content to make them more robust
and more secure against some of the problems that we're seeing today.
So our work is motivated by the fact that the web has really changed significantly
over the last decade or so. We're no longer just looking at static documents that
need to be rendered by the browser, we're looking at active programs that are
running within the browser itself, things like WebMail clients that have significant
complexity on the client side but that are written in languages like HTML and
JavaScript code, and so they're running right inside the browser itself.
Now, this ends up the putting the browser in a new role, something that it wasn't
necessarily designed for, which is sort of like an operating system. It has to run
these independent programs side by side and prevent them from interfering with
each other or causing damage.
But browsers weren't architected for this to begin with. They were sort of
designed for rendering documents. And web content also isn't really designed
for expressing programs. It's designed for expressing a marked-up document.
So publishers also face real challenges in trying to get programs to run in this
space.
So before I dive into any of the problems here in any detail, I want to just give
you a sense of the types of contributions that I've had in this space.
So I've worked on a project to build a new type of browser architecture using
multiple processes in an effort to prevent interference between programs. And
this will appear at the Eurosys conference next month. I've also done a study to
see how web pages are changed between the server and the client's browser
using a mechanism called a Web Tripwire, and that appeared at NSDI in 2008.
I've laid out architectural principles for what we might need to do to have better
support for programs within the browser. That was at the HotNets workshop in
'07. And I've done some work to try and prevent cross-site scripting attacks and
browser exploits using a wide listing mechanism for JavaScript code at UW and
then a research prototype called BrowserShield here at Microsoft Research,
which interposed on web content. And that appeared at OSDI in 2006.
Now, in this talk I'm just going to focus mainly on these first two projects, but that
will give you a sense for some of the things I've done both on the browser side
and on the publisher side. But I also want to point out just the range of different
types of projects that this represents.
So it ranges from the more concrete side to the browser architecture project,
code that's actually deployed in Google Chrome today, to a measurement study
looking at 50,000 different web clients and how their views of web content
change, but also a position paper and two research prototypes, including a
prototype that we've recently seen has gone on to help influence the Live Lab's
WebSandbox project, which we were excited to hear about.
Okay. So all of these projects are united in the sense that they're trying to do a
good job of supporting complex program code within the browser itself. And in
that sense it's acting like an operating system, so we can sort of take a step back
and consider how operating systems do this today, what sort of properties do we
have in a platform that's been thinking about this for a long time.
So you might have lots of programs running side by side, and we have things like
performance isolation between these programs so if one of them slows down or
locks up, you can continue using the other programs on your desktop. We have
good resource accounting. So we have things like the Windows Task Manager
that can tell you how much CPU and how much your programs are using, and
you can diagnose problems when something goes wrong.
We also have failure isolation in that if one of these programs crashes or
becomes corrupted, you can throw out that program and its address space and
continue using the rest of your programs. It doesn't necessarily interfere with the
other programs that you're using.
And all of this gets at the fact that we have a clear program abstraction within the
operating system. We have some sense of what code and data are related, and
we can load them in separate processes. We have mechanisms that support
that.
Now, if we move over to the browser world, turns out we have to give all of that
up. We're still running programs side by side, we still have things like WebMail
clients and video applications, other services you might be using, but now we've
given up things like performance isolation. So in many browsers, if something
slows down in the browser, the whole browser locks up in a way and you're
unable to interact with other programs that you're using or the browser's user
interface itself.
We've also given up good resource accounting. So there's no Task Manager
that can tell us which one of these programs to blame if the browser is slowing
down. We just see that, oh, geez, Firefox is using 90 percent of my CPU again,
and you have to sort of guess which one is to blame and go hunting and pecking
and closing tabs.
We've also given up failure isolation, that if one of these programs triggers a bug
in one of the rendering engines or the plug-ins, the runtime environments in
these programs, you can end up losing the entire browser and all of the
programs that you're working on.
So in that sense we don't have good support for programs. And in fact we don't
even know really what a program is in a browser. You might think it's something
like a page or a window, but it's not really that in all cases. So you might look at
this example of Gmail where you've got a Gmail window, but then you've also got
a chat window and a tasks window and a composition window that have all
popped out from there, and in many cases you can use some JavaScript code to
pop those things back into the window. They're all very tightly integrated and
part of the same program. But there's no abstractions in the browser to capture
this. We just sort of have sets of windows and pages. So what we need to do is
find a way to build up those properties that we had from the operating system
and support them again for these programs on the web.
So my thesis here is that we can actually learn from some of the things we've
developed in the operating system world and use that to improve the browser
and web content to provide a safer platform for running programs. And we've
laid out four different architectural principles in our HotNets paper to try and get
at what's required for this. And these include defining exactly what a program is
within the browser, the precise program abstraction,
being able to isolate these programs from each other in the browser's
architecture so that we don't get that sort of interference that we saw, making it
possible for publishers to authorize what code gets to run in their own programs
so they can reason about what their programs do, and then having ways to
interpose on program behaviors so you can enforce policies that you didn't
anticipate to begin with when the web was first designed.
So in the rest of this talk I'm going to focus on two of the projects that I've
worked, primarily ways to improve browser architecture. And this is work that is
appearing in the Google Chrome browser, which is also called Chromium, as the
open source browser that shares the same code base as Chrome. And it
includes finding program abstractions within the browser and then isolating them
from each other.
After that I'll give you a brief overview of our web [inaudible] study about how
content has changed between the server and the browser, I'll talk a little bit on
the previous work that I've done, and then conclude with some future directions.
So we're trying to do a better job of support programs within the browsers. The
best way to start here is to sort of look at an example about how someone might
use their browser. So consider a case where you're looking at a WebMail client,
and within there you use some integrative functionality to pull up a list of
documents and a document that you're working on within the browser itself. And
there's lots of suites that can do this. There's Zoho, there's Google docs and so
on, where you're actually editing things within the browser itself.
Now, while you're working on one of these documents, maybe you pull up a
second one from your document list, decide that you don't need that right now,
and you navigate that to a different website, like a blog. Now, this is already
something different from the operating system world where you can take an
existing window and navigate it to a completely different program.
While you're here, maybe you get distracted and you start reading a news
website, pull open an article in another window, and then while you're working on
a mail message, you pull up a second copy of your mail client so you can read a
different message while you're writing one.
So I don't know about you, but my desktop looks like this pretty much all the time.
I always have lots of windows open. Some of them are related to each other,
some of them aren't. And the key point here is that we've got several
independent programs that are running side by side here.
Now, in many of today's browsers we end up with a monolithic architecture that
puts all of these programs in the same process. So that's where we get things
like poor performance isolation, poor failure isolation, even poor security if
something went wrong in one of these where if something does go wrong, it can
spread to the rest of the browser. So we should do a better job of architecting
the browser to prevent these things from happening.
So how might we do that? We want to carve it up so these different programs
are running in different processes, for example. So maybe we create a process
for each window in the browser. Now, this is nice because we would get some of
those benefits back. We'd get performance isolation and failure isolation.
But the trouble here is that we'd end up breaking some pages that do directly
communicate, and the Gmail window and attack window is one example. Maybe
the document list and the document are part of a single program and they're
talking to each other with JavaScript, and by talking to each other in the browser,
I mean, they actually have shared access to each other's data structures, each
other's functions. It's very tightly integrated. And it can only happen in certain
cases. It can happen when pages are connected to each other, so they have
some handle to refer to each other contents, and it happens when they're from
the same origin, so the browser allows them to talk to each other.
But because the process per window approach doesn't reflect that, it sort of fails
as a program abstract. It doesn't capture what we think of as a program, like a
document editing program that has multiple windows or the Gmail program, and
it can break compatibility with pages that do interact in that type of fashion.
So we're after some sort of program abstraction that can capture these
constraints. We want something that will match our intuitions about what a
program is on the web, as well as preserve compatibility so that pages that did
communicate before can continue to communicate. And we'll see that we can
get there by taking some cues from the browser's existing rules and mechanisms
about how pages communicate in the browser, and then we can isolate these
groupings in separate OS processes to get that isolation that we're after.
And I'll show that we can get performance and failure isolation with the
abstractions that we get, but not necessarily secure isolation of sites from each
other, and that's because in this project we're very focused on preserving
compatibility as one our constraints. So we were looking at getting this into a
browser like Google Chrome that we could put out on the web today and be able
to support content that's out there. So we're focusing on compatibility a little bit
more than security in this situation.
>> Helen Wang: Do you think that compatibility and security are [inaudible].
>> Charles Reis: I don't think they are necessarily mutually exclusive. I think
that you will often have to give up some of the things that you like to get to be
able to support the content that's out there. And it would be nice to change these
things, but given the amount that's out on the web today already, it's tough to
change everything.
Yes?
>>: So far you've been sort of separating at the page level. What if that situation
is where you've got multiple applications within the same web page like the
[inaudible] or a gadget [inaudible].
>> Charles Reis: Sure. So there are situations where you do have match-ups,
multiple applications in the same page. I think that that's also a good opportunity
to look at how you want to isolate these different applications from each other
and allow them to communicate. And I think there's been some great work done
in this space on -- for example, the match-up OS project has tried to expose, and
communication abstractions for these different communicating applications within
a page. That's not something that I'm going to get into here, but I think that's also
a very important direction for supporting programs.
Okay. So first I'm going to talk about how we can find program abstractions
within the browser, then how we can isolate them from each other and what
benefits that gives us for robustness and performance.
So at a high level, we're looking for some sort of abstraction that represents a
program in the browser, and that's what I'm going to call the web program. It's
some set of pages and the sub-resources for those pages that make up a
common service. And this is just like in the desktop world where you have code
and data that makes up some application that you're using.
Now, we also need an abstraction for an instance of one of these web programs
which is a live copy of that running within the browser itself, and you may have
multiple web program instances of the same web program in a given browser; for
example, two copies of the WebMail program.
Now, the important thing here is that this is the unit that we'll end up isolating
within the browser's architecture. You have some web program instance that can
run in a process.
Now, this is intuitive, but it's fake. This isn't something that the browser can look
at and say, oh, okay, I know what a web program is. You need to figure out how
to define this.
Yes?
>>: So would there be any sharing between the web programs [inaudible].
>> Charles Reis: There will be certain types of things shared. For example, they
share the same set of cookies from the same website. And I'll touch a little bit
about how strongly they can be isolated. But they won't have direct access to
each other's contents.
>>: Do you do applications across different [inaudible]?
>> Charles Reis: Navigation -- so you can have a web program instance and
then navigate to a new web program instance. So if you're looking at two
different web programs, you can navigate between them, but then they would be
separate web program instances.
Okay. So let's look at how we can define these in a more concrete sense, and
specifically we want to find abstractions that are compatible with the content
that's out there on today's web. And what we found is that there's three different
ways that you could actually carve up the content within the browser to put them
into separate processes and isolate them without breaking pages, but they're
going to match our intuitions about what our program is to various degrees.
Some are better than others.
So first we could look at the browser access control policies. We could look at
things like the same origin policy that says that if two pages are coming from
different origins, they're not allowed to communicate. So maybe we can isolate
them in separate processes.
We can also look at the communication channels between pages and come up
with what we call browsing instance abstraction, where only pages that have
handles to each other and are allowed to -- and have the ability to communicate
need to be in the same isolation unit, and if you don't have a handle, then maybe
you can put that in an isolated unit.
What we will find is that none of these capture quite what we want with a
program. We actually want the intersection of these two, and this is what we'll
call a site instance, which is where we're headed. So I'll explain a little bit more
about how we define each one of these abstractions.
So first we can look at the access control rules in the browser. This is where we
get the site abstraction. So the same origin policy says that two pages are
allowed to communicate if they come from the same host, protocol, and port,
which is the origin. So, for example, the two pages here from Docs.Zoho.com
can access each other's data structures, call each other's functions. Essentially
they're running in the same address space within the browser.
But the pages from different origins are not allowed to communicate with each
other, except it's not quite true. There's actually some ways to bend the rules
here. Pages can modify a variable in their own name space called
document.domain, and the browser will then change how it views these pages in
its access control rules. So a page can change its document.domain to drop off
some of the subdomains. So you could say rather than treating me like
docs@Zoho.com, treat me like Zoho.com.
And the mail example could also do that. And in that case all of these pages
from Zoho.com could communicate with each other. So we don't want to break
pages that are using this already. And as it turns out, Zoho is one of these pages
that uses this today, as is CNN, as is MySpace. There's cases where pages
from different origins do end up communicating with each other even if they're
just coming from the same site.
>>: Do you think this is more of a broad legacy of lack of communication in the
past, but now, you know, we have like [inaudible] existing in the [inaudible] as far
as [inaudible]. I'm not sure about Chrome. So do you think, you know -- do you
think that's kind of a domain, setting subdomain which usage is -- I think that's
actually [inaudible]. Do you think those will go away?
>> Charles Reis: I'm hoping that they will go away. I'm hoping that having better
communication mechanisms between pages will allow pages to not rely on this
feature as much. I think that it's difficult to require websites to use those
channels today, particularly because so much of the web-using population is still
using older browsers that don't support those features yet. In those cases you do
still need to rely on some of these.
>>: Can you give some examples of [inaudible].
>> Charles Reis: That's a good question. I don't have a lot of specific examples
off the top of my head. I know, for example, that you might end up passing
updates between the mail and the document program if something changed
there. I'm not sure -- I think some of the, like, CNN examples might have frames
that communicate across even though they're hosted on different origins, but I
don't have this off the top of my head.
Okay. So thankfully there's at least a limit to how much pages can change the
document.domain, and that's up to the registry control domain name. This is the
part of the domain name just before the .com or the co.uk that is essentially
identifying the website that it's coming from.
And so what we can end up with is a site abstraction that represents the
registry-controlled domain name plus the protocols, whether it's HTTP or HTTPS,
and pages from different sites will never be allowed to communicate with each
other based on the same origin policy. So we can end up with three different
sites in this example, one for Zoho, one for Blogger, and one for bc.co.uk
[phonetic], and we can put these in isolated units and they would never be able
to access each other's contents. So that's one approach. It's the first proposal
we've seen that doesn't end up breaking the content that's out on today's web.
But we can do better than this, because in this case we end up putting everything
from Zoho.com in the same process even if you have independent copies of your
mail program that aren't going to be directly communicating with each other. And
if you have lots of copies, they may end up interfering with each other
performance-wise, and so on. So we can do better than this.
So, second, we can take a look at the communication channels between pages
within the browser, and this will give us what I call a browsing instance
abstraction. So we can recognize that not all pages in the browser can talk. You
need a handle to another page to be able to access its data structures. And
those handles come about as a result from the relationships between windows
and the browser. So it eventually depends on how you got to the page in a
sense.
So because the document list opened these two documents using JavaScript
code, using something like window.open, it has a handle to it, and it can say
something like, you know, this w gets w -- or gets window.open, so it has a
handle to this new child window that it just created.
And the child windows can also talk about their parent by saying something like
window.opener. And the interesting thing here is that this is a property of the
window, the container of the page, and not the page itself. So it outlasts the
lifetime of the page. So, remember, we navigated this window to a different
website entirely, but it can still talk about window.opener. It's just that the same
origin policy won't allow it to reach into the content of that page. It can only do
things at a very high level, like maybe navigate that window to a different website
or possibly traverse the frame hierarchy within that window, but the same origin
policy won't let it do anything more fine-grained. So in that sense we end up with
a set of connected windows within the browser.
And we can call this a browsing instance, and we can look at all the different sets
that have references to each other, and we can put these in isolated units
because pages in different sets won't be able to communicate with each other
directly.
And this is also not something you'd want to use as a program abstraction
because it ends up conflating pages from different websites that are clearly not
related to each other and won't be able to communicate based on the same
origin policy.
So we can finally end up with this site instance abstraction that goes for both of
these properties. It's the intersection of a site and a browsing instance, and it's a
set of connected pages from the same site. So it reflects that the browsing
instance here, pages from the same browsing instance and the same site, are in
the same site instance. And it's always safe to isolate things in the site instances
from other site instances because either it's in the same browsing instance but
from a different website, so the same origin policy says, no, you can't talk to it, or
it's from the same site and a different browsing instance, so there's no ability,
there's no channel to talk to the other pages. So this is what we use as our
compatible notion of a web program instance.
Yes?
>>: What if in the blowup window I up going to a Zoho site or the blowup window
itself references some -- or part of it is running a program on Zoho?
>> Charles Reis: That's a great question.
So if you start navigating these windows to a different site, those sites will need
to -- those windows will need to reflect the site instance of their parent, of the
site -- the page they're showing. So this window actually started out in this site
instance when it showed a document. When we navigate it to the blog, it will
switch to a new site instance for the blog, and if you went back to the Zoho site,
you would have to go back to this existing site, because there's a reference, and
they could now talk to each other's contents.
So that brings up a point that there's only ever one site instance per site within a
browsing instance. For example, you couldn't have two Zoho site instances
within that browsing instance.
Okay. So just to recap these, we have a site abstraction that says where pages
come from, a browsing instance abstraction that captures how windows in the
browser are related to each other regardless of what site they're showing, and a
site instance abstraction that captures the connected set of pages from the same
site.
>>: So let's consider a specific scenario where you visit, for example,
attacker.com in which you have CNN IFrame and from the CNN IFrame you
open the CNN tab. Are these all in one process or in different processes?
>> Charles Reis: So this is a good question.
The way IFrames work, in an ideal sense when we're talking about these
abstractions, an IFrame is an independent page and should be treated
independently of its parent. So that CNN IFrame entitled attacker.com would
ideally be considered a separate site instance running in a separate process that
would share a process with the new CNN tab. We'll see that in the
implementation of this, we haven't quite gotten there in Google Chrome
implementation, so I'll talk about that more later.
Okay. So we did have to give some things up to get here.
>>: I'm sorry [inaudible] you said there is a [inaudible] inside the attacker.com's
[inaudible], there's a new tab that's also CNN.com. Are these CNN.coms two site
instances or one site instance?
>> Charles Reis: It depends on whether they're part of the same browsing
instance. So if that IFrame opened a new CNN tab or if something else in the
browsing instance opened a new CNN tab, they would be part of the same site
instance. If you got to that CNN page independently, then it's a separate site
instance.
Okay. So what did we have to give up to get here. Turns out this is somewhat of
a course granularity, that if you notice, we ended up putting one copy of the mail
program in the same site instance as the documented program even though
intuitively those are sort of independent of each other, and that's because we do
need to preserve the fact that sometimes those pages will end up communicating
if the browser let them, and they could do that by changing the document.domain
variable.
Now, thankfully, because we can put separate instances of web programs in
different isolated units, that will help provide some isolation that we wouldn't
otherwise have gotten.
There's also some imperfect isolation. As we -- as Helen asked earlier, there's
some things that are shared between site instances
Cookies is one example. If you pull up a second copy of your WebMail client,
you're still logged in, so they're still sharing the same set of stored credentials in
the browser, but they won't be able to directly access each other's contents.
>>: Do you think that's imperfect or you think that's -- so, for example, even the
two program instances in today's desktop, two program instances of a program
share -- they can access the same files.
>> Charles Reis: Yes.
>>: That seems to be reasonable.
>> Charles Reis: I don't think that this is necessarily something that needs to
change. I think there's room for improvement here. I think there's some types of
cookies or storage in a program that might be instance specific and that shouldn't
necessarily be shared across site instances, but there should also be room for
more persistent things that can be shared.
>>: You're referring to the [inaudible].
>> Charles Reis: Possibly that would be one way to defend against certain types
of requests for attacks. Yes.
>>: Also, I think that the window level GS codes are not
imperfect to isolation, but I think they are just the limitations of the doc
specification. You cannot [inaudible].
>> Charles Reis: Sure. So this is something that -- again, it's an example of how
site instances aren't entirely isolated from each other. So the windows in
different browsing instances can sometimes make calls that interact with just the
window object. So this is like navigating the window to a different website. And
that's something that, yes, it's sort of a limitation or something inherent to the
[inaudible] standard, which won't be going away.
>>: [inaudible] the standard.
>> Charles Reis: You could change the standard, but that would be, again,
something that's harder to get to [inaudible]. So this is something that you want
to support these sort of high-level calls between site instances.
And then there's the fact that at least the way the abstractions are defined right
now, this isn't providing a secure boundary between websites, and that's
because we still do need to rely on the rendering engine between a site instance
to provide certain types of security. For example, if you're logged into your bank
in one site instance, another site instance might be pulling in objects from that
same website, and they carry your cookies. So the bank still thinks you're
authenticated, so there's some information that can end up in the site instance,
and you're relying on more subtle logic in the render to protect you there.
And this is something I'd be happy to talk about more afterwards in more detail
and that I'm hoping to look at more in the future and has been looked at some in
related work as well.
Okay. So now that we have these abstractions in hand we can try to isolate
them from each other in the browser's architecture, and this gets at the fact that,
you know, most of today's browsers have a monolithic architecture that put not
just all web program instances but all of their components in one process,
including all the rendering logic, the HTML, CSS layout, and so on, as well as
things like the storage functionality for cookies, cache, stored passwords, user
interface for the browser itself, the chrome around the browser, and the network
stack.
So we'd like to do a better job than this by isolating some of these components in
separate modules that can then be placed in separate OS processes. So from
there we'll end up getting all these nice isolation properties from the OS for free.
We don't need to reinvent an isolation mechanism here.
So what we end up with is a multi-process browser architecture. And here we
can put some of the more privileged components inside a browser colonel
module, and it acts similarly to an operating system that provides the resources
that the web program instances need. So it provides storage logic, it provides
access to the network, it provides the user interface around the web programs
that you're using, but it doesn't need to handle any of the untrusted code from the
web directly.
We can also have a separate rendering engine module that can run in a separate
process that handles both a web program instance and the underlying runtime
environment. So it's the HTML rendering, the JavaScript engine, the parsing
logic; all of that can live in a separate process.
We can also carve out plug-ins, things like Silverlight or Flash that don't need to
run within the browser kernel, and have isolation between them as well.
I've actually had the chance to implement this architecture in two different
projects. One was a prototype at the University of Washington based on the
Linux browser called Conquerer, where I was able to carve out the rendering
engine part of the browser into separate processes. And we have a technical
report describing this experience, but this helped me to get involved with the
Google Chrome project for building a new browser, which was released last
September. And it has a multi-process architecture that I just described.
And I was able to go in and help add support for isolating these program
abstractions from each other by providing support for site instance isolation,
including, if you navigated a tab from one tight instance to another, being able to
switch between those in different processes.
Now, what's interesting here is that we can actually support multiple process
models within a given browser. And in fact all of the different compatible options
that I mentioned can be supported. Within Chrome you can do it just by
specifying a different command line flag. So it's got a command line flag that can
give you a monolithic architecture that loads all of these components in the same
process. This is actually useful for evaluation purposes so you can see how it
might have behaved, how much overhead or what properties it would have if it
was monolithic.
You can also have a process-per-browsing instance model, which is sort of the
next implementation step. If you were building a multi-process browser, you
might put each new window or set of connected windows in its own renderer
process. From there you can go on to support a process-per-site instance model
where even if you navigate one of those windows to a different website, you can
swap it to an appropriate process for that site instance. And this is actually the
default in Chrome, and if you don't do anything, else you'll be running a
process-per-site instance model.
And if you do a little extra work, you can actually support a process-per-site
model where you group all of the site instances from the same website into the
same process. Now, this might be nice if you're trying to reduce the amount of
overhead that you have, the number of processes, but it also will give you less
isolation between the instances of each one of the programs using from a given
website.
So we've found that the process-per-site instance model had the best robustness
benefits and that the overhead was reasonable, so that was why it ended up as
the default.
Yes?
>>: So browser kernel controls the actual pixels? In other words, in order to go
from the rendering engine to the actual OS, you have to go through the browser
kernel?
>> Charles Reis: That's correct. The browser kernel is providing a way for
rendering engines to display things to the user, and it does it at the granularity of
a bitmap. It doesn't know anything about what's running inside there. The
rendering engine just ships the bitmap to the browser kernel and it can be
displayed on the screen.
>>: And that was never a performance issue?
>> Charles Reis: It actually works out pretty well. The blitting operations are
pretty quick. There's some shared memory that you can have between these
processes to make it fast. So we didn't find that to be a constraint.
Okay. Now, there's a few caveats in the implementation here. We hinted at
those earlier. So there are some cases in Chrome's implementation in which
pages from different sites will end up sharing processes. So, for example, at the
moment, not all types of cross site navigations will swap you to a new process.
And there's some implementation reasons for that.
We also haven't yet pulled out frames from their parent process if they're
showing a different website. So if you have that attacker.com and it loads an
IFrame with CNN.com, it will be part of the same site instance process. That's
something that, architecturally, the browser could support and that I'm hoping will
get to where that can be supported.
There's also a limit on the number of processes that Chrome currently creates.
You have more than 20 rendering engine processes that will start reusing
existing ones. So that's another reason why this isn't targeted at security right
now, it's just trying to provide robustness in the common case. After that they
start getting randomly reused. So this is something, again, you could lift if you
felt you had the resources to support arbitrary numbers of processes.
>>: So without the frame, the limitation about IFrame, is there a benefit of this
architecture when we consider [inaudible] attacks? Because you can always visit
-- there is always a chance that you visit an attacker site, right?
>> Charles Reis: That's correct.
So I'm not arguing that this architecture is currently providing ways to enforce the
same origin policy. We'll see in just a second that it can provide other types of
security, but not security between websites. And that's something that -- I think
there's room for future work there. There's room to get to where that might be
the case. Good question.
Okay. So let's look at how it can help in these other areas of robustness and
performance.
So, first, we do end up getting failure isolation between web program instances,
that if one of the web programs you're using crashes, you don't have to lose the
entire browser, you end up with what Google has implemented as a sad tab icon
for the program that you lost and then the others you can keep using, and you
can just refresh that tab to get back the program you were using.
We also get better accountability. The browser actually has its own Task
Manager now that can show you how much CPU and memory and network that
each of your web program instances is using, so you can track something down
when the browser becomes slow or you start noticing your CPU usage being
high.
>>: [Inaudible] kill off?
>> Charles Reis: You can kill them off. There's a button down at the bottom that
you can't see right now because it's disabled, but it says "end process," and then
all those pages will just become a sad tab.
This also has benefits for just the memory management of the browser in
general, that if some web program instance allocated tons of memory and you
wanted to get rid of it, the browser just has to throw out that address space. It's
just killing the process. So it doesn't have to go back through the heap and find
the objects that were allocated to free those.
Now, it's also worth pointing out that you can get some additional security from
this architecture. And we have a technical report describing this. I'm not going to
go into too much detail here, but you can sandbox each one of these rendering
engines to reduce the amount of privilege that it has, that they never need any
direct access to the disc or to the network that uses other resources locally. So
you can make sure that the rendering engine processes need to go through the
browser kernel to get access to them. So we're not isolating websites from each
other necessarily because of the frame problem and some other issues, but we
can at least say that web content is hopefully held within this sandbox.
Question?
>>: I'm curious how much you believe this sandbox. Would you let me run
arbitrary [inaudible] inside the rendering engine process container?
>> Charles Reis: So I think that the sandbox is designed to try to allow that sort
of arbitrary code, but I'm not sure how much it's achieving that in practice. I think
that's sort of an open question right now. So maybe it could be improved. But I
think there's room for a sandboxing mechanism in general in the architectures to
reduce this sort of privilege.
>>: How does this sandbox mechanism compare with IE?
>> Charles Reis: Yeah, that's a good question. So IE also has a multi-process
architecture that has a low-rights mode for these processes, and the low-rights
mode, as I'm familiar with it, tries to prevent those processes from writing to disc,
having access like that, but it doesn't prevent them from reading from disc. So
that means that those processes could still go in and look for confidential files
and leak those back out to the web, and Chrome sandbox is trying to go that
extra step further and prevent reading. Whether or not you could run arbitrary
code in there I'm not familiar enough to know.
Yes?
>>: What's the functionalities in the browser kernel?
>> Charles Reis: So the browser kernel is providing an API, a messaging API, to
the rendering engines that is designed to be sufficient for rendering a web page.
So it provides network -- the ability to make HTP requests, the ability to display
images -- which, again, we're skipping over the bitmaps, and the browser decides
where to put them -- to storage for cookies and other things like that. It's sort of a
limited API, and the full list of messages is online on the chromium.org website if
you're interested in seeing what's there.
>>: [inaudible] direct access [inaudible] having the rendering engines using a
bitmap to communicate with a browser kernel, if I have a fancy graphics card
that's using a direct access [inaudible] basically you are sacrificing performance
because the rendering is not [inaudible] right?
>> Charles Reis: That's right.
>>: It's not going to elaborate that fancy graphics card.
>> Charles Reis: That's right. Being able to take advantage of hardware support
or other devices is a challenge here, and maybe the browser kernel needs to
have a way of exposing some of that. At the moment some of the plug-ins that
do take advantage of that are running outside the sandbox, so they get direct
access. It's not great, but that's -- that's a good point.
>>: So does your rendering engine use any of the windows running functionality
like GDI or fonts or anything like that or is all that stuff [inaudible].
>> Charles Reis: I believe that -- I believe that's implemented in the rendering
engine.
>>: There's no reason you couldn't use those same bits and run them behind the
operating system [inaudible].
>> Charles Reis: Right. Yes. So the rendering engine inside there is using
Windows features ->>: It's using Windows features to render safe fonts and [inaudible]?
>> Charles Reis: Yes. Because ->>: So that stuff is -- like sandbox is permeable in that way, and if there were a
[inaudible].
>> Charles Reis: That's probably correct.
>>: Okay. I was trying to get a handle from what -- how self-contained the
rendering engines are.
>> Charles Reis: So then the rendering engine -- yes, it uses Windows libraries,
but it has a reduced security token that prevents it from accessing other types of
resources.
>>: Okay. Thanks.
>> Helen Wang: But it would still need to issue some of the Windows [inaudible]
to achieve some of those functions.
>> Charles Reis: I believe so.
>> Helen Wang: I see.
Another question about the browser kernel is that what other security decisions
do you make in the browser?
>> Charles Reis: So the security decisions in the browser kernel are essentially
trying to look at each rendering engine at a core screen and see if it's behaving in
ways that it should. So we know that a web page from the internet, for example,
should not be embedding images or resources from your local disc, that only
pages that are fully loaded from your local disc should be able to do that. So the
browser kernel can look at that and say, no, we're going to deny that request
because we know this rendering engine instance is not allowed to do that,
allowed to separate a rendering engine instance for files. There's a few policies
like that.
>>: Mostly to protect the local host from the browser [inaudible].
>> Charles Reis: Correct. Correct. It's a sort of second line of defense. It's not
enforcing the same origin policy in the current implementation.
Okay. On the performance side, we do end up getting performance isolation
between these web program instances. So the other web programs can remain
responsive while a given one is working. And we measured that by looking at the
click latency in the browser. We ended up comparing the different architectures
of Chromium for our evaluation so that we could look at architectural differences
rather than the implementation differences in browsers.
So in the monolithic mode we have a situation where if you are trying to time how
long it takes for the browser to respond to a click, like a right click menu, while
the top five pages are loading in other tabs or while Gmail is loading, you can
end up waiting for several seconds on average because the browser is tied up
running JavaScript code and rendering and so on, and then introduces just these
hiccups into your browser experience. And if you've ever used Firefox or other
browsers while lots of things are going on, you may have noticed some of these
hiccups. And in the multi-process architecture, that's gone. You don't see any
green bars here because it just takes six milliseconds to respond to a click, as
you would expect.
>>: I have a question, sir. I'm just trying to understand. So I don't know, I
started using [inaudible]. When I right click my button in IE, it doesn't take three
seconds with Gmail. So what exactly is going on with the [inaudible] that's so
slow.
>> Charles Reis: So the right click example might be one example, but there are
lots of cases where pages that are doing significant amounts of computation will
cause other pages to respond slowly, and this is true even in other browsers.
Now, IE is a different case, and I'll be able to talk about that some on the related
work slide.
>>: All browsers doesn't take three seconds when I [inaudible].
I'm not sure exactly [inaudible].
>>: [inaudible] In the meantime, there's all this other stuff going on in other
pages. Like normally that's not true for you, but in this experiment there are five
other pages that are busy doing [inaudible]. So this isn't the common case. He's
saying under certain circumstances [inaudible].
>>: And what's the underlying OS here?
>>: This is all Windows XP.
>> Charles Reis: So there is some other impact on performance. Because
we've introduced the opportunity for more parallelism, you can take advantage of
multi-core machines more effectively. So you get concurrency between each
web program instance now. It's not all running on the same thread. So, for
example, things like restoring a session of several tabs can be sped up
significantly if each one is going to a different core.
There's also some slowdown in terms of process latency. You've got to create a
new process when you're creating a new tab or when you're starting the browser,
and for a blank tab we find that's about a hundred milliseconds. But in fact in
practice that's often masked by the fact that there's tasks that can be paralyzed
between the browser kernel and the rendering engine when you're rendering a
real web page. So you often don't end up seeing this penalty at all.
Yes?
>>: Why don't you just spin off the process with an extra process?
>> Charles Reis: You could absolutely do that that's a tradeoff just between sort
of time and space, so you've got more memory sitting around. And there's no
reason you couldn't mask that entirely. You can still see it if you were starting the
browser from scratch with a blank page.
Now, the real cost here is on the memory side, and this is where we're paying to
have extra copies of each rendering engine and the cache that each one has and
so on. So we've found that when rendering realistic web pages, it just about
doubled the memory use of the browser, that with 10 popular pages it went from
about 65 megabytes to about 130 megabytes. And we think that this is a
reasonable price to pay on most realistic machines today in terms of the other
performance benefits that you're getting, especially given the types of response
and the speed of sourcing.
We're also taking a look at the impact on compatibility. There's currently no
known compatibility bugs that are due to this architecture. There's always lots of
compatibility bugs on the web, so -- but we don't think that they are introduced by
this particular architecture. Google also has a distributed framework for testing
versions of Chrome against the top million pages to try and look for compatibility
problems, so they're actively looking for things like this.
Now, there are some minor behavior changes that you get just whenever you
change anything in a system, and one of them in this case is that we have a
narrower scope for window names, that sometimes you can -- another channel
between windows is a window can establish a name and other browser windows
can find it. Generally that's a global name space throughout the browser, but the
HML5 specification says that browsers can sort of determine the scope of these
names appropriately, and we've narrowed it to within a browsing instance. So
you can find a window of the same name within your browsing instance now.
And that will only affect sites that -- for example, like Pandora, where you open a
music player window from the given web page and then if you went to a separate
instance of the Pandora website and tried to open it again, it's a question of
would it refresh the existing player or would it open a second player. And we
actually think that the latter is sort of more logical because they're independent to
begin with. But that's one change that was made.
There's been some related work in this space. It's sort of exciting to see how
much is going on here. IE8 is also introducing a multi-process browser
architecture that puts different windows in different processes, but there's not an
attempt to try and identify a program abstraction here. So there's cases where
related and communicating pages will end up getting put in separate processes.
Now, they've gone on ahead and implemented support for JavaScript calls
between these, but they're on different threads, which means that you can
introduce race conditions between different parts of programs running in
JavaScript, and JavaScript does the language without any concurrency
perimeters so you're creating real problems for programmers.
The Gazelle browser here out of Microsoft Research is proposing -- actually, it's
sort of a similar architecture, carving up the browser into different program
instances, but they're looking at trying to provide secure isolation of websites.
And sometimes to get there, you have to look at the features on the web that we
don't want that are causing problems that we would like to go away, and Gazelle
is saying, hey, if these were gone we could have a very good security story for
the browser. And we may be able to get here even in a compatible sense, but I
think that it's worth exploring both of these directions to say where could we get
with security and where can we get today with the stuff that's out there on the
web.
There have been some other research projects that have also looked at modular
browser architectures, including OP [phonetic] out of Illinois, Tahoma out of UW
that looked at having several virtual machines for web programs, and the sub OS
browser that are all looking at trying to carve up the browser into different pieces,
but without as much emphasis on supporting today's web content.
Okay. So to summarize this part of the talk, we've seen that browsers need to do
a better job of supporting the programs that are out there on today's web, but to
do that, they need to recognize what a program is. And we've introduced the site
instance abstraction to capture this in a way that's compatible with today's web
content. And once you have this abstraction, you can then isolate these with OS
processes to prevent interference between programs.
Okay. So now I'm going to give a much briefer overview about what Tripwire is,
but before I go on, we can take questions about this part of the talk.
Yes?
>>: So Chrome is a big product, obviously. It's the work of many people. Can
you explain which parts were your contributions to it?
>> Charles Reis: I can -- yeah. Sort of a general sense.
So the Chrome engineers were working on a multi-process architecture, and I
helped to do essentially some of the stuff that I described in this talk about
identifying the abstractions for what a web program is and supporting that within
the browser's architecture.
Any other questions?
All right.
So now we can take a look at things from the publisher's perspective. Now that
we have some idea about what a web program is, how can we do a better job of
supporting them, say, with some simple integrity text for programs.
And when we're thinking about a site instance, we now know sort of what the
boundaries are, but can either users or publishers trust what's running within
that? And as we all know, most web pages are sent over http. They're plain text
protocol that can be modified in flight.
And what's interesting about this is that if one of these changes happened, that
change becomes part of the site instance, becomes part of whatever is being
governed by the same origin policy in that web program.
And so, for example, if an internet service provider had looked at the page and
injected something like an advertisement into it, that would become part of the
site instance that the user is running.
Now, is this a concern in practice? Well, we did a measurement study that
actually says it kind of is. We looked at 50,000 different web clients, and we
found that over one percent of them reported that a web page was changed in
some way before it got to their browser. So we were pretty surprised by this, and
we saw a variety of different things that you might not want to see happening to
web pages, including injected ads, injected exploits, and changes that either
broke web pages or introduced security vulnerabilities, so stuff that people might
want to know about.
>>: How do identify an in-flight change?
>> Charles Reis: This is exactly what I'll get to on this slide.
Yes?
>>: You can't tell who actually injected it ->>: Not context. Yes. So we know that it happened between, and we -- there's
often enough evidence to narrow down what made the change. It could have
been faked, but we think that we can generally narrow down what caused the
change in many cases.
Okay. So the way that we detect this is we realized we could put a piece of
JavaScript code on a web page on a server and then ship that to the client's
browser to do an integrity check, and this is what we call a web Tripwire. We
ship that to the client's browser, and it looks at the HTML source code of the
page that arrived and compares it to what was expected.
So, for example, if an internet service provider had injected an ad, we'd see
some change to the HTML, and we could display a message to the user, as well
as sending the change back to our server for our measurement study. Now,
we've put this on line at a website called vancouver.cs.washington.edu, and it's
still
online. The measurement study was about two years ago, but you're free to go
to that if you want to test whatever network you're on and see if things are being
changed.
But we wanted to use this for a measurement study, so we wanted to get a view
of lots of different users on many different networks to see if pages are being
changed in some situations and not others.
So we ended up getting this posted to Slashdot and Digg and a couple of other
news outlets, and that drew in sort of a flash crowd that gave us a good
measurement study. So we had visits from 50,000 unique IP addresses, and
that was over about two weeks, and 653 of those reported some change back to
us. So we were pretty surprised to see that many. We were sort of expecting
maybe we'd see a handful, but to see over 1 percent of people having some sort
of change was pretty surprising.
So I just want to give you a brief sense of the types of things that we saw. It was
injected by many different types of parties. For example, we did see some
internet service providers that were injecting ads. Some of these were on free
wireless networks where the network is supported by ads that users view within
web pages. Some of them are done by ISPs that work with companies like
NebuAd or possibly Form in UK that inject targeted advertisements to users
based on their browsing history, and they're doing this by looking at the web
page and then injecting a targeted ad or looking at the user's browsing history.
We saw some evidence that some firewalls were injecting security checks. So
maybe there's positive changes you can get out of this as well.
So there's products like Blue Coat WebFilter that try to enhance security to the
enterprise by injecting JavaScript code into web pages.
>> Helen Wang: How do they do it?
>> Charles Reis: They look for suspicious behavior, so a page that might try to
spoof the address bar, for example. And if they see evidence of that, they'll pop
up a message to the user saying you probably don't want to trust this page, and
maybe they report that back, but it's tough to tell exactly what they do.
We saw some injected exploits in the page which we found were consistent with
reports of ARP poisoning where some infected client on the user's local area
network was sending out ARP requests saying "I'm the router. Send all your
traffic through me." And that means that when the user requests a web page,
that infected client has a chance to inject exploits into it and infect other clients
on his local area network, which was kind of scary. We've actually heard reports
of this happening in server rooms, as well, where something on the local area
network of the server sticks itself between the server and the outside world, so all
outgoing pages get exploits injected.
And, finally, we saw evidence of lots of user proxies. So this is software that the
user obviously knows about designed for things like blocking ads or pop-ups, and
this accounted for about 70 percent of the changes that we saw. So it's a big
portion.
But what we found was that even these sort of well-intentioned changes can
have some negative consequences. And in fact these are where we ended up
seeing bugs and security vulnerabilities introduced.
So we saw some pop-up blockers that introduced bugs into certain web pages.
For example, web forums. You can find many comment boards on the web
where the user's comments is preceded by a little bit of garbage JavaScript code,
and that's because the pop-up blocker they were using was injecting JavaScript
code into the wrong part of the web page, so injecting it into the comment forum
on the forum, and so that got copied into the website and displayed in the user's
posts. You can actually find some sort of amusing posts online where users are
like, "Why is my computer putting garbage here? I don't understand it."
Unfortunately, we also saw some more severe problems, including vulnerabilities
introduced into web pages based on the ad-blocking and pop-up blocking proxies
the user had installed.
So, for example, we saw a program called Proxomitron and a program called Ad
Muncher that both injected ad-blocking JavaScript code into each web page that
the user visited, or most web pages, and that code was vulnerable to a cross-site
scripting attack, and that was because it copied the URL of the page into the
body of the page itself, and the URL is something that an attacker has influence
over. An attacker might send you a link to a web page with some code at the
end of it or they might redirect you from a web page, and when the user visited
that page, the proxy would copy the URL into the body, and then that code at the
end of the URL would run as part of the page.
Now, the interesting part about this is that the victim of this attack is whatever
page the user happens to be looking at, and in many cases that was all of the
pages that the user visited. These proxies would often modify all of the user's
web traffic, not most of it.
And so in the world where browsers are acting like operating systems, this is sort
of like a root exploit. Suddenly every program that you're using becomes
vulnerable to an attack. So it's a pretty severe problem. And we were able to
report these vulnerabilities to the proxy vendors and had them fixed, but it's the
sort of thing that you have to be very careful, when you're modifying a program
on the fly, not to introduce security problems.
So what can publishers do about this? Well, you could just switch to htpps.
You'd have confidentiality and integrity for your programs, but we recognized that
that's not necessarily a practical solution for every website on the internet.
So one alternative is to deploy something like a web Tripwire on your own page,
which is just a lightweight piece of JavaScript code that does this integrity check,
and it's very easy to deploy.
>>: [inaudible].
>> Charles Reis: That's right. So there's -- it's not
cryptographically secured. It's not as good as htpps. But you could at least
obfuscate the Tripwire to make it difficult to detect on the fly. It's sort of -- it's in
arm's reach there.
>>: [inaudible] I mean, it never ends.
>> Charles Reis: So the flip side is -- another aspect of that is that it's not
necessarily an alternative to https in that you're not preventing these changes,
right? You may be able to detect when they're happening, but it's something that
the publisher could know and maybe notify the user "This isn't what I wanted."
>>: [inaudible].
>> Charles Reis: So https is -- if you're very concerned about the security of your
program, you should be using that. I don't want to imply that this is an alternative
to https.
Now, if you're just looking to detect these things, we do have a toolkit that can
help you deploy a web tripwires page, and we have a service that all you need to
do is just include a line of JavaScript on your web page, and the Tripwire be
fetched from our server, and we'll have a reporting interface for publishers to see
how their users pages were changed in flight.
Okay. So to summarize this, we found that it's not safe in general to blindly patch
the code of a web program just like you wouldn't blindly patch a binary installer
without really knowing what you're doing. Unfortunately, there's many parties
with incentives do so, and it is happening on today's web. And you can at least
tell when it's going on in most cases using web tripwires.
So, finally, I want to give just a brief summary of other work that I've done both
from the browser space and elsewhere and talk about some future directions.
So I've also worked on the BrowserShield project here at Microsoft Research,
which is designed to block exploit of known browser vulnerabilities, and it does
this by interposing on web content within the browser.
>>: [inaudible].
>> Charles Reis: That's right. As it turns out, we implemented this as a proxy,
which modifies pages in flight. So it's something that is looking at web content.
And we did that mainly for the deployability story. So this is something where
you could also deploy it on the server to try and rewrite pages there or you could
deploy it within the browser itself where the browser really has the context of
what's going on.
And in fact we've seen that it's gone on to help influence the Live Labs
WebSandbox project, which is trying to enforce policies in a different way, but it's
doing it on the server side. So the server knows what's going in the page and it
is doing some rewriting. But the goal of it in general is to rewrite web content to
enforce flexible policies, and browsers still use that to prevent exploits of
vulnerabilities.
Before I got into the browser space, I've also done some work in wireless
networking research. So I looked at some low-level 802.11 behavior from traces
at SIGCOMM, and I've also looked at how you might be able to predict wireless
behavior on a static network based on previous measurements that appeared at
SIGCOMM in 2006, and when I was at Rice University I did some computer
science education research on the Dr. Java development environment where we
had undergraduate students maintaining a shipping product by -- or learning
production programming practices by maintaining something where they have to
fix bug reports and add future requests to things that was an introductory course
at other schools.
Okay. So, finally, I just want to wrap up with some future directions that I could
see heading in.
I think there are actually lots of opportunities to have secure isolation of websites
in a compatible way, and I think that both Chrome and Gazelle are exploring this
space in very interesting ways, and I think there's ways to move forward here.
But I think all of this sort of fits in this theme of doing a better job of supporting
programs within web content and web browsers.
I think there's also a need for better ways to understand what it means to be
compatible on the web, that it's not necessarily just that you change the
rendering of a page or that you've introduced a JavaScript load on error -- or a
JavaScript error on load, there's bigger questions about what features do users
rely on, and so on. So do we need benchmarks, do we need larger surveys?
There's interesting problems there.
I think there's lots of cool opportunities for opt-in mechanisms for how you might
let publishers provide more information to the browser to let it enforce more
interesting policies and maybe even be able to get away from things like the
same origin policy in certain cases to do a better job of saying here's what my
web program is, and then maybe being able to do things like BrowserShield-style
interposition within the browser itself to enforce policies on how things behave.
Longer term, I think all of this is getting at where applications and network
applications are headed, that we've got a world where we have some
understanding of desktop applications now, and then various sandbox web
applications, but what will that look like, and how will browsers and operating
systems support these; what APIs do they need, what services do they need,
what isolation mechanisms do they need.
And in that sense, how will the trust models change, that if a program coming in
from the web wants access to more storage or more devices, can we grant that
to certain web programs without everything in ways that users might understand.
There's very big challenges there.
And all of this ties into my interest in just finding ways to design robust and
secure systems in general.
So with that, I'll conclude the talk. I think the web is becoming a very interesting
application platform that is giving us chances to figure out how applications
should be supported, and it requires changes to both browser architectures and
the way that web content is defined. And I think it really is sort of an exciting
opportunity to be working in this space where the web is starting to change and
browsers are changing and we're getting a chance to define what that will look
like.
So thanks, and I'm happy to take questions.
>>: So I have two kinds of questions. First question is when I came in, like, you
know, the [inaudible] of your talk is make the web more safer, and when I talk to
people -- like, you know, when people look at the problems of the web,
robustness and performance are not at the top of the list, and your entire talk was
about robustness. So I had a hard time reconciling how, from making the web
safer, you kind of got to robustness and performance as being the right goals for
it.
>> Charles Reis: So I think that -- so I think robustness and performance are
absolutely part of safety. The things that are on people's minds now are probably
things like phishing and security exploits and so on. And I think that there's
certainly room for security improvements there, and some of the work on browser
exploit prevention, trying to prevent cross-site scripting attacks are getting at that
aspect of it.
But I think looking at robustness issues in particular are looking at this inflection
point about where we're looking at -- programs are becoming more popular. As
you get more and more programs, what are the problems going to be in the
browser space. And we need to do a better job supporting these before users
are like, geez, all of my programs are always crashing because I visit some page
with an ad and it just takes the browser down. So being able to provide a safer
platform for programming also includes providing isolation between programs.
>> Helen Wang: I think in this sense ->>: Can I ask my second question?
My second question is, you started by saying, you know, like what we need to do
is basically take the browser and make it look more like the operating system. It
strikes me as, you know, like -- I don't know. I'm not very happy with my
operating system [laughter] and [inaudible] problems. So I want to have you kind
of tell us exactly why that's the right approach. Or maybe why not just look at the
operating system and actually, you know, forget about the browser and make the
operating system kind of support the application that the browser does. Why is
turning the browser into the operating system the right approach here?
>> Charles Reis: So I agree with you that it may not -- turning the browser into
the operating system isn't necessarily the goal. I think that part of the goal is
learning from the operating system and the things that it's done well and getting
those properties in the browser space.
So, for example, we're not trying to invent a new isolation mechanism, like a new
type of process for the browser. We're saying that the OS can provide that to us,
and I think there's room for this evolution where the OS and the browser are both
providing this platform for running programs, and they both really need to reflect
these changes.
So, for example, some of the changes in the trust models, the OS might be a
better place to consider how you support these programs. So I think there's
needs to change both of them, and the browser space is the one that is where
you can start and get the most benefit right now.
>>: Okay. So you're saying that that's how we did it, basically [inaudible].
>> Charles Reis: Yes, but I'm also saying that ->>: [inaudible].
>> Charles Reis: No, no. What I'm arguing is that -- I'm not saying we need to
turn the browser into an operating system, I'm saying we should learn from the
things that the operating system has done well and get those properties in the
browser. It's
not -- it's very different from saying the browser should re-implement its own
operating system on top of the operating system.
>>: So, you know, the sense is that the web is an application platform that is
evolving and that we're kind of early on. Can you give us a couple senses? One
is, what's your sense of where the really exciting new developments in web
applications are? So what are some examples of those? And the other thing, I
guess, is do you see, like, the rate of change? Are they getting more and more
interesting quickly, are they kind of plateaued out at Gmail, you know, that either
Gmail, Gmaps or maps at Google are the best things we'll ever see?
>> Charles Reis: So I think in terms of the trends, where we're heading, I think
that we -- that things are sort of accelerating here. For a long time we've hit this
plateau about what you can get into the browser and run, and maybe Gmail was
just sort of, you know, pushing that envelope, and it was this big chunk of code
running within the browser.
But we're starting to see the browser wars be reignited, and you're seeing faster
JavaScript engines, you're seeing more support being added for new features by
post message, for example, and so we're starting to see applications that get the
ability to use more features. And that's why I think the really interesting things
are just starting.
I think in terms of off-line access, in terms of access to certain devices, there's
certain web applications that are saying, you know, we want just that little bit
more. We want access to your web cam so, here, a solid browser plug-in or, you
know, let me out of the browser sandbox in this way. And there's a question
about, you know, what does the browser need to provide for these applications
that we don't really know yet, and can we do that without giving up all of the trust
that we currently understand about the web. So I think in that sense things are
accelerating, and we're starting to see applications start to take advantage of
that.
>> Helen Wang: I agree.
>> Charles Reis: Thank you, Helen.
>>: I was trying to add the comment which is about the title, the word safe.
Usually that means quite different thing versus likeness. So in this ->> Helen Wang: [inaudible].
>>: But this talk is more on the likeness side of that.
>> Charles Reis: Well, and so in casting the talk, I'm trying to capture the work
that I've done in my dissertation, which has included work in both the security
side and robustness and keeping programs running safely.
>> Helen Wang: Okay. Well, with that, let's thank our speaker.
>> Charles Reis: So thank you very much.
[applause]
Download