18312 >>: ...event. So Lorrie Cranor is a faculty member... Research. She directs the CyLab Usable Privacy and Security...

advertisement
18312
>>: ...event. So Lorrie Cranor is a faculty member at the CMU Institute for Software
Research. She directs the CyLab Usable Privacy and Security Lab. She's on the EFF
Board of Directors. She's an advisor to Microsoft as part of our Trustworthy Computing
Academic Advisory Board. She's also the co-founder of Wombat Security Technology,
commercializing some of the anti-phishing technology that she's built, and she's a
co-author of the book on usable security and privacy.
So I am very excited. Please join me in welcoming Lorrie.
[applause].
>> Lorrie Cranor: Thank you. Thanks. Well, I'm really glad to be here today. And Rob
asked me to give a talk. And he said: You can talk about any of your research. And I
said: Well any preferences? And he gave me a whole long list of things. I said I can't
put all of that in one talk.
So I'm going to try to actually cram two talks into one. So at least you get some flavor of
multiple things. But I have a few slides that just have brief references and mentions of
other things. If you want to know more about those you can ask me about them later.
So today we're going to focus on cyber trust indicators. So what are they? This is kind
of my umbrella term for any sort of symbol of security, privacy or trust that will show up
in a browser or in software.
And so within scope here are things like pop-up warnings. Symbols in the browser
chrome. Privacy seals. Privacy policies. These all fall under this umbrella.
All right. There are a lot of uses for cyber trust indicators. One use is if there's a
potential hazard, it's an alert about the hazard. Another use has to do with indicating
some reputation trust, validation, something along those lines.
And sometimes these are used to facilitate decision making. So today I'm going to talk
about two projects that we've done at CMU. And one of them deals with alerting about
potential hazards. This is SSL certificate warnings.
And the other one is about facilitating decision making, and this has to do with privacy
policies. So that's what you'll hear about today.
Okay. So talking about hazards. This is one of my favorite hazard photos. This is
actually the sidewalk in front of my house. It actually -- it used to be the sidewalk in front
of my house. The city of Pittsburgh actually informed me that this is a hazard. And I got
a court summons, failure to repair my sidewalk. I had to go in front of a judge and plead
not guilty. Anyway, by the time I went in front of the judge, I, of course, had done
something about this hazard. What should we do about hazards? The best solution is
what I actually did. I removed the hazard.
But there are a lot of other things you can do when you have a hazard. The next best
solution, and this is what I did until I had the contractor come out, was I just guarded
against the hazard. [laughter].
And that's often a good solution. Not as good as getting rid of it, but still reasonably
good. It still can be costly. These guys don't want to sit out there 24/7. So the other
solution, of course, is just to warn about the hazard.
>>: It's Pittsburgh, so they're union, right? [laughter].
>> Lorrie Cranor: Yeah, so you can warn about the hazard. And obviously I just
PhotoShopped that on. But I was walking around my neighborhood and I discovered
that in fact people are doing that. [laughter].
So, yeah, that is real. That's a few blocks from my house. Okay. So this is what people
do about real life hazards. And I've actually spent a bit of time reading the hazard
literature. There's actually a whole area of psychology that deals with hazards. And
these people study like what color should railroad crossings be, and what font should
you put on medicine bottles and things like that.
One of the things they talk about is this kind of hazard hierarchy, exactly like I've shown
you. All right. So how does this apply to computer security warnings?
Well, so the warnings that we see in our software indicate that there's a possible security
hazard, and it triggers a warning. And often our warnings are confusing, and there are a
lot of false positives and people tend to ignore them.
So sometimes when there's really a hazard it's dangerous, and people actually are
injured by the hazard. So just to give you an example. Here is a typical security
warning. And when users see that, this is what they actually read. [laughter].
So they swat it away. Okay. So what we've been looking at in our research is, is
processes that can help us do something better. And we've come up with a process
which is in a paper called the Human in the Loop, which you can take a look at. It's the
human threat, identification and mitigation process. The idea here is if you have a
secure system that's relying on humans to do something useful in order to protect their
security, we'd like to model that human as a threat. The good human as a threat, not
just the evil attacker.
So in order to do this, the first thing we need to do is to identify what exactly is the
security task that we're expecting humans to do. And I think this is a step that people
often don't really think about. And so I think it's really important. And another person in
the security community tells me when she consults with companies she has them do
this. She says get out a piece of paper, blank eight and a half by 11 piece of paper and
write down what are all the security tasks you're asking people to do.
When they fill up that piece of paper, she tells them they're in trouble and they need to
start over. Because you can't rely on humans to do this men things. But the first step is
to enumerate what are these tasks that you're expecting humans to do. Once you've
done that, you can go through them and say: Do I really need a human to do each of
these things. Are there tasks here that I could actually automate? Or partially automate
to somehow lessen the burden on the user to do these tasks.
And then once I've done that, I see what's left. And I want to see, okay, are these tasks
that humans are good at or things they're likely going to fail to do. So we want to identify
failures. And the Human in the Loop model has a whole set of steps to try to identify
those failures. And you might need to do some user studies to find out whether people
are good or actually not so good at some of these things. Then once you know where
the failures are you try to figure out how to mitigate the failures and you may do some
looping. Where there may be some things that you felt were too hard to automate at
first, but once you realized how bad people were at them, now automation suddenly
looks a lot better.
Okay. So now we can apply this process to warnings. So in this case the task is to
decide whether I should stop doing what I'm doing and pay attention to this warning, or
whether I should swat it away and keep doing what I'm doing. That's the task we're
asking the human.
And if we wanted to, we could automate. We could get rid of that warning and have the
computer decide either there's a hazard or there's not. If there's a hazard, then the
computer just doesn't let the human proceed. If there's not a hazard, then the computer
lets the human proceed and doesn't say anything.
Hopefully, there's some good reason why we're bothering to interrupt the human here
and there's something that we can't completely automate here. Okay. So the current
situation looks something like this. Unfortunately, often what happens is software
developers are in a situation where there might be something dangerous. And they
decide, well, I want the human to decide. I don't know if it's really dangerous or not.
So we have this big gray area. And so what we'd like to have happen is that we want to
use some automation to make this gray area smaller. If we can kind of cut away at the
edges and find the cases where we really can automate and decide it's safe or it's not
safe.
So we make it smaller. So now we have some situations where we don't have to bother
the user, because it's really probably not a hazard, or where we just want to block and
really not let them decide.
And then for what's left in that gray area, now we want to have better messaging to the
user so that they can actually understand what questions they're being asked and make
a really sensible decision. So an example of how you might change the question -okay. Here's a bad question. Your Web browser thinks this is a phishing website, do
you want to go there any way? And your typical user may not know what a phishing site
is and of course they want to go to that site. This is a silly question. And they're just
going to click through and go to the site.
You could rephrase this. You are trying to go to evilsite.com. Do you really want to go
there, or would you rather go to yourbank.com? I want to go to my bank. I don't want to
go to the evil site. That's a much better question. In order to ask this question, we have
to do more automation behind the scenes, because we have to figure out where the user
really wants to go. This may be a nontrivial thing to figure out and this process itself may
be attackable by the attacker. So we need to deal with that.
But this is kind of what we're aiming for, is we want to be able to actually reduce when
we need to ask the user at all and when we ask them, ask a better question.
So this led us to study browser certificate warnings and to see if we could apply this sort
of process to improving them. So this was work that was done by a bunch of my
students, including Serge Eggelman, who many of you know.
So looking at browser certificate warnings. There's a whole bunch of different types of
things that Web browsers warn about, related certificates, and these include things like
domain mismatch, unknown certificate authority and expired certificates. And on the one
hand most of the time, when you get a certificate warning, there is not actually a hazard
that you need to worry about. On the other hand, sometimes there is.
And so we would like to be able to distinguish those cases. So this is an example of the
Firefox 2 certificate warning. You can see there's a lot of text. And the most natural
thing is to just say, okay, and to ignore the warning. And that gives you what people
actually read. Okay the IE 7 warning, and this has less text. But still not really all that
informative to people.
The Firefox 3 warning is actually really interesting. So there's no buttons. You can't
easily swat it away. What you have to do is you have to click on that link on the bottom
which says or you can add an exception. And then that opens it up and then you can
add an exception. And then you get this box and you have to click on the "get
certificate", and then you finally get to say confirm security exception and then you can
finally go to your website.
So that's a multi-step process. So we decided to conduct a laboratory study to see how
people reacted to these three different warnings. And my students also tried their hand
at coming up with their own warnings. They invented two warnings I'll show new a
minute. We had five different warnings we were testing.
We had 100 participants. And each participant was assigned to a condition where they
saw one of these five warnings. And they were assigned to do a bunch of Web surfing
tasks, and during that process the warnings were triggered twice. Once at a high
security site and once at a low security site.
All right. So these are the two warnings that my students made up. This is the first one.
So here they decided to really focus on making the risk itself very obvious. So they
turned the warning red so that it would really stand out.
And they made the headline high risk of security compromise. And if you read the -- if
you read the text of the warning, in detail, it actually explains in reasonably easy to
understand language exactly what's going on here.
The second warning that they came up with is this ask a user a question warning. So
here it says: The website responding to your request failed to provide verifiable
identification. What type of website are you trying to reach? And there's a choice of
different types of websites. And if they say bank or other financial institution, then we
show them this. If they say any of the other things, then we just let them proceed to the
website. So the idea is to try to get at: Are they actually in danger? If not, we're not
going to bother them. If they are, now we want to stop them in their tracks.
So the way the laboratory study worked is that the people were given -- oops. Now my
phone decides to come back. Okay. People were given a bunch of tasks one at a time.
And most of these were actually distracter tasks. But the two we cared about were, they
had to go to their online bank account, log in and find their bank balance and just tell us
the last two digits. And they needed to look up a book in the CMU library catalog. When
they did those two tasks, a warning was going to appear.
For each of those tasks we also provided an alternate task, and the idea was that if we
told them your task is to go find your bank account number, then they were likely to one
way or another find their bank account number, even if they were concerned about the
warning.
So we wanted to have another route so if they were concerned this he could do
something else instead. And so in this case the alternate task was we gave them the
phone number for online banking at their bank and there was a phone provided right
there, and we said you should use the phone to find out.
So here is an example of how the task worked. We gave them a separate sheet of
paper that had the task on it. It had the alternate task, so you can see the phone
number is right there.
And we asked them to think aloud so we could see what they were doing as they did the
task. So it looked something like this. They opened up the Web browser. They typed in
their bank's URL. And so this was one of our custom warning conditions. So they would
get the warning. And then if they chose to ignore the warning, they would go to their
bank account and log in.
And if they chose not to ignore the warning, they would click the get me out of here
button and they would pick up the phone and call their bank. We had a number of
hypotheses we were testing here. The one is IE 7 and Firefox 2 warnings we thought
would largely be ignored by users. We thought they were more likely to obey the Firefox
3 warning because it was so complicated as well as the warnings that we created.
And we thought that the multi-page warning, they would be more likely to obey on the
bank website where it actually was going to put up that false stop. And then on the
library website, where they were going to see it, that was going to be a low risk website,
and so they would probably just go ahead and go to the library.
And incidentally, at the CMU library, they have self-signed certificate so CMU students
are used to swatting away a certificate whenever they go to the CMU library.
So here are the results in the bank account. So what we found was that in the risky
situation at online bank account, that the Firefox 2 and IE 7 users for the most part
ignored the warning, which is actually quite dangerous, because this was their real
online bank account, and they had no way of knowing what we might have done to the
computer that they were using.
In reality, all we did was actually we removed the root certificate from the browser. So it
triggered warnings. But they didn't know what we had done.
So then for the library, what we see is that actually people largely ignored the warning
just about everywhere, except in Firefox 3. And Firefox 3 was very interesting. They
wanted to ignore the warning in Firefox 3. They just couldn't figure out how. [laughter].
>>: Over the years we actually trained our users to look for the okay button. They've
been trained very well.
>> Lorrie Cranor: Yes. When we did this test, Firefox 3 had just been out a short
amount of time and people hadn't been trained what to do to override it. And it's
significantly more complicated.
So we compared the library versus the bank, and what we can see is that in the three
native warning conditions that we tested, there's actually no statistically significant
difference in behavior between the library and the bank. So users are not differentiating
between a risky situation and a nonrisky situation.
But with the new warnings that we created, we actually got people to make that
differentiation. So we also dug down a little bit to see if we could understand more of
what was going on. And this was a think aloud so we have a lot of information about
what they said as they went. And then we also surveyed them after they finished all the
tasks.
So one of the things we asked them about was why did they choose to do what they did
with the warnings. And we coded their answers. And what we found was that in the
three native warnings, very few people mentioned anything about risk.
But in our single page warning condition, almost everybody mentioned risk. And so
there was clearly a difference as far as whether the warning was communicating that
there was something risky or not.
We also asked them: What did you think the warning wanted you to do? Now, you
would think this is obvious what the warning wanted you to do, but in fact it wasn't. And
most people did not say that they thought the warning wanted them to not proceed,
which just seems like such an obvious choice, and yet they didn't say that.
>>: Doesn't the IE 7 have a recommended with a little green check and a not
recommended with a red?
>> Lorrie Cranor: Uh-huh.
>>: That's in IE 7 and people didn't see it?
>> Lorrie Cranor: Yeah, so we didn't have the warning in front of them when they
answered the question. It's just what they remembered. Yeah. So, yeah, these
warnings were not clear about what it is that they should do.
We did note that the Firefox 3 approach of making it difficult seemed somewhat
effective; that people couldn't figure out how to do it. Although, half of them could figure
out how to override it. So there was only so far that that went.
Another interesting thing about it is that there was a lot of confusion. So this is the
Firefox 3 warning, and this is 404 warning in Firefox. And you'll notice that there's some
similarities. They both have yellow things in the corner. Apparently that was enough to
confuse users.
A lot of people just said, oh, that looks kind of like that. And they didn't actually realize
what the warning was. Okay. What about our approach of asking a question?
So this had mixed results. So at the bank only 15 out of 20 people correctly answered.
Did they not know that they were at a bank? Yeah, yeah, they knew, three of them
knowingly gave the wrong answer because they realized if they gave the right answer,
we were going to block them. [laughter].
And two of them were just confused and didn't actually read the question itself and just
randomly clicked something. Okay. Now, it also turns out that there's some
vulnerabilities in this approach. And in our paper we described this finer-grained origins
attack which we submitted this to Usnix Security and they accepted it conditionally that
the program committee had identified this finer-grained origin attack, and they said you
must write about it in your paper, which we did. So there's some other problems as well.
So on the one hand, you know, if you look at the numbers, wow, we had a 50 percent
improvement. But on the other hand half the people still, had this been real, would have
been a victim of an attack at their online bank which suggests this isn't good enough. A
mixed result. It suggests having a contact sensitive approach is going to be useful but
this isn't it.
>>: Why, I'm not sure that I agree this is an improvement. What is the base? What is
the base you know occurrence of people being attacked with bad certificates or what.
You have three people here giving the right answer it would seem to me I know from
experience that my computer does random stuff all the time. It blocks me from doing
stuff where I know better than it a lot of time and this is still right.
>> Lorrie Cranor: Our feeling was that at websites that are not their bank, they're
probably right. But really if you -- you should not see a warning when you're going to
your bank.
>>: But my point would be that your average user has seen certificate errors in his
lifetime. He's never once ever seen ->> Lorrie Cranor: I absolutely agree with that. What our argument is that -- that once in
a lifetime, when you see the warning at your bank, it should stop you and these warnings
are not stopping you.
>>: But it's not unreasonable that people have been trained to ->> Lorrie Cranor: And that's why we're trying to change the warnings and say: Can we,
by using a different kind of warning, actually get people to stop when it happens at their
bank. And what we found is we can improve the odds but not enough.
>>: But that's the constraint. If you just want to do that, then the answer is clear, you
just lost the connection.
>> Lorrie Cranor: That may be the right thing for developers to do. And that's
something you all should think about.
>>: But is it possible that it's just the novelty of the ones you designed? If you had them
use it for a year, they would have started ignoring it.
>> Lorrie Cranor: It's entirely possible. Yeah. It is entirely possible that -- so we're
already saying up front that we saw only a small improvement and we don't think it's
good enough. And it may be that over time our small improvement would even
minimize. Yeah.
I suspect that at the end of the year you would still see a small improvement but it would
actually be even smaller than what we showed. Okay. So our conclusion from this
study is that basically we have a problem. Yes, you can improve the wording of
warnings, but in many cases, that's probably not good enough.
And so think about the sidewalk. We could spray paint the word "warning" in bigger
letters, but is this really the right solution to the problem? So I think we need to have
some rethinking about the whole solution and not just focus on, okay, let's just reword
the warning.
There are some research systems that are out there that are trying different approaches.
So in our paper we talk about perspectives and forced https which are kind of academic
research solutions. Perspectives is another system at other people at CMU not me have
been working on, where there's basically a server that keeps track of certificate warnings
and certificates from websites.
So basically your browser contacts a server and says: Well, I just got this certificate; is it
the same one that you've seen? And if lots of other people have seen that certificate, it's
assumed to be a good one.
But if suddenly you're getting a different certificate than other people have gotten, then
you should question the validity of the certificate. So that's an interesting approach.
There hasn't really been research into the false positive rates you get with that or the
usability of it.
But it seems like an interesting direction to look at. Okay. So now I'm going to switch
gears and talk about privacy policies, unless there are any other questions before I -okay.
So a nutrition label for privacy. So this is again, work done with a bunch of students at
CMU. Patrick Kelly kind of took the lead on this.
So we start with the notion that lots of privacy surveys have been done. And the privacy
surveys always point to privacy as a value that people really want. They say it's very
important.
And then if you observe behavior, you find that people often take steps that seem to
contradict how much they value privacy. That they often seem to be quite willing to give
up their privacy for small rewards.
And so there's been a lot written about why that is. And some people say well maybe
people don't really care much about privacy after all. And that's certainly a possibility.
There are other people who say, well, you know, maybe there's this immediate
gratification thing that I want that candy bar now, and the privacy is a long-term thing. I
won't actually see the impact of the privacy invasion for some time far out in the future.
That's a possibility, too.
There are two other possibilities, and that's the focus of our work here. One is that they
don't actually understand the privacy implications of their behavior. And, two, is that they
don't actually understand the privacy policy that they're dealing with.
It's actually very time-consuming to read and understand privacy policies and to inform
yourself about what's going on. So the theory behind privacy policies is that they are
informative of consumers, and that consumers can read them and then vote with their
feet.
If they're at a company that has a bad privacy policy, they should just not shop there.
They should go somewhere else. The problem is that it's been well documented that
most privacy policies require very advanced reading skills to actually understand. They
take a long time to read. And so nobody is reading them. And so they're pretty much
ineffective for this role.
Actually, there's still a lot of good things about privacy policies but if you're expecting
them to be used by consumers to make decisions, that's not happening.
One of my students actually did a study where she said, okay, let's just pretend
everybody really does read privacy policies. Let's see what would happen here. And so
she did some calculations and got some data about the average number of websites that
people visit. Average salaries, things like that.
And she found that it would take you 244 hours a year to read a privacy policy at every
site you visit once a month. That's ridiculous. And then if you put the dollar amount on
it, it gets even more ridiculous.
So even if people wanted to read privacy policies, we can see that it's just ridiculous to
think that people would actually do that.
Okay. So then we looked at, all right, is there anything we can do to put privacy policies
in a format that would make them more accessible. So we did a study where we looked
at a bunch of different privacy policies formats. We looked at the full standard long
privacy policy formats. We looked at the layered notice privacy policy. We looked at
some things my students came up with.
And we did this for a number of different real websites. We basically found a bunch of
real websites. Some had long privacy policies. Some had short ones. We took out all
the brand names so they anonymased them. We put them all in the same font. We put
them in these different formats, and we asked people to read the policy in one policy in
one format and answer a bunch of reading comprehension questions. And then a bunch
of questions about how much they enjoyed the experience they just had.
What we found is that people had a really tough time with this. The very simple
question -- so all of the companies we named Acme. So Acme does use cookies? 98
percent of the people could actually answer that question. But anything that required
any reading comprehension was very difficult for people across the board.
And even the well-written policies, we picked some that we felt like this company has a
really nice, concise, easy-to-understand policy, and we threw those in the mix. And
even those people didn't like that much and they still found difficult to do.
And we found that this idea of layered policies wasn't really helping either. So this was
kind of depressing. Lots of different efforts to try to improve privacy policies, and we're
finding that they're pretty much universally hated.
So then we did some studies to look at, well, what if you give people privacy information
in the search engine? So we built this thing called Privacy Finder. You can check it out
at privacyfinder.org. Although, we just tried it earlier today and it's very slow right now
because its cache is broken. But next week hopefully it will be faster.
Anyway, we built this search engine that the search results come back annotated with a
privacy meter. You can see at a glance when you do a search which of the search
results has good privacy and which one has bad privacy.
And we did some lab studies where we paid people to go shopping, and we set up the
studies so that the search results that had good privacy were also sold products that
were more expensive. And so we could test whether people would actually pay more for
good privacy.
And we actually found that they will. As long as it's not too much. But in the 50 cents
range we actually found people would pay more for good privacy.
>>: 50 cents?
>> Lorrie Cranor: On a $10 purchase. So this actually was one of the first studies to
demonstrate that people may actually be willing to pay for privacy if you make it really
blatantly obvious to them which company has a better privacy policy.
>>: You translate privacy to where they thought -- because it's more private it was
safer?
>> Lorrie Cranor: Yes. So one of the things we did in the study, there was an exit
interview, and we asked them: What do you think it meant for them to have a good
score on the privacy meter? And we did find that there were some people who
associated it with safety, security. I'm not going to have my credit card stolen, things like
that.
Now, they did have an opportunity, if they wanted to, to click on the privacy meter and
find out what it meant. But very few people actually did that. Most of them just
accepted, high privacy, great, didn't actually click through what high privacy actually
means.
We did a bunch of these studies. We tried it with different items. We tried it with
different types of conditions. We also checked to see was the meter itself attractive.
Maybe it had nothing to do with privacy but it's a high level on some meter. So we
tested it with meters that mean other things, and we found that actually, no, privacy was
scoring more highly than other things.
So then we said, all right, let's go back to this privacy policy problem ask can we actually
make a better one. And we looked to the nutrition label literature for some insights here.
And nutritional labels are not perfect either. They have their criticism as well. But there
are a lot of really nice things about nutrition labels.
So I can take two products and put them side by side, and I can easily see which one
has more calories and which one has more fat and calcium or whatever. And if I have
particular dietary concerns, I've gone to the doctor and the doctor says you need to
watch your cholesterol now, I can now start looking at the cholesterol box on the nutrition
label.
It may be that I never paid any attention to cholesterol before, but now that I know that
that's something I need to look at, I can very easily learn how to find cholesterol on there
and I can do that comparison.
And the reason that this works is because it's a very standardized format. It uses
standardized language, and it's also fairly brief. And it's linked to more information.
So I can actually get all of the ingredients. Now, this is not a very interesting example,
because the only ingredient is milk. But there are other examples where there's actually
a very lengthy ingredient list.
And it may be that after you've seen the summary, you still want to know, well, there's
some particular ingredient is in there and you can actually go and read the whole
ingredient list and find it.
So we looked at a bunch of other types of consumer labels. Energy labels. Drug facts.
Water ratings, all sorts of things like that. We looked at the literature in this area about
how these types of things were developed. And that gave us a lot of insights.
Now, the privacy case, we had a number of challenges. One is that people are not
actually familiar with privacy terminology. Now, of course the same thing could be said
about nutrition. Before nutrition labels, most people didn't know what a trans fat is and
still people may not know what a trans fat is. But we're still able to use a nutrition label.
There were also issues of sort of the cross-product problem. It's not enough to know
what data they collect, what they use data for. You need to know they use this data
element for this purpose. So that cross-product becomes actually very important.
Another challenge is that privacy policies tend to actually be fairly complex. And people
don't really understand the implications of these privacy practices. So we did an iterative
design process that actually was a multi-year process. It involved doing focus groups,
lab studies, online studies, and in these studies we measured reading comprehension,
how accurately they could answer questions.
How long it took to find information, how easy it was to compare multiple policies and
then subjective opinions about things like ease fund, trust and enjoyment. So this was
one of our first attempts.
This was actually designed by Rob Reader. And this was in Rob's thesis. What we did
here is we took the P3P language and we took every element in the P3P language and
tried to cram it into one table. And it may look like that to you.
This is actually the collapsed view. It's something you may know Rob did work on
expandible grids. So you could expand this and then it was even more crazy.
It has I think no fewer than 11 different symbols of various shades of teal. [laughter] and
it won't surprise you that when we did the user study to test it it was a complete disaster.
And we wrote a paper about it anyway. And this was a lessons from failure kind of
paper.
But we did actually learn a lot from doing this. And actually I must say that there actually
are a few good things about this. One of which is that people who wanted to create P3P
policies, policy authors actually really like this, say it would be a great authoring tool
because I can find all the elements I want to put in my P3P policy and if I could toggle
them I could create my P3P policy. But for readers, people who wanted to find out about
a site's policy, this was just kind of overwhelming.
So we did a lot of interviews with people about this and tried to understand what their
problems were with it. And then we did an iterative process, where we tried lots of
different approaches.
So after this, the first reaction was to go complete opposite end of the pendulum, let's do
something very, very simple. So Patrick Kelly had come up with that one, which is
based very closely on nutrition facts.
Then we said, okay, we've gone too far here. Let's go back to a grid but a simpler grid.
That's what we had here. There was a lot of feedback about this grid, and one thing is
people wanted color. So color came back here. Lots of different changes. And then we
finally ended up here. The evolution.
And there were actually a bunch of others in between. There were lots of minor variants
that we tested separately. And our tests were done with informal focus groups for some
of these small variants. We also used Mechanical Turk and that was really good to be
able to take, okay, we have two subtle variations. Let's throw them both up on
Mechanical Turk, get 50 users to use them and we can very quickly decide which
variants we should pursue and which ones not.
So zooming in so you can actually see this a bit more. This is actually not the very final
one, but this is the one that we tested. So you can see across the top we have how we
use your information and who we share your information with. And down the side we
have types of information. And then we have these colored cells to show you whether
the information is going to be used in that way all the time or in an opt-in or opt-out
basis. This is all generated automatically from P3P policies.
But it doesn't have to be. You could manually generate this for a company. But in
privacy finder, we actually are automatically generating these now. And then if you scroll
down, we actually have the key to show what the symbols mean. And then there's a
glossary. And the glossary actually goes on beyond this.
But all of the terms used on the page are in the glossary. Okay. So first we did some
focus groups. And that helped us zero in on the actual terminology that we used and
some of the issues with symbols and colors and things like that.
And then we did a laboratory study with 24 participants. We had them basically go
through and do a bunch of reading comprehension tests with the policy in the lab, doing
it in the lab allowed us to observe exactly how people were manipulating the policies and
do that think aloud and get their feedback that way.
That was a Soups paper. Then we refined the nutrition label based on that feedback,
and we did an online study with 763 participants. And this was done using M Turk. In
this case, we had five conditions that we tested.
And I'll show you them in a minute. And for all conditions we measured the same things
that we've been measuring before. So this was our standardized label. And so this
looks very similar to what I just showed you, except the exclamation points are gone now
and there's some minor wording changes.
And so this was -- our hypothesis was this was going to be the best solution. One thing
that we were interested in was do we really need the empty rows. So we did a version
where we took out all the empty rows so we could test that as well. So that was our
short standardized label.
Then somebody asked us, well, now that you've standardized it, maybe the grid isn't so
exciting. Maybe text standardized would be just as good. In fact, that was some
criticism we had gotten on the Soups paper you actually didn't test text. So we then
generated our standardized short text. Basically we took the standardized short table
and turned it into text. Every column and row became a sentence and we put it all
together here. So that's our standardized short text.
We also tested the layered text. So this is something that's being done in the industry.
There are a number of companies, including Microsoft, who have layered notices.
Sometimes known as highlights notices. And so we just took an actual one that was out
there and we just changed the name of the company to Acme. But we used all their
formatting and everything.
And again we also tested the full policy, which I don't have an example here. We had
four real companies. And we created five versions of their policy using all these different
versions. So in our M Turk study, we started by asking people demographic questions.
We asked them some questions about their views of their use of the Internet and their
views of privacy.
Then we had simple tasks where they could answer it by looking at one row or one
column in the table. So does Acme use cookies is an example. And then we had
complex tasks where they had to look at the combination of a row and column in order to
answer the question. Then we asked them how much they enjoyed this experience.
Then we gave them comparison tasks where they looked at two policies side by side.
And we actually set up a browser where so they could actually look at two policies. And
answer some questions about that. And then we asked them about their enjoyment of
doing the comparison test. So overall, if you look at all the accuracy tests we did, what
you see is the three standardized formats all do much better than the full policy and the
layered text. And there's actually very few statistically significant differences between
the three standardized formats and very few significantly differences between the other
two formats.
But between each other, the blue is statistically different than the red. So what's going
on? Okay. So overall we see standardized formats outperform text and layered. It
seems like there's a big advantage to having standardization, having a structured
presentation and having clear terminology, having the glossary. So having definitions.
All of these things seem to contribute to people being able to actually understand the
policy and use it more quickly.
Between the standardized formats we saw only minor differences. The table versus the
text, we see that the table does give you a more holistic view, and so for some of the
questions we saw people were able to perform a bit better, because of the holistic view.
We found that the short table takes up less space. And for some questions, therefore,
you can actually answer it faster, because there's less you can ignore. But there were
some questions that it tripped people up, especially if it was about those missing rows, it
was then harder for people find that information.
We found that the text one in general worked pretty well, but there were some questions
where the answer was in the middle of a paragraph. And people didn't do as well on
those questions. Because they had to find text in the middle of a paragraph. And for a
complicated policy, the text one -- you're going to have more sentences in a complicated
policy. So it's going to get longer. So it actually doesn't scale well for a complicated
policy.
That's not a problem you have with the table because the table is always going to be the
same size.
>>: Does having a very constrained data format like that limit the amount of
communication you can have? It seems in this test that the privacy things you're asked
to look at are very well known as sort of technology margins, for new privacy risks come
up on the flash code, something like that. Flash code, there's not a box for it, there's not
a standard for it. Seems like just open text, it at least gives you the opportunity to put in
arbitrary data.
>> Lorrie Cranor: In our standardized text you don't have that opportunity but in full
open text you do. So this is a trade-off. So if you want to have everything, it's not going
to be standardized and it's not going to be compact. And our notion is that you use the
standardized format as the first look, and that you allow people to drill down.
One of the things that we're currently working on is making it so you can click on the cell
and then that would tell you more information. So if you say, wanted to know is it regular
cookies or is it flash cookies, you could click on the cookie cell and get more information.
>>: Going back to the previous presentation, sort of interesting interaction with like your
experiment where you're, given the user where they think they should be going to
backup.com and evilsite.com, that example, interesting privacy implications around that,
if you do something funny, you're going to need to send me your Perl or path or
something like that. But if you're not doing something funny you won't. Describing those
more sort of complicated maybe you'll send me your Perls, maybe you'll send me your
IT. Those sort of conditional privacy considerations. Is there any way to represent
those?
>> Lorrie Cranor: So we have looked at that some. So as I said, this was actually
generated from a P3P policy. And the P3P policy allows you to have actually what's
called multiple on a state site. For when you're logged in this is what we do. When
you're not logged in this is what we do. So these are separate statements.
So you could actually generate a separate grid for each statement. And we've
experimented with that and a typical site has like three to five statements. And now you
have three to five statements on the screen.
So one of the things that we're looking at is having tabs. And so when you hit the site,
you see the composite statement, which is what we've been showing you. But then you
see the tabs if you want to look at -- you see the tabs for when you're not logged in, and I
can click over to that say what are they doing when I'm not logged in. There's something
along those lines. It's something we're experimenting with.
>>: Tell me again, we adjust for ->> Lorrie Cranor: For this study it was Mechanical Turkers. So Amazon has this system
where you can crowd source -- basically you can put up tasks and say I'll pay you 50
cents to do a task and people who are bored come to this site and get paid 50 cents to
spend --
>>: Educational ->> Lorrie Cranor: We actually -- we ask demographic questions so we could find out.
>>: So what was the demographics?
>> Lorrie Cranor: It's actually a pretty good cross-section of people skewed towards a
younger demographic. It's mostly people under 50, but you get a pretty good range of
occupations and ->>: Well educated versus not. I ask that simply because the difference between a table
versus the text, and privacy text has to get complex all by itself.
>> Lorrie Cranor: Yeah, we had a mix of educational backgrounds. If you look at US
census data versus this, this is more educated than US census. But if you look at
Internet users in general, I think this is a reasonable cross-section of Internet users.
>>: [indiscernible].
>> Lorrie Cranor: Yeah.
>>: Very small [indiscernible], click on the link to get the privacy statement. Had you
ever said if the privacy statement had one of these formats, would you be more apt to
click on the privacy statement to learn more about it?
>> Lorrie Cranor: We had something to the effect if all privacy policies looked like this, I
would be more likely to read privacy policies. And I think the only useful data from that is
the comparison between formats because I don't believe it when people say they're
going to actually look at it. But we want to compare across formats. So the people who
had the standardized formats said they would be more likely to read it than the other
people. But that's about all I could take away from it, because until you actually put
people in that situation, I don't think you'd really know.
>>: One of the things I think works on nutrition labels that that almost works here, for the
nutrition label stuff I scan only the two or three fields that I'm really interested in and
there's very little wiggle room in interpretation of how much cholesterol there is in this.
Whereas with the table here, I look at the information you're giving, and I guess you're
just parsing what they did. But what I really know when I'm looking at the thing okay
you're going to use my e-mail for marketing, does that mean like once a week, once a
month? Three times a day? How much of this I'm going to be seeing from you and I
guess that's not really in the policy. If it's not there.
>> Lorrie Cranor: I don't know any website privacy policy that tells you that information.
>>: Yeah. I guess that's the kind of ->> Lorrie Cranor: There are other things people say: I want to know what company are
they going to share my data with. Nobody will tell you that.
>>: Which is -- yeah.
>> Lorrie Cranor: So we're only going with what we have. But I agree. I'd like to have
that, too.
>>: Combining in the first presentation with this presentation, so if you can prep the data
out of the privacy statements, I, the browser says this is my minimum. Now block me
from sites that don't meet that minimum.
>> Lorrie Cranor: Yes. Actually, that's where we started. If we wind back in history. So
I was involved in developing the P3P specification at w3C. And we talked in the working
group about what could people do with P3P and we came up with a sketch of what it
should look like, and then Microsoft implemented something else in IE.
And so we decided to go ahead and build a prototype of what we thought P3P might do.
I worked for AT&T at the time. And so we built an IE plug-in called Privacy Bird. And
Privacy Bird let you set your personal privacy preferences and then every time you went
to a website it looked for the P3P policy. If it matched your preferences then you got a
green singing happy bird in the corner of your browser. If it didn't match you got the red
angry bird in the corner of your browser.
And we didn't actually implement this option, but the intention was that you could also
seen say just put up a block page, don't even let me go there. But the red bird cawing at
you was pretty disconcerting to people.
So that's kind of how we started with this. We have a notion with Privacy Finder Search
Engine that you could do something similar. So the privacy meter is actually interpreting
the privacy policy and is giving you this privacy score. You could actually personalize
that and we had a version at one point where we did personalize it and say, well, there
are lots of things that Carnegie Mellon says are bad privacy but I don't care about those,
these are what I care about and I want to see a big red X next to it if it has any of these
things.
Yeah?
>>: What about a high level between the nutrition one and privacy, there's
recommendations in one. Because the government's made recommendation for ->> Lorrie Cranor: For good nutrition.
>>: Whatever the recommendations are. And I don't think until someone makes the
[indiscernible] you can tell someone's telling me hey this isn't recommended. I don't
know if you can ever get to agree on this issue [indiscernible] because I just scan down,
I look if anything is above ten percent, I'm like, huh, I look to see what it's telling me.
Nine grams of [indiscernible] I know that's ten percent of what I have in the day.
>> Lorrie Cranor: I see what you're saying. And that's part of why we've paired it with
the privacy meter because you can look at the privacy meter. And the privacy meter
says this is great privacy. And then you can go look through and say, okay, it was great
privacy, I guess this is good. Or the privacy meter says this is bad privacy, now you can
see why it's bad privacy. But it's a little bit less immediate than the nutritional label
where you put it right there.
>>: And who made -- is the meter just a static thing for testing?
>> Lorrie Cranor: No, we actually are computing it dynamically by analyzing the P3P
policy against -- we have our own opinion as to what's good privacy and we just
compare it against that.
>>: So on the follow-up on like the nutritional label, that's quite measurable. But the
price is your opinion, which presumably would differ depending on who you ask. So I
think that's a very big ->> Lorrie Cranor: Actually, the nutritional label differs depending on who you ask. When
they tell you this is ten percent of the recommended daily value that's for a person of a
particular size and weight which you probably aren't.
>>: The percentage varies, right? Percentage if it's 2,000 versus 5,000 percentage just
goes along with it.
>>: If you're a weight lifter it does. You want more protein.
>>: Multiple the percentage by whatever ->> Lorrie Cranor: I think the point is that you should personalize the nutrition
information. You shouldn't just take it as is. Same for privacy. You might say I don't
care about junk mail. It's telemarketing. The only thing I care about. So I'm going to
ignore the junk mail column I'm only going to focus on telemarketing.
>>: So when you're looking at making a determination whether it is a good or bad
practice, I think you could take a particular type of information collection and you could
say, well, is that information collection good or bad. Is the configureable choice by the
user good or bad? Is there a choice? And what is that default option? So is all three of
those aspects going into the analysis of what's a good or bad privacy practice, or are you
sort of siloing each of those or maybe just looking at the information collection.
>> Lorrie Cranor: We're doing the cross-product. So we're looking at what information
is collected and how it's used. So let me just kind of wrap up here and then I'll continue
talking with those of you who have questions here.
All right. So the layered policy, as I mentioned before, did not perform well in our study.
We found that it performed very similarly to the full policy. Some of the big problems
were that the layered policy actually doesn't have all the information. And people didn't
realize when information was missing.
And so they would answer a question that required actually clicking through to the full
policy without clicking through to the full policy. And yet they would still guess at it. I'll
hold your question until the end.
And basically the layered policy is only standardized a little bit. The layered headings
are standardized, but the actual rest of the content of the layer is not standardized. So
it's really not actually helping very much.
So our ongoing work at this point is that we actually have now finished the integration
into Privacy Finder, and we're now working on trying to make the label interactive so you
can come and click on this cell and now find out exactly what's going on at that website.
So what are they doing with this combination of purpose and type of data. And we're
also experimenting with the things like the tabs that I mentioned. We'd like to do some
field studies to see how people use this when this is their search engine. And we would
really love to test this with a major search engine. Hint, hint. We can test it in our lab
with a few hundred people. Maybe if we find ways to bribe them, we can get maybe a
thousand people to come use our search engine, but it would be great if we had 20,000
people, a very small fraction of a search engine's population that we could test it with.
So I will end it there. This URL, this is our lab website. And all of the papers that I
mentioned and many, many more are available there. So you can check those out. And
now I'll take a question in the back.
>>: So beyond the work you've done here, which is really good work, you actually -- I
don't know if you used this word or not. But I would agree with you, using whatever word
you did, that it's mitigation for I think what are fundamentally broken models in the first
place.
I mean, you know just the whole concept of a certificate and certificate provider. We've
been struggling with that since the dawn of time with regard to the Internet. Are your
folks going to do anything in that area about, well, what if the world were completely
different and we weren't dependent on these mechanisms we have?
>> Lorrie Cranor: I think we're starting to think along those lines as well. It's always
harder to think out of the box that way. But that is I think where I'm trying to get my
students to think and trying to come up with some of these more creative different types
of approaches.
>>: So in the list of things on your nutrition label are things that probably mean a lot to
users like my personal information, my financial. But there's also an element cookies.
What do they think that means?
>> Lorrie Cranor: Yeah, we actually are doing another study right now that's looking at
that. So the choice of what those terms are in the grid, you know, there were many
more P3P terms we could have thrown in there and we collapsed things down. And
what ended up staying were either concepts that were clearly important to people or if
we found a bunch of things that were very similar, we put them together. Cookies is
something that it seems like people who know about cookies, they want to know. Do
they have cookies or not. Seemed like it needed to be there.
We've been doing studies where we ask people what do you think a cookie is and
getting some very interesting responses. There is actually a large fraction of the
population that now knows what a cookie is. But it's not by any means like 90 percent or
anything like that.
And then when you start getting the things like third-party cookies or flash cookies, all
bets are off. They don't know at all.
>>: I think you're going to have more questions but let's thank Lorrie.
Download