16824 >>: Well, good morning, everybody. Thanks for coming,...

advertisement
16824
>>: Well, good morning, everybody. Thanks for coming, and everybody streaming live, as well.
So it's my great pleasure today to actually welcome Deborah Estrin to MSR to spend the day with
us. And to most of us, like, Deborah actually needs no introduction. She's been -- she's a very
renowned researcher in networking and systems and beyond, and she started out I think in her
career doing Internet research, scalable multicast, robust [phonetic] transport. And I actually
remember, like, and she stopped doing Internet research at some point and moved over to
sensor.
In turn with Sally, Sally Floyd, in 2000 -- I think 2000, and she would constantly lament that
Deborah has stopped doing Internet research, and by that time I think it had been five or six
years, but Sally could never get over it, that Deborah had stopped doing Internet research. Well,
anyway, in retrospect, it seems to have turned out really well for anybody involved, between
Deborah and us, who have seen like a stream of really good work come out of our group at UCLA
and USC. And along the way she has picked up a bunch of awards, too many to name, but the
ones that stand out for me is like the Anita Borg Institute Vision for Women of Innovation, the
Brilliant 10 by "Popular Science" and the WITI hall of fame in 2008.
So, without much ado, I had it over to Deborah.
>> Deborah Estrin: Well, I stopped doing Internet research and then you an possibly decide by
the end of the talk or somewhere in the middle that I stopped doing research all together, but
that's another story. One of the things I miss about not doing Internet research, one of the things
I missed from the very beginning was Sally, so the feeling has always been mutual.
So I do this work with lots of other people, some of whom are named up here, and I'm sure I
forgot some. And just to give you a little bit of background, I moved to UCLA in 2000, took
Jeremy with me, and before that I was at USC for 15 years. And around the time of moving to
UCLA, I applied for this Multidisciplinary Research Center from the National Science Foundation
that we called CENS, the Center for Embedded Network Testing, and the whole idea was to do
distributed sensing in the context of driving applications, because, as Jeremy will remember, we
were tired of finding out that we were making up and solving the wrong problems, because we
weren't working in the context of applications at the time.
So that was the beginning of a long perhaps slippery slope, or possibly a more positive term of
my moving increasingly and increasingly towards sort of doing what I hope are innovative or -- not
useless -- or useful applications, or both, and from that looking at technical problems that still
need solving.
And so CENS is UCLA, USC, UC Riverside, UC Merced, a tiny bit at Caltech, and we're in our
eighth year, so we only have two more years after this, and in fact those last two years will be
spent largely sort of transitioning and refocusing to what we're likely to do next. We did a lot of
things in the past on what people would call, but it's always a funny term, traditional wireless
sensor networks, as if wireless sensor networks were ever mature enough to be called traditional.
So we did applications in environmental engineering, particularly contaminated water transport, in
terrestrial ecology, in coastal marine biology and in seismology, seismology being really like the
granddaddy of distributed sensing. And work continues in those areas at CENS and we largely
expect that those will sort of fork off and be domain-specific activities.
For seismology, we introduced some new mechanisms to allow them to do wireless and online
sensing. In the other applications, they really didn't do much distributed sensing at all, and so
there were even greater impacts there in terms of their methodologies. What I'm likely to do and
we're likely to do with a bulk of CENS over these next few years and going off into the future is
what I'll talk to you about today, which is really leveraging a device that if you stretch you can call
it and think about it as a wireless sensor.
It's, again, having Jeremy here, sorry, I'm going to be -- because I talk while I think, or sometimes
think while I talk, you'll have to hide yourself to not have me continuously reminded. But he'll
remember back in the wireless sensor network days when we were -- the database people came
to distributed sensing and thought of distributed sensing or wireless sensor networks as a
database problem.
And the programming language people came to it and thought it was a programming language
problem and we came from networking and we thought of it as a networking problem. Of all
those, we were clearly I think the most wrong in the sense that the networking was the easy part
of distributed sensing.
Its probably really a statistics problem or a data fusion problem, and so forth. And in the process
of going through those -- building these applications and going through those discoveries and
building in the mechanisms that needed to be built in, one of the things that we clearly came to
understand was that mobility was key to any sort of economically form of sensing.
And I think one of the things that frustrated Jeremy, Lew Girod, who you might know, and I, who
stretched all the way back from Internet work and moved into distributed sensing is that in doing
that we lost all of that scale and economics that you have when you work on things related to the
Internet, because there's so much out there that's deployed. There's so much that you can
leverage. You do an innovation and it lights up a whole bunch of things.
And we started doing this work in sort of statically placed distributed sensors, there's just this
economics to it that every sensor you place costs something, even if it's cheap, to maintain it, to
calibrate it, every point of measurement. And so at every point of measurement, if it's a contact
sensor, is like getting -- it's measuring like a cubic centimeter of the globe, and so it's a -- just the
basic economics of it are a different situation.
It doesn't mean it's not worth doing. It just means that you want to do it where those data points
are really powerful, where they fill in some parameters in a model, where they tell you something
about the world, and that by and large our scientists wanted to go out, deploy distributed sensing,
study the hell out of a particular area of the globe and then pick up their stuff and go someplace
else and study it.
And so mobility happened in that it wasn't just a static sensor network that needed to live for a
whole year or 10 years without ever being touched. There are those, but that wasn't the bulk of
what the scientists wanted to do, so they moved in that portability sense. And then, within any
distributed sensing system, you still had this problem that statically placed sensors leave all this
space in between.
And, often, the whole problem is you don't actually know the spatial variability that you're trying to
measure, and so where you could, which was not in the soil below ground, but in water and
aboveground, the ability to move a sensor through the environment and take multiple
measurements, as opposed to being stuck in one place, had great economy and scalability to it.
So from that story of looking at mobile sensing, and from a random invitation from Nokia
Research Center to a workshop that they held back in 2005 and a project that they launched
called Sensor Planet, where they gave out a bunch of phones -- so instead of having to pay for
every mote that I then had to package and put out somewhere, suddenly I had 300 relatively
powerful mobile phones, and what could we do with these things?
So we started doing -- we started really with that question of what can we do with these things,
and from that perspective, being relatively application driven. So now what I want to tell you
about for a bulk of the talk is about some of these applications we've built and that we've been
using in an exploratory way, as very much compelled by this availability of the technology, that it's
not just that the infrastructure's out there in the cell towers and all of that, but that people are very
attached to their phones, carry them in their pockets and therefore the power of leveraging that
existing infrastructure and adopted technology is very compelling.
And, looking at problems that seem important to solve at this particular point in time on this
planet, and that's one of the excuses I have for being in academia. I don't do theoretical work. I
don't do particularly far-out work in terms of being very futuristic, because I tend to do things with
all sorts of components that exist. So one of the reasons I stay in academia is because I try to
work on problems that don't have any clear revenue stream.
So while some of that might change with some of these things, that's sort of a common theme.
So a lot of my slides will be talking about technology. Often, when people give research talks,
they're talking about what's difficult. And when I look at this sort of body of work or sets of
activities that we're doing, it looks to me like all things that are easy, not difficult, and that's part of
what I find so compelling and exciting about it, is so much is so easy to do.
I do think that there are some hard problems to be solved that follow on that, so I don't scare all
graduate students away, but really what I think is most exciting about it is how easy everything is.
And also, in sort of reflecting back on this set of sides -- and this is by no means a complete list.
This was really just from recall.
I just wrote down there are so many building blocks and precedents that come from this place.
Well, not this building, because I guess you weren't here for a lot of that time, but this place,
organizationally, so using GPS traces, spatial cloaking, radio as sensor, thinking of Victor's work,
reality sampling, map crunching, activity classification, many, many things.
So I don't know if you guys are in the audience, because I'm very bad with names and faces, but
lots of pieces that are very related. I will also use this sort of blue variable font, because I'm not
careful enough to get it the right size font, down at the bottom to talk about some of the research
challenges that underlie taking the applications to the next step.
So let me start by talking about three -- not start. I'll spend a fair amount of my time talking about
three types of applications, and I'll be fairly specific about them. Many of them have -- this is my,
like, mark for something being real these days, is do we have an active URL that represents that
project. It isn't just a static webpage. And many of them do and I'll try to be coordinated enough
to flip over to the webpage.
So the first class of applications I'll talk about are really the absolutely simplest and easiest in
terms of the underlying technology, and this is just that incredible thing that we know as the
geocoded image. It's just that thing that's possible on all smartphones now, is you take a
high-quality image of something, it's automatically geocoded, automatically timestamped and
automatically uploaded, the most mundane thing possible.
People do it all the time with Flickr and other photo repositories, and the idea here is just that
instead of having basically images be organized by who took them, okay, which is how people
tend to organize their photo albums now, think about a data campaign as a particular purpose for
which you want evidence. It's really just a make-a-case technology. And so the basic model of
operation here is that you have a -- it's a combination of mobile and web.
You have a downloadable app, just like that you get from the app store or the market or whatever
your favorite app type store is. You download that app to your phone, it has a very visible,
recognizable icon. You click on that icon. It has you do a prompted capture. It gives you an offer
of some tag information, some specified tags.
You capture it in basically a click, possibly tag it. It's uploaded and automatically then curated
and presented on a live website, along with everybody else who's contributing to the campaign.
So one of the first campaigns we did was with a class on campus, a group of kids who wanted to
do -- students who are part of a sustainability initiative on campus. And they were trying to make
a case on campus to put out more recycling bins.
And so it's a lovely set of images, I guess we are relatively close to lunch, where you can go and
basically the campaign that the students had was go around campus, every time you see a
garbage bin with recycled -- every time you see a garbage bin, take a picture of it, tag it whether it
has recyclable items in it or not. And they actually ended up creating a map. I mean, the map
was automatically created and they ended up writing a whole report that they gave to the facilities
office at UCLA about placement of garbage bins and overflowing garbage bins and a whole
bunch of other things.
So what's interesting about it is it's no computer vision, not a single difficult technical problem in
the way of it, not something for any one campaign that you would go to a lot of effort to do, right?
This is not something that's important enough for you to bother investing a lot of money in.
But, at this point, we have this thing that we call the Rapid Campaign Platform, and it's just
probably an hour or so of changing some scripts on the website and changing some code on your
downloadable app, mostly in terms of the visuals and the tags that are offered to you, and you
have yet a different campaign.
So a subset of the same class was looking at food waste in the dorms, where they have a
cafeteria plan. They're trying to get the students to take less food on their plate so that they
waste less food, and so here they're taking pictures of food as it goes back on the conveyor belt,
equally appetizing.
Also, just the same basic idea, right? Do something on the web that's there to automatically
curate, that indexes into the images, either by time or by spatially on a map or by tagged value.
Now, I showed this to a guy who came to visit me on Friday who runs a bunch of the DWP, the
primary utility down in the Department of Water and Power down in Los Angeles, and he
immediately wanted this for his employees who tool around town all the time, looking for things
having to do with wasted water use, broken electric utilities -- excuse me, broken electric meters,
things like that.
And now they write down reports, they come back, they enter into a database. So he wants to
just be able to generate quick campaigns for the top five or 10 things of the week that they're
looking for, and they go to that particular app. And the point is that the data is automatically,
because of those tagged values, geocoding, timestamping, automatically curated.
They can then do triage once a week or every morning, for that matter, where are the primary
problems, should they go out and allocate resources. Just very, very simple.
Now, all kinds of things you'd like to be able to do, a lot of work here for image analysis, computer
vision, some of which is zero research, just hard work, applying some of the existing techniques
that already are there, but others of it that is research, right? Category recognition in computer
vision is not a solved problem, by any means, and you'd love to actually be able to apply that
problem over time, not just get rid of bad images and give people data integrity feedback.
And then, as these things get successful, having campaign management tools of where you're
missing data, reminders and things like that. Okay, so that same story applies, as I sort of think
of as civic engagement, same story applies to citizen science.
I never really believed in citizen science. I thought citizen science, many activities in citizen
science, were acts of marketing, okay, and that there was something false about them, because
you would dumb down the doing of science enough to be able to get the public to engage, but the
data they would generate had no way of being validated or really used. And so it felt to me like
very false pretenses, sort of in the same way that sometimes when they try to teach kids with
math with sort of cartoon-y simulation things that take away all the messiness of it. I have that
same ambivalence about it.
But using geocoded images in this way, as a way of citizens contributing data, is interesting
because it has an automated validation -- sets of validation points, because it's geocoded and
timestamped, because the image speaks for itself, even though it might be tagged incorrectly.
And so we've been working on a couple of different -- let me try this. A couple of different
applications in citizen science. Let me see if I manage to do this. Now, what am I supposed to
do that?
Okay, and I couldn't get on your open wi-fi. Not that one, this one. So these are all live URLs,
whatsinvasive.com. And so it was the basic idea of doing this citizen science thing, and we
started talking to the National Park Service, so LA, yes, it's environmentally a disaster zone and a
ridiculously designed city, barely a city, and all of those things, but we do have the Santa Monica
Mountains, which are nice for biking and hiking and those sorts of things, and they're an
important ecosystem and much of the park area is managed by the National Park Service.
And one of the things that they need to do to try to maintain the ecosystem, aside from trying to
keep it from always burning down or being developed over, is to manage invasive species.
Invasive species are a problem because they come in, they take up all the water and the nutrients
and they crowd out the local plants and then the rest of the ecosystem, the bees, the insects, et
cetera, change as a result.
There are many invasive species growing all over the Santa Monica Mountains that are already
so prevalent that there's nothing you could do about them. You'd have to eradicate everything to
get rid of them. But they have always the top six to 10 weeds, their top 10 most wanted list that
they're trying to spot early incursions of them so that they can eradicate them because they're still
at a manageable level.
Think about any kind of preventative disease problem. This is just an ecosystem disease. And
so we said, okay, that same thing we did for GarbageWatch -- first of all, we'll have some prettier
pictures than garbage and food waste, and we just went out to their facilities staff and said, "Here
are 10 phones, and here's how you use them." And we set up this website, and according to the
-- and we got from them their top six weeds, and we gave them tags. And then this is the map
that just gets automatically generated by them clicking on it. It's geocoded, it's timestamped, it's
tagged by the weed, and then you end up with this kind of a map in real time.
Now, normally, they do -- the last invasive species survey they did was three years ago. It took
like a year to collect the data and another year to get it online and process it and do whatever,
and now by the time they did that they were about ready to start the next one. We actually have
some nice plots that Mark did that sort of overlays this on top of their three years ago invasive
species data, from there years ago.
And they did this in 10 days, without any coercion from them. We just gave them the phones,
and just as they're going around as part of their daily work, patrolling the parklands, they would
go ahead and do this.
Now, what we want to do is also make it available to hikers and things like that, to really engage
citizen science. Interesting question, we almost were only looking to do this for sort of the public
and hikers, but these employees who are always out and around, whether it's the DWP's guys or
these guys, is actually an even more interesting target.
In the same way that you were talking about going to small enterprise before you go to the home,
I think that there is some really interesting market there, if you will, not necessarily revenue
market, but in terms of early adopters. Whoops.
So, to do all this, and as you were saying, as well, it's really not to publish papers that we are
trying to sort of develop the basic framework, just that we're doing lots of these. And you go and
talk to somebody and you have five new ideas every day about data campaigns that people want
to do. And so the things that have let us now do like a dozen and moving on, numbers of
campaigns, is just building a very simple, making a lot of use of the cloud.
We happen to store these images on Flickr and pull them back down, and then we have some
analytic stuff going on. And for the most part you get feedback on the web and then we do a
feedback on the phone, as well, particularly in terms of your participation, contributions. And lots
more to be done here, but this is all done using the same basic set of software and just modifying
some PHP and Python scripts and then a little bit on the either -- we're doing this mostly on
Android phones, or still we have some that are working on Symbian.
And we've done some of our work in Windows Mobile, but basically in this way I'm -- whoever
gives me phones, that's what I work on. I mean, the pragmatics of it, I can't do big campaigns if I
have to buy the phones, so if you give me phones, that's what I work on. Good phones.
>>: How many?
>> Deborah Estrin: Not less than 40. Okay, four phones don't do me any good. We have like
four or five Windows Mobile phones, and that's just about how -- so we did some parallel
implementations. Or you give me a customer base. So nobody is ever going to give me an
iPhone, but we are building an iPhone client for some of these things, because that's the only
smartphone that I can count on people having out there in the real world.
So this has sort of grown up into -- some of the things that is maybe not as completely mundane,
still very doable and where some of this is going, is tying this much more into all of the remote
sensing and GIS data and other both model-based and empirically based data that exists. So we
came up with this idea of sort of what we call OurPixel.
And the big picture is, if you think of every pixel of a globally satellite image, to not only have the
reflectance data that it has now, as well as other sophisticated sensors that are going on
satellites, but also to encompass the local knowledge about the land, about the practices, about
what's going on. So we often would talk about how the tag line behind CENS was about spatial
resolution and that every pixel in a remote sensing image represents an average over many,
many kilometers, and so we wanted to get to something lower spatial resolution.
But when you think about what people can capture with mobile phones, it's not just about visible
spatial resolution. It's there are also multiple perspectives about what's going on on that piece of
land. If you think about even that invasive species story, and that's what we're sort of using as
some initial concrete fodder for this, we're -- there's a lot that you can use if you look at remote
sensing and use that as a guide to where you're doing data collection, but also you can then
support investigating places where there are interesting different perspectives, such as related to
land use.
So there's lots of controversy about building -- there's a school actually in my neighborhood that it
held up through court proceedings with the local neighborhood for years now, because it's
claimed that it's going to have too much encroachment on the local wild land. Many issues
between neighbors and the park and neighbors and one another. And, if you think about it, the
ability to document what's going on in a particular place with local perspective is interesting.
And OurPixel does that, as well as tie together in a more integrated online way what's available
from remote sensing and GIS data with these local observations. And I wanted to mention this in
particular, once again, we're sort of building it out of as many existing and open-source
components as we can.
Thanks to the Jim Gray funding that we received from you folks, we're hoping to make some
significant progress on this, this summer in particular. It came just in time. Money made
available right before the summer is much more valuable than money made available -particularly small amounts of money, you can do a lot, compressed time, in the summer, more so
than you can like in the fall or winter quarter.
And we also have a collaboration brewing with Conservation International. We actually did some
early work with them on using the mobiles in something called an EcoPDA for their teams of
semiprofessional data collectors that they hire in the tropics to monitor biodiversity. So instead of
them writing right in the rain, paper, and coming back and entering that online, they actually put it
into a PDA and very similar to the rest of the story.
And, with Sandy Andelman, we're actually trying to get some funding to make some serious data
campaign happen in time to possibly demonstrate for the Copenhagen Climate Change
Conference in December, although that's getting a little iffy. The idea being that nations, similar
to Kyoto, having to do with carbon reduction, have also made commitments to maintaining
ecosystems in order to do the balance of CO2 generation and sequestration, but there's no way
in which -- they don't really have monitoring capacity for that, for those commitments that they've
made.
Moreover, there's a lot of interest these days -- every time you go and make a plane reservation,
they ask you if you want to pay for your offset. And so there are lots of schemes -- I will call them
schemes -- coming up as to how you can -- what those carbon offsets can pay for. And some
people have been trying to tie carbon offsets to poverty alleviation, so that instead of just paying a
government to plant a big forest, you pay subsistence farmers to plant things, but you want them
to plant the right things, and how in the hell are you going to monitor it?
The cost of monitoring that kind of a thing is incredible. So can we think of creating a kind of
challenge response, verifiable type of campaign, where with images over a course of time you
can verify that it's the right kind of tree in the right place?
Yes.
>>: I mean, it seems like if all of this is driven by volunteers or something, that would be kind of
like a self-selected set. So you'll get documentary evidence in only one direction of the debate.
Is there any way of getting around that?
>> Deborah Estrin: Yes, so, for example, Team, Conservation International, pays people. It
might not be their full-time job, but they pay local populations to go out and do data capture. And
because they see where they've been and when they've been doing that data capture, they can
identify where they're getting coverage. And a lot of that is driven, then, by remote sensing,
GIS-based analysis of where they want to be.
You're not going to cover every cubic meter of the globe. You do experimental design, you
decide where you need data from, and so there's the notion of paid. In this context, where you're
talking about doing verification, auditing these mitigation activities, there it's tied to you're having
somebody plant something in a particular place. They're getting paid for your offset and this is
verification of that over time. It's not volunteers. It's going to be a requirement.
If it happens at all, it's going to be a requirement associated with receiving that offset funding.
And that question applies in many different contexts, so if you think about in the civic context that
I was mentioning before, you can't -- people will use it to make a case that's the case they want to
make, and you're going to be hearing from the people who complain, as opposed to the people
who don't.
But that's why sort of this DWP guy wanting to use it for his employees is an interesting
alternative model, because they sort of peruse the county as part of what they do, and then this
just becomes a very easy vehicle for them to do that.
Okay, so that was my story for lots of things we're doing with just the most simplest of things to
do, which is geocoded images, that are automatically curated and automatically processed, that
have behind each of them -- I'd say the only difference is, is that there's a particular purpose, a
particular theme, a particular thing you're trying to do with it and every campaign is freshly
generated with that.
Okay, so now, second of this set of things I want to tell you about, are making use of a different,
as readily available, but a little more interesting processing to do, which is that wonderful thing,
the location trace.
Okay, so how we got to this was we were actually doing things with automated capture of images
for this nutritional, sort of dietician, epidemiology study that somebody was interested in where we
had these phones that were slung around our neck, a program, outward-facing camera
automatically capturing images every 10 seconds as a form of input to like a three-day dietary
assessment.
You're not Gordon Bell, you're not going to walk around like that all day and all the time, but if
somebody's doing a dietary assessment as part of a food elimination or what have you, instead of
filling out a three-day retrospective report, you would capture these images. They would be
curated. You'd get a sample for during your day. The individual would look at them and use that
to help to trigger their self report that they're filling out.
Why were they looking at them, as opposed to nutritionists? Because outward facing camera
taken throughout your day has some serious privacy issues with respect to not the person so
much, but everyone around them, including their half-dressed family running around while they're
at the breakfast table. Or, as Fang Zhao, the best thing he ever did for me was when he came to
a research review of mine and he enacted a story that I had told as an imagine-if, which is that he
walked in the men's room with his outward-facing camera on. And we were in real time, popping
up all the images. Just, anyway, it was great.
[laughter]
So where was I? I got off my tangent. I try to run all of these systems that we use, particularly
now that we're working in applications in this sort of urban sensing space that's easier for me to
be a guinea pig. So I was driving around -- driving around. I was going around with this camera
around my neck and, as Jeremy will know, first of all, I don't spend a lot of time at meals, right?
So all it would see maybe is every once in a while is a protein bar coming in or out of the image,
or it wouldn't be caught. Second of all, I spend a lot of time in my car, because I have a long
commute, so most of my images were of my steering wheel, every 10 seconds, and some protein
bars or fruit coming in and out of the image. Not particularly interesting.
And it was sort of distressing, and then I also decided that that technology, in general, while I
think it has its niche applications, just doesn't have a lot of scalability because the privacy stuff
about taking images all the time is just too much.
But the geolocation trace -- forget the images. My images were completely boring. But just my
GPS time series, and you guys had already been doing things with this before, but it wasn't until I
was doing it myself that I realized the power of that. So I started thinking about, okay, this GPS
time series, well, first of all, you do something, you sort of do in some sense it's just a mash-up on
steroids, or just a crunching together of maps. So, first of all, what happens when I just see my
GPS time series throughout the day? Okay, so that spatially tells me something, and it's
sometimes interesting to see just how repeatable your patterns are.
But I started thinking more about this as what can I infer from this GPS time series? So, of
course, as you know, and work done here long before and at Intel and many other places, from a
GPS time series you can do activity classification. So now I have an activity, time, location time
series that's easily collected throughout my day, throughout my week, throughout my year.
So that by itself is of interest for different things. I'll mention an application when I get over to
health. But what we did with it was build a system called PEIR, where it's this Personal
Environmental Impact Report, and so from that GPS location time series, we did course activity
classification, are you indoors, are you outdoors, are you driving on a highway, are you driving on
a surface street?
And by using that location time series as index into models that the Resources Board maintains
for things like air quality levels, PM2.5 is the most harmful particulates. I'm not measuring PM2.5.
It's not a personal measurement of that, because -- I won't go into that rant.
But the state of miniature sensors that are going to do high-quality particulate measurement are
far form where you would want them to be, but they maintain dynamic models of what the PM2.5
concentrations are, and it's based on climate. It's based on some course-grain static
measurements, big instruments that are out there, as well as some occasional drive-around maps
that they do with also significant instruments on the electric vehicles, as well as traffic reports.
So there is an exposure map for the area that's dynamic, and now I take my location trace and I
use that as an index into that map and I get an estimate of my PM2.5 exposure, just like I get an
estimate of a personal carbon calculator. So that's what we did with PEIR. We took this location
time series that is going up to a website. It's a private account on that website, activity annotated,
run through these models, and the models that have been most useful have been the CO2
calculator and the PM2.5.
We also did fast food exposure, how much time do you spend sort of still around places that have
many fast food establishments? And there are standard measures for that, because food
availability is one of the contributors to health disparities.
And we actually have a trial that was running with some high school students in the Bay Area in
San Francisco where they were doing -- these high school classes were competing against one
another as to who could go greener by changing their transportation modes. And instead of
doing it through self report, Nokia gave them a bunch of phones and I think AT&T gave them a
bunch of SIM cards and they were running PEIR on it and we made a special Facebook widget
for them because we didn't have the air quality model integrated for San Francisco.
But it's just a personal carbon calculator and it shows what their scores are relatively to their
group average and they got a lot of good press and they're hoping to continue it.
So one of the interesting things here is that the processing pipeline isn't trivial. Not big research,
but serious software engineering, and, in fact, PEIR as we have it can handle a few thousand
people. I actually think it's an interesting tool for like a city to use, to say, "okay, this is commuting
optimization month." Run the PEIR tool. If you're going to choose to work at home a day a week
or ride share a couple days a week or do a bus commute one day a week, how might you shift
your schedule and which day you might do that to overall improve you overall impact on the
environment. There's things that you can imagine, that an employer might do that for their set of
employees.
Again, it's not something you run every day of every year, but if you ant to do some analysis to
see how to optimize your commute around these things and raise awareness, it's an interesting
example. But, to do that, we're now this summer spending some time retooling the processing
pipeline. And, as you think of other models that you might want to run these things through, this
becomes a nice, challenging just sort of software engineering effort.
And, of course, many of the models you might want to use, as opposed to being nice live models
up on the web with good APIs and things like that, they're not. They're like chunks of FORTRAN
code sitting someplace that you have to figure out how to get up and live and be able to index
into, so that's even true for this [Eresis] Board data that you have out there.
One of the derivative projects that sort of combines some of that geocoded imagery and some of
what you saw in PEIR is a project that came up through some of the students, and this is about
bicycle commuting, not the weekend cyclists who go out for exercise or weekend activities, but
trying to promote bicycle commuting in Los Angeles, which, if you know Los Angeles is a bit of a
challenge. Los Angeles is built for cars. It doesn't particularly have any buses. It has no trains
and it certainly isn't a great thing for bicyclists.
But, as gas prices for a while there were getting up to where they should be, people were starting
to look more and more at cycle commuting. And so the idea here is just you take your phone, it's
doing the GPS time series and it's timing your route from start to finish, but it's not just doing that.
It also has, because of an accelerometer, it's capturing a measure of the bumpiness of the ride
and it's also giving you back statistics on how much of your ride are you continuously moving,
versus stopped, assuming at an intersection breathing in exhaust fumes and probably more a
source of safety concerns.
So the cycle commuting community, so far we just have a couple of users using it, but we're
hoping to do more this summer, particularly once we get the iPhone client, because that's the
only smartphone that people have in large numbers out there. It's an interesting -- I don't know if
this is the most up-to-date URL, but I'll get it to you.
It's an interesting application that makes use of both location time series and makes a couple of
inferences out of that, not just duration of ride, but, as I said, quality of ride. So the last set of
applications I want to talk about are in the context of health and wellness. And one of them is
basically the same thing as, if you will, as PEIR or as Biketastic is using -- I did not come up with
my name, but I showed my, whatever, maturity as a boss by accepting that that name be used,
even though I can barely get myself to say it.
So this was actually -- this is just about using that location trace in other sorts of contexts. What
does your mobility trace tell you if you're managing some kind of chronic disease? So that
chronic disease might be aging, or that chronic disease might be Parkinson's or MS, or muscular
dystrophy or many other diseases that introduce not just neuromuscular -- have neuromuscular
impact, but also just diseases that introduce fatigue, either the disease itself or the treatment.
And so ambulation, which is a term -- so I was giving some talk about some of this stuff and
somebody at the NIH put me in touch with somebody who does a lot of IT work for the Muscular
Dystrophy Foundations. And they explained that their kids who have muscular dystrophy, the
way they're evaluated is that they go into the doctor's office once a year and do a six-minute walk
and that's the evaluation. Can you walk for six minutes in the doctor's office?
And so the woman who heard the talk said, "It seems like it would be relevant to this guy Kyle's
community. Why don't they run this kind of activity classification?" Then they can have, even if
they don't need it 365 days a week, even if you do it for a week once a month, or two weeks,
twice a year, you get a picture that's a much better sample of what your actual life looks like and
what your mobility patterns looks like.
So this is just doing GPS plus accelerometer, to do activity classification of walking versus slow
driving and other things. Accelerometer helps a tremendous amount. We have a lot move work
to do to perfect the activity classifier, and for example they want to be able to do wheelchair
versus car versus walking. They want to be able to do steps versus not, and there's been lots of
prior art [phonetic] in that area.
But it's just an example of the very simple application of the same technology. Now, in some
cases, it might be this kind of diagnostic do it for weeks. In other cases, it might really be
something that because it's so easy to be passively collecting it, you might just want to do it all
the time. Because then, as you're looking at shifts in medications, or how was I relative to where
I was last year, you can go back and look at your mobility statistics.
And there are indicators also that might generate sort of events. For example, one of the
indicators for these populations is when people become less confident in their mobility, they stop
going out as much. And it's not something you announce to anybody. It's not something that
happens over the course of a day, it's just something that sort of gradually sneaks up on you, the
same thing as something that I noticed in my aging parents.
When I found out like that six months before they'd stopped cooking at home, that they were
always taking their main sort of meal outside the home, and it's just an indicator of something. It
has dietary and sort of nutrition implications, and it's an interesting indicator, and it's something
that their location trace would tell me, or tell them over time. I'll come back to the privacy issue.
And then the other sort of dominant mode in which we're doing some initial health applications is
a combination of some of that location trace, but it's going back to what I started with, which is the
geocoded image. But added to the geocoded image is something even more mundane than the
image, which is asking you a few questions. So the term ecological momentary assessment is
what some of the social scientists use.
You guys call it -- is it you guys who call it reality sampling? Is that you, or am I confusing you
with Intel? Probably, sorry.
So reality sampling is another term. Robert Wood Johnson Foundation calls it observations of
daily living, which is actually I think the nicest of the terms that I've heard. And this is, again,
when you have that nice really good quality displays that we have now, touchscreens and such.
It's really easy to add sort of very quick sets of questions.
And so this application, which we call generally -- AndWellness, this particular instantiation of it is
just -- this is instead of that automated image capture around the neck, right, with all its issues?
This is this location and time-triggered query about what you're eating.
Okay, now, all that's happening here is that that geocoded image is being tagged with some quick
information about what the health folks told us to ask, which is how full are you? Is it good
quality? Were you on or off plan? Because it doesn't matter what diet you're on. It only matters
if you stick to it, and what was the duration, and who are you eating with, possibly.
Yes.
>>: Is there anything about people eating better when they're forced to take photos every time
they eat?
>> Deborah Estrin: So, first of all, in case anyone from my IRB ever listens to this recording, you
will know that this is a technical trial, and we were not studying anything about human behavior,
nor would we ever.
And then, more seriously, we haven't had the numbers to know whether reactivity acts [phonetic]
comes into play. So for the epidemiologist, that's like a huge issue, and for the health behavior
people, that's great. That's part of what you want, and so we have like a zillion different versions
of this into NIH with our health applications folks and our Robert Wood Johnson Foundation,
because we'd love to be able to answer those questions in terms of the reactivity and then most
likely that reactivity starts to fall off, as does participation.
But there are all kinds of fun things you can do with this, because this is the device that's in
somebody's pocket. You can do location reminders. You can do different incentive things. You
can do social networking things, and it just all is just a small matter of really probably very simple
programming, but not so simple user interface design.
Yes.
>>: So is taking the picture just a trigger, because you don't have to take a picture.
>> Deborah Estrin: No, you don't. And, in fact, the prompt doesn't make you take a picture. It's
just one possibility. And sometimes people find it interesting, because it helps them remember
and it tells you something about portion and things like that, and sometimes you just don't bother.
Now, this could also be about medication adherence. And, again, It's just that it's very easy to
use and it's automatically curated and in real time. And the point is to be able to go back and see
your trends over time.
This is automatically generated for you, go and see where are you off plan. It's not going to
automatically change your behavior. There's a little reactivity. More than that, it's to say where
do I tend to be off plan and then maybe talk to a diet coach or whatever about, okay, what can I
do to try to change that behavior, one little step at a time, because that's how behaviors change?
So if the problem is eating -- Mary Jane Rotheram-Borus, who runs this big NIH-funded center at
UCLA, I like her example best. She explains that at 10:00 she eats ice cream, but if she
remembers to buy strawberries on her way home and eats strawberries at 9:00, she doesn't eat
ice cream at 10:00. That kind of thing, the Powerbar at 10:00, instead of the pastries off the office
cart at 11:00.
And so we've done some -- I actually hired my first user interface sort of person from actually
somebody who graduated from the DMA, Design Media Arts Program at UCLA. So, here, if
distributed sensing was a statistics problem. In this domain, it may be everything is a user
interface, user experience design problem in the end, and lots of questions about how you chunk
this stuff off to make it useful, incentives and feedback and customization.
And so we just put in a proposal to the Robert Wood Johnson Foundation, as I said, they call this
observations of daily living, and they wanted people to target populations that have two diseases.
So we did hypertension and diabetes that often occur in common, usually along with obesity, and
there we have physiological measures, so blood pressure, glucose. It might be Bluetooth, but it
could just be a regular, standard device that you then enter in whatever the reading was or take a
picture of it, what have you.
There are then patient reporting of things that there aren't measures for yet or we don't expect
them to be in the near future. So medication adherence, yes, people are coming out with the
automated pillboxes and all of that.
But in the meantime, you can also ask people to take pictures of that. Do any of you like spread
your vitamins out into a Monday through Sunday pillbox kind of a thing. Just think about taking
pictures of that, really mundane, cheap was of doing similar kinds of reporting about adherence,
about symptoms. So one of the biggest problems in those disease populations and with chronic
diseases is this actually adhering to medication, taking your medication.
There's like a 60 percent drop-off rate in hypertension cases, because the disease is relatively
invisible next to the symptoms -- next to the side effects of the medication. The medication can
cause dizziness and all kinds of things that just make you feel crappy, and the disease at that
day-to-day level is relatively invisible.
So you really want to allow people to titrate their medications and get it down to the right dose for
them, but it's difficult because there are delays in the system and it really requires this kind of a
monitoring to be able to make that correlation. And so the idea is if people could do much -- not
all the time, but as they're trying to get their medication to the right level and then it's not
something that's steady, right?
As people's situation changes, there are either -- their disease can change or other things can
change that make their dosage change. Whenever you do one of those shifts, you go through
this period of trying to record your symptoms as you're adjusting your medication. And this along
with location traces, because somebody can talk about fatigue, or you can ask somebody
retrospectively in their monthly visit to the doctor how have they been feeling?
But how have they been feeling is all through how they're feeling right now, whereas if you just go
look at their location traces, even if it's just every Monday or twice a week or it could be all the
time, you can see fatigue and depression and changes in fatigue and depression just in when
people get up and go to work and how much they go outside and what they do on the weekend,
which is something that is a good segue.
So all of these things have this common kind of framework, data capture, processing. It might be
on the mobile, might be on the website. It might be back and forth, depending on how much
you're training it or not, and then applications based on models and archive data.
This isn't something that I want to run just on my phone, because I want to be able to go back
and look at data from a year ago and look at trends and those sorts of things. So even if my
phone gets very, very powerful, the web is an important component of this.
So, finally, I just want to end the talk by talking about some of the architectural opportunities -architecture is a big word, some of the things that are common across these civic applications,
citizen science applications, sort of personal sustainability, transportation applications and health.
And in particular, as I was just starting to say, there really is a common workflow that we see
here. And so this should lend itself to middleware APIs, whatever the hell you want to call it, and
certainly lots of uses of things like activity classification modules, something that is definitely
something we don't do well. That is at least some very serious design and development and
application of machine learning.
I don't expect that it's necessarily research in machine learning in terms of needing new machine
learning algorithms, but making activity classification something very personalizable, so that I can
increase its accuracy. It's just calling out for some serious attention. So in the same way that
you used to -- I'm using all kinds of politically incorrect analogies here, but what the hell.
So when you open up your Palm Pilot and start up Graffiti, you would click -- are any of you old
enough to know this? You would click on the four corners of the screen and in the middle, right,
to do the calibration to your Touch. A better analogy is voice-recognition software, where you go
through some scripted phrases to have it adjust to your particular accent and pronunciation. And
you really want your phone to do the same thing.
And so Hossein Falaki will be here this summer, and you know it's a good thing when an adviser
who's only been working with the guy for a year quotes their graduate student. So Hossein has
this line that he says, a month or year after you've gotten your phone, it shouldn't behave the
same way as it did when you pulled it out of the box, right? It should adapt to you. And that's
true in terms of power management, and it's clearly true in terms of the kinds of activity
classification.
So, like, activity classifiers that it used for PEIR or Biketastic or what have you, as I said to my
students many times, if it ever classifies me on a bicycle -- bicycle shouldn't be even in the realm
of possibilities. My only relationship with bicycles is that I try to avoid them when I'm running, but
I don't go on bicycles.
So how much better can my classifier be if you just know that I'm never on a bicycle? So simple
things, as well as more complex. There's a lot of role for common modules, mechanisms,
algorithms and such in that context. But perhaps the most challenging of these things, or
challenging of the commonalities across these applications has to do with privacy, again,
something that was early identified here early in all the work that was done here on spatial
cloaking.
And is here when we're talking about on a -- and all of that was like way before everyone had
these GPS devices in their pockets all the time. And here we're talking about applications where
it's not just that you can go and subpoena information from the cell phone provider, but in fact
people, for their own use, or getting discounts on insurance, for being part of some social
networking application, what have you, will be archiving information continuously or at least for
significant periods of time about their location traces.
And so many potential consequences -- we all have visceral reactions. Some people have a
visceral reaction, which is that I don't -- I have nothing to hide, there's no problem. Most people
who have that reaction will though agree to the fact that just because they feel that way doesn't
mean everybody should feel that way.
And so if you look at things like GINA, which is the legislation that said that you can't be
discriminated against based on genetic data that might be available about you, that was put out
there in part because even if somebody chooses to make their information available, they
shouldn't be able to set their precedent that other people should be coerced into making that
information available and be discriminated accordingly.
And the same kind of thing can play out about your location trace. There is no information that is
so easily collected and that tells so much as one's location trace, right? Images all around town,
all around a city, all around an airport, are still relatively difficult to process in large numbers. It's
a much more technically challenging job, and they're not always with the person.
This data is so easy to mine, and so consequences having to do with, like, location-based
discrimination, okay, being denied insurance because you spent five years of your life living within
300 meters of a major highway, which is later associated with a threefold increase in getting
something. Or, for that matter, none of all of this requires location traces. It could just be based
on your address.
So, for example, if you live in the Long Beach or San Pedro Port areas, you're exposed to
particulate matter levels that are three orders of magnitude hire than when you're sitting on the
101 Freeway at a crowded time. Not three times, three orders of magnitude, because of all the
port truck traffic that comes in and out of the port area. So many issues having to do with health
that could end up relating to that kind of discrimination, but coming to things that are nearer at
hand. Just think about those professional boundaries, and even those personal boundaries, that
have to do with complete accountability about where you are and when you're there.
Some of them are just sort of in the category of trivial, tiny white lies. Some of them are about
how we manage our busy lives and juggle all the aspects of it, that notion that I can now be
accountable for where I am at every time is a change. That DWP manager who I talked to who I
had this great conversation with and immediately wanted to go use this application was also very
excited that he would now have complete information about where his workers were actually all
spending their time all the time.
So we have been looking at this question of how can we build into the architecture to give
individuals some control over their time-space accountability, to still want to be able to share it,
right? The answer of just never sharing isn't interesting. The problem is that people do want to
share subsets of this information, and as you guys showed many, many years ago with spatial
cloaking, you can't just make location trace data anonymized, because you lose so much of your
data.
So we've been looking at an idea of using something that we call a personal data vault. Recently
have been talking to Monica Lam at Stanford and she has something that she calls a Butler,
which I told her that I thought was really strange language, and then she reminded me that we
talk about master-slave all the time, so why was I giving her this hard time about using a
politically incorrect term?
If anyone could give me a good answer to that, I owe her the next retort in the e-mail, so I would
appreciate it. But so their Butler concept is a similar one. That's a big project they have, Monica
and Nick McKeown, and the idea being that my location trace should be owned by me and in its
raw form should be available to me historically. Again, I don't want to just keep it on my phone,
because I want it available for this kind of analysis, but I don't want to necessarily -- I want the
ability to store it independent of anybody who has commercial interest in mining it.
Okay, so this is something we started talking about before Latitude was announced. But in some
sense it's a no, let's not have that be the only way in which you can go about things, in which you
are sharing your location traces directly with a third party, who has, even if not anything at this
current moment -- will in the future or could in the future -- have an interest in mining that data.
So I want my raw data, my raw personal data streams, coming into a private data store, and then
the hard part becomes actually negotiating to whom I release those streams, what statistics am I
going to release? I can reduce the spatial resolution. I can provide it for certain periods of time of
day and so forth. And so we're building with Iram Eshkavindin [phonetic] is a collaborator on this,
and we're also working with a legal scholar colleague, Jerry Kong at UCLA, on what kind of
protection we might be able to have.
Ideally, you'd like sort of this to be subject to privilege. In the same way that your doctor can't be
subpoenaed to give out information on what you've told them because there's privilege in that
relationship, I would like it to be that this sort of passive data that I'm capturing about myself is not
automatically necessarily subpoenable, as well.
So many precedents for this. Some aspects of this are just very simple. I'm not talking about a
physically centralized vault. This would be something that could live out there on the web with all
kinds of existing encryption mechanisms and such to protect it. And of course the hard problem
has to do with when you start to release information, and so this is all just very much work in
progress and some of what we're trying to support then is not so much trying to make sure that I
know what happens to the data that I release, and that I've already released, but giving me back
enough diagnostic information over time so that I can decide what kinds of data to release in the
future.
So if I think about these data streams as sort of sources or spigots that I turn off or direct or
release filtered versions of them, the data I've already released, okay, I've already released it.
But how can I get back information from the system so that I can make better decisions in the
future? What is a better decision? It might be sort of times at which I choose not to release
information. It might be down-coding that data.
And so one of the mechanisms we're playing with is the notion of something called a trace audit,
sort of like trace route. Just like trace route, people can lie when they're responding to your
request for a trace route, but you can imagine third-party services that would agree as part of
their label as running responses to requests about where your data has been released and when.
And, of course, they can lie and be fraudulent, but you have all kinds of legal recourse in that
case.
This is just the idea of starting to bring some instrumentation back into the process so that when
you release your data streams to particular third parties to be passed on, you have some
feedback as to where that data is going. And the hard problems there are then having to do with
making that data you get back be something that anyone can do anything with, that anyone will
pay any attention to, that you know how to actually process. So, as is common knowledge now,
the problems with security and computer security has to do with the fact of the user interface, to
setting of those security mechanisms, to checking of them, to verifying of them. And so here we
have all those problems in spades, as well.
And, again, so many of these problems seem to become user interface design problems and
good data visualization. So just to give you an example, when we designed PEIR, in many ways
PEIR is designed in the wrong way, and one of the things we're doing this summer, because my
location trace goes directly up to a private account in the PEIR system.
So this summer we're pulling it apart, we're putting in an instantiation of a personal data vault and
then I will release streams of my data to PEIR, which is sort of that model we'd like to have for
that model of indirection to third parties. But something we did design into PEIR from earlier is
the ability to go and see automatically what computation led to the inference that the system is
making.
And so it's that kind of interface that you design into you system so that it is self-reporting as to
how particular inferences are made that seems important to design into this kind of system, as
well. And so privacy is one of the -- and these privacy-supporting architectures is one of the
areas that I think is most important for us to do, particularly in the university setting, where we're
not as -- we are not -- there are a lot of commercial applications coming out and there's lots of
reasons to mine the data. And so I'm not expecting somebody who's in the business of
optimizing for profit for necessarily taking a look at this question about how should we introduce
this level of indirection into the problem.
But I think it behooves us all in the end to at least look into providing an alternative, more
privacy-preserving way of going about doing these types of applications. There will still be plenty
of commercial services to be offered off of this, but this seems to be a more responsible way of
building the applications.
So just to finish up, I talked about a wide range of applications, relatively mundane components of
technology that go into them, and hopefully these applications will be commonly moved forward,
because there are so many common components to them and some issues, such as privacy, that
really pervade all of them. And so just to finish up, this is very much a case of if you can't go to
the field with the sensor you want, go with the sensor you have, and about really leveraging that
combined power of the Internet and the mobile device.
And, with that, I'm done. And I opened up a bunch of other URLs here, if anyone's interested.
So, ask me some questions or I'll be terribly insulted.
Yes.
>>: Some of the applications are immediately useful to the first person that uses them, whereas
others you have to sort of get over this effect.
>> Deborah Estrin: That's my line.
>>: I'm wondering what your experiences have been, trying to advise people to get over that.
>> Deborah Estrin: Good, good, good. So, not a single one of them depends on large numbers,
because, as you were still around then, there's sort of this -- along with being traumatized from
work in distributed sensing because of the actual cost of each point of measurement, it's related
to it is that's sort of traumatized by it's just so slow to get out there and deploy.
And so I didn't want applications where I had to imagine that I had 3 percent, 5 percent or 10
percent penetration in a particular area before it became useful. So, for example, one of the first
applications we did with GPS time series was doing traffic monitoring on side streets that aren't
instrumented, because you can look at the progression of the GPS and you can get an indication
of how traffic is moving.
And you can go and build a little version of that application. It's all very easy to do, but then,
okay, so how was I going to get some percentage penetration around Los Angeles so that there
would be enough people out there on side streets who were generating this information. That
seemed like a good thing for a commercial endeavor to go do, which is now what's happening,
where you have the Millennium Project.
I did not have the creative idea, nor the chutzpah, to go and rent 1,000 rental cars and have
people write the Millennium Project that Alex Bayen from Berkeley did with Nokia, which is he
rented 100, and then 1,000, I think, rental cars and had students drive up and down the highways
there to show -- because he also does this very complex modeling. Very cool work, but that's an
example of something that doesn't scale down.
So each of these applications, I was rather stuck or obsessed with the notion of choosing
applications that scale down. So PEIR is an example of an application that scales down. I using
it all by myself, nobody else using it. I'm using my location trace as an index into existing models.
I'm not depending on data that's being contributed to create the model or the data set. This map
was created by 10 people who happened to already be going around the park area, because
that's part of what they do in their every day.
Some of the other applications that will happen in the future could be, if this stuff is widely
adopted, because it will be, but all of these campaigns are in some sense all about that, making it
be very narrow, sort of narrow-band in that sense. So you don't have to have large populations to
have some chance that it will work.
So they're all intended that they scale down. They don't just scale up. If something scales down,
then hopefully it can build up to numbers that eventually it scales up. The AndWellness, the
wellness applications are all very individually focused as opposed to collecting data from people
to build some model.
I think building those bigger models and the epidemiology story and all of that will all be
tremendously exciting, but I can't figure out how to incrementally sort of get there. Yeah.
>>: One will use the scaled number [phonetic] version that I use on my data now. Why do we
need to share it with our [inaudible] or with our data board and have it audited and [inaudible].
>> Deborah Estrin: Good, good. Yes, yes. So just keeping it on your device is just a reliability
issue. I want it off my device. There are two reasons. The very minimal reason is reliability. So I
don't know if you've ever dropped or lost your -- okay, so then you say not on my device, but
right.
So my home computer has the same problem as my device, right? And if I'm talking about
wanting to have and having health applications that build up more seriously over this over time, I
really want it sitting someplace in the cloud.
>>: [inaudible].
>> Deborah Estrin: Exactly. So, at some level, this is just an institutionalizing that. It's just an
encrypted thing in the cloud. At a minimum, that's all the data vault is. However, a lot of these
applications, even though they're not about aggregation, they are about services. So I want to
run, I want to be a part of -- one of the interesting things about this Robert Wood Johnson
Foundation proposal we submitted, we actually submitted it with a clinic that's in South Central.
We went to them and talked to them about this mobile personal sensing and all of this stuff and
this feedback about looking at your own health behavior. And they sort of looked at us and said,
we get it, but for our populations there's not this sense of what they referred to as sort of agency,
of I'm going to do my self monitoring and change my health behavior.
These are people who are struggling in many respects. This isn't their -- that's just not culturally
how they work, so they have community health workers that work with subsets of the people
there to help them do behavior change. They go, and they're that sort of -- they fill this huge gap
that exists between the clinician and actually doing the kinds of health behavior that they need to
do in order to change their health situation.
So you want to be running a coaching application, a health monitoring application, tie into a social
network of set of people who are going to want to help you do that behavior change, whether it's
an Alcoholic's Anonymous or whatever your analogy might be, Weight Watchers, what have you.
So if you're never doing any sharing, you just need a private container in the cloud. But I think
the power of these things will come through sharing.
Some of it will be through those sorts of things. Some of it will be through things we might
consider more coercive or just simply capitalism at work, which is that I'll get discounts on my
insurance or rebates on my health insurance or what have you. The city officer in New York
wants to make it mandatory to do a certain level of monitoring and reporting, because he sees it
as the only way that he's going to keep health.
I mean, if we just dealt with obesity in this country, there is a whole set of follow-on conditions
--hypertension, diabetes. If you think about the impact -- cancer. If you think about on the health
care system of just dealing with that, there are reasons why if one was into Draconian measures,
that would be the place to put it. So long answer.
>>: Think about the space of applications that are possibly in this architecture. So the ones you
describe seem to have this flavor that this personal sensing lets you do certain things way more
efficiently than would be otherwise possible. I wonder if there are a second class of applications
that would actually not even be possible without forward [phonetic] personal sensing or not. Is
that a distinction there in your mind?
>> Deborah Estrin: I'm glad you asked that question in part because I should have said -- there
are people who do -- what's the guy, Seth Roberts or something? That self analyzing -- all of
these people who obsessively document -- who document all of their behaviors on websites, this
self-analyzed -- come on, hasn't anyone been to these sites? There's some better term than self
analyzed, the self-examined life, something like that.
And so he's a really interesting guy, I think a stat professor at Berkeley, who started off, he had a
whole set of health conditions and he started doing this very, very detailed measurement of what
he ate and what soap he used, and started tracking all these things and just using the web as a
vehicle for that and for sharing and things of that sort.
So, clearly, there are a set of these things that if you are focused on it enough, that you can do
without the device, that the device, as you said, makes easier. I think in some contexts it makes
it easier to the point at which it wouldn't have been done. For many people, if you take into
account human behavior, they simply wouldn't have done it.
Now, the case still has to be made whether they'll do it this way either, right? It's too early to
know that. But it has a -- this is just not a quantitative or systematic, but it has a feel to it because
people carry these devices around with them, that for some people it might be that qualitative
difference, just not just the quantitative.
In general, this area of like retrospective self report, switching retrospective self report to
in-the-moment self report, it's still self report, I can still lie, but the nature of bias and lying that
happens in in-the-moment self report versus retrospective self report is very different for many
different populations of people.
I'm sure there are people who are completely honest and have perfect recollection in both cases,
right? So in that context I think it's a matter of a qualitative difference. It's yet to be seen how we
can do the user interface design and the reminders. Remember, these are also devices that can
be triggered by location and time to remind you to do things and prompt you to do things, to
notice that you're not doing them. And, as the machinery and the learning part of it gets better
can get more and more sophisticated as to when they decide they actually need information,
versus when you seem to be having the same kind of day you had before.
All of those kinds of things can get better and better. I think all of my examples, you could find
another way of doing them if you can imagine people saying, for example, go back and say they
were at home and at work and these are the other things they did during the day, so you have an
equivalent of their location trace.
Yes.
>>: You've made a pretty compelling argument for using sort of mobile personal devices for
these sort of sensor applications. I was wondering, do you see sort of other classes of
applications for mobile devices, or do you have any opinions on, like, people using mobile
devices for computation and storage augmentation? I carry my life around in my phone?
>> Deborah Estrin: That particular one, I mean, I have an observation that's not a particularly
educated observation, which is I'm stupid. Before that, I was about to say I don't know, but that
one just strikes me as stupid. It's just why would I only carry my -- and the thing that has to be
powered is the place that you would do that?
I think the days of sort of SETI@home notion for many reasons, and we joked with sort of SETI
on phone kind of a thing, it's not a great model because it's also not even the most energy or
reliability or anything way of actually doing things, to leave all of our computers always plugged in
case we want to run some computation for somebody.
But I'm not an expert on all that's been uploaded to and downloaded from and used by people off
of the app store. There are clearly a lot of these apps that are about location based. I didn't
really talk about location-based services. These are more location trace based. It's not just
instantaneous location base. Clearly, lots of just location based everything that tends to be
possibly useful, but I don't have anything particularly useful to say about that, otherwise. Yeah.
>>: Are your tools freely available and open source?
>> Deborah Estrin: So everything we do is thus far open source. I'm supposed to say, my
colleagues want me to say thus far, because they make the point that as we transition out of
being a center we shouldn't cut ourselves off from possibly doing something else. But everything
is open source. Whether the particular set of things is currently up there in an easily
downloadable way without sending me e-mail is just a matter of where the project is in its phase
right now.
But if you send me e-mail and say where is this and can we get a version of it, the answer will
almost always be yes, and somebody else has probably asked for it, so it's pretty reasonably
available. So this Rapid Campaign Platform that's succinct [phonetic], Randy has been the
primary person, but with others as an example of that, so we've given out to a few other people,
and we're delighted to see that happen.
[applause]
Thank you.
Download