>> K. Shriraghav: It's my pleasure to introduce Aditya Parameswaran... former intern with us from Stanford University to come here...

advertisement
>> K. Shriraghav: It's my pleasure to introduce Aditya Parameswaran who has been a
former intern with us from Stanford University to come here for a talk. Aditya has a lot of
publications in [inaudible]. His [inaudible] count is more than 25 I think actually. Some
pretty large number. And -- Sorry?
>>: A crowd of publications.
>> K. Shriraghav: A crowd of publications.
>> Aditya Parameswaran: You know my secret.
>> K. Shiriraghav: In fact two office papers have been among the best papers of their
respective conferences, so he has a very distinguished record for a graduating PhD
student. And he's here to tell us about human-powered data management. So on to you.
>> Aditya Parameswaran: All right. Thank you [inaudible] and thank you for inviting me.
It's a pleasure to be here, a pleasure to be back actually. All right. So I'm Aditya
Parameswaran from Standford University. I'm going to be talking about human-powered
data management.
All right? So we are now in the midst of the big data age. Every minute we have 48
hours of video upload on YouTube, 100,000 tweets and so on. Understanding this data
would help us power a whole range of data driven applications. Unfortunately an
estimated 80 percent of this data is unstructured so it is images, videos and text. Fully
automated processing of unstructured data is not yet a solved problem. Humans, on the
other hand, are very good at understanding unstructured data. We are very good at
understanding abstract concepts. We are very good at understanding images, videos
and text.
So incorporating humans doing small tasks into computation can significantly improve
the gathering, processing and understanding of data. So the question is how do we
combine human and traditional computation for data-driven applications?
So let me illustrate the challenges using a simple example, the task that I actually
wanted to do for this presentation. So I want five clipart images of a student studying of
college age. There must not be a watermark, and it must be suitable for a presentation.
Okay? So it's a simple enough task.
So the first option is I do it myself, right? But this could take very long. I need to figure
out which queries to issue to Google -- or Bing Images. [laughter] Faux pas right at the
start. So I need to figure out which queries to issue to Bing Images, and for each of
those queries I need to go through hundreds and hundreds of results so it's really
painful.
The second option is I ask a friend. Once again it could take very long. The person might
not do a good job, right?
The third option is to orchestrate many humans. By orchestrating many humans I could
get results faster because I'm [inaudible], and because I'm using many humans I may
have low error. All right? Of course orchestrating many humans gives rise to many
different challenges. First, how should this task be broken up? Presumably I need to
gather images from an image search engine. Which query should I issue? How many
images should I gather? How should I check if these images obey properties? To
guarantee correctness I may want to check one property at a time. In what order should I
check these properties? Since humans make mistakes, I may want to ask multiple
humans. How many humans should I ask? How do I rank these images? How do I
optimize a work flow? How do I guarantee correctness? So these are the kinds of
challenges that one needs to grapple with when orchestrating humans for humanpowered data management.
All right? So these challenges boil down to a fundamental three-way trade off that holds
in this scenario: tradeoff between latency -- How long can I wait? -- cost -- How much am
I willing to pay? -- and quality -- What is my desired quality? So recall that in traditional
database query optimization the focus is on latency, in traditional parallel computation
the focus is on latency and cost, and traditional on certain databases the focus is on
latency and quality. In this case we have a three-way tradeoff.
So if there's one thing I'd like you to take away from this talk it's our focus on this threeway tradeoff that permeates all of my research on human-powered data management.
So to get access to humans we need to crowdsource. All right, so here is our diagram of
the landscape of crowdsourcing in the industry by an organization called
crowdsourcing.org. As you can see it's a very active area. Each of these tiny icons refers
to a separate company. Crowdsourcing means a lot of different things to different
people. It could mean generating funding using the crowd, generating designs using the
crowd, solving really hard problems using the crowd and so on.
In my work I focused on cloud labor or paid crowdsourcing. To get access to cloud labor
or paid crowdsourcing, one uses marketplaces. So all of the icons in this figure refer to a
marketplace. Marketplaces allow users to post tasks via low-level API [inaudible] people
who are online can pick up these tasks and solve them. And the economical example of
a marketplace which I'm sure a lot of you have heard of is Amazon's Mechanical Turk.
These marketplaces are growing rapidly. The size quadrupled in 2010 and 2011 and the
total revenue reached 400 million dollars in 2011. So here's an example of a task that I
could post to one of these marketplaces asking people, "Is this an image of a student
studying?" People who are online can pick up this task and solve it and will get the fivecent reward, in this case.
All right? Okay, so now let me draw a diagram of the landscape of research in crowd
labor or paid crowdsourcing and tell you how my work fits in. So there are lots of
humans; there are lots of marketplaces like I described earlier to reach humans. And
once again the canonical example of a marketplace is Mechanical Turk. Then, there has
been work on platforms making it easier to post tasks to these marketplaces dealing with
issues like how should interact with humans, what kind of human issues arise, what kind
of interfaces should one use, and so on.
Then, there is work on algorithms that leverage these platforms having humans do the
data processing operations, operations like comparisons, filtering, ranking, rating and so
on. Then there are systems that call these algorithms asking these algorithms to sort,
cluster and clean data. Of course these systems can also directly leverage the platforms
by having humans get or verify data. My focus has been on designing algorithms and
systems. So the focus of my thesis has been on designing algorithms and systems and
efficient algorithms and systems for human-powered data management.
There are four aspects that I've studied: data processing, data gathering, data extraction
and data quality. I've also worked on other research that doesn't fit under the umbrella of
human-powered data management. If there is time, I'll tell you about that too at the end
of the talk.
All right. So here's the outline of the rest of the talk. I'm going to tell you about two
human-powered systems or applications that both motivate and are influenced by my
research. I'll tell you about one of them immediately and the other one interspersed with
the second topic which is filtering.
Filtering is a critical data processing algorithm that applies to both of the systems that I
will talk about. I'll tell you about the other research I've done in crowdsourcing and in
other topics and then conclude with future research or open problems.
All right, so the first application or system that I'm going to be talking about is the
DataSift Toolkit. So the DataSift Toolkit is a toolkit for efficiently executing a specific kind
of query, the gather-filter-rank query on any corpus. So the idea is you gather items from
the corpus. You filter them. You rank them and then, you produce the result.
And humans may be involved in all three steps: the gather step, the filter step as well as
the rank step. And what we've built is a general purpose toolkit that can be efficiently
deployed on any corpus. All right, so let me dive into a quick demo.
So a user of DataSift will see a screen like this. They'll select what they're looking for, in
this case Google Images. Sorry, I don't have one for Bing Images. But they'll select the
corpus that they're looking for, type in what they're interested in. The conditions the
items must satisfy -- These are filtering predicates -- and how the items must be ranked
so the ranking predicate.
They can also specify how many results they want and how much they are willing to
spend. All right? Now I'm going to play a video of how I would use DataSift to ask my
query.
So currently DataSift is implemented over four corpora: Google Images, YouTube,
Amazon Products and Charter Stock but it could be over any corpus. For my clipart of
student studying example, I would type in, "Give me a clipart of student studying. Must
be one student of college age," and so and so forth. Right? Unlike a traditional search
engine, notice that I can use as many words as I want to describe each of these
predicates. Let's say I want ten results, and I say that my budget is, let's say, five dollars.
Okay, so this is how a user would post a query to DataSift. DataSift will translate this
specification into questions that are asked to the crowd.
So there are three steps: the gather, filter and rank step. So it'll actually gather items, in
this case images, by issuing keyword search queries to the corpus, in this case Google
Images. So in this case it'll gather items by issuing the keyword search query "clipart of
student studying," so the crowd is not involved in this gather step. But the crowd could
also be involved in the gather step, and I'll show you an example of that next.
Then, DataSift checks if the images retrieved actually obeyed the filters using humans.
Then, DataSift ranks the items using humans and then, presents the results to the user.
All right. Now I'm going to show you results from previous runs for this query. So this is
one such result. This is just a portion of the result; there are lots more below. So the first
column here refers to the rank given by DataSift for this query. The second column
refers to the rank given by Google for this query for clipart of student studying.
As you can see the first four results are all fairly good. This is one student of college age
sitting with books. There is no watermark and so on. Another interesting thing to note is
that the item at rank four was actually ranked seventy-eight in Google. So DataSift
managed to pull it up.
Now I'm going to play a video of me scrolling down so that we look at the rest of the
results. Okay, so then you have items with DataSift rank five, six and seven which are
also fairly good. Then you get to items with DataSift rank minus one. So these are items
that DataSift discarded during processing because it felt that it did not satisfy one or
more of the filtering predicates. These items were fairly high up in the Google Search
results for clipart of student studying. Right? So they were ranked three all the way to
sixteen. If you go and manually inspect each of these images, you'll indeed find that they
do not satisfy one or more of the filtering predicates. The typical one that is not satisfied
is the "no watermark" restriction, and sometimes it's not even a clipart of student
studying. All right, so the results are fairly good for this example.
Now let me try to give you an even more compelling example. So in this case I'm looking
for a type of a cable that connects to a socket that I took a photo of. So notice that I can
add a photo as part of my query. And in this case I'm searching over the Amazon
Products catalogue. So in this case DataSift does not even -- Yes?
>>: Sorry. Just to understand: how would that photo be leveraged during the gather
phase? Right.
>> Aditya Parameswaran: I'm getting to that.
>>: Okay.
>> Aditya Parameswaran: Yeah. So in this case DataSift does not even know what
keyword queries to issue to the Amazon Products corpus, as you rightly pointed out. So
it asks the crowd for keyword query suggestions in the gather step. Right? And then it
retrieves items corresponding to those keyword query suggestions, all in the gather step.
Then, in the filter step for those items retrieved it checks whether it satisfies this query or
not. So the items -- And there is no rank step in this case, right, because you're not
ranking based on any predicate.
So the items are fairly good. All of these are indeed cables that would satisfy my query.
Now I'm going to scroll down so that we can look at the results that were discarded by
DataSift. So here you can see an item with DataSift rank minus one. This is in fact a
Mini-B cable. It is not a printer cable. This is scanner PC interface cable which I don't
know what that is. Then there's a micro cable. Then there's a printer which is not even a
cable. And the further down you go, the more strange the results get. Right? They're not
even cables beyond a point.
All right. So once again DataSift does a fairly good job even for this query. To
summarize, DataSift is a toolkit for efficiently executing gather-filter-rank workflows on
any corpus. There lots and lots of applications of DataSift. How do you help school
teachers find appropriate articles to assign to students? How do you help shoppers find
desired products? How do you help journalists find supporting data or supporting
images?
And there are lots of challenges in building DataSift. How do you make it flexible and
extensible? How do you optimize the individual operators of gather, filter and rank? And
how do you optimize across the entire work flow which is something we haven't yet
addressed. So I won't have the time to get into the detailed design of DataSift; I'm going
to talk to you about just the optimization of just one of the operators, specifically the filter
operator.
All right, at this point are there any questions? Yes?
>>: Yeah, just the two examples you just mentioned, right, so for example how do I find
the right reading materials for students? With image recognition that's something anyone
can do, so people can look at them and say, "That's a student." Finding out what is an
appropriate book for a classroom is something that almost no one can do if you pick out
sort of random people off...
>> Aditya Parameswaran: True.
>>: ...the Internet. So is that really an adequate task here?
>> Aditya Parameswaran: True. So it is true that in some cases if the material is too
specialized it might not be appropriate to use a general purpose crowd. But the thing that
I had in mind, the use-case that I had in mind was a little more simple. Let's say I want to
assign articles on global warming, and I want articles that are well written that have, let's
say, neither a liberal bias nor a conservative bias. It sort of has a very neutral bias. It
considers the pros and cons of both arguments, and it is from a reputable source. Right?
So these are things that anyone can check, I think, anyone who has English
background.
Of course if it's a very detailed technical icon, for instance use DataSift to ask for related
work to my publication. That's something that I can't use it as of now. But I suspect once
we have a better sense for skilled workers or skill sets of workers, I think we can get
there eventually but not right now. Any other questions? Yeah?
>>: Why do you divide your task into gather-filter? [inaudible] I thought you [inaudible]
the system as just small, general purpose query, but you are choosing this particular
division called gather-filter-rank. Tell us your examples.
>> Aditya Parameswaran: So I will talk about another system briefly which does general
purpose computation, but this was a specific enough task. Even though this is very
specific there are lots and lots of applications that fit under this model so it [inaudible]
like detailed investigation. So one aspect that is different -- Although, the other system
does have filter and rank competence, it does not have a gather competence. The
gather competence is very new to the system, so I don't think there are corresponding
competence in the other system. But I'll get to that. Yeah?
>>: My related question is, if you look at databases that relational [inaudible].
>> Aditya Parameswaran: Yes.
>>: These look sort [inaudible] those relational [inaudible].
>> Aditya Parameswaran: [inaudible]
>>: Do you have language? Because typically the interface for databases align with
[inaudible]...
>> Aditya Parameswaran: Yes.
>>: Everything is implemented using composing operators. So these are the operators
you used to compose them? Is that a language interface or --?
>> Aditya Parameswaran: No, in this case this is the interface. It's only gather, a
sequence of filters and a rank step; that is the restricted language that I can handle. But
our toolkit is flexible enough that you can plug and play these operators if you really want
to, but we are not supporting queries.
>>: But then it's almost like the relational [inaudible].
>> Aditya Parameswaran: But only for filtering and ranking, that's it. No complex
operations like...
>>: These are the set of operators.
>> Aditya Parameswaran: Yeah, these are the set of operators.
>>: Well the gather is like the [inaudible].
>> Aditya Parameswaran: Yes, in a sense. In a sense, yeah.
>>: [inaudible] very interesting about all this is I mean do you sort of envision kind of a
marketplace of -- I mean, in some sense both gather and filter, depending on specialized
the question is, you might be willing to pay more to get people with...
>> Aditya Parameswaran: Yeah.
>>: ...different levels. It's almost like a matchmaking service between people who have
specialized kinds of knowledge and people who have questions that need answering.
>> Aditya Parameswaran: Absolutely. Yeah, so I see this as just the first steps towards
a very specialized marketplace where most of the people who are currently sort of going
to day jobs are -- And this is certainly happening. A lot more people are looking for
employment online, and this has certainly transitioned to skill labor of the kinds like
programming, design, virtual assistant. All of this has already moved a lot to these
crowdsourcing marketplaces, not enough to replace existing companies. But people
really like this service, right? I mean it's a flexi-time, flexi-cost. They can choose
whichever project they are interested in. It's great. And I think there will be a need in the
future to optimize the use of these humans even for skilled labor, and that's precisely the
point.
>>: It's the ultimate outsourcing.
>>: Yeah.
>> Aditya Parameswaran: Yeah.
>>: It's scary.
[laughing]
>>: Depends on who you are.
>> Aditya Parameswaran: It is giving opportunities to everyone, right? Anyway, that's a
[inaudible]. All right, so let me move on to filtering. Filtering is an algorithm that forms
part of the core of the applications that I just mentioned as well as the other applications
that I'll talk about later. It's also one of the fundamental data processing algorithms.
So in filtering you have a dataset of items. I don't need to tell you this but you have a
dataset of items, you have a predicate and you want to find all the items that satisfy the
predicate. So in our case items could be images. The Boolean predicate could be, "Is
this image a cat?" and I may want to find all the cat images in this dataset. Right?
>>: The Boolean predicate is an English [inaudible]. It's an English sentence?
>> Aditya Parameswaran: Yeah, it's an English sentence much like the predicates that I
had earlier. Yeah. So this is not something I can automatically evaluate. That's the
[inaudible].
So since I can't automatically evaluate I need to ask humans, right, does an item satisfy
this predicate or not? And since humans may make mistakes, I may need to ask multiple
humans. So the question is: how many humans should I ask? When should I ask them?
How should I ask them? These are the kinds of questions that come up in this part of the
talk.
>>: Are you...
>> Aditya Parameswaran: Yes?
>>: ...stopping with the dataset? Is it something -- [inaudible] can be the web that
searches, right?
>> Aditya Parameswaran: Sure. But in my scenario I have a restricted dataset. So the
way to think about this is let's say I did an initial gather set. I have a set of images that I
consider as...
>>: Because why I'm asking [inaudible] several queries...
>> Aditya Parameswaran: Yeah.
>>: ...[inaudible].
>> Aditya Parameswaran: Yeah.
>>: Is this image a cat? If you could just do a Google or Bing Image search on cat.
>> Aditya Parameswaran: Yeah.
>>: You already get a bunch of -- So some parts of predicates in your task, you can
probably push to the gather phase if you're working off a live feed in a certain [inaudible]
or something.
>> Aditya Parameswaran: So are you suggesting that I move -- So I think what you are
mentioning is the option of moving some of the predicates from the filter step to the
gather step.
>>: No, so...
>> Aditya Parameswaran: Is that the step?
>>: So you are doing crowdsourcing.
>> Aditya Parameswaran: Yes.
>>: And the questions you are [inaudible] fairly general.
>> Aditya Parameswaran: Right.
>>: So what I'm trying to understand is if [inaudible] a set of precomputed items or --The
most [inaudible] source is the web itself. Right? So if you...
>> Aditya Parameswaran: Okay.
>>: So if you think of the source as the web...
>> Aditya Parameswaran: Yeah.
>>: ...and you have some predicates in mind -- Let's say you have five predicates -some of those could be pushed to simple search predicates which you already filter the
images from the web.
>> Aditya Parameswaran: Okay.
>>: It's sort of optimizing the gap, [inaudible] some of the filtering with [inaudible].
>>: In some sense I mean gather already has a filtering operator there, right?
>>: Yes.
>>: Exactly, yeah.
>>: So gather-filter-rank...
>> Aditya Parameswaran: Sure.
>>: ...that is not [inaudible] sometimes.
>> Aditya Parameswaran: I agree.
>>: Right?
>> Aditya Parameswaran: I agree. And that is one of the reasons why we haven't been
able to optimize the entire workflow yet. I'm just talking about this one individual operator
and trying to optimize that. There are very complex interactions between gathering and
filtering and ranking in the sense that one of the versions of the system that we are
building involves gathering keyword query suggestions from the crowd, retrieving a few
items for each of those query suggestions, then filtering them and then going back to the
gather to step to gather even more for the keyword query suggestions that did well.
>>: So maybe I could rephrase the question a different way.
>> Aditya Parameswaran: Yeah.
>>: As a user of the system [inaudible] like a researcher, dataset of items on the web.
There are all these images on the web and I want to know what are the images of a cat.
>> Aditya Parameswaran: Yeah.
>>: I can think of it in two ways, right? I just pose this query to the [inaudible] system.
The [inaudible] system breaks it down into a gather phase.
>>: Yeah.
>>: Because you can't handle millions of images. It [inaudible] down to thousands of
images.
>> Aditya Parameswaran: Okay.
>>: So it extracts something [inaudible] and poses a query to the web and gets
[inaudible]. This is the first step.
>> Aditya Parameswaran: Yeah.
>>: And it shows the [inaudible]. Or I could have a different system [inaudible] me to
[inaudible].
>> Aditya Parameswaran: Who is you?
>>: I'm the user of datasets. The user of datasets to come up with the gather predicate.
>> Aditya Parameswaran: Okay.
>>: So [inaudible] is on the user, right?
>> Aditya Parameswaran: In...
>>: It doesn't automatically the predicate for the gather phase.
>> Aditya Parameswaran: So the toolkit is general enough that it could have both
options. So one options is -- So let me go back to dataset, right?
>>: [inaudible] better than mention nothing and gather and just say search using Google
Images and just fulfill the predicates. What happens?
>> Aditya Parameswaran: Say that again.
>>: My gather phase [inaudible]...
>> Aditya Parameswaran: The topic is empty.
>>: Empty?
>> Aditya Parameswaran: Yeah.
>>: So search all Booleans [inaudible]...
>> Aditya Parameswaran: Right.
>>: ...[inaudible].
>> Aditya Parameswaran: Right.
>>: And then I say image of a cat.
>> Aditya Parameswaran: Right.
>>: What will happen?
>> Aditya Parameswaran: So as a filtering predicate. So there are different versions of
the system. One version of the system takes the entire query and asks the crowd for
keyword query suggestions.
>>: It will ask the crowd for keyword...
>> Aditya Parameswaran: Keyword query suggestions. So that will be used in the
gather step to retrieve initial items. Then, you will filter those items based on the
predicate.
>>: So [inaudible].
>> Aditya Parameswaran: Of course if I use the version that uses the topic to ask
keyword query suggestions, that's obviously not going to work in this case. Yeah?
>>: So just to sort of try to put this in a perspective that I can understand: one way to
think about the gather step is when you give something crude, it's a way to specify what
the set is. And then, the filtering predicates are actually a validation stage.
>> Aditya Parameswaran: Exactly. Exactly. Yes, perfect.
>>: So the other step also has a ranking, right? For example I want cat.
>> Aditya Parameswaran: Yes.
>>: Now in the Google Image there are...
>> Aditya Parameswaran: Yes.
>>: ...[inaudible].
>> Aditya Parameswaran: Yes.
>>: But [inaudible].
>> Aditya Parameswaran: Yes.
>>: So how do you even decide like how many to start with in the gather phase?
>> Aditya Parameswaran: Great question. So while we have not yet done anything
sophisticated in that step, all that we do is take the top ten results, multiply it by a factor
K and then, retrieve those many results and then process it. That's all that we've done so
far.
So there are many ways of thinking about this question. One is that the search results
are somewhat correlated with the final results, right, so beyond a point going down the
search results is not a good idea. If you are searching, for instance, for let's say -- I don't
know -- clipart of student studying, beyond the thousandth image you're not going to get
student studying at all. You're going to get very noisy images.
>>: So you are making some assumption about the data source, that Google Images is
doing a good job.
>> Aditya Parameswaran: I am making some assumption about the data source. I
agree.
>>: But shouldn't the size of an initial set -- Okay, so here's your query. For Google
Images you get three million of them. You have a budget of five dollars, so shouldn't you
use that budget as a guide as well. So how big should my initial set be?
>> Aditya Parameswaran: Certainly.
>>: Restrict that five million down to twenty.
>> Aditya Parameswaran: Certainly.
>>: Because that's all you can afford to ask.
>> Aditya Parameswaran: Yeah. So that is something we haven't yet done, right? The
entire workflow optimization is something we haven't yet done. So right now I have some
ad hoc rules that govern how I use my budget -- I mean I have a rule that says gather so
many items for how many items I actually need. But that is a great point, yes.
Overall that's what I need to do. I need to think about how much I’m spending in the
gather step, how much I'm going to spend in the filter step. And the set of items
constantly shrinks as you go from the gather step to the filter step to the rank. Right? So
I need to think about how much I'm spending in each of these steps. It's a very complex
problem. And hopefully by just talking about filtering itself, I'll convince you that it's
complex enough. All right?
So should I get into filtering?
>>: Yes.
>> Aditya Parameswaran: All great questions. Please keep asking.
All right so in this part of the talk I'll focus on the trade off between quality and cost. I will
not consider latency; although, we also have results for that case. And for now I will
assume that all humans have the same error rate. This is an assumption I'll get rid of
later on in the talk. All right?
So how do we filter? Well, we use a strategy. So this is how we visualize strategies in a
two-dimensional grid: the number of nuances gotten so far for an item along the Y axis;
the number of yes answers gotten so far for an item along the X axis. At all yellow points
we continue asking questions. At all blue points we stop and decide that the item has
passed the filter. At all the red points we stop and decide that the item has failed the
filter. Okay? So this is just one example of a strategy, let me emphasize that.
An item will begin at the origin. Let's say we ask a question to a human. We get a no
answer; the item moves up. We ask an additional question. We get a yes answer; item
moves to the right. We ask an additional question. We get a no answer; item moves up.
And let's say I get a sequence of yes answers; we stop and decide that the item has
passed the filter. All right?
So the key insight here is that since I'm making the assumption that all workers are alike,
the way I get to a point is not as important as a factor as I am there. So these strategies
are Markovian. And for those of you who are familiar with stochastic control, this is in
fact an instance of a Markov Decision Process so this might be familiar to some of you.
So this is just one example of a strategy. Here are other strategies: always ask five
questions and then take the majority. Wait until you have three yes answers or three no
answers and until then keep asking questions. So other examples of strategies.
Now let me move on to the optimization problem. So in the optimization problem this is
one of many variants. I'm given or I estimate via sampling using a gold standard if I have
one or I can approximately estimate it if I don't the Pr question human error probability.
So this is the probability that a human answers yes given that an item does not satisfy
the filter and the probability that a human answer is no given that an item satisfies the
filter. And I also know the A-priori probability of an item satisfying or not satisfying the
filter.
So I know these quantities...
>>: [inaudible]
>> Aditya Parameswaran: Huh?
>>: [inaudible] A-priori probability?
>> Aditya Parameswaran: So if I have a gold standard then it's easy in the sense that -assuming the gold standard is a sample of the actual dataset then it would be an
accurate estimation. The fraction of true yes's versus true no's would be the estimate of
the A-prior probability.
>>: So A-priori probability that any [inaudible] image is an image of a cat?
>> Aditya Parameswaran: Exactly. So in the DataSift case, I estimate these quantities
approximately as part of the processing. So since it's a completely unsupervised system
I actually estimate these quantities while doing processing. So I do a little bit of
sampling, approximate sampling to estimate these quantities.
>>: How sensitive are the results to how [inaudible]?
>> Aditya Parameswaran: I haven't really checked, so I don't know how sensitive it is.
But my understanding is that these strategies are fairly robust, so even if the estimates
are off -- And we've done this using synthetic experiments: even if the strategies are
slightly off you still get fairly good results. Yes?
>>: So have you done anything on -- Sorry. Have you done anything to filter out what
should I call sloppy users or malicious users, somebody who just clicks no on every
image or yes on every image?
>> Aditya Parameswaran: Yeah, so there is...
>>: How do you recognize that? Or you could also seed something, seed the input set
with things that you know are correct.
>> Aditya Parameswaran: True.
>>: Right?
>> Aditya Parameswaran: Yes. Perfect. So in the DataSift case it's completely
unsupervised so I can't do this apart from sort of using the other workers, other humans,
estimates to check if a given human is good or not. Right? That is a disagreement based
scheme. We also have work on dealing with data quality but it's not integrated into this
current system. So the right now the way I think about it -- In this part of the talk I'm
assuming that all humans are alike. And since Mechanical Turk is such a, I mean,
rapidly changing pool of people that's an accurate assumption to make because I don't
have the reliable error rate estimate for people over time.
Because the pool of people that I have access to rapidly changes, so at any given time I
won't have a worker who I've seen before.
>>: So Mechanical Turk maintain the accuracy of those users?
>> Aditya Parameswaran: No.
>>: [inaudible]
>> Aditya Parameswaran: No. So all they maintain is the number of tasks that these
users have attempted in the past and their approval rate. And approval rate is not of any
good because if you do not approve their work then all the workers will boycott you. So
you just approve their work typically. That's just something you do.
So Mechanical Turk does not have a good reputation system.
All right. So quality is important aspect. Right now we are sidestepping quality by
assuming that all workers are alike; that's one. The other -- We do take into account if
we have estimates of worker quality that that can be taken into account while filtering by
suitably down voting, in some sense, the bad workers. And I will tell you about that later
on.
>>: Maybe you can take the users, the workers through some test to make sure that,
you know, they're of some decent quality.
>> Aditya Parameswaran: Great. Yes that is an option that is often used in practice.
Unfortunately in DataSift, because the task is new every time a user uses my system I'm
getting a completely new task. So testing the user on something I have information
about earlier is not going to help.
>>: But you can probably use...
>>: [inaudible]...
>>: ...the results from previous tasks and give it to your system to sort of judge the
quality of previous workers.
>> Aditya Parameswaran: But...
>>: [inaudible]
>>: I mean, suppose you are doing this task for, you know, thousands of things.
>> Aditya Parameswaran: Yes.
>>: Someone can just do ten of those and, you know, those things could be a test. Even
that would be good enough probably [inaudible]...
>>: Guys we can generate lots and lots of ideas.
>> Aditya Parameswaran: Yeah, all good ideas.
>>: Why don't let you continue on what you actually did.
>> Aditya Parameswaran: All right. Thank you. So, yeah, and my goal is to find the
strategy with minimum possible expected cost. In this case, since I'm paying the same
amount for every question the expected cost is nothing but the expected number of
questions. And I want my expected error to be less than a threshold, so this is the
second objective.
The last constraint is that I want my strategies to be bounded. So I don't want to spend
too much money on any single item. So what the last constraint means is that the
strategies fit within the two axes and X plus Y is equal to M. Okay, so I don't spend more
than, say, twenty questions on any single item which is reasonable.
All right, so how do estimate expected cost and error? So given a strategy the overall
expected cost of a strategy is nothing but summed over the red and blue points X, Y. X
plus Y which is a proxy of the cost times the probability of reaching X, Y. All right? And
overall expected error is the probability of reaching a red point and the item satisfying
the filter plus the probability of reaching a blue point and the item not satisfying the filter.
So these are the two ways you can go wrong.
And how do I compute these probabilities? Well I can compute it iteratively. So the
probability of reaching this point is the probability of reaching this point and getting a yes
answer plus the probability of reaching this point and getting a no answer. So I can
compute these probabilities iteratively.
So I now have a way of computing expected cost and error of any strategy. So here's a
naïve approach to compute the best strategy for all strategies, evaluate cost and error
and -- Yes?
>>: I have [inaudible] question. Is column identical to the [inaudible]?
>> Aditya Parameswaran: No, so I assume that I have my estimates already.
>>: No, each picture is a [inaudible].
>> Aditya Parameswaran: Okay.
>>: Each picture has a probability of satisfying the requirements [inaudible].
>> Aditya Parameswaran: No, in my case I know that each picture is either a zero or a
one. I'm not given that each picture is a probability. It's not a bias. Each picture is a zero
or a one. Given that an item is a zero or a one, I have probabilities of getting wrong
answers.
Okay, so naïve approach for all strategies that fit within the two axes and X plus Y is
equal to M, evaluate expected cost and error, and return the best one. How do I
compute all strategies? That's easy. For each grid point you can assign it one of three
colors, red, yellow and blue, run through all possible strategies and give the best one.
Right? Of course this is exponential in the number of grid points. If you have 20 grid
points -- Anyway, so if you have 20 grid points in the order of 3 to the 20 -- And in other
cases it gets even worse. So it is exponential in the number of grid points but this is not
an approach we would like to take.
So I have given you a naïve approach to find the best strategy, and I'll call these
deterministic strategies for reasons that will become clear shortly. Computing the best
strategy is simply not feasible. It takes too long. But the resulting strategy is fairly good.
It has low monetary cost. I have another algorithm that also gives me a deterministic
strategy. Once again this is exponential but it's feasible; I'm able to execute it for a fairly
large M. the resulting strategy is slightly worse. It has slightly higher monetary cost. But
I'm not going to talk about this algorithm either; I'm going to tell you about a different
algorithm. In order to that I need to introduce a new kind of strategy. As some of you
may have guessed, the new strategy is a probabilistic strategy. So in addition to having
yellow, blue and red points, I have points that are probabilistic like this point. So with
probability 0.2 you continue asking question. With probability 0.8 you stop and return
that the item has passed the filter. With probability zero you return that the item has
failed the filter.
All right. So these are probabilistic strategies. We have an algorithm that gives us the
best probabilistic strategy in polynomial time. And since probabilistic strategies are a
generalization of deterministic strategies, they are in fact the best strategy. Period.
Okay? And we can get that in polynomial time. Since it is the best strategy, it has the
lowest possible monetary cost.
So over the next four slides I'm going to give you the key insight behind this algorithm
and then tell you about the algorithm.
Okay, so the key insight necessary is the insight of path conservation. So you have for
any point a fractional number of parts reaching that point. And what that point does is to
split the parts. Some of the parts continue onward. Some of the parts you stop and
return that the item either passes or fails the filter. So pictorially let's say there are two
parts coming into this point. This point decides to split the parts 50/50 so one part
continues onward to ask; one part you decide to stop. For the part that continues onward
to asking an additional question, this part moves to the point above as well as the point
on the right.
Okay, so this is how path conservation works for a single point. Now how does path
conservation work for strategies? You have one path coming into the origin. Since it is a
continue point it lets the paths continue onward, so one path goes to this point and to
this point. Once again since this is a continue point, it lets the paths flow onward.
While this is a probabilistic point lets say the probability is 50/50, so half a path flows
onward from here. So overall you have one path ending here, one and a half ending
here and half a path ending there. All right, so this is how path conservation works in
strategies.
Now finding the optimal strategy is easy. We simply use linear programming on the
number of paths. And so you have a number of paths coming into each point; those are
the variables. The only decision that needs to be made at each point is how these
variables are split. Everything else is a constant multiple. So the probability of reaching a
point is a constant times the number of paths reaching that point. The probability of
reaching a point and the item satisfying the filter is a different constant times the number
of paths. And returning paths of fail at a point does not depend on the number of paths.
All right?
So finding the optimal strategy for this scenario is easy; you just use linear programming.
Now I'm sure you thought of many issue with the current simple model. We have
generalizations that can hopefully handle all the issues that you've thought of. All right?
So let me pick a few of them to explain further. All right, so the first generalization is that
of multiple answers. So instead of having a Boolean predicate, yes or no, whatever you
want to categorize an image as: either being a dog image, a pig image or a cat image or
you want to rate an item as being either 0 out of 5, 1 out of 5, all the way until 5 out of 5.
In this case we simply record the state as the number of answers of each category that
I've gotten so far. Once again I can use path conservation and a linear programming to
find the best strategy. Second generalization is multiple filters.
So, so far we considered a single filter. What if we have a Boolean combination of
multiple independent filters like in my DataSift example? In this case we simple record
the state as the number of yes and no answers for each of those filters, and at any point
you can choose to ask any one of those filters or you can stop and return that the
Boolean predicate is either satisfied or not satisfied.
Then the last generalization is that of difficulty. So far we assumed that all items are
equally easy or equally difficult, so they all had the same error rates. What if they're not?
What if there is a hidden element of difficulty? We captured that using a latent difficulty
variable and the error rate of each item is dependent on that latent difficulty variable.
Once again, we can capture that in our current setup.
Now let me move on to a harder generalization. So this is a generalization of worker
abilities. So let's say I have three items whose actual scores are 0, 1 and 0. Worker 1
who is a very good worker decides to answer 0, 1 and 0 for these three items. Worker 2
decides to answer 1, 1 and 1 for each of the three items, so he's a fairly poor worker.
Worker 3 is adversarial so he flips a bit for each of the items. So he's a fairly bad worker
but he gives us a lot of useful information. We can just flip his bit. Anyway, so how do we
handle such a case? We are losing a lot of key information by assuming that all workers
are alike. So we can reuse the trick for multiple filters. We can certainly record the
number of yes and no answers corresponding to each of the workers, and this certainly
works.
Unfortunately if we have many workers with varying abilities, we have an exponential
number of grid points and, therefore, our approach does not scale. All right? So over the
next three or four slides I'm going to tell you about a new representation that helps us
solve this problem. Any questions at this point? All right.
So instead of recording the number of yes answers and the number of no answers
gotten so far, we record the posterior probability of an item satisfying the filter given the
answers that you've seen so far along the Y axis and the cost that you've spent so far
along the X axis. Okay?
So now to make it clearer I'm going to map the points from the previous representation
to the new representation. So the point at the origin maps precisely to the A-priori
probability of an item satisfying the filter and cost is equal to zero. All right? These two
points map to points above and below that point at costs equal to one. Right? And the
remaining points would map to their respective points in the new representation.
Now as an approximation, I'm going to discretize the posterior probability of an item
satisfying the filter given the current answers into one of a small number of buckets. And
as a result multiple points in the old representation may map to the same point in the
new representation. All right? Notice that I can discretize it as finely as I want, as finely
as my application needs.
So if we have many workers with varying abilities, we can once again map this two-end
dimensional representation to that two-dimensional representation. And as an interesting
property: as we reduce the size of the discretization, make it smaller and smaller, the
optimal strategy in the new representation tends to the optimal strategy in the old, more
expensive representation.
All right? So what changes in the new representation? Well instead of starting at the
origin, you now start at the a-priori probability of an item satisfying the filter with one path
entering the strategy at that point. If you have all workers having the same error rates,
you have two possible transitions, one above and one below all on spending one unit of
cost. So you always transition to the right. If you have N workers with varying abilities,
you have order of N transitions. So the size of each linear equation scales up by order of
N.
And once again everything else works. You can use the path conservation property and
linear programming to find the optimal strategy. So let me quickly tell you about one
more generalization that works well in the new representation then I'll move on to
experiments.
So the other generalization that works well in the new representation is the
generalization of a-priori scores. So what if I have a machine learning algorithm that
provides for every item of probability estimate of that item satisfying the filter. For
instance I may have a dog classifier and the probability estimate may be proportional to
the distance from the classifier, as well as which side of the classifier the item lies.
This is easy to use as part of my strategy computation. Let's say I have 50 percent of
items with probability 0.6 and 50 percent of items with probability 0.4. I simply start half a
path at 0.6 and half a path at 0.4, and the strategy computation proceeds as before.
When running a strategy on an item, the item will begin at its a-prior score. And this apriori score sort of would capture the intuition when we already have the input dataset
having a ranked list of results. Right? We could certainly help in that case as well.
All right, so [inaudible] about a number of generalizations. We have other generalizations
that I did not have time to cover. Yes?
>>: Yeah, so I have like one question on the [inaudible]. So in this case you studied the
[inaudible].
>> Aditya Parameswaran: Yeah.
>>: But in practice I would argue for your application. A much more natural operator is
[inaudible]. I mean I want to pick 50 images [inaudible]. Right? If my ultimate goal is to
pick 50 images [inaudible], it might not be optimal for me to go through each image and
get it graded, right? It's much better for me to consider an image, and if an image is
marked yes I want to pilfer those images. But if an image is starting to have some
variation, like variance in the marking, it's probably not useful because I want to focus on
images [inaudible]. So it seems that the nature of the problem with fundamentally
change if you incorporate [inaudible]....
>> Aditya Parameswaran: Certainly. Certainly. So that's another problem we have
studied. And the key insight in that scenario when you want a fixed number of items from
a dataset that satisfies the predicate, is that as soon as an item falls below the average
item in the dataset you would rather pick the average item of the dataset. That's an
intuition that you had as well, and we have a separate paper on that. I'm not going to be
focusing on that in this talk.
In addition to systems DataSift also appears in lots of natural scenarios, things like -Companies do this all the time, so things like content moderation. A lot of companies
have a content moderation phase before the user-uploaded images go on the live site.
They have a content moderation phase where they use crowdsourcing services. In that
case you need to go and manually check every single image. And the second
application that I'm going to tell you about, also you need to go and manually inspect
every single item.
All right. So finding key items that satisfy the predicate, that's a natural algorithm that
we've studied. Yeah?
So now I'm now going to tell you about experiments. I'll use that as an excuse to tell you
about the second application that we've been studying that is MOOCs. So I'm sure
you've heard of MOOCs, massive open online courses. They're very trendy. There's in
fact even a photoshop poster of the movie The Blob which has been photoshopped to
read The MOOC which I though was quite cute.
MOOCs are revolutionizing the world of education. There are hundreds and hundreds of
courses each being taken by thousands and thousands of students. There are lots of
courses that require subjective evaluation, courses like psychology, sociology, literature,
HCI and so on. And there's no way TA's can go and evaluate all the assignments in all of
these courses. So what we need, therefore, is peer evaluation.
So peer evaluation is crowdsourcing but with an important twist. The important twist is
that the evaluators are also the people being evaluated. Okay, so now the key question
is how do you assign evaluations for submission so that you can accurately determine
the actual grade of each submission?
And notice these were the images that DataSift gave me for my initial example. Okay, so
deciding whether or not to get additional evaluations for each submission is a
generalization of filtering that I considered where I want to rate an item as being either 0
out of 5, 1 out of 5, all the way until 5 out of 5. So we are very lucky to have the dataset
from one of the early MOOCs offered at Stanford. This is the Stanford HCI course. In
this case you have 1500 students with 5 assignments each having 5 parts. These are
graded by random peers whose error rates we know because we've had them go and
evaluate assignments for which we know the true grade.
Okay, so we know their error rates. And our goal is to study how much we can reduce
error for fixed cost or vice versa. So here is one sample result. I'm plotting the average
error. In this case the average error is the average distance from the actual grade. And
remember actual grades are between 0 and 5. And along the X axis I have the cost; in
this case the cost is the average number of evaluations for each submission. And I'm
plotting three separate algorithms. The first is the median algorithm that requests a fixed
number of evaluations for each submission and then takes a median. The one-class
algorithm that assumes that all workers are alike, have the same error rates, and uses
the old representation. And the two-class algorithm that puts workers into two buckets
based on their variance, high variance and low variance workers, and uses the new
representation.
So for now let me focus on the median algorithm and evaluations equals five. So in that
case the median algorithm has an error of 0.3. So this is in fact the heuristic that is
currently being used in the course error system for a range of courses like psychology,
sociology and so on. So we can get to the same error using just 60 percent of the cost,
using the one-class algorithm, and just 40 percent of the cost using the two-class
algorithm.
From the perspective of error if I fix a cost at three, I can reduce the cost of the error by
40 percent if I use the one-class algorithm and by 60 percent if I use the two-class
algorithm. So either way I can significantly reduce both costs in error using our
strategies. All right? So at this point I'm happy to take questions because this is -- All
right. Moving on.
So let me tell you about other work in the crowdsourcing space that I've worked on, other
research that I've done and then conclude by talking about open problems. Yes?
>>: So [inaudible] filtering: so if you have multiple filters like in your example you
showed, like four or five filter [inaudible]...
>> Aditya Parameswaran: Right.
>>: How do you handle them? Do you ask [inaudible] questions for each of them or...
>> Aditya Parameswaran: Yes. Yeah. So currently the way we handle them in the
filtering operator is by having a separate question for each of those filters.
>>: Okay, but you could also consider kind of combinations and...
>> Aditya Parameswaran: True. True. The reason why we decided to go with separate
questions for each of the filters is because it's not very clear -- humans are more likely to
make mistakes with a combined question because it's not clear what question they are
answering. If it is a single-unit question, it's much more clear what they are answering.
It's like does it satisfy this and this and this and this?
They might say no even if it does satisfy or the other way around. But, yeah, good point.
Okay so we have studied other aspects of data processing in addition to filtering: finding
the best item out of a set of items; categorizing an item into a taxonomy of concepts;
identifying a good classifier for imbalanced datasets; also the search problem, which is
the one that you mentioned, finding key items that satisfy a predicate. Determining the
optimal set of questions to ask humans in a lot of these cases is NP-Hard even for very
simple error models; therefore, we need to resort to approximation algorithms.
And recently we've started looking into some of the data quality issues as well which are
common to all of these algorithms. Let me move on to Deco. So DataSift is, in my mind,
an information retrieval-like system powered by the crowd. Deco on the other hand is a
database system that's powered by the crowd. So I don't need to tell you this but
database systems are very good at answering declarative queries over stored relational
data. But what if the data is missing? What if I don't have the data?
So Deco can actually tap into the tiny databases that exist in people's heads so it can
answer declarative queries over stored relational data as well as data computed on the
fly by the crowd. So if you ask a query like this, asking for the cuisine of Bytes Café at
Stanford, Deco will gather the fact that the cuisine of Bytes is French and return that as a
result for the query.
So -- Yeah?
>>: This system works perfectly as something [inaudible].
>> Aditya Parameswaran: Not the gather step.
>>: I thought this is something that we're gathering.
>> Aditya Parameswaran: It is gathering missing data. The keyword query suggestions - I mean it would take a lot of sort of mangling Deco to fit under DataSift. But it is true
that this is a much more general purpose system than DataSift.
All right? So here are key elements of Deco's design. It has a principle and general data
model and query language. You have user configurable fetch rules for gathering data
from the crowd. This is sort of like access methods if you will. User configurable
resolution rules for removing mistakes or resolving inconsistencies from data gathered
by the crowd. One such resolution rule could be the filter operator.
Due to the three-way tradeoff between latency, cost and quality we need to completely
revisit query processing and optimization in this scenario. So we have a working
prototype which was developed by a colleague and myself at Stanford. And this is also a
web interface. So the prototype supports Deco's query language, a data model, query
processing, as well as optimization as a web interface where you can post queries,
visualize query plans and get results.
All right? So let me move onto data extraction. Let's say I have a web site like Amazon,
and I want to extract an attribute like price from all the web pages on Amazon. I can ask
humans to provide pointers to where the attribute is present on a given web page. Then,
I can extract from all the pages on that site. But what if the web page is modified? So my
pointers are no longer valid and I may end up extracting incorrect information. Like in
this case, I may end up extracting the fact that the Kindle cost ten dollars. So what do I
do in such a case?
We built a robust wrapper toolkit that can reverse engineer where the pointers have
gone in the modified versions of the website so you can continue to extract from
modified pages accurately. So you can significantly reduce the cost of having humans
provide pointers once again. So our robust wrapper toolkit had some very nice
theoretical guarantees and over an internship I deployed this in Yahoo's internal
information extraction pipeline.
Okay, so there's lots of related work that we build on in the crowdsourcing space, work
on workflows, games and apps. Whenever I give talks, I typically get questions on the
first four topics; although, my work is more similar to the last two topics. Deco and
DataSift are similar to the recent work happening around the same time on CrowdDB
and Qurk. And [inaudible] there's been a development of a number of other algorithms,
sorts and joints, clustering and so on. Okay, now let me tell you about some of the other
research I've done. I've also worked on course recommendations for a course
recommendation site called CourseRank. Course recommendations pose a number of
interesting and challenging aspects. So you need to deal with things like temporality
because courses are typically taken in sequence. You need to deal with requirements
because courses need to be recommended that are not just interesting but also help the
student meet graduations requirements. My course recommendation engine had some
nice theoretical guarantees, and this was deployed into CourseRank.
CourseRank was spun off as a startup by these four undergrads and deployed at about
500 universities. And I think a year ago it was purchased by a company called
[inaudible].com.
All right, so...
>>: So you don't need a job, right?
[laughing]
>> Aditya Parameswaran: There's this T-shirt, right, that says, "My friends had a startup
and all I got were the lousy research papers," right?
Yeah, so I've worked on human-powered data management and recommendation
systems. In addition I've also worked on information extraction and search but I won't
have time to cover that in this talk.
So in all of my work I've followed a sort of end-to-end approach to research. I model
scenarios conceptually starting from simple error models and then generalizing, like in
the filtering case. I formulate optimization questions that make sense in the real world. I
find optimized solutions using techniques from optimization, inference, approximation
algorithms and so on. And I build systems with these solutions, systems like DataSift,
Deco, to robust wrapper toolkit and so on.
Of course in research it's never really a linear path; there are lots and lots of iterations.
But I intend to continue to using this end-to-end approach in research. All right? So I
think human-powered data management is only going to get more and more important in
the future. There are more and more people looking for employment online, so there's a
need to manage and optimize the interaction of this giant pool of people who are
interacting online in a seamless manner. And of course more and more data is being
accumulated. However, many fundamental issues in crowdsourcing remain. Issues like it
takes too long, sometimes the work is badly specified, sometimes workers are errorprone, sometimes humans don't like the tasks that we give them, and sometimes it costs
too much.
So I have initial angles of attack for all of these issues. Let me a pick a few of them to
explain further. So latency can be addressed by having systems produce partial results
as they do their computation. But this requires revisiting the computation models of
systems and algorithms. Given two algorithms, how do we pick the one that produces
interesting partial results faster?
So to deal with poorly specified work, how can we use a crowd to decompose -- So one
more point about the eager computation: this is related to some prior work in the
database community on online aggregation. To deal with poorly specified work, how can
we use a crowd to decompose a query to a workflow? What should the intermediate
representation be? How can we verify the correctness of this intermediate
representation?
To deal with error-prone workers how can we monitor their performance and see that
their performance doesn't start to drop? How can we be sure that our estimate of their
performance is correct? And how and when should we provide feedback to workers that
they're doing a good job or a bad job?
All right, so the steps that I'm arguing for go beyond looking at systems and algorithms to
the other steps in the pipeline, the interaction with humans, as well as the interaction
with platforms.
So once we solve some of the fundamental issues in crowdsourcing, I think there is no
end to what we can do. There are many, many hard data management challenges that
could benefit by plugging in humans as component. And of course designing some of
these systems would bring about a whole range of additional challenges as well.
For instance, how can we impact interactive analytics using humans? Can humans help
formulate queries? Can humans help visualize query results? How can we build better
consumer-facing applications powered by humans? By combing human and computer
expertise, can I build a newspaper, a personalized newspaper that beats Google News,
for instance? Can I build human-powered recommendation systems? How can I impact
data integration, a problem database people have been working on for decades now,
using humans? Overall I think there are lots of interesting problems in redesigning data
management systems by combining the best of humans and algorithms. At this point I'd
like to mention that a lot of the work that I've done in my PhD is in collaboration with a
number of collaborators both at Stanford and outside Stanford.
In particular I'd like to call out my advisor, Hector Garcia-Molina, my unofficial co-advisor
Jennifer [inaudible], as well as frequent collaborator [inaudible]. At this point I'm happy to
take questions.
[applause]
>>: [inaudible]
>> Aditya Parameswaran: You guys are there. You guys are there.
>>: Who's the collaborator [inaudible]?
>> Aditya Parameswaran: That is Ming Han. I couldn't find a photo of him.
[multiple comments simultaneously]
>>: Collaborator humans and sometimes non-humans?
>> Aditya Parameswaran: Yeah, I should have mentioned the crowd of Mechanical Turk
workers. [inaudible] Any other questions? All right, thank you so much for attending.
[applause]
Download