1

advertisement
1
>> Michael Gamon: Hi. My name is Michael Gamon. I'm from the NLP Group here
in MSR, and it's my pleasure today to introduce Professor Bing Liu from the
University of Illinois at Chicago. And I've known Bing for a long time. I
actually don't remember now how long we go back, but Bing has done a lot of
work on web mining and, specifically, on reviews, including aspect detection,
product aspect detection, sentiment detection and he was also, as far as I can
tell, really the first person to start research on fake reviews and the problem
of review spam. And you might actually even have seen some public media
attention to that problem. It's obviously a big one, because we just talked
about it, it can really affect a small company tremendously.
So this is what Bing's going to be talking about and also this is part of a
book that just came out on sentiment and opinion mining, which is sort of an
overview of the field.
>> Bing Liu: Thank you very much, yeah. So this problem is getting pretty bad
now. Previously, it was not as bad, but there are some sort of companies which
is so-called the reputation management companies, they start to do this and
writing things on the web.
And so this has been, the opinions in social media has been useful for quite a
while, even many years ago, not really many years, about three years ago, there
were already some market company did surveys on how many people, what
percentage of people, when they buy something, or when they use some services,
they get, do they read reviews.
And I think it was three or four years ago, it was about 75 percent, and now
it's getting a lot bigger. And also, people using reviews, the people
analysis -- the companies doing analysis of reviews has also increased quite
dramatically, and they also use in terms of opinions in the other kind of
forums and Twitter and blogs. And I think MacKenzie did some survey some years
ago, also about three years ago, said the U.S. has about 30 to 60 companies
working on that. And this May, somebody else did a survey that said now there
are about 350 companies doing it, doing something related to this, getting
opinions, reviews from blogs, from sort of Twitter.
So we use this thing for decision making in purchasing or when you do election
and also, of course, for marketing and branding. And afterwards, you want
to -- after you've seen the thing, you want to do some sort of actions. This
2
give people very strong incentive to do something to try to promote their
products, products or services, and this basically becomes a business itself
also.
Previously, it was mostly individuals who write it. And recent few years, the
people in the companies -- they get middle man in India write it and now even
U.S., also small companies, reputation management companies are doing the
writings fake reviews.
And what do we say? We studied this particular research in 2005, 2006 for a
time frame. Actually, it was related to a project with Microsoft beginning.
What we were trying to do at Microsoft at that time was try to figure out the
people from different websites, are they the same guy or not the same guy. We
did some surveys, what kind of names do you use on different websites. And
then later on, we were more interested to get into this particular topic of
detecting what we call opinion spam. And so then we first start was the
opinions in the reviews and how to detect the fake reviews.
Okay. So there's lots of people who write these fake reviews, and we call
opinion spam, and they try to write undeserving positive reviews and to promote
those products and those who write unfair, malicious, negative reviews to
target some products and try to demote them.
We said also it's been a business in recent years and also customers are
getting worried about this thing and lots of things are being reported on the
web. There's lots of reports. Again, there's just a subset of those. And
this is one which is very interesting. My website, I compiled quite a bit a
few links.
The first one, Amazon Glitch Unmasks War of Reviewers. This was 2004. What it
was was that Amazon of Canada made a mistake with a system, and then they
released the real name of the reviewers, and then they discovered all the book
authors write for themselves. Even those very famous ones. And when they were
confronted, reporter goes to talk to them, they say oh, everybody does it. And
then they also have one woman, she was a writer, and she was even complaining
her friends and one of the friends, she asked her friend and family to write
it. One friend only write oh, this is a good book. She's like what kind of
friend is this? Only write a single line for my book.
So that was the sort of very early report.
And early on -- and this second one
3
was one of the companies selling printouts on Amazon, they give the customer
for free if you ran a positive review, and this is small, cost about ten
dollars. So you can see that two dollars for a star. So suddenly, you can see
the sudden burst of reviews from the customers.
But, of course, this thing has been done quite a long time ago and has been
going on especially by hotels. They give you a little bit of a discount and
ask you so write something positive. And another one which was published
recently, the person who wrote these fake reviews, and he talked to New York
Times.
And then, another one, so now it becomes interesting, the Google Places, Google
Places has lots of fake reviews. So this particular person, this one, what
happens, he went to -- this reporter went to investigation about something he
feel fishy, and then this is being done by some sort of reputation management
company.
And then he go to look at his sort of surrounding
found out around 60 of them using this particular
interesting, ABC 7 News in Colorado somewhere, he
doing, writing a lot of reviews about a few small
only doing a few businesses.
area's businesses. And he
service. This one
also find a company who is
businesses. Lots of people
And so this tool, when you have your reputation company, reputation management
company that writes fake reviews for you, they're fairly easy to detect.
Actually, I want to talk about some algorithm which can detect this type of
situation. Then there's some of the test services you can find on the web,
there are lots of them if you search for them, things that you'll find lots of
companies that will do it for you. Some are expensive, some are much cheaper.
Also, people put out some interesting sort clues to allow individual users to
figure out something, whether it's fake or not fake. But which it is really
hard. We did try some experiment with students, and it was very, very
difficult. Almost impossible to figure out, okay?
And what we did was we asked students to write some fake reviews which are on a
product they're familiar with and then write fake reviews you have never used
that product before, you know nothing about that product. So for this part,
it's relatively easy to detect. The product, if you know about it, it's almost
impossible to detect. Very, very hard to detect.
4
Then this is sock puppets.
multiple -- yes, go ahead.
>>:
And what happens is one person register for
I have a question about the experiment.
>> Bing Liu:
All right.
>>: The students who knew about the product, so like how fake they could, or
did you ask -- did you give them some [indiscernible] as to how they might want
to fake the review?
>> Bing Liu:
Yeah, I did.
>>: [indiscernible] or some kind of feature they didn't like that they say
they liked? That kind of ->> Bing Liu: I did not give them any further instruction except to say just
try to write it, looks like, sounds like it's real. That's the only thing I
said.
>>:
But if they knew about the product, right?
>> Bing Liu:
>>:
And could be their own opinion, right?
>> Bing Liu:
cameras.
>>:
Yeah.
How can you --
What I'm saying, you know that category of product, for example,
Have not reviewed the property.
>> Bing Liu: Yeah, not your own review about review products, yes. But
category. For example, cameras, we do games and those kind of things so they
can write pretty real stuff.
And also, a company -- is doing too and China is very bad. I'm talking to some
companies which is hosting reviews. They say if you don't remove these guys,
the review is useless. They said probably 50 percent of them are fake. Quite
bad.
5
So this was somebody put on the web, Belkin international, and asked people on
the web to write reviews for them. Use your best possible grammar, write in
U.S. English and always give 100 percent rating, as high as possible. Keep
your entry not too long, and this is the real one, okay. This is the real one
that was caught. It was caught by somebody.
So who writes these fake reviews? So business owner does quite a lot.
Restaurant owners and book authors and song writers. And themes two are
particularly bad, the book authors. Now becomes everybody can publish a book
if you pay some money, you can publish a book. So they have to promote the
stuff. And also the small bands, they have some songs in their music, and they
have to promote it. Restaurant owners, also quite a bit.
But now, also, businesses and other type of small businesses also do it. Just
now, I was talking to Michael. And a few weeks ago, a dentist sent us an email
from North Carolina, Charlotte, and what happens is he has three practices.
Three practices. He's owner of the three practices, and one of the shops has
one guy, and he got negative review. And then this person who contact us, he
was the own, he goes to talk to that person. Confront the person, say what's
going on, okay? How can you get something which is so bad? Did you treat a
patient very badly?
So they got very angry with each other. And then a few days later, then his
other two practices got very nasty, negative review. His practice has been
there for long, no review at all. Then suddenly, two very nasty review show
up.
So then he was very angry. He was wow, must be from the other guy. So what
happens is we did an investigation. Now we also find out those two -- those
two reviews are written with two different IDs, and they were all written on
the same day, exact the same day, and those two person also wrote another
review on that same day, and both of the reviews about food. One is about
cookie, another one is about cracker or something like that.
So it's very interesting. It looks like these two must be the same guy. The
writing style was pretty good and the writing was very nasty, saying this
doctor who want to charge him, he really has nothing to do, just going to wash
your teeth [indiscernible] do something else and the cup boards are full of
coffee. It was very bad.
6
So this is sort of -- this dentist got very upset. He contact Yahoo and do all
kind of things to get it removed and to say this must be from the same person,
but Yahoo didn't do anything, okay. They didn't do anything.
And then we were talking about this thing, and a few days later, few days
later, then suddenly a few positive reviews show up. Positive review.
Basically the negative reviews was still there. And then suddenly, a few
positive reviews show up. Before that, there was no review at all.
So my guess is this positive review probably from himself or from his friends.
You can see the positive review, at least two of them were so obvious and so
formal writing. Doctor so and so, full name, has been very caring, treat us
very well. It's very interesting. So anyway, this has become -- so now I
really don't know who's telling the truth. I mean, this dentist himself,
whether the first review was written by him or somebody else. He doesn't like
the doctor or something like that. What is going on.
So it's very interesting.
Anyway, so this is a case and also some other cases.
>>:
[indiscernible].
>> Bing Liu:
>>:
Yes?
Okay.
Yeah, we look at the apps.
[indiscernible] apps as well?
>> Bing Liu: Yeah, but this is just some of the result, but many of them are
written by the person who actually wrote those apps.
>>: But [indiscernible] more focused on the [indiscernible] to receive a fake
review, or it's all over the place because all the [indiscernible] seem
similar.
>> Bing Liu: Yeah, we look at many, many categories that are fairly similar.
The only thing probably with very big companies, they have pretty reputable
companies, they tend to have a little bit less.
But it doesn't seem to be very true, because we thought that was the case,
7
okay. But we investigate the Yelp, and there are lots of very reputable
restaurants, and they have lots of reviews. They have lots of fakes as well.
So what happens is they want to maintain kind of a steady state. So make sure
some people always there's some review going on. Not just say oh, the last
review was two years ago. What's happened in those two years? Nobody come to
the restaurant or something? So they also do fakes as well.
And another case was somebody where students find out it was an Apple app. I
think there was one Chinese guy who wrote a bunch of apps. Every time he wrote
an app, he put there will be five reviews that says good, good, good, good,
good. Five. Every one of the exactly same five guys say it's a good app, all
five stars. So we also talked to the Google. Google has the Android apps.
They also try to do something but haven't done anything yet, yeah. So the
freelance individuals and the middleman and even customers, you get discount to
write the fake reviews.
So quite a number of random people who just write them for fun. For example, I
saw lots of good reviews. Oh, I'm going to write a bad one. So there are some
guys like that.
So this one was one of the reviews. You can take a look whether you think this
is fake or not fake, okay? This was written by my students in my class.
They're all [indiscernible] students, they're not undergraduate students.
>>: So that's fake, because it's given as regard to Royal Caribbean.
was just a joke.
>> Bing Liu:
You think so?
What do you think?
>>: It's fake. It's all kind of abstract.
It's very good, people are nice.
>> Bing Liu:
>>:
What about this one?
That's true.
>> Bing Liu:
>>:
Okay.
This one true?
Skeptical.
No, that
Doesn't say anything specific.
8
>> Bing Liu: You got it wrong. The previous is real. This is fake. What
about this one? This one I got from the internet. I was trying to show
people, show students some reviews. I opened up this Price Grabber and I sort
of somehow got this page. And I saw this review. What do you think about
this? Sounds like fake, but I don't know whether fake or not. Sounds fake.
This guy seems to know the shop too well, right? This one I don't know about.
But these two, I do know, because they were written by a student.
And he went to Royal Caribbean, that was true. She went with her family. And
this, she said this restaurant was not far from her home, but it's expensive,
she's never been there. She knows this restaurant. And it really look real.
I would think this is more likely to be real, because she said something.
However, I'm not the one for clubbing and drinking. Also, that night it was
pretty slow for me because there was not much else I can do, right? I thought
this sounds real, right? And this one more like to be fake. He doesn't say
anything bad about this place.
Anyway, so it's very difficult. It's very difficult to really figure out. So
that shows the difficulty for this particular problem, how do you detect them.
The first problem is when you see it, you don't know it. You pretty much don't
know it, okay. Then which makes it very difficult to build models. If you
want to build a model, you have nothing to evaluate it.
And also, logically, if you think about this, logically, if you look at the
contents, this is an impossible to solve problem, right. For example, I can
say I write -- I stay in this hotel, and this is a great hotel and staying
there have a real experience. But I write one for this, I put it for somebody
else. So this review cannot be fake and truthful at the same time. Because
the hotel, they're all similar, right? If I don't mention a particular
location, I mean, everything's not much of a difference.
So in practice, if you think about this logically, this is impossible to solve
this problem. However, you have to look at the behavior of the person who
wrote this thing, okay. So we study this problem from 2006, we crawl pretty
much at that time, if not all the reviews from Amazon, a big chunk of the
reviews. We crawl for about two months, and I got 2.8 million reviews on 1.2
million products. Amazon has more products, but many of the products are just
a change of color. Post cards, you know, you change the color of a post card,
change an image. So there's lots of those things. But most of the post cards,
9
those have no reviews.
But the reviews all come with this product ID, reviewer ID, rating and dates.
These are probably the standard things you've seen probably seen yourself. So
this is an interesting plot of the reviewer, number of reviews, the review and
the reviewers. So there's a few reviewers who have written a lot of reviews.
Yes?
>>: Did you also collect whether that person was an Amazon validated -- if it
was Amazon validated?
>> Bing Liu: Yes. At that time, there was not so many. Now, there's much
more. And also [indiscernible] investigate their own reviewers when they have
a new product, they can send to the person testing and writing the review.
So there was one guy who actually wrote more than
there's a few guys, more than 20,000 reviews. We
of Amazon. So we counted sort of this guy has to
the day that Amazon started business on average.
All about books.
15,000 reviews. But now,
are trying to [indiscernible]
write ten reviews a day since
His review all about books.
And this is the products. So again, the parallel, I mean, some products got
very small number of products got lots, lots of reviews. And on average, I
think we computed the average was less than one number of products. And, of
course, we have lots of tiny, tiny products we did not count into the 1.2
million things. The 1.2 million things are the products which have reviews.
So this is an interesting graph. So what are the distribution, what is the
distribution of different review stars? You can see almost 60 percent are five
stars. Almost 60 percent are five stars. And then there's a 20 percent sort
of four stars. So if you consider these two positive reviews, it's almost 80
percent positive reviews, which is a little bit strange, because people
normally don't do that. You don't -- when you are reasonably happy, you
probably not going to write it. It's only when you're really unhappy, I mean,
for example, myself, I've wrote only one review in my life and I was very
unhappy with a cell phone. It was quite a long time ago.
So this is something very fishy, very fishy.
happening.
Why this is happening.
What is
10
And so we did some study on this. What happens is there's a few types of fake
reviews. The first type of the fake ones -- sorry, a few types of spam
reviews. There's also quite a number of reviews which just give the brand
idea. So they'll say oh, I don't like HP, because I've never bought anything
from them. I never bought anything from them. So this kind of review is also
very biased.
And then there's the non-reviews. There's a few types of non-reviews saying
that this is an advertisement and some that this is random text. And these two
are fairly easy to detect. And you can do some analysis on the using the
product descriptions together with different information, they're fairly easy
to detect.
This one here I can give you an example of what is a fake review.
>>:
[indiscernible].
>> Bing Liu: At that time, Amazon was, I think 95 percent like that.
can write it.
>>:
Anybody
[indiscernible].
>> Bing Liu: No, they did not. No. They might have bought it, but they don't
really have to give exactly and say I bought this.
>>:
[indiscernible] that this person bought this product?
>> Bing Liu: That information is actually on the web verified, it was called
Amazon verified purchase. But it's a very small proportion. Very, very tiny
proportion.
So here is trying to do something like what are the reviews which is very
harmful. So is in for example, we have some way to find out the good quality
products and the bad quality products and the average quality products and
there you can see positive and negative reviews. And this, if you are good
products, if you write a positive review, it's not so bad, okay. But this
happens also, okay. It's not so bad. And also, if you are a bad review,
somebody write a negative review, it's not so harmful, okay? It's not so
harmful. But these few are pretty harmful. So this is essentially one of the
clues people normally use to try to say what is this sort of deviation of your
11
review with the other guys.
And so the, in terms of detection, this will be the focus as well, try to
detect this.
So there are now, oh, even before, there are two type of reviewers. So there's
many individual guys who just write themselves and, for example, restaurant
owners and these type of people and they don't really work with anybody. And
so there are also group spammers, okay. Group spammers which are especially,
when you have a small business, when you have a person who writes using
multiple user IDs, okay. So this is the reputation management company doing
quite a bit of things like this over here. And this is getting a little bit
more popular than that particular situation.
So the type of data you can use in this domain, the first thing is the actual
review contents, okay, actual review contents. And this includes a title and
the text, the review text. And the metadata or somebody called the site data,
one of those kind of things. You have the ID, you have the star writing.
These are not text. And time when the review is posted. I also put public and
public. This is the public data which you can see, everybody can see.
And there's also some private data, which basically is internal data, which is
the website collect, the IP address, MAC address and the cookie information.
And from those information, they can somehow figure out your location where you
write or the location you come from, and those are sort of private information
and the sequence of clicks. This becomes sort of a web user mining domain.
So I talked to some companies who do this, they detect this. They said it's
very useful, actually, in the sense, for example, if somebody write a review,
come directly to that product, which is very likely to be fake, because the
company who advertising the service, advertising the sort of product request,
they want people to write fake reviews, they just say, oh, anybody who want to
write some review for us, and here's the link. We'll give you something. So
many people who actually write a fake review, they click the link and go there.
Oh, they directly post those things are fairly likely to be fake.
And another thing that is very interesting is when people write a review,
especially the owners write reviews, you'll see this [indiscernible] page count
on that page all the time. And then one day, he has something positive. And
then, you know, this is not quite right, okay. If you're not owner of this
12
restaurant, then you are unlikely to come to this restaurant all the time and
not go anywhere else, which is very, very interesting. So we can use this
thing to detect.
All this information pretty much very, very useful. For example, another
interesting information would be if you have a hotel, all the reviews seem to
come from the surrounding area. Then that's not quite right, right. And local
guys are not going to stay at a hotel just for any reason.
And there are also text clues. But those are very difficult to determine. For
example, I also heard from one company that this guy who wrote many views, and
at one time, my wife loved this product. And a few days later, oh, my husband
loved this product. So it's something -- those are not easy to detect for
text. Use deep text analysis.
And also, there are quite a bit of information in the private and public
information about products. So product descriptions you can use, when the
guy's writing something related, and then also when the product was sold. When
the product was sold, which is pretty useful as well. And the sales volume and
sales rank. Especially the sales volume.
If you can see, for example, Amazon can do this very easily. If you see I saw
this product and review comes in and what's happening. Oh, the sales volume.
Normally, you have a reasonable idea of distribution, sort of proportion of
people who might write reviews. And suddenly this product is more than that,
okay. More than that.
But the bad things, although this company can do lots of work, can do lots of
things to detect it, but they don't do it. Amazon is not doing much, unless
they were caught by -- [indiscernible] for example, New York Times and they
find out oh, go to remove it.
The reason is these five star reviews is good for them. It's help you to sell
products. If you go to the website, you say oh, this got no review, not going
to buy it, right? If you see something negative, oh, forget about it. If you
see something good, wow, that's probably something reasonable, okay, something
reasonable product. So these retailers, they're not doing much, okay. They're
not doing much.
We also have the reviewer information.
You have the profile and in many cases
13
not so trustworthy. But this can be public and also some private. For
example, Amazon has lots of private information, credit card, whatever, all
this information if you purchased before also. And, of course, they review he
wrote, you can analyze the string of reviews he's been writing and what kind of
products he wrote and the services wrote.
So the first work that we're trying to do that in the beginning was really,
really hard, we couldn't figure out how to do this thing, okay, how to do this
thing. And then I ask one of the my masters students, say why don't you just
go back and see whether you can manually find something, okay, manually find
something. We can use that stuff. If you can manually find something, let's
spend some time to find some more and we'll use that for training to get some
information.
Second day, he come back to me, oh, yes, I find many. I find many. What
happens is he finds all those reviews very near duplicates. A little bit of
change. A little bit of change. He find me quite a number of them. You can
see the guy change a little bit at the end, a few words in the beginning, a few
words sort of in the middle somewhere a little bit.
Then we say oh, that may be interesting. We can do this automatically, right.
You can find out -- you can just have algorithm to doing the comparison and
then we can see a lot of things like this.
And then we find out that lots of interesting things. For example, some
product, actually the same guy wrote seven, eight reviews about the same
product. Same camera. And what on earth, why do you want to do that, you
know? Everything is all positive. If you say, I mean, if you think few weeks
later or a few months later, if you feel this camera is no longer as I thought
at the beginning, you have to go back and change your reviews. There's no
point to write so many of them, and they are all positive. They're all good,
okay.
So this is very interesting, and then we did this study. We did not consider
this one, because the same user ID, same product, especially written the same
day. So it might be possible you click on the same thing, the submit button
multiple times and it could submit the same day. So those are not really
reliable, okay, not reliable.
Another one which is not quite reliable is Amazon copy reviews.
Which means
14
not exactly copy reviews. They can get different type of products, the same
review. For example, you have a pair of shoes, that is a blue color, that is a
red color so they sort of share reviews. You have to remove those things.
Those are easy to remove because they have the review ID, okay, the review ID.
The same ID, okay the same ID.
Then we got a few of them. We only work on the one category of products which
is manufactured kind of products and there are quite a few [indiscernible] of
these things. We used that to be our training data and test data, okay,
training and test data.
And then we used logistic regression to be the model, and so what are the
features we use? It was review centric features are just essentially the
N-grams, the text N-grams and the rating. We use the ratings as well. And
then also reviewer, I think we only used up to bigram, okay. Bigram probably.
And the review centric features will be about the reviewers, there's some
different sort of unusual behaviors. For example, the person wrote many
reviews which are the first review of the products, which is fishy, which is a
bunch of these things.
>>:
[inaudible].
>> Bing Liu:
>>:
Group spam?
Yeah.
>> Bing Liu: I'm going to talk about that later. The group spam now gets very
bad because of the small business and the reputation management companies doing
this, and then they have to do group spam, okay, the group spam.
So this is so we're looking at the reviews, we're not looking at the people
yet. Not looking at the reviewers yet. And then we did a classification, and
we get this AUC, 78 percent predicting of the fakes on the duplicate,
near-duplicate kind of reviews, which is reasonable. It's not easy, okay.
This is obviously a very difficult problem.
>>:
A 50/50 split?
>> Bing Liu: No, what we did is roughly, I don't remember what was the case.
It was -- we did roughly the -- I remember it was roughly the distribution
15
followed the natural distribution, yeah. Because we know, because this is a
pretty good case we can did 50/50, it's not reliable. For example, people
doing 50/50, but if you make a natural distribution -- for example, if the
proportion of this is only about 10 percent or 15 percent, 20 percent, if you
do 50/50 accuracy, everything is so high. But if you go to low, then you're
going to drop dramatically. You're going to drop dramatically.
And so data miners, we've been doing this thing, we do roughly in the natural
distribution. So, of course, it's also interesting to say is it reasonable to
do this, to treat the duplicates as spam reviews? Is it reasonable? Okay. We
thought it was probably reasonable because we're doing something reasonably
okay. It's not that bad, okay, not that bad, okay.
Then we did, after we got this result, then we analyzed some other type of
reviews, okay, some other type of reviews. So negative outliers, so
essentially those guys write very negative reviews, they tend to be heavily
spammed, okay. They tend to be heavily spammed.
And those reviewers, only reviews of the products are also likely to be
spammed. This one is obvious. The first one is not so obvious, not so
obvious. I would thought the positive review are more likely to be spammed,
because from my experience, I talk to my students, for example, I have lots of
students from India, and they do it. They write it.
So last year, I got two students who said they have done it before. No, they
didn't say it themselves. The one guy said himself. The other guy says, my
friends. So I'm not sure. And this year, in the first class, I ask has
anybody done it? Two raise hands. Have you done reviews for a website? They
said themselves they do it.
Two or three weeks ago, I asked them again, no hands. Nobody raised their
hands before. First time I asked, there was two person that raise their hands.
When they know I'm doing that kind of research, okay, they probably don't want
to get into this.
And also, the spam reviews can get quite good helpful feedbacks. If you write
quite carefully, you can get fairly good feedbacks. Also, my students
[indiscernible] was the student. He wrote the camera review that was
completely fake. He wrote it very nicely. He wrote very detailed stuff, but
he never used that product. Got lots of good feedback. It was completely
16
fake.
So which basically means this helpful, not helpful is not very useful. And
also, if you do web spam and this can be fake itself, right, this can be fake
too. Okay. This can be fake as well.
So we don't really know whether this is true. We have no way to actually get
the ground truth and just so we depend on this one is doing something, and then
we try to test on the other review, we think they might be the case. It might
be the case.
So there's a few more, we do some chart to show this situation.
So in the supervise the techniques and people have been doing, this and other
groups recently have been doing this [indiscernible] people, they were using
Mechanical Turk to write fake reviews, but it turns out we did some analysis on
this thing. It turns out it's not really the true fake. Not the true fake.
Not true fake.
And this group, they labeled this, they used the opinions to label it. They
did pretty much the same idea to do this, but this one is much simpler, just
use bigram, nothing else, just bigram. This one used the bigram, opinion
sentiments and every kind of piece of information. And they labeled themselves
use the opinions data. The opinion have those trust and the comments and
everything. They got a bunch of guys reading the things and sort of think that
probably is fake. I mean, nobody know exactly, okay, nobody know exactly.
So this is a supervised techniques. And then there's also many unsupervised
techniques, okay, many unsupervised techniques. And this one was essentially
we try to go behind the scenes, right, to see what is the behavior of this
person, okay, what is the behavior of this person. And also, what is the
behavior of the reviews, okay, what is the behavior of the reviews.
So somehow, we try to uncover, you know, is there any interesting secrets in
the way he behave strangely. The first one, and it's basically two, we
published in 2010 and then we stopped this research. Actually, in 2009, we
didn't really do much, because we find it was really, really hard. It was
really hard. And every time when we submit a paper, there's also the same
comments. How do you know this is fake? And are you sure you're doing it
correctly? How do you know this? I would get very, very upset. Get very
17
disappointed always.
And then we sort of stopped this, stopped this research, and then I was in
Singapore, I gave a talk on this particular topic, and then this person was
doing something about research paper ranking. Not research paper ranking. To
analyze the reviews of research paper reviews. So they have been doing kind of
thing, I want to just try this on the product review, but this is the same
rating and this type of thing. Not much of difference, okay, much of
difference.
And if you're looking at the behaviors, they don't really have to look at the
contents, not look at the text. And then we come up with a few things and do
this thing. And also my students work on something else, I'll give you a
little bit of introduction. So the key idea is try to uncover, try to identify
some unusual behavior patterns. Could be reviewers, could be reviews, okay,
could be reviews.
And the thing we're doing is trying to find out
this is basically a data mining type of thing.
particular paper, was talking about targeting.
they are either spam or not spammed. So you're
products.
unexpected [indiscernible]. So
So the first one, with this
So we try to say the products,
targeting some specific
And then also, you could be targeting a group of products. For example, a
brand of product. So this brand is getting people to fake, okay, getting
people to fake. And also reading deviations. Since this is the same type of
problem where you do rating in the review of papers, how you figure out rating.
And also early rating deviation. People try to write things right at the
beginning, and then this review, the ranking gets lower and lower.
So now, this is actually true. What happens, my [indiscernible] student, he's
from India, he said his friend is doing this, okay. His friend is in India
working for a -- called a reputation kind of company. And he monitor -- he was
in charge of this product. What that means, no matter what, you got to make
this positive. Either you write it or you watch other people write it, you
have to make sure this is positive. So he normally, he'll start with something
and he'll watch, you know, hm, somebody wrote a negative. He'll make it up so
he'll start to write and he can create quite a few accounts to do this type of
thing.
18
This is true, because I verify, the student said he's doing it. He said he
didn't do it because he says it's too troublesome because you got to watch this
thing all the time over a period of time, and it's not that interesting.
And so then this bunch of scores, you can do different combination, linear
combination or just can do it and do a finally do a ranking, okay, do a ranking
on this. You can sort of rank these reviews, and then we do the user interface
to let the user do an evaluation. Again, we have no ground truth. Again, the
user do the evaluation, do this kind of features, let people to see it.
This was data mining, okay, this was complete data mining in the sense we try
to find the interesting patterns, okay, interesting patterns. So we create a
database, okay, create database so this database can be on the reviewer ID,
okay, the brand ID so Amazon also have the brand and also product ID, the
individual product under the brand and also class. Looks like a
classification, okay, looks like a classification type of database. And then
this is the class of positives, negatives, four or five stars will be positive
and three and below will be negative, okay, will be negative.
Then we're trying to find what's called the class association use. My previous
slide was doing mostly on the data mining, okay. I was trying to figure out
[indiscernible]. This is what we call the class association. Not exactly the
classification product, but try to find all the rules. So classification is
only finding a subset of the things which is just for the purpose of
classification. If he's got classification, that's good enough.
But this type of algorithm tries to find all of them, all of the rules. So,
for example, you say review one, the brand one is positive. So this is what we
call the class association, kind of association rules. Then how do we know
something is expected or unexpected? Then we have to do some sort of
definition, of typical behaviors.
For example, I just give you some examples. If you're interested, in the paper
there was quite a bit of things that we defined. For example, in this case, a
reviewer wrote all positive reviews on products of a brand, but all negative
reviews on a competing brand, for example, all right.
So you can see when you have one condition, this reviewer equals to one and
then goes positive. The [indiscernible] is conditional probability, giving
reviewer one, what is the probability of his positive review? And this is in
19
data mining people called confidence.
And 60 percent, okay, 60 percent.
Then when you extend this one more condition, okay, one more condition, and
then you got 100 percent, for example. Then this is fishy, something is fishy.
But how do you find this thing?
And it turns out this you can define probabilistically, okay, you can define
probabilistically. And then for example, in this particular case, we can do
this and use entropy, okay, use entropy, or you can compute entropy of this
particular thing and also compute entropy where you have multiple brands, which
is essentially, just like if you know the decision tree, just like a decision
tree branching out into two brands. You can compute entropy on those -- on
extra attributes, okay, extra attributes.
There you can see the entropy is changing dramatically. So you can get some
sort of information gain to find out something strange about this. Okay.
And then you can also analyze further. For example, these two brands actually
other people don't think they're great but why this guy give them so high,
okay. And this is the entropy. And another one you can use, we call it a
confidence unexpectedness. For example, you can see the reviewer one, brand
one, is positive. And then the support was the, actually, the joint
probability, okay, the joint probability of something happens in -- what is the
proportion, for example, you have database. What is the proportion of those
data records that contains both -- contain everything, contain all three of
them, okay, all three of them.
So the condition and the consequence are contained. But in this case, then you
can define fairly easily probabilistically, basically how you do that is just
given -- I don't really have it here, but given you have one condition use,
okay, and then you can define -- you have one condition rules, and then you
have the joint and also the condition probability. And what happens if you say
this two thing have nothing to do with each other, they're not related.
They're completely independent. Assume condition independence, and then you
can compute what happens using together if they are really not related.
And then this is essentially, this would be the formula to compute that, okay.
Assume the condition of the independence, which is right, because we assume
this has nothing to do with each other. They're sort of independent. And then
you can compute when they are not independent. It's essentially to try to
20
compute it how much they're not independent. And then you can see since
like -- so this is just one of the things. You can find out many reviews which
are not supposed to be probably not really correct. Not right, okay, not
right.
And you can also define support unexpectedness. So what happens with this one,
I'm using a different measure. Reviewer one, product one, positive. There's
five of them. Five positive reviews. So this one was discovered
automatically. It was very interesting, we find this. How come the same guy
writing many reviews on the same product? Okay, on the same product. And then
you can go to look at it and so the guy basically change a little bit and then
put quite a few reviews in the same thing.
Again, you an define this also probabilistically, okay. You can define. I
don't really have it here, but you can define this probabilistically, and then
to define that, you can also define what is unexpected, okay, what is
unexpected.
And this can be quite flexible, because this is nothing, it's not something we
come up with manually ours and it's completely defined the mine from the data.
You can also find out brand of product relationship, right. All kind of
relationship you can discover, and you can define all this type of what is
unexpected, what is expected. Okay.
And we also do some evaluation manually, okay, manual evaluation on this thing.
We also do on detecting user graph, okay, using a graph and you need some sort
of relational model to model this thing. And this is the different type of
reviews. This is a set of ratings, resellerratings.com, this is the short
reviews. They're store reviews.
So store reviews is a little bit different from product reviews. For example,
you buy a camera, you probably should write one reviews about that camera. But
store reviews, you can write multiple. For example, I've been there a few
times and can I say every time my experience is different. Every time my
experience is different.
So we crawled the whole thing. Why we did this was in that time, Google was
interested in those things. So Google probably was working with these
companies. They were interested in this particular sites, okay. And then we
crawled the whole thing and get the data mining class, the data mining class to
21
do this.
Wow, some guy wrote 30, 40 reviews within half an hour.
Anyway, so but that was easy to detect. Then we represent this as a review
graph, okay, review graph in the sense you have [indiscernible] notes, it's
heterogenous graph. And the reviewers, and they have reviews, and you have
stores. So you use this graph to capture the relationship, okay, capture the
relationship of this three [indiscernible] entities. And one of the things,
one of the concepts, we define a few concepts.
So the trustiness of reviewers, the honesty of review and the reliability of
stores. And then you can see all those things are related to each other, okay,
related to each other. For example, a reviewer is more trustworthy if he or
she has written more honest reviews, okay, honest reviews.
A store is more reliable if it has more positive reviews from trustworthy
reviewers, okay, trustworthy reviewers. And the review is more honest if it is
supported by many more honest reviews, okay, honest reviews.
So you can see this is sort of like page rank, all those kind of things, okay,
page rank or hits algorithm. So they're related. And then you can model them
in a relational way, okay, relational manner.
And we use this logistic function, just try to make sure you reach somewhere
you should have stabilize, okay, should stabilize your value. So then you
can -- this one is about honesty and all these things are related. This one is
the reliability. You see the relational things going on here. And then later,
and then you can run this in -- sorry, you can run this with a power iteration,
right. You can iterate many sort of iterations and you get something sort of
stabilized.
So in this particular evaluation was we tried to compare with the better
business bureau. The U.S. has a Better Business Bureau. You can see what
happens there and what happens with our rank [indiscernible] and fairly
consistent. You can see, fairly consistent. If Better Business Bureau is
saying this is a good review, good guy and we're giving them sort of fairly
good too.
Then there's the students also tried to go to the web and search for this
particular company to see if people trust this company or not.
22
So here is how do we detect the groups, okay? Group detection. And group
detection is much easier, actually easier. Individual ones, the pattern is not
so obvious. But when you do things in terms of groups, your pattern becomes
fairly obvious, okay, fairly obvious, all right.
So previously, we tried to manually label individual reviews, whether they're
fake or not, which was very, very difficult. But if you try to label groups,
it's relatively easy, okay, relatively easy. Because the collective behaviors,
okay, the collective behaviors.
All right. And so in this case was a group of people work in collusion, all
right, they write to demote some products and such spams can be quite damaging,
actually quite damaging. And they can take total control, and in this case we
have the algorithm which has three steps, okay, three steps.
And frequent pattern -- let me see. Okay. The first one was try to find
frequent pattern, because you can see whether the same bunch of guys been
reviewing multiple products, okay, multiple products so that's fairly easy to
detect using the frequent pattern.
And then some of them are just coincidence, for example, Apple products there
might be a bunch of guys who all review the same. This is probably okay. They
are very popular products, okay, popular products. And then you have a set of
features, which is essentially indicators or the clues, okay, clues.
And then later on, we also use the relational model to rank this, okay. In
this case, we have a bunch of guys to actually label the data, okay, label the
data. Because in this case, it was not that difficult to label. And this was
the review story from ABC that I mentioned to you. And what happens is ABC,
one of the reporters discover a chain of positive endorsements from a group of
six metro area businesses, ranging from auto care to this. And we found from
ABC, 50 Google user accounts only posted reviews about the same businesses.
So they behaved like a group. They have the post sort of multiple products,
same guy. Same guy multiple. And this is exactly fit for this group
detection, okay, exactly group detection. And another one which I mentioned a
little bit earlier was also the case. You can see this is the same, one of the
things we discovered. Because this is about three softwares, three different
softwares, okay. And this is a person called Big John, okay, Big John, and
this is for real.
23
This is a guy give three products only -- this guy only reviewed these three
products at that time. You can see all five stars. The dates, December 4,
December 4, December 4, okay. The same day, this particular person. You see
in the second one, December 8, December 8, December 8. Again, the same exact,
the same product. And you can see another guy, and this is December 4 also.
Exactly the same product. Those three guy -- actually, there was another guy,
four. Four of them, okay. Four of them. They review at that time only those
three products.
If you look at the contents, they are very different. Content very different,
but this kind of behavior, if you do data mining, is fairly easy to detect, to
find it.
The only problem, of course, when you do frequent [indiscernible] mining, you
need to have reasonably high support, which does cause some trouble, because
now we have -- when he we did this analysis, it was only for the manufacturer
products, which was 100,000 type of reviews a long time ago. Now we cross
[indiscernible] a few millions now. Then you've got to make the support very
high. Otherwise, you can't find it, which is not good.
And because [indiscernible] grows exponentially. The number of [indiscernible]
grows exponentially. We also try to deal with that problem.
Then there's another problem. We were talking to companies. Some companies
try to implement this algorithm. They say this is probably, it's okay you can
do this periodically. For example, in a few weeks or in a few, you can do this
ones. But is it possible to do this online, when this guy wrote it, submit it,
and oh, this is no good. Which is a problem. Because this kind of algorithm
cannot run like that. It cannot run online, on line sort of progress.
So one of the companies in D.C. area, they were saying how can you do this in
an online kind of situation?
So this was fairly simple to do and for frequent [indiscernible] for each
product, you get all the user IDs, all the reviewer IDs. And for every product
you have a user ID, you get what are called the transactions and then you can
mine the transaction to find all the guys, the bunch of guys who sort of review
a few products the same. Not necessarily at the same time, but all of them
review the same.
24
And then you can also have a few, for example, the clues to further filter it,
further filter it. This is a group time window. So when a bunch of guys
doing, they're probably doing the similar time frame, similar time frame, and
then group deviation, they're similar review, the ratings similar and the
contents similarity, are they similar.
And also, the member content similarity, for example, the same guy across the
different things or are they similar and also the early time frame and this
type of thing so we can see and then we can, again, compute models, relational
models of how the groups relate to individual products and the members and the
products and the group and the members. Especially the group and the member,
because the group and the member are related, right. If your group is bad, of
course member probably bad, and then you can define this mathematically, sort
of a heuristic way to define it.
And then enough to be manipulating matrices, and then try to compute a final
ranking, okay, final ranking and you can do an evaluation. This is a pretty
good evaluation we did. We did manual labeling, okay, manual labeling by a
bunch of guys, okay, bunch of guys.
And you can do this thing fairly accurately, fairly accurately. So that's
roughly what I've been doing. And my research mainly was focusing on sentiment
analysis because it involves Microsoft and also when I sometimes give talks on
this particular issue, people always ask are you sure this -- you do this
sentiment analysis, are you sure this guy's trustworthy or not?
So okay. Let me do something else, see whether I can get some -- try to catch
some of those people. And now it becomes very, if you've seen the news,
there's quite a bit of interest in the public. In the public. And next few
weeks, also some reporters still want to talk about these issues. They're
still not over this, still quite interested in this.
We have been moving into the other type of social media. For example, you go
to Twitter, there are also lots of people doing this. And then the current
methods still fairly simple, okay, fairly simple. And we have not really
focused a lot of time on this in the sense where we got very, very disappointed
in the first few years. Whatever we submitted, always give us back something
not very good.
25
Always, we have comments about evaluation, how can you be sure, how can you be
sure. So even for the WWW paper, we got this year submit twice, three times.
Every time, it was -- my student say okay, I'm going to label it. So he went
and labeled it, and then he got accepted. He got eight of his friends to label
it. And it turns out the groups are much easier to label, because the behavior
is so obvious. The behavior is so obvious.
And then so that was basically the thing. If you're interested, you can look
at, we have a chapter talk about this thing, sort of survey. But now there's
quite a few people, quite a number of people are working on this now.
And this is going to be a problem forever. It's going to be sort of, what is
it called? You do something and now I got to do something. [indiscernible]
whether you can. But this has to be dealt with in some sense.
I've been working
They're all doing
retailer is doing
thing. And also,
to do this too.
with -- not working, but talking to people in China and here.
it. But not all this is retailers. I've not seen a single
this, okay, because it's actually good for them to have this
some other company, for example, American Express is trying
What they are trying to do is quite interesting. They want to do this in the
sense they want to verify a purchase. So you have a credit card, right. So
they have the credit card information. You purchase something, you wrote
something on Amazon, but you bought from the Best Buy, they were freaking out.
So they talked to me, and they tried to work with Yelp. Yelp said oh, no,
we're not interested in working with you.
So they are now, they got a retailer, they said. They didn't tell me which
one. And also work together with other credit card issuers. I say if you just
do yourself, probably not good enough. You've got to work with other issuers
of credit cards to verify you, whether you purchased something somewhere,
somewhere this particular product.
Okay.
And that's roughly about it.
Yeah?
>>: It's a very interesting talk. So you talked a lot about how difficult it
is to detect the fake reviews right. Have you thought about looking at
internet reviews?
26
>> Bing Liu:
Say again?
>>: So rather than detecting fake reviews, trying to detect [indiscernible]
reviews, like good reviewers and good reviews.
>> Bing Liu: Yeah, there's some sentiment analysis researchers doing the
quality or the utility of the reviews. You can rank those things too. And but
the problem with those, also, the guy who is really writing it well, you don't
know. They can write all the product features, you know, give a really nice
comments and so ->>: [indiscernible] reviewers' behaviors, like the way that they looked at the
fake reviewers' behaviors, right?
>> Bing
this is
so that
somehow
Liu: Yeah, we did some analysis. Especially if you read this paper,
a dub dub paper, because we have this label data this time, label data
we can analyze the behaviors. Whether this kind of feature does
separate people, separate groups, yeah, we did some analysis on that.
Individuals, we couldn't do it, because you have no good data for that.
>>:
And then do you have any idea of how to sort of raise the bar so that --
>> Bing Liu:
>>:
[indiscernible] economically challenging or how [indiscernible] --
>> Bing Liu:
>>:
Yeah, we --
Yeah, yeah --
Verify purchase is one thing, right?
>> Bing Liu: Yeah, yeah, yeah. That's what one [indiscernible] is saying.
Another thing is if you think about it, I think, I personally think this thing
can be done if the company doing it themselves, because what we are doing here,
it's completely public information.
If you think about this as private information, I think it's doable. At least
you can catch quite a bit of those. For example, Yelp is doing that, but Yelp
is not doing a great job in the sense -- I mean, they're doing quite a good
job, actually, quite a good job. But there's still lots of fakes out there.
27
For example, I give you the dentist situation. Very clearly fake, but Yelp did
not filter this. Yelp did filter two of them. The first you, and as I said,
the first review about this particular sort of doctor and one of his three
shops. That got removed, filtered. This dentist told us how come that thing
is gone? The first thing I saw it and a few days later, it disappeared.
He thought this guy must sort of talk to Yelp or whatever, pay money to Yelp or
whatever to get it removed, okay.
And I don't think -- I don't think so. I don't think so. And but then there's
another very positive review, I believe, this dentist wrote himself got
filtered also. But there's a few more still there.
>>: So [indiscernible] because, let's say there are other fake reviews that
people really believe in that and after a while, a lot of people are going to
figure out that these are not good products. So eventually, the product is
going to not have ->> Bing Liu: Yeah, yeah, but it seems lots of people don't bother to write.
And also, the product may not be that great but may not be that bad so you
probably don't go there, and oh, I'm going to say something. I'm going to -unless it's really bad.
>>:
Okay.
It's more of local businesses?
>> Bing Liu: Yeah, yeah. For example, the Google Places now is a good place
people fake quite a lot, yeah.
>>: [indiscernible] if we just use individual [indiscernible] technique, one
person [indiscernible] identify to a new [indiscernible].
>> Bing Liu: The individual ones, we don't know. We really have no idea. We
just doing something, whether we're just, you know, we just do some evaluation,
100 or something like that, we can't go below because it's easy to figure out
because we do manual evaluation. The group wants very clear features, you
think, probably.
>>: But by [indiscernible] identify a set of fake reviews, [indiscernible]
versus the group and what is the individuals --
28
>> Bing Liu: We have not done that kind of analysis. That would be
interesting too, yeah. And the people ask me to estimate what, for example,
the New York Times people ask me to estimate what percentage of review are
fake. That was a very difficult question. Very difficult. I tell them 30
percent. Why? I have a reason, okay? I have a reason.
If you go to look at Yelp, they filter out about 15 percent. About 15 percent.
I have a good reason, I have a sea of also fake reviews that I know are fake
that did not get filtered out. They still stay. So I give them, you know,
probably -- I tell them, I can't really do this so accurately. But recently,
[indiscernible] something have one report that said by 2014, they'll be doing
about 15 percent. So lots of people are saying that's underestimate. That's
underestimate.
Because somebody says -- because they have estimated, they said fake reviews,
and also the fake clicks and the fake friends and the fake followers, things
like that, so one of the guys who wrote a comment says they've been doing that.
He said his experience in 90 percent are fake, especially the followers. And
those followers are created. Try to do something and you look at the
followers, all those followers have nothing to do with my business. They say
they got a lot of followers, but he believes 90 percent are not right.
>>:
[indiscernible].
Click for the ads, specifically, like Google ads.
>> Bing Liu: Right, we don't really know that, yeah. So there's lots of
interesting research that needs to be done, actually, in this area. And we
have, I mean, when we use the external information, we're always handicapped
very much. Thank you.
Download