1 >> Michael Gamon: Hi. My name is Michael Gamon. I'm from the NLP Group here in MSR, and it's my pleasure today to introduce Professor Bing Liu from the University of Illinois at Chicago. And I've known Bing for a long time. I actually don't remember now how long we go back, but Bing has done a lot of work on web mining and, specifically, on reviews, including aspect detection, product aspect detection, sentiment detection and he was also, as far as I can tell, really the first person to start research on fake reviews and the problem of review spam. And you might actually even have seen some public media attention to that problem. It's obviously a big one, because we just talked about it, it can really affect a small company tremendously. So this is what Bing's going to be talking about and also this is part of a book that just came out on sentiment and opinion mining, which is sort of an overview of the field. >> Bing Liu: Thank you very much, yeah. So this problem is getting pretty bad now. Previously, it was not as bad, but there are some sort of companies which is so-called the reputation management companies, they start to do this and writing things on the web. And so this has been, the opinions in social media has been useful for quite a while, even many years ago, not really many years, about three years ago, there were already some market company did surveys on how many people, what percentage of people, when they buy something, or when they use some services, they get, do they read reviews. And I think it was three or four years ago, it was about 75 percent, and now it's getting a lot bigger. And also, people using reviews, the people analysis -- the companies doing analysis of reviews has also increased quite dramatically, and they also use in terms of opinions in the other kind of forums and Twitter and blogs. And I think MacKenzie did some survey some years ago, also about three years ago, said the U.S. has about 30 to 60 companies working on that. And this May, somebody else did a survey that said now there are about 350 companies doing it, doing something related to this, getting opinions, reviews from blogs, from sort of Twitter. So we use this thing for decision making in purchasing or when you do election and also, of course, for marketing and branding. And afterwards, you want to -- after you've seen the thing, you want to do some sort of actions. This 2 give people very strong incentive to do something to try to promote their products, products or services, and this basically becomes a business itself also. Previously, it was mostly individuals who write it. And recent few years, the people in the companies -- they get middle man in India write it and now even U.S., also small companies, reputation management companies are doing the writings fake reviews. And what do we say? We studied this particular research in 2005, 2006 for a time frame. Actually, it was related to a project with Microsoft beginning. What we were trying to do at Microsoft at that time was try to figure out the people from different websites, are they the same guy or not the same guy. We did some surveys, what kind of names do you use on different websites. And then later on, we were more interested to get into this particular topic of detecting what we call opinion spam. And so then we first start was the opinions in the reviews and how to detect the fake reviews. Okay. So there's lots of people who write these fake reviews, and we call opinion spam, and they try to write undeserving positive reviews and to promote those products and those who write unfair, malicious, negative reviews to target some products and try to demote them. We said also it's been a business in recent years and also customers are getting worried about this thing and lots of things are being reported on the web. There's lots of reports. Again, there's just a subset of those. And this is one which is very interesting. My website, I compiled quite a bit a few links. The first one, Amazon Glitch Unmasks War of Reviewers. This was 2004. What it was was that Amazon of Canada made a mistake with a system, and then they released the real name of the reviewers, and then they discovered all the book authors write for themselves. Even those very famous ones. And when they were confronted, reporter goes to talk to them, they say oh, everybody does it. And then they also have one woman, she was a writer, and she was even complaining her friends and one of the friends, she asked her friend and family to write it. One friend only write oh, this is a good book. She's like what kind of friend is this? Only write a single line for my book. So that was the sort of very early report. And early on -- and this second one 3 was one of the companies selling printouts on Amazon, they give the customer for free if you ran a positive review, and this is small, cost about ten dollars. So you can see that two dollars for a star. So suddenly, you can see the sudden burst of reviews from the customers. But, of course, this thing has been done quite a long time ago and has been going on especially by hotels. They give you a little bit of a discount and ask you so write something positive. And another one which was published recently, the person who wrote these fake reviews, and he talked to New York Times. And then, another one, so now it becomes interesting, the Google Places, Google Places has lots of fake reviews. So this particular person, this one, what happens, he went to -- this reporter went to investigation about something he feel fishy, and then this is being done by some sort of reputation management company. And then he go to look at his sort of surrounding found out around 60 of them using this particular interesting, ABC 7 News in Colorado somewhere, he doing, writing a lot of reviews about a few small only doing a few businesses. area's businesses. And he service. This one also find a company who is businesses. Lots of people And so this tool, when you have your reputation company, reputation management company that writes fake reviews for you, they're fairly easy to detect. Actually, I want to talk about some algorithm which can detect this type of situation. Then there's some of the test services you can find on the web, there are lots of them if you search for them, things that you'll find lots of companies that will do it for you. Some are expensive, some are much cheaper. Also, people put out some interesting sort clues to allow individual users to figure out something, whether it's fake or not fake. But which it is really hard. We did try some experiment with students, and it was very, very difficult. Almost impossible to figure out, okay? And what we did was we asked students to write some fake reviews which are on a product they're familiar with and then write fake reviews you have never used that product before, you know nothing about that product. So for this part, it's relatively easy to detect. The product, if you know about it, it's almost impossible to detect. Very, very hard to detect. 4 Then this is sock puppets. multiple -- yes, go ahead. >>: And what happens is one person register for I have a question about the experiment. >> Bing Liu: All right. >>: The students who knew about the product, so like how fake they could, or did you ask -- did you give them some [indiscernible] as to how they might want to fake the review? >> Bing Liu: Yeah, I did. >>: [indiscernible] or some kind of feature they didn't like that they say they liked? That kind of ->> Bing Liu: I did not give them any further instruction except to say just try to write it, looks like, sounds like it's real. That's the only thing I said. >>: But if they knew about the product, right? >> Bing Liu: >>: And could be their own opinion, right? >> Bing Liu: cameras. >>: Yeah. How can you -- What I'm saying, you know that category of product, for example, Have not reviewed the property. >> Bing Liu: Yeah, not your own review about review products, yes. But category. For example, cameras, we do games and those kind of things so they can write pretty real stuff. And also, a company -- is doing too and China is very bad. I'm talking to some companies which is hosting reviews. They say if you don't remove these guys, the review is useless. They said probably 50 percent of them are fake. Quite bad. 5 So this was somebody put on the web, Belkin international, and asked people on the web to write reviews for them. Use your best possible grammar, write in U.S. English and always give 100 percent rating, as high as possible. Keep your entry not too long, and this is the real one, okay. This is the real one that was caught. It was caught by somebody. So who writes these fake reviews? So business owner does quite a lot. Restaurant owners and book authors and song writers. And themes two are particularly bad, the book authors. Now becomes everybody can publish a book if you pay some money, you can publish a book. So they have to promote the stuff. And also the small bands, they have some songs in their music, and they have to promote it. Restaurant owners, also quite a bit. But now, also, businesses and other type of small businesses also do it. Just now, I was talking to Michael. And a few weeks ago, a dentist sent us an email from North Carolina, Charlotte, and what happens is he has three practices. Three practices. He's owner of the three practices, and one of the shops has one guy, and he got negative review. And then this person who contact us, he was the own, he goes to talk to that person. Confront the person, say what's going on, okay? How can you get something which is so bad? Did you treat a patient very badly? So they got very angry with each other. And then a few days later, then his other two practices got very nasty, negative review. His practice has been there for long, no review at all. Then suddenly, two very nasty review show up. So then he was very angry. He was wow, must be from the other guy. So what happens is we did an investigation. Now we also find out those two -- those two reviews are written with two different IDs, and they were all written on the same day, exact the same day, and those two person also wrote another review on that same day, and both of the reviews about food. One is about cookie, another one is about cracker or something like that. So it's very interesting. It looks like these two must be the same guy. The writing style was pretty good and the writing was very nasty, saying this doctor who want to charge him, he really has nothing to do, just going to wash your teeth [indiscernible] do something else and the cup boards are full of coffee. It was very bad. 6 So this is sort of -- this dentist got very upset. He contact Yahoo and do all kind of things to get it removed and to say this must be from the same person, but Yahoo didn't do anything, okay. They didn't do anything. And then we were talking about this thing, and a few days later, few days later, then suddenly a few positive reviews show up. Positive review. Basically the negative reviews was still there. And then suddenly, a few positive reviews show up. Before that, there was no review at all. So my guess is this positive review probably from himself or from his friends. You can see the positive review, at least two of them were so obvious and so formal writing. Doctor so and so, full name, has been very caring, treat us very well. It's very interesting. So anyway, this has become -- so now I really don't know who's telling the truth. I mean, this dentist himself, whether the first review was written by him or somebody else. He doesn't like the doctor or something like that. What is going on. So it's very interesting. Anyway, so this is a case and also some other cases. >>: [indiscernible]. >> Bing Liu: >>: Yes? Okay. Yeah, we look at the apps. [indiscernible] apps as well? >> Bing Liu: Yeah, but this is just some of the result, but many of them are written by the person who actually wrote those apps. >>: But [indiscernible] more focused on the [indiscernible] to receive a fake review, or it's all over the place because all the [indiscernible] seem similar. >> Bing Liu: Yeah, we look at many, many categories that are fairly similar. The only thing probably with very big companies, they have pretty reputable companies, they tend to have a little bit less. But it doesn't seem to be very true, because we thought that was the case, 7 okay. But we investigate the Yelp, and there are lots of very reputable restaurants, and they have lots of reviews. They have lots of fakes as well. So what happens is they want to maintain kind of a steady state. So make sure some people always there's some review going on. Not just say oh, the last review was two years ago. What's happened in those two years? Nobody come to the restaurant or something? So they also do fakes as well. And another case was somebody where students find out it was an Apple app. I think there was one Chinese guy who wrote a bunch of apps. Every time he wrote an app, he put there will be five reviews that says good, good, good, good, good. Five. Every one of the exactly same five guys say it's a good app, all five stars. So we also talked to the Google. Google has the Android apps. They also try to do something but haven't done anything yet, yeah. So the freelance individuals and the middleman and even customers, you get discount to write the fake reviews. So quite a number of random people who just write them for fun. For example, I saw lots of good reviews. Oh, I'm going to write a bad one. So there are some guys like that. So this one was one of the reviews. You can take a look whether you think this is fake or not fake, okay? This was written by my students in my class. They're all [indiscernible] students, they're not undergraduate students. >>: So that's fake, because it's given as regard to Royal Caribbean. was just a joke. >> Bing Liu: You think so? What do you think? >>: It's fake. It's all kind of abstract. It's very good, people are nice. >> Bing Liu: >>: What about this one? That's true. >> Bing Liu: >>: Okay. This one true? Skeptical. No, that Doesn't say anything specific. 8 >> Bing Liu: You got it wrong. The previous is real. This is fake. What about this one? This one I got from the internet. I was trying to show people, show students some reviews. I opened up this Price Grabber and I sort of somehow got this page. And I saw this review. What do you think about this? Sounds like fake, but I don't know whether fake or not. Sounds fake. This guy seems to know the shop too well, right? This one I don't know about. But these two, I do know, because they were written by a student. And he went to Royal Caribbean, that was true. She went with her family. And this, she said this restaurant was not far from her home, but it's expensive, she's never been there. She knows this restaurant. And it really look real. I would think this is more likely to be real, because she said something. However, I'm not the one for clubbing and drinking. Also, that night it was pretty slow for me because there was not much else I can do, right? I thought this sounds real, right? And this one more like to be fake. He doesn't say anything bad about this place. Anyway, so it's very difficult. It's very difficult to really figure out. So that shows the difficulty for this particular problem, how do you detect them. The first problem is when you see it, you don't know it. You pretty much don't know it, okay. Then which makes it very difficult to build models. If you want to build a model, you have nothing to evaluate it. And also, logically, if you think about this, logically, if you look at the contents, this is an impossible to solve problem, right. For example, I can say I write -- I stay in this hotel, and this is a great hotel and staying there have a real experience. But I write one for this, I put it for somebody else. So this review cannot be fake and truthful at the same time. Because the hotel, they're all similar, right? If I don't mention a particular location, I mean, everything's not much of a difference. So in practice, if you think about this logically, this is impossible to solve this problem. However, you have to look at the behavior of the person who wrote this thing, okay. So we study this problem from 2006, we crawl pretty much at that time, if not all the reviews from Amazon, a big chunk of the reviews. We crawl for about two months, and I got 2.8 million reviews on 1.2 million products. Amazon has more products, but many of the products are just a change of color. Post cards, you know, you change the color of a post card, change an image. So there's lots of those things. But most of the post cards, 9 those have no reviews. But the reviews all come with this product ID, reviewer ID, rating and dates. These are probably the standard things you've seen probably seen yourself. So this is an interesting plot of the reviewer, number of reviews, the review and the reviewers. So there's a few reviewers who have written a lot of reviews. Yes? >>: Did you also collect whether that person was an Amazon validated -- if it was Amazon validated? >> Bing Liu: Yes. At that time, there was not so many. Now, there's much more. And also [indiscernible] investigate their own reviewers when they have a new product, they can send to the person testing and writing the review. So there was one guy who actually wrote more than there's a few guys, more than 20,000 reviews. We of Amazon. So we counted sort of this guy has to the day that Amazon started business on average. All about books. 15,000 reviews. But now, are trying to [indiscernible] write ten reviews a day since His review all about books. And this is the products. So again, the parallel, I mean, some products got very small number of products got lots, lots of reviews. And on average, I think we computed the average was less than one number of products. And, of course, we have lots of tiny, tiny products we did not count into the 1.2 million things. The 1.2 million things are the products which have reviews. So this is an interesting graph. So what are the distribution, what is the distribution of different review stars? You can see almost 60 percent are five stars. Almost 60 percent are five stars. And then there's a 20 percent sort of four stars. So if you consider these two positive reviews, it's almost 80 percent positive reviews, which is a little bit strange, because people normally don't do that. You don't -- when you are reasonably happy, you probably not going to write it. It's only when you're really unhappy, I mean, for example, myself, I've wrote only one review in my life and I was very unhappy with a cell phone. It was quite a long time ago. So this is something very fishy, very fishy. happening. Why this is happening. What is 10 And so we did some study on this. What happens is there's a few types of fake reviews. The first type of the fake ones -- sorry, a few types of spam reviews. There's also quite a number of reviews which just give the brand idea. So they'll say oh, I don't like HP, because I've never bought anything from them. I never bought anything from them. So this kind of review is also very biased. And then there's the non-reviews. There's a few types of non-reviews saying that this is an advertisement and some that this is random text. And these two are fairly easy to detect. And you can do some analysis on the using the product descriptions together with different information, they're fairly easy to detect. This one here I can give you an example of what is a fake review. >>: [indiscernible]. >> Bing Liu: At that time, Amazon was, I think 95 percent like that. can write it. >>: Anybody [indiscernible]. >> Bing Liu: No, they did not. No. They might have bought it, but they don't really have to give exactly and say I bought this. >>: [indiscernible] that this person bought this product? >> Bing Liu: That information is actually on the web verified, it was called Amazon verified purchase. But it's a very small proportion. Very, very tiny proportion. So here is trying to do something like what are the reviews which is very harmful. So is in for example, we have some way to find out the good quality products and the bad quality products and the average quality products and there you can see positive and negative reviews. And this, if you are good products, if you write a positive review, it's not so bad, okay. But this happens also, okay. It's not so bad. And also, if you are a bad review, somebody write a negative review, it's not so harmful, okay? It's not so harmful. But these few are pretty harmful. So this is essentially one of the clues people normally use to try to say what is this sort of deviation of your 11 review with the other guys. And so the, in terms of detection, this will be the focus as well, try to detect this. So there are now, oh, even before, there are two type of reviewers. So there's many individual guys who just write themselves and, for example, restaurant owners and these type of people and they don't really work with anybody. And so there are also group spammers, okay. Group spammers which are especially, when you have a small business, when you have a person who writes using multiple user IDs, okay. So this is the reputation management company doing quite a bit of things like this over here. And this is getting a little bit more popular than that particular situation. So the type of data you can use in this domain, the first thing is the actual review contents, okay, actual review contents. And this includes a title and the text, the review text. And the metadata or somebody called the site data, one of those kind of things. You have the ID, you have the star writing. These are not text. And time when the review is posted. I also put public and public. This is the public data which you can see, everybody can see. And there's also some private data, which basically is internal data, which is the website collect, the IP address, MAC address and the cookie information. And from those information, they can somehow figure out your location where you write or the location you come from, and those are sort of private information and the sequence of clicks. This becomes sort of a web user mining domain. So I talked to some companies who do this, they detect this. They said it's very useful, actually, in the sense, for example, if somebody write a review, come directly to that product, which is very likely to be fake, because the company who advertising the service, advertising the sort of product request, they want people to write fake reviews, they just say, oh, anybody who want to write some review for us, and here's the link. We'll give you something. So many people who actually write a fake review, they click the link and go there. Oh, they directly post those things are fairly likely to be fake. And another thing that is very interesting is when people write a review, especially the owners write reviews, you'll see this [indiscernible] page count on that page all the time. And then one day, he has something positive. And then, you know, this is not quite right, okay. If you're not owner of this 12 restaurant, then you are unlikely to come to this restaurant all the time and not go anywhere else, which is very, very interesting. So we can use this thing to detect. All this information pretty much very, very useful. For example, another interesting information would be if you have a hotel, all the reviews seem to come from the surrounding area. Then that's not quite right, right. And local guys are not going to stay at a hotel just for any reason. And there are also text clues. But those are very difficult to determine. For example, I also heard from one company that this guy who wrote many views, and at one time, my wife loved this product. And a few days later, oh, my husband loved this product. So it's something -- those are not easy to detect for text. Use deep text analysis. And also, there are quite a bit of information in the private and public information about products. So product descriptions you can use, when the guy's writing something related, and then also when the product was sold. When the product was sold, which is pretty useful as well. And the sales volume and sales rank. Especially the sales volume. If you can see, for example, Amazon can do this very easily. If you see I saw this product and review comes in and what's happening. Oh, the sales volume. Normally, you have a reasonable idea of distribution, sort of proportion of people who might write reviews. And suddenly this product is more than that, okay. More than that. But the bad things, although this company can do lots of work, can do lots of things to detect it, but they don't do it. Amazon is not doing much, unless they were caught by -- [indiscernible] for example, New York Times and they find out oh, go to remove it. The reason is these five star reviews is good for them. It's help you to sell products. If you go to the website, you say oh, this got no review, not going to buy it, right? If you see something negative, oh, forget about it. If you see something good, wow, that's probably something reasonable, okay, something reasonable product. So these retailers, they're not doing much, okay. They're not doing much. We also have the reviewer information. You have the profile and in many cases 13 not so trustworthy. But this can be public and also some private. For example, Amazon has lots of private information, credit card, whatever, all this information if you purchased before also. And, of course, they review he wrote, you can analyze the string of reviews he's been writing and what kind of products he wrote and the services wrote. So the first work that we're trying to do that in the beginning was really, really hard, we couldn't figure out how to do this thing, okay, how to do this thing. And then I ask one of the my masters students, say why don't you just go back and see whether you can manually find something, okay, manually find something. We can use that stuff. If you can manually find something, let's spend some time to find some more and we'll use that for training to get some information. Second day, he come back to me, oh, yes, I find many. I find many. What happens is he finds all those reviews very near duplicates. A little bit of change. A little bit of change. He find me quite a number of them. You can see the guy change a little bit at the end, a few words in the beginning, a few words sort of in the middle somewhere a little bit. Then we say oh, that may be interesting. We can do this automatically, right. You can find out -- you can just have algorithm to doing the comparison and then we can see a lot of things like this. And then we find out that lots of interesting things. For example, some product, actually the same guy wrote seven, eight reviews about the same product. Same camera. And what on earth, why do you want to do that, you know? Everything is all positive. If you say, I mean, if you think few weeks later or a few months later, if you feel this camera is no longer as I thought at the beginning, you have to go back and change your reviews. There's no point to write so many of them, and they are all positive. They're all good, okay. So this is very interesting, and then we did this study. We did not consider this one, because the same user ID, same product, especially written the same day. So it might be possible you click on the same thing, the submit button multiple times and it could submit the same day. So those are not really reliable, okay, not reliable. Another one which is not quite reliable is Amazon copy reviews. Which means 14 not exactly copy reviews. They can get different type of products, the same review. For example, you have a pair of shoes, that is a blue color, that is a red color so they sort of share reviews. You have to remove those things. Those are easy to remove because they have the review ID, okay, the review ID. The same ID, okay the same ID. Then we got a few of them. We only work on the one category of products which is manufactured kind of products and there are quite a few [indiscernible] of these things. We used that to be our training data and test data, okay, training and test data. And then we used logistic regression to be the model, and so what are the features we use? It was review centric features are just essentially the N-grams, the text N-grams and the rating. We use the ratings as well. And then also reviewer, I think we only used up to bigram, okay. Bigram probably. And the review centric features will be about the reviewers, there's some different sort of unusual behaviors. For example, the person wrote many reviews which are the first review of the products, which is fishy, which is a bunch of these things. >>: [inaudible]. >> Bing Liu: >>: Group spam? Yeah. >> Bing Liu: I'm going to talk about that later. The group spam now gets very bad because of the small business and the reputation management companies doing this, and then they have to do group spam, okay, the group spam. So this is so we're looking at the reviews, we're not looking at the people yet. Not looking at the reviewers yet. And then we did a classification, and we get this AUC, 78 percent predicting of the fakes on the duplicate, near-duplicate kind of reviews, which is reasonable. It's not easy, okay. This is obviously a very difficult problem. >>: A 50/50 split? >> Bing Liu: No, what we did is roughly, I don't remember what was the case. It was -- we did roughly the -- I remember it was roughly the distribution 15 followed the natural distribution, yeah. Because we know, because this is a pretty good case we can did 50/50, it's not reliable. For example, people doing 50/50, but if you make a natural distribution -- for example, if the proportion of this is only about 10 percent or 15 percent, 20 percent, if you do 50/50 accuracy, everything is so high. But if you go to low, then you're going to drop dramatically. You're going to drop dramatically. And so data miners, we've been doing this thing, we do roughly in the natural distribution. So, of course, it's also interesting to say is it reasonable to do this, to treat the duplicates as spam reviews? Is it reasonable? Okay. We thought it was probably reasonable because we're doing something reasonably okay. It's not that bad, okay, not that bad, okay. Then we did, after we got this result, then we analyzed some other type of reviews, okay, some other type of reviews. So negative outliers, so essentially those guys write very negative reviews, they tend to be heavily spammed, okay. They tend to be heavily spammed. And those reviewers, only reviews of the products are also likely to be spammed. This one is obvious. The first one is not so obvious, not so obvious. I would thought the positive review are more likely to be spammed, because from my experience, I talk to my students, for example, I have lots of students from India, and they do it. They write it. So last year, I got two students who said they have done it before. No, they didn't say it themselves. The one guy said himself. The other guy says, my friends. So I'm not sure. And this year, in the first class, I ask has anybody done it? Two raise hands. Have you done reviews for a website? They said themselves they do it. Two or three weeks ago, I asked them again, no hands. Nobody raised their hands before. First time I asked, there was two person that raise their hands. When they know I'm doing that kind of research, okay, they probably don't want to get into this. And also, the spam reviews can get quite good helpful feedbacks. If you write quite carefully, you can get fairly good feedbacks. Also, my students [indiscernible] was the student. He wrote the camera review that was completely fake. He wrote it very nicely. He wrote very detailed stuff, but he never used that product. Got lots of good feedback. It was completely 16 fake. So which basically means this helpful, not helpful is not very useful. And also, if you do web spam and this can be fake itself, right, this can be fake too. Okay. This can be fake as well. So we don't really know whether this is true. We have no way to actually get the ground truth and just so we depend on this one is doing something, and then we try to test on the other review, we think they might be the case. It might be the case. So there's a few more, we do some chart to show this situation. So in the supervise the techniques and people have been doing, this and other groups recently have been doing this [indiscernible] people, they were using Mechanical Turk to write fake reviews, but it turns out we did some analysis on this thing. It turns out it's not really the true fake. Not the true fake. Not true fake. And this group, they labeled this, they used the opinions to label it. They did pretty much the same idea to do this, but this one is much simpler, just use bigram, nothing else, just bigram. This one used the bigram, opinion sentiments and every kind of piece of information. And they labeled themselves use the opinions data. The opinion have those trust and the comments and everything. They got a bunch of guys reading the things and sort of think that probably is fake. I mean, nobody know exactly, okay, nobody know exactly. So this is a supervised techniques. And then there's also many unsupervised techniques, okay, many unsupervised techniques. And this one was essentially we try to go behind the scenes, right, to see what is the behavior of this person, okay, what is the behavior of this person. And also, what is the behavior of the reviews, okay, what is the behavior of the reviews. So somehow, we try to uncover, you know, is there any interesting secrets in the way he behave strangely. The first one, and it's basically two, we published in 2010 and then we stopped this research. Actually, in 2009, we didn't really do much, because we find it was really, really hard. It was really hard. And every time when we submit a paper, there's also the same comments. How do you know this is fake? And are you sure you're doing it correctly? How do you know this? I would get very, very upset. Get very 17 disappointed always. And then we sort of stopped this, stopped this research, and then I was in Singapore, I gave a talk on this particular topic, and then this person was doing something about research paper ranking. Not research paper ranking. To analyze the reviews of research paper reviews. So they have been doing kind of thing, I want to just try this on the product review, but this is the same rating and this type of thing. Not much of difference, okay, much of difference. And if you're looking at the behaviors, they don't really have to look at the contents, not look at the text. And then we come up with a few things and do this thing. And also my students work on something else, I'll give you a little bit of introduction. So the key idea is try to uncover, try to identify some unusual behavior patterns. Could be reviewers, could be reviews, okay, could be reviews. And the thing we're doing is trying to find out this is basically a data mining type of thing. particular paper, was talking about targeting. they are either spam or not spammed. So you're products. unexpected [indiscernible]. So So the first one, with this So we try to say the products, targeting some specific And then also, you could be targeting a group of products. For example, a brand of product. So this brand is getting people to fake, okay, getting people to fake. And also reading deviations. Since this is the same type of problem where you do rating in the review of papers, how you figure out rating. And also early rating deviation. People try to write things right at the beginning, and then this review, the ranking gets lower and lower. So now, this is actually true. What happens, my [indiscernible] student, he's from India, he said his friend is doing this, okay. His friend is in India working for a -- called a reputation kind of company. And he monitor -- he was in charge of this product. What that means, no matter what, you got to make this positive. Either you write it or you watch other people write it, you have to make sure this is positive. So he normally, he'll start with something and he'll watch, you know, hm, somebody wrote a negative. He'll make it up so he'll start to write and he can create quite a few accounts to do this type of thing. 18 This is true, because I verify, the student said he's doing it. He said he didn't do it because he says it's too troublesome because you got to watch this thing all the time over a period of time, and it's not that interesting. And so then this bunch of scores, you can do different combination, linear combination or just can do it and do a finally do a ranking, okay, do a ranking on this. You can sort of rank these reviews, and then we do the user interface to let the user do an evaluation. Again, we have no ground truth. Again, the user do the evaluation, do this kind of features, let people to see it. This was data mining, okay, this was complete data mining in the sense we try to find the interesting patterns, okay, interesting patterns. So we create a database, okay, create database so this database can be on the reviewer ID, okay, the brand ID so Amazon also have the brand and also product ID, the individual product under the brand and also class. Looks like a classification, okay, looks like a classification type of database. And then this is the class of positives, negatives, four or five stars will be positive and three and below will be negative, okay, will be negative. Then we're trying to find what's called the class association use. My previous slide was doing mostly on the data mining, okay. I was trying to figure out [indiscernible]. This is what we call the class association. Not exactly the classification product, but try to find all the rules. So classification is only finding a subset of the things which is just for the purpose of classification. If he's got classification, that's good enough. But this type of algorithm tries to find all of them, all of the rules. So, for example, you say review one, the brand one is positive. So this is what we call the class association, kind of association rules. Then how do we know something is expected or unexpected? Then we have to do some sort of definition, of typical behaviors. For example, I just give you some examples. If you're interested, in the paper there was quite a bit of things that we defined. For example, in this case, a reviewer wrote all positive reviews on products of a brand, but all negative reviews on a competing brand, for example, all right. So you can see when you have one condition, this reviewer equals to one and then goes positive. The [indiscernible] is conditional probability, giving reviewer one, what is the probability of his positive review? And this is in 19 data mining people called confidence. And 60 percent, okay, 60 percent. Then when you extend this one more condition, okay, one more condition, and then you got 100 percent, for example. Then this is fishy, something is fishy. But how do you find this thing? And it turns out this you can define probabilistically, okay, you can define probabilistically. And then for example, in this particular case, we can do this and use entropy, okay, use entropy, or you can compute entropy of this particular thing and also compute entropy where you have multiple brands, which is essentially, just like if you know the decision tree, just like a decision tree branching out into two brands. You can compute entropy on those -- on extra attributes, okay, extra attributes. There you can see the entropy is changing dramatically. So you can get some sort of information gain to find out something strange about this. Okay. And then you can also analyze further. For example, these two brands actually other people don't think they're great but why this guy give them so high, okay. And this is the entropy. And another one you can use, we call it a confidence unexpectedness. For example, you can see the reviewer one, brand one, is positive. And then the support was the, actually, the joint probability, okay, the joint probability of something happens in -- what is the proportion, for example, you have database. What is the proportion of those data records that contains both -- contain everything, contain all three of them, okay, all three of them. So the condition and the consequence are contained. But in this case, then you can define fairly easily probabilistically, basically how you do that is just given -- I don't really have it here, but given you have one condition use, okay, and then you can define -- you have one condition rules, and then you have the joint and also the condition probability. And what happens if you say this two thing have nothing to do with each other, they're not related. They're completely independent. Assume condition independence, and then you can compute what happens using together if they are really not related. And then this is essentially, this would be the formula to compute that, okay. Assume the condition of the independence, which is right, because we assume this has nothing to do with each other. They're sort of independent. And then you can compute when they are not independent. It's essentially to try to 20 compute it how much they're not independent. And then you can see since like -- so this is just one of the things. You can find out many reviews which are not supposed to be probably not really correct. Not right, okay, not right. And you can also define support unexpectedness. So what happens with this one, I'm using a different measure. Reviewer one, product one, positive. There's five of them. Five positive reviews. So this one was discovered automatically. It was very interesting, we find this. How come the same guy writing many reviews on the same product? Okay, on the same product. And then you can go to look at it and so the guy basically change a little bit and then put quite a few reviews in the same thing. Again, you an define this also probabilistically, okay. You can define. I don't really have it here, but you can define this probabilistically, and then to define that, you can also define what is unexpected, okay, what is unexpected. And this can be quite flexible, because this is nothing, it's not something we come up with manually ours and it's completely defined the mine from the data. You can also find out brand of product relationship, right. All kind of relationship you can discover, and you can define all this type of what is unexpected, what is expected. Okay. And we also do some evaluation manually, okay, manual evaluation on this thing. We also do on detecting user graph, okay, using a graph and you need some sort of relational model to model this thing. And this is the different type of reviews. This is a set of ratings, resellerratings.com, this is the short reviews. They're store reviews. So store reviews is a little bit different from product reviews. For example, you buy a camera, you probably should write one reviews about that camera. But store reviews, you can write multiple. For example, I've been there a few times and can I say every time my experience is different. Every time my experience is different. So we crawled the whole thing. Why we did this was in that time, Google was interested in those things. So Google probably was working with these companies. They were interested in this particular sites, okay. And then we crawled the whole thing and get the data mining class, the data mining class to 21 do this. Wow, some guy wrote 30, 40 reviews within half an hour. Anyway, so but that was easy to detect. Then we represent this as a review graph, okay, review graph in the sense you have [indiscernible] notes, it's heterogenous graph. And the reviewers, and they have reviews, and you have stores. So you use this graph to capture the relationship, okay, capture the relationship of this three [indiscernible] entities. And one of the things, one of the concepts, we define a few concepts. So the trustiness of reviewers, the honesty of review and the reliability of stores. And then you can see all those things are related to each other, okay, related to each other. For example, a reviewer is more trustworthy if he or she has written more honest reviews, okay, honest reviews. A store is more reliable if it has more positive reviews from trustworthy reviewers, okay, trustworthy reviewers. And the review is more honest if it is supported by many more honest reviews, okay, honest reviews. So you can see this is sort of like page rank, all those kind of things, okay, page rank or hits algorithm. So they're related. And then you can model them in a relational way, okay, relational manner. And we use this logistic function, just try to make sure you reach somewhere you should have stabilize, okay, should stabilize your value. So then you can -- this one is about honesty and all these things are related. This one is the reliability. You see the relational things going on here. And then later, and then you can run this in -- sorry, you can run this with a power iteration, right. You can iterate many sort of iterations and you get something sort of stabilized. So in this particular evaluation was we tried to compare with the better business bureau. The U.S. has a Better Business Bureau. You can see what happens there and what happens with our rank [indiscernible] and fairly consistent. You can see, fairly consistent. If Better Business Bureau is saying this is a good review, good guy and we're giving them sort of fairly good too. Then there's the students also tried to go to the web and search for this particular company to see if people trust this company or not. 22 So here is how do we detect the groups, okay? Group detection. And group detection is much easier, actually easier. Individual ones, the pattern is not so obvious. But when you do things in terms of groups, your pattern becomes fairly obvious, okay, fairly obvious, all right. So previously, we tried to manually label individual reviews, whether they're fake or not, which was very, very difficult. But if you try to label groups, it's relatively easy, okay, relatively easy. Because the collective behaviors, okay, the collective behaviors. All right. And so in this case was a group of people work in collusion, all right, they write to demote some products and such spams can be quite damaging, actually quite damaging. And they can take total control, and in this case we have the algorithm which has three steps, okay, three steps. And frequent pattern -- let me see. Okay. The first one was try to find frequent pattern, because you can see whether the same bunch of guys been reviewing multiple products, okay, multiple products so that's fairly easy to detect using the frequent pattern. And then some of them are just coincidence, for example, Apple products there might be a bunch of guys who all review the same. This is probably okay. They are very popular products, okay, popular products. And then you have a set of features, which is essentially indicators or the clues, okay, clues. And then later on, we also use the relational model to rank this, okay. In this case, we have a bunch of guys to actually label the data, okay, label the data. Because in this case, it was not that difficult to label. And this was the review story from ABC that I mentioned to you. And what happens is ABC, one of the reporters discover a chain of positive endorsements from a group of six metro area businesses, ranging from auto care to this. And we found from ABC, 50 Google user accounts only posted reviews about the same businesses. So they behaved like a group. They have the post sort of multiple products, same guy. Same guy multiple. And this is exactly fit for this group detection, okay, exactly group detection. And another one which I mentioned a little bit earlier was also the case. You can see this is the same, one of the things we discovered. Because this is about three softwares, three different softwares, okay. And this is a person called Big John, okay, Big John, and this is for real. 23 This is a guy give three products only -- this guy only reviewed these three products at that time. You can see all five stars. The dates, December 4, December 4, December 4, okay. The same day, this particular person. You see in the second one, December 8, December 8, December 8. Again, the same exact, the same product. And you can see another guy, and this is December 4 also. Exactly the same product. Those three guy -- actually, there was another guy, four. Four of them, okay. Four of them. They review at that time only those three products. If you look at the contents, they are very different. Content very different, but this kind of behavior, if you do data mining, is fairly easy to detect, to find it. The only problem, of course, when you do frequent [indiscernible] mining, you need to have reasonably high support, which does cause some trouble, because now we have -- when he we did this analysis, it was only for the manufacturer products, which was 100,000 type of reviews a long time ago. Now we cross [indiscernible] a few millions now. Then you've got to make the support very high. Otherwise, you can't find it, which is not good. And because [indiscernible] grows exponentially. The number of [indiscernible] grows exponentially. We also try to deal with that problem. Then there's another problem. We were talking to companies. Some companies try to implement this algorithm. They say this is probably, it's okay you can do this periodically. For example, in a few weeks or in a few, you can do this ones. But is it possible to do this online, when this guy wrote it, submit it, and oh, this is no good. Which is a problem. Because this kind of algorithm cannot run like that. It cannot run online, on line sort of progress. So one of the companies in D.C. area, they were saying how can you do this in an online kind of situation? So this was fairly simple to do and for frequent [indiscernible] for each product, you get all the user IDs, all the reviewer IDs. And for every product you have a user ID, you get what are called the transactions and then you can mine the transaction to find all the guys, the bunch of guys who sort of review a few products the same. Not necessarily at the same time, but all of them review the same. 24 And then you can also have a few, for example, the clues to further filter it, further filter it. This is a group time window. So when a bunch of guys doing, they're probably doing the similar time frame, similar time frame, and then group deviation, they're similar review, the ratings similar and the contents similarity, are they similar. And also, the member content similarity, for example, the same guy across the different things or are they similar and also the early time frame and this type of thing so we can see and then we can, again, compute models, relational models of how the groups relate to individual products and the members and the products and the group and the members. Especially the group and the member, because the group and the member are related, right. If your group is bad, of course member probably bad, and then you can define this mathematically, sort of a heuristic way to define it. And then enough to be manipulating matrices, and then try to compute a final ranking, okay, final ranking and you can do an evaluation. This is a pretty good evaluation we did. We did manual labeling, okay, manual labeling by a bunch of guys, okay, bunch of guys. And you can do this thing fairly accurately, fairly accurately. So that's roughly what I've been doing. And my research mainly was focusing on sentiment analysis because it involves Microsoft and also when I sometimes give talks on this particular issue, people always ask are you sure this -- you do this sentiment analysis, are you sure this guy's trustworthy or not? So okay. Let me do something else, see whether I can get some -- try to catch some of those people. And now it becomes very, if you've seen the news, there's quite a bit of interest in the public. In the public. And next few weeks, also some reporters still want to talk about these issues. They're still not over this, still quite interested in this. We have been moving into the other type of social media. For example, you go to Twitter, there are also lots of people doing this. And then the current methods still fairly simple, okay, fairly simple. And we have not really focused a lot of time on this in the sense where we got very, very disappointed in the first few years. Whatever we submitted, always give us back something not very good. 25 Always, we have comments about evaluation, how can you be sure, how can you be sure. So even for the WWW paper, we got this year submit twice, three times. Every time, it was -- my student say okay, I'm going to label it. So he went and labeled it, and then he got accepted. He got eight of his friends to label it. And it turns out the groups are much easier to label, because the behavior is so obvious. The behavior is so obvious. And then so that was basically the thing. If you're interested, you can look at, we have a chapter talk about this thing, sort of survey. But now there's quite a few people, quite a number of people are working on this now. And this is going to be a problem forever. It's going to be sort of, what is it called? You do something and now I got to do something. [indiscernible] whether you can. But this has to be dealt with in some sense. I've been working They're all doing retailer is doing thing. And also, to do this too. with -- not working, but talking to people in China and here. it. But not all this is retailers. I've not seen a single this, okay, because it's actually good for them to have this some other company, for example, American Express is trying What they are trying to do is quite interesting. They want to do this in the sense they want to verify a purchase. So you have a credit card, right. So they have the credit card information. You purchase something, you wrote something on Amazon, but you bought from the Best Buy, they were freaking out. So they talked to me, and they tried to work with Yelp. Yelp said oh, no, we're not interested in working with you. So they are now, they got a retailer, they said. They didn't tell me which one. And also work together with other credit card issuers. I say if you just do yourself, probably not good enough. You've got to work with other issuers of credit cards to verify you, whether you purchased something somewhere, somewhere this particular product. Okay. And that's roughly about it. Yeah? >>: It's a very interesting talk. So you talked a lot about how difficult it is to detect the fake reviews right. Have you thought about looking at internet reviews? 26 >> Bing Liu: Say again? >>: So rather than detecting fake reviews, trying to detect [indiscernible] reviews, like good reviewers and good reviews. >> Bing Liu: Yeah, there's some sentiment analysis researchers doing the quality or the utility of the reviews. You can rank those things too. And but the problem with those, also, the guy who is really writing it well, you don't know. They can write all the product features, you know, give a really nice comments and so ->>: [indiscernible] reviewers' behaviors, like the way that they looked at the fake reviewers' behaviors, right? >> Bing this is so that somehow Liu: Yeah, we did some analysis. Especially if you read this paper, a dub dub paper, because we have this label data this time, label data we can analyze the behaviors. Whether this kind of feature does separate people, separate groups, yeah, we did some analysis on that. Individuals, we couldn't do it, because you have no good data for that. >>: And then do you have any idea of how to sort of raise the bar so that -- >> Bing Liu: >>: [indiscernible] economically challenging or how [indiscernible] -- >> Bing Liu: >>: Yeah, we -- Yeah, yeah -- Verify purchase is one thing, right? >> Bing Liu: Yeah, yeah, yeah. That's what one [indiscernible] is saying. Another thing is if you think about it, I think, I personally think this thing can be done if the company doing it themselves, because what we are doing here, it's completely public information. If you think about this as private information, I think it's doable. At least you can catch quite a bit of those. For example, Yelp is doing that, but Yelp is not doing a great job in the sense -- I mean, they're doing quite a good job, actually, quite a good job. But there's still lots of fakes out there. 27 For example, I give you the dentist situation. Very clearly fake, but Yelp did not filter this. Yelp did filter two of them. The first you, and as I said, the first review about this particular sort of doctor and one of his three shops. That got removed, filtered. This dentist told us how come that thing is gone? The first thing I saw it and a few days later, it disappeared. He thought this guy must sort of talk to Yelp or whatever, pay money to Yelp or whatever to get it removed, okay. And I don't think -- I don't think so. I don't think so. And but then there's another very positive review, I believe, this dentist wrote himself got filtered also. But there's a few more still there. >>: So [indiscernible] because, let's say there are other fake reviews that people really believe in that and after a while, a lot of people are going to figure out that these are not good products. So eventually, the product is going to not have ->> Bing Liu: Yeah, yeah, but it seems lots of people don't bother to write. And also, the product may not be that great but may not be that bad so you probably don't go there, and oh, I'm going to say something. I'm going to -unless it's really bad. >>: Okay. It's more of local businesses? >> Bing Liu: Yeah, yeah. For example, the Google Places now is a good place people fake quite a lot, yeah. >>: [indiscernible] if we just use individual [indiscernible] technique, one person [indiscernible] identify to a new [indiscernible]. >> Bing Liu: The individual ones, we don't know. We really have no idea. We just doing something, whether we're just, you know, we just do some evaluation, 100 or something like that, we can't go below because it's easy to figure out because we do manual evaluation. The group wants very clear features, you think, probably. >>: But by [indiscernible] identify a set of fake reviews, [indiscernible] versus the group and what is the individuals -- 28 >> Bing Liu: We have not done that kind of analysis. That would be interesting too, yeah. And the people ask me to estimate what, for example, the New York Times people ask me to estimate what percentage of review are fake. That was a very difficult question. Very difficult. I tell them 30 percent. Why? I have a reason, okay? I have a reason. If you go to look at Yelp, they filter out about 15 percent. About 15 percent. I have a good reason, I have a sea of also fake reviews that I know are fake that did not get filtered out. They still stay. So I give them, you know, probably -- I tell them, I can't really do this so accurately. But recently, [indiscernible] something have one report that said by 2014, they'll be doing about 15 percent. So lots of people are saying that's underestimate. That's underestimate. Because somebody says -- because they have estimated, they said fake reviews, and also the fake clicks and the fake friends and the fake followers, things like that, so one of the guys who wrote a comment says they've been doing that. He said his experience in 90 percent are fake, especially the followers. And those followers are created. Try to do something and you look at the followers, all those followers have nothing to do with my business. They say they got a lot of followers, but he believes 90 percent are not right. >>: [indiscernible]. Click for the ads, specifically, like Google ads. >> Bing Liu: Right, we don't really know that, yeah. So there's lots of interesting research that needs to be done, actually, in this area. And we have, I mean, when we use the external information, we're always handicapped very much. Thank you.