CLICK FRAUD Alexander Tuzhilin By Vinny Rey Why was the study done? • Google was getting sued by advertisers because of click fraud. • Google agreed to have a third party review how they combated click fraud and to determine whether or not Google's methods were reasonable or not. Development of the Internet • The Internet was developed a long time ago • People with strong technical skills used the Internet prior to the 90’s • WWW is a globally connected network of Web servers and browsers that allows transferring different types of web pages and other types of documents over the Internet. • The development of the World Wide Web, Web documents and Web browsers for displaying these documents in a user-friendly fashion, made Internet much more user-friendly. Development of the Pay-per-Click Advertising Model • “Targeted ads” 1. Personal characteristics of a web page visitor known to the party delivering an ad Keywords of a search query launched by the user Content of a web page visited by the user 2. 3. For what exactly should advertisers pay and when? 1. When the ad is being shown to the user 2. When the ad is being clicked by the user 3. When the ad has “influenced” the user in the sense that its presentation lead to a “conversion event” Conversion event- the actual purchase of the product advertised in the ad Why wouldn’t an advertiser favor number 1? Because the user may not even look at the ad and may simply ignore it. Development of the Pay-per-Click Advertising Model • Two measures of the effectiveness of an advertisement: • Click-Through Rate (CTR): measures how often visitors click on the ad. CTR= X/Y • Conversion Rate: gives a sense of how often visitors actually act on a given ad, which is a better measure of ad’s effectiveness than the CTR measure. • Cost per Mille-CPM- an advertiser pays for one thousand impressions of the ad • Cost per Click-CPC- an advertiser pays only when the visitor clicks on the ad • Cost per action-CPA- an advertiser only pays when a certain conversion action takes place Cost Per Click • Two problems with the Cost per click model: • although correlated, good click-through rates are still not indicative of good conversion rates • It does not offer any “built in” fundamental protection mechanisms against the click fraud Google’s Pay-per-Click Advertising Model • • AdWordsa program allowing advertisers to purchase CPC-based advertising that targets the ads based on the keywords specified in the users’ search queries. • • Ad Rank = CPC x QualityScore QualityScore- a measure identifying the “quality” of the keyword and the ad combined The more the advertiser is willing to pay (CPC) and the higher the click through rate on the ad (CTR), the higher the position of the ad in the listing is. “The actual amount of money paid when the user clicks on an ad is determined by the lowest cost needed to maintain the clicked ad’s position on the results page.” Why not just bid a whole lot so the other companies will be unable to match the price? The advertisers don’t know the amount of money other advertisers are willing to pay • • • • Click fraud in AdWords 1. Make the competitor pay more 2. If you’re second, click on the competitor’s advertisement enough so that he will hit his budget for the day 3. If you knock the second guy out, you’ll only have to pay as much as the number three guy The AdSense Program • AdSense• a program for website owners (publishers) to display Google’s ads on their website and earn money from google as well. • AdSense for Search (AFS): relevant ads are displayed as links sponsored by Google; links are produced using the same method as on Google.com • AdSense for Content (AFC): ads are based on the content of the visited pages, geographical location and some other factors. • What is a way that a publisher can cheat through AdSense? • The publisher can continue to click on an advertisement on their site from Google in order to receive more money from Google. • Unethical users: • AdWords = hurt other advertisers AdSense = enrich themselves The Google Network • Does most invalid clicking come from direct publishers of online publishers? • Online publishers • AdSense for Content (AFC) is most prone to invalid clicking. Data Google Can Collect • When a user has visited a conversion page • Google main weakness? • The inability to get full access to all the clicking activities of the visitors of the advertised website Invalid Clicks and Google’s Definition • • • • • • • • “Click Fraud occurs in pay per click online advertising when a person, automated script or computer program imitates a legitimate user of a web browser clicking on an ad, for the purpose of generating an improper charge per click” Invalid vs. Fraudulent Publishers subscribing to paid traffic websites that artificially bring extra traffic to the site, including extra clicking on the ads a. invalid but not fraudulent b. fraudulent but not invalid c. both invalid and fraudulent d. neither fraudulent or invalid C Problem with identifying fraudulent clicks: Was the click generated “artificially” or not and what does exactly “artificial” mean In this case? Click Quality Team • The goal of the Click Quality Team is to identify all invalid clicks regardless of its nature and origin and make sure advertisers do not pay for these invalid clicks Operational Definitions of Invalid Clicks • Anomaly-based (or Deviation-from-the-norm-based): doesn’t look at what defines an invalid click, but rather what defines a “normal” click. Therefore, invalid clicks are those that significantly deviate from the established norms. • Rule-based: a specified set of rules identifying invalid clicking activities; these experts define what valid and invalid are. • The main challenge with this approach is to demonstrate that these conditions are “reasonable.” • Operational Definitions a double edged sword? • They cannot be fully released to the general public because unethical users will take advantage. However, if the public doesn’t know, how do advertisers know exactly what they are being charged for? Conclusions about Definitions of Invalid Clicks The Fundament Problem: the Cost per Click model is inherently vulnerable to click fraud, making it impossible to solve. Possible solutions? The “trust us” approach Third Party Auditors