SmartAds: Bringing Contextual Ads to Mobile Apps Mike Lin Outline Authors Introduction Background Characterization SmartAds Architecture Results Discussion Comment Authors Suman Nath A Ph.D student at the School of Computer Science, Carnegie Mellon University . A masters student in University of Illinois at Urbana-Champaign work at Microsoft Research. Suman Nath’s Work Authors Felix Xiaozhu Lin PhD student in Computer Science at Rice University Software Developer at Collabera Research Intern at Microsoft Research Visiting Student at IBM Research Intern at Nokia Intern at Baidu, Inc. Education Rice University Tsinghua University Tsinghua University Authors Lenin Ravindranath Sivalingam A PhD student at CSAIL. I received Microsoft Research Graduate Fellowship in 2011. I joined MIT in the fall of 2008. I was on leave (from May 2011 to Jan 2013) building AppInsight and related systems at Microsoft Research. Authors Jitendra Padhye A Principal Researcher at Microsoft Research. He is interested in all aspects of computer networking and networked systems. His recent work has focused on data center networks and mobile computing. He has published numerous research papers in top conferences, and holds over 25 US patents. He is the recipient of the ACM SIGCOMM Test of Time award. He received his PhD in Computer Science from University of Massachusetts Amherst in 2000. Outline Authors Introduction Background Characterization SmartAds Architecture Results Discussion Comment Introduction US consumers spent 30% more time on mobile apps than on traditional web. Today's mobile ads are not contextual and mostly irrelevant. To scrape the content at runtime, extract keywords and fetch contextually relevant ads. Introduction Introduction i. PHYSICAL CONTEXT-AWARE ADVERTISING: based on users' locations, activities, and other physical contexts. ii. BEHAVIORAL TARGETING: based on each individual user's past web usage history. Outline Authors Introduction Background Characterization SmartAds Architecture Results Discussion Comment Background Contextual Advertising: matching ads to web pages (1)Offline ad labeling: Ads in ad network's inventory are labeled with keywords, called bidding keywords. (2)Offline keyword extraction: Ad network employs a bot that crawls web pages and uses machine learning algorithms (such as KEX) to extract keywords. (3)Online web page to ad matching. Background Mobile Ads: (1)main signals today: the app metadata (2)focus in this paper: contextual advertising-challenging -the apps often transform the content in a variety of ways, sometimes combining content from multiple sources, such as a news aggregator app. Two challenging points: a. limited resources b. user privacy Outline Authors Introduction Background Characterization SmartAds Architecture Results Discussion Comment Characterization Page Data: A rich source of ad keywords More keywords than the metadata Change significantly over a period of time Example: If a user uses the app for nding local restaurants, he should be shown restaurant related ads Characterization Methodology Apps PhoneMonkey emulate various user interactions (touch, swipe etc.) App session(each run): run each of top 1200 app 30 times with the PhoneMonkey. Methodology Keyword Extraction A modified version of the well-known KEX keyword extractor Example the words “is” and “zebra”:both get a low score the word “pipe”: get higher weight because of plumbing business Methodology Page data is a good source of ad keywords. Methodology Page data yields more keywords than metadata. Methodology Page data yields more keywords than metadata. Methodology Page data is dynamic, and requires online keyword extraction. Outline Authors Introduction Background Characterization SmartAds Architecture Results Discussion Comment SmartAds Architecture Utility. Efficiency. Privacy. SmartAds Architecture Achieving Good Utility Local features AnywhereCount: total times NearBeginningCount: times the word appears in the beginning of the page SentenceBeginningCount: times the word starts a sentence. PhraseLengthInWord:words in the phrase. PhraseLengthInChar:characters in the phrase. MessageLength: The length of the line, in characters,containing the word. Capitalization: times the word is capitalized in the page. Font size: Font size of the word. SmartAds Architecture Global knowledge the knowledge about how often advertisers bid on a keyword The frequency is how many times the word appears in the bidding keyword trace. SmartAds Architecture Achieving Efficiency Addressing memory overhead we partition x into a vector of local features xl (e.g.,anywhereCount) and a vector of global features xg (e.g, global knowledge of a keyword), with weight vectors wl and wg respectively. SmartAds Architecture Achieving Efficiency Addressing communication overhead Bloom filter: a space-efficient probabilistic data structure Bloom filter size at the client. SmartAds Architecture Achieving Efficiency Addressing communication overhead Dynamics of bidding keywords. SmartAds Architecture Achieving Privacy The ad server knows only the ad keywords in the page and nothing else. End-to-end workflow in SmartAds Outline Authors Introduction Background Characterization SmartAds Architecture Results Discussion Comment Results Relevance Results Relevance Results End-to-end performance Results Overheads CPU Overhead: The combined runtime. As seen from Table 1,this overhead is minimal. Memory Overhead: The size of our bloom filter is 1MB. Based on our measurements, SmartAds control consumes around 2.8MB of memory on an average. Network Overhead: From our analysis of 1200 apps, we find that the average number of keywords extracted per page is 7.6. Hence, the average extra bytes sent is 91. But on average various ad controls upload 1.5KB and download 5KB of data. Battery Overhead: The increase in power consumed was less than 1% and well within experimental noise. Outline Authors Introduction Background Characterization SmartAds Architecture Results Discussion Comment Discussion Optimizations Addressing lack of text: Level 1: keywords from the current page Level 2 keywords from all the pages Level 3 keywords for each app, learned offline from that app's metadata. Handling related keywords {HDTV} {HDTV; LED TV; LCD TV}. Sources: Analyzing Bing web queries and click logs http://veryrelated.com Discussion Dealing with Tail Bidding Keywords (1) by serving them when their keywords are semantically related to any keyword in the Bloom filter (2) by prioritizing them when the current page does not contain any important ad keywords (3) by occasionally serving them even if the contextual signals do not match them. Outline Authors Introduction Background Characterization SmartAds Architecture Results Discussion Comment Comment Comment