T H E C O M P L E X TA S K O F MAKING SEARCH SIMPLE Jaime Teevan (@jteevan) Microsoft Research UMAP 2015 THE WORLD WIDE WEB 20 YEARS AGO Content 2,700 websites (14% .com) Tools Mosaic only 1 year old Pre-Netscape, IE, Chrome 4 years pre-Google Search Engines 54,000 pages indexed by Lycos 1,500 queries per day THE WORLD WIDE WEB TODAY Trillions of pages indexed. Billions of queries per day. 1996 We assume information is static. But web content changes! SEARCH RESULTS CHANGE New, relevant content Improved ranking Personalization General instability Can change during a query! SEARCH RESULTS CHANGE BIGGEST CHANGE ON THE WEB Behavioral data. BEHAVIORAL DATA MANY YEARS AGO Marginalia adds value to books Students prefer annotated texts Do we lose marginalia when we move to digital documents? No! Scale makes it possible to look at experiences in the aggregate, and to tailor and personalize It is impossible to separate a cube into two cubes, or a fourth power into two fourth powers, or in general, any power higher than the second, into two like powers. I have discovered a truly marvellous proof of this, which this margin is too narrow to contain. PAST SURPRISES ABOUT WEB SEARCH Early log analysis Excite logs from 1997, 1999 Silverstein et al. 1999; Jansen et al. 2000; Broder 2002 Queries are not 7 or 8 words long Advanced operators not used or “misused” Nobody used relevance feedback Lots of people search for sex Navigational behavior common Prior experience was with library search SEARCH IS COMPLEX, MULTI-STEPPED PROCESS Typical query involves more than one click 59% of people return to search page after their first click Clicked results often not the endpoint People orienteer from results using context as a guide Not all information needs can be expressed with current tools Recognition is easier than recall Typical search session involves more than one query 40% of sessions contain multiple queries Half of all search time spent in sessions of 30+ minutes Search tasks often involves more than one session 25% of queries are from multi-session tasks IDENTIFYING VARIATION ACROSS INDIVIDUALS Normalized DCG 1 0.95 0.9 0.85 0.8 0.75 1 2 3 4 Number of People in Group Group Individual 5 6 WHICH QUERY HAS LESS VARIATION? campbells soup recipes v. vegetable soup recipe tiffany’s v. tiffany nytimes v. connecticut newspapers www.usajobs.gov v. federal government jobs singaporepools.com v. singapore pools NAVIGATIONAL QUERIES WITH LOW VARIATION Use everyone’s clicks to identify queries with low click entropy 12% of the query volume Only works for popular queries Clicks predicted only 72% of the time Double the accuracy for the average query But what is going on the other 28% of the time? Many typical navigational queries are not identified People visit interior pages craigslist – 3% visit http://geo.craigslist.org/iso/us/ca People visit related pages weather.com – 17% visit http://weather.yahoo.com INDIVIDUALS FOLLOW PATTERNS Getting ready in the morning. Getting to a webpage. FINDING OFTEN INVOLVES REFINDING Repeat query (33%) user modeling, adaptation, and personalization Repeat click (39%) http://umap2015.com/ Query umap Lots of repeats (43%) Repeat Click New Click Repeat Query 33% 29% 4% New Query 67% 10% 57% 39% 61% IDENTIFYING PERSONAL NAVIGATION Use an individual’s clicks to identify repeat (query, click) pairs 15% of the query volume Most occur fewer than 25 times in the logs Queries more ambiguous Rarely contain a URL fragment Click entropy the same as for general Web queries Multiple meanings – enquirer Found navigation – bed bugs Serendipitous encounters – etsy 95% National Enquirer http://www.medicinenet.com/bed_bugs/article.htm Cincinnati Enquirer Etsy.com [Informational] Regretsy.com (parody) SUPPORTING PERSONAL NAVIGATION Tom Bosley - Wikipedia, the free encyclopedia Bosley died Thomas Edward at 4:00 "Tom" a.m. Bosley of heart (October failure1,on1927 October October 19, 2010, 19, 2010) at a was hospital an American near his actor, in home best Palm known Springs, for portraying California.Howard … His agent, Cunningham Sheryl on Abrams, the long-running said BosleyABC hadsitcom been Happy Days. battling lung cancer. Bosley was born in Chicago, the son of Dora and Benjamin Bosley. en.wikipedia.org/wiki/tom_bosley PATTERNS A DOUBLE EDGED SWORD Patterns are predictable. Changing a pattern is confusing. CHANGE INTERRUPTS PATTERNS Example: Dynamic menus Put commonly used items at top Slows menu item access Does search result change interfere with refinding? CHANGE INTERRUPTS REFINDING When search result ordering changes people are Less likely to click on a repeat result Slower to click on a repeat result when they do More likely to abandon their search Even happens when the repeat result moves up! How to reconcile the benefits of change with the interruption? 9 Time to click S2 (secs) Happens within a query and across sessions 5.5 Down Gone Stay Up 2 0 4 8 12 Time to click S1 (secs) 16 20 USE MAGIC TO MINIMIZE INTERRUPTION ABRACADABRA Magic happens. YOUR CARD IS GONE! CONSISTENCY ONLY MATTERS SOMETIMES BIAS PERSONALIZATION BY EXPERIENCE CREATE CHANGE BLIND WEB EXPERIENCES CREATE CHANGE BLIND WEB EXPERIENCES THE COMPLEX TASK OF MAKING SEARCH SIMPLE Challenge: The web is complex Tools change, content changes Different people use the web differently Fortunately, individuals are simple We are predictable, follow patterns Predictability enables personalization Beware of breaking expectations! Bias personalization by expectations Create magic personal experiences REFERENCES Broder. A taxonomy of web search. SIGIR Forum, 2002 Donato, Bonchi, Chi & Maarek. Do you want to take notes? Identifying research missions in Yahoo! Search Pad. WWW 2010. Dumais. Task-based search: A search engine perspective. NSF Task-Based Information Search Systems Workshop, 2013. Jansen, Spink & Saracevic. Real life, real users, and real needs: A study and analysis of user queries on the web. IP&M, 2000. Kim, Cramer, Teevan & Lagun. Understanding how people interact with web search results that change in real-time using implicit feedback. CIKM 2013. Lee, Teevan & de la Chica. Characterizing multi-click search behavior and the risks and opportunities of changing results during use. SIGIR 2014. Mitchell & Shneiderman. Dynamic versus static menus: An exploratory comparison. SIGCHI Bulletin, 1989. Selberg & Etzioni. On the instability of web search engines. RIAO 2000. Silverstein, Marais, Henzinger & Moricz. Analysis of a very large web search engine query log. SIGIR Forum, 1999. Somberg. A comparison of rule-based and positionally constant arrangements of computer menu items. CHI 1986. Svore, Teevan, Dumais & Kulkarni. Creating temporally dynamic web search snippets. SIGIR 2012. Teevan. The Re:Search Engine: Simultaneous support for finding and refinding. UIST 2007. Teevan. How people recall, recognize and reuse search results. TOIS, 2008. Teevan, Alvarado, Ackerman & Karger. The perfect search engine is not enough: A study of orienteering behavior in directed search. CHI 2004. Teevan, Collins-Thompson, White & Dumais. Viewpoint: Slow search. CACM, 2014. Teevan, Collins-Thompson, White, Dumais & Kim. Slow search: Information retrieval without time constraints. HCIR 2013. Teevan, Cutrell, Fisher, Drucker, Ramos, Andrés & Hu. Visual snippets: Summarizing web pages for search and revisitation. CHI 2009. Teevan, Dumais & Horvitz. Potential for personalization. TOCHI, 2010. Teevan, Dumais & Liebling. To personalize or not to personalize: Modeling queries with variation in user intent. SIGIR 2008. Teevan, Liebling & Geetha. Understanding and predicting personal navigation. WSDM 2011. Tyler & Teevan. Large scale query log analysis of re-finding. WSDM 2010. More at: http://research.microsoft.com/~teevan/publications/ THANK YOU! Jaime Teevan (@jteevan) teevan@microsoft.com EXTRA SLIDES How search engines can make use of change to improve search. CHANGE CAN IDENTIFY IMPORTANT TERMS Divergence from norm cookbooks frightfully merrymaking ingredient latkes Staying power in page Sep. Oct. Nov. Time Dec. CHANGE CAN IDENTIFY IMPORTANT SEGMENTS Page elements change at different rates Pages are revisited at different rates Resonance can serve as a filter for important content EXTRA SLIDES Impact of change on refinding behavior. BUT CHANGE HELPS WITH FINDING! Change to click Unsatisfied initially Gone > Down > Stay > Up Satisfied initially Stay > Down > Up > Gone NSAT SAT Up 2.00 4.65 Stay 2.08 4.78 Down 2.20 4.75 Gone 2.31 4.61 Changes around click Always benefit NSAT users Best below the click for satisfied users NSAT SAT Changes Static Above 2.30 4.93 2.21 4.93 Below 2.09 4.79 1.99 4.61 EXTRA SLIDES Privacy issues and behavioral logs. PUBLIC SOURCES OF BEHAVIORAL LOGS Public Web service content Twitter, Facebook, Digg, Wikipedia Research efforts to create logs Lemur Community Query Log Project http://lemurstudy.cs.umass.edu/ 1 year of data collection = 6 seconds of Google logs Publicly released private logs DonorsChoose.org http://developer.donorschoose.org/the-data Enron corpus, AOL search logs, Netflix ratings EXAMPLE: AOL SEARCH DATASET August 4, 2006: Logs released to academic community 3 months, 650 thousand users, 20 million queries Logs contain anonymized User IDs August 7, 2006: AOL pulled the files, but already mirrored AnonID Query QueryTime -------------------------------1234567 jitp 2006-04-04 18:18:18 “A Face1234567 Is Exposedjiptfor AOL Searcher No. 4417749” submission process 2006-04-04 18:18:18 computational socialinscinece 2006-04-24 09:19:32 Queries1234567 for businesses, services Lilburn, GA (pop. 11k) 1234567 computational social science 2006-04-24 09:20:04 Queries1234567 for Jarrettseattle Arnold (and others of the Arnold clan) restaurants 2006-04-24 09:25:50 1234567 all perlman montreal 2006-04-24 10:15:14 NYT contacted 14 people in Lilburn with Arnold surname 1234567 jitp 2006 notification 2006-05-20 13:13:13 When contacted, Thelma Arnold acknowledged her queries … ItemRank ------------1 3 ClickURL -----------http://www.jitp.net/ http://www.jitp.net/m_mscript.php?p=2 2 2 4 http://socialcomplexity.gmu.edu/phd.php http://seattletimes.nwsource.com/rests http://oldwww.acm.org/perlman/guide.html August 9, 2006: New York Times identified Thelma Arnold August 21, 2006: 2 AOL employees fired, CTO resigned September, 2006: Class action lawsuit filed against AOL EXAMPLE: AOL SEARCH DATASET Other well known AOL users User 927 how to kill your wife User 711391 i love alaska http://www.minimovies.org/documentaires/view/ilovealaska Anonymous IDs do not make logs anonymous Contain directly identifiable information Names, phone numbers, credit cards, social security numbers Contain indirectly identifiable information Example: Thelma’s queries Birthdate, gender, zip code identifies 87% of Americans EXAMPLE: NETFLIX CHALLENGE October 2, 2006: Netflix announces contest Predict people’s ratings for a $1 million dollar prize 100 million ratings, 480k users, 17k movies Very careful with anonymity post-AOL All customer identifying information has been removed; all that remains are ratings and dates. This follows our privacy policy. . . 12, 3, 2006-04-18 [CustomerID, Date] Paper published by Narayanan &Rating, Shmatikov 1234, 5 , 2003-07-08 [CustomerID, Rating, Date] Even if, for example, you knew all your own 2468, 1, 2005-11-12 [CustomerID, Uses background knowledge fromRating, IMDBDate] ratings and their dates you probably couldn’t … Robust to perturbations in data identify them reliably in the data because Movie Titles … only a small sample was included (less than December 17, 2009: Doe v. Netflix 10120, 1982, “Bladerunner” one tenth of our complete dataset) and that 17690, 2007, “The Queen” … 2010: Netflix cancels second competition March 12, data was subject to perturbation. May 18,Ratings de-anonymized 1:2008: Data [Movie 1 of 17770]