1 Evaluating ASSOCIATIVE BROWSING by Simulation Jin Y. Kim / W. Bruce Croft / David Smith 2 * What do you remember about your documents? Registration James James Use search if you recall keywords! 3 * What if keyword search is not enough? Registration Associative browsing to the rescue! 4 * Probabilistic User Modeling • Query generation model • Term selection from a target document [Kim&Croft09] • State transition model • Use browsing when result looks marginally relevant • Link selection model • Click on browsing suggestions based on perceived relevance 5 * Simulating Interaction using Probabilistic User Model Target Doc : Initial Query : James Registration Search Not Relevant (RankD > 50 ) Marginally Relevant (11 < RankD < 50 ) Reformulated Query : Click On a Result : Two Dollar Registration 1. Two Dollar Regist… Target Doc at Top 10 Target Doc at Top 10 End * A User Model for Link Selection • User’s browsing behavior [Smucker&Allan06] • Fan-out 1~3: the number of clicks per ranked list • BFS vs. DFS : the order in which documents are visited * A User Model for Link Selection • User’s level of knowledge • Random : randomly click on a ranked list • Informed : more likely to click on more relevant item • Oracle : always click on the most relevant item • Relevance estimated using the position of target item 1 … 2 … 3 … 1 … 2 … 4 … 5 … 3 … 4 … 1 … 5 … 2 … 3 … 4 … 5 … * Evaluation Results • Simulated interaction was generated using CS collection • 63,260 known-item finding sessions in total • The Value of Browsing • Browsing was used in 15% of all sessions • Browsing saved 42% of sessions when used • Comparison with User Study Results • Roughly matches in terms of overall usage and success ratio Evaluation Type Total Browsing used Successful Simulation 63,260 9,410 (14.8%) 3,957 (42.0%) User Study 290 42 (14.5%) 15 (35.7%) * Evaluation Results • Success Ratio of Browsing 0.48 0.46 0.44 0.42 0.4 random informed 0.38 oracle 0.36 0.34 0.32 0.3 FO1 FO2 More Exploration FO3 10 * Summary Associative Browsing Model Evaluation by Simulation • Simulated evaluation showed very similar statistics to user study in when and how successfully associative browsing is used • Simulated evaluation reveals a subtle interaction between the level of knowledge and the degree of exploration Any Questions? Jin Y. Kim / W. Bruce Croft / David Smith 11 * Simulation of Know-item Finding using Memory Model t1 t3 t2 t4 t3 t5 • Build the model of user’s memory • Model how the memory degrades over time • Generate search and browsing behavior on the model • Query-term selection from the memory model • Use information scent to guide browsing choices [Pirolli, Fu, Chi] • Update the memory model during the interaction • New terms and associations are learned 12 OPTIONAL SLIDES * Evaluation Results • Lengths of Successful Sessions 2.5 2 1.5 FO1 FO2-BFS 1 FO3-BFS 0.5 0 random informed oracle 2.5 2 1.5 FO1 FO2-DFS 1 FO3-DFS 0.5 0 random informed oracle 14 * Summary of Previous Evaluation • User study by DocTrack Game [Kim&Croft11] • Collect public documents in UMass CS department • Build a web interface by which participants can find documents • Department people were asked to join and compete • Limitations • Fixed collection, with a small set of target tasks • Hard to evaluate with varying system parameters • Simulated Evaluation as a Solution • Build a model of user behavior • Generate simulated interaction logs If search How would accuracy its effectiveness improves byvary X%,for how diverse will itgroups affect user of behavior? users? 15 * Building the Associative Browsing Model 1. Document Collection 2. Concept Extraction 3. Link Extraction 4. Link Refinement Term Similarity Temporal Similarity Co-occurrence 16 * DocTrack Game Target Item Find It! 17 * Community Efforts based on the Datasets