User Interfaces and Information Retrieval Dina Reitmeyer WIRED (i385d) 10.14.04 Presentation Outline • Past problems with interfaces & IR • Why we need good systems & interfaces • How to make things better: – – – – User-friendly systems (using statistical ranking) – D. Harman WebMate – L. Chen & K. Sycara Dynamic queries – C. Williamson & B. Shneiderman Relevance profiling within documents – D. Harper, S. Coulthard, & S. Yixing • Whatever happened to … ? • Conclusion Interfaces & IR systems: Problems • Past emphasis on a working system (not on a user-friendly system) • Usually tacked on a front-end for users (not necessarily user-friendly) • Users without training were out of luck • Interfaces using Boolean queries were common Why do we need a good system & a good interface? • The user of the system is “the customer”! • If your system doesn’t work well, no one will use it! • If your interface doesn’t work or is too confusing/difficult to interact with, no one will use it! • If you want your system to be useful, how the user views it (and its interface) must be a major concern. “User-Friendly Systems Instead of User-Friendly Front-Ends” –D. Harman • Problem: systems not built for users (even with a front-end, still hard to use well) • Need for user-friendly systems • 1 Solution: use systems with statistical ranking simple definition: a system compares a document & a query and estimates the likelihood of the document’s relevance to the query 4 Prototypes of Statistical Retrieval Systems • PRISE - uses weighting formula, adds weights of matching terms, results are ranked • CITE - similar ranking system to PRISE, uses relevance feedback* • MUSCAT - different weighting system, but does use ranking, relevance feedback • News Retrieval Tool (NRT) - user can use slider to weight a term, relevance feedback Harman’s Results • Users often preferred this type of natural language search to Boolean • Searches were comparable to Boolean in speed & retrieval of relevant documents • Easier for novice users; little training required The Future of Statistical Ranking: -add more complex term weighting -use IDF (measures scarcity of terms) -use relevance feedback WebMate: A Personal Agent for Browsing & Searching L. Chen & K. Sycara • What is it? How does it work? – A stand-alone proxy between the user’s browser & the web – Monitors user’s actions & updates itself constantly with this info. – Uses TF-IDF (term frequency-inverted document frequency) with multiple vector representations to see if a document is relevant based on what you have already deemed relevant – Uses trigger pairs model to automatically provide more search terms based on the one(s) the user provides What does WebMate do? • It can compile a personal “newspaper” by: – Parsing user-designated URLs – Extracting links from each headline – Fetching the pages from those links – Analyzing TF-IDF vectors for each page – Comparing a page’s similarity with the user’s current profile Results: 50-60% accuracy in articles returned • It can refine a search by: – Taking the user’s search term. – Using the trigger pairs model to find related terms – Searching using those terms together Results: WebMate gives many more relevant results than if you simply type the word into AltaVista or Lycos. The Dynamic HomeFinder: Evaluating Dynamic Queries in a Real-Estate Information Exploration System C. Williamson & B. Shneiderman • Direct Manipulation – Benefits + graphic representation of objects/actions + buttons/sliders instead of query syntax + rapid & reversible operations (immediate results) + permits use with little/no training The Experiment • Users were asked to find answers to different types of queries using: – HomeFinder (a dynamic query interface) – Q&A (a natural language query interface) – Paper listings sorted by different fields The Results • HomeFinder rocked the house because: – Its use took little/no training. – There were no error messages. – It was good for viewing trends. – It took less time. – It was easy for novices to use. – No complex query formulation was necessary. – People just liked it better! A Language Modeling Approach to Relevance Profiling for Document Browsing D. Harper, S. Coulthard, & S. Yixing • With longer documents available online, users need a tool for within-document retrieval. • SmartSkim: creates a relevance profile for a query about a document using language modeling (a statistical model of text that captures the distribution of text features) Why SmartSkim could be cool: • It highlights all query terms in the text • It shows a histogram depicting the places in the text where relevant passages are to be found • It is color coded to show where you have & have not looked Whatever Happened to…? • WebMate • Dynamic HomeFinder • Ben Shneiderman - Spotfire Conclusion • If we want our users to utilize our information retrieval systems, we have to have both user-friendly systems & userfriendly interfaces! • Tools like statistical ranking, personal software agents, relevance profiling, & dynamic queries help us provide what the user needs to interact with a system.