User Interfaces and Information Retrieval Dina Reitmeyer WIRED (i385d)

advertisement
User Interfaces and
Information Retrieval
Dina Reitmeyer
WIRED (i385d)
10.14.04
Presentation Outline
• Past problems with interfaces & IR
• Why we need good systems & interfaces
• How to make things better:
–
–
–
–
User-friendly systems (using statistical ranking) – D. Harman
WebMate – L. Chen & K. Sycara
Dynamic queries – C. Williamson & B. Shneiderman
Relevance profiling within documents – D. Harper, S. Coulthard,
& S. Yixing
• Whatever happened to … ?
• Conclusion
Interfaces & IR systems: Problems
• Past emphasis on a working system (not on a
user-friendly system)
• Usually tacked on a front-end for users (not
necessarily user-friendly)
• Users without training were out of luck
• Interfaces using Boolean queries were common
Why do we need a good system &
a good interface?
• The user of the system is “the customer”!
• If your system doesn’t work well, no one will use it!
• If your interface doesn’t work or is too confusing/difficult
to interact with, no one will use it!
• If you want your system to be useful, how the user views
it (and its interface) must be a major concern.
“User-Friendly Systems Instead of User-Friendly
Front-Ends”
–D. Harman
• Problem: systems not built for users
(even with a front-end, still hard to use well)
• Need for user-friendly systems
• 1 Solution: use systems with statistical
ranking
simple definition: a system compares a document
& a query and estimates the likelihood of the
document’s relevance to the query
4 Prototypes of Statistical Retrieval
Systems
• PRISE
- uses weighting formula, adds weights of matching terms, results
are ranked
• CITE
- similar ranking system to PRISE, uses relevance feedback*
• MUSCAT
- different weighting system, but does use ranking, relevance
feedback
• News Retrieval Tool (NRT)
- user can use slider to weight a term, relevance feedback
Harman’s Results
• Users often preferred this type of natural
language search to Boolean
• Searches were comparable to Boolean in speed
& retrieval of relevant documents
• Easier for novice users; little training required
The Future of Statistical Ranking:
-add more complex term weighting
-use IDF (measures scarcity of terms)
-use relevance feedback
WebMate: A Personal Agent for Browsing &
Searching
L. Chen & K. Sycara
• What is it? How does it work?
– A stand-alone proxy between the user’s browser &
the web
– Monitors user’s actions & updates itself constantly
with this info.
– Uses TF-IDF (term frequency-inverted document
frequency) with multiple vector representations to see
if a document is relevant based on what you have
already deemed relevant
– Uses trigger pairs model to automatically provide
more search terms based on the one(s) the user
provides
What does WebMate do?
• It can compile a personal
“newspaper” by:
– Parsing user-designated URLs
– Extracting links from each
headline
– Fetching the pages from those
links
– Analyzing TF-IDF vectors for
each page
– Comparing a page’s similarity
with the user’s current profile
Results: 50-60% accuracy in
articles returned
• It can refine a search by:
– Taking the user’s search term.
– Using the trigger pairs model
to find related terms
– Searching using those terms
together
Results: WebMate gives many
more relevant results than if
you simply type the word into
AltaVista or Lycos.
The Dynamic HomeFinder: Evaluating Dynamic
Queries in a Real-Estate Information Exploration
System
C. Williamson & B. Shneiderman
• Direct Manipulation
– Benefits
+ graphic representation of objects/actions
+ buttons/sliders instead of query syntax
+ rapid & reversible operations (immediate
results)
+ permits use with little/no training
The Experiment
• Users were asked to find answers to
different types of queries using:
– HomeFinder (a dynamic query interface)
– Q&A (a natural language query interface)
– Paper listings sorted by different fields
The Results
• HomeFinder rocked the house because:
– Its use took little/no training.
– There were no error messages.
– It was good for viewing trends.
– It took less time.
– It was easy for novices to use.
– No complex query formulation was necessary.
– People just liked it better!
A Language Modeling Approach to Relevance
Profiling for Document Browsing
D. Harper, S. Coulthard, & S. Yixing
• With longer documents available online,
users need a tool for within-document
retrieval.
• SmartSkim: creates a relevance profile for
a query about a document using language
modeling (a statistical model of text that
captures the distribution of text features)
Why SmartSkim could be cool:
• It highlights all query
terms in the text
• It shows a histogram
depicting the places in
the text where relevant
passages are to be found
• It is color coded to show
where you have & have
not looked
Whatever Happened to…?
• WebMate
• Dynamic HomeFinder
• Ben Shneiderman - Spotfire
Conclusion
• If we want our users to utilize our
information retrieval systems, we have to
have both user-friendly systems & userfriendly interfaces!
• Tools like statistical ranking, personal
software agents, relevance profiling, &
dynamic queries help us provide what the
user needs to interact with a system.
Download