Augmenting Web Search Valerie Gomez de la Torre INF 385F: WIRED Dr. Turnbull

advertisement
Augmenting Web Search
Valerie Gomez de la Torre
INF 385F: WIRED
Dr. Turnbull
November 4, 2004
Overview

How people search for information
Filtering Approaches

Recommendation Systems

• Content-Based Filtering
• Collaborative Filtering
• PHOAKS
• FAB
• Tapestry
• SiteSeer
User models : How users search the
web
Maglio and Barrett

Routines
• users have a standard pattern of search
•

behavior
will recall previous searches in terms of their
standard routine
Waypoints
• nodes along a search path of unbroken links
• rely on key nodes and not the actual path
they took
Web agents based on user models
Maglio and Barrett

Agents – programs that collaborate with
human users to try to mimic their expectant
behavior

Short-cut agent

Waypoint agent
• used to extract repeated patterns
• tracks specific parts of a user’s history
• no back tracking
• connected by a sequence of links
WBI Tool Bar: using the Short-cut Agent
Maglio and Barrett
Content Based Filtering
Balabonovi and Shoham


Recommendations are made for a user
based solely on a profile built up by
analyzing the content of items which that
user has rated in the past.
Cons:
•
•
•
very shallow analysis
over specialization
hard to elicit user feedback
Collaborative Filtering
Balabonovi and Shoham

Recommendations are made solely on the
basis of similarities to other users
• “nearest neighbors”

Cons:
• new items in database won’t be
•
•
recommended until rated
new users begin with a “cold start”
lack of access to content can also prevent the
matching of “nearest neighbors”
PHOAKS: System for sharing
recommendations
Terveen,
Hill, Amento, McDonald & Creter



“People Helping One Another Know Stuff”
Uses collaborative filtering to sift through
Usenet newsgroup messages
Principles of Role specialization and Reuse
PHOAKS: Rules for selecting
recommendations
Terveen, Hill, Amento, McDonald & Creter




Cannot be cross-posted to too many news
groups
URL cannot be a part of a poster’s signature
or signature file
Cannot be found in quoted sections of
previous text
Cannot be part of an advertisement
FAB: A hybrid recommendation system
Balabanovic &
Shoham


Combines content-based & collaborative
filtering approaches
Components:
• collection agent: pages for a specific topic
• selection agent: pages for a specific user
• central router
FAB: Overview of how it works
Balabanovic &
Shoham
Tapestry: Experimental mail system
Goldberg, Nichols,
Oki, and Terry


Uses content-based and collaborative
filtering for mailing lists
Content-based
• allows a user to create filters that scan across
various lists for items of interest

Collaborative
• records user’s reactions to items in
•
annotations
these annotations are then able to be
searched by other users
Tapestry: The flow of documents
Goldberg, Nichols, Oki, and
Terry
Siteseer: Bookmarks for collaborative search
Rucker and Polanco


Recommendation system that uses a user’s
book marking system (bookmarks and
folders) to predict and recommend relevant
pages to other users.
Once Siteseer has learned each user’s
preference and category scheme – they can
become ‘reviewers’ for other users
Siteseer: How it works
Rucker and Polanco

Selects “reviewers”
• Compares user’s folders and bookmarks
•
looking for overlap
gives additional weight to obscure links

Does not focus on semantics

Forms “communities of interest”
• determines similarity by the overlap in content
Siteseer: Reviewers and Communities of
Interest
Rucker and Polanco
Conclusion

Augmenting search systems with
collaborative and content based filtering will
provide more interesting results for users

Focusing on users’ search behaviors &
pattern will help build systems that think like
they do

All we really need to know we learned in
kindergarten...Share everything!
References





Terveen, L., W. Hill, et al. (1997). PHOAKS: A System for Sharing
Recommendations. Communications of the ACM 40(3): 59-62.
Balabanovic, M. and Y. Shoham (1997). Fab: Content-Based,
Collaborative Recommendation. Communications of the ACM 40(3):
66-72.
Goldberg, D., D. Nichols, et al. (1992). Using Collaborative Filtering to
Weave an Information Tapestry. Communications of the ACM 35(12):
61-70.
Rucker, J. and M. J. Polanco (1997). Siteseer: Personalized
Navigation for the Web. Communications of the ACM 40(3): 73-75.
Maglio, P., & Barrett, R. (1996). How to Build Modeling Agents to
Support Web Searchers. Paper presented at the Sixth International
Conference on User Modeling, New York.
Download