Augmenting Web Search Valerie Gomez de la Torre INF 385F: WIRED Dr. Turnbull November 4, 2004 Overview How people search for information Filtering Approaches Recommendation Systems • Content-Based Filtering • Collaborative Filtering • PHOAKS • FAB • Tapestry • SiteSeer User models : How users search the web Maglio and Barrett Routines • users have a standard pattern of search • behavior will recall previous searches in terms of their standard routine Waypoints • nodes along a search path of unbroken links • rely on key nodes and not the actual path they took Web agents based on user models Maglio and Barrett Agents – programs that collaborate with human users to try to mimic their expectant behavior Short-cut agent Waypoint agent • used to extract repeated patterns • tracks specific parts of a user’s history • no back tracking • connected by a sequence of links WBI Tool Bar: using the Short-cut Agent Maglio and Barrett Content Based Filtering Balabonovi and Shoham Recommendations are made for a user based solely on a profile built up by analyzing the content of items which that user has rated in the past. Cons: • • • very shallow analysis over specialization hard to elicit user feedback Collaborative Filtering Balabonovi and Shoham Recommendations are made solely on the basis of similarities to other users • “nearest neighbors” Cons: • new items in database won’t be • • recommended until rated new users begin with a “cold start” lack of access to content can also prevent the matching of “nearest neighbors” PHOAKS: System for sharing recommendations Terveen, Hill, Amento, McDonald & Creter “People Helping One Another Know Stuff” Uses collaborative filtering to sift through Usenet newsgroup messages Principles of Role specialization and Reuse PHOAKS: Rules for selecting recommendations Terveen, Hill, Amento, McDonald & Creter Cannot be cross-posted to too many news groups URL cannot be a part of a poster’s signature or signature file Cannot be found in quoted sections of previous text Cannot be part of an advertisement FAB: A hybrid recommendation system Balabanovic & Shoham Combines content-based & collaborative filtering approaches Components: • collection agent: pages for a specific topic • selection agent: pages for a specific user • central router FAB: Overview of how it works Balabanovic & Shoham Tapestry: Experimental mail system Goldberg, Nichols, Oki, and Terry Uses content-based and collaborative filtering for mailing lists Content-based • allows a user to create filters that scan across various lists for items of interest Collaborative • records user’s reactions to items in • annotations these annotations are then able to be searched by other users Tapestry: The flow of documents Goldberg, Nichols, Oki, and Terry Siteseer: Bookmarks for collaborative search Rucker and Polanco Recommendation system that uses a user’s book marking system (bookmarks and folders) to predict and recommend relevant pages to other users. Once Siteseer has learned each user’s preference and category scheme – they can become ‘reviewers’ for other users Siteseer: How it works Rucker and Polanco Selects “reviewers” • Compares user’s folders and bookmarks • looking for overlap gives additional weight to obscure links Does not focus on semantics Forms “communities of interest” • determines similarity by the overlap in content Siteseer: Reviewers and Communities of Interest Rucker and Polanco Conclusion Augmenting search systems with collaborative and content based filtering will provide more interesting results for users Focusing on users’ search behaviors & pattern will help build systems that think like they do All we really need to know we learned in kindergarten...Share everything! References Terveen, L., W. Hill, et al. (1997). PHOAKS: A System for Sharing Recommendations. Communications of the ACM 40(3): 59-62. Balabanovic, M. and Y. Shoham (1997). Fab: Content-Based, Collaborative Recommendation. Communications of the ACM 40(3): 66-72. Goldberg, D., D. Nichols, et al. (1992). Using Collaborative Filtering to Weave an Information Tapestry. Communications of the ACM 35(12): 61-70. Rucker, J. and M. J. Polanco (1997). Siteseer: Personalized Navigation for the Web. Communications of the ACM 40(3): 73-75. Maglio, P., & Barrett, R. (1996). How to Build Modeling Agents to Support Web Searchers. Paper presented at the Sixth International Conference on User Modeling, New York.