Collaborative Filtering: Possibilities for Digital Libraries Jon Herlocker Janet Webster Seikyung Jung Oregon State University CNI 2003/Herlocker, Jung, and Webster 1 Current search engines are insufficient. CNI 2003/Herlocker, Jung, and Webster 2 Two important search engine problems • They don’t understand: – Quality – Context CNI 2003/Herlocker, Jung, and Webster 3 But First: Our Context • Why are we standing up here? • We think we can improve the digital library experience. CNI 2003/Herlocker, Jung, and Webster 4 Today’s Context 1. 2. 3. 4. Research questions & hypotheses Collaborative filtering Our approach to CF in the Library Challenges of collaborative filtering for library search 5. Initial lessons learned CNI 2003/Herlocker, Jung, and Webster 5 The Librarian’s Questions • As electronic information increases in amount and value, how to provide access to it? • How to change digital libraries from disconnected collections to integrated systems? • How to integrate the expertise of librarians into the development process? • How to adapt traditional library values to new opportunities? CNI 2003/Herlocker, Jung, and Webster 6 The Computer Scientist’s Questions • What is the next big leap in document search technology? • How to overcome the limitations of software’s ability to understand language? • How can we build a search engine that learns by observing searchers? CNI 2003/Herlocker, Jung, and Webster 7 Our Research Hypotheses • Enabling the entire community to participate in organizing and recommending information will add value to the digital library • In other words: Collaborative Filtering will increase the value of a digital library CNI 2003/Herlocker, Jung, and Webster 8 What is Collaborative Filtering? • Communities of people sharing their evaluations of content • Recommendations are transferred between people of like interest • Examples: – – – – MovieLens.org Epinions.com Launchcast (launch.yahoo.com) Amazon.com CNI 2003/Herlocker, Jung, and Webster 9 CF and Libraries • Search is central to user experience of digital library • Collaborative Filtering: – Could overcome the limitations of current search technology – CF already exists in libraries. • Not search, but cataloguing (OCLC) • Adapting CF for document searching is not trivial. – Information needs are dynamic. CNI 2003/Herlocker, Jung, and Webster 10 Our Approach • OSU Libraries Recommender System – Perform at CF at query level • Match similar queries in addition to similar users – Generate results based on past user recommendations – Infer recommendations from user behavior – Integrate with existing library systems and traditions CNI 2003/Herlocker, Jung, and Webster 11 CNI 2003/Herlocker, Jung, and Webster 12 CNI 2003/Herlocker, Jung, and Webster 13 CNI 2003/Herlocker, Jung, and Webster 14 The Benefits of CF • Quality is considered. – Recommendations are based on human evaluations. • Context is considered. • The system gets better as it’s used. • Doesn’t require significant, centralized human resources CNI 2003/Herlocker, Jung, and Webster 15 CS Challenges • How to collect evaluations? • How to identify the “useful” element of recommendations? • How to represent the information needs of searchers? • How to rank results? • How to design the interface? CNI 2003/Herlocker, Jung, and Webster 16 Library Challenges • How to balance privacy with personalization & involvement? • How to maintain authority of recommended information? • How to deal with timeliness of information? • How to integrate with existing library systems? • How to fund research in the library setting? CNI 2003/Herlocker, Jung, and Webster 17 What We’ve Learned • Weakness of “old” search technology affects perception of new • Wrapper technology minimizes IT commitment • Existing internal data can be used to jumpstart system • Controlled experiments show – Increased performance – Increased perception of non-tangibles CNI 2003/Herlocker, Jung, and Webster 18 CF and Digital Libraries • Helps handle more electronic information • Improve search results • Shapes direction of digital libraries • Supports collaboration on many levels Nothing ventured, nothing gained. CNI 2003/Herlocker, Jung, and Webster 19 Funding • OSU Libraries Gray Chair for Innovative Technologies • National Partnership for Advanced Computing Infrastructure (NSF) • Georgia Pacific HMSC internship CNI 2003/Herlocker, Jung, and Webster 20 More information – Silence of the Sleeper • Malcom Gladwell, The New Yorker, October 4th, 1999 (gladwell.com) – System for Electronic Recommendation Filtering Prototype (SERF) for OSU Libraries • http://dl.nacse.org/osu CNI 2003/Herlocker, Jung, and Webster 21 Contacts Janet Webster – Oregon State University Libraries, Hatfield Marine Science Center – janet.webster@oregonstate.edu Jon Herlocker – Oregon State University, School of Electrical Engineering & Computer Science – herlock@eecs.oregonstate.edu CNI 2003/Herlocker, Jung, and Webster 22