Collaborative Filtering: Searching and Retrieving Web Information Together Huimin Lu December 2, 2004 INF 385D Fall 2004 Instructor: Don Turnbull Outline • • • • • • • • Introduction Collaborative Search Family Collaborative Filtering Systems Process Algorithm Problems & Solutions Privacy Collaborative Search into IR World • Inverted Index • Yellow-pages-like information gateway & Internet search engine (Sun, 1999) • Needs for collaborative retrieval • Information-resources-focused systems - By CSCW: structuring mechanisms & recommendation techniques • User-preferences-focused systems Collaborative Search Types • Collaborative browsing • Mediated searching • Collaborative information filtering • Collaborative agents - meda-search engines • Collaborative re-use of results (Setten, 2000) Collaborative Filtering • User-based filtering • Collects the taste information from users who like to collaborate in the process of searching and automatically predict or filter the relevant information to users (Wikipedia, 2004). • Store profile & preferences • Build users’ database • Recommended list by collaborative filter Collaborative Filtering Systems • Commercial - Amazon - Barnes and Noble - Netflix • Non-commercial - Moonranker - MovieLens - AmphetaRate - Audioscrobbler - Findory - Gnomoradio - iRATE radio System Example I: Amazon.com Recommendation page Back System Example II: Moonranker.com ranking page Back System Example I: Movielens.com rating page Back Collaborative Filtering Process Collaborative Filtering Algorithm • Goal - Suggest new items/predict the utility based on previous likings (Sarwar, 2001) • Memory-based - use entire user-item database - Pearson-correlation based approach, vector similarity based approach, the extended generalized vector space model • Model-based - develop a model of user rating - Bayesian network approach, the aspect model Problems and Solutions • Memory-based algorithm problems - Sparsity: insufficient user rating information - Scalability: nearest neighbor algorithm (compute user number and item number) - Solution: automatic weighting scheme by MSU & CMU • Model-based algorithm problem - Inherent static structure: updating problem & learning exact cluster number and specifying user classes problem • Systems problems - Scarcity: less rating for some items - Early-rater: no recommendations for new items - Solution: collaborative information filtering (communicating agents, correlating profile, and filterbots - automated rating robots) Privacy • Unsafe server-based system • Monopolies • Peer-to-peer architecture - Multi-party computation Conclusion The computer environment turns to be more ubiquitous and pervasive. To meet IR user’s needs, future collaborative filtering system should be easily maintained with well-designed algorithms and highly-protected user privacy. References Balabanov ic, M., & Shoha m, Y. (1997). Fab: Con tent-Based Coll aborative Recomm enda tion. Communications of the ACM, 40, 3, March 1997, 66 -72. Blackwell , A.F., Stringe r, M., Toye , E.F., & Rode, J.A. (2004 ). Tang ible Interface for Coll aborative Information Retrieva l. Extended abstracts of the 2004 conference on Human factors and computing systems. April 2004. Canny, J . (2002). Collaborative Fil tering wit h Privacy. Proceedings of the 2002 IEEE Symposium on Security and Privacy. May 2002. Canny, J . (2002). Collaborative Fil tering wit h Privacy via Factor Ana lysis. Proceedings of the 25th annua l international ACM SIGIR conference on Research and development in information retrieval. Augu st 2002. Coster, R., & Svens son, M . (2002). Inve rted Fil e Search A lgorit hms for Coll aborative Filtering. Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval. Augu st 2002. References InfoV is .ne t (2004). Collaborative Filtering. Retrieved November 09, 2004, from http:/ /www. infovis.net/E-zine/2004/num_155.h tm. Jin, R., Chai, J.Y., and Si, L. (2004). An Automatic Weighting Sche me for Coll aborative Filtering. Proceedings of the 27th annual international conference on Research and development in information retrieval. July 2004. Klein, M., Saya ma, H., Faratin, P., & Bar-Yam, Y. (2002). A Complex Systems Perspective on Computer-Supported Coll aborative Design T echno logy. Communications of the ACM, 45, 11, November 2002. Maes, P. (1994). Agen ts that Reduce Work and Informa tion Overload. Communications of the ACM, 37, 7, 1994, 31-40. Manber , U. (1992 ). Foreword, Information Retrieval: Data Structures and A lgorithms. Engl ewood Cliff s, NJ: Prentice Hall. References Pennock , D.M., Ho rvitz, E., Lawrenc e, S., & Gil es, C.L. (2000). Coll aborative Filt ering by Persona li ty Diagno sis: A Hyb rid Memory- and Model-Based Approach. Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI). 2000. Rosche is en, M., & Winograd, T. (n.d.). Generalized Annotations for Shared Commenting, Content Rating, and Other Collaborative Usages. Retrieved November 9, 2004, from http:/ /www.w3 .org/Coll aboration/Workshop/ Proceeding s/P11.html . Sarwar, B., Karyp is , G., Konstan, J., & Riedl, J. (2001). Item- Based Coll aborative Filtering Recomm end ation Algorithms . Proceedings of the tenth international conference on World Wide Web. April 2001. Sarwar, B.M., Kons tan, J.A., Bo rche rs, A., Herlocker, J., Mill ar, B., & Riedl, J. (1998 ). Using Filtering Agen ts to Improve Prediction Qua lit y in the Group Lens Research Coll aborative Filt ering System. Proceedings of CSCWÕ98, 1998, 345-354. References Setten, M.V., & Had idy, F.M. (2000). Collaborative Search and Retrieval: Finding Information Together. Retrieved November 8, 2004, from https:// doc.telin.nl/ dscgi/ ds.py/Search. Sun, M., Bakis , N., & Watson, I. (1999). Intelli gen t agen t based collaborative construc tion info rmation ne twork. International Journal of Construction Information Technolog y. Vol. 7, No.2, pp35-46. Twidale, M.B., Nicho ls, D.M., Smit h, G., & Trevor , J. (1995 ). Suppor ting Coll aborative Learning during Info rmation Searching. Proceedings of Computer Suppo rt for Collaborative Learning Õ95 (CSCLÕ95), (Eds.) Schna se, J.L., & Cunn ius, E.L., Bloomi ngton, Indi ana, 367-74 . Walkerdine, J., & Rodden , T. (2001). Sharing Searche s: Develop ing Open Suppo rt for Coll aborative Searching. Proceedings of Interact 2001, Japan, Ju ly, 9th-13th, 2001. Wikipedia. (2004). Collaborative Filtering. Retrieved November 09, 2004, from http:/ /en.w ikipedia.org/wiki/ Coll aborative_ filt ering. Questions or Comments?