Collaborative Filtering in iCAMP Max Welling Professor of Computer Science & Statistics Example I: Movie Recommendation http://www.netflix.com/RecommendationsHome?lnkctr=mh2rh&lnkce=sntRc Example II: Book Recommendation http://www.amazon.com/Data-Mining-Practical-Techniques-Management/dp/0120884070/ref=sr_1_1?ie=UTF8&s=books&qid=1273092289&sr=1-1 Example III: Internet Search http://www.google.com/search?hl=en&client=firefox-a&hs=gSR&rls=org.mozilla%3Aen-US%3Aofficial&q=max+welling&aq=f&aqi=g2g-m1&aql=&oq=&gs_rfai= movies (+/- 17,770) Back to The Movies: Data total of +/- 400,000,000 nonzero entries (99% sparse) 4 users (+/- 240,000) movies (+/- 17,770) Demo Matlab total of +/- 400,000,000 nonzero entries (99% sparse) users (+/- 240,000) movies (+/- 17,770) K x K users (+/- 240,000) “K” is the number of factors, or topics. Conclusion • We will implement a number of collaborative filtering algorithms in matlab. • You will learn: Clustering; Matrix factorization & Principal Components Analysis; Regression; Classification: naive Bayes classifier, decision trees, neural networks • We will work with real world data from netflix, stockportfolio management, and more. • But most of all: this will be fun!