Efο¬cient Retrieval of Recommendations in a Matrix Factorization Framework Noam Koenigstein Parikshit Ram Yuval Shavitt School of Electrical Engineering Computational Science & Engineering School of Electrical Engineering Tel Aviv University Georgia Institute of Technology Tel Aviv University Motivation • In the field of Recommender System, Matrix Factorization (MF) models have shown superior accuracy for recommendation tasks. E.g., The Netflix Prize, KDD-Cup’11, etc. • Training is fast. Computing test scores is fast. But… Retrieval of Recommendations (RoR) is s--l--o--w ! • This problem is well known in the industry, yet never been approached before in academia! ITEMS 2 Yahoo! Music: 2 1M Users 625K Items 4 5 U S E R S 3 3 Naïve Multithreading: High latency + wasteful 2 .... 4 1 2 2 5 6 Tera elements ~300 multiplications ~5 days CPU Reduction to Inner Product ππ’π = π©ππ’ πͺπ = π©π’ πͺπ cos ππ’π ππ = ππππππ₯π∈π πͺT π©π’ π©π’ =1 Core problem: Given a user vector π©π’ and a set π of item, find an item vector ππ that will maximize π©ππ’ πͺπ π©π (π) πͺπ’ (π) π©π (π) πͺπ’ (π) π©π (π) πͺπ’ (π) π©π (π) πͺπ’ (π) . . . . . . π©π (π) πͺπ’ (π) 1 ππ Best Matches Algorithms • Metric Space • Cosine Similarity • Locality Sensitive Hashing Metric Trees R R Branch-and-bound Algorithm Bounding Inner Product Similarity Approximate Solution Users vectors can be normalized ο Users can be clustered based on their spherical angle! ππ’π = π©ππ’ πͺπ = πͺπ cos ππ’π π©π’ =1 Relative Error Bound What is the error when recommendations are retrieved based on an approximate user vector? ππ π ππ − ππ’ π ππ πππ = πππ‘ ππ π ππ ≤1− cos πππ ππ + Δ cos πππ ππ Adaptive Approximate Solution Experimentations Set-up MovieLens Netflix Yahoo! Music 1,000,206 100,480,507 252,800,275 Users 6,040 480,189 1,000,990 Items 3,952 17,770 624,961 95.81% 98.82% 99.96% Ratings Sparsity Yahoo! Music Recommendations: Modeling Music Ratings with Temporal Dynamics and Item Taxonomy Gideon Dror, Noam Koenigstein, Yehuda Koren (RecSys-11`) Exact Alg. Speedup Approximate Alg. Speedup Speedup vs. Precision Speedup vs. MedianRank Conclusions • We introduce a new and relevant research problem • An exact solution with limited speedup • An approximate solution with a tradeoff between accuracy and speedup • Much room for further research…