pptx - Georgia Institute of Technology

advertisement
Efficient Retrieval of Recommendations
in a Matrix Factorization Framework
Noam Koenigstein
Parikshit Ram
Yuval Shavitt
School of Electrical
Engineering
Computational Science &
Engineering
School of Electrical
Engineering
Tel Aviv University
Georgia Institute of
Technology
Tel Aviv University
Motivation
• In the field of Recommender System, Matrix Factorization
(MF) models have shown superior accuracy for
recommendation tasks.
E.g., The Netflix Prize, KDD-Cup’11, etc.
• Training is fast.
Computing test scores is fast.
But… Retrieval of Recommendations (RoR) is s--l--o--w !
• This problem is well known in the industry, yet never been
approached before in academia!
ITEMS
2
Yahoo! Music:
2
1M Users
625K Items
4
5
U
S
E
R
S
3
3
Naïve Multithreading:
High latency +
wasteful
2
....
4
1
2
2
5
6 Tera elements
~300 multiplications
~5 days CPU
Reduction to Inner Product
π‘Ÿπ‘’π‘– = 𝐩𝑇𝑒 πͺ𝑖 = 𝐩𝑒
πͺ𝑖 cos πœƒπ‘’π‘–
π‘žπ‘– = π‘Žπ‘Ÿπ‘”π‘šπ‘Žπ‘₯π‘ž∈𝑆 πͺT 𝐩𝑒
𝐩𝑒
=1
Core problem:
Given a user vector 𝐩𝑒 and a set 𝑆 of item,
find an item vector π‘žπ‘– that will maximize 𝐩𝑇𝑒 πͺ𝑖
𝐩𝒖 (𝟏)
πͺ𝐒 (𝟏)
𝐩𝒖 (𝟐)
πͺ𝐒 (𝟐)
𝐩𝒖 (πŸ‘)
πͺ𝐒 (πŸ‘)
𝐩𝒖 (πŸ’)
πͺ𝐒 (πŸ’)
.
.
.
.
.
.
𝐩𝒖 (𝒏)
πͺ𝐒 (𝒏)
1
π’ƒπ’Š
Best Matches Algorithms
• Metric Space
• Cosine Similarity
• Locality Sensitive Hashing
Metric Trees
R
R
Branch-and-bound Algorithm
Bounding Inner Product Similarity
Approximate Solution
Users vectors can be
normalized
οƒ 
Users can be clustered
based on their
spherical angle!
π‘Ÿπ‘’π‘– = 𝐩𝑇𝑒 πͺ𝑖 = πͺ𝑖 cos πœƒπ‘’π‘–
𝐩𝑒
=1
Relative Error Bound
What is the error when recommendations are
retrieved based on an approximate user vector?
𝑝𝑐 𝑇 π‘žπ‘– − 𝑝𝑒 𝑇 π‘žπ‘–
π‘’π‘Ÿπ‘Ÿ
=
π‘œπ‘π‘‘
𝑝𝑐 𝑇 π‘žπ‘–
≤1−
cos πœƒπ‘π‘ π‘žπ‘– + Δ
cos πœƒπ‘π‘ π‘žπ‘–
Adaptive Approximate Solution
Experimentations Set-up
MovieLens
Netflix
Yahoo!
Music
1,000,206
100,480,507
252,800,275
Users
6,040
480,189
1,000,990
Items
3,952
17,770
624,961
95.81%
98.82%
99.96%
Ratings
Sparsity
Yahoo! Music Recommendations: Modeling Music Ratings with Temporal Dynamics
and Item Taxonomy
Gideon Dror, Noam Koenigstein, Yehuda Koren
(RecSys-11`)
Exact Alg. Speedup
Approximate Alg. Speedup
Speedup vs. Precision
Speedup vs. MedianRank
Conclusions
• We introduce a new and relevant research
problem
• An exact solution with limited speedup
• An approximate solution with a tradeoff
between accuracy and speedup
• Much room for further research…
Download