Ryan O'Donnell (CMU) Yi Wu (CMU, IBM) Yuan Zhou (CMU)

Ryan O'Donnell (CMU) Yi Wu (CMU, IBM) Yuan Zhou (CMU) Locality Sensitive Hashing h: H: objects [Indyk-Motwani '98] sketches family of hash functions h s.t. “similar” objects collide w/ high prob. “dissimilar” objects collide w/ low prob. Abbreviated history Min-wise hash functions [Broder '98] A 0 1 1 1 0 0 1 0 0 1 1 1 0 0 0 1 0 1 B Jaccard similarity: | A B | | A B | Invented simple H s.t. Pr [h(A) = h(B)] = Indyk-Motwani '98 Defined LSH. Invented very simple H good for {0, 1}d under Hamming distance. Showed good LSH implies good nearest-neighbor-search data structs. Charikar '02, STOC Proposed alternate H (“simhash”) for Jaccard similarity. Patented by Google . Many papers about LSH Practice Theory Free code base [AI’04] [Broder ’97] Sequence comparison in bioinformatics Association-rule finding in data mining [Indyk–Motwani ’98] [Gionis–Indyk–Motwani ’98] [Charikar ’02] [Datar–Immorlica– –Indyk–Mirrokni ’04] Collaborative filtering [Motwani–Naor–Panigrahi ’06] Clustering nouns by meaning in NLP [Andoni–Indyk ’06] [Tenesawa–Tanaka ’07] Pose estimation in vision [Andoni–Indyk ’08, CACM] ••• [Neylon ’10] Given: Goal: (X, dist), r > 0, distance space “radius” c>1 “approx factor” Family H of functions X → S (S can be any finite set) s.t. ∀ x, y ∈ X, dist ( x, y ) ≤ r  dist ( x, y ) ≥ cr  Pr [h( x)  h( y )] .25 .5 ≥q pρ.1 Pr [h( x)  h( y )] ≤q h~ H h~ H dist ( x, y )  r Theorem  Pr [h( x)  h( y )]  q  h~ H dist ( x, y )  cr  Pr [h( x)  h( y )]  q [IM’98, GIM’98] h~ H Given LSH family for (X, dist), can solve “(r,cr)-near-neighbor search” for n points with data structure of size: query time: O(n1+ρ) Õ(nρ) hash fcn evals. dist ( x, y )  r Example  Pr [h( x)  h( y )]  q  h~ H dist ( x, y )  cr  Pr [h( x)  h( y )]  q X = {0,1}d, dist = Hamming h~ H r = εd, c=5 0 1 1 1 0 0 1 0 0 1 1 1 0 0 0 1 0 1 [IM’98] H= { h1, h2, …, hd }, hi(x) = xi “output a random coord.” dist ≤ εd or ≥5εd Analysis dist ( x, y)  d  dist ( x, y)  5d  Pr [h( x)  h( y)]  1   h~ H = qρ Pr [h( x)  h( y)]  1  5 = q h~ H (1 − 5ε)1/5 ≈ 1 − ε. ∴ ρ ≈ 1/5 (1 − 5ε)1/5 ≤ 1 − ε. ∴ ρ ≤ 1/5 In general, achieves ρ ≤ 1/c, ∀c (∀r). Optimal upper bound ( {0, 1}d, Ham ), S ≝ {0, 1}d ∪ {✔}, hab(x) = dist ( x, y ) ≤ r  dist ( x, y ) ≥ cr  r > 0, c > 1. H ≝ {hab : dist(a,b) ≤ r} ✔ if x = a or x = b x otherwise .5positive .1 .01 Pr [h( x)  h( y )] => 0.0001 h~ H Pr [h( x)  h( y )] = 0 h~ H The End. Any questions? Wait, what? Theorem [IM’98, GIM’98] Given LSH family for (X, dist), can solve “(r,cr)-near-neighbor search” for n points with data structure of size: query time: O(n1+ρ) Õ(nρ) hash fcn evals. Wait, what? Theorem [IM’98, GIM’98] Given LSH family for (X, dist), can solve “(r,cr)-near-neighbor search” for n points with data structure of size: query time: O(n1+ρ) Õ(nρ) hash fcn evals. q ≥ n-o(1) ("not tiny") More results For Rd with ℓp-distance: 1  p c when p = 1, 0 < p < 1, p = 2 [IM’98] [DIIM’04] [AI’06] For Jaccard similarity: ρ ≤ 1/c [Bro’98] For {0,1}d with Hamming distance: [MNP’06] .462  c immediately −od(1) (assuming q ≥ 2−o(d)) .462  p c for ℓp-distance Our Theorem For {0,1}d with Hamming distance: (∃ r s.t.) immediately 1  c 1  p c −od(1) (assuming q ≥ 2−o(d)) for ℓp-distance Proof also yields ρ ≥ 1/c for Jaccard. Proof Proof: Noise-stability is log-convex. Proof: A definition, and two lemmas. Definition: Noise stability at -т e Fix any arbitrary function h : {0,1}d → S. Pick x ∈ {0,1}d at random: x= 0 1 1 1 0 0 1 0 0 h(x) = s Flip each bit w.p. (1-e-2т)/2 independently y= 0 def: 0 1 1 0 0 1 K h ( )  Pr[h( x)  h( y)] x~ y 1 0 h(y) = s’ Lemma 1: For x τ dist(x, y) = (1  e 2 )d / 2 ≈ d y, o(d) w.v.h.p. when τ ≪ 1. Proof: Chernoff bound and Taylor expansion. Lemma 2: Kh(τ) is a log-convex function of τ. τ (for any h) 0 log Kh(τ) Proof: Fourier analysis of Boolean functions. 1 d Theorem: LSH for {0,1} requires    od (1) . c Proof: Say H is an LSH family for {0,1}d with params (εd + o(d), cεd - o(d), r (c − o(1)) r def: K H ( )  E [K h ( )] h~ H  E [ Pr[h( x)  h( y)]] h~ H x~ y  E [ Pr [h( x)  h( y)]] x~ y h~ H w.v.h.p., dist(x,y) ≈ (1 - e-т)d ≈ тd qρ, q) . (Non-neg. lin. comb. of log-convex fcns. ∴ KH(τ) is also log-convex.) ∴ KH(ε) ≳ qρ KH(cε) ≲ q KH(τ) is log-convex ∴ lnKH(0) = 10 ∴ ln KH(ε) ≳ q ρρln q ln KH(cε) ≲ 0 ε cε 1 ln q c q q ln τ ln q ln KH(τ) 1 ∴ ρ ln q ≤ ln q c The End. Any questions?

Ryan O'Donnell (CMU) Yi Wu (CMU, IBM) Yuan Zhou (CMU)

Related documents

Products

Support

Ryan O'Donnell (CMU) Yi Wu (CMU, IBM) Yuan Zhou (CMU)

Related documents

Add this document to collection(s)

Add this document to saved

Suggest us how to improve StudyLib