Multi-Query Computationally-Private Information Retrieval with Constant Communication Rate Jens Groth, University College London Aggelos Kiayias, University of Athens Helger Lipmaa, Cybernetica AS and Tallinn University Information retrieval Client Server xi i x1,...,xn Privacy Index i ? Client i Server Example of a trivial PIR protocol Perfectly private: Client reveals nothing x1,...,xn xi i x1,...,xn Communication: nℓ bits with ℓ-bit records Communication bits nℓ Trivial protocol O(nk1/-1ℓ) Kushilevitz-Ostrovsky 97 O(kℓ) Cachin-Micali-Stadler 99 O(k log2n+ℓlog n) Lipmaa 05 O(k+ℓ) Gentry-Ramzan 05 Database size: n records Record size: ℓ bits Security parameter: k bits (size of RSA modulus) Multi-query information retrieval Client Server xi1,...,xim i1,...,im x1,...,xn Privacy i1,...,im? Client i1,...,im Server Our contribution • Lower bound (information theoretic): (mℓ+m log(n/m)) bits • Upper bound (CPIR protocol): O(mℓ+m log(n/m)+k) bits Lower bound (mℓ+m log(n/m)) bits Client Server xi1,...,xim i1,...,im x1,...,xn Client and server have unlimited computational power We do not require protocol to be private We assume perfect correctness We assume worst case indices and records Lower bound for 2-move CPIR Client Server xi1,...,xim i1,...,im Query: Response: possible indices m records x1,...,xn (m log(n/m)) (mℓ) Lower bound for many-move CPIR Client Server xi1,...,xim i1,...,im x1,...,xn Proof overview: At loss of factor 2 assume 1-bit messages exhanged View function as tree with client at leaf choosing an output We will prove the tree has at least (leaf, output) pairs Input to the tree-function: I=(i1,...,im) and X=(x1,...,xn) C(i1,...,im) 0 S(x1,...,xn,0) 0 1 1 S(x1,...,xn,1) 0 1 C(i1,...,im,0,0) C(i1,...,im,0,1) C(i1,...,im,1,0) C(i1,...,im,1,1) xi1,...,xim Observation: If (I,X) and (I´,X´) lead to same leaf and output, then also (I,X´) lead to this leaf and output Define F = { (I,X)=(i1,...,im,x1,...,xn) | xi=1ℓ if i I and else xi=0ℓ} If (I,X) F and (I´,X´) F then (I,X´) F This means each (I,X) F leads to different (leaf,output) pair For each (I,X) F the output is 1ℓ,...,1ℓ There are pairs in F, so the tree must have This means the height is at least log leaves ≥ m log(n/m) So the client and server risk sending ½m log(n/m) bits For the general case we then get a lower bound of max(mℓ, ½m log(n/m)) = (mℓ+m log(n/m)) bits Four cases Trivial PIR (nℓ bits) 4 ℓ=log(n/m) 2 1 3 m=k2/3 m=n/9 Tool: Restricted CPIR protocol • Perfect correctness • Constant >0 (e.g. =1/25) so CPIR with k bits of communication for parameters satisfying mℓ+m log n k • m = poly(k), n = poly(k), ℓ = poly(k) Example: Gentry-Ramzan CPIR Primes: p1,…,pn Prime powers: 1,…,n • Query: • Response: • Extract: |pi| = O(log n) |i| > ℓ N, g i1…im | ord(g) c = gx mod N x = xi mod i (cord(g)/i1…im) = (gord(g)/i1…im)x compute x mod i1…im extract xi1,…,xim Three remaining cases ℓm/k CPIRs with record size k/m in parallel Restricted CPIR mℓ+m log n k 4 ℓ=log(n/m) 2 3 m=k2/3 m=n/9 Two remaining cases mℓ/log(n/m)out of-n CPIR with record size log(n/m) 4 ℓ=log(n/m) 3 m=k2/3 m=n/9 One remaining case Restricted CPIR mℓ+m log n k ℓ=log(n/m) 3 m=k2/3 m=n/9 Parallel extraction Res-CPIR Res-CPIR Res-CPIR Res-CPIR The problem • If ℓ = (log n) we could use parallel repetition of the restricted CPIR for mℓ+m log n k on blocks of the database to get a constant rate • But if ℓ is small and m is large, we may loose a multiplicative factor (mℓ+m log n)/(mℓ+m log(n/m)) = 1+log m/(ℓ+log(n/m)) by parallel repetition of the restricted CPIR Solution aℓ-bit records (x1,x2) x(x 1,x 1,x 2,x 3)3 (x2,x3) (x4,x5) x(x 4,x 4,x 5,x 6)6 (x5,x6) (x7,x8) x(x 7,x 7,x 8,x 9)9 (x8,x9) ℓ’=aℓ, m’=m/a, n’= Restricted CPIR mℓ+m log n k n/a Summary Client Server xi1,...,xim i1,...,im x1,...,xn • Lower bound: (mℓ+m log(n/m)) bits • CPIR protocol: O(mℓ+m log(n/m)+k) bits