When Are LDCs a False Promise? Moni Naor Weizmann Institute of Science

advertisement
When Are LDCs a False Promise?
Moni Naor
Weizmann Institute of
Science
Talk Based on:
• The Complexity of Online Memory Checking
[Naor and Rothblum]
• Fault Tolerant Storage And Quorum Systems
[Nadav and Naor]
• On the Compressibility of NP Instance and
Cryptographic Applications
[Harnik and Naor]
Theme: cases where LDC should be helpful but
•Either provably not helpful
•Or open problem
Authentication
Verifying a string has not been modified
– Central problem in cryptography
– Many variants
Our Setting:
• User works on large file residing on a remote server
• User stores a small secret `fingerprint’ (hash) of file
– Used to detect corruption
• What is the size of the fingerprint?
– A well understood problem
Online Memory Checking
Problem with the model:
What if we don’t want to read the entire file?
What if we only want small part? Read entire file?!
Idea: Don’t verify the entire file, verify what you need!
– How much of the file do you read per
authenticated bit?
– How large a fingerprint do you need?
Online Memory Checkers
User makes store and retrieve requests to memory
a vector in {0,1}n under adversary’s control
Checker Checks: answer to retrieve = last stored value
Checker:
– Has secret reliable memory: space complexity s(n)
– Makes its own reads/writes: query complexity q(n)
Want small s(n) and small q(n)!
b
User
retrieve(i)
store(i,b)
q(n) bits
C
memory checker
secret memory
s(n) bits
R/W
R/W
R/W
Public
memory
Memory Checker Requirements:
For
ANY sequence of user requests and
ANY responses from public memory:
Completeness: If every read from public memory = last write
Guarantee: user retrieve = last store (w.h.p)
Soundness: If some read from public memory ≠ last write
Guarantee: user retrieve = last store or BUG (w.h.p)
b orbBUG
User
retrieve(i)
C
memory checker
secret memory
s(n) bits
Public
memory
Past Results:
[Blum, Evans, Gemmel, Kannan and Naor 1991]
Offline Memory Checkers:
Detect errors only at end of long request sequence
q(n)=O(1) (amortized)
s(n)=O(log n)
Are they
Very Simple
No Crypto assumptions!
necessary?!
(in chunks)
Online Memory
Checkers:
Other
Results:
With One-Way Functions
No Computational Assumpt.
Optimal [Gemmel Naor 92]
q(n)=O(log n)
q(n) (any query complexity)
Must be> invasive
[Ajtai
2003]
= O(n/q(n))
s(n)=n (for any
0) s(n)
s(n) x q(n) = O(n)
Authenticators
Memory Checkers allow reliable local decodability,
What about reliable local testability?
Authenticators:
• Encode the file x 2 {0,1}n into:
• a large public encoding px
• a small secret encoding sx. Space complexity: s(n)
• Decoding Algorithm D:
– Receives a public encoding p and decodes it into a vector x 2
{0,1}n
• Consistency verifier checks (repeatedly) public encoding
was it (significantly) corrupted? reading only a few bits: t(n).
– If not currupted: verifier should output “Ok”
– If verifier outputs “Ok”, decoder can (whp) retrieve the file
Pretty Good Authenticator
with computational assumptions
• Idea: encode file X using a good error correcting code C
– Actually erasures are more relevant
– As long as a certain fraction of the symbols of C(X) is available,
can decode X
• Add to each symbol a tag Fk(a,i), a function of
• secret information k 2 {0,1}s, seed of a PRF
• symbol a 2 
• location i
Good example:
Reed Solomon
• Verifiers picks random location i reads symbol ’a’ and tag t
– Check whether t=Fk(a,i) and rejects if not
• Decoding process removes all inappropriate tags and uses
the decoding procedure of C
Memory Checker  Authenticator
If there exists an online memory checker with
– space complexity s(n)
– query complexity t(n)
then there exists an authenticator with
– space complexity O(s(n))
– query complexity O(t(n))
Idea: Use a high-distance code
Improve the Information Theoretic Upper
Bound(s)?
Maybe we can use:
Locally Decodable Codes?
Locally Testable Codes?
PCPs of proximity?
The Lower Bound
Theorem 1 [Tight lower bound]:
For any online memory checker secure against a
computationally unbounded adversary
s(n) x q(n) = (n)
True also for authenticators
Memory Checkers and One-Way Functions
Breaking the lower bound implies one-way functions.
Theorem 2:
If there exists an online memory checker:
– Working in polynomial time
– Secure against polynomial time adversaries
– With query and space complexity:
s(n) x q(n) < c · n (for a constant c > 0)
Then there exist functions that are hard to invert for
infinitely many input lengths
(“almost one-way” functions)
This Talk:
• Not say much about the proof
– It is involved
• Initial insight: connection to the simultaneous
message model
Simultaneous Messages Protocols [Yao 1979]
x {0,1}n
ALICE
mA
CAROL
y
{0,1}n
BOB
f(x,y)
x=y?
mB
• For the equality function:
– |mA| + |mB| = (√n)
– |mA| x |mB| = (n)
[Newman Szegedy 1996]
[Babai Kimmel 1997]
Ingredients for Full Proof:
• Consecutive Messages Model:
Generalized communication complexity lower bound.
• Adversary “learns” public memory access distribution:
Learning Adaptively Changing Distributions [NR06].
• “Bait and Switch” technique:
Handle adaptive checkers.
• One-Way functions:
Breaking the generalized communication complexity
lower bound in a computational setting requires oneway functions.
Conclusions for OMC
Settled the complexity of online memory checking
Characterized the computational assumptions
required for good online memory checkers
Open Questions:
Do we need logarithmic query complexity for online
memory checking with computational assumptions?
Understanding relationships of crypto/complexity
objects
Quantum Memory Checkers?
LDC
Talk Based on:
• The Complexity of Online Memory Checking
[Naor and Rothblum]
• Fault Tolerant Storage And Quorum Systems
[Nadav and Naor]
• On the Compressibility of NP Instance and
Cryptographic Applications
[Harnik and Naor]
Theme: cases where LDC should be helpful but
•Either provably not helpful
•Or open problem
Goal
• Distributed file storage system
– Peer-to-peer environment
– Processors join and leave the system continuously
Want to be able to store and retrieve files distributively
• Partial Solutions
– Distributed File sharing applications [Gnutella, Kazaa]
– Distributed Hash Tables [DH, Chord, Viceroy]
• Store (key, value) pairs and perform lookup on key
Fault-Tolerant Storage System
• Censor
– Aims to eliminate access to some files
– Can take down some servers
• Design Goal:
– A reader should be able to reconstruct each file with
high probability even after faults have occurred
Probability taken over coins of the writer and reader
Adversarial Behavior
• How are the faulty processors chosen?
What is the influence of the adversary
• Type of faults
– Complete/Partial control
Adversarial Model
• Adversary chooses the set of processors to crash
• Different degrees of adaptiveness
– Non adaptive adversary
• Choice of faulty processors is not based on their content
– Adversary with a limited number of queries
• May query some processors
• fail-stop failures
– We do not consider Byzantine failures
Other Fault Models
• Random faults model:
– Examples: Distance Halving DHT, Chord
– Standard technique:
• Replication to log(n) processors
• Assures survival with high probability
• Adversarial faults
[Fiat, Saia]
– Large fraction accessible after adversary crashes a
linear fraction of the processors
• Still, a censor can target a specific file
Measures of Quality
• Read/Write complexity:
– Average number of processors accessed during a read/write
operation
• Number of rounds:
– Number of rounds required from an adaptive reader
• Blowup Ratio:
– Ratio between the total number of bits used for the storage of a
file and its size
Connection to LDC
• If you are willing to have high write complexity:
• Can encode ALL the data with an LDC
• Parameters of the LDC determine how good the
data storage is
Probabilistic Storage system
based on  intersecting quorum system
• Storage System:
– To store a file: pick a set of size
uniformly at random
• replicate the file to all members of the quorum set
– Retrieval: Choose a random set of size
members
– Intersection follows from the birthday paradox
and probe its
Properties of the Probabilistic Storage System
• Pros:
– Simplicity
– Resilient against linear number of faults
• Even if the processors are chosen by the adversary adaptively
– Adapted to a dynamic environment [Abraham, Malkhi]
•Cons:
•High read/write complexity
•High blowup-ratio
Want a storage system with better parameters
Non-adaptive readers are wasteful!
• Non-adaptive reader:
– Processors are chosen without accessing any processor
Theorem:
A fault tolerant storage system, in the non-adaptive reader
model, resilient against (n) faults, cannot do better than the intersecting storage system example.
Read
Complexity ¢ Write Complexity is (n)
Blowup Ratio is (√n)
Open Question
• Do the lower bounds for the case when both the
reader and the adversary are non-adaptive hold
when both are fully adaptive?
For Effort
Talk Based on:
• The Complexity of Online Memory Checking
[Naor and Rothblum]
• Fault Tolerant Storage And Quorum Systems
[Nadav and Naor]
• On the Compressibility of NP Instance and
Cryptographic Applications
[Naor and Harnik]
Theme: cases where LDC should be helpful but
•Either provably not helpful
•Or open problem
The Problem
Is it possible to have an efficient procedure:
• Given CNF formulae 1 and 2 on same
variables and same length
come up with a CNF formula  that is:
1. Satisfiable if and only if 1 v 2 is satisfiable
2. Shorter than |1|+|2| Sufficiently short to apply
recursively (1-) (|1|+|2|)
If no:
there
is
hope
for:
If yes: There is a construction of Collision Resistant Hash
from any encryption
one-way function
• functions
Efficient everlasting
in the hybrid bounded storage model
•
•
No “black box” construction of CRH from OWF [Simon98]
Forward-Secure-Storage [Dziembowski]
Construction uses
code of [Dubrov-Ishai]
the one-way function
Derandomization
ofthe
Sampling
No Witness Retrievable Compression
• Given CNF formulae 1 and 2 on same variables
come up with a formula  that is:
1. Satisfiable if and only if 1 v 2 is satisfiable
2. Shorter than |1|+|2|
Satisfying assignment
Claim: if one-way functions exist, then a witness for
either 1 or 2 cannot yield a witness for 
efficiently.
Most natural ideas are witness retrievable
Proof intuition based on broadcast encryption lower bounds
I can’t find an
algorithm for the
problem
Find an
algorithm
that usually
works?
Maybe I can
Solve
justit in
approximateCould we
n
time
2
postpone it ?
it
Solve it for
some fixed
parameters
Approaches for dealing with NP-complete problems:
• Approximation algorithms
• Sub-exponential time algorithms
• Parameterized complexity
Garey and Johnson, 1979
• Average case complexity
• Save it for the future
Verdict on LDCs?
Uncompressed paper on compressibility:
www.wisdom.weizmann.ac.il/~naor/PAPERS/compressibility.html
Compressed version FOCS 2006
THE END
Thank You
Slides for the Proof of OMC
Simultaneous
Consecutive Messages Protocols
x
{0,1}n
ALICE
mP
y {0,1}n
mA
CAROL
x=y?
mB
BOB
Theorem (lower bound for CM protocols):
For any equality protocol, as long as |mP| ≤ n/100,
|mA| x |mB| = (n)
Program for This Talk:
•
•
•
•
Define online memory checkers
Review some past results
Describe new results
Proof sketch:
– Define communication complexity model
– Sketch lower bound for a simple case
– Ideas for extending to the general case
The Reduction
Use online memory checker
to construct a consecutive messages equality protocol
Online Checker
Space: s(n)
Query: q(n)
Equality Protocol
Reduction
Alice msg: s(n)
Bob msg: O(q(n))
Conclusion: s(n) x q(n) = Ω(n)
(From communication complexity lower bound)
Simplifying Assumption
(With loss of generality)
Assumption: checker chooses indices to read from
public memory independently of secret memory
Checker Operation:
1. Get an index i in the original file
2. Choose which indices to read from the public memory,
and read them.
3. Get the secret memory
4. Retrieve i-th bit or say BUG
The Reduction: Outline
Use online memory checker
Construct “random index” protocol, Bob chooses random index i:
If x = y, then Carol accepts
If xi ≠ yi, then Carol rejects
Use online checker to build this protocol
Use error correcting code
Go from “random index” to equality testing:
Alice, Bob encode inputs and run “random index” protocol
If Alice’s and Bob’s inputs different at even one index, encodings
are different at many indices.
xi
retrieve(i)
store(x)
Checker
Public Memory P(x)
Secret Memory S(x)
x{0,1}n
ALICE
S(x)
Accept if
yi = Cbits
i
s(n)
WANT: An adversary that can find bad x,y for protocol
be
x=yCan
accept
CAROL x ≠y reject
used to find bad x,P(y),i for memory checker
i i
i, yi
Conclusion
n
[Weak
Theorem]:
PROBLEM:y{0,1}
Protocol adversary
BOB sees randomness!q(n)+1 bits
Get random
SOLUTION:
Re-Randomize! Bits for Carol
Cindex
i = xi /BUG
For
“restricted” online memory checkers
store(y)
Checker
Alice re-computes S(x)
with different randomness,
retrieve(i) s(n) x q(n) = Ω(n)
Public Memory P(y)
New S(x) independent of public randomness (given P(x))
Secret Memory
Memory S(y)
S(x)
Secret
Requires exponential
time
Alice
Program for This Talk:
•
•
•
•
Define online memory checkers
Review some past results
Describe new results
Proof sketch:
– Define communication complexity model
– Sketch lower bound for a simple case
– Ideas for extending to the general case
Recall Simplifying Assumption
Assumption: checker chooses indices to read from
public memory independently of secret memory
Do we really need the assumption?
Idea: If checker uses secret memory to choose
indices, Adversary learns something about the
secret memory from indices the checker reads.
Access Pattern Distribution
For a retrieve request
Access Pattern:
Bits of public memory accessed by checker
Access Pattern Distribution:
Distribution of the checker’s access pattern
(given its secret memory)
Randomness: over checker’s coin tosses
Where Do We Go From Here?
Observation:
If adversary doesn’t know the access pattern
distribution, then the checker is “home free”.
Lesson for adversary:
Activate checker many times, “learn” its access
pattern distribution!
[NR05]: Learning to Impersonate.
Learning The Access Pattern Distribution
Theorem (Corollary from [NR05])
Learning algorithm for adversary:
– Adversary stores x, secret memory s
– Adversary makes O(s(n)) retrieves,
p: Final public memory (after the stores and retrieves)
– Adversary learns L, can generate distribution DL(p).
– “Real” distribution is DS(p)
Guarantee: With high probability, the distributions
DL(p) and DS(p) are ε -close.
L is of size O(q(n) x s(n)) bits.
Guarantee is only for the public memory p reached by checker!
store(x), retrieves
Checker
Public Memory P(x)
Secret Memory S(x)
x{0,1}n
ALICE
L
Accept
s(n)
bits if
S(x)
Run Learneryi = Ci
with public coins
x=y accept
L
CAROL xLearned
i≠yi reject
Soundness: An adversary that finds x≠y
s.t.
Carol
doesn’t bits
O(s(n)xq(n))
i,
y
i
n
y{0,1}
reject,
also fools Adversary
memory checker
BOB
Completeness:
that finds x s.t. Carol rejects when
Run random
Learner
Get
Alice
AND
Bob’s
inputs
C
=
x
/BUG
withindex
same
i
i icoins
q(n)+1 bits
are x, also
memory checker
Bitsfools
for Carol
store(y),
retrieves Checker
Does
this
work???
Access pattern distributions by “real” S and “learned” L are
retrieve(i)
Public Memory
PROBLEM:
distributions by “real” S and “learned”
L are P(y)
close
close on P(x).
on
original
P(x)! They
be very
far
on P(y)!learns it!
Secret
Memory
S(x)
Protocol
adversary
seesmay
L,
checker
adversary
Secret
Memory
Learned
LS(y)
Does it Work?
• Will the protocol work when y≠x?
• No! Big problem for the adversary:
Can learn access pattern distribution on correct and
unmodified public memory…
really wants the distribution on different modified memory!
• Learned information L may be:
– Good on unmodified memory (DL(P(x)), DS(P(x)) close)
– Bad on modified memory (DL(P(y)), DS(P(y)) far)
• Can’t hope to learn distribution on modified public memory
Bait and Switch
Carol knows S and L, if only she could check whether
DL(P(y)), DS(P(y)) are ε-close…
If far:
P(y)≠P(x) (not “real” public memory)! Reject!
If close:
OK for Bob to use L for access pattern!
Bob always uses L to determine access pattern.
This is a “weakening” of the checker.
Bait and Switch:
Carol Approximates the Distance
Main Observation:
Carol (computationally unbounded) can compute
probabilities of any access pattern for which all the bits
read from P(y) are known.
(Probabilities by both DL(P(y)) and DS(P(y)))
Solution:
Sample O(1) access patterns by DL(P(y)), use them to
approximate distance between the distributions.
In the protocol Bob sends these samples to Carol,
she approximates the distance.
Putting It Together
From any memory checker, we get a CM protocol for
equality with:
• Public message:
length O(s(n) x q(n))
• Alice message:
length s(n)
• Bob message:
length O(q(n))
Conclusion: s(n) x q(n) = (n)
Conclusion
Settled the complexity of online memory checking
Characterized the computational assumptions required for
good online memory checkers
Open Questions:
Do we need logarithmic query complexity for online
memory checking with computational assumptions?
Understanding relationships of crypto/complexity objects
Quantum Memory Checkers?
Download