Lecture 13: “Confuse/Match” Games (II) 1 Reminders from Lecture 12

advertisement
CSE 533: The PCP Theorem and Hardness of Approximation
(Autumn 2005)
Lecture 13: “Confuse/Match” Games (II)
Nov. 14, 2005
Lecturer: Ryan O’Donnell
1
Scribe: Ryan O’Donnell and Sridhar Srinivasan
Reminders from Lecture 12
Our goal is to show the following theorem:
Theorem 1.1. Let s < 1 be a constant. Suppose G is a “confuse/match”-style game with ω(G) ≤
s. Then if k = poly(1/²), ω(Gk ) < ².
If G is repeated in parallel poly(1/²) times, with overwhelming probability (as a function of
²), there will be poly(1/²) many “confuse rounds” and poly(1/²) many “match rounds”. These
rounds will be randomly ordered. Further, it only helps the provers if we fix the questions in some
rounds and tell them everything chosen in those rounds. For all of these reasons, it suffices (and
indeed is equivalent) to prove the following theorem:
Theorem 1.2. Let s < 1 be constant. Let C = (1/²)39 , m = (1/²)11 , and write C 0 = C + m.
Given any 2P1R game G (with the projection property), let G0 be the game with C 0 parallel rounds,
in which C confuse rounds of G and m match rounds of G are played, in a random order. Then
ω(G0 ) ≤ ².
To prove this theorem we will refer to the following two theorems proved in the last class:
0
Fact 1.3. Let X be a set and γ be a probability distribution on X. Let f : X C → {0, 1}, C ≥ 1,
0
0
where we think of X C as having product probability distribution γ C . Let
µ = Pr 0 [f (~q) = 1].
q~∈γ C
Suppose we pick i ∈ [C 0 ] at random and then pick qi ← γ at random. Let
µ
e=µ
ei,qi = Pr[f | i, qi ].
i,qi
Then,
h
i
√
√
3
3
Pr |e
µ − µ| ≥ 1/ C 0 ≤ 1/ C 0 .
i,qi
0
Fact 1.4. Let R ⊆ [C] be any nonempty set, and P : QC → AC . Suppose i and qi are chosen
randomly as in Fact 1.3; then
0 ≤ E [PredictabilityR (P | i, qi )] − PredictabilityR (P )] ≤ 1/C 0 .
i,qi
Remark 1.5. Note that the above two facts are lemmas about general functions; i.e., they have
nothing to do with 2P1R games.
1
2
Intuition for the proof
0
0
Let P : QC → AC be the strategy of Prover 2. The key component of our overall proof is
codifying the intuition that P2 ’s strategy is either “mostly serial” or “highly dependent on many
coordinates”. (Note that this key theorem still has nothing to do with games; it is just a theorem
about the structure of prover strategies.) To prove a rigorous statement along these lines takes some
work. Roughly speaking, our key theorem will say something like the following:
High-level version of the key theorem. Suppose we pick a block of coordinates R ⊆ [C 0 ] and
the questions for that block, ~qR , randomly. Then either:
• It’s very likely (over the choice of R, ~qR that P ’s answers on the coordinates of R are highly
unpredictable (over the choice of the remaining questions). OR,
• [There’s a decent chance that P ’s answers on R are decently predictable, BUT. . . ] Conditioned on any plausible set of answers in the coordinates of R, P is forced into a highly serial
strategy on the remaining coordinates [C 0 ] \ R.
Once we rigorize this statement, it is not too hard to use it to show that the provers cannot succeed
in G0 with probability more than ².
3
Finding a “good block size”
0
0
Fix once and for all the strategy of Prover 2, P2 : QC → AC . (We have switched notation, Q
instead of Y .) Our first task is to determine a “good block size” r∗ for the block R discussed in the
previous section on intuition. This will be a number between 1 and m/2. To find r∗ , we consider
the following “thought experiment” regarding P2 :
0
We imagine filling in up to m/2 questions in QC at random, one by one. I.e., we pick
i1 , i2 , . . . , im/2 at random from [C 0 ] (all distinct) and also qi1 , qi2 , . . . , qm/2 , independently from
γ, P2 ’s natural distribution on questions (inputs). Given 1 ≤ r ≤ m/2, let Rr denote the (random) set {i1 , . . . , ir }, and write ~qRr for the associated questions. We now look at the following
Predictabilities:
(1 ) PredictabilityR1 (P2 | R1 , ~qR1 )
(10 ) PredictabilityR1 (P2 | R2 , ~qR2 )
(2 ) PredictabilityR2 (P2 | R2 , ~qR2 )
(20 ) PredictabilityR2 (P2 | R3 , ~qR3 )
(3 ) PredictabilityR3 (P2 | R3 , ~qR3 )
···
2
(m/2) PredictabilityRm/2 (P2 | Rm , ~qRm/2 )
I.e., we see how Predictability of P2 varies as we do the following two things: a) Give a new
random question in a random coordinate (this makes Predictability go up); b) Require prediction
on this new coordinate (this makes Predictability go down).
Consider now the list of expectations of the above quantities, (1), (10 ), etc., where the expectation is over the choice of Rm/2 and ~qRm/2 .
Lemma 3.1. There exists some “special block size” 1 ≤ r∗ < m/2 such that in going from the (r∗ )
expectation to the (r∗ + 1) expectation, the expected Predictability goes down by at most O(1)/m.
Proof. The expected value of any of the quantities (r) or (r0 ) is a number in the range [0, 1], since
Predictabilities are always in this range. In the r → r0 steps, Predictability goes up. However, by
Fact 1.4, it goes up by at most 1/(C 0 − r) ≤ 1/C. This is very small, and so the total amount the
expected Predictability goes up over all m/2 steps is at most (m/2)(1/C) ¿ 1, using the fact that
C À m. Since all numbers are in the range [0, 1], and the total increase from beginning to end is
at most 1, the total decrease, from the r0 → r + 1 steps, must be at most 2. Thus there must be
at least one step (r∗ )0 → r∗ + 1, where the decrease in the Predictabilities’ expectation is at most
2/(m/2) = 4/m. (And, going from r∗ to r∗ + 1 only makes the decrease less.)
Thus we have identified a “special block size” 1 ≤ r∗ < m/2 with the following property:
Corollary 3.2. Let R ⊆ [C 0 ] be a random set of cardinality r∗ and let ~qR be random questions for
P2 on these coordinates. Further, let i be a random coordinate from [C 0 ] \ R and let qi be a random
question for this coordinate. Then
E[PredictabilityR (P2 | ~qR )] − E[PredictabilityR∪{i} (P2 | ~qR , qi )] ≤ O(1)/m.
Notation.
In the remainder of the proof, we will often use the following notation:
• (R, ~qR ) will be a “random block”, meaning R is a randomly chosen subset of [C 0 ] of size r∗ ,
and ~qR is a random set of questions for those coordinates.
• ~a will denote a string of answers in AR .
• Notation like “Pr[~a | (R, ~qR )]” will mean “the probability, over the choice of questions to
P2 outside of R, that P2 will output the string of answers ~a in the positions R, given that it
gets the questions ~qR in the coordinates R”.
• In particular, we will also use the notation Pr[~a | (R, ~qR ), (i, qi )] and Pr[~a, a | (R, ~qR ), (i, qi )],
where (i, qi ) is a coordinate and a question outside R, and a is a single answer associated
with the coordinate i.
3
4
The key theorem
In this section we use Corollary 3.2 to prove the key theorem, giving a dichotomy of the possibilities for P2 ’s strategy.
Definition 4.1. Given (R, ~qR ), we say answer string ~a ∈ AR is dead if Pr[~a | (R, ~qR )] ≤ ².
Otherwise it is alive. We further say that the whole block (R, ~qR ) is dead if all answer strings
~a ∈ AR are dead for it.
Recall that ² is our target value for the repeated game. Define also η = ²3 . We now make the
following slightly tricky definition:
Definition 4.2. We say that the block (R, ~qR ) “(1 − η) induces serial strategies” if
• (R, ~qR ) is alive.
• For every associated live answer ~a ∈ AR , there is a serial strategy
S~a : ([C 0 ] \ R) × Q → A
such that with probability ≥ 1 − η over the choice of an additional random question (i, qi ),
it holds that
Pr[~a, S~a (i, qi ) | (R, ~qR ), (i, qi )] ≥ (1 − η)Pr[~a | (R, ~qR ), (i, qi )].
In other words, for almost all additional questions, the answer P2 gives in the associated
coordinate is essentially forced.
The idea for the key theorem is this. From Corollary 3.2 we know that for a typical r∗ -block
(R, ~qR ), there is almost no loss in Predictability between predicting on R and predicting on R plus
one more random coordinate. By the definition of Predictability, this can only happen in one of two
ways: First, either Predictability was negligible to begin with (i.e., almost all answer strings are
dead); or, Predictability was non-negligible, but, every answer string on R forces a mostly serial
way of answering most other questions.
Theorem 4.3 (Key Theorem). Given the strategy P2 , one of the following two cases holds:
• (Case 1) When (R, ~qR ) is a random r∗ -block, with probability at least 1 − ², (R, ~qR ) is dead.
• (Case 2) [At least an ² fraction of (R, ~qR ) are alive, but. . . ] If (R, ~qR ) is a random live block,
then with probability at least 1 − ², (R, ~qr ) (1 − η)-induces serial strategies.
Proof. The proof essentially just involves unpacking the definitions involved in Corollary 3.2 and
the phrase “(1 − η)”-induces serial strategies. The proof is by contradiction. Assuming that neither
Case 1 or Case 2 holds, we get that if you pick the block (R, ~qR ) at random, with probability at least
²2 it is alive and fails to (1 − η)-induce serial strategies. This means that there is some particular
live answer string ~a such that, conditioned on P2 answering ~a, at least an η fraction of future
4
(i, qi ) pairs are not “(1 − η)-determined”. This is enough to show that in fact there is a slightly
substantial loss in expected Predictability in predicting on r∗ -sized blocks versus (r∗ + 1)-sized
blocks, contradicting Corollary 3.2.
To do the details, we need to overcome a slight technicality: Answer strings that are alive (i.e.,
have probability at least ²) for (R, ~qR ) might become significantly less probable once the extra
question (i, qi ) is picked. However, by Fact 1.3, this is extremely unlikely. Leaving this detail for
later, what we have is the following:
Proposition 4.4. Assuming the statement of the theorem does not hold, let (R, ~qR ) be a random
r∗ -block and let (i, qi ) be an additional random question. Then with probability at least η²2 /2,
there exists some answer string ~a ∈ AR such that:
1. For all a ∈ A, Pr[~a, a | (R, ~qR ), (i, qi )] ≤ (1 − η)Pr[~a, a | (R, ~qR )].
2. Pr[~a | (R, ~qR ), (i, qi )] ≥ ²/2.
We show that this proposition leads to a loss of η 2 ²4 /8 in expected Predictability. From Point 1
above it is easy to conclude that
X
Pr[~a, a | (R, ~qR ), (i, qi )]2 ≤ (1 − 2η + 2η 2 )Pr[~a | (R, ~qR ), (i, qi )]
a∈A
because, subject to Point 1, the left-hand side is maximized if there is one answer contributing a
(1 − η)-fraction of the probability, another contributing an η-fraction, and the rest contributing 0.
Using 2η − 2η 2 ≥ η and Pr[~a | (R, ~qR ), (i, qi )] ≥ ²/2, we get
X
Pr[~a | (R, ~qR ), (i, qi )]2 −
Pr[~a, a | (R, ~qR ), (i, qi )]2 ≥ η(²/2)2 .
a∈A
From the definition of Predictability, we conclude that whenever the events of the Proposition
occur, at least η(²/2)2 is contributed to
E[PredictabilityR (P2 | ~qR )] − E[PredictabilityR∪{i} (P2 | ~qR , qi )].
Thus the total loss is at least (η²2 /2) · η(²/2)2 = ²1 0/8 ¿ O(1)/m, contradicting Corollary 3.2.
To complete the proof we now only need to use Fact 1.3 to overcome the technicality mentioned
earlier.
Fix any (R, ~qR ) and any live answer string ~a ∈ AR , so Pr[~a | (R, ~qR )] ≥ ². Since
√
3
1/√C ¿ ²/2, Fact 1.3 tells us that Pr[~a | (R, ~qR ), (i, qi )] ≥ ²/2 except with probability at most
1/ 3 C over the choice of (i, qi ). (We are using C ≤ C 0 − r∗ here.) Union-bounding over all
live ~a (of which there
pat most 1/²) we get that for a random choice of (R, ~qR ) and (i, qi ), except
with probability 1/² 3 (C) = (1/²)12 every answer string ~a that is alive for (R, ~qR ) still satisfies
Pr[~a | (R, ~qR ), (i, qi )] ≥ ²/2. Since 1/²12 ¿ η²2 /2, we are done.
5
5
Bounding the success probability of the provers
We now turn to dealing with 2P1R games; specifically, Theorem 1.2. Fix now also Prover 1’s
0
0
strategy, P1 : X C → AC .
In the theorem the way questions are picked is that a random m our C 0 coordinates are “match
rounds” and the rest are “confuse rounds”. We will equivalently imagine the rounds types and
questions to be picked as follows:
Step 1: Pick R ⊆ [C 0 ] of size r∗ at random. These will be match rounds. Pick also both provers’
questions for these rounds.
Step 2: Pick m − r∗ more random coordinates to be match rounds, and also pick the provers’
questions for these rounds. (Note there are at least m/2 such rounds, as r∗ ≤ m/2.)
Step 3: All other coordinates are “confuse rounds”; pick the provers’ questions for these rounds
randomly. Note that the two provers get independent questions in these rounds, since they are
confuse rounds.
The proof that the provers succeed with probability at most ² (actually, we will just prove O(²))
now divides into two cases, the case from the Key Theorem 4.3.
Case 1: In this case, after Step 1 is complete, the question block (R, ~qR ) for P2 is dead with
probability at least 1 − ². We give up the ² here to the provers’ success probability. So assuming
it’s dead, no answer string ~a ∈ AR has more than ² probability of ultimately being given by P2 ,
over the choice of its remaining questions.
We would like to argue that this is still true after Step 2, when P2 has gotten its remaining
match round questions. This follows pretty easily from Fact 1.3. We can’t quite union bound
over all answer strings (there may be too many), but we can do the following: Group all possible
strings in AR into “clusters”, where the clusters have total probability between ²/2 and ². There
are at most 2/² such clusters. By Fact√
1.3, for each additional random√
match round question, the
3
3
probability a cluster gets more than 1/ C more probable is at most 1/ C. Union-bounding over
all remaining match round questions (at most m)
and all clusters, we get that even after Step 2,
√
3
except with probability at most m · (2/²) · (1/ √
C = 2², all clusters — and therefore all answer
R
strings in A — have probability at most ² + m/ 3 C ≤ 2².
Again, we give up the 2² chance of answer strings becoming significantly more likely to the
provers’ success probability. So assuming this doesn’t happen, we are through Steps 1 and 2 and
we know that for every answer string ~a ∈ AR in P2 ’s R coordinates, the probability P2 will answer
with that string — over the choice of its remaining confuse round questions — is at most 2². But
now in Step 3, the provers’ questions are chosen independently. So first imagine filling Prover 1’s
questions. Then this prover has all of its questions, and so its complete answer string is decided. In
particular, its answer string on the coordinates R — call it ~b — is decided. Since these coordinates
are match coordinates, and since G has the projection property, this means that there is a unique
6
string ~a that Prover 2 must answer on these R coordinates (π(~b)) in order for the provers to win. But
we have already argued the probability P2 will answer this string, over the choice of its remaining
confuse round questions, is at most 2².
We have therefore shown that the overall success probability of the provers in Case 1 is at most
² + 2² + 2² = O(²).
Case 2: In this case, let us first do Step 1, producing (R, ~qR ) (as well as Prover 1’s questions on
R). If (R, ~qR ) is dead then the analysis from Case 1 shows that the provers succeed with probability
at most 4². Giving up this probability, we can condition on (R, ~qR ) being alive. Since we are in
Case 2 of Theorem 4.3, we have that except with probability 1 − ², (R, ~qR ) (1 − η)-induces serial
strategies. We assume this is the case, giving up the remaining ² probability.
We now proceed to analyze the provers’ success probability as follows:
X
Pr[success] =
Pr[success AND P2 eventually answers ~a on R].
~a∈AR
We break up the sum above into the cases when ~a is alive and when it is dead. For the dead ~a’s,
we can again use the analysis from Case 1 to show that in total they contribute at most 4² to the
overall sum. Giving up this 4² probability, we proceed to analyze the contribution from live ~a’s,
each of which we know (1 − η)-induces a serial strategy S~a . We will show that for each live ~a
the probability the provers succeed AND that P2 eventually answers ~a is at most O(²2 ). Thus we
conclude that the overall success probability is at most 4² + ² + 4² + (1/²) · O(²2 ) ≤ O(²), using
the fact that there are at most (1/²) live answers. This will complete the proof.
So let us fix a live answer ~a and its associated serial strategy S := S~a . By definition of inducing
serial strategies, we know that conditioning on P2 answering with ~a, the probability that it answers
a random additional question (i, qi ) in accordance with S is at least 1 − 2η. Hence the expected
fraction of coordinates answered in accordance with S is at least 1 − 2η. Indeed, this is true just of
the coordinates in the remaining match rounds. Using Markov’s inequality, we conclude:
Fact 5.1. Conditioned on P2 ultimately answering ~a in the coordinates R, except with probability
²2 , P2 will answer at least a 1 − 2η/²2 = 1 − 2² fraction of the coordinates in the Step 2 rounds
according to the serial strategy S.
Thus
Pr[success AND P2 eventually answers ~a] ≤ ²2 +
Pr[success AND P2 eventually answers ~a
AND P2 answers ≥ 1 − 2² fraction of remaining match coordinates according to S].
But as soon as we know P2 plays a large fraction of coordinates according to a serial strategy, we
can upper-bound its success probability. Since succeeding on all coordinates while using a serial
7
strategy on many coordinates is even harder than succeeding on many coordinates while using a
serial strategy on all coordinates, we have the probability on the right, above, is at most
Pr[P2 matches on ≥ 1 − 2² fraction of the Step 2 match coordinates
| P2 answers all Step 2 coordinates according to S]. (1)
But S is a serial strategy, and even with all questions in the Step 1 match rounds fixed, P2 can match
in any given Step 2 match round with probability at most s < 1. Thus it is expected to match in at
most an s fraction of these coordinates — of which there are at least m/2. By a Chernoff bound,
we easily get that (1) at most exp(−Ω(²(m/2))) ¿ O(²2 ). This completes the proof.
8
Download