Improved Constructions of Three Source Extractors

advertisement
Electronic Colloquium on Computational Complexity, Report No. 190 (2010)
Improved Constructions of Three Source Extractors
Xin Li ∗
University of Texas at Austin
lixints@cs.utexas.edu
Abstract
We study the problem of constructing extractors for independent weak random sources. The
probabilistic method shows that there exists an extractor for two independent weak random
sources on n bits with only logarithmic min-entropy. However, previously the best known explicit
two source extractor only achieves min-entropy 0.499n [Bou05], and the best known three source
extractor only achieves min-entropy n0.9 [Rao06]. It is a long standing open problem to construct
extractors that work for smaller min-entropy.
In this paper we construct an extractor for three independent weak random sources on n bits
with min-entropy n1/2+δ , for any constant 0 < δ < 1/2. This improves the previous best result
of [Rao06]. In addition, we consider the problem of constructing extractors for three independent
weak sources, such that one of the sources is significantly shorter than the min-entropy of the
other two, in the spirit of [RZ08]. We give an extractor that works in the case where the longer,
n-bit sources only have min-entropy polylog(n), and the shorter source also has min-entropy
polylog(n). This improves the result of [RZ08].
We also study the problem of constructing extractors for affine sources over GF(2). Previously
√ the best known deterministic extractor for n-bit affine sources in this case achieves entropy
n/ log log n [Yeh10, Li10]. In this paper we construct an extractor for two independent affine
sources with entropy n1/2+δ , for any constant 0 < δ < 1/2.
Our constructions mainly use the extractors for somewhere random sources in [Rao06, Rao09]
and the lossless condenser in [GUV07].
∗
Supported in part by NSF Grant CCF-0634811, NSF grant CCF-0916160 and THECB ARP Grant 003658-01132007.
ISSN 1433-8092
1
Introduction
Randomness has played an important role in computer science, with applications in many areas
such as algorithm design, distributed computing and cryptography. These applications generally
either provide solutions that are much more efficient than their deterministic counterparts (e.g., in
algorithm design), or solve problems that would otherwise be impossible in the deterministic case
(e.g., in distributed computing and cryptography).
However, to ensure their performances these applications need to rely on the assumption that
they have access to uniformly random bits, while in practice it is not clear how to obtain such high
quality sources of randomness. Instead, it may be the case that only imperfect random sources
with highly biased random bits are available. Therefore an important problem is to study how to
use imperfect random sources in these applications.
The study of this problem has led to the introduction of randomness extractors[NZ96]. Informally, a randomness extractor is a function Ext : {0, 1}n → {0, 1}m such that given any imperfect
random source as the input, the output is statistically close to uniform. If such functions can be
computed efficiently then they can be used to convert imperfect random sources into nearly uniform
random bits, and the problem of using imperfect randomness in randomized applications can be
solved. Since the introduction of extractors, a lot of research work has been conducted on this
object, and extractors are found to have applications in many other problems in computer science.
We refer the reader to [FS02] for a survey on this subject.
In the context of extractors, an imperfect random source is modeled by an arbitrary probability
distribution with a certain amount of entropy. The entropy used here is the min-entropy. A
probability distribution (or equivalently, a random variable) on n bits is said to have min-entropy
k if the probability of getting any particular string is at most 2−k , and the distribution is called an
(n, k)-weak random source. Unfortunately, it is not hard to show that no deterministic extractor
exists even for (n, n − 1) weak random sources. Given this negative result, the study of extractors
has been pursued in two different directions.
One direction is to give the function Ext an additional independent small seed of uniform random
bits. Such extractors are thus called seeded extractors. Seeded extractors provide an optimal solution
to the problem of simulating randomized algorithms using weak random sources, and a long line
of research has resulted in seeded extractors with almost optimal parameters [LRVW03, GUV07,
DW08, DKSS09].
Another direction is to study extractors for special classes of sources. These include for example
samplable sources [TV00], bit-fixing sources [KZ07, GRS04], affine sources [GR05, Bou07], independent sources [BIW04, BKS+ 05, Raz05, Rao06, BRSW06] and small space sources [KRVZ06].
The results of this paper fall into the second kind. Specifically, in this paper we study the
problem of constructing extractors for independent sources and affine sources.
1.1
Extractors for Independent Sources
Using the probabilistic method, it is not hard to show that there exists an extractor for just
two independent weak random sources, with only logarithmic min-entropy. However, despite considerable efforts on this problem [BIW04, BKS+ 05, Raz05, Bou05, Rao06, BRSW06], the known
constructions are far from achieving these parameters. Currently the best explicit extractor for two
independent (n, k) sources only achieves min-entropy k = 0.499n [Bou05], and the best explicit
extractor for independent (n, nα ) sources requires O(1/α) sources [Rao06, BRSW06]. Given this
1
embarrassing situation, in this paper we ask a slightly less ambitious question: how well can we
do with three independent sources? That is, we want to construct an explicit extractor for three
independent (n, k) sources, with k as small as possible.
Currently, the best known extractor for three independent sources achieves min-entropy k = n0.9
[Rao06]. In this paper we improve this result. In addition, we consider the problem of building
extractors for three independent sources with uneven lengths as in [RZ08]. There the authors gave
an extractor for three independent sources, where two of them can have any polynomially small
min-entropy and the third can have polylogarithmic entropy, as long as the length of the third
source is much smaller than the min-entropy of the other two. In this paper we also improve this
result in various aspects, and our construction is simpler than that of [RZ08].
1.2
Extractors for Affine Sources
An affine source is the uniform distribution over some affine subspace of a vector space.
Definition 1.1. (affine source) Let Fq be the finite field with q elements. Denote by Fnq the ndimensional vector space over Fq . A distribution X over Fnq is an (n, k)q affine source if there
exist linearly independent vectors a1 , · · · , ak ∈ Fnq and another vector b ∈ Fnq s.t. X is sampled by
choosing x1 , · · · , xk ∈ F uniformly and independently and computing
X=
k
X
xi ai + b.
i=1
In the case of affine sources, the min-entropy coincides with the standard Shannon entropy, and
we will just call it entropy.
An affine extractor is a deterministic function such that given any affine source as the input,
the output of the function is statistically close to the uniform distribution.
Definition 1.2. (affine extractor) A function AExt : Fnq → {0, 1}m is a deterministic (k, ǫ)-affine
extractor if for every (n, k)q affine source X,
|AExt(X) − Um | ≤ ǫ.
Here Um is the uniform distribution over {0, 1}m and | · | stands for the statistical distance.
In this paper we focus on the case where q = 2 and we will just write (n, k) affine sources instead
of (n, k)2 affine sources. Using the probabilistic method, it is not hard to show that there exists a
deterministic affine extractor, as long as k > 2 log n and m < k − O(1). However, again the known
constructions are far from achieving these parameters.
Currently the best explicit extractor for an
√
(n, k) affine source only achieves entropy k = n/ log log n [Yeh10, Li10]. Thus in this paper we
also ask a slightly less ambitious question: how well can we do with two independent affine sources?
That is, we want to construct an explicit extractor for two independent (n, k) affine sources, with
k as small as possible. To our best knowledge there has been no result on this problem before,
although a result in [Li10] implies such an extractor with k = δn for any constant 0 < δ < 1. In
this paper we also improve this result.
2
1.3
Our Results
For the problem of constructing three source extractors, we achieve min-entropy k = n1/2+δ , for any
constant 0 < δ < 1/2. This improves the previous best result of [Rao06], where the min-entropy is
required to be at least n0.9 . Specifically, we have the following theorem.
Theorem 1.3. For every constant 0 < δ < 1/2, there exists a polynomial time computable function
THExt : ({0, 1}n )3 → {0, 1}m such that if X, Y, Z are three independent (n, k) sources with k =
n1/2+δ , then
|THExt(X, Y, Z) − Um | < n−Ω(δ)
with m = Ω(k).
The following table summarizes recent results on extractors for independent sources.
Number of Sources
O(poly(1/δ))
3
3
2
2
O(1/δ)
O(1/δ)
3
3
3
Min-Entropy
δn
δn, any constant δ
One source: δn, any constant δ. Other
sources may have k ≥ polylog(n).
One source: (1/2 + δ)n, any constant δ.
Other source may have k ≥ polylog(n)
(1/2 − α0 )n for some small universal
constant α0 > 0
k = nδ
k = nδ
One source: δn, any constant δ. Other
sources may have k ≥ polylog(n).
k = n1−α0 for some small universal
constant α0 > 0
k = n1/2+δ , any constant δ
Output
Θ(n)
Θ(1)
Error
2−Ω(n)
O(1)
Ref
[BIW04]
[BKS+ 05]
Θ(1)
O(1)
[Raz05]
Θ(k)
2−Ω(k)
[Raz05]
Θ(n)
2−Ω(n)
[Bou05]
Θ(k)
Θ(k)
k −Ω(1)
Ω(1)
2−k
[Rao06]
[BRSW06]
Θ(k)
2−k
Ω(1)
Θ(k)
2−k
Ω(1)
Θ(k)
k −Ω(1)
[Rao06]
[Rao06]
This work
Table 1: Summary of Results on Extractors for Independent Sources.
For three independent sources with uneven lengths, we have the following theorem.
Theorem 1.4. For every constant 0 < γ < 1, there exists a polynomial time computable function
UExt : {0, 1}n1 × {0, 1}n1 × {0, 1}n3 → {0, 1}m such that if X is an (n1 , k1 ) source, Y is an (n2 , k2 )
source, Z is an (n3 , k3 ) source and X, Y, Z are independent, then
Ω(1)
|UExt(X, Y, Z) − Um | < 2−k1
with m = Ω(k3 ), as long as the following hold:
• n1 < k2γ and k2 ≤ k3 .
• k1 > 2 log2 n1 , k1 > 2 log2 n2 and k1 > 2 log2 n3 .
1−γ
1−γ
• k2 4 > log2 n2 and k2 4 > log2 n3 .
3
Ω(1)
+ 2−k2
By setting the parameters appropriately, we get the following two corollaries.
Corollary 1.5. For all constants 0 < β, γ < 1 there is a polynomial time computable function
βγ
β
UExt : {0, 1}n × {0, 1}n × {0, 1}n → {0, 1}Ω(n ) which is an extractor for three independent
Ω(1)
n.
sources with min-entropy requirement k = log3 n, nβ , nβ , and error 2− log
This result improves that of [RZ08] in the sense that the length of the short source can be nβγ
for any constant 0 < γ < 1. In particular, γ can be arbitrarily close to 1, while in [RZ08] it has to
be 1/h for some constant h > 1.
Corollary 1.6. For every constant c > 3 there is a constant d > c and a polynomial time comc
d
putable function UExt : {0, 1}log n ×{0, 1}n ×{0, 1}n → {0, 1}Ω(log n) which is an extractor for three
Ω(1)
n.
independent sources with min-entropy requirement k = log3 n, logd n, logd n, and error 2− log
This result gives an extractor for three independent sources, such that two of them have only
polylogarithmic min-entropy, and the third one has polynomially small min-entropy while the length
is only polylogarithmic of the other two. It appears that the result in [RZ08] cannot handle this
situation.
For two independent affine sources, we have the following theorem.
Theorem 1.7. For every constant 0 < δ < 1/2, there exists a polynomial time computable function
TAExt : {0, 1}n × {0, 1}n → {0, 1}m such that if X, Y are two independent affine (n, k) sources with
k = n1/2+δ , then
|TAExt(X, Y ) − Um | < n−Ω(δ)
with m = Ω(k).
Remark 1.8.
• In this paper we do not optimize the length of the output, but it is not hard
to show that in all our constructions the output length can be made to k − o(k), by using a
seeded extractor that extracts almost all randomness from a weak source in our constructions.
• In fact, our extractor for two independent affine sources works for one affine block source
with two blocks of length n, as long as the first block has entropy n1/2+δ , and the second has
entropy n1/2+δ even conditioned on the fixing of the first one.
1.4
Overview of the Constructions and Techniques
Here we give a brief description of our constructions and the techniques used. For clarity and
simplicity we shall be imprecise sometimes.
1.4.1
The three source extractor
Our construction of the three source extractor mainly uses somewhere random sources and the
extractor for such sources in [Rao06]. Informally, a somewhere random source is a matrix of
random variables such that at least one of the rows is uniform. If we have two independent
somewhere random sources X = X1 ◦ · · · ◦ Xt and Y = Y1 ◦ · · · ◦ Yt with the same number of rows
t, and there exists an i such that both Xi and Yi are uniform, then we call X and Y independent
aligned somewhere random sources. In [Rao06] it is shown that if we have two independent aligned
somewhere random sources X, Y with t rows and each row has n bits, such that t < nγ for some
arbitrary constant 0 < γ < 1, then we can efficiently extract random bits from X and Y .
4
Now given three independent (n, k) sources X, Y, Z with k = n1/2+δ , our construction uses X
to convert Y and Z into two somewhere random sources, such that with high probability over
the fixing of X (and some additional random variables), they are independent aligned somewhere
random sources, and the number of rows is significantly smaller than the length of each row. Then
we will be done by using the extractor in [Rao06] described above.
To illustrate how we do this, first assume that we have a strong seeded extractor that only uses
log n additional random bits and can extract almost all the entropy from an (n, k)-source with error
1/100. A strong seeded extractor is a seeded extractor such that with high probability over the
fixing of the seed, the output is still close to uniform. We now try this extractor on X, Y and Z with
all 2log n = n possibilities of the seed and output nδ/2 bits. Thus we obtain three n × nδ/2 matrices.
√
√
Now we divide each matrix into n blocks with each block consisting of n rows. Therefore we
√
get X 1 ◦ · · · ◦ X t , Y 1 ◦ · · · ◦ Y t and Z 1 ◦ · · · ◦ Z t , where t = n and each X i , Y i , Z i is a block. By
a standard property of strong seeded extractors, with probability 1 − 1/10 = 9/10 over the fixing
of the seed, the output is 1/10-close to uniform. Therefore in each matrix, at least 9/10 fraction of
the rows are close to uniform. We say a block is “good” if it contains at least one such row. Thus
in each matrix at least 9/10 fraction of the blocks are good.
Now it’s easy to see that there exists an i such that all X i , Y i and Z i are good. In other
words, in some sense the matrices are already “aligned”. Next, for each i we compute an output
Ri from (X i , Y i , X, Y ) and an output Ri′ from (X i , Z i , X, Z), with the property that if X i , Y i are
good, then Ri is (close to) uniform, and if X i , Z i are good, then Ri′ is (close to) uniform. We then
concatenate {Ri } to form a matrix SRy and concatenate {Ri′ } to form a matrix SRz . Since there
exists an i such that all X i , Y i and Z i are good, SRy and SRz are (close to) aligned somewhere
random sources.
In the analysis below we consider a particular i such that all X i , Y i and Z i are good (though
we may not know what i is).
Now let’s consider computing Ri from (X i , Y i , X, Y ) (Ri′ is computed the same way from
i
(X , Z i , X, Z)). Here we use a two-source extractor Raz in [Raz05]. This extractor is strong and
it works as long as one of the source has min-entropy rate (the ratio between min-entropy and the
length of the source) > 1/2, and even if the two sources have different lengths. We first apply
Raz to Y i and each row of X i (note Y i is treated as a whole) to obtain a matrix M . Note that
if X i , Y i are good then they are both somewhere random, and thus Y i has min-entropy at least
nδ/2 . Thus M is also a somewhere random source. Since Raz is a strong two-source extractor,
we can fix X i , and conditioned on this fixing M is still a somewhere random source. Moreover
now M is a deterministic function of Y i and is thus independent of X. Next note that the size
√
of X i is n · nδ/2 = n1/2+δ/2 while the min-entropy of X is n1/2+δ . Thus with high probability
over the fixings of X i , X still has min-entropy at least 0.9n1/2+δ . Therefore now we can apply a
strong seeded extractor to X and each row of M and output 0.8n1/2+δ bits. Thus we obtain a
√
( n × 0.8n1/2+δ ) somewhere random source X̄ i . Furthermore, since we applied a strong seeded extractor and now M is a deterministic function of Y i , we can further fix Y i and X̄ i is still somewhere
random, meanwhile it is now a deterministic function of X.
Similarly, we can compute a somewhere random source Ȳ i . Specifically, Since Raz is a strong
two-source extractor, we can fix Y i , and conditioned on this fixing M is still a somewhere random
source. Moreover now M is a deterministic function of X i and is thus independent of Y . Next note
√
that the size of Y i is n · nδ/2 = n1/2+δ/2 while the min-entropy of Y is n1/2+δ . Thus with high
probability over the fixings of Y i , Y still has min-entropy at least 0.9n1/2+δ . Therefore now we
5
can apply a strong seeded extractor to Y and each row of M and output 0.8n1/2+δ bits. Thus we
√
obtain a ( n × 0.8n1/2+δ ) somewhere random source Ȳ i . Furthermore, since we applied a strong
seeded extractor and now M is a deterministic function of X i , we can further fix X i and Ȳ i is still
somewhere random, meanwhile it is now a deterministic function of X.
√
Therefore now after the fixings of (X i , Y i ), we get two independent ( n×0.8n1/2+δ ) somewhere
random sources (X̄ i , Ȳ i ). It is easy to check that they are aligned. Note that the number of rows
is significantly less than the length of each row, thus we can apply the extractor in [Rao06] to
get a random string Ri with say 0.7n1/2+δ bits. Further notice that the extractor in [Rao06] is
strong, thus we can fix X and Ri is still (close to) uniform. This means that we can fix X and
SRy is still somewhere random (recall that SRy is the concatenation of {Ri }), moreover it is now
a deterministic function of Y .
Similarly we can compute SRz , and by the same argument we can fix X and SRz is still
somewhere random, moreover it is now a deterministic function of Z. Therefore now after we fix
√
(X, Y i , Z i ), we get two independent aligned ( n×0.7n1/2+δ ) somewhere random sources, and again
the extractor in [Rao06] can be used to obtain an output that is (close to) uniform.
The above argument works even if the seed length of the strong seeded extractor that we use
on X, Y, Z (try all possibilities of the seed) is (1 + α) log n instead of log n, as long as α can be
an arbitrarily small constant. However, currently we don’t have such extractors for min-entropy
k = n1/2+δ . Fortunately, we have condensers with such short seed length. A (seeded) condenser
is a generalization of a seeded extractor, such that the output is close to having high min-entropy
instead of being uniform. In this paper we use the condenser built in [GUV07]. For any constant
α > 0 and any (n, k ′ ) source, this condenser uses d = (1 + 1/α) · (log n + log k ′ + log(1/ǫ)) + O(1)
additional random bits to convert the source roughly into a ((1 + α)k ′ , k ′ ) source with error ǫ. Now
we can take α to be a sufficiently large constant, say 10/δ, take k ′ to be small, say nδ/10 (note that
an (n, n1/2+δ ) source is also an (n, nδ/10 ) source), and take ǫ to be something like n−δ/10 . This gives
us a small seed length, such that 2d = O(n1+δ/3 ). Therefore the number of blocks is O(n1/2+δ/3 ),
which is significantly less than n1/2+δ .
Now we can repeat the argument before. The condenser can also be shown to be strong, in the
√
√
sense that with probability 1 − 2 ǫ over the fixing of the seed, the output is ǫ-close to having
min-entropy k ′ − d (intuitively, this is because the seed length is d, thus conditioned on the seed
the output can lose at most d bits of entropy). Now define a block to be “good” if it contains at
√
least one “good” row that is ǫ-close to having min-entropy k ′ − d. Again we can show there is an
i such that all X i , Y i and Z i are good.
To finish the argument, we need to apply Raz to Y i and each row of X i . However now the good
row in X i is not uniform, in fact it may not even have min-entropy rate > 1/2. On the other hand,
it does have a constant min-entropy rate. Therefore we now first apply the somewhere condenser
from [BKS+ 05, Zuc07] to each row of X i to boost the min-entropy rate to 0.9. The somewhere
condenser outputs another constant number of rows for each row of X i , and if the row of X i is
good, then one of the outputs is close to having min-entropy rate 0.9. Now we can apply Raz to Y i
and each output of the somewhere condenser, and proceed as before. Since this only increases the
number of rows in (X̄ i , Ȳ i ) by a constant factor, it does not affect our analysis. Thus we obtain a
three source extractor for min-entropy k = n1/2+δ .
6
1.4.2
The extractor for three sources with uneven lengths
If we do not try to optimize the parameters here, our construction for this extractor is quite simple.
Assume that we have three independent sources X, Y, Z such that X is an (n1 , k1 ) source, Y is an
(n2 , k2 ) source, Z is an (n3 , k3 ) source and n1 is significantly smaller than k2 and k3 . We can do the
following. First take a strong seeded extractor that uses O(log n1 ) additional random bits, and try
this extractor on X with all possibilities of the seed. Then we get a somewhere random source X̄
with nc1 rows for some constant c > 1. Now we apply a strong seeded extractor to Y and each row
of X̄ and output Ω(k2 ) bits. Thus we get a nc1 × Ω(k2 ) somewhere random source. We do the same
thing to Z and each row of X̄ and we get a nc1 × Ω(k3 ) somewhere random source. Note that these
two somewhere random sources are aligned. Moreover since the extractor is strong, we can fix X
and conditioned on this fixing, the two sources are still aligned somewhere random sources, and
they are now independent (because they are now deterministic functions of Y and Z respectively).
Thus if nc1 < min[k2 , k3 ], we can use the extractor in [Rao06] to extract random bits from these
two sources.
There are two small disadvantages with the above construction. First, it requires that n1 <
min[k2 , k3 ]1/c for some constant c > 1 instead of n1 < min[k2 , k3 ]γ for any constant 0 < γ < 1.
Second, the error of the construction is only 1/poly(n1 ). We deal with these problems as follows.
For the first problem, as before, instead of using a seeded extractor on X (trying all possibilities
of the seed), we use the seeded condenser in [GUV07] and the somewhere condenser in [BKS+ 05,
Zuc07]. This gives us a short seed whose length is arbitrarily close to log n1 , thus we can achieve
n1 < min[k2 , k3 ]γ for any constant 0 < γ < 1. However, now the “good” row in X̄ only has
min-entropy rate 0.9 and we cannot apply a seeded extractor to Y and each row of X̄. We can use
the strong two source extractor Raz here, but then the length of the output is at most k1 , and is
smaller than n1 . So instead we do the following.
For each row X̄i in X̄, assume the length is m1 . We first take a substring Si of X̄i with length
0.3m1 , and apply Raz to Si and Y to get Vi . We then apply a strong seeded extractor to X̄i and Vi to
get Ri , and apply a strong seeded extractor to Y and Ri to get the final output Hi . We concatenate
{Hi } to get a matrix SRy . Now if X̄i has min-entropy rate 0.9, then Si has min-entropy rate at
least 2/3 and thus the two source extractor Raz works. Since Raz is strong, we can fix Si and Vi is
still (close to) uniform, and it is a deterministic function of Y and thus independent of X̄i . Note
that Si has length 0.3m1 thus with high probability over the fixing of Si , X̄i still has min-entropy
at least say 0.5m1 . Therefore Ri is (close to) uniform. Since the seeded extractor is also strong,
we can further fix Vi and Ri is still uniform, and it is now a deterministic function of X̄i and thus
independent of Y . Note that the length of Vi is at most k1 < n1 and is much smaller than k2 , thus
with high probability over the fixing of Vi , Y still has min-entropy at least 0.9k2 . Therefore now
Hi is (close to) uniform and can output 0.8k2 bits. Moreover since the seeded extractor is strong,
Hi is (close to) uniform even conditioned on the fixing of X. Thus we obtain a somewhere random
source SRy with the number of rows significantly smaller than the length of each row, and it is
somewhere random even conditioned on the fixing of X.
We can now do the same thing to X and Z to obtain a somewhere random source SRz , and
apply the extractor in [Rao06] to (SRy , SRz ) to extract random bits.
For the second problem, we use similar techniques as those in [BRSW06] to show that, when
the extractor is run on three sources with larger min-entropy, the error must become much smaller.
7
1.4.3
The affine two source extractor
Here we mainly use affine somewhere random sources, the extractor for such sources in [Rao09],
and strong linear seeded extractors. An affine somewhere random source is an affine source that
is also a somewhere random source. In [Rao09] it is shown that if we have an affine somewhere
random sources X with t rows and each row has n bits, such that t < nγ for some arbitrary constant
0 < γ < 1, then we can efficiently extract random bits from X. A strong linear seeded extractor is
a strong seeded extractor such that for any fixing of the seed, the output is a linear function of the
weak source. Such extractors are known to exist, for example Trevisan’s extractor [Tre01].
Given two independent (n, k) affine sources X, Y with k = n1/2+δ for any constant 0 < δ <
1/2, our extractor works as follows. We first convert X into a somewhere random source X̄ (not
√
necessarily affine) with n rows. We then apply a strong linear seeded extractor to Y and each row
√
of X̄ and output Ω(n1/2+δ ) bits. Thus we obtain a ( n × Ω(n1/2+δ )) somewhere random source
H. By the property of the strong linear seeded extractor, we can fix X, and with high probability
over this fixing, H is still somewhere random. Moreover now it is affine, since conditioned on the
fixing of X, the extractor is a linear function. Thus we can now apply the extractor in [Rao09] to
extract random bits from H.
√
Now let us see how we can convert X into a somewhere random source with n rows. To
√
√
√
do this, we first divide X into n blocks X 1 ◦ · · · ◦ X t where t = n and each block has n
bits. By Lemma 2.15, the sum of the entropies of these blocks is at least k = n1/2+δ . Therefore
at least one block must have entropy at least nδ . Now for each X i we do the following. We take
the condenser from [GUV07] and try it on X i with all possibilities of the seed. Next we take a
strong seeded extractor that uses O(log n) additional random bits, and for each possible output
of the condenser, try the extractor with all possibilities of the seed. We concatenate all these
√
outputs to get a somewhere random source M i . Note that X i has length n, the condenser has
seed length d1 = (1 + 1/α) · (log n1 + log k1 + log(1/ǫ1 )) + O(1) and the extractor has seed length
√
d2 = O(log n2 ). Thus M i has 2d1 +d2 rows. Now note that n1 = n, and n2 is the length of the
output of the condenser, which is roughly (1 + α)k1 . Thus we can choose α to be a large enough
constant, choose k1 to be nδ1 for a small enough constant δ1 and choose ǫ1 = n−δ2 for a small
enough constant δ2 , such that d1 + d2 ≤ (1/2 + δ/2) log n. Hence M i has at most n1/2+δ/2 rows.
Now for each M i , we apply a strong linear seeded extractor to X and each row of M i to get a
matrix X̄ i . Now assume X i has entropy at least nδ (recall that we know there exists such a block).
Then M i is a somewhere random source. We want to argue that X̄ i is also a somewhere random
source. At first this may seem unreasonable because M i and X are correlated. However note that
M i is a deterministic function of X i , and X i is a linear function of X. Also note that since X i only
√
√
has n bits, conditioned on the fixing of X i , X has entropy at least n1/2+δ − n > 0.9n1/2+δ by
Lemma 2.14. Now the property of strong linear seeded extractor and the structure of affine sources
guarantee that in this case, with high probability over the fixing of X i , X̄ i is also a somewhere
random source. Moreover, each row in X̄ i can have length Ω(n1/2+δ ). Furthermore, conditioned
on the fixing of X i , X is still an affine source. Thus conditioned on the fixing of X i , X̄ i is also an
affine source, since the extractor is a linear seeded extractor. Thus now conditioned on the fixing of
X i , X̄ i is an affine somewhere random source with at most n1/2+δ/2 rows and each row has length
Ω(n1/2+δ ). Therefore we can use the extractor in [Rao09] to extract random bits from X̄ i . Since
√
we try this for every X i , we end up with a somewhere random source with n rows.
8
Roadmap. The rest of the paper is organized as follows. In Section 2 we give the preliminaries
and previous work that we use. Section 3 gives the formal description of our three source extractor,
and its analysis. Section 4 gives the formal description of our extractor for three independent
sources with uneven lengths, and the analysis of the extractor. Section 5 gives our extractor for
two independent affine sources, and the analysis of the extractor. Finally in Section 6 we conclude
with some open problems.
2
Preliminaries
We use common notations such as ◦ for concatenation and [n] for {1, 2, · · · , n}. All logarithms are
to the base 2. We often use capital letters for random variables and corresponding small letters for
their instantiations.
2.1
Basic Definitions
Definition 2.1 (statistical distance). Let D and F be two distributions on a set S. Their statistical distance is
1X
def
|D − F | = max(|D(T ) − F (T )|) =
|D(s) − F (s)|
T ⊆S
2
s∈S
If |D − F | ≤ ǫ we say that D is ǫ-close to F .
Definition 2.2. The min-entropy of a random variable X is defined as
H∞ (X) = minx∈supp(X) {− log2 Pr[X = x]}.
We say X is an (n, k)-source if X is a random variable on {0, 1}n and H∞ (X) ≥ k. When n is
understood from the context we simply say that X is a k-source.
2.2
Somewhere Random Sources, Extractors and Condensers
Definition 2.3 (Somewhere Random sources). A source X = (X1 , · · · , Xt ) is (t × r) somewhererandom (SR-source for short) if each Xi takes values in {0, 1}r and there is an i such that Xi is
uniformly distributed.
Definition 2.4. An elementary somewhere-k-source is a vector of sources (X1 , · · · , Xt ), such that
some Xi is a k-source. A somewhere k-source is a convex combination of elementary somewhere-ksources.
Definition 2.5. A function C : {0, 1}n × {0, 1}d → {0, 1}m is a (k → l, ǫ)-condenser if for every
k-source X, C(X, Ud ) is ǫ-close to some l-source. When convenient, we call C a rate-(k/n → l/m, ǫ)condenser.
Definition 2.6. A function C : {0, 1}n × {0, 1}d → {0, 1}m is a (k → l, ǫ)-somewhere-condenser
if for every k-source X, the vector (C(X, y)y∈{0,1}d ) is ǫ-close to a somewhere-l-source. When
convenient, we call C a rate-(k/n → l/m, ǫ)-somewhere-condenser.
9
Definition 2.7. A function Ext : {0, 1}n × {0, 1}d → {0, 1}m is a strong seeded extractor for
min-entropy k and error ǫ if for every min-entropy k source X,
|(Ext(X, R), R) − (Um , R)| < ǫ,
where R is the uniform distribution on d bits independent of X, and Um is the uniform distribution
on m bits independent of R.
Definition 2.8. A function TExt : {0, 1}n1 × {0, 1}n2 → {0, 1}m is a strong two source extractor
for min-entropy k1 , k2 and error ǫ if for every independent (n1 , k1 ) source X and (n2 , k2 ) source Y ,
|(TExt(X, Y ), X) − (Um , X)| < ǫ
and
|(TExt(X, Y ), Y ) − (Um , Y )| < ǫ,
where Um is the uniform distribution on m bits independent of (X, Y ).
Definition 2.9. (aligned SR-source) [Rao06] We say that a collection of SR-sources X1 , · · · , Xu is
aligned if there is some i such that the i’th row of every SR-source in the collection is uniformly
distributed.
We also need the definition of a subsource.
Definition 2.10 (Subsource). Given random variables X and X ′ on {0, 1}n we say that X ′ is a
deficiency-d subsource of X and write X ′ ⊆ X if there exits a set A ⊆ {0, 1}n such that (X|A) = X ′
and Pr[x ∈ A] ≥ 2−d .
We have the following lemma.
Lemma 2.11 ([BRSW06]). Let X be a random variable over {0, 1}n such that X is ǫ-close to
an (n, k) source with ǫ ≤ 1/4. Then there is a deficiency 2 subsource X ′ ⊆ X such that X ′ is a
(n, k − 3) source.
2.3
Strong Linear Seeded Extractors
We need the following definition and property of a specific kind of extractors.
Definition 2.12. A function Ext : {0, 1}n × {0, 1}d → {0, 1}m is a strong seeded extractor for
min-entropy k and error ǫ if for every min-entropy k source X,
Pr [|Ext(X, u) − Um | ≤ ǫ] ≥ 1 − ǫ,
u←R Ud
where Um is the uniform distribution on m bits. We say that the function is a linear strong seeded
extractor if the function Ext(·, u) is a linear function over GF(2), for every u ∈ {0, 1}d .
Proposition 2.13 ([Rao09]). Let Ext : {0, 1}n × {0, 1}d → {0, 1}m be a linear strong seeded
extractor for min-entropy k with error ǫ < 1/2. Let X be any affine source with entropy k. Then,
Pr [|Ext(X, u) − Um | = 0] ≥ 1 − ǫ
u←R Ud
10
2.4
The Structure of Affine Sources
The following lemma explains the behavior of a linear function acting on an affine source.
Lemma 2.14 ([Rao09, Li10]). (Affine Conditioning) . Let X be any affine source on {0, 1}n . Let
L : {0, 1}n → {0, 1}m be any linear function. Then there exist independent affine sources A, B such
that:
• X = A + B.
• For every b ∈ Supp(B), L(b) = 0.
• H(A) = H(L(A)) and there exists an affine function L−1 : {0, 1}m → {0, 1}n such that
A = L−1 (L(A)).
We have the following lemma that exhibits a nice structure of affine sources.
Lemma 2.15 ([Li10]). Let X be any affine source on {0, 1}n . Divide X into t arbitrary blocks
X = X1 ◦ X2 ◦ ... ◦ Xt . Then there exist positive integers k1 , ..., kt such that,
• ∀j, 1 ≤ j ≤ t and ∀(x1 , .., xj−1 ) ∈ Supp(X1 , .., Xj−1 ), H(Xj |X1 =x1 ,...,Xj−1 =xj−1 ) = kj .
Pt
•
i=1 ki = H(X).
2.5
Previous Work that We Use
We are going to use condensers recently constructed based on the sum-product theorem. The
following construction is due to Zuckerman [Zuc07].
Theorem 2.16 ([BKS+ 05, Zuc07]). For any constant β, δ > 0, there is an efficient family of rate(δ → 1 − β, ǫ = 2−Ω(n) )-somewhere condensers Zuc : {0, 1}n → ({0, 1}m )D where D = O(1) and
m = Ω(n).
We need the following two source extractor from [Raz05].
Theorem 2.17 ([Raz05]). For any n1 , n2 , k1 , k2 , m and any 0 < δ < 1/2 with
• n1 ≥ 6 log n1 + 2 log n2
• k1 ≥ (0.5 + δ)n1 + 3 log n1 + log n2
• k2 ≥ 5 log(n1 − k1 )
• m ≤ δ min[n1 /8, k2 /40] − 1
There is a polynomial time computable strong 2-source extractor Raz : {0, 1}n1 × {0, 1}n2 →
{0, 1}m for min-entropy k1 , k2 with error 2−1.5m .
We need the following theorem from [Rao06].
11
Theorem 2.18 ([Rao06]). For every constant γ < 1 and integers n, n′ , t s.t. t < nγ and t < n′γ
there exists a constant α < 1 and a polynomial time computable function 2SRExt : {0, 1}tn ×
′
{0, 1}tn → {0, 1}m s.t. if X is a (t × n) SR-source and Y is an independent aligned (t × n′ )
SR-source,
|(2SRExt(X, Y ), Y ) − (Um , Y )| < ǫ
and
|(2SRExt(X, Y ), X) − (Um , X)| < ǫ,
′ Ω(1)
where Um is independent of X, Y , m = min[n, n′ ] − min[n, n′ ]α and ǫ = 2−min[n,n ]
.
We use the following lossless condenser constructed in [GUV07].
Theorem 2.19 ([GUV07]). For all constants α > 0, and every n ∈ N, k ≤ n and ǫ > 0, there
is an explicit (k → k + d, ǫ) (lossless) condenser Cond : {0, 1}n × {0, 1}d → {0, 1}m with d =
(1 + 1/α) · (log n + log k + log(1/ǫ)) + O(1) and m ≤ 2d + (1 + α)k.
We use the following strong seeded extractor in [GUV07].
Theorem 2.20 ([GUV07]). For every constant α > 0, and all positive integers n, k and ǫ >
∗
exp(−n/2O(log n) ), there is an explicit construction of a strong (k, ǫ) extractor Ext : {0, 1}n ×
{0, 1}d → {0, 1}m with d = O(log n + log(1/ǫ)) and m ≥ (1 − α)k.
We need the following simple lemma about statistical distance.
Lemma 2.21 ([MW97]). Let X and Y be random variables and let Y denote the range of Y . Then
for all ǫ > 0
·
µ ¶¸
1
≥1−ǫ
Pr H∞ (X|Y = y) ≥ H∞ (X) − log |Y| − log
Y
ǫ
We need the following lemma about conditioning on the seed of a condenser.
Lemma 2.22. Let Cond : {0, 1}n × {0, 1}d → {0, 1}m be a (k → l, ǫ)-condenser. For any (n, k)√
source X, let R be the uniform distribution over d bits independent of X. With probability 1 − 2 ǫ
√
over the fixings of R = r, Cond(X, r) is ǫ-close to being an l − 2d source.
Proof. Let W = Cond(X, R). We know that W is ǫ close to having min-entropy l. Now for a fixed
R = r, let Sr = {w ∈ Supp(W ) : Pr[W = w|R=r ] > 2−l+2d }. Note that if Pr[W = w|R=r ] > 2−l+2d
then Pr[W = w] ≥ Pr[W = w|R=r ] Pr[R = r] > 2−l+d . Pick ǫ1 > 0 and let PrR [Pr[W |R=r ∈ Sr ] >
ǫ1 ] = ǫ2 , then PrW [Pr[W = w] > 2−l+d ] > ǫ1 ǫ2 . Thus the statistical distance between W and any
l-source is at least (1 − 2−d )ǫ1 ǫ2 > ǫ1 ǫ2 /2. Therefore ǫ1 ǫ2 < 2ǫ.
√
√
√
Therefore with probability 1 − 2 ǫ over R, ǫ1 < ǫ. This implies that W |R=r is ǫ-close to
having min-entropy l − 2d.
Our extractor for affine sources use strong linear seeded extractors as ingredients. Specifically,
we use the construction of Trevisan [Tre01] and the improvement by Raz et al. [RRV02].
Theorem 2.23 ([Tre01, RRV02]). For every n, k ∈ N with k < n and any 0 < ǫ < 1 there
is an explicit (k, ǫ)-strong linear seeded extractor LExt : {0, 1}n × {0, 1}d → {0, 1}Ω(k) with d =
O(log2 (n/ǫ)).
12
We need to use the following extractor for an affine somewhere random source.
Theorem 2.24 ([Rao09]). For every constant γ < 1 and integers n, t s.t. t < nγ there exists a
α
constant α < 1 and a polynomial time computable function ASRExt : {0, 1}tn → {0, 1}n−n s.t. for
Ω(1)
every (t × n) affine-SR-source X, ASRExt(X) is 2−n -close to uniform.
3
The Three Source Extractor
In this section we present our three source extractor. We have the following algorithm.
Algorithm 3.1 (THExt(x, y, z)).
Input: x, y, z — a sample from three independent (n, k)-sources with k = n1/2+δ , for some
arbitrary constant 0 < δ < 1/2.
Output: w — an m bit string.
Sub-Routines and Parameters:
Let Cond be a (k1 → k1 + d, ǫ1 ) condenser from Theorem 2.19, such that k1 = nδ/10 , ǫ1 = n−δ/10
and α = 10/δ where α is the parameter α in Theorem 2.19.
Let Zuc be a rate-(0.09δ → 0.9, 2−Ω(n) )-somewhere-condenser form Theorem 2.16.
Let Raz be the strong 2-source extractor from Theorem 2.17.
Let Ext be the strong extractor from Theorem 2.20.
Let 2SRExt be the extractor for two independent aligned SR-source from Theorem 2.18.
1. For every s ∈ {0, 1}d compute xs = Cond(x, s). Concatenate {xs } in the binary order
of s to form a matrix of 2d rows. Divide the rows of the matrix sequentially into blocks
√
x1 , · · · xt with each block consisting of n rows. Do the same things to y and z and obtain
blocks y 1 , · · · , y t and z 1 , · · · , z t .
2. (Compute an SR-source from x and y). For i = 1 to t do the following.
√
• For each row xij in block xi (there are n rows), apply Zuc to get a constant number
of outputs {xijℓ }.
i = Raz(xi , y ) and output m = Ω(k ) bits.
• For each xijℓ compute vjℓ
2
1
jℓ i
i compute Ext(x, v i ), output 0.9n1/2+δ bits and concatenate these strings
• For each vjℓ
jℓ
i compute Ext(y, v i ), output 0.9n1/2+δ bits
to form a matrix x̄i . Similarly for each vjℓ
jℓ
and concatenate these strings to form a matrix ȳ i .
• Compute ri = 2SRExt(x̄i , ȳ i ) and output m3 = 0.8n1/2+δ bits.
3. Concatenate {ri , i = 1, · · · , t} to form a matrix sry .
4. Repeat step 2 and step 3 above for x and z to obtain a matrix srz .
5. Output w = 2SRExt(sry , srz ).
13
3.1
Analysis of the extractor
In this section we analyze the three source extractor. Specifically, we prove the following theorem.
Theorem 3.2. For any constant 0 < δ < 1/2, let X, Y, Z be three independent (n, k) sources with
k = n1/2+δ . Then
|THExt(X, Y, Z) − Um | < n−Ω(δ)
with m = Ω(k).
Proof. Our goal is to show that SRy and SRz is (close to) a convex combination of independent
aligned SR-sources with few rows. Then we’ll be done by Theorem 2.18.
First note that k > k1 , thus an (n, k)-source is also an (n, k1 ) source. Let S be the uniform
distribution over d bits independent of (X, Y, Z). By Theorem 2.19 we have that XS = Cond(X, S)
is ǫ1 -close to being an (m1 , k1 + d) source with m1 ≤ 2d + (1 + α)k1 < (2 + α)k1 , since d =
(1 + 1/α) · (log n + log k + log(1/ǫ)) + O(1) = O(log n) = O(log k1 ).
√
√
Now by Lemma 2.22, with probability 1 − 2 ǫ1 over the fixings of S = s, Xs is ǫ1 -close to
√
being an (m1 , k1 − d) source. We say that a row Xs is good if it is ǫ1 -close to being an (m1 , k1 − d)
source, and we say that a block X i is good if it contains at least one good row. It’s easy to see
√
that the fraction of “bad” blocks in {X i } is at most 2 ǫ1 . Similarly this is also true for the blocks
{Y i } and {Z i }.
√
Now since 2 ǫ1 < 1/3, by the union bound there exists an i s.t. X i , Y i and Z i are all good
blocks. Without loss of generality assume that X 1 , Y 1 and Z 1 are all good blocks. We are going
to show that the first rows of SRy and SRz are close to uniform, thus SRy and SRz are aligned
somewhere random sources.
We first show this for SRy . Note that X 1 is a good block and Y 1 is also a good block. Therefore
at least one row in X 1 is good. Without loss of generality assume that X11 is a good row. Thus
√
X11 is ǫ1 -close to being an (m1 , k1 − d) source. Note that k1 − d > 0.99k1 since d = O(log k1 ) and
m1 < (2 + α)k1 . Thus X11 is close to having min-entropy rate 0.99/(2 + α) = 0.99/(2 + 10/δ) =
0.99δ/(10 + 2δ) > 0.09δ since δ < 1/2.
√
Therefore by Theorem 2.16, Zuc(X11 ) is ǫ1 + 2−Ω(m1 ) = n−Ω(δ) -close to being a somewhere
rate 0.9 source with O(1) rows, and the length of each row is Ω(m1 ) = Ω(k1 ).
We now have the following claim.
Claim 3.3. With probability 1 − n−Ω(δ) over the fixings of X 1 and Y 1 , X̄ 1 is a deterministic
Ω(1)
function of X, Ȳ 1 is a deterministic function of Y , and they are 2−n -close to being two aligned
√
(O( n) × 0.9n1/2+δ ) SR-sources.
proof of the claim. Note that Zuc(X11 ) is n−Ω(δ) -close to being a somewhere rate 0.9 source with
O(1) rows, and each row has length Ω(k1 ). For simplicity, consider the case where Zuc(X11 ) is
close to an elementary somewhere rate 0.9 source (since Zuc(X11 ) is n−Ω(δ) -close to being a convex
combination of such sources, this increases the error by at most n−Ω(δ) ). Without loss of generality
1 is n−Ω(δ) -close to having rate 0.9. Since Y 1 is a good block Y 1 is
assume that the first row X11
√
√
ǫ1 -close to having min-entropy at least k1 − d > 0.99k1 , and Y 1 has length m1 n = poly(k1 ).
Therefore by Theorem 2.17 we have
1
1
1
|(V11
, X11
) − (Um2 , X11
)| < n−Ω(δ) + 2−Ω(k1 ) = n−Ω(δ)
14
and
1
|(V11
, Y 1 ) − (Um2 , Y 1 )| < n−Ω(δ) + 2−Ω(k1 ) = n−Ω(δ) .
1 , V 1 is n−Ω(δ) -close to uniform.
Therefore with probability 1 − n−Ω(δ) over the fixing of X11
11
1
1
Since X11 is a deterministic function of X , this also implies that with probability 1 − n−Ω(δ) over
1 is n−Ω(δ) -close to uniform. Note that after this fixing, V 1 is a deterministic
the fixing of X 1 , V11
11
√
1
function of Y , and is thus independent of X. Moreover, note that the length of X 1 is m1 n =
√
δ/10
over the fixing of X 1 ,
O(k1 n) = O(n1/2+δ/10 ). Thus by Lemma 2.21, with probability 1 − 2−n
1/2+δ
1/2+δ/10
δ/10
1/2+δ
X has min-entropy at least n
− O(n
)−n
> 0.99n
.
Therefore, now by the strong extractor property of Ext from Theorem 2.20, with probability
1 , Ext(X, V 1 ) is 2−nΩ(1) -close to uniform. Since now V 1 is a
1 − n−Ω(δ) over the fixing of V11
11
11
deterministic function of Y 1 , this also implies that with probability 1 − n−Ω(δ) over the fixing of Y 1 ,
1 ) is 2−nΩ(1) -close to uniform. Note also that after this fixing Ext(X, V 1 ) is a deterministic
Ext(X, V11
11
function of X. Therefore we have shown that with probability 1 − n−Ω(δ) over the fixing of X 1 and
Ω(1)
Y 1 , X̄ 1 is a deterministic function of X and is 2−n -close to an SR-source.
By symmetry we can also show that with probability 1 − n−Ω(δ) over the fixing of X 1 and Y 1 ,
Ω(1)
1
Ȳ is a deterministic function of Y and is 2−n -close to an SR-source.
Since both in X̄ 1 and Ȳ 1 , the first row is close to uniform, they are close to being aligned
SR-sources.
Now we have the following claim.
Claim 3.4. With probability 1 − n−Ω(δ) over the fixing of X, Y 1 , Z 1 , SRy and SRz are two independent aligned somewhere random sources.
√
proof of the claim. Note that X̄ 1 and Ȳ 1 each has n rows, and each row has 0.9n1/2+δ bits. Thus
by Claim 3.3 and Theorem 2.18, we have
|(R1 , X̄ 1 ) − (Um3 , X̄ 1 )| < n−Ω(δ) .
This means that with probability 1 − n−Ω(δ) over the fixing of X̄ 1 , R1 is n−Ω(δ) -close to uniform.
Since we have fixed X 1 and Y 1 before, now X̄ 1 is a deterministic function of X. Thus this also
implies that with probability 1−n−Ω(δ) over the fixing of X, R1 is n−Ω(δ) -close to uniform. Moreover,
now R1 (and all the other Ri ’s) is a deterministic function of Y . Therefore with probability 1−n−Ω(δ)
over the fixing of X, SRy is n−Ω(δ) -close to an SR-source. Moreover, after this fixing SRy is a
deterministic function of Y .
By the same argument, it follows that with probability 1 − n−Ω(δ) over the fixings of X and Z 1 ,
SRz is n−Ω(δ) -close to an SR-source, and SRz is a deterministic function of Z. Thus SRy and SRz
are independent.
Since both in SRy and SRz , the first row is close to uniform, they are close to being aligned
independent SR-sources.
Now note that
2d = O((nk1 /ǫ1 )1+1/α ) = O(n(1+δ/5)(1+δ/10) ) = O(n1+δ/3 ).
√
Thus SRy and SRz each has 2d / n = O(n1/2+δ/3 ) rows, and each row has 0.8n1/2+δ bits.
Therefore again by Theorem 2.18 we have
15
Ω(1)
|THExt(X, Y, Z) − Um | < n−Ω(δ) + 2−n
= n−Ω(δ) ,
and m = Ω(k).
4
Extractor for Three Sources with Uneven Lengths
In this section we give our extractor for three independent sources with uneven lengths. Our
extractor improves that of [RZ08].
First we have the following algorithm.
Algorithm 4.1 (UExt(x, y, z)).
Input: x, y, z — a sample from three independent sources X, Y, Z, where X is an (n1 , k1 ) source,
Y is an (n2 , k2 ) source and Z is an (n3 , k3 ) source s.t. k2 ≤ k3 and n1 < k2γ for some arbitrary
constant 0 < γ < 1.
Output: w — an m bit string.
Sub-Routines and Parameters:
Let Cond be a (k4 → k4 + d, ǫ4 ) condenser from Theorem 2.19, with parameters α =
1−γ
4
1+3γ
1−γ , k4
=
min[k1 , k2 ] and ǫ4 = 1/100.
Let Zuc be a rate-(0.9/(2 + α) → 0.9, 2−Ω(n) )-somewhere-condenser form Theorem 2.16.
Let Raz be the strong 2-source extractor from Theorem 2.17.
Let Ext be the strong extractor from Theorem 2.20.
Let 2SRExt be the extractor for two independent aligned SR-source from Theorem 2.18.
1. For every s ∈ {0, 1}d compute xs = Cond(x, s). Concatenate {xs } to form a matrix of
t = 2d rows.
2. (Compute an SR-source from x and y). For i = 1 to t do the following.
• For each row xi , apply Zuc to get a constant number of outputs {xij }. Each xij has
m1 bits.
• For each xij , take a substring x̄ij with 0.3m1 bits. Compute vij = Raz(x̄ij , y), rij =
Ext(xij , vij ) and hij = Ext(y, rij ).
3. Concatenate {hij } to form a matrix sry .
4. Repeat step 2 and step 3 above for x and z to obtain a matrix srz .
5. Compute w̄ = 2SRExt(sry , srz ).
6. Output w = Ext(srz , w̄).
16
4.1
Analysis of the extractor
First we prove the following theorem.
Theorem 4.2. For every constant 0 < γ < 1, assume that X is an (n1 , k1 ) source, Y is an (n2 , k2 )
source and Z is an (n3 , k3 ) source such that X, Y, Z are independent, and the following hold:
• n1 < k2γ and k2 ≤ k3 .
• k1 > log2 n1 , k1 > log2 n2 and k1 > log2 n3 .
1−γ
1−γ
• k2 4 > log2 n2 and k2 4 > log2 n3 .
Then there exists a deficiency 2 subsouce X ′ ⊆ X such that
Ω(1)
|(UExt(X ′ , Y, Z), X ′ ) − (Um , X ′ )| < 2−k1
Ω(1)
+ 2−k2
with m = Ω(k3 ).
Proof. As usual, the goal is to show that SRy and SRz is (close to) a convex combination of aligned
SR-sources with few rows.
First note that k4 ≤ k1 , thus X is also an (n1 , k4 ) source. Therefore by Lemma 2.22, with
√
√
probability 1 − 2 ǫ4 = 4/5 over the fixings of S = s, Cond(X, s) is ǫ4 = 1/10-close to having
min-entropy k4 − d. Without loss of generality assume that the first row X1 is 1/10-close to having
min-entropy k4 − d.
Now note that
d = (1 + 1/α) · (log n1 + log k4 + log(1/ǫ4 )) + O(1).
1−γ
Since we take k4 = min[k1 , k2 4 ], we have d = O(log n1 ). Note that k1 > log2 n1 and n1 < k2γ ,
thus d = o(k4 ). Now by Lemma 2.11 there is a deficiency 2 subsource X ′ ⊆ X such that X1′ has
min-entropy k4 − d − 3 > 0.9k4 while the length of X1′ is at most 2d + (1 + α)k4 < (2 + α)k4 . Thus
1−γ
Ω(1)
X1′ has min-entropy rate at least 0.9/(2 + α). Note that k2 4 = n1
Zuc(X1′ )
Ω(1)
−k1
Ω(1)
= k1
Ω(1)
. Thus k4 = k1
.
is 2
-close to a somewhere rate 0.9 source.
Therefore now by Theorem 2.16,
′
For simplicity, consider the case where Zuc(X1 ) is close to an elementary somewhere rate 0.9
Ω(1)
source (since Zuc(X1′ ) is 2−k1
the error by at most
Ω(1)
2−k1
-close to being a convex combination of such sources, this increases
Ω(1)
′ is 2−k1
).Without loss of generality assume that X11
2
1−γ
4
-close to hav-
2
′ has length
ing min-entropy rate 0.9. Note that k1 > log n2 and k2
> log n2 . Thus X11
2
′
′
m1 = Ω(k4 ) = Ω(log n2 ). Since X11 has min-entropy 0.9m1 , X̄11 has min-entropy at least 0.2m1 .
Ω(1)
′ has entropy rate at least 2/3. Thus by Theorem 2.17, with probability 1 − 2−k1
Therefore X̄11
Ω(1)
′ , V
′
−k
over the fixings of X̄11
-close to uniform and has length Ω(m1 ).
11 = Raz(X̄11 , Y ) is 2 1
′
Note that after the fixing of X̄11 , V11 is a deterministic function of Y and is thus independent
Ω(1)
′ . Moreover by Lemma 2.21 with probability 1 − 2−0.1m1 = 1 − 2−k1
over the fixings
of X11
′
′ has min-entropy at least 0.9m − 0.3m − 0.1m = 0.6m . Thus by Theorem 2.20,
of X̄11 , X11
1
1
1
1
Ω(1)
Ω(1)
′ , V ) is 2−k1
with probability 1 − 2−k1 over the further fixings of V11 , R11 = Ext(X11
-close to
11
uniform and has length Ω(m1 ). Note that after this fixing, R11 is a deterministic function of X ′
17
and is thus independent of Y . Moreover since V11 has length at most m1 = O(k1 ) = O(n1 ) = o(k2 ),
Ω(1)
by Lemma 2.21 with probability 1 − 2m1 = 1 − 2−k1 over the fixings of V11 , Y has min-entropy
Ω(1)
at least k2 − m1 − m1 > 0.9k2 . Therefore again by Theorem 2.20, with probability 1 − 2−k1
Ω(1)
over the further fixings of R11 , H11 = Ext(Y, R11 ) is 2−k1 -close to uniform and has length Ω(k2 ).
Ω(1)
Since now R11 is a deterministic function of X ′ , this also implies that with probability 1 − 2−k1
Ω(1)
over the fixings of X ′ , H11 is 2−k1 -close to uniform. Note that after the fixing of X ′ , SRy is a
deterministic function of Y . Thus we have shown the following claim.
Ω(1)
Claim 4.3. With probability 1 − 2−k1
Ω(1)
−k1
of Y and is 2
over the fixings of X ′ , V11 , SRy is a deterministic function
-close to an SR-source.
Ω(1)
Similarly, we can also show that with probability 1 − 2−k1 over the fixings of X ′ and some
other random variable (which is a deterministic function of Z), SRz is a deterministic function of
Ω(1)
Z and is 2−k1 -close to an SR-source.
′ is close
Since both the first rows in SRy and SRz are close to uniform (we assumed that X11
Ω(1)
to having min-entropy rate 0.9), SRy and SRz are 2−k1
independent aligned SR-sources.
Now note that
1−γ
-close to a convex combination of two
1−γ
1+ 1+3γ
2d = (O(n1 k4 ))1+1/α < (O(k2γ k2 4 ))
1+γ
= O(k2 2 ).
1+γ
Thus the number of rows in SRy and SRz is O(2d ) = O(k2 2 ). Since γ < 1, 1+γ
2 < 1. Note that
each row in SRy has length Ω(k2 ) and each row in SRz has length Ω(k3 ) with k3 ≥ k2 . Therefore
by Theorem 2.18,
Ω(1)
|(W̄ , SRz ) − (Um′ , SRz )| < 2−k1
Ω(1)
+ 2−k2
,
Ω(1)
where m′ = Ω(k2 ). Therefore by Theorem 2.20, W = Ext(SRz , W̄ ) is 2−k1
uniform with m = Ω(k3 ). Since this is true with probability 1 − 2
have that
Ω(1)
|(UExt(X ′ , Y, Z), X ′ ) − (Um , X ′ )| < 2−k1
Ω(1)
−k1
-close to
over the fixings of X ′ , we
Ω(1)
+ 2−k2
Ω(1)
+ 2−k2
.
Now we have the following theorem.
Theorem 4.4. For every constant 0 < γ < 1, there exists a polynomial time computable function
UExt : {0, 1}n1 × {0, 1}n1 × {0, 1}n3 → {0, 1}m such that if X is an (n1 , k1 ) source, Y is an (n2 , k2 )
source, Z is an (n3 , k3 ) source and X, Y, Z are independent, then
Ω(1)
|UExt(X, Y, Z) − Um | < 2−k1
with m = Ω(k3 ), as long as the following hold:
• n1 < k2γ and k2 ≤ k3 .
18
Ω(1)
+ 2−k2
• k1 > 2 log2 n1 , k1 > 2 log2 n2 and k1 > 2 log2 n3 .
1−γ
1−γ
• k2 4 > log2 n2 and k2 4 > log2 n3 .
Proof. Let UExt be the algorithm from Algorithm 4.1, set up to work for an (n1 , k1 /2) source, an
Ω(1)
(n2 , k2 ) source and an (n3 , k3 ) source. Let m = Ω(k3 ) be the output length and ǫ = 2−k1
be the error in Theorem 4.2. Now run UExt on X, Y, Z. Define
Ω(1)
+2−k2
BX = {x ∈ Supp(X) : |UExt(x, Y, Z) − Um | ≥ ǫ}.
We have the following claim:
Claim 4.5. |BX | < 2k1 /2 .
The proof of the claim is by contradiction. Assume that |BX | ≥ 2k1 /2 . Then we define a
weak random source X̄ to be the uniform distribution on BX . Thus X̄ is an (n1 , k1 /2) source
and is indepedent of Y, Z. Note that k1 /2 > log2 n1 , k1 /2 > log2 n2 and k1 /2 > log2 n3 . Thus by
Theorem 4.2 there exists a deficiency 2 subsouce X ′ ⊆ X̄ such that
|(UExt(X ′ , Y, Z), X ′ ) − (Um , X ′ )| < ǫ.
However by the definition of BX any subsource X ′ ⊆ X̄ must have
|(UExt(X ′ , Y, Z), X ′ ) − (Um , X ′ )| ≥ ǫ,
which is a contradiction. Therefore we must have that |BX | < 2k1 /2 .
Thus,
Ω(1)
|(UExt(X, Y, Z) − Um | < 2k1 /2 · 2−k1 + ǫ = 2−k1
5
Ω(1)
+ 2−k2
.
The Affine Two Source Extractors
In this section we give an extractor for two independent affine sources with entropy k = n1/2+δ .
We have the following algorithm.
19
Algorithm 5.1 (TAExt(x, y)).
Input: x, y — a sample from two independent (n, k) affine sources with k = n1/2+δ , for some
constant 0 < δ < 1/2.
Output: w — an m bit string.
Sub-Routines and Parameters:
Let Cond be a (k1 → k1 + d1 , ǫ1 ) condenser from Theorem 2.19, with parameters α = 10/δ and
k1 = nΩ(δ) < nδ , ǫ1 = n−Ω(δ) to be chosen later. The output has length m1 ≤ 2d1 + (1 + α)k1 .
Let Ext be the strong extractor from Theorem 2.20, set up to extract from a (m1 , k1 − d1 ) source
using d2 bits, with error ǫ2 = n−Ω(δ) to be chosen later.
Let LExt be the strong linear seeded extractor from Theorem 2.23.
Let ASRExt be the extractor for an affine SR-source from Theorem 2.24.
√
√
√
1. Divide x into n blocks x1 , · · · , xt where t = n and each block has n bits.
2. (Compute an SR-source from x and y). For i = 1 to t do the following.
• For every s ∈ {0, 1}d1 , v ∈ {0, 1}d2 compute xis = Cond(xi , s) and xisv = Ext(xis , v)
with output length nΩ(1) . Concatenate {xisv } to form a matrix of 2d1 +d2 rows.
i = LExt(x, xi ) and output
• For each row xisv (there are 2d1 +d2 rows), compute zsv
sv
i
m2 = Ω(k) bits. Concatenate all zsv to form a matrix x̄i of 2d1 +d2 rows.
• Compute ri = ASRExt(x̄i ) and output Ω(k) bits.
• Compute hi = LExt(y, ri ) and output m3 = Ω(k) bits with error ǫ3 = n−Ω(δ) .
√
3. Concatenate {hi } to form a matrix hxy with t = n rows.
4. Output w = ASRExt(hxy ).
5.1
Analysis of the extractor
In this section we prove the following theorem.
Theorem 5.2. For any constant 0 < δ < 1/2, let X, Y be two independent affine (n, k) sources
with k = n1/2+δ . Then
|TAExt(X, Y ) − Um | < n−Ω(δ) ,
and m = Ω(k).
Proof. Our goal is to show that Hxy is (close to) a convex combination of affine SR-sources with
few rows. Then we’ll be done by Theorem 2.24.
First note that by Lemma 2.15, the sum of the entropies of X i is at least k = n1/2+δ . Thus at
least one block has entropy at least nδ . Without loss of generality assume that H(X 1 ) ≥ nδ .
Note that we choose k1 < nδ . Thus an affine source with entropy nδ is also a source with min√
√
entropy k1 . Now by Lemma 2.22, with probability 1 − 2 ǫ1 over S, XS1 is ǫ1 -close to being an
20
√
(m1 , k1 − d1 ) source. Therefore there exists an i such that Xi1 is ǫ1 -close to being an (m1 , k1 − d1 )
√
source. Without loss of generality assume X11 is ǫ1 -close to being an (m1 , k1 − d1 ) source. Now by
1 is √ǫ + √ǫ -close
the strong extractor property of Ext (Theorem 2.20), there exists a j such that X1j
1
2
√
√
1
to uniform. Without loss of generality assume that X11 is ǫ1 + ǫ2 -close to uniform.
i }. This number is equal to 2d1 +d2 . Note that
Now we count the number of elements in {Xsv
√
d1 = (1 + 1/α) · (log( n) + log k1 + log(1/ǫ1 )) + O(1)
and
d2 = O(log m1 + log(1/ǫ2 )),
where m1 ≤ 2d1 + (1 + α)k1 .
Since we choose k1 = nΩ(δ) and ǫ1 , ǫ2 = n−Ω(δ) , we have d1 = O(log n) = O(log k1 ). Thus
m1 ≤ (2 + α)k1 and k1 − d1 > 0.99k1 .
Therefore
√
d1 + d2 = (1 + 1/α)(log( n)) + (1/α + O(1)) log k1 + (1 + 1/α) log(1/ǫ1 ) + O(log(1/ǫ2 )) + O(1).
Since α = 10/δ, we can choose k1 = nΩ(δ) and ǫ1 , ǫ2 = n−Ω(δ) such that
d1 + d2 ≤ (1/2 + δ/2) log n.
i outputs Ω(k ) = nΩ(δ) bits by Theorem 2.20.
Thus 2d1 +d2 ≤ n1/2+δ/2 and each Xsv
1
−Ω(δ)
1
-close to uniform. We have the following claim.
Now we know that X11 is n
1 = LExt(X, x1 ) is uniform.
Claim 5.3. With probability 1 − n−Ω(δ) over the fixing of X 1 = x1 , Z11
11
1 are correlated, so it is not clear that LExt can
proof of the claim. At first it seems that X and X11
work. However in our case we are going to use the fact that X is an affine source and LExt is a
strong linear seeded extractor.
1 is a deterministic function of X 1 , which is in turn a linear function
Specifically, note that X11
of X. Let L stand for this linear function, i.e. L(x) = x1 . By Lemma 2.14, there exist independent
affine sources A and B such that X = A + B and for every b ∈ Supp(B), L(b) = 0. Thus
√
X 1 = L(X) = L(A + B) = L(A). Therefore by Lemma 2.14 H(A) = H(X 1 ) ≤ n. Thus the
entropy of B is
H(B) = H(X) − H(A) ≥ n1/2+δ − n1/2 > 0.9n1/2+δ = 0.9k.
1 is a deterministic function of X 1 , thus it is also a deterministic function of A,
Note that X11
1 has nΩ(δ) bits and ǫ = n−Ω(δ) , by Theorem 2.23 and
and is therefore independent of B. Since X11
2
Proposition 2.13 we have
Pr
1
u←R X11
[|LExt(B, u) − Um2 | = 0] ≥ 1 − ǫ2 − n−Ω(δ) = 1 − n−Ω(δ) .
1 is a deterministic function of A, this also implies that
Since X11
Pr
[|LExt(B, u) − Um2 | = 0] ≥ 1 − n−Ω(δ) .
a←R A,u=x111 (a)
21
Now note that conditioned on a fixed A = a, u is also fixed, and
LExt(X, u) = LExt(a + B, u) = LExt(a, u) + LExt(B, u)
since LExt is a linear seeded extractor. Thus if for a fixed A = a and u = x111 (a), LExt(B, u) =
Um2 , then LExt(X, u) is also uniform.
Therefore
Pr
[|LExt(X, u) − Um2 | = 0] ≥ 1 − n−Ω(δ) .
a←R A,u=x111 (a)
Note that by Lemma 2.14 there exists an affine function L−1 such that A = L−1 (L(A)) =
Thus A is also a deterministic function of X 1 (in fact, there is a bijection between X 1
and A). Therefore this also implies that
L−1 (X 1 ).
Pr
x←R X 1 ,u=x111 (x)
[|LExt(X, u) − Um2 | = 0] ≥ 1 − n−Ω(δ) .
Therefore, with probability 1−n−Ω(δ) over the fixing of X 1 = x1 , X¯1 is an SR-source. Moreover,
since X 1 is a linear function of X, conditioned on X 1 = x1 , X is still an affine source. Next note
that conditioned on X 1 = x1 , LExt(X, x1sv ) is a linear function of X. Thus conditioned on X 1 = x1 ,
X¯1 is an affine source. Therefore with probability 1 − n−Ω(δ) over the fixing of X 1 = x1 , X¯1 is
an affine SR-source. Note that the number of rows is 2d1 +d2 ≤ n1/2+δ/2 and each row has length
m2 = Ω(k) = Ω(n1/2+δ ), by Theorem 2.24 we have that R1 is n−Ω(δ) close to uniform.
Note R1 has Ω(k) bits and R1 is a deterministic function of X and is independent of Y , thus
again by Theorem 2.23 and Proposition 2.13 we have
Pr [|LExt(Y, r) − Um3 | = 0] ≥ 1 − ǫ3 − n−Ω(δ) = 1 − n−Ω(δ) .
r←R R1
Since R1 is a deterministic function of X this also implies that
Pr
x←R X,r=r1 (x)
[|LExt(Y, r) − Um3 | = 0] ≥ 1 − n−Ω(δ) .
Note that conditioned on X = x, Hxy is a linear function of Y , thus it is an affine source.
Therefore with probability 1 − n−Ω(δ) over the fixing of X = x, Hxy is an affine SR-source. Note
that it has n1/2 rows and each row has length m3 = Ω(k) = Ω(n1/2+δ ). Therefore by Theorem 2.24,
|TAExt(X, Y ) − Um | < n−Ω(δ) + 2−n
Ω(1)
= n−Ω(δ) ,
and m = Ω(k).
Remark 5.4. Note that to compute Hxy we apply a strong linear seeded extractor to Y and each
Ri , and Ri is a deterministic of X. Thus we can use the same argument before (the property of
strong linear seeded extractors and the structure of affine sources) to show that our construction
works even if (X, Y ) is an affine block source, instead of independent sources.
22
6
Conclusions and Open Problems
In this paper we study the problem of constructing extractors for independent weak random sources
and affine sources. In the case of independent sources, we give an extractor for three independent
(n, k) sources with k = n1/2+δ for any constant 0 < δ < 1/2. This improves the previous best
result of [Rao06], where the min-entropy is required to be at least n0.9 . We also give extractors
for three independent sources with uneven lengths, where two of them can have any polynomially
small min-entropy, or even polylogarithmic min-entropy, while the length of the third source is
significantly smaller than the min-entropy of the other two. This improves the result of [RZ08].
In the case of affine extractors, we give an extractor for two independent affine (n, k) sources
with k = n1/2+δ for any constant 0 < δ < 1/2. In fact, our extractor works even if it is an affine
block source, with the first block being an (n, k) affine source, and the second block being an
(n, k) affine source conditioned on the fixing of the first one. We hope that this result can help us
understand the nature of affine sources and build better deterministic extractors for affine sources.
The obvious open problem here is to build extractors for sources with smaller min-entropy.
For example, it would be very interesting if we can construct extractors for three independent
sources with any polynomially small min-entropy. In the case of affine sources, an extractor for two
independent affine sources with any polynomially small entropy would also be interesting.
Another open problem is to reduce the error in our three source extractor and affine two source
extractor. We only achieve an error of 1/poly(n) in both of these cases, and it would be interesting
to find a way to decrease the error to be exponentially small.
Acknowledgements
We thank David Zuckerman for useful discussions.
References
[BIW04]
Boaz Barak, R. Impagliazzo, and Avi Wigderson. Extracting randomness using few
independent sources. In Proc. of 45th FOCS, pages 384–393, 2004.
[BKS+ 05] Boaz Barak, Guy Kindler, Ronen Shaltiel, Benny Sudakov, and Avi Wigderson. Simulating independence: New constructions of condensers, Ramsey graphs, dispersers, and
extractors. In Proceedings of the 37th Annual ACM Symposium on Theory of Computing, pages 1–10, 2005.
[BRSW06] Boaz Barak, Anup Rao, Ronen Shaltiel, and Avi Wigderson. 2 source dispersers for
no(1) entropy and Ramsey graphs beating the Frankl-Wilson construction. In Proc. of
38th STOC, 2006.
[Bou05]
Jean Bourgain. More on the sum-product phenomenon in prime fields and its applications. International Journal of Number Theory, 1:1–32, 2005.
[Bou07]
Jean Bourgain. On the construction of affine-source extractors. Geometric and Functional Analysis, 1:33–57, 2007.
23
[DKSS09] Zeev Dvir, Swastik Kopparty, Shubhangi Saraf, and Madhu Sudan. Extensions to the
method of multiplicities, with applications to kakeya sets and mergers. In Proceedings of
the 50th Annual IEEE Symposium on Foundations of Computer Science, pages 181–190,
2009.
[DW08]
Zeev Dvir and Avi Wigderson. Kakeya sets, new mergers and old extractors. In Proceedings of the 49th Annual IEEE Symposium on Foundations of Computer Science,
2008.
[FS02]
Lance Fortnow and Ronen Shaltiel. Recent developments in explicit constructions of
extractors, 2002.
[GR05]
Ariel Gabizon and Ran Raz. Deterministic extractors for affine sources over large fields.
In Proc. of 46th FOCS, 2005.
[GRS04]
Ariel Gabizon, Ran Raz, and Ronen Shaltiel. Deterministic extractors for bit-fixing
sources by obtaining an independent seed. In Proc. of 45th FOCS, 2004.
[GUV07]
Venkatesan Guruswami, Christopher Umans, and Salil Vadhan. Unbalanced expanders
and randomness extractors from parvaresh-vardy codes. In Proceedings of the 22nd
Annual IEEE Conference on Computational Complexity, 2007.
[KRVZ06] Jesse Kamp, Anup Rao, Salil Vadhan, and David Zuckerman. Deterministic extractors
for small space sources. In Proc. of 38th STOC, 2006.
[KZ07]
Jesse Kamp and David Zuckerman. Deterministic extractors for bit-fixing sources and
exposure-resilient cryptography. SIAM Journal on Computing, 36(5):1231–1247, 2007.
[Li10]
Xin Li. A new approach to affine extractors and dispersers. Technical Report TR10-064,
ECCC: Electronic Colloquium on Computational Complexity, 2010.
[LRVW03] C. J. Lu, Omer Reingold, Salil Vadhan, and Avi Wigderson. Extractors: Optimal up
to constant factors. In Proc. of 35th STOC, pages 602–611, 2003.
[MW97]
Ueli M. Maurer and Stefan Wolf. Privacy amplification secure against active adversaries.
In Advances in Cryptology — CRYPTO ’97, 17th Annual International Cryptology
Conference, Proceedings, 1997.
[NZ96]
Noam Nisan and David Zuckerman. Randomness is linear in space. Journal of Computer
and System Sciences, 52(1):43–52, 1996.
[Rao06]
Anup Rao. Extractors for a constant number of polynomially small min-entropy independent sources. In Proc. of 38th STOC, 2006.
[Rao09]
Anup Rao. Extractors for low-weight affine sources. In Proceedings of the 24nd Annual
IEEE Conference on Computational Complexity, 2009.
[RZ08]
Anup Rao and David Zuckerman. Extractors for 3 uneven length sources. In Random
2008, 2008.
24
[Raz05]
Ran Raz. Extractors with weak random seeds. In Proceedings of the 37th Annual ACM
Symposium on Theory of Computing, pages 11–20, 2005.
[RRV02]
Ran Raz, Omer Reingold, and Salil Vadhan. Extracting all the randomness and reducing
the error in trevisan’s extractors. jcss, 65(1):97–128, 2002.
[Tre01]
Luca Trevisan. Extractors and pseudorandom generators. Journal of the ACM, pages
860–879, 2001.
[TV00]
Luca Trevisan and Salil Vadhan. Extracting randomness from samplable distributions.
In Proc. of 41st FOCS, pages 32–42, 2000.
[Yeh10]
Amir Yehudayoff. Affine extractors over prime fields. Manuscript, 2010.
[Zuc07]
David Zuckerman. Linear degree extractors and the inapproximability of max clique
and chromatic number. In Theory of Computing, pages 103–128, 2007.
25
ECCC
http://eccc.hpi-web.de
ISSN 1433-8092
Download