Course Notes for Greedy Approximations Math 663-601 Th. Schlumprecht

advertisement
Course Notes for
Greedy Approximations
Math 663-601
Th. Schlumprecht
March 10, 2015
2
Contents
1 The
1.1
1.2
1.3
Threshold Algorithm
Greedy and Quasi Greedy Bases . . . . . . . . . . . . . . . . . .
The Haar basis is greedy in Lp [0, 1] and Lp (R) . . . . . . . . . .
Quasi greedy but not unconditional . . . . . . . . . . . . . . . . .
5
5
15
17
2 Greedy Algorithms In Hilbert Space
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 Convergence Rates . . . . . . . . . . . . . . . . . . . . . . . . . .
27
27
30
42
3 Greedy Algorithms in general Banach Spaces
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Convergence of the Weak Dual Chebyshev Greedy Algorithm
3.3 Weak Dual Greedy Algorithm with Relaxation . . . . . . . .
3.4 Convergence Theorem for the Weak Dual Algorithm . . . . .
.
.
.
.
51
51
55
60
65
4 Open Problems
4.1 Greedy Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Greedy Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . .
71
71
73
5 Appendix A: Bases in Banach spaces
5.1 Schauder bases . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Markushevich bases . . . . . . . . . . . . . . . . . . . . . . . . .
75
75
79
6 Appendix B: Some facts about Lp [0, 1] and Lp (R)
6.1 The Haar basis and Wavelets . . . . . . . . . . . . . . . . . . . .
6.2 Khintchine’s inequality and Applications . . . . . . . . . . . . . .
87
87
96
3
.
.
.
.
4
CONTENTS
Signals or images are often modeled as elements of some Banach space consisting of functions, for example C(D), Lp (D), or more generally Sobolev spaces
W r,p (D), for a domain D ⊂ Rd . These functions need to be “processed”: approximated, converted into an object which is storable, like a sequence of numbers,
and then reconstructed.
This means to find an appropriate basis of the Banach space, or more generally
a dictionary and to compute as many coordinates of the given functions with
respect to this basis as necessary to satisfy the given error estimates. Now the
question one needs to solve, is to find the coordinates one wants to use, given a
restriction on the budget.
Definition 0.0.1. Let X (always) be a separable and real Banach space. We call
D ⊂ SX a dictionary of X if span(D) is dense and x ∈ D implies that −x ∈ D.
An approximation algorithm is a map
G : X → span(D)N , x 7→ G(x) = (Gn (x)),
with the property that for n ∈ N and x ∈ X, there is a set Λ(n,x) ⊂ D of
cardinality at most n so that Gn (x) ∈ span(Λn ). For n ∈ N we call Gn (x) the
n-term approximation of x.
Usually Gn (x) is computed inductively by maximizing a certain value, therefore these algorithms are often called greedy algorithms.
Remark. If X has a basis (en ) with biorthogonals (e∗n ) and G = (Pn ), where
Pn is the n-th canonical projection, would be an example of an approximation
algorithm. Nevertheless the point is to be able to adapt the set Λn to the vector
x, and not letting it be independent. of x.
The main questions are
1) Does (Gn (x)) converge to x?
2) If so, how fast does it converge? How fast does it converge for certain x?
3) How does kx − Gn (x)k compare to the best n-term approximation defined
by
σn (x) = σn (x, D) =
inf
inf kz − xk?
Λ⊂D,#Λ=n z∈span(Λ)
Chapter 1
The Threshold Algorithm
1.1
Greedy and Quasi Greedy Bases
We start with the Threshold Algorithm:
Definition 1.1.1. Let
X be a separable Banach space with a normalized M basis (ei , e∗i ) : i ∈ N ; we mean by that kei k = 1, for i ∈ N) For n ∈ N and
x ∈ X let Λn ⊂ N so that
min |e∗i (x)| ≥ max |e∗i (x)|,
i∈Λn
i∈N\Λn
i.e. we are reordering (e∗i (x)) into (e∗σ(i) (x)), so that
|e∗σ1 (x)| ≥ |e∗σ2 (x)| ≥ |e∗σ3 (x)| ≥ . . . ,
and put for n ∈ N
Λn = {σ1 , σ2 , . . . σn }.
Then define for n ∈ N
GTn (x) =
X
e∗i (x)ei .
i∈Λn
(GTn ) is called the Threshold Algorithm.
Definition 1.1.2. A normalized M -basis (ei ) is called Quasi-Greedy, if for all x
x = lim GTn (x).
(QG)
n→∞
A basis is called greedy if there is a constant C so that
x − GT (x) ≤ Cσn (x),
(G)
where we define
σn (x) = σn x, (ej ) =
inf
inf
Λ⊂N,#Λ=n z∈span(ej :j∈Λ)
5
kz − xk.
6
CHAPTER 1. THE THRESHOLD ALGORITHM
In that case we say that (ei ) is C-greedy. We call the smallest constant C for
which (G) holds the greedy constant of (en ) and denote it by Cg .
Remarks. Let (ei , e∗i ) : i ∈ N be a normalized M basis.
1. From the property that (en ) is fundamental we obtain that for every x ∈ X
σn (x) →n→∞ 0,
it follows therefore that every greedy basis is quasi greedy.
P
2. If (ej ) is an unconditional basis of X, and x = ∞
i=1 ai ∈ X, then
x = lim
n→∞
n
X
aπ(j) eπ(j) ,
j=1
for any permutation π : N → N and thus, in particular, also for a greedy
permutation, i.e. a permutation, so that
|aπ(1) | ≥ |aπ(2) | ≥ |aπ(3) | . . . .
Thus, an unconditional basis is always quasi-greedy.
3. Schauder bases have a special order and might be reordered so that the
cease to be basis. But
• unconditional bases,
• M bases,
• quasi greedy M -bases,
• greedy bases
keep their properties under any permutation, and can therefore be indexed
by any countable set.
4. In order to obtain a quasi greedy M -Basis which is not a Schauder basis,
one could take quasi greedy Schauder basis, which is not unconditional (its
existence will be shown later), but admits a suitable reordering under which
is not a Schauder basis anymore. Nevertheless, by the observations in (3), it
will still be a quasi greedy M -basis. But it seems unknown whether or not
there is a quasi greedy M -basis which cannot be reordered into a Schauder
basis.
Examples 1.1.3.
1-greedy.
1. If 1 ≤ p < ∞, then the unit vector basis (ei ) of `p is
1.1. GREEDY AND QUASI GREEDY BASES
7
2. The unit vector basis (ei ) in c0 is 1-greedy.
P
3. The summing basis sn of c0 (sn = nj=1 ej ) is not quasi greedy.
4. The unit bias of (`p ⊕`q )1 is not greedy (but 1-unconditional and thus quasi
greedy).
P
Proof. To prove (1) let x = ∞
j=1 xj ej ∈ `p , and let Λn ⊂ N be of cardinality n
so that
min{|xj | : j ∈ Λn } ≥ max{|xj | : j ∈ N \ Λn }
P
and Λ ⊂ N be any subset of cardinality n and z =
zi ei ∈ `p with
supp(z) = {i ∈ N : |zi | =
6 0} ⊂ Λ.
Then
kx − zkpp =
X
≥
X
|xj − zj |p +
j∈Λ
|xj |p
j∈N\Λ
|xj − zj |p +
j∈Λ
X
|xj |p
j∈N\Λn
X
≥
X
|xj |p = kGT (x) − xkp .
j∈N\Λn
Thus
σn (x) = inf{kz − xkp : #supp(z) ≤ n} = kGT (x) − xkp .
(2) can be shown in the same way as (1).
In order to show (3) we choose sequences (εj ) ⊂ (0, 1), (nj ) ⊂ N as follows:
1
ε2j = 2−j and ε2j−1 = 2−j 1 + 3 , for j ∈ N
j
and
j
nj = j2 and Nj =
n
X
ni for i ∈ N0 .
i=1
Note that the series
x=
∞
X
Nj
X
(ε2j−1 s2i−1 − ε2j s2i )
j=1 i=Nj−1 +1
=
∞
X
Nj
X
j=1 i=Nj−1 +1
(ε2j−1 − ε2j )s2i−1 − ε2j e2i
8
CHAPTER 1. THE THRESHOLD ALGORITHM
converges, because
Nj
X
∞
X
ε2j e2i ∈ c0
j=1 i=Nj−1 +1
and
∞
X
Nj
X
j=1 i=Nj−1 +1
∞
∞
X
X
1
(ε2j−1 − ε2j )s2i−1 =
< ∞.
nj (ε2j−1 − ε2j ) =
j2
j=1
j=1
Now we compute for l ∈ N0 the vector x − GT2Nl +nl+1 (x):
Nl+1
x−
GT2Nl +nl+1 (x)
X
=−
ε2l+2 s2i +
∞
X
Nj
X
(ε2j−1 s2i−1 − ε2j s2i ).
j=l+2 i=Nj−1 +1
i=Nl +1
From the monotonicity of (si ) we deduce that
∞
X
Nj
X
ε2j−1 s2i−1 − ε2j s2i ≤ kx||.
j=l+2 i=Nj−1 +1
However,
Nl+1
l+1
NX
X
ε2l+2 = l + 1 →l→∞ ∞,
ε2l+2 s2i =
i=Nl +1
i=Nl +1
which implies that GTn (x) is not convergent.
To show (4) assume w.l.o.g. p < q, and denote the unit vector basis of `p by
(ei ) and the unit vector basis of `q by (fj ) for n ∈ N and we put
x(n) =
n
X
1
j=1
Thus
GTn (x(n)) =
n
X
j=1
Nevertheless
2
ej +
n
X
fj .
j=1
1
fj , and thus GTn (x(n)) − x(n) = n1/p .
2
n
X
1 ej = n1/q ,
x −
2
j=1
and since 12 n1/p /n1/q % ∞, for n % ∞, the basis {ej : j ∈ N} ∪ {fj : j ∈ N}
cannot be greedy.
1.1. GREEDY AND QUASI GREEDY BASES
9
Remarks. With the arguments
used in (4) Examples
1.1.3 one can show that
∞
n
∞
the usual bases of ⊕n=1 `q `p and `p (`q ) = ⊕n=1 `q `p are also not greedy but
of course unconditional.
Now in [BCLT] it was shown that `p ⊕ `q has up to permutation and up to
isomorphic equivalence a unique unconditional basis, namely the one indicated
above. Since, as it will be shown later, every greedy basis must be unconditional,
the space does not have any greedy basis.
n
Due to a result in [DFOS] however ⊕∞
n=1 `q `p has a greedy bases if 1 <
p, q < ∞. More precisely, the following was shown:
Let 1 ≤ p, q ≤ ∞.
n
a) If 1 < q < ∞ then the Banach space (⊕∞
n=1 `p )`q has a greedy basis.
n
b) If q = 1 or q = ∞, and p 6= q, then (⊕∞
n=1 `p )`q has not a greedy basis. Here
we take c0 -sum if q = ∞.
The question whether or not `p (`q ) has a greedy basis is open and quite an
interesting question.
The following result by Wojtaszczyk can be seen the analogue of the characterization of Schauder bases by the uniform boundedness of the canonical projections for quasi-greedy bases.
Theorem 1.1.4. [Wo2] A bounded M -basis (ei , e∗i ), with kei k = 1, i ∈ N, of a
Banach space X is quasi greedy if and only if there is a constant C so that for
any x ∈ X and any m ∈ N it follows that
(1.1)
kGTm (x)k ≤ Ckxk
We call the smallest constant so that (1.1) is satisfied the Greedy Projection
Constant.
Remark. Theorem 1.1.4 is basically a uniform boundedness result. Nevertheless,
since the GTm are nonlinear projections we need a direct proof.
We need first the following Lemma:
Lemma 1.1.5. Assume there is no positive number C so that kGTm (x)k ≤ Ckxk
for all x ∈ X and all m ∈ N. Then the following holds:
For all finite A ⊂ N all KP
> 0 there is a finite B ⊂ N , which is disjoint from
A and a vector x, with x = j∈B xj ej , such that kxk = 1 and kGTm (x)k ≥ K,
for some m ∈ N.
Proof. For a finite set F ⊂ N, define PF to be the coordinate projection onto
span(ei : i ∈ F ), generated by the (e∗i ), i.e.
X
PF : X → span(ei : i ∈ F ), x 7→ PF (x) =
e∗j (x)ej .
j∈F
10
CHAPTER 1. THE THRESHOLD ALGORITHM
Since there are only finitely many subsets of A we can put
X
X
M = max kPF k = max sup e∗j (x)ej ≤
ke∗j k · kej k < ∞.
F ⊂A
F ⊂A x∈BX
j∈F
j∈A
Let K1 > 1 so that (K1 −M )/(M +1) > K, and choose x1 ∈ SX ∩span(ej : j ∈ N)
and k ∈ N so that so that kGTk (x1 )k ≥ K1 . We assume without loss of generality
(after suitable small perturbation) that all the non zero numbers |e∗n (x1 )| are
different from each other.
Then let x2 = x1 − PA (x1 ), and note that kx2 k ≤ M + 1 and GTk (x1 ) =
GTm (x2 ) + PF (x1 ) for some m ≤ k and F ⊂ A. Thus kGTm (x2 )k ≥ K1 − M , and
if we define x3 = x2 /kx2 k, we have kGTk (x3 )k ≥ (K1 − M )/(M + 1) > K.
It follows that the support B of x = x3 is disjoint from A and that kGTm (x)k >
K.
Proof of Theorem 1.1.4. Let b = supi ke∗i k.
“⇒” Assume there is no positive number C so that kGTm (x)k ≤ Ckxk for all
x ∈ X and all m ∈ N.
Applying Lemma 1.1.5 we can choose recursively vectors y1 , y2 , . . . in SX ∩
span(ej : j ∈ N) and numbers mn ∈ N, so that the supportsP
of the yn , which
we denote by Bn , are pairwise disjoint, (Recall that for z = ∞
i=1 zi ei , we call
∗
supp(z) = {i ∈ N : ei (z) 6= 0}, the support of z) and so that
kGTmn (yn )k
(1.2)
n n
≥2 b
n−1
Y
ε−1
j ,
j=1
where
εj = min 2−j , min{|e∗i (yj )| : i ∈ Bj } .
Then we let
∞ n−1
X
Y
x=
(εj /b) yn ,
n=1
j=1
(which clearly converges) and write x as
x=
∞
X
xj ej .
j=1
Since |e∗i (yj )| ≤ b, for i, j ∈ N
n
n
n
n
o n−1
n
o
[
Y εj
Y
[
εj
min |xi | : i ∈
Bj ≥
εn =
b ≥ max |xi | : i ∈ N \
Bj .
b
b
j=1
j=1
j=1
j=1
1.1. GREEDY AND QUASI GREEDY BASES
11
We may assume w.l.o.g. that mn ≤ #Bn , for n ∈N. Letting kj = mj +
Pj−1
i=1 #Bi , it follows that
GTkj (x)
=
j−1 Y
i−1
X
i=1
(εs /b) yi +
GTmj
!
j−1
Y
(εi /b) yj .
s=1
i=1
and thus by (1.2)
! j−1 i−1
XY
j−1
Y
T
T
G (x) ≥ Gm
−
(ε
/b)
kyi k ≥ 2j b,
(ε
/b)
y
s
i
j
kj
j
i=1
i=1
s=1
which implies that GTkj does not converge.
“⇐” Let C > 0 such that kGTm (x)k ≤ Ckxk for all m ∈ N and all x ∈ X. Let
x ∈ X and assume w.l.o.g. that supp(x) is infinite. For ε > 0 choose x0 with
finite support A so that kx − x0 k < ε. Using small perturbations we can assume
that A ⊂ supp(x) and that A ⊂ supp(x − x0 ). We can therefore choose m ∈ N
large enough so that GTm (x) and GTm (x − x0 ) are of the form
GTm (x) =
X
e∗i (x)ei and GTm (x − x0 ) =
j∈B
X
e∗i (x − x0 )ei
j∈B
with B ⊂ N such that A ⊂ B. It follows therefore that
kx − GTm (x)k ≤ kx − x0 k + kx0 − GTm (x)k = kx − x0 k + kGTm (x0 − x)k ≤ (1 + C)ε,
which implies our claim by choosing ε > 0 to be arbitrarily small.
Definition 1.1.6. An M basis (ej , e∗j ) is called unconditional for constant coefficients if there is a positive constant C so that for all finite sets A ⊂ N and all
(σn : n ∈ A) ⊂ {±1} we have
X 1
X X
en .
en ≤ σn en ≤ C C
n∈A
n∈A
n∈A
Proposition 1.1.7. A quasi-greedy M basis (en , e∗n ) is unconditional for constant
coefficients. Actually the constant in Definition 1.1.6 can be chosen to be equal
to twice the projection constant in Theorem 1.1.4.
Remark. We will show later that there are quasi-greedy bases which are not
unconditional. Actually there are Banach spaces which do not contain any unconditional basic sequence, but in which every normalized weakly null sequence
contains a quasi-greedy subsequence.
12
CHAPTER 1. THE THRESHOLD ALGORITHM
Proof of Proposition 1.1.7. Let A ⊂ N be finite and (σn : n ∈ A) ⊂ {±1}. Then
if we let δ ∈ (0, 1) and put m = #{j ∈ A : σj = +1} we obtain
X
en = GTm
n∈A,σn =+1
≤ C
X
(1 − δ)en X
en +
n∈A,σn =+1
n∈A,σn =−1
X
X
n∈A,σn =+1
en +
(1 − δ)en .
n∈A,σn =−1
By taking δ > 0 to be arbitrarily small, we obtain that
X X
en ≤ C en .
n∈A,σn =+1
n∈A
Similarly we have
X
n∈A,σn =−1
X en ≤ C en ,
n∈A
and thus,
X
X σn en ≤ 2C en .
n∈A
n∈A
We now present a characterization of greedy bases obtained by Konyagin and
Temliakov. We need the following notation.
Definition 1.1.8. We call a a normalized basic sequence democratic if there is
a constant C so that for all finite E, F ⊂ N, with #E = #F it follows that
X X (1.3)
ej ≤ C ej j∈E
j∈F
In that case we call the smallest constant, so that (1.3) holds, the Constant of
Democracy of (ei ) and denote it by Cd .
Theorem 1.1.9. [KT1] A normalized basis (en ) is greedy if and only it is unconditional and democratic. In this case
(1.4)
max(Cs , Cd ) ≤ Cg ≤ Cd Cs Cu2 + Cu ,
where Cu is the unconditional constant and Cs is the suppression constant.
Remark. The proof will show that the first inequality is sharp. Recently it was
shown in [DOSZ1] that the second inequality is also sharp.
1.1. GREEDY AND QUASI GREEDY BASES
13
P ∗
Proof of Theorem
ei (x)ei ∈ X, n ∈ N and let η > 0.
P 1.1.9. “⇐” Let x ∗=
Choose x̃ =
i∈Λ∗n ai ei so that #Λn = n which is up to η the best n term
approximation to x (since we allow ai to be 0, we can assume that #Λ is exactly
n), i.e.
kx − x̃k ≤ σn (x) + η.
(1.5)
Let Λn be a set of n coordinates for which
b := mini∈Λn |e∗i (x)| ≥ max |e∗i (x)| and GTn (x) =
i∈N\Λn
X
e∗i (x)ei .
i∈Λn
We need to show that
kx − GTn (x)k ≤ (Cd Cs Cu2 + Cu )(σn (x) + η).
Then
x − GTn (x) =
X
e∗i (x)ei =
X
e∗i (x)ei +
i∈Λ∗n \Λn
i∈N\Λn
But we also have
X
(1.6)
e∗i (x)ei ≤ bCu i∈Λ∗n \Λn
X
i∈Λ∗n \Λn
X
e∗i (x)ei .
i∈N\(Λ∗n ∪Λn )
ei (By Proposition 5.1.11)
X
≤ bCu Cd ei \
i∈Λn Λ∗n
[Note that #(Λn \ Λ∗n ) = #(Λ∗n \ Λn )]
X
≤ Cu2 Cd e∗i (x)ei i∈Λn \Λ∗n
[Note that |e∗i (x)| ≥ b if i ∈ Λn \ Λ∗n ]
X
X
(e∗i (x) − ai )ei +
e∗i (x)ei ≤ Cs Cu2 Cd i∈Λ∗n
=
Cs Cu2 Cd kx
− x̃k ≤
i∈N\Λ∗n
Cs Cu2 Cd (σn (x)
+ η)
and
(1.7)
X
i∈N\(Λ∗n ∪Λn )
X
X
e∗i (x)ei ≤ Cs (e∗i (x) − ai )ei +
e∗i (x)ei i∈Λ∗n
i∈N\Λ∗n
= Cs kx − x̃k ≤ Cs (σn (x) + η).
This shows that (ei ) is greedy and, since η > 0 is arbitrary, we deduce that
Cg ≤ Cs Cu2 Cd + Cs .
14
CHAPTER 1. THE THRESHOLD ALGORITHM
“⇒” Assume that (ei ) is greedy. In order to show that (ei ) is democratic let
Λ1 , Λ2 ⊂ N with #Λ1 = #Λ2 . Let η > 0 and put m = #(Λ2 \ Λ1 ) and
X
X
ei .
x=
ei + (1 + η)
i∈Λ1
i∈Λ2 \Λ1
Then it follows
X ei = kx − GTm (x)k
i∈Λ1
≤ Cg σm (x) (since (ei ) is Cg -greedy)
X
X
X
≤ Cg x −
ei ≤ Cg ei + (1 + η)
ei .
i∈Λ1 \Λ2
i∈Λ1 ∩Λ2
i∈Λ2 \Λ1
Since η > 0 can be taken arbitrary, we deduce that
X X e
≤
C
ei .
i
g
i∈Λ1
i∈Λ2
Thus, it follows that (ei ) is democratic and Cd ≤ Cg . P
In order to show that (ei ) is unconditional let x = e∗i (x)ei ∈ X have finite
support S. Let Λ ⊂ S and put
y=
X
i∈Λ
X
e∗i (x)ei + b
ei ,
i∈S\Λ
with b > maxi∈S |e∗i (x)|. For n = #(S \ Λ) it follows that
X
GTn (y) = b
ei ,
i∈S\Λ
and since (ei ) is greedy we deduce that (note that #supp(y − x) = n)
X
e∗i (x)ei = ky − GTn (y)k ≤ Cg σn (y) ≤ Cg ky − (y − x)k = Cg kxk,
i∈Λ
which implies that (ei ) is unconditional with Cs ≤ Cg .
1.2. THE HAAR BASIS IS GREEDY IN LP [0, 1] AND LP (R)
1.2
15
The Haar basis is greedy in Lp [0, 1] and Lp (R)
Theorem 1.2.1. For 1 < p < ∞ there are two constants cp ≤ Cp , depending
only on p, so that for all n ∈ N and all A ⊂ T with #A = n
X
(p) cp n1/p ≤ ht ≤ Cp n1/p .
t∈A
(p)
In particular (ht )t∈T is democratic in Lp [0, 1].
With Theorem 1.1.9 and Theorem 6.1.1 we deduce that
Corollary 1.2.2. The Haar Basis of Lp [0, 1], 1 < p < ∞ is greedy.
The proof will follow from the following three Lemmas.
Lemma 1.2.3. For any 0 < q < ∞ there is a dq > 0 so that the following holds.
Let n1 < n2 < . . . nk be integers and let Ej ⊂ [0, 1] be measurable for j =
1, . . . k. Then we have
k
1X
Z
0
nj /q
2
k
q
X
1Ej (x) dx ≤ dq
2nj m(Ej ).
j=1
j=1
Proof. Define
f (x) =
k
X
2nj /q 1Ej (x).
j=1
For j = 1, . . . k write Ej0 = Ej \
f (x) ≤
j
X
2ni /q ≤
i=1
Sk
i=j+1 Ei .
nj
X
2i/q =
i=1
It follows that for x ∈ Ej0
2(nj +1)/q − 1
21/q
2nj /q .
≤
1/q − 1
21/q − 1
2
| {z }
1/q
dq
Thus
Z
1
f (x)q dx ≤ dq
0
k
X
2ni m(Ei0 ) ≤ dq
i=1
k
X
2nj m(Ej ),
j=1
which finishes the proof.
Lemma 1.2.4. For 1 < p < ∞ there is a Cp > 0 so that for all n ∈ N, A ⊂ T
with #A = n, and (εt ) ⊂ {−1, 1} it follows that
X
(p) εt ht ≤ Cp n1/p .
t∈A
p
16
CHAPTER 1. THE THRESHOLD ALGORITHM
(p)
Proof. Abbreviate ht = ht for t ∈ T . Let n1 < n2 < . . . < nk be all the integers
ni for which there is a t ∈ A so that m(supp(ht )) = 2−ni . For j = 1, . . . k put
[
supp(h(i,nj ) ).
Ej =
i∈{0,1,...2nj −1},(nj ,i)∈A
Since
m(Ej ) = 2−nj #{i ∈ {0, 1, . . . 2nj − 1}, (nj , i) ∈ A}
and thus
#{i ∈ {0, 1, . . . 2nj − 1}, (nj , i) ∈ A} = 2nj m(Ej ).
It follows therefore that
(P
Pk
k
nj
nj − 1}, (n , i) ∈ A} =
j
j=1 2 m(Ej ) if 0 6∈ A
j=1 #{i ∈ {0, 1, . . . 2
n=
Pk
if 0 ∈ A.
1 + j=1 2nj m(Ej )
Assume without loss of generality that 0 6∈ A. It follows that
"Z
#1/p
" k
#1/p
k
X
ip
1hX
X
1/p
εt ht =
2nj /p 1Ej dx
≤ d1/p
= d1/p
.
2nj m(Ej )
p
p n
t∈A
p
0
j=1
j=1
[dp as in Lemma 1.2.3]
Lemma 1.2.5. For 1 < p < ∞ there is a cp > 0 so that for all n ∈ N, A ⊂ T
with #A = n, and (εt ) ⊂ {−1, 1} it follows that
X
(p) 1/p
ε
h
.
t t ≥ cp n
t∈A
Proof. Note that for 1 < p, q < ∞ with
p
1
p
+
1
q
and s, t ∈ T it follows that
(p)
hht , h(q)
s i = δ(t, s),
(p)
thus the claim follows from the fact that the ht ’s are normalized in Lp [0, 1] and
by Lemma 1.2.4 using the duality between Lp [0, 1] and Lq [0, 1]. Indeed,
*
+
P
(q)
X
X
εt ht
(p) (p)
t∈A
εt h t ≥
εt h t , P
(q) ε
h
t∈A
t∈A
t∈A t t n
n1/p
≥
=
,
P
(q) cq
t∈A εt ht where cq is chosen like in Lemma 1.2.5. Our claim follows therefore bu letting
Cp = 1/cq .
1.3. QUASI GREEDY BUT NOT UNCONDITIONAL
1.3
17
A quasi greedy basis of Lp [0, 1] which is not unconditional
In this section we make the general assumption on a separable Banach space X,
that X has a normalized basis (en ) which is Besselian meaning that for some
constant CB
∞
∞
X
1/2
1 X ∗
∗
(1.8)
kxk = ej (x)ej ≥
|ej (x)|2
for all x ∈ X.
CB
j=1
j=1
where (e∗j ) denote the coordinate functionals for (ej ) We secondly assume that
(ej ) has a subsequence (emj : j ∈ N) which is Hilbertian which means that for
some constant CH
∞
∞
X
X
1/2
∗
(1.9) emj (x)emj ≤ CH
for all x ∈ span(emj : j ∈ N).
|e∗mj (x)|2
j=1
j=1
Example 1.3.1. An example for such a basic sequence are the trigonometrically
polynomial (tn : n ∈ Z) in Lp [0, 1] with p > 2. Indeed, for (an : |n| ≤ N ) ⊂ C it
follows from Hölder’s (or Jensen’s) inequality that
!1/p
!1/2
Z 1 X
Z 1 X
N
N
N
p
2
X
1/2
inξ/2π
inξ/2π
aj e
≥
aj e
=
|aj |2
.
dξ
dξ
0
0
n=−N
n=−N
n=−N
Secondly it follows from the complex version of Khintchine’ s inequality (Theorem
6.2.4) that the subsequence (t2n : n ∈ N) of the trigonometric polynomials is
equivalent to the `2 -unit vector basis.
(n)
We recall the 2n by 2n matrices A(n) = (a(i,j) : 1 ≤ i, j ≤ 2n ), for n ∈ N, which
were introduced in Section 5.2. Let us recall the following two properties which
we will need here:
n
(1.10)
A(n) is unitary operator on `22 , and
(1.11)
a(j,1) = 2−n/2 .
(n)
(k)
k
k)
For k ∈ N we put nk = 22 and B (k) = (bi,j : 1 ≤ i, j ≤ nk ) = A(2
`n2 k ), for k ∈ N which implies that nk+1 = n2k . We let
(acting on
(hj : j ∈ N) = (emj : j ∈ N) and (fi : i ∈ N) = (es : s ∈ N \ {mi : i ∈ N}),
so that if fi = es and fj = et then then i < j if and only if s < t. For k ∈ N we
(k)
define a family (gj : j = 1, 2, . . . nk ) as follows
(k)
(k)
g1 = fk and gi
= hSk−1 +i−1 , for i = 2, 3, . . . nk ,
18
CHAPTER 1. THE THRESHOLD ALGORITHM
(k)
where S0 = 0, and, inductively, Sj = Sj−1 + nj − 1. If we order (gj
1, 2, . . . nk ) lexicographically we note that the sequence
(1)
(1)
(2)
(3)
g1 , g2 , . . . gn(1)
, g1 , . . . , gn(2)
, g1 , , . . .
1
2
: k ∈ N, j =
is equal to the sequence
f1 , h1 , h2 , . . . hn1 −1 , f2 , hn1 , . . . hn2 −2 , f3 , . . . .
(k)
Then we define for k ∈ N a new system of elements (ψj
: j = 1, 2 . . . nk ), by
 (k) 
 (k) 
ψ1
g
 (k) 
 1(k) 
ψ2 
g 
 .  = B (k) ◦  2. 
 . 
 . 
 . 
 . 
(1.12)
(k)
(k)
ψnk
gnk
or, in other words,
(k)
ψi
=
nk
X
(k)
(k)
b(i,j) gj
for i = 1, 2 . . . nk .
j=1
Our goal is now to prove the following result
(k)
Theorem 1.3.2. Ordered lexicographically, the system (ψj
is a quasi-greedy basis of X.
(k)
Proposition 1.3.3. Ordered lexicographically, (gj
Besselian basis of X.
(k)
Proof. Given that (gj
: k ∈ N, j = 1, 2 . . . nk )
: k ∈ N, j = 1, 2 . . . nk ) is a
: k ∈ N, j = 1, 2 . . . nk ) is a reordering of (ej ), which was
(k)
assumed to be a Besselian basis of X, we only need to show that (gj : k ∈
N, j = 1, 2 . . . nk ) is a basic sequence.
To do so we need to show that there is a constant C ≥ 1 so that for all N ∈ N,
(k)
all M ∈ {1, 2 . . . nM } and all (cj : k ∈ N, j = 1, 2 . . . nk ), with
(k)
#{(k, j) : k ∈ N, j = 1, 2 . . . nk , cj
6= 0}
being finite, it follows
(1.13)
nk
nk
−1 X
M
∞ X
NX
X
X
(k) (k)
(N ) (N ) (k) (k) cj gj +
cj gj ≤ C cj gj .
k=1 j=1
j=1
k=1 j=1
1.3. QUASI GREEDY BUT NOT UNCONDITIONAL
(k)
Since the gi
19
are a reordering of the original basis (ej ) we can write
x=
nk
∞ X
X
(k) (k)
cj gj
as x =
k=1 j=1
(k)
∞
X
ci ei ,
i=1
(k)
where ci = cj if ei = gj (and for each i ∈ N there is exactly one such choice of
k and j ∈ {1, 2 . . . nk }). From (1.8) and (1.9) we deduce that
nk
nk
−1 X
−1 X
M
M
NX
NX
1/2
X
X
(k) (k)
(k)
(N ) (N ) (N )
(1.14) cj gj +
|cj |2 +
cj gj ≤ CH
|cj |2
k=1 j=2
k=1 j=2
j=2
≤ CH
∞
X
j=2
|cj |2
1/2
≤ CH CB kxk.
j=1
(k)
Since gj = fj = esj , where (sj ) which consists of the elements of N \ {mj : j ∈
N}, ordered increasingly it follows that we can write
N
X
(k) (k)
c1 g1
=
sN
X
j=1
k=1
X
cj ej −
X
ci ei =
i∈{1,2,...sN }\{sj :j≤N }
(k) (k)
cj gj ,
(k,j)∈A
for some set A ⊂ {(k, j) : k ∈ N, j = 2, 3 . . . nk }. If Ce is the basis constant of (ej )
we deduce therefore that
sN
X
cj ej ≤ Ce kxk,
j=1
and thus, using (1.14),
sN
N
X
X
X
(k) (k) (k) (k) c1 g1 ≤ cj ej + cj gj ≤ (Ce + CB CH )kxk.
j=1
k=1
(k,j)∈A
This implies that
nk
−1 X
M
NX
X
(k) (k)
(N ) (N ) cj gj +
cj gj k=1 j=1
j=1
nk
N
−1 X
M
X
NX
X
(k) (k) (k) (k)
(N ) (N ) ≤
c1 g1 + cj gj +
cj gj k=1
k=1 j=2
j=2
≤ (Ce + CB CH )kxk + CB CH kxk
which implies our claim with C = Ce + 2CB CH .
(k)
Proposition 1.3.4. Under the lexicographical order, (ψj
Besselian basis of X with the same constant CB .
: j = 1, 2 . . . nk ) is a
20
CHAPTER 1. THE THRESHOLD ALGORITHM
Proof. We first note that for k ∈ N
(k)
Xk = span(ψj
(k)
: j = 1, 2 . . . nk ) = span(gj
: j = 1, 2 . . . nk )
(k)
(k)
and thus it follows that (ψj : k ∈ N, j = 1.2, . . . nk ) spans as (gj
1.2, . . . nk ) a dense subspace of X.
Secondly we observe that if
(k)
(dj
: k ∈ N, j =
: k ∈ N, j = 1, 2 . . . nk ) ⊂ K
with
(k)
#{(k, j) : k ∈ N, j = 1, 2 . . . nk , dj
and let
x=
nk
∞ X
X
(k)
6= 0}
(k)
dj Ψj
k=1 j=1
or in g-coordinates:
x=
nk
∞ X
X
(k) (k)
cj gj .
k=1 j=1
We write x =
P∞
k=1 xk
with
xk =
nk
X
(k) (N )
dj ψj
=
j=1
nk
X
(k) (k)
cj gj .
j=1
Since
xk =
nk
X
(k)
(k)
di ψi
i=1
=
nk
X
(k)
di
i=1
nk
X
(k)
(k)
b(i,j) gj
j=1
=
nk
X
(k)
gj
j=1
nk
X
(k)
(k)
b(i,j) di
=
i=1
nk
X
(k) (k)
cj gj
j=1
this means that
(k)
(ci
(k)
: j = 1, 2 . . . nk ) = (B (k) )−1 (di
: j = 1, 2 . . . nk )
or
(k)
(di
(k)
: j = 1, 2 . . . nk ) = (B (k) )(ci
: j = 1, 2 . . . nk ).
If we project x to its first, say L, coordinates in the lexicographical order of
P −1
(k)
(Ψj : k ∈ N, j = 1, . . . kn ), for N ∈ N and M ≤ nN , so that L = N
k=1 kn + M ,
this projected vector equals to:
nk
N
−1 X
X
k=1 j=1
(k) (k)
dj ψj +
M
X
j=1
(N ) (N )
dj ψj
=
nk
N
−1 X
X
k=1 j=2
(k) (k)
cj gj +
M
X
j=1
(N )
dj
(N )
ψj
.
1.3. QUASI GREEDY BUT NOT UNCONDITIONAL
21
Therefore we only need to show that there is a constant C ≥ 1 so that for all k
and all M ≤ nk
nk
M
X
X
(k) (k) (k) (k) dj ψj dj ψj ≤ C (1.15)
j=1
j=1
(k)
and that (ψj ) is Besselian.
It follows from the assumption that the matrices B (k) are unitary and Proposition 1.3.4 that
∞
∞ nk
∞ nk
X
1/2
1/2
1 XX
1 XX
(k)
(k)
kxk = xk ≥
|cj |2
|dj |2
=
,
CB
CB
k=1 j=1
k=1
k=1 j=1
(k)
which proves that (ψj ) is Besselian. Secondly we note that (1.11) yields
nk
M
M
X
X
X
(k) (k) (k)
(k) (k) d
ψ
=
d
b
g
i
i i
(i,j) j i=1
i=1
≤
M
X
j=1
(k)
−1/2
|di |nk
nk
M
X
X
(k)
(k)
(k) (k) kg1 k + di
b(i,j) gj i=1
≤
M
X
i=1
(k)
1/2
(k)
1/2
(k)
1/2
|di |2
i=1
≤
M
X
j=2
|di |2
≤
i=1
nk X
M
X
(k) (k) 2 1/2
+ CH
di b(i,j) i=1
nk
X
j=2
|di |2
j=2
nk
M
X
X
(k)
(k) (k) +
gj
di b(i,j) + CH
nk
X
i=1
i=1
(k)
|di |2
1/2
(By (1.10))
i=1
Therefore applying (1.10) and then (1.8) it follows that
nk
M
X
X
1/2
(k) (k) (k)
di ψi ≤ (1 + CH )
|di |2
i=1
i=1
= (1 + CH )
nk
X
(k)
|ci |2
1/2
≤ (1 + CH )CB kxk k
i=1
which proves our claim.
Our last step of proving Theorem 1.3.2 is the following
22
CHAPTER 1. THE THRESHOLD ALGORITHM
(k)
Proposition 1.3.5. (ψj
: j = 1, 2 . . . nk ) is quasi-greedy.
Proof. Let
x=
nk
∞ X
X
(k)
(k)
di ψi
∈ X,
k=1 i=1
with kxk = 1 and suppose that the m-th greedy approximate is given by
X X (k) (k)
GTm (x, Ψ) =
di ψi ,
k∈J i∈Ik
P
where m = k∈J #Ik . We need to show that there is a constant C ≥ 1 (of course
not dependent on x and m) so that
kGTm (x, Ψ)k ≤ Ckxk
(1.16)
We write GTm (x, Ψ) as
GTm (x, Ψ) =
XX
(k)
(k)
di (ψi
(k)
− b(i,1) fk ) +
k∈J i∈Ik
|
XX
k∈J i∈Ik
{z
}
Σ1
|
(k)
{z
Σ2
(k)
(recall that g1 = fk ). From the definition of the ψj
Σ1 =
XX
(k)
di
k∈J i∈Ik
nk
X
(k) (k)
di b(i,1) fk .
(k)
(k)
b(i,j) gj
j=2
=
nk
XX
we get that
(k)
gj
}
X
k∈J j=2
(k) (k)
di b(i,j) ,
i∈Ik
(k)
which yields by the choice of the gj , properties (1.9), and (1.10) that
(1.17)
Σ1 ≤ CH
nk X
XX
(k) (k) 2
d
b
i
(i,j) !1/2
k∈J j=2 i∈Ik
!1/2
= CH
X B (k) −1 ◦ (d(k) : i ∈ Ik )2
i
2
k∈J
!1/2
= CH
X (k)
(d : i ∈ Ik )2
i
2
(By (1.10))
k∈J
≤ CH CB kxk (By Proposition (1.3.4)).
In order to estimate Σ2 we split Ik , k ∈ N into the following subsets:
(1)
(k)
Ik = i ∈ Ik : |di | ≤ n−1
k }
(2)
(k)
−1/2 Ik = i ∈ Ik : |di | ≥ nk
1.3. QUASI GREEDY BUT NOT UNCONDITIONAL
23
(3)
(k)
−1/2 Ik = i ∈ Ik : n−1
k < |di | < nk
and let
(s)
Σ2 =
(k) (k)
X X
di b(i,1) fk for s = 1, 2, 3.
k∈J i∈I (s)
k
(1)
From the definition of Ik
and 1.11 it follows that
X
(k) (k) −1/2
d
b
.
i
(i,1) ≤ nk
(1)
i∈Ik
and thus
(1) X −1/2
Σ ≤
nk
≤ 1.
2
(1.18)
k∈J
(2) (2)
In order to estimate Σ2 we we first note that the definition of Ik yields that
(2)
(#Ik )n−1
k ≤
X
(k)
|di |2 ≤
nk
X
(k)
|di |2 ,
i=1
(2)
i∈Ik
and, thus,
(2) X X (k) (k) (1.19) Σ2 = di b(i,1) fk k∈J i∈I (2)
k
≤
X
−1/2
nk
k∈J
≤
X
(k)
|di | (By (1.11))
(2)
i∈Ik
X
−1/2
nk
(2)
(#Ik )1/2
X
k∈J
≤
(k)
|di |2
1/2
(By Hölder’s inequality)
(2)
i∈Ik
X X
(k)
|di |2
k∈J i∈I (2)
k
2
2
≤ CB
kxk2 = CB
(By Proposition (1.3.4)).
(3) Finally we have to estimate Σ2 . Before that let us make some observations:
We first note that in the estimation of kΣ1 k we did not use specific properties
of the sets Ik . Replacing in the estimation of kΣ1 k the sets Ik by any set Ik0 ⊂
{1, 2 . . . nk } in (1.17) and J by any set J 0 ⊂ N we obtain
(1.20)
nk
XX
X
(k)
(k) (k) di
b(i,j) gj ≤ CH CB kxk
k∈J 0 i∈Ik0
j=2
24
CHAPTER 1. THE THRESHOLD ALGORITHM
Taking Ik0 to be all of {1, 2 . . . nk } and J 0 = [1, K] for some K ∈ N we deduce
from Proposition 1.3.5 that
(1.21)
nk
K X
X
(k) (k)
di b(i,1) fk k=1 i=1
nk
nk
K X
K X
X
X
(k)
(k)
(k)
(k) (k) di (ψi −b(i,1) fk )
di ψi + ≤
k=1 i=1
k=1 i=1
≤ CΨ kxk + CH CB kxk = CΨ + CH CB ,
(k)
where CΨ denotes the basis constant of (ψj ; k ∈ N, j = 1, 2 . . . nk ). Secondly we
(1)
note that in the estimation of kΣk k we could replace the set J by any subset
(1)
J 0 ⊂ N and Ik by any subset
(1)
(k)
Ik0 ⊂ Ik = i ≤ nk : |di | ≤ n−1
k } ,
to obtain
XX
(k) (k)
di b(i,1) fk ≤ 1.
(1.22)
k∈J 0 i∈Ik0
(2) Thirdly we note that in the estimation of Σ2 in (1.19) we could have also
(2)
replaced J by any subset of N, and for k ∈ N the set Ik
by any subset
(2)
(k)
−1/2 Ik0 ⊂ Ik = i ∈ Ik : |di | ≥ nk
to obtain
XX
(k) (k)
2
di b(i,1) fk ≤ CB
.
(1.23)
k∈J 0 i∈Ik0
(3)
In order to estimate the kΣ2 k we define
(3)
K = max{k ∈ J : Ik 6= ∅},
(3)
(k)
−1/2
which means that for some i ∈ Ik it follows that |di | < nK
and note that
for any k ∈ [1, K − 1] either k ∈ J or (here we use the first time that we are
dealing with the threshold algorithm) k 6∈ J, which implies
(1.24)
(k)
−1/2
|di | < nK
≤ n−1
k for all i ∈ {1, 2, . . . nk }
(3)
(here we are using that nk+1 = n2k ) and thus for such a k the sets Ik
are empty.
(2)
and Ik
1.3. QUASI GREEDY BUT NOT UNCONDITIONAL
We compute now
X (K) (K)
(3)
di b(i,1) fK +
Σ2 =
X
(k) (k)
X
di b(i,1) fk
k∈J,j<K i∈I (3)
(3)
i∈IK
=
X
25
k
(K) (K)
b(i,1) fK
di
+
nj
K−1
XX
(k) (k)
di b(i,1) fk
k=1 i=1
(3)
i∈IK
−
K−1
X
X
(k) (k)
X
di b(i,1) fk −
k=1 i∈I (1)
k
X
(k) (k)
di b(i,1) fk
k∈J,k<K i∈I (2)
k
The first term we estimate, using Hölder’s inequality:
nK
NK
X
X
X
(K) (K) 2 1/2
(K) (K)
−1/2
d ≤ n−1/2 n1/2
d di b(i,1) fK ≤ nK
≤ CB .
i
i
K
K
(3)
i∈IK
j=1
j=1
It follows therefore from (1.21), (1.22) and (1.23)
(3) 2
Σ ≤ CB + CΨ + CH CB + 1 + CB
2
2.
which implies our claim letting C = CB + CΨ + CH CB + 1 + CB
Corollary 1.3.6. Apply Theorem 1.3.2 to the trigonometrical polynomials (tn )=
(e−i2πn(·) : n ∈ Z) which are a basis of Lp [0, 1] and satisfy by Example 1.3.1 the
assumptions if p > 2. This leads to a quasi greedy basis (Ψn : n ∈ N) of Lp [0, 1].
Secondly note since (tn ) is absolutely bounded by 1(in L∞ [0, 1]), and since
the matrices B (k) , which where used in the construction of the basis (Ψn ) are
uniformly bounded as linear operators on `n∞k , it follows that also (Ψn : n ∈ N) is
bounded L∞ [0, 1].
This implies by Corollary 6.2.7 that (Ψn ) cannot be unconditional.
26
CHAPTER 1. THE THRESHOLD ALGORITHM
Chapter 2
Greedy Algorithms In Hilbert
Space
2.1
Introduction
We will now replace in our greedy algorithms, bases by more general and possibly
redundant systems.
Let H (always) be a separable and real Hilbert space. Recall that D ⊂ SH is
a dictionary of X if span(D) is dense and x ∈ D implies that −x ∈ D.
An n-term approximation algorithm is a map
G : H → span(D)N , x 7→ G(x) = (Gn (x)),
with the property that for n ∈ N and x ∈ H, there is a set Λn ⊂ D of cardinality at
most n so that Gn (x) ∈ span(Λn ), Gn (x) is then called an n-term approximation
of x.
Perhaps the first example was considered by Schmidt [Schm]:
Example 2.1.1. [Schm] Let f ∈ L2 [0, 1]2 , i.e. f is a square integrable function
in two variables. By the Theorem of Arcela and Ascoli we know that the set
D=
n
nX
o
uj ⊗ vj : n ∈ N, ui , vj ∈ C[0, 1]
j=1
is dense in C [0, 1]2 . Here we denote for two functions f, g : [0, 1] → K
f ⊗ g : [0, 1]2 → K, (x, y) 7→ f (x)g(y).
Since C [0, 1]2 is dense in L2 [0, 1]2 it follows that
D=
n
nX
o
uj ⊗ vj : n ∈ N, ui , vj ∈ L2 [0, 1]
j=1
27
28
CHAPTER 2. GREEDY ALGORITHMS IN HILBERT SPACE
is dense in L2 [0, 1]2 .
The question is now, how to find a good approximate to f from D. E. Schmidt
considered the following
procedure and showed that it worked:
Let f ∈ L2 [0, 1]2 and define f0 = f .
Then choose u1 , v1 ∈ L2 [0, 1] so that
kf0 − u1 ⊗ v1 k2 = inf kf0 − u ⊗ vk2 : u, v ∈ L2 [01]}.
Since this infimum might be hard to achieve he also considered a weaker condition, and fixed some weakening factor t ∈ (0, 1) and chose u1 , v1 ∈ L2 [0, 1] so
that
1
kf0 − u1 ⊗ v1 k2 ≤ inf kf0 − u ⊗ vk2 : u, v ∈ L2 [01] .
t
Then he let
f1 = f0 − u1 ⊗ v1 .
After n steps he obtained u1 , v1 , u2 , v2 , . . . un , vm ∈ L2 [0, 1], and let
fn = f −
n
X
uj ⊗ vj ,
j=1
and chose un+1 and vn+1 in L2 [0, 1] so that
kfn − un+1 ⊗ vn+1 k = kf0 −
n+1
X
uj ⊗ vj k2 ≤
j=1
1
inf kfn − u ⊗ vk2 : u, v ∈ L2 [01] .
t
P
Finally he proved that fn converges in L2 [0, 1]2 to 0 and thus Gn (f ) = n+1
j=1 uj ⊗
vj converges to f .
He asked whether there is some general principle behind, and how and whether
this generalizes..
(PGA)
The Pure Greedy Algorithm.
1)
For x ∈ H we define Gn = Gn (x), for each n ∈ N0 , by induction.
G0 = 0 and assuming that G0 , G1 . . . Gn−1 , have been defined
for some n ∈ N we proceed as follows:
Choose zn ∈ D and an ∈ R so that
kx − Gn−1 − an zn k =
2)
Put Gn = Gn−1 + an zn .
Note that for any x ∈ H it follows that
(2.1)
inf
kx − azk2
z∈D,a∈R
inf
z∈D,a∈R
kx − Gn−1 − azk.
2.1. INTRODUCTION
29
inf
kxk2 − 2ahx, zi + a2 kzk2 ]
z∈D,a∈R
= inf kxk2 − hx, zi2 ]
=
z∈D
[a 7→ kxk2 −2ahx, zi+a2 kzk2 is minimal for a = hx, zi]
= kxk2 − suphx, zi2 .
z∈D
So condition (1) in (PGA) can be replaced by the following condition (1’)
1’)
Choose zn ∈ D so that
hx − Gn−1 , zn i = suphx − Gn−1 , zi
z∈D
2’)
and (2) by
Put Gn = Gn−1 + hx − Gn−1 , zn izn .
As already noted in Example 2.1.1, the “sup” in (1’) (PGA), respectively the
“inf” in (1) might not be attained or might be hard to attain. In this case we
might consider the following modification.
(WPGA)
The Weak Pure Greedy Algorithm.
We are given a sequence τ = (tn ) ⊂ (0, 1). For x ∈ X we define
Gn = Gn (x), for each n ∈ N0 , by induction.
G0 = 0 and assuming that G0 , G1 . . . Gn−1 , have been defined
for some n ∈ N we proceed as follows:
1)
Choose zn ∈ D, so that
hx − Gn−1 , zn i ≥ tn suphx − Gn−1 , zi
z∈D
2)
Put Gn = Gn−1 + hx − Gn−1 , zn izn . For WPGA we call the
sequence (tn ) the weakness factors.
A possibly faster (but computational more laborious) algorithm is the following Orthogonal Greedy Algorithm.
(OGA)
1)
The Orthogonal Greedy Algorithm.
For x ∈ H we define Gon = Gon (x), for each n ∈ N0 , by induction.
Go0 = 0 and assuming that Go0 , Go1 . . . Gon−1 , and vectors z1 . . . zn−1
have been defined for some n ∈ N we proceed as follows:
Choose zn ∈ D so that
hx − Gon−1 , zn i = suphx − Gon−1 , zi
z∈D
2)
Define Zn = span(z1 , z2 . . . zn )) and let Gon−1 be the best approximation of x to Zn , i.e.
kGon−1 − xk = inf kz − xk : z ∈ Zn },
30
CHAPTER 2. GREEDY ALGORITHMS IN HILBERT SPACE
which means that Gn−1 = PZn (x), where PZn denotes the orthonormal projection of H onto Zn .
(GAR)
The Greedy Algorithm with free Relaxation.
1)
For x ∈ H we define Grn = Grn (x), for each n ∈ N0 , by induction.
Gr0 = 0 and assuming that Gr0 , Gr1 . . . Grn−1 , have been defined
for some n ∈ N we proceed as follows:
Choose zn ∈ D so that
hx − Grn−1 , zn i = suphx − Grn−1 , zi
z∈D
2)
Put Grn = an Grn−1 + bn zn , where Grn is best approximation of x
by an element of the two dimensional space span(Grn−1 , zn ).
(GAFR)
The Greedy Algorithm with fixed Relaxation.
1)
Let c > 0. For x ∈ H we define Gfn = Gfn (x), for each n ∈ N0 ,
by induction.
Gf0 = 0 and assuming that Gf0 , Gf1 . . . Gfn−1 , have been defined
for some n ∈ N we proceed as follows:
Choose zn ∈ D so that
hx − Gfn−1 , zn i = suphx − Gfn−1 , zi
z∈D
2)
Put Gfn = c 1 − n1 Gfn−1 + nc zn ,
Similar to the weak purely greedy algorithm there are also weak versions of
the orthogonal greedy algorithm and the pure greedy algorithm with relaxation
and We denote them by WOGA, WGAR and WGAFR.
2.2
Convergence
Proposition 2.2.1. Assume that we consider the WPGA, WOGA or WGAR
and assume for the weakness factors (tn ) that
(2.2)
X
t2k = ∞.
k∈N
For x we let xn = x − Gn (x), xn = x − Gon (x) or xn = x − Gr (x), respectively.
If the sequence (xn ) converges it converges to 0.
2.2. CONVERGENCE
31
Proof. Assume that xn converges to some u ∈ H and u 6= 0. Then, since D is
a dictionary, there is a d ∈ D so that δ = hd, ui > 0 and thus we find a large
enough N ∈ N so that hd, xn i ≥ δ/2, for all n > N
In the case that we consider WPGA we obtain for n ≥ N
2
kxn+1 k2 = xn − hzn+1 , xn izn+1 = kxn k2 − hzn+1 , xn i2 ≤ kxn k2 − t2n+1 δ 2 /4.
and thus for k = 1, 2, 3 . . .
2
2
kxN k − kxN +k k =
NX
+k−1
2
2
kxj k − kxj+1 k ≥
j=N
N
+k
X
t2j+1 δ 2 /4 →N →∞ ∞.
j=N
But this is a contradiction.
In the case of the WOGA we similarly have for n ≥ N
n+1
n
X
kxn+1 k2 = min x −
aj zj : a1 , a2 , . . . an+1 ∈ R
j=1
2
≤ xn − hzn+1 , xn izn+1 = kxn k2 − hzn+1 , xn i2 ≤ t2n+1 δ 2 /4
and we obtain a contradiction as in the WPGA case. Similarly in the case we
consider the WGAR we estimate:
n
kxn+1 k2 = min x − aGrn (x) − bzn+1 : a, b ∈ R
2
≤ x − Grn (x) − hzn+1 , xn izn+1 = kxn k2 − hzn+1 , xn i2 ≤ kxn k2 − t2n+1 δ 2 /4.
Theorem 2.2.2. Assume that condition (2.2) of Proposition 2.2.1 holds. Then
(Gon (x) : n ∈ N) (as defined in WOGA) converges for all x ∈ H to x.
Proof. Let x ∈ H. For n ∈ N let Zn be the space defined in WOGA. Gon (x) =
o
PZn (x).
S Since Z1 ⊂ Z2 ⊂ Z3 . . . it follows that Gn (x) converges to PZ (x), where
Z = n∈N Zn . Thus the claim follows from Proposition 2.2.1
Theorem 2.2.3. Assume that the sequence (tk ) ⊂ (0, 1) satisfies
(2.3)
X tk
k∈N
k
= ∞.
For x ∈ X consider the WPGA (Gn (x)) with weakness factors (tn ).
Then (Gn (x)) converges.
32
CHAPTER 2. GREEDY ALGORITHMS IN HILBERT SPACE
Remark. Since by Hölder’s inequality
∞
∞
X
1/2 X
1 1/2
2
,
≤
tk
k
k
∞
X
tk
k=1
k=1
P
condition 2.3 implies that
2
k∈N tk
k=1
= ∞.
We will need the following Lemma first
Lemma 2.2.4. Assume y = (yj ) ∈ `2 and (tk ) ⊂ (0, 1) satisfies (2.3) . Then
n
lim inf
n→∞
|yn | X
|yj | = 0.
tn
j=1
Proof. (an alternate, and shorter proof due to Sheng Zhang will be given below)
We will prove the following claim:
Claim. If f ∈ L2 [0, ∞] and we define
Z x
F (x) =
|f (t)| dt
0
then
∞
Z
(2.4)
0
F 2 (x)
dx ≤ 4
x2
Z
n=1
n
|yj |
i2
≤
∞ h n−1
X
1X
f 2 (x) dx.
0
If we apply the claim to the function f (·) =
n
∞ h X
X
1
∞
P∞
j=1 1(j−1,j] |yj |,
it follows that
∞
i2 X
|yj | +
|yj |2
n
n=1
j=1
"
#2
Z ∞
Z x
∞
X
1
≤
f (t) dt dx +
|yj |2
x
0
0
n=1
Z ∞
∞
∞
X
X
≤4
f 2 (t) dt +
|yj |2 = 5
|yj |2
n=1
j=1
0
n=1
n=1
It follows therefore from the Cauchy Schwarz inequality that
" ∞
#1/2 " ∞
#
∞
n
∞
n
n
i2 1/2
X
X
X
Xh1 X
tn |yn | X
1X
<∞
|yj | ≤
|yn |
|yj | ≤
|yj |2
|yj |
n tn
n
n
n=1,tn 6=0
since
j=1
n=1
n=1
j=1
∞
X
tn
n=1
n
=∞
n=1
j=1
2.2. CONVERGENCE
33
it follows that
n
|yn | X
|yj | = 0.
tn
lim inf
n→∞
j=1
In order to prove the claim we can assume that f (x) is a positive function,
we note first that by Hölder’s inequality,
Z x
Z x
1/2
f 2 (t) dt,
F (x) =
f (t) dt ≤ x
0
0
and thus
F (x)
≤
x1/2
(2.5)
x
Z
f 2 (t) dt →x→0 0
0
For a fixed x0 > 0 we also deduce from Hölder’s inequality for x > x0 that
Z x
Z x
Z ∞
F (x) − F (x0 ) =
f (t) dt ≤ (x − x0 )1/2
f 2 (t) dt ≤ x1/2
f 2 (t) dt,
x0
x0
and thus
F (x0 )
F (x)
≤ 1/2 +
1/2
x
x
Z
∞
f 2 (t) dt.
x0
By choosing for a given ε > 0 x0 far enough out so that
−1/2
then x1 > x0 so that x1
x0
R∞
x0
f 2 (t) dt < ε/2 and
F (x0 ) < ε/2, it follows that
F (x)
< ε whenever x > x1 ,
x1/2
and thus
(2.6)
F (x)
≤
x1/2
Z
x
f 2 (t) dt →x→∞ 0.
0
Using integration by parts, it follows for any 0 < a < b < ∞ that
Z
a
b
Z b
F 2 (x) b
F 2 (x)
dx = −
+2
F (x)f (x)x−1 dx
x2
x x=a
a
"
#2 "
#2
!1/2
Z b 2
F (a)
F (b)
F (x)
≤
+
+2
x2
a1/2
b1/2
a
[By Hölder’s inequality]
Z
!1/2
b
2
f (x) dx
a
34
CHAPTER 2. GREEDY ALGORITHMS IN HILBERT SPACE
and thus, in case that a is chosen small enough and b large enough so that F (x)
does not a.e. vanish on [a, b], we have
Z
b
a
F 2 (x)
x2
!1/2
"
h F (a) i2 h F (b) i2
+ 1/2
≤
a1/2
b
# Z
b
a
F 2 (x)
x2
!−1/2
Z
+2
!1/2
b
f 2 (x) dx
a
Our claim follows now by letting a → 0 and b → ∞
Proof by Sheng Zhang. Suppose, to the contrary that
n
δ = lim inf
n→∞
|yn | X
|y + j| > 0,
tn
j=1
and, thus for some n0 ∈ N
n
|yn | X
|y + j| > δ/2 whenever n ≥ n0 .
tn
j=1
For n ≥ n0 , we deduce fro Hölders’s inequality
n
n
δ
1 X
1 X
<
|yn ||yj | ≤
|yj |2 n|yn |2 ,
2
tn
tn
j=1
j=1
and thus
|yn |
tn
n
≥
δ
1
Pn
,
2 j=1 |yj |2
which yields
lim inf
n→∞
|yn |
tn
n
≥
δ
1
P∞ 2 =: ε
2 j=1 yj
2
Thus there is an n> ≥ n0 , so that for all n ≥ nP
1 , |yn | ≥ εtn /2n. But this
∞
contradicts the assumption that y = (yn ) ∈ `2 and n=1 tn /n = ∞.
Proof of Theorem 2.2.3. Let x ∈ H and put for n ∈ N, Gn = Gn (x) with
Gn (x) =
n
X
hx − Gj−1 , zj izj ,
j=1
where zn ∈ D satisfies
(2.7)
hzn , x − Gn−1 i = hzn , xn−1 i ≥ tn suphz, xn−1 i.
z∈D
.
2.2. CONVERGENCE
35
Define
(2.8)
n
X
xn = x − Gn (x) = x −
hzj , x − Gj−1 izj = xn−1 − hzn , x − Gn−1 izn ,
j=1
By induction we show that for every n ∈ N
kxn k2 = kxk2 −
(2.9)
n
X
hzj , xj−1 i2 .
j=1
Indeed, for n = 1 the claim is clear and assuming that (2.9) is true for n ∈ N we
compute
kxn+1 k2 = kxn − hxn , zn+1 izn+1 k2
= kxn k2 − hxn , zn+1 ikzn+1 k2 = kxk2 −
n+1
X
hzj , xj−1 i2 .
j=1
It follows therefore from (2.9) that
∞
X
(2.10)
hzj , xj−1 i2 ≤ kxk2
j=1
For m < n we compute
(2.11)
kxn − xm k2 = kxm k2 − kxn k2 − 2hxm − xn , xn i,
and
n
X
hxm − xn , xn i = hxj−1 − xj , xn i
≤
=
j=m+1
n
X
hxj−1 − xj , xn i
j=m+1
n
X
hzj , xn i · hzj , xj−1 i
j=m+1
n
hzn+1 , xn i X
hzj , xj−1 i
≤
tn+1
j=m+1
h
i
hzj , xn i ≤ max hd, xn i ≤ t−1 hzn+1 , xn i
n+1
d∈D
hzn+1 , xn i n+1
X
hzj , xj−1 i.
≤
tn+1
j=1
36
CHAPTER 2. GREEDY ALGORITHMS IN HILBERT SPACE
We can therefore apply Lemma 2.2.4 to (tn ) and yn = hzn+1 , xn i, for n ∈ N, and
deduce that
lim inf max hxm − xn , xn i = 0.
n→∞ m<n
Together with the fact that kxn k is decreasing and (2.11) this implies that there
is subsequence (xnk ) which converges to some x ∈ H. We claim that the whole
sequence (xn ) converges to that x, which, together with Proposition 2.2.1, would
finish the proof. Note that for any n ∈ N and any k ∈ N so that nk > n we have
kxn − xk ≤ kxn − xnk k + kxnk − xk
1/2
= kxn k2 − kxnk k2 − 2hxn − xnk , xnk i
+ kxnk − xk
1/2
1/2
≤ kxn k2 − kxnk k2
+ 2 max hxm − xnk , xnk i + kxnk − xk.
m≤nk
1/2
So, given ε > 0 we can choose n0 large enough so that kxn k2 −kxnk k2
< ε/3,
for all n ≥ n0 and k with nk > n. Then we choose k0 so that for all k > k0 ,
1/2
2 maxm≤nk hxm − xnk , xnk i < ε/3 and kxnk − xk < ε/3. For any n ≥ n0 , we
can therefore choose k ≥ k0 so that also nk > n, and from above inequalities we
deduce that kxn − xk < ε.
The next Theorem proves that at least among the decreasing weakness factors
τ the condition 2.3 is optimal in order to imply convergence of the WPGA.
Theorem 2.2.5. In the class of monotone decreasing sequences τ = (tk ), the
condition (2.3) is necessary for the WPGA to converge.
In other words, if (tn ) is a decreasing sequence for which
(2.12)
∞
X
tn
<∞
n
n∈N
then there is a dictionary D of H, an x ∈ H and sequences (Gn ) ⊂ H and
(zn ) ⊂ D , with G0 = 0 so that for xn = x − Gn the following is satisfied:
(2.13)
xn = xn−1 − hxn−1 , zn izn
(2.14)
hxn−1 , zn i ≥ tn max0 hxn−1 , zi,
z∈D
but so that xn does not converge to 0.
We will need the following notation:
Definition 2.2.6. Assume D0 ⊂ SH has the property that z ∈ D0 implies that
−z ∈ D0 and assume that τ = (tn )N
n=1 ⊂ (0, 1], with N ∈ N ∪ {∞} isa finite
N
0
sequence of positive numbers. A pair of sequences (xn )N
n=0 ⊂ H and (zn )n=0 ⊂ D
2.2. CONVERGENCE
37
and are called a pair of WPGA-sequences with weakness factor τ and dictionary
D if x0 ∈ span(D0 ) and for all n = 1, 2 . . . N
(2.15)
xn = xn−1 − hxn−1 , zn zn i
(2.16)
hxn−1 , zn i ≥ tn max0 hxn−1 , zi.
z∈D
Remark. To given sequence τ = (tn )∞
n=1 ⊂ (0, 1], satisfying (2.12) we will choose
elements of a dictionary D as well as the elements xn and zn of a pair of WPGAsequences with weakness factor τ and dictionary D recursively.
To achieve that we will choose inductively elements xn , n ≥ 0 and zn , n ≥ 1,
so that for all n ∈ N
(2.17)
xn = xn−1 − hxn−1 , izn
(2.18)
hxn−1 , zn i ≥ tn max sup |hxn−1 , ei i|,
i∈N
(2.19)
max
j=1,2,...n−1
|hxn−1 , zj |
hxj , zj+1 i ≥ tj hxj , zn i, for all j = 0, 1, 2 . . . n − 1.
Here (ej ) denotes an orthonormal basis of H.
∞
We deduce then that (xn )∞
n=0 and (zn )n=0 is a pair of WPGA-sequences with
weakness factor τ and dictionary D = {±ej , ±zj : j ∈ N}.
Proof of Theorem 2.2.5. The following procedure is the key observation towards
inductively producing our example. We let (ej ) be an orthonormal basis of H.
For given t ∈ (0, 1/3] and i 6= j in N. We define elements xn ∈ span(ei , ej ),
n ≥ 0 and zn ∈ (ei , ej ), kzn k = 1, n ≥, and αn ∈ [0, 1] recursively until we stop
at some n = N , when some criterium is satisfied, as follows:
We put x0 = ei ,
Now assume that for some n ∈ N, we defined xs = as ei + bs ej and αs ∈ (0, 1)
and zs ∈ SH for all 1 ≤ s ≤ n − 1 so that for all 1 ≤ s < n we have
(2.20)
hxs−1 , zs i = t, as long as s ≤ N − 1,
(2.21)
(2.22)
zs = αs ei − (1 − αs2 )1/2 ej ,
√
as , bs ≥ 0, and as − bs ≥ 2, as long as s ≤ N ,
(2.23)
xs = xs−1 − hxs−1 , zs izs .
(Conditions (2.20) and (2.22) become vacuous once we defined N for s = N )
Then we first define z̃n as
z̃n = α̃n ei − (1 − α̃n2 )1/2 ej
where α̃n is defined so that hx̃n−1 , z̃n i = t. Secondly define x̃n to be
x̃n = xn−1 − hxn−1 , zn izn
38
CHAPTER 2. GREEDY ALGORITHMS IN HILBERT SPACE
and write x̃n as
x̃n = ãn ei + b̃n ej .
√
Case 1. ãn − b̃n ≥ 2t In that case we choose αn = α̃n and zn = αn ei − (1 −
αn2 )1/2 ej . Thus (2.20) and (2.21) are satisfied for s = n. Then we let xn = x̃n ,
and have therefore satisfied
(2.22) and (2.23).
√
√
Case 2. ãn −√
b̃n < 2t. Then we let N = Nt = n and put αN = 1/ 2,
zN = (ei − ej )/ 2 and xN = xN −1 − hxN −1 , zN izN . Then (2.21),and (2.23) are
satisfied while (2.20) is vacuous.
From the definition of XN in Case 2, we observe that
(2.24) hxN −1 , zN i = 2−1/2 aN −1 − bN −1 ≥ t, and it follows therefore that
1
(2.25) aN = bN = (aN −1 + bN −1 ).
2
In particular also (2.22) is satisfied for n = N , assuming that N is finite, which
we will see later (here the second part of (2.22) is vacuous).
Once the second case happens we finish the definition of our sequences.
We still will have to show that eventual Case 2 will happens and that N is
finite; for the moment we think of N being an element of N ∪ {∞}
We make the following observations. From (2.20) and (2.21) we deduce that
(2.26)
2
an+1 = an − tαn+1 and bn+1 = bn + t(1 − αn+1
)1/2 , if n < N − 1
which implies that
(2.27)
2
an+1 − bn+1 = an − bn − t αn+1 + (1 − αn+1
)1/2
(
√
≥ an − bn − 2t
if n < N − 1.
≤ an − bn − t
This yields
1 = a0 − b0 ≥
N
−2
X
(as − bs ) − (as+1 − bs+1 ) ≥ (N − 1)t
s=0
and therefore we showed that N is finite. Since by definition of N and ãN and
b̃N
√
2t > ãN − b̃N
√
1/2
2
= aN −1 − bN −1 − t α̃N + (1 − α̃N
)
≥ aN −1 − bN −1 − t 2
it follows that
(2.28)
−1/2
hxN −1 , zN i = 2
(
≤ 2t
aN −1 − bN −1
≥t
.
2.2. CONVERGENCE
39
It follows therefore from (2.27),(2.24) and (2.25) that
(
N
−1
X
≥ tN
√
1 = a0 − b0 =
(as − bs ) − (as+1 − bs+1 )
≤ 2tN
s=0
and thus
(2.29)
1
1
√ ≤N ≤ .
t
2t
From the definition of xN and (2.28) we deduce that
kxN k2 = kxN −1 k2 − hxN −1 , zN i2 (By (2.28))
≥ kxN −1 k2 − 4t2
= kx0 k2 +
N
−1
X
kxs k2 − kxs−1 k2 − 4t2
s=1
2
= kx0 k − (N − 1)t2 − 4t2 ≥ kx0 k2 − t − 3t2 (By 2.29)
and thus, since t ≤ 1/3,
(2.30)
kxN k2 ≥ kx0 k2 − 2t
Finally note that the sequence (xn )N
n=0 is a WAGD sequence for the Dictionary
t
D0 = {zn : n = 1, 2 . . . Nt } ∪ {ei } with the weakness factor t. We call (xn )N
n=0
Nt
together with the sequence (zn )n=0 the WAGD sequence generated by t and the
pair (ei , ej ).
P∞
Now we assume that (tn )∞
n=1 is a sequence in (0, 1], so that
j=1 tn < ∞. We
P∞ tn
3
first require the additional assumption that j=1 n < ∞ < ε = 16
. It follows
that
∞
X
t2s = t1 + t2 + t4 + t8 . . .
s=0
1
1
1
≤ t1 + t2 + (t3 + t4 ) + (t5 + t6 + t7 + t8 ) + (t9 + t10 + . . . t16 ) + . . .
2
4
8
h
i
t2 1
1
1
≤ 2 t1 + + (t3 + t4 ) + (t5 + t6 + t7 + t8 ) + (t9 + t10 + . . . t16 ) + . . .
2
4
8
16
∞
X
tn
≤ 2ε.
≤2
n
n=1
We will construct recursively sequences (xn : n = 0, 1, 2, . . .) and (zn : n = 1, 2 . . .)
so that x0 = e1 , xn = xn−1 − hxn−1 , zn izn , so that for every n ∈ N
(2.31)
hxn−1 , zn i ≥ tn
max hxn−1 , zj i and hxn−1 , zn i ≥ tn suphxn−1 , ej i
j=1,...n−1
j∈N
40
CHAPTER 2. GREEDY ALGORITHMS IN HILBERT SPACE
and
(2.32)
tj hxj , zn i ≤ hxj , zj+1 i for all j = 0, 1, 2, . . . n − 1.
As noted in the remark before the proof, these two conditions will ensure that that
for each n the vector is of the form xn = x−Gn (x), where (Gn (x) : n ∈ N0 ) is the
result of a WPGA with weakness factors (tn ) and dictionary D = {zn , en : n ∈ N}.
(1)
(1)
We start with x = x0 = e1 , and let (xn : n = 0, 1, 2, . . . Nt1 ) and (zn : n =
1, 2, . . . Nt1 ) be the WAGD sequence generated by t and the pair (e1 , e2 ), then
(1)
(1)
we put xn = xn and zn = zn for n = 1, 2, 3 . . . Nt1 .
Note that we satisfied so far our required conditions (2.31) and (2.32) since
by construction hxn−1 , zn i = t1 ≥ tn ≥ tn kxn−1 k for all n = 1, 2 . . . , N1 = Nt1 .
By (2.25) xN1 is of the form xN1 = c1 (e1 − e2 ), and we deduce from (2.30) and
the fact that kxN1 k ≤ 1 that
c21 ≤ 1/2, N1 ≥ 1 and kxN1 k2 ≥ 1 − 2t1 .
(2,1)
(2,1)
Then we consider let (xn
: n = 0, 1, . . . Nt2 ) and (zn
: n = 1, 2 . . . Nt2 ) be
(2,2)
the WAGD sequence generated by t2 and the pair (e1 , e3 ), and (xn
: n =
(2,2)
0, 1, . . . Nt2 ) and (zn
: n = 1, 2 . . . Nt2 ) be the WAGD sequence generated by
t2 and the pair (e2 , e4 ). We put N2 = Nt2 and for n = 1, 2, . . . N2 we define
xN1 +n = c1 x(2,1)
+ c1 e2 and zN1 +n = zn(2,1)
n
(2,1)
xN1 +N2 +n = c1 xN1 + c1 x(2,2)
and zN1 +N2 +n = zn(2,2) .
n
We observe that for n = 1, 2 . . . N2
hxN1 +n−1 , zN1 +n i = c1 t2 ≥ t2 maxhxN1 +n−1 , es i and
s∈N
hxN1 +n−1 , zN1 +n i = c1 t2 ≥ t2
max hxN1 +n−1 , zs i
s=1,2,...N1
(the first inequality follows from the fact that the coordinates of xN1 +n , n =
1, 2 . . . N2 are absolutely, not larger than c1 , the second inequality follows from
(2.21) and the fact moreover the coordinates of xN1 +n , n = 1, 2 . . . N2 are not
negative while zs , s = 1, 2 . . . N1 , has a positive and negative coordinate). Secondly we note that for j = 1, 2, . . . , N1 and n = 1, 2, . . . , N2 − 1, it follows from
(2.20) and (2.24) that
tj hxj , zN1 +n i ≤ t1 ≤ hxj , zj+1 i.
This implies that the conditions (2.31) and (2.32) hold for all N1 ≤ n ≤ N1 + N2 .
Similarly we can show that they also hold for all N1 + N2 ≤ n ≤ N1 + 2N2 .
Finally (2.30) implies that XN1 +2N2 is of the form
XN1 +2N2 = c2 (e1 + e2 + e3 + e4 )
2.2. CONVERGENCE
41
with
c22 ≤ 1/4 and kxN1 +2N2 k2 ≥ kxN1 k2 − c21 2t2 − c21 2t2 ≥ 1 − 2t1 − 2t2 .
Now assume that for some r ∈ N we have chosen
(xn : n = 1, 2, . . . Mr ), with Mr =
r
X
2j−1 Nj and Nj = Nt2j−1 , j = 1, 2 . . . r,
j=1
and
(zn : n = 1, 2, . . . Mr )
so that (2.31) and (2.32) hold for all n ≤ Mr , and so that
r
xMr = cr
2
X
ei
i=1
for some cr with c2r ≤ 2−r , and so that
kxMr k2 ≥ 1 − 2t1 − 2t2 − 2t4 − . . . − 2t2r−1
(r+1,j)
(r+1,j)
then we let for j = 1, 2 . . . 2r (xn
: n = 0, . . . Nr+1 ) and (zn
: n =
0, . . . Nr+1 ),with Nr+1 = Nt2r , be the WPGA sequences generated by by t2r and
the pair (ej , e2r +j ), and finally put for i = 1, 2 . . . 2r and n = 1, 2, . . . Nr
xMr +(i−1)Nr+1 +n =
i−1
X
r
(r+1,s)
cr xNr
+
cr x(r+1,i)
n
+ cr
s=1
2
X
es , and
s=i+1
zMr +(i−1)Nr+1 +n = zn(r+1,i) .
We deduce as in the case r = 1 that the conditions (2.31) and (2.32) hold for
P
P2r+1
s−1 N = M
all n ≤ Mr + 2r Nr+1 = r+1
s
r+1 , that xMr+1 = cr+1
s=1 2
s=1 es , for
some cr+1 ≤ 2−r−1 , and that
kxMr+1 k ≥ 1 − 2t1 − 2t2 , . . . 2tr .
This finishes the choice of the the xn and zn .
P
12
Since kxMr k2 ≥ 1 − 2 rs=1 t2s ≥ 1 − 4ε > 1 − 16
= 14 , it follows that (xn ) does
not converge.
We therefore proved our claim under the additional assumption
P∞
that n=1 (tn /n) < 3/16.
In the general case we proceed as follows. We first find an n0 so that
∞
X
s=n0
t2s < 3/16,
42
CHAPTER 2. GREEDY ALGORITHMS IN HILBERT SPACE
and let
n
x=
2 0
X
ej .
j=1
1, 2 . . . , 2n0
− 1, and thus x0 = x and recursively
X
n0
xn = xn−1 − hxn−1 , zn i =
j = n + 1 2 ej
Then we choose zi = ei , i =
for n = 1, 2, . . . 2n0 − 1. In particular x2n0 −1 = e2n0 from then on we choose
x2n0 −1+n = x̃n , n = 1, 2 . . . and z2n0 −1+n = z̃n , where the x̃n and z̃n are chosen
like the xn and the zn in the special case, but in the Hilbertspace H̃ = span(ej :
j ≥ 2n0 ).
2.3
Convergence Rates
Note that without any special conditions on the starting point in the Pure greedy
algorithm (or others) we can not expect being able to estimate the convergence
rate.
Indeed let (ξn ) be any sequence of positive numbers, which decreases to 0 and
let D = {±en : n ∈ N}, where (en ) is an orthonormal basis of our Hilbert space
H, then take
∞
X
p
x=
ξj − ξj+1 ej
j=1
then it follows for the n-th approximates Gn = Gn (x) define as in (PGA)
n
X
p
Gn =
ξj − ξj+1 ej
j=1
and thus
2
kx − Gn k =
∞
X
ξj − ξj+1 = ξn+1 .
j=1+n
Thus no matter how slow (ξn ) converges to 0, there is a x so that Gn (x) converges
at least as slow as (ξn ).
In order to state our first result we introduce for a dictionary D of H the
following linear subspace:
nX
o
X
(2.33)
A1 = A1 (D) =
cz z : (cz ) ⊂ K and
|cz | < ∞ .
z∈D
z∈D
For x ∈ A1 we put
(2.34)
kxkA1 = inf
nX
z∈D
|cz | : (cz ) ⊂ K and f =
X
z∈D
o
cz .
2.3. CONVERGENCE RATES
43
Theorem 2.3.1. [DT] Assume D is a dictionary of a separable Hilbert space.
Let x ∈ A1 (D) and assume that (Gn ) = (Gn (x)) is defined as in (PGA) and let
xn = x − Gn , for n ∈ N. Then
kxn k ≤ kf kA1 n−1/6 for n ∈ N.
(2.35)
For the proof of Theorem 2.3.1 we need the following observation.
Lemma 2.3.2. Assuming that (ξm ) is a sequence of positive numbers so that for
some number A > 0
(2.36)
ξ1 ≤ A and ξm+1 ≤ ξm (1 − ξm /A), for m ≥ 1.
Then
ξm ≤
(2.37)
A
, for all m ∈ N.
m
Proof. We assume A = 1 (pass to ξ˜m = ξm /A) We prove the claim by induction
for each m ∈ N. For m = 1 (2.37) follows from the assumption. Assume that the
1
1
claim is true for m ∈ N. If ξm ≤ m+1
then also ξm+1 ≤ m+1
since from (2.36) it
1
1
follows that the sequence (ξi ) is decreasing. If m+1 < ξm ≤ m
we deduce that
ξm+1 ≤ ξm (1 − ξm ) ≤
1
1 1 m
1
1−
=
=
,
m
m+1
mm+1
m+1
which implies the claim for m + 1 and finishes the induction step.
Proof of Theorem 2.3.1. For x ∈ H we put
hx, zi
.
z∈D kxk
ρ(x) = sup
P
+
Note
that
if
x
∈
A
,
η
>
0
and
(c
)
⊂
R
is
such
that
x
=
1
z
z∈D
0
z∈D cz z and
P
cz ≤ η + kxkA1 it follows that
D X
E X
kxk2 = x,
cz z ≤
cz suphz, xi ≤ kxkA1 + η kxkρ(x),
z∈D
z∈D
z∈D
and, thus, since η > 0 was arbitrary,
(2.38)
ρ(x) ≥
kxk
.
kxkA1
P
Let x ∈ A1 and
P let us assume that there is a representation x = z∈D cz z
so that kxkA1 = z∈D cz (otherwise we use arbitrary approximations). Let (zm )
44
CHAPTER 2. GREEDY ALGORITHMS IN HILBERT SPACE
and (Gm ) be defined as in (PGA) and xn = x − Gn , for n ∈ N. We note that for
m ∈ N0
kxm+1 k2 = kxm − hxm , zm+1 izm+1 k2
(2.39)
= kxm k2 − hxm , zm+1 i2 = kxm k2 (1 − ρ2 (xm )).
Putting am = kxm k2 , b0 = kx0 kA1 = kxkA1 and, assuming that bm has been
1/2
defined, we let bm+1 = bm + ρ(xm )kxm k = bm + ρ(xm )am . First we observe that
kxm kA1 ≤ bm .
(2.40)
Indeed, for m = 0 this simply follows from the definition of b0 , and assuming
(2.40) holds for m ∈ N0 it follows that
kxm+1 kA1 = kxm − hxm , zm+1 izm+1 kA1
≤ kxm kA1 + |hxm , zm+1 i|
= kxm kA1 + ρ(xm )kxm k = bm+1 .
Secondly we compute using (2.39), (2.38) and (2.40)
kxm k2 am am+1 = kxm+1 k2 = am (1 − ρ2 (xm )) ≤ am 1 −
≤
a
1
−
m
b2m
kxm k2A1
and thus, since bm+1 ≥ bm
am+1
am am am+1
≤
1
−
.
≤
b2m
b2m
b2m
b2m+1
Note that
ξn =
an−1
b2n−1
(2.41)
a0
b20
=
kxk2
kxk2A
≤ 1. We therefore apply Lemma 2.3.2 to sequence ξn with
1
and deduce that
am b−2
m ≤
1
, whenever m ∈ N.
m
Since by the recursive definition of (bj ), (2.40) and (2.38) we get
−1
−1
1/2
2
bm+1 = bm 1 + ρ(xm )a1/2
m bm ≤ bm 1 + ρ(xm )am kxm kA1 ≤ bm 1 + ρ (xm ) ,
we obtain together with (2.39)
am+1 bm+1 ≤ am bm (1 − ρ2 (xm ))(1 + ρ2 (xm )) ≤ am bm .
(am bm ) is therefore decreasing and am bm ≤ a0 b0 = kxk2 · kxkA1 . Multiplying
both sides of (2.41) by a2m b2m we obtain therfore
a3m ≤
kxk4 · kxk2A1
a2m b2m
≤
,
m
m
which implies our claim after taking on both sides the sixth root.
2.3. CONVERGENCE RATES
45
The next Example due to DeVore and Temlyakov gives a lower bound for the
convergence rate of (PGA)
Example 2.3.3. [DT] Let H be a separable Hilbertspace and (hj ) an orthonormal basis of H. We will define a dictionary D ⊂ H, a vector x ∈ H for which
kxkA1 (D) = 2, and so that
c
kxm k = kx − Gm k ≥ √ , for m ∈ N.
m
Define
a=
23
11
1/2
and A =
and
z = A(h1 + h2 ) + aA
∞
X
33
89
1/2
k(k + 1)
−1/2
hk .
k=3
Note that
∞
X
1 1
kzk = 2A + a A
kk+1
2
2
2
2
k=3
∞
X
1
1
2
2 2
= 2A + a A
−
k k+1
k=3
1
33 23 = 2A2 + a2 A2 =
2+
= 1.
3
89
33
Put D = {±g} ∪ {±hj : j ∈ N} and let x = h1 + h2 and we apply (PGA) to f
Claim: In Step 1 of (PGA) we have
z1 = z and x1 = x − hx, ziz = (1 − 2A2 )(h1 + h2 ) − 2aA2
∞
X
k(k + 1)
k=3
Indeed,
hx, zi = 2A > 1,
hx, h1 i = hx, h2 i = A, and
hx, hj i = 0, if, j > 2.
Thus z1 = z and
x1 = x − hx, ziz
∞
X
−1/2 = h1 + h2 − 2A A(h1 + h2 ) + aA
k(k + 1)
hk
k=3
−1/2
hk .
46
CHAPTER 2. GREEDY ALGORITHMS IN HILBERT SPACE
= (1 − 2A2 )(h1 + h2 ) − 2aA2
∞
X
k(k + 1)
−1/2
hk .
k=3
Claim: In Step 2 and Step 3 of (PGA) we have z2 = h1 and z3 = h2 or vice
versa.
Indeed,
hx1 , zi = 0 (by construction of x1 )
23
hx1 , h1 i = hx1 , h2 i = (1 − 2A2 ) =
and
89
−1/2 1 2
hx1 , hj i = 2aA2 j(j + 1)
≤ aA < (1 − 2A2 ) if j > 2.
6
So take W.l.o.g z2 = h1 and thus
2
2
x2 = x1 − hx1 , h1 ih1 = (1 − 2A )h2 − 2aA
∞
X
k(k + 1)
−1/2
hk .
k=3
Then we observe that
hx2 , zi = hx1 , zi + hx2 − x1 , zi = −(1 − 2A2 )hh1 , zi = −A(1 − 2A2 )
hx2 , h1 i = 0,
hx2 , h2 i = (1 − 2A2 ) > |hx2 , zi|
−1/2 1 2
≤ aA < (1 − 2A2 ) if j > 2.
hx2 , hj i = 2aA2 j(j + 1)
6
which implies that z3 = h2 and that
x3 = x2 − hx2 , h2 ih2 = −2aA
2
∞
X
k(k + 1)
−1/2
hk .
k=3
From now on we prove by induction that
2
zm = hm−1 and xm = −2aA
∞
X
k(k + 1)
−1/2
hk whenever m ≥ 3.
k=m
Indeed, for m = 3 this was already shown. Assuming that our claim is true for
some m ≥ 3 we compute that
hxm , zi = −2a2 A3
∞
X
k=m
2a2 A3
k(k + 1) = −
m(m + 1)
hxm , h` i = 0 for ` < m, and
hxm , h` i = −2a2 A3 `(` + 1)
−1/2
for ` ≥ m.
2.3. CONVERGENCE RATES
47
Thus zm+1 = hm and xm = −2aA2
induction step.
Finally note that for m ≥ 3
P∞
kxm k2 = 4a2 A4
k=m+1
∞
X
k=m
−1/2
k(k + 1)
hk , which finishes the
1
4a2 A4
=
.
k(k + 1)
m
Thus kxkA1 = 2 and (kxm k) is of the order m−1/2 .
Remark. In [KT2] the rate of convergence in Theorem 2.3.1 was slightly improved to Cn−11/62 where C is some universal constant. And in [LT] a dictionary D of H was constructed for which there is an x ∈ A1 (D) for which
kx − Gn (x)k ≥ Cn−.27 , whenever n ∈ N.
It is conjectured that the rate of convergences should be of the order of n−1/4 .
Theorem 2.3.4. [Jo] Consider the Greedy Greedy Algorithm (Gfm (x)), x ∈ H
with fixed relaxation, and let x ∈ A1 (D) with kxkA1 ≤ c, where c is the constant
in (GAFR) then
(2.42)
2c
kx − Grm kA1 ≤ √ , for all m ∈ N.
m
We first need the following elementary Lemma.
Lemma 2.3.5. Let (am ) be a sequence of non-negative numbers, for which there
is an A > so that
A
2
am−1 + 2 if m ≥ 2.
(2.43)
a1 ≤ A and am ≤ 1 −
m
m
Then
(2.44)
am ≤
A
for all m ∈ N.
m
Proof. We prove (2.44) by induction. For m = 1 (2.44) is part of our assumption,
and assuming (2.47) is true for m − 1 we deduce from the second part of our
assumption that
2
A
am ≤ 1 −
am−1 + 2
m
m
2 A
A
≤ 1−
+ 2
m m−1 m
1
m+1 =A
− 2
m − 1 m (m − 1)
m2 − m − 1 =A
m2 (m − 1)
48
CHAPTER 2. GREEDY ALGORITHMS IN HILBERT SPACE
=
A m2 − m − 1 A
< ,
m m(m − 1)
m
which finishes the induction step and the proof of our claim.
Proof of Theorem (2.3.4). W.l.o.g. we can assume that c = 1. Let Gfn = Gfn (x)
and zn be as in GAFR and let xn = x − Gfn , for n ∈ N we compute:
2
(2.45) kxn k2 = x − Gfn 1 1 f
2
Gn−1 − zn = x − 1 −
n
n
2
1
2
2
x − Gfn−1 , Gfn−1 − zn + 2 Gfn−1 − zn = x − Gfn−1 +
n
n
2 2 4
f
f
f
≤ x − Gn−1 +
x − Gn−1 , Gn−1 − zn + 2
n
n
(for the inequality notice that if kGfn−1 k ≤ kGfn−1 kA1 ≤ 1) and
x − Gfn−1 , Gfn−1 − zn ≤ inf x − Gfn−1 , Gfn−1 − z
z∈D
=
inf
x − Gfn−1 , Gfn−1 − z
z∈A1 (D),kzkA1 ≤1
= hx − Gfn−1 , Gfn−1 i −
sup
z∈A1 (D),kzkA1 ≤1
f
Gn−1 , Gfn−1 − z
≤ x − Gfn−1 , Gfn−1 − x = −kx − Gfn−1 k2 .
Inserting this inequality into (2.45) yields
2 x − Gf 2 + 4 = 1 − 2 kxn−1 k|2 + 4 ,
kxn k2 ≤ 1 −
n−1
n
n2
n
n2
which together with Lemma 2.3.5 yields our claim.
Theorem 2.3.6. If we consider the orthogonal greedy algorithm (OGA), then
for each x ∈ H with kxkA1 = kxkA1 (D) < ∞, it follows
(2.46)
kx − Gon (x)k ≤
kxkA1 (D)
√
.
n
Proof. Assume that kxkA1 = 1, and Gon = Gon (x), zn are given as in (OGA), i.e.
hx − Gon1 , zn i = suphx − Gn−1 , zi
z∈D
2.3. CONVERGENCE RATES
49
and Gon is the orthogonal projection PZ⊥n (x) of x onto Zn = span(z1 , z2 , . . . zn ).
As usual we put xn = x − Gon . Since G0n is the best approximation of x by
elements of Zn , it follows that
(2.47)
kxn k2 ≤ kxn−1 − hxn−1 , zn izn k2
= kxn−1 k2 − hxn−1 , zn i2 = kxn−1 k2 1 −
hx
n−1 , zn i 2
kxn−1 k
(W.l.o.g.
xn−1 6= 0, otherwise we would be done Write x as x =
P
cz = 1, cz ≥ 0, for z ∈ D. Then
P
kxn−1 k2 = hx − Gon−1 , x − Gon−1 i
= hx − Gon−1 , xi (Since x − Gon−1 ⊥ Gon−1 )
D
E
X
= x − Gon−1 ,
cz z
z∈D
D
E
X
≤ x − Gon−1 ,
cz zn
z∈D
= hxn−1 , zn−1 i = kxn−1 k
hxn−1 , zn i
kxn−1 k
and thus by (2.47)
kxn k2 ≤ kxn−1 k2 (1 − kxn−1 k2 ).
Our claim follows therefore again from Lemma 2.3.5.
!
z∈D cz z,
with
50
CHAPTER 2. GREEDY ALGORITHMS IN HILBERT SPACE
Chapter 3
Greedy Algorithms in general
Banach Spaces
3.1
Introduction
The algorithms from Chapter 2 can be generalized to separable Banach spaces
X. Again let D ⊂ SX be a dictionary of X, i.e. X = span(D), and with z ∈ D,
it also follows that −z ∈ D.
(XGA)
The X-Greedy Algorithm.
1)
For x ∈ X we define Gn = Gn (x), for each n ∈ N0 , by induction.
G0 = 0 and assuming that G0 , G1 . . . Gn−1 , have been defined
for some n ∈ N we proceed as follows:
Choose zn ∈ D and an ∈ R so that
kx − Gn−1 − an zn k =
2)
inf
z∈D,a≥0
kx − Gn−1 − azk.
Put Gn = Gn−1 + an zn .
As in the Hibert space case, the “inf” in the above defined algorithm (XGA)
may not be attained. In this case we can consider the following modification.
(WXGA)
The Weak X-Greedy Algorithm.
We are given a sequence τ = (tn ) ⊂ (0, 1). For x ∈ X we define
Gn = Gn (x), for each n ∈ N0 , by induction.
G0 = 0 and assuming that G0 , G1 . . . Gn−1 , have been defined
for some n ∈ N we proceed as follows:
1)
Choose zn ∈ D and an ∈ R so that
tn kx − Gn−1 − an zn k ≤
51
inf
z∈D,a∈R
kx − Gn−1 − azk.
52CHAPTER 3. GREEDY ALGORITHMS IN GENERAL BANACH SPACES
2)
Put Gn = Gn−1 + an zn .
The following example shows that without any further conditions, one can
”get stuck pretty easily”:
Example 3.1.1. On R2 consider the `2∞ norm:
k(x, y)k∞ = max(|x|, |y|), if x, y ∈ R.
and let D = {±e1 , ±e2 } be the dictionary.
Now for vector x = (1, 1) we have
inf
a≥0,z∈D
kx − azk∞ = 1 = kxk∞ .
In order to avoid cases like in Example 3.1.1 we will assume that our space
X is at least smooth:
Definition 3.1.2. A Banach space X is called smooth if for every x ∈ X there
is a unique support functional fx ∈ X ∗ , i.e. with kfx k = 1 and fx (x) = kxk.
Remark. As shown for example in [Schl, Theorem 4.1.3] the assumption that X
is smooth is equivalent with the condition that the norm is Gateaux differentiable
on X \ {0}. In that case it follows for x0 i ∈ X \ {0}
fx0 (y) =
1
kx0 + hyk − kx0 k
1 ∂
kx0 + λykλ=0 =
lim
,
kx0 k ∂λ
kx0 k h→∞
h
for all y ∈ SX
This implies that the X-greedy algorithm or the weak X-greedy algorithm
cannot become stationary at point x0 6= 0.
Indeed, if for all z ∈ D and all λ
kx0 − λzk ≥ kx0 k,
it follows that for all z ∈ D
fx0 (z) =
1
kx0 + hzk − kx0 k
lim
= 0.
kx0 k h→0
h
Since span(D) is dense this would mean that fx0 = 0 which is a contradiction.
In the Hilbert space case minimizing kx − azk over all z ∈ D and all a ∈ R
is equivalent to maximizing hx, zi over all z ∈ D. Generalizing this to Banach
spaces will lead to a different algorithm, i.e. to an algorithm which does not
coincide with (XGA).
3.1. INTRODUCTION
(DGA)
1)
53
The Dual Greedy Algorithm.
For x ∈ X we define Gdn = Gdn (x), for each n ∈ N0 , by induction
as follows. Gd0 = 0 and assuming Gd0 , Gd1 . . . Gdn−1 , have been
defined,
choose zn ∈ D so that
fx−Gd
n−1
2)
(zn ) = sup fx−Gdn (z),
z∈D
and then an so that
k(x − Gdn−1 ) − an zn k = min k|(x − Gdn−1 ) − azk.
a∈R
Then put Gdn = Gdn−1 + an zn .
Similar to XGA we can also define the weak version of (DGA) and denote it by
(WDGA).
(XGDAR)
1)
The X-Greedy Dual Algorithm with relaxation.
We are given a sequence ρ = (rn ) ⊂ [0, 1).
For x ∈ X we define Grn (ρ), for each n ∈ N0 , by induction.
Gr0 (ρ) = 0 and assuming that Gr0 (ρ), Gr1 (ρ) . . . Grn−1 (ρ), have
been defined for some n ∈ N we proceed as follows:
Choose zn ∈ D and an ∈ R so that
kx−(1−rn )Grn−1 (ρ)−an zn k ≤
2)
inf
z∈D,a∈R
kx−(1−rn )Grn−1 (ρ)−azk.
Put Grn (ρ) = (1 − rn )Grn−1 (ρ) + an zn .
The following algorithm is a generalization of the the Orthogonal Greed Algorithm for Hilbert spaces.
(CDGA)
1)
The Chebyshev Dual Greedy Algorithm.
For x ∈ X we define GC
n , for each n ∈ N0 , by induction.
C
C
C
G0 = 0 and assuming that GC
0 , G1 . . . Gn−1 , and vectors z1 . . . zn−1
have been defined for some n ∈ N we proceed as follows:
Choose zn ∈ D so that
fx−GC (zn ) ≥ sup fx−GC (z).
n−1
2)
z∈D
n−1
Define Zn = span(z1 , z2 . . . zn )) and let GC
n be the (or a) best
approximation of x to Zn .
Similar to (WXGA) there are weak version of (XGDAR) and (CDGA) which
we denote (WXGDAR) and (WCDGA), respectively.
54CHAPTER 3. GREEDY ALGORITHMS IN GENERAL BANACH SPACES
The following algorithm is of a different nature than the previous ones. We
will assume that (ei ) is a semi normalized basis of X.
Before discussing these algorithms and there convergence properties we will
need to introduce the following strengthening of smoothness of Banach spaces.
Definition 3.1.3. For a Banach space X define the modulus of uniform smoothness by
1
(3.1)
ρ(u) = ρX (u) = sup
kx + uyk + kx − uyk − 1 .
x,y∈SX 2
We say that X is uniformly smooth if
ρ(u)
= 0.
u→0 u
(3.2)
lim
Remark. We will use the modulus of uniform smoothness as follows. For some
x, y ∈ X \ {0} (not necessarily in SX we would like to find an upper estimate for
kx − yk, and write
!
x
kyk kyk y − kx + yk.
kx − yk = kxk · −
≤2 1+ρ
kxk kxk kyk
kxk
Example 3.1.4. As was shown in [Schl], the spaces Lp [0, 1] are uniform smooth
if 1 < p < ∞, more precisely for X = Lp [0, 1] and u ≥ 0
(
cp up
if 1 ≤ p ≤ 2
(3.3)
ρX (u) ≤
2
(p − 1)u /2 if 2 ≤ p < ∞
Proposition 3.1.5. For any Banach space X ρ is a convex and even function
and
max(0, u − 1) ≤ ρ(u) ≤ u, for u ≥ 0.
Lemma 3.1.6. For x 6= 0. Then for u ∈ R
(3.4)
ukyk 0 ≤ kx + uyk − kxk − fx (y) ≤ 2kxkρ
.
kxk
Proof. First we note that
kx + uyk ≥ fx (x + uy) = kxk + ufx (y),
which implies the first inequality in (3.4). Secondly, from the definition of ρ(u),
and assuming w.l.o.g that y 6= 0, it follows that
x
x
ukyk y ukyk y kx + uyk + kx − uyk = kxk
+
−
+ kxk
kxk
kxk kyk
kxk
kxk kyk
3.2. CONVERGENCE OF THE WEAK DUAL CHEBYSHEV GREEDY ALGORITHM55
ukyk ≤ 2kxk 1 + ρ
kxk
!
and
kx − uyk ≥ fx (x − uy) = kxk − ufx (y),
and, thus,
ukyk kx + uyk ≤ 2kxk 1 + ρ
kxk
!
ukyk − kx − uyk ≤ 2ρ
+ ufx (y),
kxk
which implies our claim.
Corollary 3.1.7. If X is a uniformly smooth Banach space and x ∈ X \ {0},
then
!
d
kx + uyk − kxk
(3.5)
fx (y) =
kx + uyk (0) = lim
,
u→0
dx
u
and this convergence is uniform in x, y ∈ SY . The norm is therefore Fréchet
differentiable
Proof. Note, that by (3.4)
kx + uyk − kxk
kxk ukyk − fx (y) ≤ 2
ρ
→u→0 .
u
u
kxk
If kyk = 1 and kxk > ε it follows therefore
kx + uyk − kxk
kxk u − fx (y) ≤ 2
kρ
≤ 2ρ(u)/u →u→0 ,
u
u
kxk
which implies the claimed uniform convergence.
3.2
Convergence of the Weak Dual Chebyshev Greedy
Algorithm
Recall the Weak Dual Chebyshev Greedy Algorithm:
We are given a sequence of weakness factors τ = (tn ) ⊂ (0, 1) and a dictionary
D ⊂ X. For x ∈ X we choose (zn ) ⊂ D and Gcn = Gcn (x) as folllows.
Gc0 = 0 and assuming Gcj , j = 01, 2 . . . n and zj , j = 1, 2 . . . n have been
chosen, we let zn ∈ D so that
c
c
fx−G
(zn ) ≥ tn sup fx−G
(z).
n −1
n−1
z∈D
56CHAPTER 3. GREEDY ALGORITHMS IN GENERAL BANACH SPACES
Then let Zn = span(z1 , z2 , . . . zn ) and let Gcn be the (or a) best approximation of
x inside Zn .
The main goal of this section is to prove two results by Temliakov We will
need the following technical definition first.
Definition 3.2.1. Let ρ be an even convex function on [−2, 2] with limu→0 ρ(u)/u =
0 and ρ(2) ≥ 1, let τ = (tn ) of numbers in (0, 1] and Θ ∈ (0, 1/2] then let
ξm = ξm (ρ, τ, θ) = 0 the (by the Intermediate Value Theorem uniquely existing)
number for which
(3.6)
ρ(ξm ) = Θtm ξm .
Theorem 3.2.2. Let X be a uniformly smooth Banach space and ρ its modulus
of uniform smoothness, and let τ = (tn ) be sequence of numbers in (0, 1]. Assume
that for any Θ > 0 we have
∞
X
tm ξm (ρ, τ, Θ) = ∞.
m=1
We consider the weak Chebysheev Greedy Algorithm (WCDGA). This means for
x ∈ X Gcn = Gcn (x) and zn ∈ D is such that
fx−GC (zn ) ≥ sup fx−GC (z),
n−1
z∈D
n−1
and Gcn is the best approximation of x to Zn = span(z1 , z2 , . . . zn ). Let xn =
x − Gcn , for n ∈ N. Then limn→∞ kxn k = 0.
Theorem 3.2.3. Let X be a uniformly smooth Banach space and and assume
that its modulus of uniform smoothness ρ satisfies ρ(u) ≤ γuq for some q ∈ (1, 2]
and γ ≥ 1. Let τ = (tn ) be sequence of numbers in (0, 1]. For x ∈ X, assume
that x ∈ A1 (D) and let Gcn = Gcn (x)be defined as in Theorem 3.2.2. Then there
is constant C(q, γ), only dependent on qand γ so that for all n ∈ N
(3.7)
m
X
x − Gcn ≤ C(q, γ)kxkA (D) 1 +
tpk
1
!−1/p
k=1
where p = q/(q − 1).
We will first need some Lemmas
Lemma 3.2.4. Let X be a uniformly smooth Banach space and let Z ⊂ X be a
finite dimensional subspace. If y is the best approximate of some x ∈ X \ Z from
Z then
fx−y (z) = 0, for all z ∈ Z.
3.2. CONVERGENCE OF THE WEAK DUAL CHEBYSHEV GREEDY ALGORITHM57
Proof. Assume to the contrary that there is a z ∈ Z, kzk = 1, so that β =
fx−y (z) > 0. By the definition of ρ(u) it follows for any λ and z ∈ SZ that
!
λ (3.8)
kx − y − λzk + kx − y + λzk ≤ 2kx − yk 1 + ρ
kx − yk
and secondly
kx − y + λzk ≥ fx−y (x − y + λz) = kx − yk + λβ.
(3.9)
It follows therefore from (3.8) and (3.9) that
!
λ − kx − y + λzk
kx − yk
λ ≤ kx − yk + ρ
− λβ
kx − yk
ρ(λ/kx − yk) .
= kx − yk − λ β −
λ/kx − yk
kx − y − λzk ≤ 2kx − yk 1 + ρ
Since ρ(u)/u →u→0 , it follows therefore that kx − y + λzk < kx − yk, which is a
contradiction.
Lemma 3.2.5. For any x∗ ∈ X ∗ we have
sup x∗ (z) =
(3.10)
z∈D
Proof. for x =
that
P
z∈D cz z,
x∗ (x).
sup
x∈A1 (D),kxkA1 ≤1
with cz ≥ 0, for z ∈ D and
x∗ (x) =
X
P
z∈D cz
≤ 1, it follows
cz x∗ (z) ≤ sup x∗ (z),
z∈D
z∈D
and thus
sup x∗ (z) ≥
z∈D
sup
x∗ (x).
x∈A1 (D),kxkA1 ≤1
The reverse inequality is trivial.
Lemma 3.2.6. Let X be a uniformly smooth Banach space and ρ its modulus
of uniform smoothness, and let τ = (tn ) be a sequence of numbers in (0, 1]. For
x ∈ X let Gcn = Gcn (x) be defined as in (WCDGA).
Assume that x ∈ X and that for some ε ≥ 0 there is a xε ∈ X so that
kx − xε k ≤ ε and xε ∈ A(D). Then it follows for all n ∈ N
"
#
λ kx − Gcn k
λtn ε
1−
(3.11)
≤ inf 1 −
+ 2ρ
.
kx − Gcn−1 k λ≥0
kxkA1
kx − Gcn−1 k
kxn−1 k
58CHAPTER 3. GREEDY ALGORITHMS IN GENERAL BANACH SPACES
Proof. Abbreviate A = kxε kA1 . Let zn ∈ D, be chosen as in (WCDGA) and put
xn = x − Gcn , for n ∈ N.
From the definition of ρ, it follows for every λ ≥ 0 that
λ (3.12)
kxn−1 − λzn k + kxn−1 + λzn k ≤ 2kxn−1 k 1 + ρ
.
kxn−1 k
and by the definition of (WCDGA) and Lemma 3.2.5 it follows that
fxn−1 (zn ) ≥ tn sup fxn−1 (z) = tn
z∈D
sup
z∈A1 (D),kzkA1 ≤1
fxn−1 (z) ≥
tn
fx (xε ).
A n−1
From Lemma 3.2.4 we deduce that
fxn−1 (xε ) = fxn−1 (x + xε − x) ≥ fxn−1 (x) − ε = fxn−1 (xn−1 ) − ε = kxn−1 k − ε,
and thus
kxn−1 + λzn k ≥ kxn−1 k + λfxn−1 (zn )
λtn λtn
λtn
fxn−1 (xε ) ≥ kxn−1 k 1 +
−
ε.
≥ kxn−1 k +
A
A
A
Finally (3.12) yields
kxn k ≤ inf kxn−1 − λzn k
λ≥0
≤ 2kxn−1 k 1 + ρ
λ
− kxn−1 + λzn k
kxn−1 k
!
λ λt
λtn
n
≤ kxn−1 k 1 + 2ρ
−
+
ε
kxn−1 k
A
Akxn−1 k
!
λ λt ε n
−
1−
= kxn−1 k 1 + 2ρ
kxn−1 k
A
xn−1 k
which proves our assertion.
Proof of Theorem 3.2.2. Let xn = x − Gcn , for n ∈ N. By construction, the
sequence (kxn k : n ∈ N) is decreasing and thus, α = limn→∞ kxn k exists and we
have to show that α = 0.
We assume that α > 0 and will deduce a contradiction. Let ε = α/2. Since
span(D) is dense in X we find an xε ∈ span(D), so that kx − xε k < ε and denote
A = kxε kA1 .
From Lemma 3.2.6 we deduce that
#
h
λ λt ε n
kxn k ≤ kxn−1 k inf 1 + 2ρ
−
1−
λ≥0
kxn−1 k
A
kxn−1 k
3.2. CONVERGENCE OF THE WEAK DUAL CHEBYSHEV GREEDY ALGORITHM59
h
λ λt i
n
≤ kxn−1 k inf 1 + 2ρ
−
.
λ≥0
α
2A
We let Θ = α/8A and take λ = αξn (ρ, τ, Θ) (recall that this means that ρ(ξn ) =
Θtn ξn ) and obtain that
h
h
i
tn i
kxn k ≤ kxn−1 k 1 + 2Θtn ξn −
= kxn−1 k 1 − 2Θtn ξn .
2A
and thus
kxn k = kxk +
n
X
kxj k − kxj−1 k ≤ kxk −
j=1
n
X
kxj−1 k2Θtj ξj ≤ kxk − α
j=1
n
X
k2Θtj ξj .
j=1
But this contradicts our assumption that Σtn ξn = ∞.
Before proving Theorem 3.2.3 we need one more Lemma.
Lemma 3.2.7. Assume that (an ) and (sn ) are sequences of positive numbers
satisfying for some A > 0 the following assumption
(3.13)
a1 <
A
sn
and an ≤ an−1 1 − an−1 .
1 + s1
A
Then it follows for all n ∈ N
(3.14)
n
−1
X
an ≤ A 1 +
sj
j=1
Proof. We prove(3.14) by induction. For n = 1 this is the assumption. Assuming
our claim is true for n − 1 we obtain
sn
an ≤ an−1 1 − an−1
A
!
n−1
n
−1
X −1
X
sn
≤A 1+
1−
sj
≤A 1+
sj
Pn−1
1 + j=1 sj
j=1
j=1
(last inequality follows from cross multiplication).
Proof of Theorem 3.2.3. W.l.o.g. we assume kxkA1 = 1. Let the sequences (zn )
and (Gcn ) = (Gcn (x)) be given as in (WCDGA) and let xn = x − Gcn , for n ∈ N.
By Lemma 3.2.6 with ε = 0 we obtain
"
#
λ q
(3.15)
kxn k ≤ kxn−1 k inf 1 − λtn + 2γ
λ≥0
kxn−1 k
60CHAPTER 3. GREEDY ALGORITHMS IN GENERAL BANACH SPACES
We choose λ such that
λ q
1
,
λtm = 2γ
2
kxn−1 k
or
λ = kxn−1 kq/(q−1) (4γ)1/(q−1) tn1/(q−1) .
Abbreviating Aq = 2(4γ)1/(q−1) and p = q/(q − 1), and inserting the choice of λ
into (3.15), we obtain
1
kxn k ≤ kxn−1 k 1 − λtn = kxn−1 k 1 − kxn−1 kp tpn /Aq .
2
Taking the pth power on each side and using the fact that x ≥ xp if 0 < x ≤ 1,
we obtain
kxn kp ≤ kxn−1 kp 1 − kxn−1 kp tpn /Aq .
Since γ ≥ 1 and thus Aq > 2, and since kxk ≤ kxkA1 = 1 we can apply Lemma
3.2.7 to an = kxn kp and sn = tpn , and A = Aq , and deduce that for all n ∈ N
p
kxn k ≤ Aq 1 +
n
X
tpn
−1
,
j=1
which implies the claim of our Theorem.
3.3
Weak Dual Greedy Algorithm with Relaxation
For the Weak Chebyshev Dual Greedy Algorithm we need to approximate x
by an element of the n dimensional space Zn , which might be computationally
complicated. The following algorithm is a compromise between the Chebyshev
Dual Greedy Algorithm and Dual Greedy Algorithm. Here we only have to find
a good approximation to a two dimensional subspace:
(WDGAFR)
The weak dual greedy algorithm with free relaxation
As usual we are given a sequence of weakness factors τ = (tn ) ⊂
(0, 1] and a dictionary D ⊂ SX . For x ∈ X we define Grn =
Grn (x) , n ∈ N, as follows: Gr0 = 0 and assuming Grn−1 has been
defined we choose zn ∈ D so that
fx−Grn−1 (zn ) ≥ tn sup fx−Grn−1 (z),
z∈D
and then we let wn and λn so that
x − (1 − wn )Grn−1 − λn zn ≤ inf x − (1 − w)Grn−1 − λzn .,
λ,w
and define Grn q = (1 − wn )Grn−1 + λzn .
3.3. WEAK DUAL GREEDY ALGORITHM WITH RELAXATION
61
Proposition 3.3.1. For all x ∈ X kx − Grn (x)k is decreasing in n ∈ N.
We will need the following analog to Lemma 3.2.6
Lemma 3.3.2. Assume that X is a uniformly smooth Banach space and denote
its modulus of uniform smoothness by ρ. Let x ∈ X, kxk ≥ ε ≥ 0 and xε ∈ A1 ,
so that kx − xε k < ε. For n ∈ N put xn = x − Gfn
Then
!
5λ λtn ε kxn k
1−
.
(3.16)
≤ inf 1 −
+ 2ρ
kxn−1 k λ≥0
kxε kA1
kxn−1 k
kxn−1 k
Proof. From the definition of ρ we deduce for w ∈ R and λ ≥ 0 that
(3.17)
dr
kxn−1 + wGdr
n−1 − λzn k + kxn−1 − wGn−1 + λzn k
kwGdr − λz k n
n−1
≤ 2kxn−1 k 1 + ρ
.
kxn−1 k
For all w ∈ R and λ ≥ 0 we estimate
(3.18)
kxn−1 −wGdr
n−1 + λzn k
≥ fxn−1 (xn−1 − wGdr
n−1 + λzn )
≥ kxn−1 k − fxn−1 (wGdr
n−1 ) + λtn sup fxn−1 (z)
z∈D
= kxn−1 k −
fxn−1 (wGdr
n−1 )
+ λtn
sup
fxn−1 (z)
z∈A1 ,kzkA1 ≤1
(By Lemma 3.2.5)
λtn
fx (xε )
kxε kA1 n−1
λtn ε
λtn
fxn−1 (x) −
.
= kxn−1 k − fxn−1 (wGdr
n−1 ) +
kxε kA1
kxε kA1
≥ kxn−1 k − fxn−1 (wGdr
n−1 ) +
Letting w∗ = λtn /kxε kA1 we deduce that
(3.19)
kxn−1 −w∗ Gdr
n−1 + λzn k
λtn ε
λtn
≥ kxn−1 k +
fxn−1 (x − Gdr
n−1 ) −
kxε kA1
kxε kA1
λtn
λtn ε
≥ kxn−1 k +
kxn−1 k −
.
kxε kA1
kxε kA1
Thus we obtain
kxn k =
inf
λ≥0,w∈R
kxn−1 + wGdr
n − 1 − λzn k
62CHAPTER 3. GREEDY ALGORITHMS IN GENERAL BANACH SPACES
kwGdr − λz k i
n
n−1
2kxn−1 k 1 + ρ
− kxn−1 − wGdr
n−1 + λzn k
λ≥0,w∈R
kxn−1
h
kw∗ Gdr − λz k n
n−1
≤ inf 2kxn−1 k 1 + ρ
λ≥0
kxn−1 k
λtn ε i
λtn
kxn−1 k −
− kxn−1 k +
kxε kA1
kxε kA1
"
#
kw∗ Gdr − λz k λtn
ε
n
n−1
= kxn−1 k inf 1 −
1−
+ 2ρ
λ≥0
kxε kA1
kxn−1 k
kxn−1 k
≤
inf
h
In order to achieve (3.16) we need to estimate kw∗ Gdr
n−1 − λzn k. First we note
that
kGdr
n−1 k = kx − xn−1 k ≤ 2kxk ≤ 2kxε kA1 + 2ε ≤ 4kxε kA1
and thus
∗
kw∗ Gdr
n−1 − λzn k ≤ 4w kxε kA1 + λ ≤ 5λ,
which implies our claim since ρ(·) is increasing on [0, ∞).
Remark. Before stating the next Theorem let as note that if ρ is an even and
convex function on R, with ρ(0) = 0, limu→0 ρ(u)/u = 0, then the function
s : u 7→ ρ(u)/u is increasing on [0, ∞), thus has a inverse function s−1 which is
also increasing and s−1 (0) = 0.
Theorem 3.3.3. Assume that X is a separable and uniformly smooth Banach
space, and denote its modulus of uniform smoothness by ρ. Let s−1 (·) be the
inverse function of s : u 7→ ρ(u)/u, u ≥ 0. We consider the (WDGAFR) with a
sequence of weakness factors τ = (tn ) ⊂ (0, 1] satisfying
(3.20)
∞
X
tn s−1 (Θtn ) = ∞ for all Θ > 0
n=1
Then for any x ∈ X
(3.21)
lim kx − Gdr
n (x)k = 0.
n→∞
dr
dr
Proof. Let Gdr
n = Gn (x) and xn = x − Gn , for n ∈ N. Since kxn k decreases
β = lim kxn k
n→∞
exists and we need to show that β = 0.
We assume that β > 0 and will derive a contradiction. We set ε = β/2 and
choose xε ∈ A1 (D) with kx − xε k < ε. Note that kxk ≥ βε.
By Lemma 3.3.2 we have
5λ λtn
kxn k ≤ kxn−1 k inf 1 −
+ 2ρ
.
λ≥0
2A
β
3.3. WEAK DUAL GREEDY ALGORITHM WITH RELAXATION
63
Putting Θ = β/40A and λ = βs−1 (Θtm )/5, we obtain
βtn s−1 (Θtn )
kxn k ≤ kxn−1 k 1 −
+ 2ρ s−1 (Θtn )
10A
ρ s−1 (Θtn ) βtn s−1 (Θtn )
−1
= kxn−1 k 1 −
+ 2s (Θtn ) −1
10A
s (Θtn )
−1
ρ s−1 (Θtn ) βtn s (Θtn )
−1
= kxn−1 k 1 −
+ 2s (Θtn ) −1
10A
s (Θtn )
−1
−1
= kxn−1 k 1 − 4Θtn s (Θtn ) + 2s (Θtn )Θtn = kxn−1 k 1 − 2Θtn s−1 (Θtn ) .
We can iterate this inequality and obtain
kxn k ≤ kxn−1 − kxn−1 k2Θtn s−1 (Θtn )
≤ kxn−1 k − β2Θtn s−1 (Θtn )
≤ kxn−2 k − β2Θtn−1 s−1 (Θtn−1 ) − β2Θtn s−1 (Θtn )
..
.
n
X
≤ kxk − β2Θtn−1
tj s−1 (Θtj )
j=1
and thus letting n → ∞
β ≤ kxk − β2Θtn−1
∞
X
tj s−1 (Θtj ),
j=1
which is the contradiction since we assumed that
Pn
j=1 tj s
−1 (Θt )
j
= ∞.
There is also a result on the rate of the convergence in Theorem 3.3.3 Since
the proof is similar to the proof of the corresponding result for the Chebyshev
Greedy Algorithm we omit a proof.
Theorem 3.3.4. Let X be a uniformly smooth Banach space with modulus of
uniform smoothness ρ, which satisfies for some q ∈ (1, 2]
(3.22)
ρ(u) ≤ γuq .
Then there is a number C only depending on q and γ so that the following holds.
If x ∈ X and if ε > 0 and xε ∈ A1 so that kx − xε k < ε, and if τ = (tn ) ⊂
dr
(0, 1], then it follows for the (WGGAFR) (Gdr
n ) = (Gn (x)) with weakness factors
(tn ) that
!
n
−1/p
X
(3.23)
kx − Gdr
tpj
n k ≤ max 2ε, C(kxε kA1 + ε) 1 +
j=1
64CHAPTER 3. GREEDY ALGORITHMS IN GENERAL BANACH SPACES
where p = q/(q − 1).
Remark. We like to point out something about the proof of Theorem 3.3.4 which
will be useful when we consider the next greedy algorithm.
In the proof of Theorem 3.3.4 It was only used that the sequence (kxn k : n ∈ N)
is decreasing and that the inequality 3.16 holds. Of course in order to proof this
inequality we needed the specific choice of zn , namely in the second “≥” of (3.18).
Thus if Gn (x) is any algorithm so that kx−Gn (x)k is decreasing which satisfies
equation 3.16 the Gn (x) converges to x.
Keeping that remark in mind we now turn to the X-Greedy Algorithm with
Free Relaxation.
(XGAFR)
the X-Greedy Algorithm with Free Relaxation As usual we are
given a dictionary D ⊂ SX . For x ∈ X we define Grn = Grn (x) ,
n ∈ N, as follows: Gr0 = 0 and assuming Grn−1 has been defined
we choose λn ≥ 0, wn ∈ R , zn ∈ D so that
kx−(1−wm )Grn−1 −λn zn k =
inf
z∈D,λ≥0,w∈R
kx−(1−w)Grn−1 −λzk
and then define
Grn q = (1 − wn )Grn−1 + λzn .
We notice that at each step the value of
kx − Gr n(x)k
kx − Gr n − 1(x)k
is at most as large as the value we would have obtained if we had computed
Gdr
n (x) form xn−1 . We deduce therefore that the conclusion of Lemma still holds
and that if 0 ≤ ε ≤ kxk and if xε ∈ A1 with kx − xε k ≤ ε it follows that
!
5λ kx − Grn (x)k
λtn ε (3.24)
≤ inf 1 −
1−
+ 2ρ
.
kx − Grn−1 (x)k λ≥0
kxε kA1
kxn−1 k
kxn−1 k
From the previous made remark we deduce therefore the following convergence
result.
Theorem 3.3.5. Assume that X is a separable and uniformly smooth Banach
space. We consider the (XGAFR) (Grn (x) : n ∈ N), for x ∈ X
Then for any x ∈ X
(3.25)
lim kx − Grn (x)k = 0.
n→∞
3.4. CONVERGENCE THEOREM FOR THE WEAK DUAL ALGORITHM65
3.4
Convergence Theorem for the Weak Dual Algorithm
For a Banach space X assume that f(·) : X \ {0} → SX ∗ support map, i.e. for
every x ∈ X \ {0} we have fx (x) = kxk. Recall from the the remark after
Definition 3.1.2 that every x ∈ X \ {0} has a unique support map fx if and only
if the norm is Gateaux differentiable
fx0 (y) =
1 ∂
1
kx0 + hyk − kx0 k
kx0 + λykλ=0 =
lim
.
kx0 k ∂λ
kx0 k h→∞
h
for all y ∈ SX
Let D ⊂ SX be a dictionary for X and put for x ∈ X:
ρ(x) = ρD (x) = sup fx (z).
z∈D
We consider the Weak Dual Greedy Algorithm with fixed weakness factor as
in Section 3.1 but slightly reformulated:
(WDGA)
Fix c ∈ (0, 1). For x ∈ X we choose sequences (xn )n≥0 , (zn )n≥1 ⊂
D and (tn ) ⊂ [0, ∞) recursively as follows.
x0 = x and assuming that xn−1 ∈ X, has been chosen we choose zn ∈ D
arbitrary and tn = 0 if xn−1 = 0, and otherwise we choose zn ∈ D and tn ≥ 0 so
that
a) fxn−1 (zn ) ≥ cρD (xn−1 ) = c supz∈D fxn−1 (z),
b) kxn−1 − tn zn k = mint≥0 kxn−1 − tzn k,
and in both cases we finally let
c) Gn = Gn−1 + tn zn and xn = x − Gn = xn−1 − tn zn .
We say that the weak dual greedy algorithm converges for D, if for all x ∈ X,
lim xn = 0 or, equivalently, lim
n→∞
n→∞
n
X
ti zi = x.
i=1
Lemma 3.4.1. Let X have Gateaux differentiable norm, 0 < c < 1, and assume
that for x ∈ X the sequences (xn ), (tn ), and (zn ) satisfy (a) and (c) of (WDGA),
but instead of (b) the following condition
kxn−1 k − kxn k
≥ cρ(xn−1 ) for all n ∈ N.
tn
P
P∞
Then, if ∞
n=1 tn = ∞, we have x =
n=1 tn zn .
d)
66CHAPTER 3. GREEDY ALGORITHMS IN GENERAL BANACH SPACES
Proof. Define sn =
Then
e
Pn
i=1 ti ,
Pn
j=2
for n ∈ N
ln((sj −tj )/sj )
=
n
Y
sj−1
s1
=
→n→∞ 0,
sj
sn
j=2
and thus
lim
n→∞
n
X
ln((sj − tj )/sj ) = −∞
j=2
It follows that
∞=−
∞
X
ln((sj − tj )/sj ) = −
j=2
∞
t2j
tj X tj
ln 1 −
≤
+ 2,
sj
s
sj
j=2
j=2 j
n
X
P
tj
and thus ∞
j=2 sj = ∞.
P
P We note that if (an ) and (bn ) are two positive sequences and an < ∞, while
bn = ∞, then there is a subsequence (nk ) of N, so that limk→∞ ank /bnk = 0.
Indeed, for every k ∈ N the set Nk = {n ∈ N : kan < bn } must be infinite, and we
can therefore choose n1 < n2 < n3 < . . ., with ank ∈ Nk , for k ∈ N.
Thus we can find n1 < n2 < n3 < . . . so that
snk +1 (kxnk k − kxnk +1 k)
= 0.
k→∞
tnk +1
lim
It follows that
0 ≤ snk ρD (xnk ) ≤
1 snk +1 (kxnk k − kxnk+1 k)
1 snk (kxnk k − kxnk +1 k)
≤
→k→∞ 0,
C
tnk +1
C
tnk +1
and thus, in particular,
(3.26)
lim ρD (xnk ) = 0.
k→∞
For 1 ≤ l ≤ nk − 1 we have:
(3.27)
nk
nk
X
X
kxn k − fxn (xl ) = fxn
tj zj ≤
tj ρD (xnk ) = snk ρD (xnk ) →k→∞ 0.
k
k
k
j=1
j=l+1
Now assume that x∗ is a w∗ -cluster point of the sequence (fxnk ) and let L =
limn→∞ kxn k.Then, by (3.27), it follows that x∗ (xl ) = L for all l ∈ N. We now
claim that this implies that L = 0. Indeed, other wise, x∗ 6= 0, and thus, since
span(D) = X, and D = −D, θ = supz∈D x∗ (z) > 0, and thus
lim sup sup fxnk (z) ≥ θ,
k∈N
which contradicts (3.26).
z∈D
3.4. CONVERGENCE THEOREM FOR THE WEAK DUAL ALGORITHM67
Lemma 3.4.2. Suppose 1 < p < ∞. Then there is a Cp > 0 such that for any
a, b ∈ R
(3.28) b|a+b|p−1 sign(a+b)−b|a|p−1 sign(a) ≤ Cp |a+b|p −pb|a|p−1 sign(a)−|a|p .
Proof. First, note that replacing a and b simultaneously by −a and −b, the
inequality does not change. We can therefore assume that a ≥ 0. Also note that
for b = 0 we have equality if we let Cp = 1. Thus, we can assume that a > 0 and,
since both sides of (3.28) are p-homogenous, we can assume that a = 1, and also
that b 6= 0. We need therefore to show that
φ(b) =
b |1 + b|p−1 sign(1 + b) − 1)
,
|1 + b|p − pb − 1
b 6= 0,
has an upper bound.
We note that
b(1 + b)p−1 − b
b→0 (1 + b)p − pb − 1
(1 + b)p−1 + (p − 1)b(1 + b)p−2
= lim
b→0
p(1 + b)p−1 − p
(p − 1)(1 + b)p−2 + (p − 1)(1 + b)p−2 + (p − 1)(p − 2)b(1 + b)p−3
2
= lim
=
p−2
b→0
p(p − 1)(1 + b)
p
lim φ(b) = lim
b→0
and
lim φ(b) = 1,
b→±∞
which implies the claim since φ is continuous.
Definition 3.4.3. A Banach space X with Gateaux differentiable norm is said
to have property Γ if there is a constant 0 < γ ≤ 1 so that for x, y ∈ X for which
fx (y) = 0 it follows that
kx + yk ≥ kxk + γfx+y (y).
Remark. For x, y ∈ Lp [0, 1], fx (y) = 0 means that
Z
(3.29)
0
1
sign(x(t))|x(t)|p−1 y(t) dt =
Z
1
sign(x(t))|x(t)|p/q y(t) dt = 0.
0
Proposition 3.4.4. If 1 < p < ∞, every quotient of Lp [0, 1] has property Γ.
68CHAPTER 3. GREEDY ALGORITHMS IN GENERAL BANACH SPACES
Proof. We first show that Lp [0, 1] itself has property Γ. So let x, y ∈ Lp [0, 1] with
fx (y) = 0. We can assume that y 6= 0, and, after dividing x and y by kyk, that
kyk = 1
By Lemma 3.4.2,
y(s)|x(s) + y(s)|p−1 sign(x(s) + y(s))
≤ Cp |x(s) + y(s)|p − |x(s)|p + (1 − pCp )y(s)|x(s)|p−1 sign(x(s)).
Integrating both sides and using (3.29), yields
Z
1
y(s)|x(s) + y(s)|p−1 sign(x(s) + y(s)) ds ≤ Cp kx + ykpp − kxkpp .
0
d
It follows from the fact that fx (y) is a positive multiple of dt
kx + tykp , and the
convexity of the function t 7→ kx + tyk that kxkp ≤ kx + ykp . Moreover we have
p−1 ∈ S , for z ∈ L [0, 1], and thus
fz = kzk1−p
p sign(z(·))|z(·)|
p
Lq
kx +
ykp−1
p fx+y (y)
Z
1
y(s)|x(s) + y(s)|p−1 sign(x(s) + y(s)) ds ≤
0
≤ Cp kx + ykpp − kxkpp
d
= Cp kx + yk − kxk
kx + tykpp dt
t=t0
(By Taylor’s Theorem for some t0 ∈ (0, 1))
d
kx
+
tyk
= Cp kx + yk − kxk pkx + ykp−1
p
p
dt
t=t0
p−1
≤ Cp kx + yk − kxk pkx + ykp
d
0 ≤ kx + tykp ≤ 1, since kxkp ≤ kx + ykp and since kykp = 0 ,
dt
t=t0
= kx +
ykpp−1
which proves our claim if we let γ = 1/pCp . From the following more general
Proposition it will follow that every quotient of Lp [0, 1] has property Γ.
Proposition 3.4.5. The quotient of a reflexive space X with property Γ and
Gateaux differentiable norm also has property Γ (with respect to the same constant
γ).
Proof. Assume that Y = X/Z, where X is a reflexive space with property Γ and
Z ⊂ X is a closed subspace of X.
Let x, y ∈ X, let x = x + Z, y = y + Z be the images under the quotient map
Q : X → Y . Since X is reflexive we can assume that k|xbkX/Z = inf x̃∈X+Z kx̃kX
and find an element w ∈ X so that kx + ykX/Z = kwkX .
Note that fx = fx ◦ Q (since fx (Q(x)) = fx (x) = kxkX/Y = kxkX ) and
fw = fx+y ◦ Q. Hence if fx (y) = 0 it follows that fx (u − x) = 0 and thus
3.4. CONVERGENCE THEOREM FOR THE WEAK DUAL ALGORITHM69
kx + yk = kwk = kx + (w − x)k ≥ kxk + γfx+(w−x) (w − u) = kxk + γfx+y (y).
Now we are ready to show the final result which implies in particular that
the (WDGA) converges in Lp [0, 1] for any dictionary D.
Theorem 3.4.6. Suppose X is s Banach space with property Γ and Fréchet
differentiable norm. If D is a dictionary of X and 0 < c ≤ 1 then the (WDGA)
converges.
Proof. By Proposition 4.2.2 (Class Notes in Functional Analysis), the map x 7→
fx is a norm continuous map between X \ {0} and SX ∗ .
Let x = x0 be in X and let (xn ), (zn ) and (tn ) as in (WDGA). If tn = 0 for
some n ∈ N then ρD (xn ) = 0, and since D is total it follows that xn−1 = 0 and
thus xk = 0, for all k ≥ n. So we can assume without loss of generality that
tn > 0 for all n ∈ N.
By condition (b) it follows that
d
kxn−1 − tzn k|t=tn = 0,
dt
and thus fxn (zn ) = 0, which yields using property Γ, that
kxn−1 k = kxn + tn zn k ≥ kxn k + γtn fxn−1 (zn )
and thus
kxn−1 k − kxn k
≥ γfxn−1 (zn ) ≥ cγρD (xn−1 ).
tn
Using
3.4.1 with cγ instead of γ we only need to show that limn→∞ kxn k =
PLemma
∞
0 if P
n=1 tn < ∞.
If ∞
n=1 tn < ∞ then (xn ) converges to some x∞ ∈ X. To deduce a contradiction assume that x∞ 6= 0 which implies that limn→∞ kfxn − fx∞ k = 0. Now
since, as observed previously, fxn (zn ) = 0, we have that limn→∞ fxn−1 (zn ) = 0,
and thus by (WDGA)(a) limn→∞ ρD (xn−1 ) = 0, and thus for any z ∈ D
fx∞ (z) = lim fxn−1 (z) ≤ lim ρD (xn−1 ) = 0
n→∞
n→∞
and similarly, since D = −D,
−fx∞ (z) = fx∞ (−z) = lim fxn−1 (−z) ≤ lim ρD (xn−1 ) = 0
n→∞
n→∞
which implies that fx∞ (z) = 0 for all z ∈ D, and thus x∞ = 0, which is a
contradiction and proves our claim.
70CHAPTER 3. GREEDY ALGORITHMS IN GENERAL BANACH SPACES
Remark. Assume that the Banach space X is Gateaux differentiable and let
x ∈ X. As in (WXGA) (xn ) ⊂ X, (zn ) ⊂ D and (tn ) ∈ [0, ∞) are chosen so that
x0 = x, xn = xn−1 − tn zn , for n ∈ N, and so that for some c ∈ (0, 1]
kxn−1 k − kxn k ≥ c sup sup kxn−1 k − kxn−1 − tgk .
g∈D t≥0
P
We note that, if (xn ) has a convergent subsequence (for example if n tn <
∞), then (xn ) has to converge to 0.
Indeed, assume that x∞ = limk→∞ xnk exists for some subsequence (nk ). We
claim x∞ = 0, and this would imply that (xn ) converges to 0, since (kxn k) is
decreasing.
Assume that x∞ 6= 0. Then, since D is total in X, it follows that supg∈D fx∞ >
0, and thus there exists a g ∈ cD and a t > 0 so that ε = kx∞ k − kx∞ − tgk > 0.
But this would imply that for some k0 ∈ N
kxnk k − kxnk+1 k ≥ c kxnk k − kxnk − tgk ≥ cε/2, whenever k ≥ k0 .
But this is a contradiction since (kxn k − kxn+1 k) is a non negative and summable
sequence.
Chapter 4
Open Problems
4.1
Greedy Bases
Problem 4.1.1. Does every infinite dimensional Banach space contain a quasi
greedy basis ?
Comments to Problem 4.1.1: First of all there are separable Banach spaces
which have a basis but do not have quasi greedy bases (for the whole spaces).
Indeed in [DKK] it was shown that a L∞ -space (for example any C(K), K
compact) which is not isomorphic to c0 does not have a quasi greedy basis.
Secondly Gowers and Maurey solved the unconditional basis problem and
showed that not every separable Banach space contains a unconditional basic
sequence (which is stronger than being quasi greedy). Nevertheless, in [DKK],
Dilworth, Kalton and Kutzarova proved that all known counterexamples to the
unconditional basis problem actually contain quasi greedy basic sequences.
The showed the following
Theorem 4.1.2. Let (xn ) be a semi normalized weakly null sequence in a Banach
space X with spreading model (en ), and suppose that (en ) has the property that
n
X
ei → ∞, if n → ∞.
i=1
Then (xn ) has a subsequence which is quasi greedy and whose quasi greedy constant does not exceed 3 + ε (for given ε > 0).
Here the spreading model of semi normalized sequences (xn ) ⊂ X is defined
as follows:
Assume that for all k ∈ N and all scalars (aj )nj=1
n
X
aj ej = lim
j=1
n
X
lim ... . . . lim aj xj n1 →∞ n2 →∞
71
nk →∞
j=1
72
CHAPTER 4. OPEN PROBLEMS
exists. It is clear that ||| · ||| is a semi norm on c00 . Using Ramsey’s Theorem
(some kind of generalized pigeon whole principle) one can prove that every semi
normalized sequence has a subsequence so that above limit exists for all (aj ) ∈ c00 ,
and that if (xn ) is weakly null or basic, then ||| · ||| is a norm on c00 and (en ) is
a basis of the completion of c00 with respect to ||| · |||. We call in this case this
completion together with its basis (en ) the spreading model of (xn ).
For more comments on the problem and its relation to the problem whether
or not the Elton number has universal upper bound see also [DOSZ2].
Problem 4.1.3. Does `p (`q ), 1 < p, q < ∞ and p 6= q, have a greedy basis?
Comments to Problem 4.1.3 Besov spaces are function spaces defined on the
real line or on some subset of it and are of importance for example in Partial
Differential Equations and Approximation Theory. It can be shown that Besov
spaces of function defined on R are isomorphic to `p (`q ), 1 < p 6= q < ∞, where
n
o
X
`p (`q ) = (xn ) : xn ∈ `q , for n ∈ N, and
kxn kpq < ∞ ,
n
with the norm
!1/p
k(xn )k =
X
kxn kpq
, for (xn ) ∈ `p (`q ).
n
It was shown in [EW] that for the space `p ⊕ `q , p 6= q, every unconditional basis
(xn ) of `p ⊕ `q splits in a basis of `p and in a basis of `q , i.e. there is a partition
of N into N1 and N2 , so that (xn : n ∈ N1 ) is a basis of `p and (xn : n ∈ N2 )
is a basis of `q . From that result it is easy to see that
`p ⊕ `q cannot have a
∞
m
n
greedy basis. On the other hand the space ⊕n=1 `q p , with 1 < p < ∞ and
1 ≤ q ≤ ∞, and mn → ∞, for n → ∞, which is isomorphic to Besov spaces of
function defined on the torus (or any closed bounded interval in R) has a greedy
mn
basis [DFOS]. In the case p = 1, ∞ it was shown in [BCLT] that ⊕∞
n=1 `2
p
has a unique unconditional basis up to permutation, and thus this space cannot
have a greedy basis (the usual one is clearly not democratic). From the proof in
mn
[BCLT] we can also deduce that for general q ∈ [1, ∞] the spaces ⊕∞
n=1 `q
1
mn
and ⊕∞
have
no
greedy
bases,
unless,
of
course,
in
the
trivial
case
that
n=1 `q
c0
p = q = ∞ or p = q = 1.
Problem 4.1.4. Given any ε > 0, can a Banach space with normalized a greedy
basis (en ) be renormed so that (en ) is (1 + ε)-greedy?
Comments to 4.1.4: First Albiac and Wojtaszczyk [AW] asked whether or not
every Banach space with normalized a greedy basis (en ) can be renormed so that
(en ) is 1-greedy. This was solved negatively (recall that by Theorem 1.1.9 every
1-greedy basis must be 1-democratic) in [DOSZ2] where the following was shown:
4.2. GREEDY ALGORITHMS
73
Proposition 4.1.5. Assume that X is a Banach space with a normalized suppression 1-unconditional basis (ei ) and that there is a sequence (ρn ) ⊂ (0, 1] with
ρ = inf n∈N ρn > 0 so that
X whenever n ∈ N and E ⊂ N with #E = n .
ei = ρn n
i∈E
Then (ei ) is ρ2 -equivalent to the unit vector basis of `1 .
Corollary 4.1.6.
1. Hardy space H1 cannot be renormed so that the the Haarbasis in H1 (which is greedy) is 1-greedy.
2. Tsirelson space T1 cannot be renormed so that it has a 1-greedy basis.
Problem 4.1.7. Can Lp [0, 1], 1 < p < ∞ be renormed so that the Haar basis
becomes 1-greedy, or at least (1 + ε)- greedy?
Coments to Problem 4.1.7: In Corollary 1.2.2 it was shown that Haar basis
in Lp [0, 1] is greedy and in [DOSZ2] it was shown that one can renorm Lp [0, 1]
so that the Haar basis is 1-democratic and 1-unconditional. But the fact that a
basis is 1-democratic and suppression 1-unconditional does only imply that it is
1- greedy. From Theorem 1.1.9 it follows that every suppression 1-unconditional
and 1-democratic basis is only at least 2-greedy and in [DOSZ2] it was shown
that this is optimal and that for any ε > 0 there is a basic sequence which is 1democratic and suppression 1-unconditional (even 1-unconditional) which is not
2 − ε-greedy.
4.2
Greedy Algorithms
For Hilbertspace it is sill open to find better convergence rates of the pure greedy
algorithm but for general Banach spaces the questions about the convergence of
X-Greedy Algorithm in are quite basic and wide open.
Problem 4.2.1. Find one infinite dimensional Banach space X, other than
Hilbert space, on which (X-PGA) converges for any dictionary?
Problem 4.2.2. Does (X-PGA) converge on `p , 1 < p < ∞, p 6= 2, for any
dictionary?
Comments on Problem 4.2.2: In [DKSTW] at least the weak convergence of
the Pure Greedy Algorithm was shown:
Theorem 4.2.3. [DKSTW, Theorem 3.2] Suppose that for n ∈ N Xn is a
finite dimensional space whose norm is Gâtaux differentiable. Then the weak
pure greedy algorithm with fixed weakness factor converges weakly.
74
CHAPTER 4. OPEN PROBLEMS
Problem 4.2.4. Does (X-PGA) converge in Lp [0, 1], 1 < p < ∞, p 6= 2, at least
if one takes the Haar basis as dictionary?
Comments on Problem: 4.2.3: The following finite dimensional version to
Problem 4.2.4 was shown in [DOSZ3]
Theorem 4.2.5. Let 1 < p < ∞ and let hj : j ∈ N be Haar basis of Lp (ordered
consistently with the usual partial order). For each m there is number N =
(N (p, m)) so that X- PGA terminates after N steps, assuming that the starting
point was chosen in span(xj : j ≤ m).
Problem 4.2.6. Are there examples of separable and uniform smooth Banach
spaces X with dictionaries for which the (weak) dual greedy algorithm does not
converge?
Chapter 5
Appendix A: Bases in Banach
spaces
5.1
Schauder bases
In this section we recall some of the notions and results presented in the course
on Functional Analysis in Fall 2012 [Schl]. Like every vector space a Banach
space X admits an algebraic or Hamel basis, i.e. a subset B ⊃ X, so that every
x ∈ X is in a unique way the (finite) linear combination of elements in B. This
definition does not take into account that we can take infinite sums in Banach
spaces and that we might want to represent elements in X as converging series.
Hamel bases are also not very useful for Banach spaces, since (see Exercise 1),
the coordinate functionals might not be continuous.
Definition 5.1.1. [Schauder bases of Banach Spaces]
Let X be an infinite dimensional Banach space. A sequence (en ) ⊂ X is
called Schauder basis of X, or simply a basis of X, if for every x ∈ X, there is a
unique sequence of scalars (an ) ⊂ K so that
x=
∞
X
an en .
n=1
Examples 5.1.2. For n ∈ N let
en = ( 0, . . . 0 , 1, 0, . . .) ∈ KN
| {z }
n−1 times
Then (en ) is a basis of `p , 1 ≤ p < ∞ and c0 . We call (en ) the unit vector of `p
and c0 , respectively.
Remarks. Assume that X is a Banach space and (en ) a basis of X. Then
75
76
CHAPTER 5. APPENDIX A: BASES IN BANACH SPACES
a) (en ) is linear independent.
b) span(en : n ∈ N) is dense in X, in particular X is separable.
c) Every
P∞ element x is uniquely determined by the sequence (an ) so Nthat x =
j=1 an en . So we can identify X with a space of sequences in K .
Proposition 5.1.3. Let (en ) be the Schauder basis of a Banach space X. For
n ∈ N and x ∈ X define e∗n (x) ∈ K to be the unique element in K, so that
x=
∞
X
e∗n (x)en .
n=1
e∗n
Then
: X → K is linear.
For n ∈ N let
Pn : X → span(ej : j ≤ n), x 7→
n
X
e∗n (x)en .
j=1
Then Pn : X → X are linear projections onto span(ej : j ≤ n) and the following
properties hold:
a) dim(Pn (X)) = n,
b) Pn ◦ Pm = Pm ◦ Pn = Pmin(m,n) , for m, n ∈ N,
c) limn→∞ Pn (x) = x, for every x ∈ X.
Pn , n ∈ N, are called the Canonical Projections for (en ) and (e∗n ) the Coordinate
Functionals for (en ) or biorthogonals for (en ).
Theorem 5.1.4. Let X be a Banach space with a basis (en ) and let (e∗n ) be the
corresponding coordinate functionals and (Pn ) the canonical projections. Then
Pn is bounded for every n ∈ N and
b = sup ||Pn kL(X,X) < ∞,
n∈N
and thus e∗n ∈ X ∗ and
ke∗n kX ∗ =
kPn − Pn−1 k
2b
≤
.
ken k
ken k
We call b the basis constant of (ej ). If b = 1 we say that (ei ) is a monotone
basis.
Furthermore
∞
∞
n
X
X
X
||| · ||| : X → R+
,
a
e
→
7
a
e
=
sup
a
e
i i
i i
i i ,
0
j=1
j=1
n∈N
j=1
is an equivalent norm under which (ei ) is a monotone basis.
5.1. SCHAUDER BASES
77
Definition 5.1.5. [Basic Sequences]
Let X be a Banach space. A sequence (xn ) ⊂ X \ {0} is called a basic sequence
if it is a basis for span(xn : n ∈ N).
If (ej ) and (fj ) are two basic sequences (in possibly two different Banach
spaces X and Y ), we say that (ej ) and (fj ) are isomorphically equivalent if the
map
n
n
X
X
T : span(ej : j ∈ N) → span(fj : j ∈ N),
aj ej 7→
aj fj ,
j=1
j=1
extends to an isomorphism between the Banach spaces between span(ej : j ∈ N)
and span(fj : j ∈ N).
Note that this is equivalent with saying that there are constants 0 < c ≤ C
so that for any n ∈ N and any sequence of scalars (λj )nj=1 it follows that
n
n
n
X
X
X
c
λj ej ≤ λj fj ≤ C λj ej .
j=1
j=1
j=1
Proposition 5.1.6. Let X be Banach space and (xn : n ∈ N) ⊂ X \ {0}. Then
(xn ) is a basic sequence if and only if there is a constant K ≥ 1, so that for all
m < n and all scalars (aj )nj=1 ⊂ K we have
(5.1)
m
n
X
X
ai xi ≤ K ai xi .
i=1
i=1
In that case the basis constant is the smallest of all K ≥ 1 so that (5.1) holds.
Theorem 5.1.7. [The small Perturbation Lemma]
Let (xn ) be a basic sequence in a Banach space X, and let (x∗n ) be the coor∗
dinate functionals (they are elements of span(xj : j ∈ N) ) and assume that (yn )
is a sequence in X such that
(5.2)
c=
∞
X
kxn − yn k · kx∗n k < 1.
n=1
Then
a) (yn ) is also basic in X and isomorphically equivalent to (xn ), more precisely
∞
∞
∞
X
X
X
(1 − c)
an xn ≤ an yn ≤ (1 + c)
an xn ,
n=1
n=1
for all in X converging series x =
n=1
P
n∈N an xn .
b) If span(xj : j ∈ N) is complemented in X, then so is span(yj : j ∈ N).
78
CHAPTER 5. APPENDIX A: BASES IN BANACH SPACES
c) If (xn ) is a Schauder basis of all of X, then (yn ) is also a Schauder basis
of X and it follows for the coordinate functionals (yn∗ ) of (yn ), that yn∗ ∈
span(x∗j : j ∈ N), for n ∈ N.
Now we recall the notion of unconditional basis. First the following Proposition.
Proposition 5.1.8. For a sequence (xn ) in Banach space X the following statements are equivalent.
a) For any reordering P
(also called permutation) σ of N (i.e. σ : N → N is
bijective) the series n∈N xσ(n) converges.
b) For any ε > 0 there
is an n ∈ N so that whenever M ⊂ N is finite with
P
min(M ) > n, then n∈M xn k < ε.
P
c) For any subsequence (nj ) the series j∈N xnj converges.
P
d) For sequence (εj ) ⊂ {±1} the series ∞
j=1 εj xnj converges.
P
In the case that above conditions hold we say that the series
xn converges
unconditionally.
Definition 5.1.9. A basis (ej ) for P
a Banach space X is called unconditional, if
for every x ∈ X the expansion x = he∗j , xiej converges unconditionally, where
(e∗j ) are coordinate functionals of (ej ).
A sequence (xn ) ⊂ X is called an unconditional basic sequence if (xn ) is an
unconditional basis of span(xj : j ∈ N).
Proposition 5.1.10. For a sequence of non zero elements (xj ) in a Banach
space X the following are equivalent.
a) (xj ) is an unconditional basic sequence,
b) There is a constant C, so that for all finite B ⊂ N, all scalars (aj )j∈B ⊂ K,
and A ⊂ B
X
X
(5.3)
aj xj ≤ C aj xj .
j∈A
j∈B
c) There is a constant C 0 , so that for all finite sets B ⊂ N, all scalars
(aj )j∈B ⊂ K, and all (εj )j∈B ⊂ {±1}, if K = R, or (εj )j∈B ⊂ {z ∈
C : |z| = 1}, if K = C,
(5.4)
n
X
X
0
ε
a
x
≤
C
a
x
j j j
j j .
j∈B
j=1
5.2. MARKUSHEVICH BASES
79
In this case we call the smallest constant C = Cs which satisfies (5.3) for all
n, A ⊂ {1, 2 . . . , n} and all scalars (aj )nj=1 ⊂ K the supression-unconditional
constant of (xn ) and we call the smallest constant C 0 = Cu so that (5.4) holds
for all n, (εj )nj=1 ⊂ {±1}, or (εj )nj=1 ⊂ {z ∈ C : |z| = 1}, and all scalars
(aj )nj=1 ⊂ K the unconditional constant of (xn ).
Moreover, it follows that
Cs ≤ Cu ≤ 2Cs .
(5.5)
Proposition 5.1.11. Let (xn ) be an unconditional basic sequence. Then
(5.6)
∞
∞
n X
o
X
Cu = sup ai bi xi : x =
ai xi ∈ BX and |bi | ≤ 1 .
j=1
i=1
Remark. While for Schauder bases it is in general important how we order
them, the ordering is not relevant for unconditional bases. We can therefore
index unconditional bases by any countable set.
5.2
Markushevich bases
Not every separable Banach space has a Schauder basis [En]. But it has at least
a bounded and norming Markushevich basis according to a result of Ovsepian and
Pelczyński [OP]. We want to present this result in this section,
Definition 5.2.1. A countable family (en , e∗n )n∈N ⊂ X × X ∗ is called
• biorthogonal, if e∗n (em ) = δ(m,n) , for all m, n ∈ N,
• fundamental, or complete, if span(en : i ∈ N) is dense in X,
• total, if for any x ∈ X with e∗n (x) = 0, for all n ∈ N, it follows that x = 0,
• norming, if for some constant c > 0,
sup
|x∗ (x)| ≥ ckxk, for all x ∈ X.
x∗ ∈span(e∗n :n∈N)∩BX ∗
and in that case we also say that (en , e∗n )n∈N is c-norming,
• shrinking, if span(e∗n : n ∈ N) = X ∗ , and
• bounded, or uniformly minimal, if C = supn∈N ken k·ke∗n k < ∞, and we say in
that case that (en , e∗n )n∈N is C-bounded and call C the bound of (en , e∗n )n∈N .
A biorthogonal, fundamental and total sequence (en , e∗n )n∈N is called an Markushevich basis or simply M -Basis.
80
CHAPTER 5. APPENDIX A: BASES IN BANACH SPACES
Remark. Assume (en , e∗n ) is an M -basis. It follows from the totality that
span(e∗n : n ∈ N) is w∗ -dense in X ∗ Thus in every reflexive space M -bases are
shrinking, and shrinking M bases are 1-norming.
Our goal is to prove following
Theorem 5.2.2. [OP] Every separable Banach space X admits a bounded, norming M -basis which can be chosen to be shrinking if X ∗ is (norm) separable.√ Moreover, the bound of that M -basis can be chosen arbitrarily close to 4(1 + 2)2 .
Remark. Pelczyński [Pe] improved later the above result and showed that for
all separable Banach spaces and all ε > 0 there exists a bounded M -basis, whose
bound does not exceed 1 + ε.
It is an open question whether or not every separable Bach space has a 1bounded M -basis. But it is not hard to show that a space X with a bounded and
norming M -basis can be renormed so that this basis becomes 1-bounded and 1
norming.
Remark. It might be nice to know that every separable Banach space has a
bounded and norming Markushevish basis (ei , e∗i ) . Nevertheless, given z ∈ X,
we do not have any (set aside a good one) procedure to approximate z by finite
linear combinations of the ei , we only know that such an approximation exists.
This is precisely the difference to Schauder bases, for which we know that the
canonical projections converge point wise.
Lemma 5.2.3. [LT, Lemma 1.a.6] Assume that X is an infinite dimensional
space and that F ⊂ X and G∗ ⊂ X ∗ are finite dimensional subspaces of X and
X ∗ , respectively. Let ε > 0. Then there is an x ∈ X, kxk = 1 and an x∗ ∈ X ∗
so that kx∗ k ≤ 2 + ε, x∗ (x) = 1, z ∗ (x) = 0, for all z ∗ ∈ G∗ , and x∗ (z) = 0 for all
z ∈ F.
Proof. Let (yi∗ )m
i=1 ⊂ SX ∗ be finite and 1/(1 + ε) norming the space F . Pick
x ∈ ⊥ {yj∗ : j = 1, 2, . . . m} ∪ G∗
= z ∈ X : yj∗ (z) = 0, j = 1, 2 . . . m, and z ∗ (z) = 0, z ∗ ∈ G∗ , with kxk = 1.
It follows for all λ ∈ R and all y ∈ F that
kyk
||y + λxk ≥ max yj∗ (y + λx) = max yj∗ (y) ≥
.
1+ε
Then define
u∗ : span(F ∪ {x}) → R,
y + λx 7→ λ.
We claim that ku∗ k ≤ 2 + ε. Indeed, let y ∈ F , y 6= 0, and λ ∈ R. Then
!
|λ|
∗ λx + y u
=
kλx + yk kλx + yk
5.2. MARKUSHEVICH BASES
≤
81

kyk


2 kλx+yk ≤ 2(1 + ε)



2kyk
2kyk−kyk
=2
if |λ| ≤ 2kyk,
if |λ| > 2kyk.
Letting now x∗ be a Hahn Banach extension of u∗ onto all of X, our claim is
proved.
Lemma 5.2.4. ([Ma], see also [HMVZ, Lemma 1.21])
Let X be an infinite dimensional Banach space. Suppose that (zn ) ⊂ X and
(zn∗ ) ⊂ X ∗ are sequences so that span(zn : n ∈ N) and (zn∗ : n ∈ N) are both infinite
dimensional and so that
(M1) (zn ) separates points of span(zn∗ : n ∈ N),
(M2) (zn∗ ) separates points of span(zn : n ∈ N).
Let N ⊂ N be co-infinite, and ε > 0.
Then we can choose a biorthogonal system
(xn , x∗n ) ⊂ span(zn : n ∈ N) × span(zn∗ : n ∈ N)
with
(5.7)
(5.8)
span(zn : n ∈ N) ⊂ span(xn : n ∈ N) and span(zn∗ : n ∈ N) ⊂ span(x∗n : n ∈ N)
sup kx∗n k · kx∗n k < 2 + ε.
n∈N
Remark. Note that (xn , x∗n ) is an M -basis if span(zn : n ∈ N) is dense in X,
∗ ∩ span(z ∗ ) is norming X.
and if (zn∗ ) separates points of X. It is norming if BX
n
Proof. Choose s1 = min{n ∈ N : zn 6= 0}. x1 = zs1 /kzs1 k. If 1 ∈ N choose
x∗1 ∈ SX ∗ with x∗1 (x1 ) = 1. Otherwise choose x∗1 = zt∗1 with t1 = min{m ∈ N :
∗ (x ) 6= 0} (which exists by (M1)).
zm
1
Write N \ N = {k1 , k2 , . . .} and proceed by induction to choose x1 , x2 , . . . xn
and x∗1 , . . . x∗n as follows.
Assume that x1 , x2 , . . . xn and x∗1 , . . . x∗n have been been chosen.
Case 1: n + 1 ∈ N . Then let F = span(xi : i ≤ n) and G∗ = span(x∗i : i ≤ n);
and choose xn+1 and x∗n+1 by Lemma 5.2.3.
Case2: n + 1 = k2j−1 ∈ N \ N . Then let
s2j−1 = min s : zs 6∈ span(xi : i ≤ n)
and define
xn+1 = zs2j−1 −
n
X
i=1
x∗i (zs2j−1 )xi .
82
CHAPTER 5. APPENDIX A: BASES IN BANACH SPACES
This implies that x∗i (xn+1 ) = 0 for i = 1, 2, . . . n. Next choose (using (M2))
t2j−1 = min{t : zt∗ (xn+1 ) 6= 0}
and let
x∗n+1
=
zt∗2j−1 −
Pn
∗
∗
i=1 zt2j−1 (xi )xi
zt∗2j−1 (xn+1 )
,
which yields that x∗n+1 (xi ) = 0, for i = 1, 2 . . . n, and x∗n+1 (xn+1 ) = 0.
Case 3: n + 1 = k2j ∈ N \ N . Then we choose
t2j = min s : zs∗ 6∈ span(x∗i : i ≤ n) .
Let
x∗n+1 = zs∗2j −
n
X
zs2j−1 (xi )x∗i ,
i=1
and hence x∗n+1 (xi ) = 0, for i = 1, 2 . . . n, and then let (using (M1)
s2j = min{s : x∗n+1 (xs ) 6= 0},
and
xn+1 =
zs2j −
Pn
∗
i=1 xi (zs2j ))xi
x∗n+1 (zs2j )
,
which implies that x∗i (xn+1 ) = 0, for i = 1, 2 . . . n and
x∗n+1 (xn+1 ) = 1.
We insured by this choice that (xi , x∗i ) : i ∈ N is a biorthogonal sequence
in X × X ∗ which also satisfies (5.8) and, since for any m ∈ N we have span(zi :
i ≤ m) ⊂ span(xk2j−1 : j ≤ m) and span(zi∗ : i ≤ m) ⊂ span(x∗k2j : j ≤ m),
(xi , x∗i ) : i ∈ N it also satisfies (5.7).
n
For n ∈ N we consider on `22 the discrete Haar basis
{h0 } ∪ {h(r,s) , r = 0, 1, . . . , n − 1, and s = 0, 1, . . . 2r−1 − 1},
with
h0 = 2−n/2 χ{1,2,3...,2n }
χ{2s2n−r−1 +1,2s2n−r−1+2,...(2s+1)2n−r−1 } −χ{(2s+1)2n−r−1 +1,(2s+1)2n−r−1+2,...(2s+2)2n−r−1 }
h(r,s) =
2(n−r)/2
if r = 0, 1, 2, . . . n − 1 and s = 0, 1, 2 . . . 2r − 1.
n
The unit vector basis (ei )2i=1 as well as the Haar basis
{h0 } ∪ {h(r,s) , r = 0, 1, . . . , n − 1, s = 0, 1, . . . 2r−1 − 1}
5.2. MARKUSHEVICH BASES
83
n
are orthonormal bases in `22 . Thus the matrix A = A(n) with the property that
A(e1 ) = h0 and A(e2r +s+1 ) = h(r,s) is a unitary matrix. If we write
(n)
A(n) = (a(i,j) : 0 ≤ i, j ≤ 2n − 1)
(n)
then it follows for k = 0, 1, 2 . . . 2n − 1 that a(k,0) = h0 (k) = 2−n/2 , and if
r = 0, 1, 2 . . . n − 1 and s = 0, 1, . . . 2r − 1 that
(n)
a(k,2r +s) = h(r,s) (k)

−(n−r)/2

2
= −2−(n−r)/2


0
and thus

2−n/2
2−n/2
..
.
..
.
..
.
..
.
if k ∈ {2s2n−r−1 + 1, 2s2n−r−1 +2, . . . (2s+1)2n−r−1 },
if {(2s + 1)2n−r−1 + 1, (2s+ 1)2n−r−1 +2, . . . (2s+2)2n−r−1 },
if k ≤ 2s2n−r−1 or k > (2s+2)2n−r−1 .
2−(n−1))/2
..
.
0
..
.
..
.
..
.
..
.
 −n/2
 2

..


.
2−(n−1))/2

.

..

−2−(n−1))/2

..
..

.
.


.

..
2−n/2 −2−(n−1)/2
0

A=
.

..
−2−n/2
0
2−(n−1))/2


..
..
..
..

.
.
.
.


..
..
..

(n−1))/2
.
.
.
2


.
.
.
..
..
..

−2−(n−1))/2


..
..
..
..

.
.
.
.
2−n/2 −2−n/2
0
−2−(n−1)/2
···
2−1/2
···
−21/2 · · ·
···
···
···
···
···
···
···
···
···
···
0
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
0
···
···
0
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
..
.
···
···
2−1/2
−2−1/2
···
···
···
···
···
···
···





























It follows therefore that for all k = 0, 1 . . . , 2n − 1 we have
(5.9)
n −1
2X
j=1
√
n X
X
√
(n) n−1
2
1 i
−(n−r)/2
a
=
√
√
≤
=
1
+
2,
2
=
(k,j)
2
2−1
r=0
i=1
because, leaving out the first column, in each row and for each r ∈ {0, 1, 2 . . . n−1}
the value 2(n−r)/2 is absolutely taken exactly once.
This implies the following:
84
CHAPTER 5. APPENDIX A: BASES IN BANACH SPACES
Corollary 5.2.5. If (xi , x∗i ) : i = 0, 1, . . . 2n − 1 is a biorthogonal sequence of
length 2n , in a Banach space X and we let
(5.10)
ek =
n −1
2X
(n)
a(k,j) xj and
j=0
(5.11)
e∗k =
n −1
2X
(n)
a(k,j) x∗j for k = 0, 1, . . . 2n − 1,
j=0
then
(5.12)
(5.13)
(5.14)
(5.15)
(5.16)
max n kek k < (1 +
0≤k<2
√
√
2) max n kxk k + 2−n/2 kx0 k,
0≤k<2
max ke∗k k < (1 + 2) max n kx∗k k + 2−n/2 kx∗0 k,
1≤k<2n
0≤k<2
∗
n
(ej , ej ) : j = 0, 1, . . . , 2 − 1 is biorthogonal
span(ej : 0 ≤ j < 2n ) = span(xj : 0 ≤ j < 2n ) and
span(e∗j : 0 ≤ j < 2n ) = span(x∗j : 0 ≤ j < 2n ).
Proof of Theorem 5.2.2 . Let δ > 0 and put M = 2 + δ. We start with a fundamental sequence (zi ) ⊂ X and a w∗ -dense sequence (zi∗ ) ⊂ BX ∗ ((BX ∗ , w∗ ) is
separable if X is norm separable) , which we choose norm dense if X ∗ is norm
separable. Then we use Lemma 5.2.4 to choose a norming (reps. shrinking) M basis (xn , x∗n ) : n ∈ N of X which satisfies for N being the odd numbers the
conditions (5.7) and (5.8). Without loss of generality we assume
that kxn k = 1,
∗
for n ∈ N. Now we will define a reordering (x̃n , x̃n ) : n ∈ N of (xn , x∗n ) : n ∈ N
as follows:
P
By induction we choose for ` ∈ N a number m` ∈ N and define q` = `j=1 2mj
and q0 = 0, and choose x̃q`−1 , x̃∗q`−1 , x̃q`−1 +1 , x̃∗q`−1 +1 , . . . , x̃q` −1 , x̃∗q` −1 as follows
Assume that for all 0 ≤ r < `, ` ≥ 1, mr , and (x̃0 , x̃∗0 ), (x̃1 , x̃∗1 ), . . . (x̃qr −1 , x̃∗qr −1 )
have been chosen. Put
s` = min s : (x2s , x∗2s ) 6∈ {(x̃t , x̃∗t ) : t ≤ q`−1 − 1}
(recall that ((x2s , x∗2s ) : s ∈ N) are the elements of ((xs , x∗s ) : s ∈ N for which we
do not control the norm) and choose m` ∈ N, so that
√
√
√
(1 + 2) + 2−m` /2 · (1 + 2)M + kx∗2s` k2−m` /2 < (1 + 2)2 M + δ.
Then let (x̃q`−1 , x̃∗q`−1 ) = (x2s` , x∗2s` ) while (x̃q`−1 +1 , x̃∗q`−1 +1 ) . . . (x̃q` −1 , x̃∗q` −1 ) con
sist of the elements of (x2t−1 , x∗2t−1 ) : t ∈ N ) which are not in the set {(x̃t , x̃∗t ) :
t ≤ q`−1 − 1} and have the lowest 2m` − 1 indices.
5.2. MARKUSHEVICH BASES
85
By that choice we made sure that allelements of (xt , x∗t ) : t ∈ N appear
exactly once in the sequence x̃t , x̃∗t : t ∈ N .
Then we apply Corollary 5.2.5 and define for k = 0, 1, 2, . . . 2m` −1
eq`−1 +k =
` −1
2m
X
j=0
(m` )
a(k,j)
x̃q`−1 +j
and
e∗q`−1 +k
=
` −1
2m
X
(m )
`
a(k,j)
x̃∗q`−1 +j .
j=0
It follows then from (5.12) and (5.13) that for k = 0, 1, 2, . . . 2m` −1
√
keq`−1 +k k · ke∗q`−1 +k k ≤ (1 + 2)2 M + δ.
√ 2
√ 2
Choosing δ > 0 small enough
we
can
ensure
that
(1+
2)
M
+δ
<
2(1+
2) +ε.
Since (xn , x∗n ) : n ∈ N is a norming
M
-basis,
it
follows
from
(5.14),
(5.15)
and (5.16) that (en , e∗n ) : n ∈ N is a norming M basis which is shrinking if
(xn , x∗n ) : n ∈ N is shrinking.
86
CHAPTER 5. APPENDIX A: BASES IN BANACH SPACES
Chapter 6
Appendix B: Some facts about
Lp[0, 1] and Lp(R)
6.1
The Haar basis and Wavelets
We recall the definition of the Haar basis of Lp [0, 1]. Let
T = {(n, j) : n ∈ N0 , j = 0, 1 . . . , 2n − 1} ∪ {0}.
Let 1 ≤ p < ∞ be fixed. We define the Haar basis (ht )t∈T and the normalized
(p)
Haar basis (ht )t∈T in Lp [0, 1] as follows.
(p)
h0 = h0 ≡ 1 on [0, 1] and for n ∈ N0 and j = 0, 1, 2 . . . 2n − 1 we put
h(n,j) = 1[j2−n ,(j+ 1 )2−n ) − 1[(j+ 1 )2−n ,(j+1)2−n ) .
2
2
and we let
∆(n,j) = supp(h(n,j) ) = j2−n , (j + 1)2−n ,
h
1 −n −n
∆+
=
j2
,
j
+
2
(n,j)
2
h
1 −n
−
−n
∆(n,j) = j +
2 , (j + 1)2
.
2
(∞)
We let h(n,j) = h(n,j) . And for 1 ≤ p < ∞
(p)
h(n,j) =
h(n,j)
= 2n/p 1[j2−n ,(j+ 1 )2−n − 1[(j+ 1 )2−n ),(j+1)2−n ) .
2
2
kh(n,j) kp
Theorem 6.1.1. [Schl, Theorems 3.2.2, 5.5.1]
(p)
(p)
We order (ht : t ∈ T ) into as sequence (hn : n ∈ N), with h1 = h0 , and
the property that if 2 ≤ m < n, then either supp(hn ) ⊂ supp(hm ) or supp(hm ) ∩
supp(hn ) = ∅. Then (hn ) is a monotone basis for Lp [0, 1].
87
88CHAPTER 6. APPENDIX B: SOME FACTS ABOUT LP [0, 1] AND LP (R)
(p)
For 1 < p < ∞, (ht : t ∈ T ) is an unconditional basis for Lp [0, 1], but it is
not unconditional for p = 1. In fact L1 [0, 1] does not embed into a Banach space
with unconditional basis.
Define h = 1[0,1/2]−1(1/2,1] . Then we can write for n ∈ N0 and j = 0, 1, . . . 2n −1.
h(n,j) (t) = h(2n t − j), for t ∈ [0, 1], and
(p)
h(n,j) (t) = 2n/p h(2n t − j), for t ∈ [0, 1].
We define now for all n ∈ Z and all j ∈ Z a function h(n,j) as follows
(6.1)
h(n,j) (t) = h(2n t − j), for t ∈ R, and
(6.2)
h(n,j) (t) = 2n/p h(2n t − j), for t ∈ R.
(p)
For all n, j ∈ Z we have
supp(h(n,j) ) := {t : h(n,j) (t) < 0} = {t : 0 ≤ 2n t−j ≤ 1} = [j2−n , (j+1)2−n ) =: ∆(n,j)
and
h
1 −n 2
=: ∆+
{h(n,j) > 0} = j2−n , j +
(n,j) ,
2
h
1 −n
{h(n,j) < 0} = j +
2 , j + 1)2−n =: ∆−
(n,j) ,
2
We note for and (m, i) and (n, j) in Z × Z that
Either ∆(m,i) ⊂ ∆(n,j) or ∆(n,j) ⊂ ∆(m,i) or ∆(m,i) ∩ ∆(n,j) = ∅.
(6.3)
Theorem 6.1.2. Let 1 ≤ p < ∞.
(p)
1. {h(n,j) : n ∈ N0 , j ∈ Z} ∪ {1[j,j+1) : j ∈ Z}, when appropriately ordered, is
a monotone basis for Lp (R), which is unconditional if 1 < p < ∞.
(p)
2. {h(n,j) : n ∈ Z, j ∈ Z} is an unconditional basis for Lp (R) if 1 < p < ∞.
Remark. Note that the second part of Theorem 6.1.2 is wrong for p = 1. Indeed,
the integral functional
Z ∞
I : L1 (R) → R,
f 7→
f (x) dx,
−∞
is a bounded linear functional on L1 (R) which is not identical to the zero(1)
functional, but for all n, j ∈ Z h(n,j) is in the kernel of I, and thus the span
(1)
of the h(n,j) cannot be dense in L1 (R). The same argumentation is invalid for
1 < p < ∞ since I is unbounded on Lp (R), if p > 1.
6.1. THE HAAR BASIS AND WAVELETS
89
Proof. First note that Lp (R) is isometrically isomorphic to (⊕i∈Z Lp [i, i + 1]) via
the map
Lp (R) → (⊕i∈Z Lp [i, i + 1]), f 7→ (f |[i,i+1] : i ∈ Z),
and that for i ∈ Z, by Theorem 6.1.1, the shifted Haar basis
(p,i)
h(n,j) :n ∈ N0 , j = 0, 1, . . . , 2n −1 ) ∪ {1[i,i+1) }
(p)
= h(n,j) : n ∈ N0 , j = 2n i, 2n i+1, 2n i+2 . . . 2n (i+1) − 1 ∪ {1[i,i+1) }
is a monotone basis for Lp [i, i+1], if ordered appropriately, which is unconditional
if 1 < p < ∞. Since Lp (R) is the 1-unconditional sum of the spaces Lp [i, i + 1),
(p,i)
i ∈ Z the union of h(n,j) : n ∈ N0 , j = 0, 1, . . . , 2n −1 ) ∪ {1[i,i+1) over all i ∈ Z
is a monotone basis of Lp [0, 1], if ordered appropriately, which is unconditional
if p > 1
In order to show (2) we assume B ⊂ Z × Z is finite and A ⊂ B and then
verify condition (5.3). Since B is finite, there is n1 , ∈ N so that
∆(n,j) ⊂ ∆(−n1 ,0) = [0, 2n1 ) for all (n, j) ∈ B + = {(m, i) ∈ Z × N0 } ∩ B,
and there is n2 , ∈ N so that
∆(n,j) ⊂ ∆(−n2 ,−1) = [−2n2 , 0) for all (n, j) ∈ B − = {(m, i) ∈ Z × (−N)} ∩ B,
Since the Lp -nrm is shift invariant, it is enough to assume that B − = ∅ and
B = B+.
Consider the map (rescaling)
φ : ∆(0,0) = [0, 1] → ∆(−n1 ,0) ,
t 7→ t2n1
and the map
T : Lp (∆(−n1 ,0) ) → Lp [0, 1],
f 7→ 2n1 /p f ◦ φ,
which is an isometry between Lp (∆(n1 ,0) ) and Lp [0, 1], mapping the family
(p)
{h(n,j) : ∆(n,j) ⊂ ∆(n1 ,0) } ∪ {1∆(n1 ,0) }
into the Haar basis of Lp [0, 1]
(p)
{h(n,j) : ∆(n,j) ⊂ ∆(0,0) } ∪ {1[0,1] }.
(p)
This proves with Theorem 6.1.1 that {h(n,j) : ∆(n,j) ⊂ ∆(n1 ,0) } ∪ {1∆(n1 ,0) } is
unconditional and therefore (5.3) is satisfied.
(p)
It is left to show that the closed linear span of {h(n,j) : n ∈ Z, j ∈ Z} is all
of Lp (R). By part (1) and using shifts, it is enough to show that 1[0,1] is in the
(p)
closed linear span of {h(n,j) : n ∈ Z, j ∈ Z}.
90CHAPTER 6. APPENDIX B: SOME FACTS ABOUT LP [0, 1] AND LP (R)
Notice that for N ∈ N we have
N
X
2−(n+1) 1[0,2n ) − 1[2n ,2n+1 )
n=0
=
1[0,1)
N
X
2−(n+1)
n=0
+1[1,2)
N
X
2−(n+1) − 2−1
2−(n+1) − 2−2
n=1
+1[2,4)
N
X
n=2
..
.
+1[2N −1 ,2N ) (2−(N +1) − 2−N )
−1[2N ,2N +1 ) 2−(N +1)
=
1[0,1) (1 − 2−(N +1) ) − 1[1,2N ) 2−(N +1) − 1[2N ,2N +1 ) 2−(N +1) .
For the last equality note that for k = 1, 2, . . . N − 1
N
X
2−n−1 − 2−k = 2−k − 2−(N +1) − 2−k = −2−(N +1) .
n=k
Since we assumed that p > 1, it follows that
Lp − lim
N →∞
N
X
2−(n+1) 1[0,2n ) − 1[2n ,2n+1 ) = 1[0,1) ,
n=0
which finishes the prove of our claim.
Definition 6.1.3. A function Ψ ∈ L2 (R) is called wavelet if the family (Ψ(n,j) :
n, j ∈ Z), defined by
Ψ(n,j) (t) = 2n/2 Ψ(2n t − j), for t ∈ R and n, j ∈ Z,
is an orthonormal basis of L2 (R).
Definition 6.1.4. A Multi Resolution Analysis of L2 (R) (MRA) is sequence of
closed subspaces (Vn : n ∈ Z) of L2 (R) such that
(MRA1) . . . V−2 ⊂ V−1 ⊂ V0 ⊂ V1 ⊂ V2 . . .,
S
(MRA2) n∈Z Vn = L2 (R),
6.1. THE HAAR BASIS AND WAVELETS
(MRA3)
T
n∈Z Vn
91
= {0},
(MRA4) Vn = {f (2+n (·) ) : f ∈ V0 }, for n ∈ Z,
(MRA5) there is a compactly supported function Φ ∈ V0 , so that (Φ((·)−m) : m ∈ Z)
is an orthonormal basis of V0 .
In this case we call Φ a scaling function of the MRA (Vn : n ∈ Z).
Note that (MRA5) implies
(MRA6) V0 (and thus any Vn ) is translation invariant by integer shifts, i.e.
f ∈ V0 ⇐⇒ f ((·) − j) ∈ V0 , for all j ∈ Z and
For h ∈ R and f ∈ L2 (R) we put
Th : L2 (R) → L2 (R),
f 7→ f ((·) − h) (Shift to there right by h units)
Jh : L2 (R) → L2 (R),
f 7→ 2h/2 f ((·)2h ) (Scaling )
Remark. For h ∈ R the operators Th and Jh are isometries and
(6.4)
Th−1 = T−h and Jh−1 = J−h .
We can rephrase (MRA4) and (MRA6) equivalently as follows
(MRA4’) Vn = Jn (V0 ), for n ∈ Z, and
(MRA6’) V0 = Tn (V0 ) for all n ∈ Z.
Finally note that (MRA4), (MRA6) and (MRA5) implies that for j ∈ Z
(MRA5’) {2j/2 Φ(2j (·) − k) : j, k ∈ Z} is an orthonormal basis of Vj .
and note that Th as well as Jh are both isometries on L2 (R).
Example 6.1.5. Take Φ = 1[0,1) , and for (n ∈ Z), put
Vn = span(Jn ◦ Tj (φ) : j ∈ Z) = span(1[j2−n ,(j+1)2−n ) ; j ∈ Z).
(Note 1[j2−n ,(j+1)(2−n ) (2−n (·)) = 1[j,j+1) (·) ∈ V0 ).
Then (Vn ) is an MRA.
We now discuss how to produce a wavelet Ψ starting with an MRA (Vn : n ∈
Z) with scaling function Φ.
We denote the orthogonal complement of Vj inside Vj+1 by Wj ; this means
that any f ∈ Vj+1 can be written as f = g + h with g ∈ Vj and h ∈ Wj and
92CHAPTER 6. APPENDIX B: SOME FACTS ABOUT LP [0, 1] AND LP (R)
p
kf k2 = kgk22 + khk22 . We write Vj+1 = Wj ⊕ Vj . Since Jj is an unitary operator
(keeping orthogonality), we deduce that
Vj ⊕ Wj = Vj+1
Jj (V1 )
= Jj (V0 ⊕ W0 )
= Jj (V0 ) ⊕ Jj (W0 ) = Vj ⊕ Jj (W0 ).
Since the orthogonal complement to a subspace W of a Hilbert space is unique
(namely W ⊥ = {h ∈ H : ∀w ∈ W hh, wi = 0}), we obtain
Wj = Jj (W0 ) for all j ∈ Z.
(6.5)
Next we observe that
(6.6)
Wj and Wi are orthonormal if j 6= i.
Indeed, assume w.l.o.g. that i < j. Then Wi ⊂ Vi+1 ⊂ Vj and Wj is orthogonal
to Vj .
From (MRA2) we deduce that every f ∈ L2 (R) can be arbitrary approximated
by some g ∈ Vn for large enough n ∈ N and (MRA3) yields that
lim PV−k (f ) = 0
k→∞
where PV−k is the orthogonal projection of L2 (R) onto V−k . Thus, choosing k ∈ N
we can arbitrarily approximate g by an element h in the orthogonal complement
of V−k inside Vn . But since
Vn = Wn−1 ⊕ Vn−1 = Wn−1 ⊕ Wn−2 ⊕ Vn−2 = . . . (Wn−1 ⊕ Wn−2 ⊕ W−k ) ⊕ V−k ,
it follows that
h ∈ Wn−1 ⊕ Wn−2 ⊕ . . . W−k .
As a consequence we deduce that L2 (R) is the orthonormal sum of the Wj , j ∈ Z.
Together with our observations (6.5) and (6.6) this yields the following result.
Proposition 6.1.6. Every Ψ ∈ W0 for which {Ψ((·) − j) : j ∈ Z} is an orthonormal basis of W0 is a wavelet. We say in that case that Ψ is the wavelet
associated to the MRA (Vn ).
Example 6.1.7. We consider the Example 6.1.5. Then
Z j+1
n
o
W0 = {f ∈ V1 : ∀g ∈ V0 hg, f i = 0} = f ∈ V1 :
f (t) dt = 0 .
j
Thus we could take
Ψ = 1[0,1/2) − 1[1/2,1) ,
as a wavelet associated to (Vn ).
6.1. THE HAAR BASIS AND WAVELETS
93
We would like to explain how to construct the wavelet Ψ associated to an
MRA (Vn : n ∈ Z) with scaling function Φ.
Theorem 6.1.8. Suppose that (Vn : n ∈ Z) is an MRA with scaling function Φ
which is integrable and
Z ∞
Φ(t) dt 6= 0.
−∞
1. The the following Scaling Relation holds:
(6.7)
Φ=
X
Z
∞
pk Φ(2(·) − k) with pk = 2
Φ(x)Φ(2x − k)dx.
−∞
k∈Z
More generally
(6.8)
Φ(2j−1 (·) − l) =
X
pk−2l Φ(2j (·) − k) for all j, l ∈ Z.
k∈Z
2. The sequence (pk : k ∈ Z) satisfies
X
(6.9)
pk−2l pk = 2δ0,l for all l ∈ Z
k∈Z
X
(6.10)
pk =
k∈2Z
X
pk = 1.
k∈2Z+1
3. The function Ψ defined by
(6.11)
Ψ=
X
(−1)k p1−k Φ(2(·) − k)
k∈Z
is a wavelet associated to (Vn : n ∈ Z).
Proof. Since Φ ∈ V0 ⊂ V1 and since (21/2 Φ(2(·) − k) : k ∈ Z) is an orthonormal
basis of V1 , we can write Φ as in (6.7). For j, l ∈ Z it follows therefore that
X
Φ(2j−1 (·) − l) =
pk Φ 2(2j−1 (·) − l) − k
k∈Z
=
X
k∈Z
X
pk Φ 2j (·) − 2l − k =
pk−2l Φ(2j (·) − k).
k∈Z
Since (Φ((·) − l) : l ∈ Z) is an orthonormal sequence we obtain from (6.7) and
(6.8) (with j =) that
δ(0,l) = hΦ((·) − l), Φi
X
=
pm−2l pk hΦ(2(·) − m), Φ(2(·) − k)i
m,k∈Z
94CHAPTER 6. APPENDIX B: SOME FACTS ABOUT LP [0, 1] AND LP (R)
=
X
Z
pk−2l pk
k∈Z
∞
|Φ(2t)|2 dt =
−∞
1X
pk−2l pk ,
2
k∈Z
which proves (6.9) and implies (replacing l by −l) that
X
2=2
δ(0,−l)
l∈Z
=
X
pk+2l pk
l,k∈Z
=
XX
=
X
=
p2k+2l p2k +
l∈Z k∈Z
p2k+1+2l p2k+1
l∈Z k∈Z
p2k
X
k∈Z
l∈Z
X
p2k
X
k∈Z
XX
p2k+2l +
X
p2k+1
X
k∈Z
p2l +
X
l∈Z
p2k+1
X
k∈Z
l∈Z
p2k+1+2l
p2l+1
l∈Z
X
2 X
2
=
p2k + p2k+1 =: A2 + B 2 .
k∈Z
k∈Z
Moreover if we integrate the scaling relation (6.7) we obtain
Z ∞
Z ∞
1X
(6.12)
Φ(x)dx =
pk
Φ(x)dx.
2
−∞
−∞
k∈Z
By our assumption on the integral of Φ, we can cancel the integral on both sides
of (6.12) and obtain that
X
A+B =
pk = 2.
k∈Z
The only solution of A2 + B 2 = 2 and A + B = 2 is A = B = 1 which proves
(6.10) (draw a picture!).
In order to prove the (3) we first note that Ψ ∈ V1 . Secondly, we note that
the sequence (Ψ((·) − k) : k ∈ Z) is orthonormal. Indeed, it follows from (6.8)
and (6.9) for l, m ∈ Z that
hΨ((·) − l),Ψ((·) − m)i
DX
E
X
=
(−1)k p1−k Φ(2((·) − l) − k),
(−1)k p1−k Φ(2((·) − m) − k)
k=1
k=1
DX
E
X
=
(−1)k p1−k+2l Φ(2(·) − k),
(−1)k p1−k+2m Φ(2(·) − k)
k=1
1X
=
p1−k+2l p1−k+2m
2
k∈Z
k=1
6.1. THE HAAR BASIS AND WAVELETS
=
95
1X
p1−k+2l−2m p1−k = δ(l,m) .
2
k∈Z
Thirdly, it follows from 6.8 for any l, m ∈ Z that
hΦ((·) − l),Ψ((·) − m))i
DX
E
X
=
pk−2l Φ((·) − k),
(−1)k p1−k Φ(2((·) − m) − k)
k∈Z
=
DX
k=1
E
X
k
pk−2l Φ((·) − k),
(−1) p1−k+2m Φ((·) − k)
k∈Z
=
=
=
=
k=1
1X
(−1)k pk−2l p1−k+2m
2
k∈Z
1X
(−1)k pk p1−k+r with r = 2m − 2l
2
k∈Z
1X
1X
p2k p1−2k+r −
p2k+1 p−2k+r
2
2
k∈Z
k∈Z
1X
1X
p2k p1−2k+r −
p1−2l+r p2l = 0.
2
2
k∈Z
l∈Z
(substitute 2l = r − 2k)
Finally, in order to show that (Ψ((·) − k) : k ∈ Z) and (Φ((·) − k) : k ∈ Z) span
all of V1 we need to show that for given j ∈ Z the projection of Φ(2(·) − j) onto
the space spanned by (Ψ((·) − k) : k ∈ Z) and (Φ((·) − k) : k ∈ Z) has the same
norm (namely 1/2) as Φ(2(·) − j). Let us denote the projected vector by Φ̃j . By
the above shown orthonormalities of (Ψ((·) − k) : k ∈ Z) and (Φ((·) − k) : k ∈ Z)
we can write
X
Φ̃j =
ak Φ((·) − k) + bk Ψ((·) − k)) ,
k∈Z
with
1
ak = hΦ(2(·) − j), Φ((·) − k)i = hΦ(2(·) + 2k − j), Φi = pj−2k .
2
and
bk = hΦ(2(·) − j), Ψ((·) − k)i
X
1
=
(−1)l hΦ(2(·) − j), p1−l Φ(2(·) − l − 2k)i = (−1)j p1−j+2k
2
l∈Z
for k ∈ Z. It follows therefore that
1X
1X
1X
1
kΦ̃j k2 =
|pj−2k |2 +
|p1−j+2k |2 =
|pk |2 = .
4
4
4
2
k∈Z
k∈Z
k∈Z
96CHAPTER 6. APPENDIX B: SOME FACTS ABOUT LP [0, 1] AND LP (R)
The proof of the following result goes beyond the scope of these notes, since
it requires several tools from harmonic analysis
Theorem 6.1.9. [Wo1, Theorem 8.13] Assume that Ψ is a wavelet for L2 (R)
satisfying the following two conditions for some constant C > 0:
(6.13)
(6.14)
|Ψ(x)| ≤ C(1 + |x|)−2 for all x ∈ R
−2
Ψ is differentiable and Ψ0 (x) ≤ C 1 + |x|
for all x ∈ R
(p)
Then for every 1 < p < ∞ the family Ψj,k = (2j/p Ψ(2j (·) − k) : j, k ∈ Z) is a
(p)
basis of Lp (R) which is isomorphically equivalent to (hj,k ) : j, k ∈ Z).
We finally,
want to present without proof another basis of Lp [0, 1]. Recall
(·)n/2π
that e
is an orthonormal basis of L2 [0, 1]. A deep Theorem by M. Riesz
states the following.
Theorem 6.1.10. (c.f. [Ka, Chapter II and III]) The sequence of trigonometric
polynomials (tn : n ∈ Z), with
tn (ξ) = eiξn/2π , for ξ ∈ [0, 1]
is a Schauder basis of Lp [0, 1], 1 < p < ∞, when ordered as (t0 , t1 , t−1 , t2 , t−2 , . . .
In the next section we prove that for p 6= 2 (tn ) cannot be unconditional.
6.2
Khintchine’s inequality and Applications
Theorem 6.2.1. [Khintchine’s Theorem, see Theorem 5.3.1 in [Schl]]
Lp [0, 1], 1 ≤ p ≤ ∞ contains subspaces isomorphic to `2 . If 1 < p < ∞ Lp [0, 1],
contains a complemented subspaces isomorphic to `2 .
Definition 6.2.2. The Rademacher functions are the functions:
rn : [0, 1] → R,
t 7→ sign(sin 2n πt), whenever n ∈ N.
Lemma 6.2.3. [Khintchine inequality], see Lemma 5.3.3 in [Schl]
For every p ∈ [1, ∞) there are numbers 0 < Ap ≤ 1 ≤ Bp so that for any m ∈ N
and any scalars (aj )m
j=1 ,
(6.15)
Ap
m
X
j=1
|aj |2
1/2
m
X
≤
aj rj j=1
Lp
≤ Bp
m
X
|aj |2
1/2
.
j=1
There is a complex version of the Rademacher functions namely the sequence
(gn : n ∈ N), with
n
gn (t) = ei2 πt for t ∈ [0, 1].
6.2. KHINTCHINE’S INEQUALITY AND APPLICATIONS
97
Theorem 6.2.4. [Complex Version of Khintchine’s Theorem] For every p ∈
[1, ∞) there are numbers 0 < A0p ≤ 1 ≤ Bp0 so that for any m ∈ N and any
scalars (aj )m
j=1 ,
(6.16)
A0p
m
X
2
|aj |
1/2
m
X
≤
aj gj j=1
Lp
j=1
≤
Bp0
m
X
|aj |2
1/2
.
j=1
Moreover, if 1 < p < ∞ then (gj ) generates a copy of `2 inside Lp [0, 1] which is
complemented.
Theorem 6.2.5 (The square-function norm). Let 1 ≤ p < ∞ and let (fn ) be a
λ-unconditional basic sequence in Lp [0, 1] for some λ ≥ 1.
Then there is a constant C = C(p, λ) ≥ 1, depending only on the unconditionality constant of (fi ) and the constants Ap and Bp in Khintchine’s Inequality
P
(Lemma 6.2.3), so that for any g = ∞
i=1 ai fi ∈ span(fi : i ∈ N) it follows that
∞ ∞ X
1/2 1/2 1
X
2
2
2
2
|ai | |fi |
|ai | |fi |
,
≤ kgkp ≤ C C
p
p
i=1
i=1
which means that k · kp is on span(fi : i ∈ N) equivalent to the norm
∞
∞ X
X
1/2
1/2 2
2
|||f ||| = |ai |2 |fi |2
=
|a
|
|f
|
.
i
i p
i=1
i=1
p/2
Proof. For two positive numbers A and B and c > 0 we write: A ∼c B if
1
c A ≤ B ≤ cA. Let Kp be the Khintchine constant for Lp , i.e the smallest
number so that for the Rademacher sequence (rn )
∞
∞
X
X
1/2
ai ri ∼Kp
|ai |2
for (ai ) ⊂ K,
p
i=1
i=1
and let Cu be the unconditionality constant of (fi ), i.e.
∞
∞
X
X
ai fi for (ai ) ⊂ K and (σi ) ⊂ {±1}.
σi ai fi ∼Cu i=1
p
i=1
p
We consider Lp [0, 1] in a natural way as subspace of Lp [0, 1]2 , with f˜(s, t) :=
f (s) for f ∈ Lp [0, 1]. Then let rn (t) = rn (s, t) be the nth Rademacher function
action on the second coordinate, i.e
rn (s, t) = sign(sin(2n πt)), (s, t) ∈ [0, 1]2 .
It follows from the Cu -unconditionality for any (aj )m
j=1 ⊂ K, that
m
m
X
p
X
p
aj fj (·) ∼Cup aj fj (·)rj (t)
j=1
p
j=1
p
98CHAPTER 6. APPENDIX B: SOME FACTS ABOUT LP [0, 1] AND LP (R)
m
X
1
Z
=
0
aj fj (s)rj (t)
p
ds for all t ∈ [0, 1],
j=1
and integrating over all t ∈ [0, 1] implies
Z
m
X
p
aj fj (·) ∼Cup
p
j=1
0
Z
0
0
Z
0
0
aj fj (s)rj (t)
ds dt
!p
aj fj (s)rj (t)
dt ds(By Theorem of Fubini)
j=1
m
1X
p
aj fj (s)rj (·) ds
p
j=1
Z
∼Kpp
!p
j=1
m
X
1Z 1
=
=
m
X
1Z 1
m
1X
0
|aj fj (s)|
2
p/2
m
X
1/2 p
ds = |aj fj |2
,
j=1
j=1
p
which proves our claim using C = Kp Cu .
Theorem 6.2.6. Assume that 1 < p < ∞ and assume that (fj ) is a normalized
λ-unconditional sequence in Lp [0, 1] for some λ ≥ 1. Let C = C(p, λ) as in
Theorem 6.2.5.
Then for all scalars (aj )nj=1 ⊂ K
(6.17)
n
n
n
1/2 X
X
1/p
1 X
|aj |2
≤
aj fj ≤ C
|aj |p
if 1 ≤ p ≤ 2, and
C
p
j=1
(6.18)
n
1 X
C
j=1
|aj |p
1/p
j=1
n
n
X
1/2
X
≤
aj fj ≤ C
|aj |2
if 2 ≤ p < ∞.
j=1
p
j=1
j=1
Proof. We will show the inequalities (6.17) and (6.18) but with their middle terms
replaced by the square-function norm. Having done that the claim will follow
from Theorem 6.2.5.
Let (ai )ni=1 ⊂ K. First we note that since for for 1 ≤ s < t ≤ ∞ the `s -norm
of a vector (bj )j≤n is at least the `t -norm we deduce
n
X
1/2 p
2
2
|a
|
|f
|
j
j
p
j=1
Z
=
0
n
1X
j=1
|aj |2 |fj (t)|2
p/2
dt
6.2. KHINTCHINE’S INEQUALITY AND APPLICATIONS
 Z 1 n
X



≤
|aj |p |fj (t)|p dt


0 j=1
Z 1X
n



≥
|aj |p |fj (t)|p dt


99
if 1 ≤ p ≤ 2,
if 2 ≤ p < ∞
0 j=1
=
n
X
|aj |p
j=1
which implies the second inequality of (6.17) and the first of (6.18). In order to
verify the other two inequalities we observe that
n
X
1/2 |aj |2 |fj |2
p
j=1
!1/2 n
n
X
X
1/2
|aj |2
2
2
P
|f
|
|a
|
=
j
i
n
2
i=1 |ai |
j=1
p i=1
!
!1/p p/2
n
n
X
1/2
2
X
|a
|
2
2
Pn j
=
|f
|
|a
|
j
i
2
i=1 |ai |
j=1
p i=1
 n
n
1/p X
1/2
X |aj |2

p
2
≥ Pn

|f
|
|a
|
if 1 ≤ p ≤ 2
j
i

2

p
i=1 |ai |
i=1
j=1
n
n
X
1/p X
1/2

|aj |2

p
2

P
|f
|
|a
|
if 2 ≤ p < ∞
≤

j
i
n
2

p
i=1 |ai |
i=1
j=1
ξ → ξ p/2 is convex if p > 2 and concave if 1 ≤ p < 2

1/p
Z 1X
n
n
n
X
1/2 X
1/2
2
|aj |
p
2
2


Pn
=
|f
|
dt
|a
|
=
|a
|
,
j
i
i
2
0
i=1 |ai |
j=1
i=1
i=1
which implies the first inequality of (6.17) and the second of (6.18).
Corollary 6.2.7. Let 1 ≤ p < ∞. Every normalized unconditional sequence
(fj ) ⊂ Lp [0, 1] which consists of uniformly bounded functions is equivalent to the
`2 -unit vector basis. In particular, if p 6= 2, (fj ) cannot span all of Lp [0, 1].
We will need Jensen’s inequality.
Theorem 6.2.8. [Jensen’s Inequality] If f : R → R is convex and g : [0, 1] → R,
so that g and f ◦ g are integrable. It follows that
! Z
Z
1
f
1
g(ξ) dξ
0
≤
f (g(ξ)) dξ.
0
100CHAPTER 6. APPENDIX B: SOME FACTS ABOUT LP [0, 1] AND LP (R)
(Here [0, 1], together with the Lebesgues measure, can be replaced by any probability space)
Proof of Corollary 6.2.7. Assume that (fj ) is uniformly bounded, normalized
and λ-unconditional. Let C = supj∈N kfj kL∞ . If 1 ≤ p ≤ 2 we deduce for
(aj ) ∈ c00 from the proof of 6.2.6 that

P
1/2
∞
2
∞
X
≤ C
1/2 
a
i=1 i
|ai fi |2
P
1/2

Lp 
∞
2
≥
a
.
j=1
i=1 i
Thus our claim follows for in the case that 1 ≤ p ≤ 2.
If p ≥ 2 we obtain for (aj ) ∈ c00 first that
∞
∞
X
1/2 X
1/2
|ai fi |2
|ai |2
.
≤C
Lp
i=1
i=1
Since kfi kLp = 1 and kfi k∞ ≤ C for i ∈ N, we deduce that
1 = kfi kpp
Z 1
|fi (t)|p dt
=
0
1
m({|fi | < 1/2C})
(2C)p
1
≤ C p m({|fi | ≥ 1/2C}) + ,
2
≤ C p m({|fi | ≥ 1/2C}) +
and thus
m({|fi | ≥ 1/2C}) ≥
We deduce that
∞
X
1/2 |ai fi |2
i=1
Lp
Z
=
∞
1X
0
|aj fj (t)|
p/2
!1/p
dt
j=1
1
≥
2C
Z
1
≥
2C
Z
≥
2
1
.
2C p
0
∞
1X
!1/p
dt
j=1
∞
1X
0 j=1
∞
X
1 1
2C 2C p
|aj |2 1{|fj |≥1/2C}
p/2
!1/2
|aj |2 1{|fj |≥1/2C} dt
|aj |2
(By Jensen’s inequality)
1/2
j=1
Our claim follows therefore from the equivalence between the Lp -norm and the
square function norm in Lp (Theorem 6.2.5).
Bibliography
[AW]
F. Albiac and P. Wojtaszczyk. Characterization of 1-greedy bases J.
Approx. Theory, 138(1):65–86, 2006.
[EW]
I. S. Edelstein and P. Wojtaszcyk, On projections and unconditional
bases in direct sums of Banach spaces. Stud. Math. 56 (3) (1976) 263
– 276.
[An]
A. D. Andrew, On subsequences of the Haar system in C(∆), Israel
Journal of Math, 31, N0. 1 (1978) 85–90.
[BCLT] J. Bourgain, P.G. Casazza, J. Lindenstrauss, and L. Tzafriri, Banach
spaces with a unique unconditional basis, up to permutation, Memoirs
Amer. Math. Soc. No. 322, 1985.
[DT]
R. A. DeVore and V. N. Temlyakov, Some remarks on greedy algorithms,
Adv. in Comp. Math. 5 (1996) 173 –187.
[DFOS] S.J. Dilworth, D. Freeman and E. Odell, and Th. Schlumprecht, Greedy
bases for Besov spaces. Constr. Approx. 34 (2011), no. 2, 281296.
[DKSTW] S. Dilworth, D. Kutzarova, K. Shuman, V. Temlyakov, and P. Wojtaszczyk, Weak convergence of greedy algorithms in Banach spaces. J.
Fourier Anal. Appl. 14 (2008), no. 5-6, 609 - - 628.
[DOSZ1] S. Dilworth, E. Odell, Th. Schlumprecht, and A. Zsak, Renormings
and symmetry properties of one-greedy bases, J. Approx. Theory 163
(2011), no. 9, 1049 – 1075.
[DOSZ2] S. Dilworth, E. Odell, Th. Schlumprecht, and A. Zsak,Partial Unconditionality
[DOSZ3] S. Dilworth, E. Odell, Th. Schlumprecht, and A. Zsak, On the convergence of greedy algorithms for initial segments of the Haar basis. Math.
Proc. Cambridge Philos. Soc. 148 (2010), no. 3, 519529.
101
102
BIBLIOGRAPHY
[1] S. Dilworth, D. Kutzarova, E. Odell, Th. Schlumprecht, and P. Wojtaszczyk, Weak thresholding greedy algorithms in Banach spaces. J.
Funct. Anal. 263 (2012), no. 12, 3900– 3921.
[2] S. Dilworth, D. Kutzarova, E. Odell, Th. Schlumprecht, and A. Zsak,
Renorming spaces with greedy bases, J. Approx. Theory 188 (2014),
3956.
[DKK]
Dilworth, S. J.; Kalton, N. J.; Kutzarova, Denka On the existence of almost greedy bases in Banach spaces. Dedicated to Professor Aleksander
Pelczyński on the occasion of his 70th birthday. Studia Math. 159
(2003), no. 1, 67–101.
[En]
P. Enflo,, A counterexample to the approximation problem in Banach
spaces. Acta Math. 130 (1973), 309 – 317.
[GK]
M. Ganichev and N. J. Kalton, Convergence of the weak dual greedy
algorithm in Lp -space, J. Approx. Theory 124 (2003) 89 – 95.
[HMVZ] P. Hájek, V. Montesinos Santalucı́a, J. Vanderwerff, and V. Zizler,
Biorthogonal systems in Banach spaces. CMS Books in Mathematics/Ouvrages de Mathmatiques de la SMC, 26. Springer, New York,
(2008) xviii+339 pp.
[Jo]
L. K. Jones, A simple lemma on greedy approximation in Hilbert space
and convergence rates for projection pursuit regression and neural network training, Annals of Stat. 20 (1992) 608–613.
[Ka]
Y. Katznelson, An Introduction to Harmonic Analysis
[KT1]
S. V. Konyagin and V. N. Temlyakov, A remark on greedy approximation in Banach spaces, East Journal on Approximation, 5 no.3 (1999),
365–379.
[KT2]
S. V. Konyagin and V. N. Temlyakov, Rate of convergence of pure greedy
algorithm. East J. Approx. 5 (1999), no. 4, 493–499.
[LT]
Lindenstrauss, J. and Tzafriri, L., “Classical Banach Spaces I – Sequence Spaces,” Springer-Verlag, Berlin, 1979.
[Ma]
A.I. Markushevich, On a basis in the wide sense for linear spaces, Dokl.
Akad. Nauk. 41 (1943), 241– 244.
[OP]
R.I. Ovsepian and A. Pelczyński, On the existence of a fundamental total and bounded biorthogonal sequence in every separable Banach space,
and related constructions of uniformly bounded orthonormal systems in
L2 . Studia Math. 54 (1975), no. 2, 149 – 159.
BIBLIOGRAPHY
103
[Pe]
A. Pelczyński, All separable Banach spaces admit for every ε > 0 fundamental total and bounded by 1 + ε biorthogonal sequences. Studia Math.
55 (1976), no. 3, 295 – 304.
[Schl]
Th. Schlumprecht, Course Notes for Functional Analysis, Fall 2012,
http://www.math.tamu.edu/∼schlump/course notes FA2012.pdf
[Sch2]
Th. Schlumprecht, Embedding Theorem of Banach spaces into Banach
spaces with bases. Adv. Math. 274 (2015), 833 – 880
[Schm]
E. Schmidt, Zur Theoreie der linearen und nicht linearen Integralgleichungen. I, Math. Annalen 63 (1906) , 433 – 476.
[T1]
V. Temlyakov, Nonlinear methods of approximation. Found. Comput.
Math. 3 (2003), no. 1, 33–107
[T2]
V. Temlyakov, Relaxation in greedy approximation, preprint.
[T3]
V. Temlyakov, Greedy algorithms in Banach spaces, Adv. Comp. Math.,
14 (2001) 277 – 292.
[Wo1]
P. Wojtaszczyk, A mathematical introduction to wavelets. London
Mathematical Society Student Texats 37 (1997).
[Wo2]
P. Wojtaszczyk, Greedy algorithm for general biorthogonal systems. J.
Approx. Theory 107 (2000), no. 2, 293 314.
Download