Course Notes for Greedy Approximations Math 663-601

Course Notes for

Greedy Approximations

Math 663-601

Th. Schlumprecht

March 10, 2015

2

Contents

1 The Threshold Algorithm 51.1 Greedy and Quasi Greedy Bases . . . . . . . . . . . . . . . . . . 51.2 The Haar basis is greedy in Lp[0, 1] and Lp(R) . . . . . . . . . . 151.3 Quasi greedy but not unconditional . . . . . . . . . . . . . . . . . 17

2 Greedy Algorithms In Hilbert Space 272.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.2 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.3 Convergence Rates . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3 Greedy Algorithms in general Banach Spaces 513.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.2 Convergence of the Weak Dual Chebyshev Greedy Algorithm . . 553.3 Weak Dual Greedy Algorithm with Relaxation . . . . . . . . . . 603.4 Convergence Theorem for the Weak Dual Algorithm . . . . . . . 65

4 Open Problems 714.1 Greedy Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714.2 Greedy Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 73

5 Appendix A: Bases in Banach spaces 755.1 Schauder bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755.2 Markushevich bases . . . . . . . . . . . . . . . . . . . . . . . . . 79

6 Appendix B: Some facts about Lp[0, 1] and Lp(R) 876.1 The Haar basis and Wavelets . . . . . . . . . . . . . . . . . . . . 876.2 Khintchine’s inequality and Applications . . . . . . . . . . . . . . 96

3

4 CONTENTS

Signals or images are often modeled as elements of some Banach space con-sisting of functions, for example C(D), Lp(D), or more generally Sobolev spacesW r,p(D), for a domain D ⊂ Rd. These functions need to be “processed”: approx-imated, converted into an object which is storable, like a sequence of numbers,and then reconstructed.

This means to find an appropriate basis of the Banach space, or more generallya dictionary and to compute as many coordinates of the given functions withrespect to this basis as necessary to satisfy the given error estimates. Now thequestion one needs to solve, is to find the coordinates one wants to use, given arestriction on the budget.

Definition 0.0.1. Let X (always) be a separable and real Banach space. We callD ⊂ SX a dictionary of X if span(D) is dense and x ∈ D implies that −x ∈ D.

An approximation algorithm is a map

G : X → span(D)N, x 7→ G(x) = (Gn(x)),

with the property that for n ∈ N and x ∈ X, there is a set Λ(n,x) ⊂ D ofcardinality at most n so that Gn(x) ∈ span(Λn). For n ∈ N we call Gn(x) then-term approximation of x.

Usually Gn(x) is computed inductively by maximizing a certain value, there-fore these algorithms are often called greedy algorithms.

Remark. If X has a basis (en) with biorthogonals (e∗n) and G = (Pn), wherePn is the n-th canonical projection, would be an example of an approximationalgorithm. Nevertheless the point is to be able to adapt the set Λn to the vectorx, and not letting it be independent. of x.

The main questions are

1) Does (Gn(x)) converge to x?

2) If so, how fast does it converge? How fast does it converge for certain x?

3) How does ‖x − Gn(x)‖ compare to the best n-term approximation definedby

σn(x) = σn(x,D) = infΛ⊂D,#Λ=n

infz∈span(Λ)

‖z − x‖?

Chapter 1

The Threshold Algorithm

1.1 Greedy and Quasi Greedy Bases

We start with the Threshold Algorithm:

Definition 1.1.1. Let X be a separable Banach space with a normalized M -basis

((ei, e

∗i ) : i ∈ N

); we mean by that ‖ei‖ = 1, for i ∈ N) For n ∈ N and

x ∈ X let Λn ⊂ N so that

mini∈Λn

|e∗i (x)| ≥ maxi∈N\Λn

|e∗i (x)|,

i.e. we are reordering (e∗i (x)) into (e∗σ(i)(x)), so that

|e∗σ1(x)| ≥ |e∗σ2(x)| ≥ |e∗σ3(x)| ≥ . . . ,

and put for n ∈ NΛn = {σ1, σ2, . . . σn}.

Then define for n ∈ NGTn (x) =

∑i∈Λn

e∗i (x)ei.

(GTn ) is called the Threshold Algorithm.

Definition 1.1.2. A normalized M -basis (ei) is called Quasi-Greedy, if for all x

(QG) x = limn→∞

GTn (x).

A basis is called greedy if there is a constant C so that

(G)∥∥x−GT (x)

∥∥ ≤ Cσn(x),

where we define

σn(x) = σn(x, (ej)

)= inf

Λ⊂N,#Λ=ninf

z∈span(ej :j∈Λ)‖z − x‖.

5

6 CHAPTER 1. THE THRESHOLD ALGORITHM

In that case we say that (ei) is C-greedy. We call the smallest constant C forwhich (G) holds the greedy constant of (en) and denote it by Cg.

Remarks. Let((ei, e

∗i ) : i ∈ N

)be a normalized M basis.

1. From the property that (en) is fundamental we obtain that for every x ∈ X

σn(x)→n→∞ 0,

it follows therefore that every greedy basis is quasi greedy.

2. If (ej) is an unconditional basis of X, and x =∑∞

i=1 ai ∈ X, then

x = limn→∞

n∑j=1

aπ(j)eπ(j),

for any permutation π : N → N and thus, in particular, also for a greedypermutation, i.e. a permutation, so that

|aπ(1)| ≥ |aπ(2)| ≥ |aπ(3)| . . . .

Thus, an unconditional basis is always quasi-greedy.

3. Schauder bases have a special order and might be reordered so that thecease to be basis. But

• unconditional bases,

• M bases,

• quasi greedy M -bases,

• greedy bases

keep their properties under any permutation, and can therefore be indexedby any countable set.

4. In order to obtain a quasi greedy M -Basis which is not a Schauder basis,one could take quasi greedy Schauder basis, which is not unconditional (itsexistence will be shown later), but admits a suitable reordering under whichis not a Schauder basis anymore. Nevertheless, by the observations in (3), itwill still be a quasi greedy M -basis. But it seems unknown whether or notthere is a quasi greedy M -basis which cannot be reordered into a Schauderbasis.

Examples 1.1.3. 1. If 1 ≤ p < ∞, then the unit vector basis (ei) of `p is1-greedy.

1.1. GREEDY AND QUASI GREEDY BASES 7

2. The unit vector basis (ei) in c0 is 1-greedy.

3. The summing basis sn of c0 (sn =∑n

j=1 ej) is not quasi greedy.

4. The unit bias of (`p⊕`q)1 is not greedy (but 1-unconditional and thus quasigreedy).

Proof. To prove (1) let x =∑∞

j=1 xjej ∈ `p, and let Λn ⊂ N be of cardinality nso that

min{|xj | : j ∈ Λn} ≥ max{|xj | : j ∈ N \ Λn}

and Λ ⊂ N be any subset of cardinality n and z =∑ziei ∈ `p with

supp(z) = {i ∈ N : |zi| 6= 0} ⊂ Λ.

Then

‖x− z‖pp =∑j∈Λ

|xj − zj |p +∑j∈N\Λ

|xj |p

≥∑j∈Λ

|xj − zj |p +∑

j∈N\Λn

|xj |p

≥∑

j∈N\Λn

|xj |p = ‖GT (x)− x‖p.

Thus

σn(x) = inf{‖z − x‖p : #supp(z) ≤ n} = ‖GT (x)− x‖p.

(2) can be shown in the same way as (1).

In order to show (3) we choose sequences (εj) ⊂ (0, 1), (nj) ⊂ N as follows:

ε2j = 2−j and ε2j−1 = 2−j(

1 +1

j3

), for j ∈ N

and

nj = j2j and Nj =

n∑i=1

ni for i ∈ N0.

Note that the series

x =

∞∑j=1

Nj∑i=Nj−1+1

(ε2j−1s2i−1 − ε2js2i)

=∞∑j=1

Nj∑i=Nj−1+1

((ε2j−1 − ε2j)s2i−1 − ε2je2i

)


converges, because∞∑j=1

Nj∑i=Nj−1+1

ε2je2i ∈ c0

and

∞∑j=1

Nj∑i=Nj−1+1

∥∥(ε2j−1 − ε2j)s2i−1

∥∥ =∞∑j=1

nj(ε2j−1 − ε2j) =∞∑j=1

1

j2<∞.

Now we compute for l ∈ N0 the vector x−GT2Nl+nl+1(x):

x−GT2Nl+nl+1(x) = −

Nl+1∑i=Nl+1

ε2l+2s2i +

∞∑j=l+2

Nj∑i=Nj−1+1

(ε2j−1s2i−1 − ε2js2i).

From the monotonicity of (si) we deduce that

∥∥∥ ∞∑j=l+2

Nj∑i=Nj−1+1

ε2j−1s2i−1 − ε2js2i

∥∥∥ ≤ ‖x||.However, ∥∥∥ Nl+1∑

i=Nl+1

ε2l+2s2i

∥∥∥ =

Nl+1∑i=Nl+1

ε2l+2 = l + 1→l→∞ ∞,

which implies that GTn (x) is not convergent.

To show (4) assume w.l.o.g. p < q, and denote the unit vector basis of `p by(ei) and the unit vector basis of `q by (fj) for n ∈ N and we put

x(n) =

n∑j=1

1

2ej +

n∑j=1

fj .

Thus

GTn (x(n)) =

n∑j=1

fj , and thus∥∥GTn (x(n))− x(n)

∥∥ =1

2n1/p.

Nevertheless ∥∥∥x− n∑j=1

1

2ej

∥∥∥ = n1/q,

and since 12n

1/p/n1/q ↗ ∞, for n ↗ ∞, the basis {ej : j ∈ N} ∪ {fj : j ∈ N}cannot be greedy.


Remarks. With the arguments used in (4) Examples 1.1.3 one can show thatthe usual bases of

(⊕∞n=1 `

nq

)`p

and `p(`q) =(⊕∞n=1 `q

)`p

are also not greedy but

of course unconditional.Now in [BCLT] it was shown that `p ⊕ `q has up to permutation and up to

isomorphic equivalence a unique unconditional basis, namely the one indicatedabove. Since, as it will be shown later, every greedy basis must be unconditional,the space does not have any greedy basis.

Due to a result in [DFOS] however(⊕∞n=1 `

nq

)`p

has a greedy bases if 1 <

p, q <∞. More precisely, the following was shown:Let 1 ≤ p, q ≤ ∞.

a) If 1 < q <∞ then the Banach space (⊕∞n=1`np )`q has a greedy basis.

b) If q = 1 or q =∞, and p 6= q, then (⊕∞n=1`np )`q has not a greedy basis. Here

we take c0-sum if q =∞.

The question whether or not `p(`q) has a greedy basis is open and quite aninteresting question.

The following result by Wojtaszczyk can be seen the analogue of the charac-terization of Schauder bases by the uniform boundedness of the canonical pro-jections for quasi-greedy bases.

Theorem 1.1.4. [Wo2] A bounded M -basis (ei, e∗i ), with ‖ei‖ = 1, i ∈ N, of a

Banach space X is quasi greedy if and only if there is a constant C so that forany x ∈ X and any m ∈ N it follows that

(1.1) ‖GTm(x)‖ ≤ C‖x‖

We call the smallest constant so that (1.1) is satisfied the Greedy ProjectionConstant.

Remark. Theorem 1.1.4 is basically a uniform boundedness result. Nevertheless,since the GTm are nonlinear projections we need a direct proof.

We need first the following Lemma:

Lemma 1.1.5. Assume there is no positive number C so that ‖GTm(x)‖ ≤ C‖x‖for all x ∈ X and all m ∈ N. Then the following holds:

For all finite A ⊂ N all K > 0 there is a finite B ⊂ N , which is disjoint fromA and a vector x, with x =

∑j∈B xjej, such that ‖x‖ = 1 and ‖GTm(x)‖ ≥ K,

for some m ∈ N.

Proof. For a finite set F ⊂ N, define PF to be the coordinate projection ontospan(ei : i∈F ), generated by the (e∗i ), i.e.

PF : X → span(ei : i ∈ F ), x 7→ PF (x) =∑j∈F

e∗j (x)ej .


Since there are only finitely many subsets of A we can put

M = maxF⊂A‖PF ‖ = max

F⊂Asupx∈BX

∥∥∥∑j∈F

e∗j (x)ej

∥∥∥ ≤∑j∈A‖e∗j‖ · ‖ej‖ <∞.

Let K1 > 1 so that (K1−M)/(M+1) > K, and choose x1 ∈ SX∩span(ej : j ∈ N)and k ∈ N so that so that ‖GTk (x1)‖ ≥ K1. We assume without loss of generality(after suitable small perturbation) that all the non zero numbers |e∗n(x1)| aredifferent from each other.

Then let x2 = x1 − PA(x1), and note that ‖x2‖ ≤ M + 1 and GTk (x1) =GTm(x2) + PF (x1) for some m ≤ k and F ⊂ A. Thus ‖GTm(x2)‖ ≥ K1 −M , andif we define x3 = x2/‖x2‖, we have ‖GTk (x3)‖ ≥ (K1 −M)/(M + 1) > K.

It follows that the support B of x = x3 is disjoint from A and that ‖GTm(x)‖ >K.

Proof of Theorem 1.1.4. Let b = supi ‖e∗i ‖.“⇒” Assume there is no positive number C so that ‖GTm(x)‖ ≤ C‖x‖ for allx ∈ X and all m ∈ N.

Applying Lemma 1.1.5 we can choose recursively vectors y1, y2, . . . in SX ∩span(ej : j ∈ N) and numbers mn ∈ N, so that the supports of the yn, whichwe denote by Bn, are pairwise disjoint, (Recall that for z =

∑∞i=1 ziei, we call

supp(z) = {i ∈ N : e∗i (z) 6= 0}, the support of z) and so that

(1.2) ‖GTmn(yn)‖ ≥ 2nbn

n−1∏j=1

ε−1j ,

where

εj = min{

2−j ,min{|e∗i (yj)| : i ∈ Bj}}.

Then we let

x =∞∑n=1

( n−1∏j=1

(εj/b))yn,

(which clearly converges) and write x as

x =∞∑j=1

xjej .

Since |e∗i (yj)| ≤ b, for i, j ∈ N

min{|xi| : i ∈

n⋃j=1

Bj

}≥

n−1∏j=1

εjbεn =

n∏j=1

εjbb ≥ max

{|xi| : i ∈ N \

n⋃j=1

Bj

}.


We may assume w.l.o.g. that mn ≤ #Bn, for n ∈N. Letting kj = mj +∑j−1i=1 #Bi, it follows that

GTkj (x) =

j−1∑i=1

( i−1∏s=1

(εs/b))yi +GTmj

(( j−1∏i=1

(εi/b))yj

).

and thus by (1.2)

∥∥GTkj (x)∥∥ ≥ ∥∥∥∥∥GTmj

(( j−1∏i=1

(εi/b))yj

)∥∥∥∥∥−j−1∑i=1

( i−1∏s=1

(εs/b))‖yi‖ ≥ 2jb,

which implies that GTkj does not converge.

“⇐” Let C > 0 such that ‖GTm(x)‖ ≤ C‖x‖ for all m ∈ N and all x ∈ X. Letx ∈ X and assume w.l.o.g. that supp(x) is infinite. For ε > 0 choose x0 withfinite support A so that ‖x− x0‖ < ε. Using small perturbations we can assumethat A ⊂ supp(x) and that A ⊂ supp(x − x0). We can therefore choose m ∈ Nlarge enough so that GTm(x) and GTm(x− x0) are of the form

GTm(x) =∑j∈B

e∗i (x)ei and GTm(x− x0) =∑j∈B

e∗i (x− x0)ei

with B ⊂ N such that A ⊂ B. It follows therefore that

‖x−GTm(x)‖ ≤ ‖x−x0‖+ ‖x0−GTm(x)‖ = ‖x−x0‖+ ‖GTm(x0−x)‖ ≤ (1 +C)ε,

which implies our claim by choosing ε > 0 to be arbitrarily small.

Definition 1.1.6. An M basis (ej , e∗j ) is called unconditional for constant coef-

ficients if there is a positive constant C so that for all finite sets A ⊂ N and all(σn : n ∈ A) ⊂ {±1} we have

1

C

∥∥∥∑n∈A

en

∥∥∥ ≤ ∥∥∥∑n∈A

σnen

∥∥∥ ≤ C∥∥∥∑n∈A

en

∥∥∥.Proposition 1.1.7. A quasi-greedy M basis (en, e

∗n) is unconditional for constant

coefficients. Actually the constant in Definition 1.1.6 can be chosen to be equalto twice the projection constant in Theorem 1.1.4.

Remark. We will show later that there are quasi-greedy bases which are notunconditional. Actually there are Banach spaces which do not contain any un-conditional basic sequence, but in which every normalized weakly null sequencecontains a quasi-greedy subsequence.


Proof of Proposition 1.1.7. Let A ⊂ N be finite and (σn : n ∈ A) ⊂ {±1}. Thenif we let δ ∈ (0, 1) and put m = #{j ∈ A : σj = +1} we obtain∥∥∥ ∑

n∈A,σn=+1

en

∥∥∥ =∥∥∥GTm( ∑

n∈A,σn=+1

en +∑

n∈A,σn=−1

(1− δ)en)∥∥∥

≤ C∥∥∥ ∑n∈A,σn=+1

en +∑

n∈A,σn=−1

(1− δ)en∥∥∥.

By taking δ > 0 to be arbitrarily small, we obtain that∥∥∥ ∑n∈A,σn=+1

en

∥∥∥ ≤ C∥∥∥∑n∈A

en

∥∥∥.Similarly we have ∥∥∥ ∑

n∈A,σn=−1

en

∥∥∥ ≤ C∥∥∥∑n∈A

en

∥∥∥,and thus, ∥∥∥∑

n∈Aσnen

∥∥∥ ≤ 2C∥∥∥∑n∈A

en

∥∥∥.

We now present a characterization of greedy bases obtained by Konyagin andTemliakov. We need the following notation.

Definition 1.1.8. We call a a normalized basic sequence democratic if there isa constant C so that for all finite E,F ⊂ N, with #E = #F it follows that

(1.3)∥∥∥∑j∈E

ej

∥∥∥ ≤ C∥∥∥∑j∈F

ej

∥∥∥In that case we call the smallest constant, so that (1.3) holds, the Constant ofDemocracy of (ei) and denote it by Cd.

Theorem 1.1.9. [KT1] A normalized basis (en) is greedy if and only it is un-conditional and democratic. In this case

(1.4) max(Cs, Cd) ≤ Cg ≤ CdCsC2u + Cu,

where Cu is the unconditional constant and Cs is the suppression constant.

Remark. The proof will show that the first inequality is sharp. Recently it wasshown in [DOSZ1] that the second inequality is also sharp.


Proof of Theorem 1.1.9. “⇐” Let x =∑e∗i (x)ei ∈ X, n ∈ N and let η > 0.

Choose x =∑

i∈Λ∗naiei so that #Λ∗n = n which is up to η the best n term

approximation to x (since we allow ai to be 0, we can assume that #Λ is exactlyn), i.e.

(1.5) ‖x− x‖ ≤ σn(x) + η.

Let Λn be a set of n coordinates for which

b := mini∈Λn |e∗i (x)| ≥ maxi∈N\Λn

|e∗i (x)| and GTn (x) =∑i∈Λn

e∗i (x)ei.

We need to show that

‖x−GTn (x)‖ ≤ (CdCsC2u + Cu)(σn(x) + η).

Then

x−GTn (x) =∑

i∈N\Λn

e∗i (x)ei =∑

i∈Λ∗n\Λn

e∗i (x)ei +∑

i∈N\(Λ∗n∪Λn)

e∗i (x)ei.

But we also have∥∥∥ ∑i∈Λ∗n\Λn

e∗i (x)ei

∥∥∥ ≤ bCu∥∥∥ ∑i∈Λ∗n\Λn

ei

∥∥∥ (By Proposition 5.1.11)(1.6)

≤ bCuCd∥∥∥ ∑i∈Λ\nΛ∗n

ei

∥∥∥[Note that #(Λn \ Λ∗n) = #(Λ∗n \ Λn)]

≤ C2uCd

∥∥∥ ∑i∈Λn\Λ∗n

e∗i (x)ei

∥∥∥[Note that |e∗i (x)| ≥ b if i ∈ Λn \ Λ∗n ]

≤ CsC2uCd

∥∥∥ ∑i∈Λ∗n

(e∗i (x)− ai)ei +∑

i∈N\Λ∗n

e∗i (x)ei

∥∥∥= CsC

2uCd‖x− x‖ ≤ CsC2

uCd(σn(x) + η)

and ∥∥∥ ∑i∈N\(Λ∗n∪Λn)

e∗i (x)ei

∥∥∥ ≤ Cs∥∥∥ ∑i∈Λ∗n

(e∗i (x)− ai)ei +∑

i∈N\Λ∗n

e∗i (x)ei

∥∥∥(1.7)

= Cs‖x− x‖ ≤ Cs(σn(x) + η).

This shows that (ei) is greedy and, since η > 0 is arbitrary, we deduce thatCg ≤ CsC2

uCd + Cs.


“⇒” Assume that (ei) is greedy. In order to show that (ei) is democratic letΛ1,Λ2 ⊂ N with #Λ1 = #Λ2. Let η > 0 and put m = #(Λ2 \ Λ1) and

x =∑i∈Λ1

ei + (1 + η)∑

i∈Λ2\Λ1

ei.

Then it follows∥∥∥∑i∈Λ1

ei

∥∥∥ = ‖x−GTm(x)‖

≤ Cgσm(x) (since (ei) is Cg-greedy)

≤ Cg∥∥∥x− ∑

i∈Λ1\Λ2

ei

∥∥∥ ≤ Cg∥∥∥ ∑i∈Λ1∩Λ2

ei + (1 + η)∑

i∈Λ2\Λ1

ei

∥∥∥.Since η > 0 can be taken arbitrary, we deduce that∥∥∥∑

i∈Λ1

ei

∥∥∥ ≤ Cg∥∥∥∑i∈Λ2

ei

∥∥∥.Thus, it follows that (ei) is democratic and Cd ≤ Cg.

In order to show that (ei) is unconditional let x =∑e∗i (x)ei ∈ X have finite

support S. Let Λ ⊂ S and put

y =∑i∈Λ

e∗i (x)ei + b∑i∈S\Λ

ei,

with b > maxi∈S |e∗i (x)|. For n = #(S \ Λ) it follows that

GTn (y) = b∑i∈S\Λ

ei,

and since (ei) is greedy we deduce that (note that #supp(y − x) = n)∥∥∥∑i∈Λ

e∗i (x)ei

∥∥∥ = ‖y −GTn (y)‖ ≤ Cgσn(y) ≤ Cg‖y − (y − x)‖ = Cg‖x‖,

which implies that (ei) is unconditional with Cs ≤ Cg.

1.2. THE HAAR BASIS IS GREEDY IN LP [0, 1] AND LP (R) 15

1.2 The Haar basis is greedy in Lp[0, 1] and Lp(R)

Theorem 1.2.1. For 1 < p < ∞ there are two constants cp ≤ Cp, dependingonly on p, so that for all n ∈ N and all A ⊂ T with #A = n

cpn1/p ≤

∥∥∥∑t∈A

h(p)t

∥∥∥ ≤ Cpn1/p.

In particular (h(p)t )t∈T is democratic in Lp[0, 1].

With Theorem 1.1.9 and Theorem 6.1.1 we deduce that

Corollary 1.2.2. The Haar Basis of Lp[0, 1], 1 0 so that the following holds.Let n1 < n2 < . . . nk be integers and let Ej ⊂ [0, 1] be measurable for j =

1, . . . k. Then we have∫ 1

0

( k∑j=1

2nj/q1Ej (x))qdx ≤ dq

k∑j=1

2njm(Ej).

Proof. Define

f(x) =

k∑j=1

2nj/q1Ej (x).

For j = 1, . . . k write E′j = Ej \⋃ki=j+1Ei. It follows that for x ∈ E′j

f(x) ≤j∑i=1

2ni/q ≤nj∑i=1

2i/q =2(nj+1)/q − 1

21/q − 1≤ 21/q

21/q − 1︸︷︷︸d1/qq

2nj/q.

Thus ∫ 1

0f(x)qdx ≤ dq

k∑i=1

2nim(E′i) ≤ dqk∑j=1

2njm(Ej),

which finishes the proof.

Lemma 1.2.4. For 1 0 so that for all n ∈ N, A ⊂ Twith #A = n, and (εt) ⊂ {−1, 1} it follows that∥∥∥∑

t∈Aεth

(p)t

∥∥∥p≤ Cpn1/p.


Proof. Abbreviate ht = h(p)t for t ∈ T . Let n1 < n2 < . . . < nk be all the integers

ni for which there is a t ∈ A so that m(supp(ht)) = 2−ni . For j = 1, . . . k put

Ej =⋃

i∈{0,1,...2nj−1},(nj ,i)∈A

supp(h(i,nj)).

Sincem(Ej) = 2−nj#{i ∈ {0, 1, . . . 2nj − 1}, (nj , i) ∈ A}

and thus#{i ∈ {0, 1, . . . 2nj − 1}, (nj , i) ∈ A} = 2njm(Ej).

It follows therefore that

n =

{∑kj=1 #{i ∈ {0, 1, . . . 2nj − 1}, (nj , i) ∈ A} =

∑kj=1 2njm(Ej) if 0 6∈ A

1 +∑k

j=1 2njm(Ej) if 0 ∈ A.

Assume without loss of generality that 0 6∈ A. It follows that

∥∥∥∑t∈A

εtht

∥∥∥p

=

[∫ 1

0

[ k∑j=1

2nj/p1Ej

]pdx

]1/p

≤ d1/pp

[k∑j=1

2njm(Ej)

]1/p

= d1/pp n1/p.

[dp as in Lemma 1.2.3]

Lemma 1.2.5. For 1 0 so that for all n ∈ N, A ⊂ Twith #A = n, and (εt) ⊂ {−1, 1} it follows that∥∥∥∑

t∈Aεth

(p)t

∥∥∥p≥ cpn1/p.

Proof. Note that for 1 < p, q <∞ with 1p + 1

q and s, t ∈ T it follows that

〈h(p)t , h(q)

s 〉 = δ(t, s),

thus the claim follows from the fact that the h(p)t ’s are normalized in Lp[0, 1] and

by Lemma 1.2.4 using the duality between Lp[0, 1] and Lq[0, 1]. Indeed,∥∥∥∑t∈A

εth(p)t

∥∥∥ ≥ ⟨∑t∈A

εth(p)t ,

∑t∈A εth

(q)t∥∥∥∑t∈A εth(q)t

∥∥∥⟩

=n∥∥∥∑t∈A εth

(q)t

∥∥∥ ≥ n1/p

cq,

where cq is chosen like in Lemma 1.2.5. Our claim follows therefore bu lettingCp = 1/cq.

1.3. QUASI GREEDY BUT NOT UNCONDITIONAL 17

1.3 A quasi greedy basis of Lp[0, 1] which is not un-conditional

In this section we make the general assumption on a separable Banach space X,that X has a normalized basis (en) which is Besselian meaning that for someconstant CB

(1.8) ‖x‖ =∥∥∥ ∞∑j=1

e∗j (x)ej

∥∥∥ ≥ 1

CB

( ∞∑j=1

|e∗j (x)|2)1/2

for all x ∈ X.

where (e∗j ) denote the coordinate functionals for (ej) We secondly assume that(ej) has a subsequence (emj : j ∈N) which is Hilbertian which means that forsome constant CH

(1.9)∥∥∥ ∞∑j=1

e∗mj(x)emj

∥∥∥ ≤ CH( ∞∑j=1

|e∗mj(x)|2

)1/2for all x ∈ span(emj : j ∈ N).

Example 1.3.1. An example for such a basic sequence are the trigonometricallypolynomial (tn : n ∈ Z) in Lp[0, 1] with p > 2. Indeed, for (an : |n| ≤ N) ⊂ C itfollows from Holder’s (or Jensen’s) inequality that(∫ 1

0

∣∣∣ N∑n=−N

ajeinξ/2π

∣∣∣p dξ)1/p

≥

(∫ 1

0

∣∣∣ N∑n=−N

ajeinξ/2π

∣∣∣2 dξ)1/2

=( N∑n=−N

|aj |2)1/2

.

Secondly it follows from the complex version of Khintchine’ s inequality (Theorem6.2.4) that the subsequence (t2n : n ∈ N) of the trigonometric polynomials isequivalent to the `2-unit vector basis.

We recall the 2n by 2n matrices A(n) = (a(n)(i,j) : 1 ≤ i, j ≤ 2n), for n∈N, which

were introduced in Section 5.2. Let us recall the following two properties whichwe will need here:

A(n) is unitary operator on `2n

2 , and(1.10)

a(n)(j,1) = 2−n/2.(1.11)

For k ∈N we put nk = 22k and B(k) = (b(k)i,j : 1 ≤ i, j ≤ nk) = A(2k) (acting on

`nk2 ), for k∈N which implies that nk+1 = n2

k. We let

(hj : j ∈ N) = (emj : j ∈ N) and (fi : i∈N) = (es : s ∈ N \ {mi : i∈N}),

so that if fi = es and fj = et then then i < j if and only if s < t. For k∈N we

define a family (g(k)j : j = 1, 2, . . . nk) as follows

g(k)1 = fk and g

(k)i = hSk−1+i−1, for i = 2, 3, . . . nk,


where S0 = 0, and, inductively, Sj = Sj−1 +nj − 1. If we order (g(k)j : k ∈ N, j =

1, 2, . . . nk) lexicographically we note that the sequence(g

(1)1 , g

(1)2 , . . . g(1)

n1, g

(2)1 , . . . , g(2)

n2, g

(3)1 , , . . .

)is equal to the sequence(

f1, h1, h2, . . . hn1−1, f2, hn1 , . . . hn2−2, f3, . . .).

Then we define for k ∈ N a new system of elements (ψ(k)j : j = 1, 2 . . . nk), by

(1.12)

ψ

(k)1

ψ(k)2...

ψ(k)nk

= B(k) ◦

g

(k)1

g(k)2...

g(k)nk

or, in other words,

ψ(k)i =

nk∑j=1

b(k)(i,j)g

(k)j for i = 1, 2 . . . nk.

Our goal is now to prove the following result

Theorem 1.3.2. Ordered lexicographically, the system (ψ(k)j : k ∈ N, j = 1, 2 . . . nk)

is a quasi-greedy basis of X.

Proposition 1.3.3. Ordered lexicographically, (g(k)j : k ∈N, j = 1, 2 . . . nk) is a

Besselian basis of X.

Proof. Given that (g(k)j : k ∈ N, j = 1, 2 . . . nk) is a reordering of (ej), which was

assumed to be a Besselian basis of X, we only need to show that (g(k)j : k ∈

N, j = 1, 2 . . . nk) is a basic sequence.

To do so we need to show that there is a constant C ≥ 1 so that for all N ∈ N,

all M ∈ {1, 2 . . . nM} and all (c(k)j : k ∈ N, j = 1, 2 . . . nk), with

#{(k, j) : k ∈ N, j = 1, 2 . . . nk, c(k)j 6= 0}

being finite, it follows

(1.13)∥∥∥N−1∑k=1

nk∑j=1

c(k)j g

(k)j +

M∑j=1

c(N)j g

(N)j

∥∥∥ ≤ C∥∥∥ ∞∑k=1

nk∑j=1

c(k)j g

(k)j

∥∥∥.


Since the g(k)i are a reordering of the original basis (ej) we can write

x =∞∑k=1

nk∑j=1

c(k)j g

(k)j as x =

∞∑i=1

ciei,

where ci = c(k)j if ei = g

(k)j (and for each i ∈ N there is exactly one such choice of

k and j ∈ {1, 2 . . . nk}). From (1.8) and (1.9) we deduce that∥∥∥N−1∑k=1

nk∑j=2

c(k)j g

(k)j +

M∑j=2

c(N)j g

(N)j

∥∥∥ ≤ CH(N−1∑k=1

nk∑j=2

|c(k)j |

2 +M∑j=2

|c(N)j |

2)1/2

(1.14)

≤ CH( ∞∑j=1

|cj |2)1/2

≤ CHCB‖x‖.

Since g(k)j = fj = esj , where (sj) which consists of the elements of N \ {mj : j ∈

N}, ordered increasingly it follows that we can write

N∑k=1

c(k)1 g

(k)1 =

sN∑j=1

cjej −∑

i∈{1,2,...sN}\{sj :j≤N}

ciei =∑

(k,j)∈A

c(k)j g

(k)j ,

for some set A ⊂ {(k, j) : k∈N, j = 2, 3 . . . nk}. If Ce is the basis constant of (ej)we deduce therefore that ∥∥∥ sN∑

j=1

cjej

∥∥∥ ≤ Ce‖x‖,and thus, using (1.14),∥∥∥ N∑

k=1

c(k)1 g

(k)1

∥∥∥ ≤ ∥∥∥ sN∑j=1

cjej

∥∥∥+∥∥∥ ∑

(k,j)∈A

c(k)j g

(k)j

∥∥∥ ≤ (Ce + CBCH)‖x‖.

This implies that∥∥∥N−1∑k=1

nk∑j=1

c(k)j g

(k)j +

M∑j=1

c(N)j g

(N)j

∥∥∥≤∥∥∥ N∑k=1

c(k)1 g

(k)1

∥∥∥+∥∥∥N−1∑k=1

nk∑j=2

c(k)j g

(k)j +

M∑j=2

c(N)j g

(N)j

∥∥∥≤ (Ce + CBCH)‖x‖+ CBCH‖x‖

which implies our claim with C = Ce + 2CBCH .

Proposition 1.3.4. Under the lexicographical order, (ψ(k)j : j = 1, 2 . . . nk) is a

Besselian basis of X with the same constant CB.


Proof. We first note that for k ∈ N

Xk = span(ψ(k)j : j = 1, 2 . . . nk) = span(g

(k)j : j = 1, 2 . . . nk)

and thus it follows that (ψ(k)j : k ∈ N, j = 1.2, . . . nk) spans as (g

(k)j : k ∈ N, j =

1.2, . . . nk) a dense subspace of X.

Secondly we observe that if

(d(k)j : k ∈ N, j = 1, 2 . . . nk) ⊂ K

with

#{(k, j) : k ∈ N, j = 1, 2 . . . nk, d(k)j 6= 0}

and let

x =

∞∑k=1

nk∑j=1

d(k)j Ψ

(k)j

or in g-coordinates:

x =∞∑k=1

nk∑j=1

c(k)j g

(k)j .

We write x =∑∞

k=1 xk with

xk =

nk∑j=1

d(k)j ψ

(N)j =

nk∑j=1

c(k)j g

(k)j .

Since

xk =

nk∑i=1

d(k)i ψ

(k)i =

nk∑i=1

d(k)i

nk∑j=1

b(k)(i,j)g

(k)j =

nk∑j=1

g(k)j

nk∑i=1

b(k)(i,j)d

(k)i =

nk∑j=1

c(k)j g

(k)j

this means that

(c(k)i : j = 1, 2 . . . nk) = (B(k))−1(d

(k)i : j = 1, 2 . . . nk)

or

(d(k)i : j = 1, 2 . . . nk) = (B(k))(c

(k)i : j = 1, 2 . . . nk).

If we project x to its first, say L, coordinates in the lexicographical order of

(Ψ(k)j : k ∈ N, j = 1, . . . kn), for N ∈ N and M ≤ nN , so that L =

∑N−1k=1 kn +M ,

this projected vector equals to:

N−1∑k=1

nk∑j=1

d(k)j ψ

(k)j +

M∑j=1

d(N)j ψ

(N)j =

N−1∑k=1

nk∑j=2

c(k)j g

(k)j +

M∑j=1

d(N)j ψ

(N)j .


Therefore we only need to show that there is a constant C ≥ 1 so that for all kand all M ≤ nk ∥∥∥ M∑

j=1

d(k)j ψ

(k)j

∥∥∥ ≤ C∥∥∥ nk∑j=1

d(k)j ψ

(k)j

∥∥∥(1.15)

and that (ψ(k)j ) is Besselian.

It follows from the assumption that the matrices B(k) are unitary and Propo-sition 1.3.4 that

‖x‖ =∥∥∥ ∞∑k=1

xk

∥∥∥ ≥ 1

CB

( ∞∑k=1

nk∑j=1

|c(k)j |

2)1/2

=1

CB

( ∞∑k=1

nk∑j=1

|d(k)j |

2)1/2

,

which proves that (ψ(k)j ) is Besselian. Secondly we note that (1.11) yields

∥∥∥ M∑i=1

d(k)i ψ

(k)i

∥∥∥ =∥∥∥ M∑i=1

d(k)i

nk∑j=1

b(k)(i,j)g

(k)j

∥∥∥≤

M∑i=1

|d(k)i |n

−1/2k ‖g(k)

1 ‖+∥∥∥ M∑i=1

d(k)i

nk∑j=2

b(k)(i,j)g

(k)j

∥∥∥≤( M∑i=1

|d(k)i |

2)1/2

+∥∥∥ nk∑j=2

g(k)j

M∑i=1

d(k)i b

(k)(i,j)

∥∥∥≤( M∑i=1

|d(k)i |

2)1/2

+ CH

( nk∑j=2

∣∣∣ M∑i=1

d(k)i b

(k)(i,j)

∣∣∣2)1/2

≤( nk∑i=1

|d(k)i |

2)1/2

+ CH

( nk∑i=1

|d(k)i |

2)1/2

(By (1.10))

Therefore applying (1.10) and then (1.8) it follows that

∥∥∥ M∑i=1

d(k)i ψ

(k)i

∥∥∥ ≤ (1 + CH)( nk∑i=1

|d(k)i |

2)1/2

= (1 + CH)( nk∑i=1

|c(k)i |

2)1/2

≤ (1 + CH)CB‖xk‖

which proves our claim.

Our last step of proving Theorem 1.3.2 is the following


Proposition 1.3.5. (ψ(k)j : j = 1, 2 . . . nk) is quasi-greedy.

Proof. Let

x =

∞∑k=1

nk∑i=1

d(k)i ψ

(k)i ∈ X,

with ‖x‖ = 1 and suppose that the m-th greedy approximate is given by

GTm(x,Ψ) =∑k∈J

∑i∈Ik

d(k)i ψ

(k)i ,

where m =∑

k∈J #Ik. We need to show that there is a constant C ≥ 1 (of coursenot dependent on x and m) so that

(1.16) ‖GTm(x,Ψ)‖ ≤ C‖x‖

We write GTm(x,Ψ) as

GTm(x,Ψ) =∑k∈J

∑i∈Ik

d(k)i (ψ

(k)i − b

(k)(i,1)fk)︸︷︷︸

Σ1

+∑k∈J

∑i∈Ik

d(k)i b

(k)(i,1)fk︸︷︷︸

Σ2

.

(recall that g(k)1 = fk). From the definition of the ψ

(k)j we get that

Σ1 =∑k∈J

∑i∈Ik

d(k)i

( nk∑j=2

b(k)(i,j)g

(k)j

)=∑k∈J

nk∑j=2

g(k)j

(∑i∈Ik

d(k)i b

(k)(i,j)

),

which yields by the choice of the g(k)j , properties (1.9), and (1.10) that

∥∥Σ1

∥∥ ≤ CH(∑k∈J

nk∑j=2

∣∣∣∑i∈Ik

d(k)i b

(k)(i,j)

∣∣∣2)1/2

(1.17)

= CH

(∑k∈J

∥∥[B(k)]−1 ◦ (d

(k)i : i ∈ Ik)

∥∥2

2

)1/2

= CH

(∑k∈J

∥∥(d(k)i : i ∈ Ik)

∥∥2

2

)1/2

(By (1.10))

≤ CHCB‖x‖ (By Proposition (1.3.4)).

In order to estimate Σ2 we split Ik, k ∈ N into the following subsets:

I(1)k =

{i ∈ Ik : |d(k)

i | ≤ n−1k }}

I(2)k =

{i ∈ Ik : |d(k)

i | ≥ n−1/2k

}


I(3)k =

{i ∈ Ik : n−1

k < |d(k)i | < n

−1/2k

}and let

Σ(s)2 =

∑k∈J

∑i∈I(s)k

d(k)i b

(k)(i,1)fk for s = 1, 2, 3.

From the definition of I(1)k and 1.11 it follows that∣∣∣ ∑

i∈I(1)k

d(k)i b

(k)(i,1)

∣∣∣ ≤ n−1/2k .

and thus

(1.18)∥∥Σ

(1)2

∥∥ ≤∑k∈J

n−1/2k ≤ 1.

In order to estimate∥∥Σ

(2)2

∥∥ we we first note that the definition of I(2)k yields that

(#I(2)k )n−1

k ≤∑i∈I(2)k

|d(k)i |

2 ≤nk∑i=1

|d(k)i |

2,

and, thus,∥∥Σ(2)2

∥∥ =∥∥∥∑k∈J

∑i∈I(2)k

d(k)i b

(k)(i,1)fk

∥∥∥(1.19)

≤∑k∈J

n−1/2k

∑i∈I(2)k

|d(k)i | (By (1.11))

≤∑k∈J

n−1/2k (#I

(2)k )1/2

( ∑i∈I(2)k

|d(k)i |

2)1/2

(By Holder’s inequality)

≤∑k∈J

∑i∈I(2)k

|d(k)i |

2

≤ C2B‖x‖2 = C2

B (By Proposition (1.3.4)).

Finally we have to estimate∥∥Σ

(3)2

∥∥. Before that let us make some observations:We first note that in the estimation of ‖Σ1‖ we did not use specific properties

of the sets Ik. Replacing in the estimation of ‖Σ1‖ the sets Ik by any set I ′k ⊂{1, 2 . . . nk} in (1.17) and J by any set J ′ ⊂ N we obtain∥∥∥∑

k∈J ′

∑i∈I′k

d(k)i

( nk∑j=2

b(k)(i,j)g

(k)j

)∥∥∥ ≤ CHCB‖x‖(1.20)


Taking I ′k to be all of {1, 2 . . . nk} and J ′ = [1,K] for some K ∈ N we deducefrom Proposition 1.3.5 that

∥∥∥ K∑k=1

nk∑i=1

d(k)i b

(k)(i,1)fk

∥∥∥(1.21)

≤∥∥∥ K∑k=1

nk∑i=1

d(k)i ψ

(k)i

∥∥∥+∥∥∥ K∑k=1

nk∑i=1

d(k)i (ψ

(k)i −b

(k)(i,1)fk)

∥∥∥≤ CΨ‖x‖+ CHCB‖x‖ = CΨ + CHCB,

where CΨ denotes the basis constant of (ψ(k)j ; k ∈ N, j = 1, 2 . . . nk). Secondly we

note that in the estimation of ‖Σ(1)k ‖ we could replace the set J by any subset

J ′ ⊂ N and I(1)k by any subset

I ′k ⊂ I(1)k =

{i ≤ nk : |d(k)

i | ≤ n−1k }},

to obtain

(1.22)∥∥∥∑k∈J ′

∑i∈I′k

d(k)i b

(k)(i,1)fk

∥∥∥ ≤ 1.

Thirdly we note that in the estimation of∥∥Σ

(2)2

∥∥ in (1.19) we could have also

replaced J by any subset of N, and for k ∈ N the set I(2)k by any subset

I ′k ⊂ I(2)k =

{i ∈ Ik : |d(k)

i | ≥ n−1/2k

}to obtain

(1.23)∥∥∥∑k∈J ′

∑i∈I′k

d(k)i b

(k)(i,1)fk

∥∥∥ ≤ C2B.

In order to estimate the ‖Σ(3)2 ‖ we define

K = max{k ∈ J : I(3)k 6= ∅},

which means that for some i ∈ I(3)k it follows that |d(k)

i | < n−1/2K and note that

for any k ∈ [1,K − 1] either k ∈ J or (here we use the first time that we aredealing with the threshold algorithm) k 6∈ J , which implies

(1.24) |d(k)i | < n

−1/2K ≤ n−1

k for all i ∈ {1, 2, . . . nk}

(here we are using that nk+1 = n2k) and thus for such a k the sets I

(3)k and I

(2)k

are empty.


We compute now

Σ(3)2 =

∑i∈I(3)K

d(K)i b

(K)(i,1)fK +

∑k∈J,j<K

∑i∈I(3)k

d(k)i b

(k)(i,1)fk

=∑i∈I(3)K

d(K)i b

(K)(i,1)fK +

K−1∑k=1

nj∑i=1

d(k)i b

(k)(i,1)fk

−K−1∑k=1

∑i∈I(1)k

d(k)i b

(k)(i,1)fk −

∑k∈J,k<K

∑i∈I(2)k

d(k)i b

(k)(i,1)fk

The first term we estimate, using Holder’s inequality:

∥∥∥ ∑i∈I(3)K

d(K)i b

(K)(i,1)fK

∥∥∥ ≤ n−1/2K

nK∑j=1

∣∣d(K)i

∣∣ ≤ n−1/2K n

1/2K

( NK∑j=1

∣∣d(K)i

∣∣2)1/2≤ CB.

It follows therefore from (1.21), (1.22) and (1.23)∥∥Σ(3)2

∥∥ ≤ CB + CΨ + CHCB + 1 + C2B

which implies our claim letting C = CB + CΨ + CHCB + 1 + C2B.

Corollary 1.3.6. Apply Theorem 1.3.2 to the trigonometrical polynomials (tn)=(e−i2πn(·) : n∈Z) which are a basis of Lp[0, 1] and satisfy by Example 1.3.1 theassumptions if p > 2. This leads to a quasi greedy basis (Ψn : n ∈ N) of Lp[0, 1].

Secondly note since (tn) is absolutely bounded by 1(in L∞[0, 1]), and sincethe matrices B(k), which where used in the construction of the basis (Ψn) areuniformly bounded as linear operators on `nk∞ , it follows that also (Ψn : n ∈ N) isbounded L∞[0, 1].

This implies by Corollary 6.2.7 that (Ψn) cannot be unconditional.


Chapter 2

Greedy Algorithms In HilbertSpace

2.1 Introduction

We will now replace in our greedy algorithms, bases by more general and possiblyredundant systems.

Let H (always) be a separable and real Hilbert space. Recall that D ⊂ SH isa dictionary of X if span(D) is dense and x ∈ D implies that −x ∈ D.

An n-term approximation algorithm is a map

G : H → span(D)N, x 7→ G(x) = (Gn(x)),

with the property that for n ∈ N and x ∈ H, there is a set Λn ⊂ D of cardinality atmost n so that Gn(x) ∈ span(Λn), Gn(x) is then called an n-term approximationof x.

Perhaps the first example was considered by Schmidt [Schm]:

Example 2.1.1. [Schm] Let f ∈ L2

([0, 1]2

), i.e. f is a square integrable function

in two variables. By the Theorem of Arcela and Ascoli we know that the set

D ={ n∑j=1

uj ⊗ vj : n ∈ N, ui, vj ∈ C[0, 1]}

is dense in C([0, 1]2

). Here we denote for two functions f, g : [0, 1]→ K

f ⊗ g : [0, 1]2 → K, (x, y) 7→ f(x)g(y).

Since C([0, 1]2

)is dense in L2

([0, 1]2

)it follows that

D ={ n∑j=1

uj ⊗ vj : n ∈ N, ui, vj ∈ L2[0, 1]}

27

28 CHAPTER 2. GREEDY ALGORITHMS IN HILBERT SPACE

is dense in L2

([0, 1]2

).

The question is now, how to find a good approximate to f from D. E. Schmidtconsidered the following procedure and showed that it worked:

Let f ∈ L2

([0, 1]2

)and define f0 = f .

Then choose u1, v1 ∈ L2[0, 1] so that

‖f0 − u1 ⊗ v1‖2 = inf{‖f0 − u⊗ v‖2 : u, v ∈ L2[01]}.

Since this infimum might be hard to achieve he also considered a weaker condi-tion, and fixed some weakening factor t ∈ (0, 1) and chose u1, v1 ∈ L2[0, 1] sothat

‖f0 − u1 ⊗ v1‖2 ≤1

tinf{‖f0 − u⊗ v‖2 : u, v ∈ L2[01]

}.

Then he letf1 = f0 − u1 ⊗ v1.

After n steps he obtained u1, v1, u2, v2, . . . un, vm ∈ L2[0, 1], and let

fn = f −n∑j=1

uj ⊗ vj ,

and chose un+1 and vn+1 in L2[0, 1] so that

‖fn − un+1 ⊗ vn+1‖ = ‖f0 −n+1∑j=1

uj ⊗ vj‖2 ≤1

tinf{‖fn − u⊗ v‖2 : u, v∈L2[01]

}.

Finally he proved that fn converges in L2

([0, 1]2

)to 0 and thusGn(f) =

∑n+1j=1 uj⊗

vj converges to f .He asked whether there is some general principle behind, and how and whether

this generalizes..

(PGA) The Pure Greedy Algorithm.

For x ∈ H we define Gn = Gn(x), for each n ∈ N0, by induction.

G0 = 0 and assuming that G0, G1 . . . Gn−1, have been definedfor some n ∈ N we proceed as follows:

1) Choose zn ∈ D and an ∈ R so that

‖x−Gn−1 − anzn‖ = infz∈D,a∈R

‖x−Gn−1 − az‖.

2) Put Gn = Gn−1 + anzn.Note that for any x ∈ H it follows that

infz∈D,a∈R

‖x− az‖2(2.1)

2.1. INTRODUCTION 29

= infz∈D,a∈R

[‖x‖2 − 2a〈x, z〉+ a2‖z‖2]

= infz∈D

[‖x‖2 − 〈x, z〉2]

[a 7→‖x‖2−2a〈x, z〉+a2‖z‖2 is minimal for a=〈x, z〉]= ‖x‖2 − sup

z∈D〈x, z〉2.

So condition (1) in (PGA) can be replaced by the following condition (1’)

1’) Choose zn ∈ D so that

〈x−Gn−1, zn〉 = supz∈D〈x−Gn−1, z〉

and (2) by

2’) Put Gn = Gn−1 + 〈x−Gn−1, zn〉zn.

As already noted in Example 2.1.1, the “sup” in (1’) (PGA), respectively the“inf” in (1) might not be attained or might be hard to attain. In this case wemight consider the following modification.

(WPGA) The Weak Pure Greedy Algorithm.

We are given a sequence τ = (tn) ⊂ (0, 1). For x ∈ X we defineGn = Gn(x), for each n ∈ N0, by induction.


1) Choose zn ∈ D, so that

〈x−Gn−1, zn〉 ≥ tn supz∈D〈x−Gn−1, z〉

2) Put Gn = Gn−1 + 〈x − Gn−1, zn〉zn. For WPGA we call thesequence (tn) the weakness factors.

A possibly faster (but computational more laborious) algorithm is the follow-ing Orthogonal Greedy Algorithm.

(OGA) The Orthogonal Greedy Algorithm.

For x ∈ H we define Gon = Gon(x), for each n ∈ N0, by induction.

Go0 = 0 and assuming thatGo0, Go1 . . . G

on−1, and vectors z1 . . . zn−1

have been defined for some n ∈ N we proceed as follows:

1) Choose zn ∈ D so that

〈x−Gon−1, zn〉 = supz∈D〈x−Gon−1, z〉

2) Define Zn = span(z1, z2 . . . zn)) and let Gon−1 be the best ap-proximation of x to Zn, i.e.

‖Gon−1 − x‖ = inf{‖z − x‖ : z ∈ Zn},


which means that Gn−1 = PZn(x), where PZn denotes the or-thonormal projection of H onto Zn.

(GAR) The Greedy Algorithm with free Relaxation.

For x ∈ H we define Grn = Grn(x), for each n ∈ N0, by induction.

Gr0 = 0 and assuming that Gr0, Gr1 . . . G

rn−1, have been defined

for some n ∈ N we proceed as follows:


〈x−Grn−1, zn〉 = supz∈D〈x−Grn−1, z〉

2) Put Grn = anGrn−1 + bnzn, where Grn is best approximation of x

by an element of the two dimensional space span(Grn−1, zn).

(GAFR) The Greedy Algorithm with fixed Relaxation.

Let c > 0. For x ∈ H we define Gfn = Gfn(x), for each n ∈ N0,by induction.

Gf0 = 0 and assuming that Gf0 , Gf1 . . . G

fn−1, have been defined

for some n ∈ N we proceed as follows:


〈x−Gfn−1, zn〉 = supz∈D〈x−Gfn−1, z〉

2) Put Gfn = c(

1− 1n

)Gfn−1 + c

nzn,

Similar to the weak purely greedy algorithm there are also weak versions ofthe orthogonal greedy algorithm and the pure greedy algorithm with relaxationand We denote them by WOGA, WGAR and WGAFR.

2.2 Convergence

Proposition 2.2.1. Assume that we consider the WPGA, WOGA or WGARand assume for the weakness factors (tn) that

(2.2)∑k∈N

t2k =∞.

For x we let xn = x−Gn(x), xn = x−Gon(x) or xn = x−Gr(x), respectively.

If the sequence (xn) converges it converges to 0.

2.2. CONVERGENCE 31

Proof. Assume that xn converges to some u ∈ H and u 6= 0. Then, since D isa dictionary, there is a d ∈ D so that δ = 〈d, u〉 > 0 and thus we find a largeenough N ∈ N so that 〈d, xn〉 ≥ δ/2, for all n > N

In the case that we consider WPGA we obtain for n ≥ N

‖xn+1‖2 =∥∥xn − 〈zn+1, xn〉zn+1

∥∥2= ‖xn‖2 − 〈zn+1, xn〉2 ≤ ‖xn‖2 − t2n+1δ

2/4.

and thus for k = 1, 2, 3 . . .

‖xN‖2 − ‖xN+k‖2 =N+k−1∑j=N

‖xj‖2 − ‖xj+1‖2 ≥N+k∑j=N

t2j+1δ2/4→N→∞ ∞.

But this is a contradiction.In the case of the WOGA we similarly have for n ≥ N

‖xn+1‖2 = min{∥∥∥x− n+1∑

j=1

ajzj

∥∥∥ : a1, a2, . . . an+1 ∈ R}

≤∥∥xn − 〈zn+1, xn〉zn+1

∥∥2

= ‖xn‖2 − 〈zn+1, xn〉2 ≤ t2n+1δ2/4

and we obtain a contradiction as in the WPGA case. Similarly in the case weconsider the WGAR we estimate:

‖xn+1‖2 = min{∥∥∥x− aGrn(x)− bzn+1

∥∥∥ : a, b ∈ R}

≤∥∥x−Grn(x)− 〈zn+1, xn〉zn+1

∥∥2

= ‖xn‖2 − 〈zn+1, xn〉2 ≤ ‖xn‖2 − t2n+1δ2/4.

Theorem 2.2.2. Assume that condition (2.2) of Proposition 2.2.1 holds. Then(Gon(x) : n ∈ N) (as defined in WOGA) converges for all x ∈ H to x.

Proof. Let x ∈ H. For n ∈ N let Zn be the space defined in WOGA. Gon(x) =PZn(x). Since Z1 ⊂ Z2 ⊂ Z3 . . . it follows that Gon(x) converges to PZ(x), whereZ =

⋃n∈N Zn . Thus the claim follows from Proposition 2.2.1

Theorem 2.2.3. Assume that the sequence (tk) ⊂ (0, 1) satisfies

(2.3)∑k∈N

tkk

=∞.

For x ∈ X consider the WPGA (Gn(x)) with weakness factors (tn).Then (Gn(x)) converges.


Remark. Since by Holder’s inequality

∞∑k=1

tkk≤( ∞∑k=1

t2k

)1/2( ∞∑k=1

1

k

)1/2,

condition 2.3 implies that∑

k∈N t2k =∞.

We will need the following Lemma first

Lemma 2.2.4. Assume y = (yj) ∈ `2 and (tk) ⊂ (0, 1) satisfies (2.3) . Then

lim infn→∞

|yn|tn

n∑j=1

|yj | = 0.

Proof. (an alternate, and shorter proof due to Sheng Zhang will be given below)We will prove the following claim:Claim. If f ∈ L2[0,∞] and we define

F (x) =

∫ x

0|f(t)| dt

then

(2.4)

∫ ∞0

F 2(x)

x2dx ≤ 4

∫ ∞0

f2(x) dx.

If we apply the claim to the function f(·) =∑∞

j=1 1(j−1,j]|yj |, it follows that

∞∑n=1

[ 1

n

n∑j=1

|yj |]2≤∞∑n=1

[ 1

n

n−1∑j=1

|yj |]2

+∞∑n=1

|yj |2

≤∫ ∞

0

[1

x

∫ x

0f(t) dt

]2

dx+∞∑n=1

|yj |2

≤ 4

∫ ∞0

f2(t) dt+

∞∑n=1

|yj |2 = 5

∞∑n=1

|yj |2

It follows therefore from the Cauchy Schwarz inequality that

∞∑n=1,tn 6=0

tnn

|yn|tn

n∑j=1

|yj | ≤∞∑n=1

|yn|1

n

n∑j=1

|yj | ≤

[ ∞∑n=1

|yj |2]1/2[ ∞∑

n=1

[ 1

n

n∑j=1

|yj |]2]1/2

<∞

since∞∑n=1

tnn

=∞

2.2. CONVERGENCE 33

it follows that

lim infn→∞

|yn|tn

n∑j=1

|yj | = 0.

In order to prove the claim we can assume that f(x) is a positive function,we note first that by Holder’s inequality,

F (x) =

∫ x

0f(t) dt ≤ x1/2

∫ x

0f2(t) dt,

and thus

(2.5)F (x)

x1/2≤∫ x

0f2(t) dt→x→0 0

For a fixed x0 > 0 we also deduce from Holder’s inequality for x > x0 that

F (x)− F (x0) =

∫ x

x0

f(t) dt ≤ (x− x0)1/2

∫ x

x0

f2(t) dt ≤ x1/2

∫ ∞x0

f2(t) dt,

and thusF (x)

x1/2≤ F (x0)

x1/2+

∫ ∞x0

f2(t) dt.

By choosing for a given ε > 0 x0 far enough out so that∫∞x0f2(t) dt < ε/2 and

then x1 > x0 so that x−1/21 F (x0) < ε/2, it follows that

F (x)

x1/2< ε whenever x > x1,

and thus

(2.6)F (x)

x1/2≤∫ x

0f2(t) dt→x→∞ 0.

Using integration by parts, it follows for any 0 < a < b <∞ that∫ b

a

F 2(x)

x2dx = −F

2(x)

x

∣∣∣bx=a

+ 2

∫ b

aF (x)f(x)x−1 dx

≤

[F (a)

a1/2

]2

+

[F (b)

b1/2

]2

+ 2

(∫ b

a

F 2(x)

x2

)1/2(∫ b

af2(x) dx

)1/2

[By Holder’s inequality]


and thus, in case that a is chosen small enough and b large enough so that F (x)does not a.e. vanish on [a, b], we have(∫ b

a

F 2(x)

x2

)1/2

≤

[[F (a)

a1/2

]2+[F (b)

b1/2

]2](∫ b

a

F 2(x)

x2

)−1/2

+2

(∫ b

af2(x) dx

)1/2

.

Our claim follows now by letting a→ 0 and b→∞

Proof by Sheng Zhang. Suppose, to the contrary that

δ = lim infn→∞

|yn|tn

n∑j=1

|y + j| > 0,

and, thus for some n0 ∈ N

|yn|tn

n∑j=1

|y + j| > δ/2 whenever n ≥ n0.

For n ≥ n0, we deduce fro Holders’s inequality

δ

2<

1

tn

n∑j=1

|yn||yj | ≤1

tn

( n∑j=1

|yj |2)n|yn|2,

and thus|yn|tnn

≥ δ

2

1∑nj=1 |yj |2

,

which yields

lim infn→∞

|yn|tnn

≥ δ

2

1∑∞j=1 y

2j

=: ε

Thus there is an n> ≥ n0, so that for all n ≥ n1, |yn|2 ≥ εtn/2n. But thiscontradicts the assumption that y = (yn) ∈ `2 and

∑∞n=1 tn/n =∞.

Proof of Theorem 2.2.3. Let x ∈ H and put for n ∈ N, Gn = Gn(x) with

Gn(x) =n∑j=1

〈x−Gj−1, zj〉zj ,

where zn ∈ D satisfies

(2.7) 〈zn, x−Gn−1〉 = 〈zn, xn−1〉 ≥ tn supz∈D〈z, xn−1〉.

2.2. CONVERGENCE 35

Define

(2.8) xn = x−Gn(x) = x−n∑j=1

〈zj , x−Gj−1〉zj = xn−1 − 〈zn, x−Gn−1〉zn,

By induction we show that for every n ∈ N

(2.9) ‖xn‖2 = ‖x‖2 −n∑j=1

〈zj , xj−1〉2.

Indeed, for n = 1 the claim is clear and assuming that (2.9) is true for n ∈ N wecompute

‖xn+1‖2 = ‖xn − 〈xn, zn+1〉zn+1‖2

= ‖xn‖2 − 〈xn, zn+1〉‖zn+1‖2 = ‖x‖2 −n+1∑j=1

〈zj , xj−1〉2.

It follows therefore from (2.9) that

(2.10)∞∑j=1

〈zj , xj−1〉2 ≤ ‖x‖2

For m < n we compute

(2.11) ‖xn − xm‖2 = ‖xm‖2 − ‖xn‖2 − 2〈xm − xn, xn〉,

and ∣∣〈xm − xn, xn〉∣∣ =∣∣∣ n∑j=m+1

〈xj−1 − xj , xn〉∣∣∣

≤n∑

j=m+1

∣∣〈xj−1 − xj , xn〉∣∣

=

n∑j=m+1

∣∣〈zj , xn〉∣∣ · ∣∣〈zj , xj−1〉∣∣

≤∣∣〈zn+1, xn〉

∣∣tn+1

n∑j=m+1

∣∣〈zj , xj−1〉∣∣

[∣∣〈zj , xn〉∣∣ ≤ maxd∈D

∣∣〈d, xn〉∣∣ ≤ t−1n+1

∣∣〈zn+1, xn〉∣∣]

≤∣∣〈zn+1, xn〉

∣∣tn+1

n+1∑j=1

∣∣〈zj , xj−1〉∣∣.


We can therefore apply Lemma 2.2.4 to (tn) and yn =∣∣〈zn+1, xn〉

∣∣, for n∈N, anddeduce that

lim infn→∞

maxm<n

∣∣〈xm − xn, xn〉∣∣ = 0.

Together with the fact that ‖xn‖ is decreasing and (2.11) this implies that thereis subsequence (xnk

) which converges to some x ∈ H. We claim that the wholesequence (xn) converges to that x, which, together with Proposition 2.2.1, wouldfinish the proof. Note that for any n∈N and any k∈N so that nk > n we have

‖xn − x‖ ≤ ‖xn − xnk‖+ ‖xnk

− x‖

=(‖xn‖2 − ‖xnk

‖2 − 2〈xn − xnk, xnk〉)1/2

+ ‖xnk− x‖

≤(‖xn‖2 − ‖xnk

‖2)1/2

+ 2 maxm≤nk

∣∣〈xm − xnk, xnk〉∣∣1/2 + ‖xnk

− x‖.

So, given ε > 0 we can choose n0 large enough so that(‖xn‖2−‖xnk

‖2)1/2

< ε/3,for all n ≥ n0 and k with nk > n. Then we choose k0 so that for all k > k0,

2 maxm≤nk

∣∣〈xm − xnk, xnk〉∣∣1/2 < ε/3 and ‖xnk

− x‖ < ε/3. For any n ≥ n0, wecan therefore choose k ≥ k0 so that also nk > n, and from above inequalities wededuce that ‖xn − x‖ < ε.

The next Theorem proves that at least among the decreasing weakness factorsτ the condition 2.3 is optimal in order to imply convergence of the WPGA.

Theorem 2.2.5. In the class of monotone decreasing sequences τ = (tk), thecondition (2.3) is necessary for the WPGA to converge.

In other words, if (tn) is a decreasing sequence for which

(2.12)

∞∑n∈N

tnn<∞

then there is a dictionary D of H, an x ∈ H and sequences (Gn) ⊂ H and(zn) ⊂ D , with G0 = 0 so that for xn = x−Gn the following is satisfied:

xn = xn−1 − 〈xn−1, zn〉zn(2.13)

〈xn−1, zn〉 ≥ tn maxz∈D′〈xn−1, z〉,(2.14)

but so that xn does not converge to 0.

We will need the following notation:

Definition 2.2.6. Assume D′ ⊂ SH has the property that z ∈ D′ implies that−z ∈ D′ and assume that τ = (tn)Nn=1 ⊂ (0, 1], with N ∈ N ∪ {∞} isa finitesequence of positive numbers. A pair of sequences (xn)Nn=0 ⊂ H and (zn)Nn=0 ⊂ D′

2.2. CONVERGENCE 37

and are called a pair of WPGA-sequences with weakness factor τ and dictionaryD if x0 ∈ span(D′) and for all n = 1, 2 . . . N

xn = xn−1 − 〈xn−1, znzn〉(2.15)

〈xn−1, zn〉 ≥ tn maxz∈D′〈xn−1, z〉.(2.16)

Remark. To given sequence τ = (tn)∞n=1 ⊂ (0, 1], satisfying (2.12) we will chooseelements of a dictionary D as well as the elements xn and zn of a pair of WPGA-sequences with weakness factor τ and dictionary D recursively.

To achieve that we will choose inductively elements xn, n ≥ 0 and zn , n ≥ 1,so that for all n ∈ N

xn = xn−1 − 〈xn−1, 〉zn(2.17)

〈xn−1, zn〉 ≥ tn max(

supi∈N|〈xn−1, ei〉|, max

j=1,2,...n−1|〈xn−1, zj |

)(2.18)

〈xj , zj+1〉 ≥ tj〈xj , zn〉, for all j = 0, 1, 2 . . . n− 1.(2.19)

Here (ej) denotes an orthonormal basis of H.We deduce then that (xn)∞n=0 and (zn)∞n=0 is a pair of WPGA-sequences with

weakness factor τ and dictionary D = {±ej ,±zj : j ∈ N}.

Proof of Theorem 2.2.5. The following procedure is the key observation towardsinductively producing our example. We let (ej) be an orthonormal basis of H.

For given t ∈ (0, 1/3] and i 6= j in N. We define elements xn ∈ span(ei, ej),n ≥ 0 and zn ∈ (ei, ej), ‖zn‖ = 1, n ≥, and αn ∈ [0, 1] recursively until we stopat some n = N , when some criterium is satisfied, as follows:

We put x0 = ei,Now assume that for some n ∈ N, we defined xs = asei + bsej and αs ∈ (0, 1)

and zs ∈ SH for all 1 ≤ s ≤ n− 1 so that for all 1 ≤ s < n we have

〈xs−1, zs〉 = t, as long as s ≤ N − 1,(2.20)

zs = αsei − (1− α2s)

1/2ej ,(2.21)

as, bs ≥ 0, and as − bs ≥√

2, as long as s ≤ N ,(2.22)

xs = xs−1 − 〈xs−1, zs〉zs.(2.23)

(Conditions (2.20) and (2.22) become vacuous once we defined N for s = N)Then we first define zn as

zn = αnei − (1− α2n)1/2ej

where αn is defined so that 〈xn−1, zn〉 = t. Secondly define xn to be

xn = xn−1 − 〈xn−1, zn〉zn


and write xn asxn = anei + bnej .

Case 1. an − bn ≥√

2t In that case we choose αn = αn and zn = αnei − (1 −α2n)1/2ej . Thus (2.20) and (2.21) are satisfied for s = n. Then we let xn = xn,

and have therefore satisfied (2.22) and (2.23).Case 2. an − bn <

√2t. Then we let N = Nt = n and put αN = 1/

√2,

zN = (ei − ej)/√

2 and xN = xN−1 − 〈xN−1, zN 〉zN . Then (2.21),and (2.23) aresatisfied while (2.20) is vacuous.

From the definition of XN in Case 2, we observe that

〈xN−1, zN 〉 = 2−1/2(aN−1 − bN−1

)≥ t, and it follows therefore that(2.24)

aN = bN =1

2(aN−1 + bN−1).(2.25)

In particular also (2.22) is satisfied for n = N , assuming that N is finite, whichwe will see later (here the second part of (2.22) is vacuous).

Once the second case happens we finish the definition of our sequences.We still will have to show that eventual Case 2 will happens and that N is

finite; for the moment we think of N being an element of N ∪ {∞}We make the following observations. From (2.20) and (2.21) we deduce that

an+1 = an − tαn+1 and bn+1 = bn + t(1− α2n+1)1/2, if n < N − 1(2.26)

which implies that

an+1 − bn+1 = an − bn − t(αn+1 + (1− α2

n+1)1/2)

(2.27) {≥ an − bn −

√2t

≤ an − bn − tif n < N − 1.

This yields

1 = a0 − b0 ≥N−2∑s=0

(as − bs)− (as+1 − bs+1) ≥ (N − 1)t

and therefore we showed that N is finite. Since by definition of N and aN andbN

√2t > aN − bN

= aN−1 − bN−1 − t(αN + (1− α2

N ))1/2 ≥ aN−1 − bN−1 − t

√2

it follows that

〈xN−1, zN 〉 = 2−1/2(aN−1 − bN−1

){≤ 2t

≥ t.(2.28)

2.2. CONVERGENCE 39

It follows therefore from (2.27),(2.24) and (2.25) that

1 = a0 − b0 =

N−1∑s=0

(as − bs)− (as+1 − bs+1)

{≥ tN≤√

2tN

and thus

1√2t≤ N ≤ 1

t.(2.29)

From the definition of xN and (2.28) we deduce that

‖xN‖2 = ‖xN−1‖2 − 〈xN−1, zN 〉2 (By (2.28))

≥ ‖xN−1‖2 − 4t2

= ‖x0‖2 +N−1∑s=1

(‖xs‖2 − ‖xs−1‖2

)− 4t2

= ‖x0‖2 − (N − 1)t2 − 4t2 ≥ ‖x0‖2 − t− 3t2 (By 2.29)

and thus, since t ≤ 1/3,

‖xN‖2 ≥ ‖x0‖2 − 2t(2.30)

Finally note that the sequence (xn)Nn=0 is a WAGD sequence for the DictionaryD′ = {zn : n = 1, 2 . . . Nt} ∪ {ei} with the weakness factor t. We call (xn)Nt

n=0

together with the sequence (zn)Ntn=0 the WAGD sequence generated by t and the

pair (ei, ej).Now we assume that (tn)∞n=1 is a sequence in (0, 1], so that

∑∞j=1 tn <∞. We

first require the additional assumption that∑∞

j=1tnn < ∞ < ε = 3

16 . It followsthat∞∑s=0

t2s = t1 + t2 + t4 + t8 . . .

≤ t1 + t2 +1

2(t3 + t4) +

1

4(t5 + t6 + t7 + t8) +

1

8(t9 + t10 + . . . t16) + . . .

≤ 2[t1 +

t22

+1

4(t3 + t4) +

1

8(t5 + t6 + t7 + t8) +

1

16(t9 + t10 + . . . t16) + . . .

]≤ 2

∞∑n=1

tnn≤ 2ε.

We will construct recursively sequences (xn : n = 0, 1, 2, . . .) and (zn : n = 1, 2 . . .)so that x0 = e1, xn = xn−1 − 〈xn−1, zn〉zn, so that for every n ∈ N

〈xn−1, zn〉 ≥ tn maxj=1,...n−1

〈xn−1, zj〉 and 〈xn−1, zn〉 ≥ tn supj∈N〈xn−1, ej〉(2.31)


and

tj〈xj , zn〉 ≤ 〈xj , zj+1〉 for all j = 0, 1, 2, . . . n− 1.(2.32)

As noted in the remark before the proof, these two conditions will ensure that thatfor each n the vector is of the form xn = x−Gn(x), where (Gn(x) : n ∈ N0) is theresult of a WPGA with weakness factors (tn) and dictionaryD = {zn, en : n ∈ N}.

We start with x = x0 = e1, and let (x(1)n : n = 0, 1, 2, . . . Nt1) and (z

(1)n : n =

1, 2, . . . Nt1) be the WAGD sequence generated by t and the pair (e1, e2), then

we put xn = x(1)n and zn = z

(1)n for n = 1, 2, 3 . . . Nt1 .

Note that we satisfied so far our required conditions (2.31) and (2.32) sinceby construction 〈xn−1, zn〉 = t1 ≥ tn ≥ tn‖xn−1‖ for all n = 1, 2 . . . , N1 = Nt1 .By (2.25) xN1 is of the form xN1 = c1(e1 − e2), and we deduce from (2.30) andthe fact that ‖xN1‖ ≤ 1 that

c21 ≤ 1/2, N1 ≥ 1 and ‖xN1‖2 ≥ 1− 2t1.

Then we consider let (x(2,1)n : n = 0, 1, . . . Nt2) and (z

(2,1)n : n = 1, 2 . . . Nt2) be

the WAGD sequence generated by t2 and the pair (e1, e3), and (x(2,2)n : n =

0, 1, . . . Nt2) and (z(2,2)n : n = 1, 2 . . . Nt2) be the WAGD sequence generated by

t2 and the pair (e2, e4). We put N2 = Nt2 and for n = 1, 2, . . . N2 we define

xN1+n = c1x(2,1)n + c1e2 and zN1+n = z(2,1)

n

xN1+N2+n = c1x(2,1)N1

+ c1x(2,2)n and zN1+N2+n = z(2,2)

n .

We observe that for n = 1, 2 . . . N2

〈xN1+n−1, zN1+n〉 = c1t2 ≥ t2 maxs∈N〈xN1+n−1, es〉 and

〈xN1+n−1, zN1+n〉 = c1t2 ≥ t2 maxs=1,2,...N1

〈xN1+n−1, zs〉

(the first inequality follows from the fact that the coordinates of xN1+n, n =1, 2 . . . N2 are absolutely, not larger than c1, the second inequality follows from(2.21) and the fact moreover the coordinates of xN1+n, n = 1, 2 . . . N2 are notnegative while zs, s = 1, 2 . . . N1, has a positive and negative coordinate). Sec-ondly we note that for j = 1, 2, . . . , N1 and n = 1, 2, . . . , N2 − 1, it follows from(2.20) and (2.24) that

tj〈xj , zN1+n〉 ≤ t1 ≤ 〈xj , zj+1〉.

This implies that the conditions (2.31) and (2.32) hold for all N1 ≤ n ≤ N1 +N2.Similarly we can show that they also hold for all N1 +N2 ≤ n ≤ N1 + 2N2.

Finally (2.30) implies that XN1+2N2 is of the form

XN1+2N2 = c2(e1 + e2 + e3 + e4)

2.2. CONVERGENCE 41

with

c22 ≤ 1/4 and ‖xN1+2N2‖2 ≥ ‖xN1‖2 − c2

12t2 − c212t2 ≥ 1− 2t1 − 2t2.

Now assume that for some r ∈ N we have chosen

(xn : n = 1, 2, . . .Mr), with Mr =

r∑j=1

2j−1Nj and Nj = Nt2j−1 , j = 1, 2 . . . r,

and

(zn : n = 1, 2, . . .Mr)

so that (2.31) and (2.32) hold for all n ≤Mr, and so that

xMr = cr

2r∑i=1

ei

for some cr with c2r ≤ 2−r, and so that

‖xMr‖2 ≥ 1− 2t1 − 2t2 − 2t4 − . . .− 2t2r−1

then we let for j = 1, 2 . . . 2r (x(r+1,j)n : n = 0, . . . Nr+1) and (z

(r+1,j)n : n =

0, . . . Nr+1),with Nr+1 = Nt2r , be the WPGA sequences generated by by t2r andthe pair (ej , e2r+j), and finally put for i = 1, 2 . . . 2r and n = 1, 2, . . . Nr

xMr+(i−1)Nr+1+n =

i−1∑s=1

crx(r+1,s)Nr

+ crx(r+1,i)n + cr

2r∑s=i+1

es, and

zMr+(i−1)Nr+1+n = z(r+1,i)n .

We deduce as in the case r = 1 that the conditions (2.31) and (2.32) hold for

all n ≤ Mr + 2rNr+1 =∑r+1

s=1 2s−1Ns = Mr+1, that xMr+1 = cr+1∑2r+1

s=1 es , forsome cr+1 ≤ 2−r−1, and that

‖xMr+1‖ ≥ 1− 2t1 − 2t2, . . . 2tr.

This finishes the choice of the the xn and zn.

Since ‖xMr‖2 ≥ 1−2∑r

s=1 t2s ≥ 1−4ε > 1− 1216 = 1

4 , it follows that (xn) doesnot converge. We therefore proved our claim under the additional assumptionthat

∑∞n=1(tn/n) < 3/16.

In the general case we proceed as follows. We first find an n0 so that

∞∑s=n0

t2s < 3/16,


and let

x =2n0∑j=1

ej .

Then we choose zi = ei, i = 1, 2 . . . , 2n0 − 1, and thus x0 = x and recursively

xn = xn−1 − 〈xn−1, zn〉 =∑

j = n+ 12n0ej

for n = 1, 2, . . . 2n0 − 1. In particular x2n0−1 = e2n0 from then on we choosex2n0−1+n = xn, n = 1, 2 . . . and z2n0−1+n = zn, where the xn and zn are chosenlike the xn and the zn in the special case, but in the Hilbertspace H = span(ej :j ≥ 2n0).

2.3 Convergence Rates

Note that without any special conditions on the starting point in the Pure greedyalgorithm (or others) we can not expect being able to estimate the convergencerate.

Indeed let (ξn) be any sequence of positive numbers, which decreases to 0 andlet D = {±en : n ∈ N}, where (en) is an orthonormal basis of our Hilbert spaceH, then take

x =∞∑j=1

√ξj − ξj+1ej

then it follows for the n-th approximates Gn = Gn(x) define as in (PGA)

Gn =n∑j=1

√ξj − ξj+1ej

and thus

‖x−Gn‖2 =∞∑

j=1+n

ξj − ξj+1 = ξn+1.

Thus no matter how slow (ξn) converges to 0, there is a x so that Gn(x) convergesat least as slow as (ξn).

In order to state our first result we introduce for a dictionary D of H thefollowing linear subspace:

(2.33) A1 = A1(D) ={∑z∈D

czz : (cz) ⊂ K and∑z∈D|cz| <∞

}.

For x ∈ A1 we put

(2.34) ‖x‖A1 = inf{∑z∈D|cz| : (cz) ⊂ K and f =

∑z∈D

cz

}.

2.3. CONVERGENCE RATES 43

Theorem 2.3.1. [DT] Assume D is a dictionary of a separable Hilbert space.Let x ∈ A1(D) and assume that (Gn) = (Gn(x)) is defined as in (PGA) and letxn = x−Gn, for n∈N. Then

(2.35) ‖xn‖ ≤ ‖f‖A1n−1/6 for n ∈ N.

For the proof of Theorem 2.3.1 we need the following observation.

Lemma 2.3.2. Assuming that (ξm) is a sequence of positive numbers so that forsome number A > 0

(2.36) ξ1 ≤ A and ξm+1 ≤ ξm(1− ξm/A), for m ≥ 1.

Then

(2.37) ξm ≤A

m, for all m ∈ N.

Proof. We assume A = 1 (pass to ξm = ξm/A) We prove the claim by inductionfor each m ∈ N. For m = 1 (2.37) follows from the assumption. Assume that theclaim is true for m ∈ N. If ξm ≤ 1

m+1 then also ξm+1 ≤ 1m+1 since from (2.36) it

follows that the sequence (ξi) is decreasing. If 1m+1 < ξm ≤ 1

m we deduce that

ξm+1 ≤ ξm(1− ξm) ≤ 1

m

(1− 1

m+ 1

)=

1

m

m

m+ 1=

1

m+ 1,

which implies the claim for m+ 1 and finishes the induction step.

Proof of Theorem 2.3.1. For x ∈ H we put

ρ(x) = supz∈D

〈x, z〉‖x‖

.

Note that if x ∈ A1, η > 0 and (cz)z∈D ⊂ R+0 is such that x =

∑z∈D czz and∑

cz ≤ η + ‖x‖A1 it follows that

‖x‖2 =⟨x,∑z∈D

czz⟩≤∑z∈D

cz supz∈D〈z, x〉 ≤

(‖x‖A1 + η

)‖x‖ρ(x),

and, thus, since η > 0 was arbitrary,

(2.38) ρ(x) ≥ ‖x‖‖x‖A1

.

Let x ∈ A1 and let us assume that there is a representation x =∑

z∈D czzso that ‖x‖A1 =

∑z∈D cz (otherwise we use arbitrary approximations). Let (zm)


and (Gm) be defined as in (PGA) and xn = x−Gn, for n ∈ N. We note that form ∈ N0

‖xm+1‖2 = ‖xm − 〈xm, zm+1〉zm+1‖2(2.39)

= ‖xm‖2 − 〈xm, zm+1〉2 = ‖xm‖2(1− ρ2(xm)).

Putting am = ‖xm‖2, b0 = ‖x0‖A1 = ‖x‖A1 and, assuming that bm has been

defined, we let bm+1 = bm+ρ(xm)‖xm‖ = bm+ρ(xm)a1/2m . First we observe that

(2.40) ‖xm‖A1 ≤ bm.

Indeed, for m = 0 this simply follows from the definition of b0, and assuming(2.40) holds for m ∈ N0 it follows that

‖xm+1‖A1 = ‖xm − 〈xm, zm+1〉zm+1‖A1

≤ ‖xm‖A1 + |〈xm, zm+1〉|= ‖xm‖A1 + ρ(xm)‖xm‖ = bm+1.

Secondly we compute using (2.39), (2.38) and (2.40)

am+1 = ‖xm+1‖2 = am(1− ρ2(xm)) ≤ am(

1− ‖xm‖2

‖xm‖2A1

)≤ am

(1− am

b2m

)and thus, since bm+1 ≥ bm

am+1

b2m+1

≤ am+1

b2m≤ amb2m

(1− am

b2m

).

Note that a0b20

= ‖x‖2‖x‖2A1

≤ 1. We therefore apply Lemma 2.3.2 to sequence ξn with

ξn = an−1

b2n−1and deduce that

(2.41) amb−2m ≤

1

m, whenever m ∈ N.

Since by the recursive definition of (bj), (2.40) and (2.38) we get

bm+1 = bm(1 + ρ(xm)a1/2

m b−1m

)≤ bm

(1 + ρ(xm)a1/2

m ‖xm‖−1A1

)≤ bm

(1 + ρ2(xm)

),

we obtain together with (2.39)

am+1bm+1 ≤ ambm(1− ρ2(xm))(1 + ρ2(xm)) ≤ ambm.

(ambm) is therefore decreasing and ambm ≤ a0b0 = ‖x‖2 · ‖x‖A1 . Multiplyingboth sides of (2.41) by a2

mb2m we obtain therfore

a3m ≤

a2mb

2m

m≤‖x‖4 · ‖x‖2A1

m,

which implies our claim after taking on both sides the sixth root.


The next Example due to DeVore and Temlyakov gives a lower bound for theconvergence rate of (PGA)

Example 2.3.3. [DT] Let H be a separable Hilbertspace and (hj) an orthonor-mal basis of H. We will define a dictionary D ⊂ H, a vector x ∈ H for which‖x‖A1(D) = 2, and so that

‖xm‖ = ‖x−Gm‖ ≥c√m, for m∈N.

Define

a =

(23

11

)1/2

and A =

(33

89

)1/2

and

z = A(h1 + h2) + aA∞∑k=3

(k(k + 1)

)−1/2hk.

Note that

‖z‖2 = 2A2 + a2A2∞∑k=3

1

k

1

k + 1

= 2A2 + a2A2∞∑k=3

1

k− 1

k + 1

= 2A2 +1

3a2A2 =

33

89

(2 +

23

33

)= 1.

Put D = {±g} ∪ {±hj : j ∈ N} and let x = h1 + h2 and we apply (PGA) to fClaim: In Step 1 of (PGA) we have

z1 = z and x1 = x− 〈x, z〉z = (1− 2A2)(h1 + h2)− 2aA2∞∑k=3

(k(k + 1)

)−1/2hk.

Indeed,

〈x, z〉 = 2A > 1,

〈x, h1〉 = 〈x, h2〉 = A, and

〈x, hj〉 = 0, if, j > 2.

Thus z1 = z and

x1 = x− 〈x, z〉z

= h1 + h2 − 2A(A(h1 + h2) + aA

∞∑k=3

(k(k + 1)

)−1/2hk

)


= (1− 2A2)(h1 + h2)− 2aA2∞∑k=3

(k(k + 1)

)−1/2hk.

Claim: In Step 2 and Step 3 of (PGA) we have z2 = h1 and z3 = h2 or viceversa.

Indeed,

〈x1, z〉 = 0 (by construction of x1)

〈x1, h1〉 = 〈x1, h2〉 = (1− 2A2) =23

89and

〈x1, hj〉 = 2aA2(j(j + 1)

)−1/2 ≤ 1

6aA2 < (1− 2A2) if j > 2.

So take W.l.o.g z2 = h1 and thus

x2 = x1 − 〈x1, h1〉h1 = (1− 2A2)h2 − 2aA2∞∑k=3

(k(k + 1)

)−1/2hk.

Then we observe that

〈x2, z〉 = 〈x1, z〉+ 〈x2 − x1, z〉 = −(1− 2A2)〈h1, z〉 = −A(1− 2A2)

〈x2, h1〉 = 0,

〈x2, h2〉 = (1− 2A2) > |〈x2, z〉|

〈x2, hj〉 = 2aA2(j(j + 1)

)−1/2 ≤ 1

6aA2 < (1− 2A2) if j > 2.

which implies that z3 = h2 and that

x3 = x2 − 〈x2, h2〉h2 = −2aA2∞∑k=3

(k(k + 1)

)−1/2hk.

From now on we prove by induction that

zm = hm−1 and xm = −2aA2∞∑k=m

(k(k + 1)

)−1/2hk whenever m ≥ 3.

Indeed, for m = 3 this was already shown. Assuming that our claim is true forsome m ≥ 3 we compute that

〈xm, z〉 = −2a2A3∞∑k=m

(k(k + 1)

)= − 2a2A3

m(m+ 1)

〈xm, h`〉 = 0 for ` < m, and

〈xm, h`〉 = −2a2A3(`(`+ 1)

)−1/2for ` ≥ m.


Thus zm+1 = hm and xm = −2aA2∑∞

k=m+1

(k(k+ 1)

)−1/2hk, which finishes the

induction step.Finally note that for m ≥ 3

‖xm‖2 = 4a2A4∞∑k=m

1

k(k + 1)=

4a2A4

m.

Thus ‖x‖A1 = 2 and (‖xm‖) is of the order m−1/2.

Remark. In [KT2] the rate of convergence in Theorem 2.3.1 was slightly im-proved to Cn−11/62 where C is some universal constant. And in [LT] a dic-tionary D of H was constructed for which there is an x ∈ A1(D) for which‖x−Gn(x)‖ ≥ Cn−.27, whenever n ∈ N.

It is conjectured that the rate of convergences should be of the order of n−1/4.

Theorem 2.3.4. [Jo] Consider the Greedy Greedy Algorithm (Gfm(x)), x ∈ Hwith fixed relaxation, and let x ∈ A1(D) with ‖x‖A1 ≤ c, where c is the constantin (GAFR) then

(2.42) ‖x−Grm‖A1 ≤2c√m, for all m∈N.

We first need the following elementary Lemma.

Lemma 2.3.5. Let (am) be a sequence of non-negative numbers, for which thereis an A > so that

(2.43) a1 ≤ A and am ≤(

1− 2

m

)am−1 +

A

m2if m ≥ 2.

Then

(2.44) am ≤A

mfor all m ∈ N.

Proof. We prove (2.44) by induction. For m = 1 (2.44) is part of our assumption,and assuming (2.47) is true for m − 1 we deduce from the second part of ourassumption that

am ≤(

1− 2

m

)am−1 +

A

m2

≤(

1− 2

m

) A

m− 1+

A

m2

= A( 1

m− 1− m+ 1

m2(m− 1)

)= A

(m2 −m− 1

m2(m− 1)

)


=A

m

(m2 −m− 1

m(m− 1)

)<A

m,

which finishes the induction step and the proof of our claim.

Proof of Theorem (2.3.4). W.l.o.g. we can assume that c = 1. Let Gfn = Gfn(x)

and zn be as in GAFR and let xn = x−Gfn, for n ∈ N we compute:

‖xn‖2 =∥∥x−Gfn∥∥2

(2.45)

=∥∥∥x− (1− 1

n

)Gfn−1 −

1

nzn

∥∥∥2

=∥∥x−Gfn−1

∥∥2+

2

n

⟨x−Gfn−1, G

fn−1 − zn

⟩+

1

n2

∥∥Gfn−1 − zn∥∥2

≤∥∥x−Gfn−1

∥∥2+

2

n

⟨x−Gfn−1, G

fn−1 − zn

⟩+

4

n2

(for the inequality notice that if ‖Gfn−1‖ ≤ ‖Gfn−1‖A1 ≤ 1) and⟨

x−Gfn−1, Gfn−1 − zn

⟩≤ inf

z∈D

⟨x−Gfn−1, G

fn−1 − z

⟩= inf

z∈A1(D),‖z‖A1≤1

⟨x−Gfn−1, G

fn−1 − z

⟩= 〈x−Gfn−1, G

fn−1〉 − sup

z∈A1(D),‖z‖A1≤1

⟨Gfn−1, G

fn−1 − z

⟩≤⟨x−Gfn−1, G

fn−1 − x

⟩= −‖x−Gfn−1‖

2.

Inserting this inequality into (2.45) yields

‖xn‖2 ≤(

1− 2

n

)∥∥x−Gfn−1

∥∥2+

4

n2=(

1− 2

n

)‖xn−1‖|2 +

4

n2,

which together with Lemma 2.3.5 yields our claim.

Theorem 2.3.6. If we consider the orthogonal greedy algorithm (OGA), thenfor each x ∈ H with ‖x‖A1 = ‖x‖A1(D) <∞, it follows

(2.46) ‖x−Gon(x)‖ ≤‖x‖A1(D)√

n.

Proof. Assume that ‖x‖A1 = 1, and Gon = Gon(x), zn are given as in (OGA), i.e.

〈x−Gon1, zn〉 = sup

z∈D〈x−Gn−1, z〉


and Gon is the orthogonal projection P⊥Zn(x) of x onto Zn = span(z1, z2, . . . zn).

As usual we put xn = x − Gon. Since G0n is the best approximation of x by

elements of Zn, it follows that

‖xn‖2 ≤ ‖xn−1 − 〈xn−1, zn〉zn‖2(2.47)

= ‖xn−1‖2 − 〈xn−1, zn〉2 = ‖xn−1‖2(

1−(〈xn−1, zn〉‖xn−1‖

)2)

(W.l.o.g. xn−1 6= 0, otherwise we would be done Write x as x =∑

z∈D czz, with∑cz = 1, cz ≥ 0, for z ∈ D. Then

‖xn−1‖2 = 〈x−Gon−1, x−Gon−1〉= 〈x−Gon−1, x〉 (Since x−Gon−1 ⊥ Gon−1)

=⟨x−Gon−1,

∑z∈D

czz⟩

≤⟨x−Gon−1,

∑z∈D

czzn

⟩= 〈xn−1, zn−1〉 = ‖xn−1‖

〈xn−1, zn〉‖xn−1‖

and thus by (2.47)‖xn‖2 ≤ ‖xn−1‖2(1− ‖xn−1‖2).

Our claim follows therefore again from Lemma 2.3.5.


Chapter 3

Greedy Algorithms in generalBanach Spaces

3.1 Introduction

The algorithms from Chapter 2 can be generalized to separable Banach spacesX. Again let D ⊂ SX be a dictionary of X, i.e. X = span(D), and with z ∈ D,it also follows that −z ∈ D.

(XGA) The X-Greedy Algorithm.

For x ∈ X we define Gn = Gn(x), for each n ∈ N0, by induction.



‖x−Gn−1 − anzn‖ = infz∈D,a≥0

‖x−Gn−1 − az‖.

2) Put Gn = Gn−1 + anzn.

As in the Hibert space case, the “inf” in the above defined algorithm (XGA)may not be attained. In this case we can consider the following modification.

(WXGA) The Weak X-Greedy Algorithm.

We are given a sequence τ = (tn) ⊂ (0, 1). For x ∈ X we defineGn = Gn(x), for each n ∈ N0, by induction.



tn‖x−Gn−1 − anzn‖ ≤ infz∈D,a∈R

‖x−Gn−1 − az‖.

51

52CHAPTER 3. GREEDY ALGORITHMS IN GENERAL BANACH SPACES

2) Put Gn = Gn−1 + anzn.

The following example shows that without any further conditions, one can”get stuck pretty easily”:

Example 3.1.1. On R2 consider the `2∞ norm:

‖(x, y)‖∞ = max(|x|, |y|), if x, y ∈ R.

and let D = {±e1,±e2} be the dictionary.

Now for vector x = (1, 1) we have

infa≥0,z∈D

‖x− az‖∞ = 1 = ‖x‖∞.

In order to avoid cases like in Example 3.1.1 we will assume that our spaceX is at least smooth:

Definition 3.1.2. A Banach space X is called smooth if for every x ∈ X thereis a unique support functional fx ∈ X∗, i.e. with ‖fx‖ = 1 and fx(x) = ‖x‖.

Remark. As shown for example in [Schl, Theorem 4.1.3] the assumption that Xis smooth is equivalent with the condition that the norm is Gateaux differentiableon X \ {0}. In that case it follows for x0i ∈ X \ {0}

fx0(y) =1

‖x0‖∂

∂λ‖x0 + λy‖

∣∣λ=0

=1

‖x0‖limh→∞

‖x0 + hy‖ − ‖x0‖h

,

for all y ∈ SXThis implies that the X-greedy algorithm or the weak X-greedy algorithm

cannot become stationary at point x0 6= 0.

Indeed, if for all z ∈ D and all λ

‖x0 − λz‖ ≥ ‖x0‖,

it follows that for all z ∈ D

fx0(z) =1

‖x0‖limh→0

‖x0 + hz‖ − ‖x0‖h

= 0.

Since span(D) is dense this would mean that fx0 = 0 which is a contradiction.

In the Hilbert space case minimizing ‖x − az‖ over all z ∈ D and all a ∈ Ris equivalent to maximizing 〈x, z〉 over all z ∈ D. Generalizing this to Banachspaces will lead to a different algorithm, i.e. to an algorithm which does notcoincide with (XGA).

3.1. INTRODUCTION 53

(DGA) The Dual Greedy Algorithm.

For x ∈ X we define Gdn = Gdn(x), for each n ∈ N0, by inductionas follows. Gd0 = 0 and assuming Gd0, G

d1 . . . G

dn−1, have been

defined,

1) choose zn ∈ D so that

fx−Gdn−1

(zn) = supz∈D

fx−Gdn(z),

2) and then an so that

‖(x−Gdn−1)− anzn‖ = mina∈R‖|(x−Gdn−1)− az‖.

Then put Gdn = Gdn−1 + anzn.

Similar to XGA we can also define the weak version of (DGA) and denote it by(WDGA).

(XGDAR) The X-Greedy Dual Algorithm with relaxation.We are given a sequence ρ = (rn) ⊂ [0, 1).

For x ∈ X we define Grn(ρ), for each n ∈ N0, by induction.

Gr0(ρ) = 0 and assuming that Gr0(ρ), Gr1(ρ) . . . Grn−1(ρ), havebeen defined for some n ∈ N we proceed as follows:


‖x−(1−rn)Grn−1(ρ)−anzn‖ ≤ infz∈D,a∈R

‖x−(1−rn)Grn−1(ρ)−az‖.

2) Put Grn(ρ) = (1− rn)Grn−1(ρ) + anzn.

The following algorithm is a generalization of the the Orthogonal Greed Al-gorithm for Hilbert spaces.

(CDGA) The Chebyshev Dual Greedy Algorithm.

For x ∈ X we define GCn , for each n ∈ N0, by induction.

GC0 = 0 and assuming thatGC0 , GC1 . . . G

Cn−1, and vectors z1 . . . zn−1

have been defined for some n ∈ N we proceed as follows:


fx−GCn−1

(zn) ≥ supz∈D

fx−GCn−1

(z).

2) Define Zn = span(z1, z2 . . . zn)) and let GCn be the (or a) bestapproximation of x to Zn.

Similar to (WXGA) there are weak version of (XGDAR) and (CDGA) whichwe denote (WXGDAR) and (WCDGA), respectively.


The following algorithm is of a different nature than the previous ones. Wewill assume that (ei) is a semi normalized basis of X.

Before discussing these algorithms and there convergence properties we willneed to introduce the following strengthening of smoothness of Banach spaces.

Definition 3.1.3. For a Banach space X define the modulus of uniform smooth-ness by

(3.1) ρ(u) = ρX(u) = supx,y∈SX

(1

2

(‖x+ uy‖+ ‖x− uy‖

)− 1).

We say that X is uniformly smooth if

(3.2) limu→0

ρ(u)

u= 0.

Remark. We will use the modulus of uniform smoothness as follows. For somex, y∈X \ {0} (not necessarily in SX we would like to find an upper estimate for‖x− y‖, and write

‖x− y‖ = ‖x‖ ·∥∥∥ x

‖x‖− ‖y‖‖x‖

y

‖y‖

∥∥∥ ≤ 2

(1 + ρ

(‖y‖‖x‖

))− ‖x+ y‖.

Example 3.1.4. As was shown in [Schl], the spaces Lp[0, 1] are uniform smoothif 1 < p <∞, more precisely for X = Lp[0, 1] and u ≥ 0

(3.3) ρX(u) ≤

{cpu

p if 1 ≤ p ≤ 2

(p− 1)u2/2 if 2 ≤ p <∞

Proposition 3.1.5. For any Banach space X ρ is a convex and even functionand

max(0, u− 1) ≤ ρ(u) ≤ u, for u ≥ 0.

Lemma 3.1.6. For x 6= 0. Then for u ∈ R

(3.4) 0 ≤ ‖x+ uy‖ − ‖x‖ − fx(y) ≤ 2‖x‖ρ(u‖y‖‖x‖

).

Proof. First we note that

‖x+ uy‖ ≥ fx(x+ uy) = ‖x‖+ ufx(y),

which implies the first inequality in (3.4). Secondly, from the definition of ρ(u),and assuming w.l.o.g that y 6= 0, it follows that

‖x+ uy‖+ ‖x− uy‖ = ‖x‖∥∥∥ x

‖x‖+u‖y‖‖x‖

y

‖y‖

∥∥∥+ ‖x‖∥∥∥ x

‖x‖− u‖y‖‖x‖

y

‖y‖

∥∥∥

3.2. CONVERGENCEOF THEWEAKDUAL CHEBYSHEVGREEDYALGORITHM55

≤ 2‖x‖

(1 + ρ

(u‖y‖‖x‖

))

and‖x− uy‖ ≥ fx(x− uy) = ‖x‖ − ufx(y),

and, thus,

‖x+ uy‖ ≤ 2‖x‖

(1 + ρ

(u‖y‖‖x‖

))− ‖x− uy‖ ≤ 2ρ

(u‖y‖‖x‖

)+ ufx(y),

which implies our claim.

Corollary 3.1.7. If X is a uniformly smooth Banach space and x ∈ X \ {0},then

(3.5) fx(y) =

(d

dx‖x+ uy‖

)(0) = lim

u→0

‖x+ uy‖ − ‖x‖u

,

and this convergence is uniform in x, y ∈ SY . The norm is therefore Frechetdifferentiable

Proof. Note, that by (3.4)∣∣∣∣∣‖x+ uy‖ − ‖x‖u

− fx(y)

∣∣∣∣∣ ≤ 2‖x‖uρ(u‖y‖‖x‖

)→u→0 .

If ‖y‖ = 1 and ‖x‖ > ε it follows therefore∣∣∣∣∣‖x+ uy‖ − ‖x‖u

− fx(y)

∣∣∣∣∣ ≤ 2‖x‖u‖ρ( u

‖x‖

)≤ 2ρ(u)/u→u→0,

which implies the claimed uniform convergence.

3.2 Convergence of theWeak Dual Chebyshev GreedyAlgorithm

Recall the Weak Dual Chebyshev Greedy Algorithm:We are given a sequence of weakness factors τ = (tn) ⊂ (0, 1) and a dictionary

D ⊂ X. For x ∈ X we choose (zn) ⊂ D and Gcn = Gcn(x) as folllows.Gc0 = 0 and assuming Gcj , j = 01, 2 . . . n and zj , j = 1, 2 . . . n have been

chosen, we let zn ∈ D so that

f cx−Gn−1(zn) ≥ tn supz∈D

f cx−Gn−1(z).


Then let Zn = span(z1, z2, . . . zn) and let Gcn be the (or a) best approximation ofx inside Zn.

The main goal of this section is to prove two results by Temliakov We willneed the following technical definition first.

Definition 3.2.1. Let ρ be an even convex function on [−2, 2] with limu→0 ρ(u)/u =0 and ρ(2) ≥ 1, let τ = (tn) of numbers in (0, 1] and Θ ∈ (0, 1/2] then letξm = ξm(ρ, τ, θ) = 0 the (by the Intermediate Value Theorem uniquely existing)number for which

(3.6) ρ(ξm) = Θtmξm.

Theorem 3.2.2. Let X be a uniformly smooth Banach space and ρ its modulusof uniform smoothness, and let τ = (tn) be sequence of numbers in (0, 1]. Assumethat for any Θ > 0 we have

∞∑m=1

tmξm(ρ, τ,Θ) =∞.

We consider the weak Chebysheev Greedy Algorithm (WCDGA). This means forx ∈ X Gcn = Gcn(x) and zn ∈ D is such that

fx−GCn−1

(zn) ≥ supz∈D

fx−GCn−1

(z),

and Gcn is the best approximation of x to Zn = span(z1, z2, . . . zn). Let xn =x−Gcn, for n ∈ N. Then limn→∞ ‖xn‖ = 0.

Theorem 3.2.3. Let X be a uniformly smooth Banach space and and assumethat its modulus of uniform smoothness ρ satisfies ρ(u) ≤ γuq for some q ∈ (1, 2]and γ ≥ 1. Let τ = (tn) be sequence of numbers in (0, 1]. For x ∈ X, assumethat x ∈ A1(D) and let Gcn = Gcn(x)be defined as in Theorem 3.2.2. Then thereis constant C(q, γ), only dependent on qand γ so that for all n ∈ N

(3.7)∥∥x−Gcn∥∥ ≤ C(q, γ)‖x‖A1(D)

(1 +

m∑k=1

tpk

)−1/p

where p = q/(q − 1).

We will first need some Lemmas

Lemma 3.2.4. Let X be a uniformly smooth Banach space and let Z ⊂ X be afinite dimensional subspace. If y is the best approximate of some x ∈ X \Z fromZ then

fx−y(z) = 0, for all z ∈ Z.


Proof. Assume to the contrary that there is a z ∈ Z, ‖z‖ = 1, so that β =fx−y(z) > 0. By the definition of ρ(u) it follows for any λ and z ∈ SZ that

‖x− y − λz‖+ ‖x− y + λz‖ ≤ 2‖x− y‖

(1 + ρ

( λ

‖x− y‖

))(3.8)

and secondly

‖x− y + λz‖ ≥ fx−y(x− y + λz) = ‖x− y‖+ λβ.(3.9)

It follows therefore from (3.8) and (3.9) that

‖x− y − λz‖ ≤ 2‖x− y‖

(1 + ρ

( λ

‖x− y‖

))− ‖x− y + λz‖

≤ ‖x− y‖+ ρ( λ

‖x− y‖

)− λβ

= ‖x− y‖ − λ(β − ρ(λ/‖x− y‖)

λ/‖x− y‖

).

Since ρ(u)/u→u→0, it follows therefore that ‖x− y + λz‖ < ‖x− y‖, which is acontradiction.

Lemma 3.2.5. For any x∗ ∈ X∗ we have

(3.10) supz∈D

x∗(z) = supx∈A1(D),‖x‖A1

≤1x∗(x).

Proof. for x =∑

z∈D czz, with cz ≥ 0, for z ∈ D and∑

z∈D cz ≤ 1, it followsthat

x∗(x) =∑z∈D

czx∗(z) ≤ sup

z∈Dx∗(z),

and thussupz∈D

x∗(z) ≥ supx∈A1(D),‖x‖A1

≤1x∗(x).

The reverse inequality is trivial.

Lemma 3.2.6. Let X be a uniformly smooth Banach space and ρ its modulusof uniform smoothness, and let τ = (tn) be a sequence of numbers in (0, 1]. Forx ∈ X let Gcn = Gcn(x) be defined as in (WCDGA).

Assume that x ∈ X and that for some ε ≥ 0 there is a xε ∈ X so that‖x− xε‖ ≤ ε and xε ∈ A(D). Then it follows for all n ∈ N

(3.11)‖x−Gcn‖‖x−Gcn−1‖

≤ infλ≥0

[1− λtn‖x‖A1

(1− ε

‖x−Gcn−1‖

)+ 2ρ

( λ

‖xn−1‖

)].


Proof. Abbreviate A = ‖xε‖A1 . Let zn ∈ D, be chosen as in (WCDGA) and putxn = x−Gcn, for n ∈ N.

From the definition of ρ, it follows for every λ ≥ 0 that

(3.12) ‖xn−1 − λzn‖+ ‖xn−1 + λzn‖ ≤ 2‖xn−1‖(

1 + ρ( λ

‖xn−1‖

)).

and by the definition of (WCDGA) and Lemma 3.2.5 it follows that

fxn−1(zn) ≥ tn supz∈D

fxn−1(z) = tn supz∈A1(D),‖z‖A1

≤1fxn−1(z) ≥ tn

Afxn−1(xε).

From Lemma 3.2.4 we deduce that

fxn−1(xε) = fxn−1(x+ xε − x) ≥ fxn−1(x)− ε = fxn−1(xn−1)− ε = ‖xn−1‖ − ε,

and thus

‖xn−1 + λzn‖ ≥ ‖xn−1‖+ λfxn−1(zn)

≥ ‖xn−1‖+λtnAfxn−1(xε) ≥ ‖xn−1‖

(1 +

λtnA

)− λtn

Aε.

Finally (3.12) yields

‖xn‖ ≤ infλ≥0‖xn−1 − λzn‖

≤ 2‖xn−1‖(

1 + ρ( λ

‖xn−1‖

))− ‖xn−1 + λzn‖

≤ ‖xn−1‖

(1 + 2ρ

( λ

‖xn−1‖

)− λtn

A+

λtnA‖xn−1‖

ε

)

= ‖xn−1‖

(1 + 2ρ

( λ

‖xn−1‖

)− λtn

A

(1− ε

xn−1‖

))

which proves our assertion.

Proof of Theorem 3.2.2. Let xn = x − Gcn, for n ∈ N. By construction, thesequence (‖xn‖ : n ∈ N) is decreasing and thus, α = limn→∞ ‖xn‖ exists and wehave to show that α = 0.

We assume that α > 0 and will deduce a contradiction. Let ε = α/2. Sincespan(D) is dense in X we find an xε ∈ span(D), so that ‖x−xε‖ < ε and denoteA = ‖xε‖A1 .

From Lemma 3.2.6 we deduce that

‖xn‖ ≤ ‖xn−1‖ infλ≥0

[1 + 2ρ

( λ

‖xn−1‖

)− λtn

A

(1− ε

‖xn−1‖

)]


≤ ‖xn−1‖ infλ≥0

[1 + 2ρ

(λα

)− λtn

2A

].

We let Θ = α/8A and take λ = αξn(ρ, τ,Θ) (recall that this means that ρ(ξn) =Θtnξn) and obtain that

‖xn‖ ≤ ‖xn−1‖[1 + 2Θtnξn −

tn2A

]= ‖xn−1‖

[1− 2Θtnξn

].

and thus

‖xn‖ = ‖x‖+

n∑j=1

‖xj‖−‖xj−1‖ ≤ ‖x‖−n∑j=1

‖xj−1‖2Θtjξj ≤ ‖x‖−αn∑j=1

‖2Θtjξj .

But this contradicts our assumption that Σtnξn =∞.

Before proving Theorem 3.2.3 we need one more Lemma.

Lemma 3.2.7. Assume that (an) and (sn) are sequences of positive numberssatisfying for some A > 0 the following assumption

(3.13) a1 <A

1 + s1and an ≤ an−1

(1− sn

Aan−1

).

Then it follows for all n ∈ N

(3.14) an ≤ A(

1 +

n∑j=1

sj

)−1

Proof. We prove(3.14) by induction. For n = 1 this is the assumption. Assumingour claim is true for n− 1 we obtain

an ≤ an−1

(1− sn

Aan−1

)≤ A

(1 +

n−1∑j=1

sj

)−1(

1− sn

1 +∑n−1

j=1 sj

)≤ A

(1 +

n∑j=1

sj

)−1

(last inequality follows from cross multiplication).

Proof of Theorem 3.2.3. W.l.o.g. we assume ‖x‖A1 = 1. Let the sequences (zn)and (Gcn) = (Gcn(x)) be given as in (WCDGA) and let xn = x − Gcn, for n ∈ N.By Lemma 3.2.6 with ε = 0 we obtain

(3.15) ‖xn‖ ≤ ‖xn−1‖ infλ≥0

[1− λtn + 2γ

( λ

‖xn−1‖

)q]


We choose λ such that1

2λtm = 2γ

( λ

‖xn−1‖

)q,

orλ = ‖xn−1‖q/(q−1)(4γ)1/(q−1)t1/(q−1)

n .

Abbreviating Aq = 2(4γ)1/(q−1) and p = q/(q − 1), and inserting the choice of λinto (3.15), we obtain

‖xn‖ ≤ ‖xn−1‖(

1− 1

2λtn

)= ‖xn−1‖

(1− ‖xn−1‖ptpn/Aq

).

Taking the pth power on each side and using the fact that x ≥ xp if 0 < x ≤ 1,we obtain

‖xn‖p ≤ ‖xn−1‖p(

1− ‖xn−1‖ptpn/Aq).

Since γ ≥ 1 and thus Aq > 2, and since ‖x‖ ≤ ‖x‖A1 = 1 we can apply Lemma3.2.7 to an = ‖xn‖p and sn = tpn, and A = Aq, and deduce that for all n ∈ N

‖xn‖p ≤ Aq(

1 +n∑j=1

tpn

)−1,

which implies the claim of our Theorem.

3.3 Weak Dual Greedy Algorithm with Relaxation

For the Weak Chebyshev Dual Greedy Algorithm we need to approximate xby an element of the n dimensional space Zn, which might be computationallycomplicated. The following algorithm is a compromise between the ChebyshevDual Greedy Algorithm and Dual Greedy Algorithm. Here we only have to finda good approximation to a two dimensional subspace:

(WDGAFR) The weak dual greedy algorithm with free relaxationAs usual we are given a sequence of weakness factors τ = (tn) ⊂(0, 1] and a dictionary D ⊂ SX . For x ∈ X we define Grn =Grn(x) , n∈N, as follows: Gr0 = 0 and assuming Grn−1 has beendefined we choose zn ∈ D so that

fx−Grn−1

(zn) ≥ tn supz∈D

fx−Grn−1

(z),

and then we let wn and λn so that∥∥x− (1− wn)Grn−1 − λnzn∥∥ ≤ inf

λ,w

∥∥x− (1− w)Grn−1 − λzn∥∥.,

and define Grnq = (1− wn)Grn−1 + λzn.

3.3. WEAK DUAL GREEDY ALGORITHM WITH RELAXATION 61

Proposition 3.3.1. For all x ∈ X ‖x−Grn(x)‖ is decreasing in n∈N.

We will need the following analog to Lemma 3.2.6

Lemma 3.3.2. Assume that X is a uniformly smooth Banach space and denoteits modulus of uniform smoothness by ρ. Let x ∈ X, ‖x‖ ≥ ε ≥ 0 and xε ∈ A1,

so that ‖x− xε‖ < ε. For n ∈ N put xn = x−GfnThen

(3.16)‖xn‖‖xn−1‖

≤ infλ≥0

(1− λtn‖xε‖A1

(1− ε

‖xn−1‖

)+ 2ρ

( 5λ

‖xn−1‖

)).

Proof. From the definition of ρ we deduce for w ∈ R and λ ≥ 0 that

‖xn−1 + wGdrn−1 − λzn‖+ ‖xn−1 − wGdrn−1 + λzn‖(3.17)

≤ 2‖xn−1‖(

1 + ρ(‖wGdrn−1 − λzn‖

‖xn−1‖

)).

For all w ∈ R and λ ≥ 0 we estimate

‖xn−1−wGdrn−1 + λzn‖(3.18)

≥ fxn−1(xn−1 − wGdrn−1 + λzn)

≥ ‖xn−1‖ − fxn−1(wGdrn−1) + λtn supz∈D

fxn−1(z)

= ‖xn−1‖ − fxn−1(wGdrn−1) + λtn supz∈A1,‖z‖A1

≤1fxn−1(z)

(By Lemma 3.2.5)

≥ ‖xn−1‖ − fxn−1(wGdrn−1) +λtn‖xε‖A1

fxn−1(xε)

= ‖xn−1‖ − fxn−1(wGdrn−1) +λtn‖xε‖A1

fxn−1(x)− λtnε

‖xε‖A1

.

Letting w∗ = λtn/‖xε‖A1 we deduce that

‖xn−1−w∗Gdrn−1 + λzn‖(3.19)

≥ ‖xn−1‖+λtn‖xε‖A1

fxn−1(x−Gdrn−1)− λtnε

‖xε‖A1

≥ ‖xn−1‖+λtn‖xε‖A1

‖xn−1‖ −λtnε

‖xε‖A1

.

Thus we obtain

‖xn‖ = infλ≥0,w∈R

‖xn−1 + wGdrn − 1− λzn‖


≤ infλ≥0,w∈R

[2‖xn−1‖

(1 + ρ

(‖wGdrn−1 − λzn‖‖xn−1

))− ‖xn−1 − wGdrn−1 + λzn‖

]≤ inf

λ≥0

[2‖xn−1‖

(1 + ρ

(‖w∗Gdrn−1 − λzn‖‖xn−1‖

))−(‖xn−1‖+

λtn‖xε‖A1

‖xn−1‖ −λtnε

‖xε‖A1

)]= ‖xn−1‖ inf

λ≥0

[1− λtn‖xε‖A1

(1− ε

‖xn−1‖

)+ 2ρ

(‖w∗Gdrn−1 − λzn‖‖xn−1‖

)]

In order to achieve (3.16) we need to estimate ‖w∗Gdrn−1 − λzn‖. First we notethat

‖Gdrn−1‖ = ‖x− xn−1‖ ≤ 2‖x‖ ≤ 2‖xε‖A1 + 2ε ≤ 4‖xε‖A1

and thus‖w∗Gdrn−1 − λzn‖ ≤ 4w∗‖xε‖A1 + λ ≤ 5λ,

which implies our claim since ρ(·) is increasing on [0,∞).

Remark. Before stating the next Theorem let as note that if ρ is an even andconvex function on R, with ρ(0) = 0, limu→0 ρ(u)/u = 0, then the functions : u 7→ ρ(u)/u is increasing on [0,∞), thus has a inverse function s−1 which isalso increasing and s−1(0) = 0.

Theorem 3.3.3. Assume that X is a separable and uniformly smooth Banachspace, and denote its modulus of uniform smoothness by ρ. Let s−1(·) be theinverse function of s : u 7→ ρ(u)/u, u ≥ 0. We consider the (WDGAFR) with asequence of weakness factors τ = (tn) ⊂ (0, 1] satisfying

(3.20)∞∑n=1

tns−1(Θtn) =∞ for all Θ > 0

Then for any x ∈ X

(3.21) limn→∞

‖x−Gdrn (x)‖ = 0.

Proof. Let Gdrn = Gdrn (x) and xn = x−Gdrn , for n ∈ N. Since ‖xn‖ decreases

β = limn→∞

‖xn‖

exists and we need to show that β = 0.We assume that β > 0 and will derive a contradiction. We set ε = β/2 and

choose xε ∈ A1(D) with ‖x− xε‖ < ε. Note that ‖x‖ ≥ βε.By Lemma 3.3.2 we have

‖xn‖ ≤ ‖xn−1‖ infλ≥0

(1− λtn

2A+ 2ρ

(5λ

β

).

3.3. WEAK DUAL GREEDY ALGORITHM WITH RELAXATION 63

Putting Θ = β/40A and λ = βs−1(Θtm)/5, we obtain

‖xn‖ ≤ ‖xn−1‖(

1− βtns−1(Θtn)

10A+ 2ρ

(s−1(Θtn)

))= ‖xn−1‖

(1− βtns

−1(Θtn)

10A+ 2s−1(Θtn)

ρ(s−1(Θtn)

)s−1(Θtn)

)= ‖xn−1‖

(1− βtns

−1(Θtn)

10A+ 2s−1(Θtn)

ρ(s−1(Θtn)

)s−1(Θtn)

)= ‖xn−1‖

(1− 4Θtns

−1(Θtn) + 2s−1(Θtn)Θtn)

= ‖xn−1‖(1− 2Θtns

−1(Θtn)).

We can iterate this inequality and obtain

‖xn‖ ≤ ‖xn−1 − ‖xn−1‖2Θtns−1(Θtn)

≤ ‖xn−1‖ − β2Θtns−1(Θtn)

≤ ‖xn−2‖ − β2Θtn−1s−1(Θtn−1)− β2Θtns

−1(Θtn)

...

≤ ‖x‖ − β2Θtn−1

n∑j=1

tjs−1(Θtj)

and thus letting n→∞

β ≤ ‖x‖ − β2Θtn−1

∞∑j=1

tjs−1(Θtj),

which is the contradiction since we assumed that∑n

j=1 tjs−1(Θtj) =∞.

There is also a result on the rate of the convergence in Theorem 3.3.3 Sincethe proof is similar to the proof of the corresponding result for the ChebyshevGreedy Algorithm we omit a proof.

Theorem 3.3.4. Let X be a uniformly smooth Banach space with modulus ofuniform smoothness ρ, which satisfies for some q ∈ (1, 2]

(3.22) ρ(u) ≤ γuq.

Then there is a number C only depending on q and γ so that the following holds.If x ∈ X and if ε > 0 and xε ∈ A1 so that ‖x − xε‖ < ε, and if τ = (tn) ⊂

(0, 1], then it follows for the (WGGAFR) (Gdrn ) = (Gdrn (x)) with weakness factors(tn) that

(3.23) ‖x−Gdrn ‖ ≤ max

(2ε, C(‖xε‖A1 + ε)

(1 +

n∑j=1

tpj

)−1/p)


where p = q/(q − 1).

Remark. We like to point out something about the proof of Theorem 3.3.4 whichwill be useful when we consider the next greedy algorithm.

In the proof of Theorem 3.3.4 It was only used that the sequence (‖xn‖ : n∈N)is decreasing and that the inequality 3.16 holds. Of course in order to proof thisinequality we needed the specific choice of zn, namely in the second “≥” of (3.18).

Thus ifGn(x) is any algorithm so that ‖x−Gn(x)‖ is decreasing which satisfiesequation 3.16 the Gn(x) converges to x.

Keeping that remark in mind we now turn to the X-Greedy Algorithm withFree Relaxation.

(XGAFR) the X-Greedy Algorithm with Free Relaxation As usual we aregiven a dictionary D ⊂ SX . For x ∈ X we define Grn = Grn(x) ,n∈N, as follows: Gr0 = 0 and assuming Grn−1 has been definedwe choose λn ≥ 0, wn ∈ R , zn ∈ D so that

‖x−(1−wm)Grn−1−λnzn‖ = infz∈D,λ≥0,w∈R

‖x−(1−w)Grn−1−λz‖

and then define

Grnq = (1− wn)Grn−1 + λzn.

We notice that at each step the value of

‖x−Grn(x)‖‖x−Grn− 1(x)‖

is at most as large as the value we would have obtained if we had computedGdrn (x) form xn−1. We deduce therefore that the conclusion of Lemma still holdsand that if 0 ≤ ε ≤ ‖x‖ and if xε ∈ A1 with ‖x− xε‖ ≤ ε it follows that

(3.24)‖x−Grn(x)‖‖x−Grn−1(x)‖

≤ infλ≥0

(1− λtn‖xε‖A1

(1− ε

‖xn−1‖

)+ 2ρ

( 5λ

‖xn−1‖

)).

From the previous made remark we deduce therefore the following convergenceresult.

Theorem 3.3.5. Assume that X is a separable and uniformly smooth Banachspace. We consider the (XGAFR) (Grn(x) : n ∈ N), for x ∈ X

Then for any x ∈ X

(3.25) limn→∞

‖x−Grn(x)‖ = 0.

3.4. CONVERGENCE THEOREM FOR THE WEAK DUAL ALGORITHM65

3.4 Convergence Theorem for the Weak Dual Algo-rithm

For a Banach space X assume that f(·) : X \ {0} → SX∗ support map, i.e. forevery x ∈ X \ {0} we have fx(x) = ‖x‖. Recall from the the remark afterDefinition 3.1.2 that every x ∈ X \ {0} has a unique support map fx if and onlyif the norm is Gateaux differentiable

fx0(y) =1

‖x0‖∂

∂λ‖x0 + λy‖

∣∣λ=0

=1

‖x0‖limh→∞

‖x0 + hy‖ − ‖x0‖h

.

for all y ∈ SXLet D ⊂ SX be a dictionary for X and put for x ∈ X:

ρ(x) = ρD(x) = supz∈D

fx(z).

We consider the Weak Dual Greedy Algorithm with fixed weakness factor asin Section 3.1 but slightly reformulated:

(WDGA) Fix c ∈ (0, 1). For x ∈ X we choose sequences (xn)n≥0, (zn)n≥1 ⊂D and (tn) ⊂ [0,∞) recursively as follows.

x0 = x and assuming that xn−1 ∈ X, has been chosen we choose zn ∈ Darbitrary and tn = 0 if xn−1 = 0, and otherwise we choose zn ∈ D and tn ≥ 0 sothat

a) fxn−1(zn) ≥ cρD(xn−1) = c supz∈D fxn−1(z),

b) ‖xn−1 − tnzn‖ = mint≥0 ‖xn−1 − tzn‖,

and in both cases we finally let

c) Gn = Gn−1 + tnzn and xn = x−Gn = xn−1 − tnzn.

We say that the weak dual greedy algorithm converges for D, if for all x ∈ X,

limn→∞

xn = 0 or, equivalently, limn→∞

n∑i=1

tizi = x.

Lemma 3.4.1. Let X have Gateaux differentiable norm, 0 < c < 1, and assumethat for x ∈ X the sequences (xn), (tn), and (zn) satisfy (a) and (c) of (WDGA),but instead of (b) the following condition

d)‖xn−1‖ − ‖xn‖

tn≥ cρ(xn−1) for all n ∈ N.

Then, if∑∞

n=1 tn =∞, we have x =∑∞

n=1 tnzn.


Proof. Define sn =∑n

i=1 ti, for n ∈ NThen

e∑n

j=2 ln((sj−tj)/sj) =n∏j=2

sj−1

sj=s1

sn→n→∞ 0,

and thus

limn→∞

n∑j=2

ln((sj − tj)/sj) = −∞

It follows that

∞ = −∞∑j=2

ln((sj − tj)/sj) = −n∑j=2

ln(

1− tjsj

)≤∞∑j=2

tjsj

+t2js2j

,

and thus∑∞

j=2tjsj

=∞.

We note that if (an) and (bn) are two positive sequences and∑an <∞, while∑

bn = ∞, then there is a subsequence (nk) of N, so that limk→∞ ank/bnk

= 0.Indeed, for every k ∈ N the set Nk = {n∈N : kan < bn} must be infinite, and wecan therefore choose n1 < n2 < n3 < . . ., with ank

∈ Nk, for k∈N.Thus we can find n1 < n2 < n3 < . . . so that

limk→∞

snk+1(‖xnk‖ − ‖xnk+1‖)

tnk+1= 0.

It follows that

0 ≤ snkρD(xnk

) ≤ 1

C

snk(‖xnk

‖ − ‖xnk+1‖)tnk+1

≤ 1

C

snk+1(‖xnk‖ − ‖xnk+1

‖)tnk+1

→k→∞ 0,

and thus, in particular,

(3.26) limk→∞

ρD(xnk) = 0.

For 1 ≤ l ≤ nk − 1 we have:(3.27)∣∣‖xnk

‖ − fxnk(xl)

∣∣ =∣∣∣fxnk

( nk∑j=l+1

tjzj

)∣∣∣ ≤ nk∑j=1

tjρD(xnk) = snk

ρD(xnk)→k→∞ 0.

Now assume that x∗ is a w∗-cluster point of the sequence (fxnk) and let L =

limn→∞ ‖xn‖.Then, by (3.27), it follows that x∗(xl) = L for all l ∈ N. We nowclaim that this implies that L = 0. Indeed, other wise, x∗ 6= 0, and thus, sincespan(D) = X, and D = −D, θ = supz∈D x

∗(z) > 0, and thus

lim supk∈N

supz∈D

fxnk(z) ≥ θ,

which contradicts (3.26).


Lemma 3.4.2. Suppose 1 0 such that for anya, b ∈ R

(3.28) b|a+b|p−1sign(a+b)−b|a|p−1sign(a) ≤ Cp(|a+b|p−pb|a|p−1sign(a)−|a|p

).

Proof. First, note that replacing a and b simultaneously by −a and −b, theinequality does not change. We can therefore assume that a ≥ 0. Also note thatfor b = 0 we have equality if we let Cp = 1. Thus, we can assume that a > 0 and,since both sides of (3.28) are p-homogenous, we can assume that a = 1, and alsothat b 6= 0. We need therefore to show that

φ(b) =b(|1 + b|p−1sign(1 + b)− 1)

|1 + b|p − pb− 1, b 6= 0,

has an upper bound.

We note that

limb→0

φ(b) = limb→0

b(1 + b)p−1 − b(1 + b)p − pb− 1

= limb→0

(1 + b)p−1 + (p− 1)b(1 + b)p−2

p(1 + b)p−1 − p

= limb→0

(p− 1)(1 + b)p−2 + (p− 1)(1 + b)p−2 + (p− 1)(p− 2)b(1 + b)p−3

p(p− 1)(1 + b)p−2=

2

p

and

limb→±∞

φ(b) = 1,

which implies the claim since φ is continuous.

Definition 3.4.3. A Banach space X with Gateaux differentiable norm is saidto have property Γ if there is a constant 0 < γ ≤ 1 so that for x, y ∈ X for whichfx(y) = 0 it follows that

‖x+ y‖ ≥ ‖x‖+ γfx+y(y).

Remark. For x, y ∈ Lp[0, 1], fx(y) = 0 means that

(3.29)

∫ 1

0sign(x(t))|x(t)|p−1y(t) dt =

∫ 1

0sign(x(t))|x(t)|p/qy(t) dt = 0.

Proposition 3.4.4. If 1 < p <∞, every quotient of Lp[0, 1] has property Γ.


Proof. We first show that Lp[0, 1] itself has property Γ. So let x, y ∈ Lp[0, 1] withfx(y) = 0. We can assume that y 6= 0, and, after dividing x and y by ‖y‖, that‖y‖ = 1

By Lemma 3.4.2,

y(s)|x(s) + y(s)|p−1sign(x(s) + y(s))

≤ Cp(|x(s) + y(s)|p − |x(s)|p

)+ (1− pCp)y(s)|x(s)|p−1sign(x(s)).

Integrating both sides and using (3.29), yields∫ 1

0y(s)|x(s) + y(s)|p−1sign(x(s) + y(s)) ds ≤ Cp

(‖x+ y‖pp − ‖x‖pp

).

It follows from the fact that fx(y) is a positive multiple of ddt‖x+ ty‖p, and the

convexity of the function t 7→ ‖x+ ty‖ that ‖x‖p ≤ ‖x+ y‖p. Moreover we have

fz = ‖z‖1−pp sign(z(·))|z(·)|p−1 ∈ SLq , for z ∈ Lp[0, 1], and thus

‖x+ y‖p−1p fx+y(y) = ‖x+ y‖p−1

p

∫ 1

0y(s)|x(s) + y(s)|p−1sign(x(s) + y(s)) ds ≤

≤ Cp(‖x+ y‖pp − ‖x‖pp

)= Cp

(‖x+ y‖ − ‖x‖

) ddt‖x+ ty‖pp

∣∣∣t=t0

(By Taylor’s Theorem for some t0 ∈ (0, 1))

= Cp(‖x+ y‖ − ‖x‖

)p‖x+ y‖p−1

p

d

dt‖x+ ty‖p

∣∣∣t=t0

≤ Cp(‖x+ y‖ − ‖x‖

)p‖x+ y‖p−1

p(0 ≤ d

dt‖x+ ty‖p

∣∣∣t=t0≤ 1, since ‖x‖p ≤ ‖x+ y‖p and since ‖y‖p = 0

),

which proves our claim if we let γ = 1/pCp. From the following more generalProposition it will follow that every quotient of Lp[0, 1] has property Γ.

Proposition 3.4.5. The quotient of a reflexive space X with property Γ andGateaux differentiable norm also has property Γ (with respect to the same constantγ).

Proof. Assume that Y = X/Z, where X is a reflexive space with property Γ andZ ⊂ X is a closed subspace of X.

Let x, y ∈ X, let x = x+Z, y = y+Z be the images under the quotient mapQ : X → Y . Since X is reflexive we can assume that ‖|xb‖X/Z = inf x∈X+Z ‖x‖Xand find an element w ∈ X so that ‖x+ y‖X/Z = ‖w‖X .

Note that fx = fx ◦ Q (since fx(Q(x)) = fx(x) = ‖x‖X/Y = ‖x‖X) andfw = fx+y ◦Q. Hence if fx(y) = 0 it follows that fx(u− x) = 0 and thus


‖x+ y‖ = ‖w‖ = ‖x+ (w − x)‖ ≥ ‖x‖+ γfx+(w−x)(w − u) = ‖x‖+ γfx+y(y).

Now we are ready to show the final result which implies in particular thatthe (WDGA) converges in Lp[0, 1] for any dictionary D.

Theorem 3.4.6. Suppose X is s Banach space with property Γ and Frechetdifferentiable norm. If D is a dictionary of X and 0 < c ≤ 1 then the (WDGA)converges.

Proof. By Proposition 4.2.2 (Class Notes in Functional Analysis), the map x 7→fx is a norm continuous map between X \ {0} and SX∗ .

Let x = x0 be in X and let (xn), (zn) and (tn) as in (WDGA). If tn = 0 forsome n ∈ N then ρD(xn) = 0, and since D is total it follows that xn−1 = 0 andthus xk = 0, for all k ≥ n. So we can assume without loss of generality thattn > 0 for all n∈N.

By condition (b) it follows that

d

dt‖xn−1 − tzn‖|t=tn = 0,

and thus fxn(zn) = 0, which yields using property Γ, that

‖xn−1‖ = ‖xn + tnzn‖ ≥ ‖xn‖+ γtnfxn−1(zn)

and thus‖xn−1‖ − ‖xn‖

tn≥ γfxn−1(zn) ≥ cγρD(xn−1).

Using Lemma 3.4.1 with cγ instead of γ we only need to show that limn→∞ ‖xn‖ =0 if

∑∞n=1 tn <∞.

If∑∞

n=1 tn < ∞ then (xn) converges to some x∞ ∈ X. To deduce a contra-diction assume that x∞ 6= 0 which implies that limn→∞ ‖fxn − fx∞‖ = 0. Nowsince, as observed previously, fxn(zn) = 0, we have that limn→∞ fxn−1(zn) = 0,and thus by (WDGA)(a) limn→∞ ρD(xn−1) = 0, and thus for any z ∈ D

fx∞(z) = limn→∞

fxn−1(z) ≤ limn→∞

ρD(xn−1) = 0

and similarly, since D = −D,

−fx∞(z) = fx∞(−z) = limn→∞

fxn−1(−z) ≤ limn→∞

ρD(xn−1) = 0

which implies that fx∞(z) = 0 for all z ∈ D, and thus x∞ = 0, which is acontradiction and proves our claim.


Remark. Assume that the Banach space X is Gateaux differentiable and letx ∈ X. As in (WXGA) (xn) ⊂ X, (zn) ⊂ D and (tn) ∈ [0,∞) are chosen so thatx0 = x, xn = xn−1 − tnzn, for n ∈ N, and so that for some c ∈ (0, 1]

‖xn−1‖ − ‖xn‖ ≥ c supg∈D

supt≥0

(‖xn−1‖ − ‖xn−1 − tg‖

).

We note that, if (xn) has a convergent subsequence (for example if∑

n tn <∞), then (xn) has to converge to 0.

Indeed, assume that x∞ = limk→∞ xnkexists for some subsequence (nk). We

claim x∞ = 0, and this would imply that (xn) converges to 0, since (‖xn‖) isdecreasing.

Assume that x∞ 6= 0. Then, sinceD is total inX, it follows that supg∈D fx∞ >0, and thus there exists a g ∈ cD and a t > 0 so that ε = ‖x∞‖− ‖x∞− tg‖ > 0.But this would imply that for some k0 ∈ N

‖xnk‖ − ‖xnk+1

‖ ≥ c(‖xnk

‖ − ‖xnk− tg‖

)≥ cε/2, whenever k ≥ k0.

But this is a contradiction since (‖xn‖−‖xn+1‖) is a non negative and summablesequence.

Chapter 4

Open Problems

4.1 Greedy Bases

Problem 4.1.1. Does every infinite dimensional Banach space contain a quasigreedy basis ?

Comments to Problem 4.1.1: First of all there are separable Banach spaceswhich have a basis but do not have quasi greedy bases (for the whole spaces).Indeed in [DKK] it was shown that a L∞-space (for example any C(K), Kcompact) which is not isomorphic to c0 does not have a quasi greedy basis.

Secondly Gowers and Maurey solved the unconditional basis problem andshowed that not every separable Banach space contains a unconditional basicsequence (which is stronger than being quasi greedy). Nevertheless, in [DKK],Dilworth, Kalton and Kutzarova proved that all known counterexamples to theunconditional basis problem actually contain quasi greedy basic sequences.

The showed the following

Theorem 4.1.2. Let (xn) be a semi normalized weakly null sequence in a Banachspace X with spreading model (en), and suppose that (en) has the property that∥∥∥ n∑

i=1

ei

∥∥∥→∞, if n→∞.Then (xn) has a subsequence which is quasi greedy and whose quasi greedy con-stant does not exceed 3 + ε (for given ε > 0).

Here the spreading model of semi normalized sequences (xn) ⊂ X is definedas follows:

Assume that for all k ∈ N and all scalars (aj)nj=1∣∣∣∣∣∣∣∣∣ n∑

j=1

ajej

∣∣∣∣∣∣∣∣∣ = limn1→∞

limn2→∞

... . . . limnk→∞

∥∥∥ n∑j=1

ajxj

∥∥∥71

72 CHAPTER 4. OPEN PROBLEMS

exists. It is clear that ||| · ||| is a semi norm on c00. Using Ramsey’s Theorem(some kind of generalized pigeon whole principle) one can prove that every seminormalized sequence has a subsequence so that above limit exists for all (aj) ∈ c00,and that if (xn) is weakly null or basic, then ||| · ||| is a norm on c00 and (en) isa basis of the completion of c00 with respect to ||| · |||. We call in this case thiscompletion together with its basis (en) the spreading model of (xn).

For more comments on the problem and its relation to the problem whetheror not the Elton number has universal upper bound see also [DOSZ2].

Problem 4.1.3. Does `p(`q), 1 < p, q <∞ and p 6= q, have a greedy basis?

Comments to Problem 4.1.3 Besov spaces are function spaces defined on thereal line or on some subset of it and are of importance for example in PartialDifferential Equations and Approximation Theory. It can be shown that Besovspaces of function defined on R are isomorphic to `p(`q), 1 < p 6= q <∞, where

`p(`q) ={

(xn) : xn ∈ `q, for n ∈ N, and∑n

‖xn‖pq <∞},

with the norm

‖(xn)‖ =

(∑n

‖xn‖pq

)1/p

, for (xn) ∈ `p(`q).

It was shown in [EW] that for the space `p⊕ `q, p 6= q, every unconditional basis(xn) of `p⊕ `q splits in a basis of `p and in a basis of `q, i.e. there is a partitionof N into N1 and N2, so that (xn : n ∈ N1) is a basis of `p and (xn : n ∈ N2)is a basis of `q. From that result it is easy to see that `p ⊕ `q cannot have agreedy basis. On the other hand the space

(⊕∞n=1 `

mnq

)p, with 1 < p < ∞ and

1 ≤ q ≤ ∞, and mn → ∞, for n → ∞, which is isomorphic to Besov spaces offunction defined on the torus (or any closed bounded interval in R) has a greedybasis [DFOS]. In the case p = 1,∞ it was shown in [BCLT] that

(⊕∞n=1 `

mn2

)p

has a unique unconditional basis up to permutation, and thus this space cannothave a greedy basis (the usual one is clearly not democratic). From the proof in[BCLT] we can also deduce that for general q ∈ [1,∞] the spaces

(⊕∞n=1 `

mnq

)1

and(⊕∞n=1 `

mnq

)c0

have no greedy bases, unless, of course, in the trivial case thatp = q =∞ or p = q = 1.

Problem 4.1.4. Given any ε > 0, can a Banach space with normalized a greedybasis (en) be renormed so that (en) is (1 + ε)-greedy?

Comments to 4.1.4: First Albiac and Wojtaszczyk [AW] asked whether or notevery Banach space with normalized a greedy basis (en) can be renormed so that(en) is 1-greedy. This was solved negatively (recall that by Theorem 1.1.9 every1-greedy basis must be 1-democratic) in [DOSZ2] where the following was shown:

4.2. GREEDY ALGORITHMS 73

Proposition 4.1.5. Assume that X is a Banach space with a normalized sup-pression 1-unconditional basis (ei) and that there is a sequence (ρn) ⊂ (0, 1] withρ = infn∈N ρn > 0 so that∥∥∥∑

i∈Eei

∥∥∥ = ρnn whenever n ∈ N and E ⊂ N with #E = n .

Then (ei) is 2ρ -equivalent to the unit vector basis of `1.

Corollary 4.1.6. 1. Hardy space H1 cannot be renormed so that the the Haar-basis in H1 (which is greedy) is 1-greedy.

2. Tsirelson space T1 cannot be renormed so that it has a 1-greedy basis.

Problem 4.1.7. Can Lp[0, 1], 1 < p < ∞ be renormed so that the Haar basisbecomes 1-greedy, or at least (1 + ε)- greedy?

Coments to Problem 4.1.7: In Corollary 1.2.2 it was shown that Haar basisin Lp[0, 1] is greedy and in [DOSZ2] it was shown that one can renorm Lp[0, 1]so that the Haar basis is 1-democratic and 1-unconditional. But the fact that abasis is 1-democratic and suppression 1-unconditional does only imply that it is1- greedy. From Theorem 1.1.9 it follows that every suppression 1-unconditionaland 1-democratic basis is only at least 2-greedy and in [DOSZ2] it was shownthat this is optimal and that for any ε > 0 there is a basic sequence which is 1-democratic and suppression 1-unconditional (even 1-unconditional) which is not2− ε-greedy.

4.2 Greedy Algorithms

For Hilbertspace it is sill open to find better convergence rates of the pure greedyalgorithm but for general Banach spaces the questions about the convergence ofX-Greedy Algorithm in are quite basic and wide open.

Problem 4.2.1. Find one infinite dimensional Banach space X, other thanHilbert space, on which (X-PGA) converges for any dictionary?

Problem 4.2.2. Does (X-PGA) converge on `p, 1 < p < ∞, p 6= 2, for anydictionary?

Comments on Problem 4.2.2: In [DKSTW] at least the weak convergence ofthe Pure Greedy Algorithm was shown:

Theorem 4.2.3. [DKSTW, Theorem 3.2] Suppose that for n ∈ N Xn is afinite dimensional space whose norm is Gataux differentiable. Then the weakpure greedy algorithm with fixed weakness factor converges weakly.

74 CHAPTER 4. OPEN PROBLEMS

Problem 4.2.4. Does (X-PGA) converge in Lp[0, 1], 1 < p <∞, p 6= 2, at leastif one takes the Haar basis as dictionary?

Comments on Problem: 4.2.3: The following finite dimensional version toProblem 4.2.4 was shown in [DOSZ3]

Theorem 4.2.5. Let 1 < p <∞ and let hj : j ∈ N be Haar basis of Lp (orderedconsistently with the usual partial order). For each m there is number N =(N(p,m)) so that X- PGA terminates after N steps, assuming that the startingpoint was chosen in span(xj : j ≤ m).

Problem 4.2.6. Are there examples of separable and uniform smooth Banachspaces X with dictionaries for which the (weak) dual greedy algorithm does notconverge?

Chapter 5

Appendix A: Bases in Banachspaces

5.1 Schauder bases

In this section we recall some of the notions and results presented in the courseon Functional Analysis in Fall 2012 [Schl]. Like every vector space a Banachspace X admits an algebraic or Hamel basis, i.e. a subset B ⊃ X, so that everyx ∈ X is in a unique way the (finite) linear combination of elements in B. Thisdefinition does not take into account that we can take infinite sums in Banachspaces and that we might want to represent elements in X as converging series.Hamel bases are also not very useful for Banach spaces, since (see Exercise 1),the coordinate functionals might not be continuous.

Definition 5.1.1. [Schauder bases of Banach Spaces]

Let X be an infinite dimensional Banach space. A sequence (en) ⊂ X iscalled Schauder basis of X, or simply a basis of X, if for every x∈X, there is aunique sequence of scalars (an) ⊂ K so that

x =∞∑n=1

anen.

Examples 5.1.2. For n ∈ N let

en = ( 0, . . . 0︸︷︷︸n−1 times

, 1, 0, . . .) ∈ KN

Then (en) is a basis of `p, 1 ≤ p <∞ and c0. We call (en) the unit vector of `pand c0, respectively.

Remarks. Assume that X is a Banach space and (en) a basis of X. Then

75

76 CHAPTER 5. APPENDIX A: BASES IN BANACH SPACES

a) (en) is linear independent.

b) span(en : n∈N) is dense in X, in particular X is separable.

c) Every element x is uniquely determined by the sequence (an) so that x =∑∞j=1 anen. So we can identify X with a space of sequences in KN.

Proposition 5.1.3. Let (en) be the Schauder basis of a Banach space X. Forn ∈ N and x ∈ X define e∗n(x) ∈ K to be the unique element in K, so that

x =

∞∑n=1

e∗n(x)en.

Then e∗n : X → K is linear.For n ∈ N let

Pn : X → span(ej : j ≤ n), x 7→n∑j=1

e∗n(x)en.

Then Pn : X → X are linear projections onto span(ej : j ≤ n) and the followingproperties hold:

a) dim(Pn(X)) = n,

b) Pn ◦ Pm = Pm ◦ Pn = Pmin(m,n), for m,n ∈ N,

c) limn→∞ Pn(x) = x, for every x ∈ X.

Pn, n ∈ N, are called the Canonical Projections for (en) and (e∗n) the CoordinateFunctionals for (en) or biorthogonals for (en).

Theorem 5.1.4. Let X be a Banach space with a basis (en) and let (e∗n) be thecorresponding coordinate functionals and (Pn) the canonical projections. ThenPn is bounded for every n ∈ N and

b = supn∈N||Pn‖L(X,X) <∞,

and thus e∗n ∈ X∗ and

‖e∗n‖X∗ =‖Pn − Pn−1‖‖en‖

≤ 2b

‖en‖.

We call b the basis constant of (ej). If b = 1 we say that (ei) is a monotonebasis.

Furthermore

||| · ||| : X → R+0 ,

∞∑j=1

aiei 7→∣∣∣∣∣∣∣∣∣ ∞∑j=1

aiei

∣∣∣∣∣∣∣∣∣ = supn∈N

∥∥∥ n∑j=1

aiei

∥∥∥,is an equivalent norm under which (ei) is a monotone basis.

5.1. SCHAUDER BASES 77

Definition 5.1.5. [Basic Sequences]Let X be a Banach space. A sequence (xn) ⊂ X \ {0} is called a basic sequenceif it is a basis for span(xn : n ∈ N).

If (ej) and (fj) are two basic sequences (in possibly two different Banachspaces X and Y ), we say that (ej) and (fj) are isomorphically equivalent if themap

T : span(ej : j ∈ N)→ span(fj : j ∈ N),n∑j=1

ajej 7→n∑j=1

ajfj ,

extends to an isomorphism between the Banach spaces between span(ej : j ∈ N)

and span(fj : j ∈ N).Note that this is equivalent with saying that there are constants 0 < c ≤ C

so that for any n ∈ N and any sequence of scalars (λj)nj=1 it follows that

c∥∥∥ n∑j=1

λjej

∥∥∥ ≤ ∥∥∥ n∑j=1

λjfj

∥∥∥ ≤ C∥∥∥ n∑j=1

λjej

∥∥∥.Proposition 5.1.6. Let X be Banach space and (xn : n ∈ N) ⊂ X \ {0}. Then(xn) is a basic sequence if and only if there is a constant K ≥ 1, so that for allm < n and all scalars (aj)

nj=1 ⊂ K we have

(5.1)∥∥∥ m∑i=1

aixi

∥∥∥ ≤ K∥∥∥ n∑i=1

aixi

∥∥∥.In that case the basis constant is the smallest of all K ≥ 1 so that (5.1) holds.

Theorem 5.1.7. [The small Perturbation Lemma]Let (xn) be a basic sequence in a Banach space X, and let (x∗n) be the coor-

dinate functionals (they are elements of span(xj : j∈N)∗) and assume that (yn)

is a sequence in X such that

(5.2) c =

∞∑n=1

‖xn − yn‖ · ‖x∗n‖ < 1.

Then

a) (yn) is also basic in X and isomorphically equivalent to (xn), more precisely

(1− c)∥∥∥ ∞∑n=1

anxn

∥∥∥ ≤ ∥∥∥ ∞∑n=1

anyn

∥∥∥ ≤ (1 + c)∥∥∥ ∞∑n=1

anxn

∥∥∥,for all in X converging series x =

∑n∈N anxn.

b) If span(xj : j∈N) is complemented in X, then so is span(yj : j∈N).


c) If (xn) is a Schauder basis of all of X, then (yn) is also a Schauder basisof X and it follows for the coordinate functionals (y∗n) of (yn), that y∗n ∈span(x∗j : j∈N), for n∈N.

Now we recall the notion of unconditional basis. First the following Proposi-tion.

Proposition 5.1.8. For a sequence (xn) in Banach space X the following state-ments are equivalent.

a) For any reordering (also called permutation) σ of N (i.e. σ : N → N isbijective) the series

∑n∈N xσ(n) converges.

b) For any ε > 0 there is an n ∈ N so that whenever M ⊂ N is finite withmin(M) > n, then

∥∥∑n∈M xn‖ < ε.

c) For any subsequence (nj) the series∑

j∈N xnj converges.

d) For sequence (εj) ⊂ {±1} the series∑∞

j=1 εjxnj converges.

In the case that above conditions hold we say that the series∑xn converges

unconditionally.

Definition 5.1.9. A basis (ej) for a Banach space X is called unconditional, iffor every x∈X the expansion x =

∑〈e∗j , x〉ej converges unconditionally, where

(e∗j ) are coordinate functionals of (ej).A sequence (xn) ⊂ X is called an unconditional basic sequence if (xn) is an

unconditional basis of span(xj : j∈N).

Proposition 5.1.10. For a sequence of non zero elements (xj) in a Banachspace X the following are equivalent.

a) (xj) is an unconditional basic sequence,

b) There is a constant C, so that for all finite B ⊂ N, all scalars (aj)j∈B ⊂ K,and A ⊂ B

(5.3)∥∥∥∑j∈A

ajxj

∥∥∥ ≤ C∥∥∥∑j∈B

ajxj

∥∥∥.c) There is a constant C ′, so that for all finite sets B ⊂ N, all scalars

(aj)j∈B ⊂ K, and all (εj)j∈B ⊂ {±1}, if K = R, or (εj)j∈B ⊂ {z ∈C : |z| = 1}, if K = C,

(5.4)∥∥∥∑j∈B

εjajxj

∥∥∥ ≤ C ′∥∥∥ n∑j=1

ajxj

∥∥∥.

5.2. MARKUSHEVICH BASES 79

In this case we call the smallest constant C = Cs which satisfies (5.3) for alln, A ⊂ {1, 2 . . . , n} and all scalars (aj)

nj=1 ⊂ K the supression-unconditional

constant of (xn) and we call the smallest constant C ′ = Cu so that (5.4) holdsfor all n, (εj)

nj=1 ⊂ {±1}, or (εj)

nj=1 ⊂ {z ∈ C : |z| = 1}, and all scalars

(aj)nj=1 ⊂ K the unconditional constant of (xn).Moreover, it follows that

(5.5) Cs ≤ Cu ≤ 2Cs.

Proposition 5.1.11. Let (xn) be an unconditional basic sequence. Then

(5.6) Cu = sup{∥∥∥ ∞∑

j=1

aibixi

∥∥∥ : x =

∞∑i=1

aixi ∈ BX and |bi| ≤ 1}.

Remark. While for Schauder bases it is in general important how we orderthem, the ordering is not relevant for unconditional bases. We can thereforeindex unconditional bases by any countable set.

5.2 Markushevich bases

Not every separable Banach space has a Schauder basis [En]. But it has at leasta bounded and norming Markushevich basis according to a result of Ovsepian andPe lczynski [OP]. We want to present this result in this section,

Definition 5.2.1. A countable family (en, e∗n)n∈N ⊂ X ×X∗ is called

• biorthogonal, if e∗n(em) = δ(m,n), for all m,n∈N,

• fundamental, or complete, if span(en : i∈N) is dense in X,

• total, if for any x ∈ X with e∗n(x) = 0, for all n∈N, it follows that x = 0,

• norming, if for some constant c > 0,

supx∗∈span(e∗n:n∈N)∩BX∗

|x∗(x)| ≥ c‖x‖, for all x ∈ X.

and in that case we also say that (en, e∗n)n∈N is c-norming,

• shrinking, if span(e∗n : n∈N) = X∗, and

• bounded, or uniformly minimal, if C = supn∈N ‖en‖·‖e∗n‖ <∞, and we say inthat case that (en, e

∗n)n∈N is C-bounded and call C the bound of (en, e

∗n)n∈N.

A biorthogonal, fundamental and total sequence (en, e∗n)n∈N is called an Marku-

shevich basis or simply M -Basis.


Remark. Assume (en, e∗n) is an M -basis. It follows from the totality that

span(e∗n : n ∈ N) is w∗-dense in X∗ Thus in every reflexive space M -bases areshrinking, and shrinking M bases are 1-norming.

Our goal is to prove following

Theorem 5.2.2. [OP] Every separable Banach space X admits a bounded, norm-ing M -basis which can be chosen to be shrinking if X∗ is (norm) separable. More-over, the bound of that M -basis can be chosen arbitrarily close to 4(1 +

√2)2.

Remark. Pe lczynski [Pe] improved later the above result and showed that forall separable Banach spaces and all ε > 0 there exists a bounded M -basis, whosebound does not exceed 1 + ε.

It is an open question whether or not every separable Bach space has a 1-bounded M -basis. But it is not hard to show that a space X with a bounded andnorming M -basis can be renormed so that this basis becomes 1-bounded and 1norming.

Remark. It might be nice to know that every separable Banach space has abounded and norming Markushevish basis (ei, e

∗i ) . Nevertheless, given z ∈ X,

we do not have any (set aside a good one) procedure to approximate z by finitelinear combinations of the ei, we only know that such an approximation exists.This is precisely the difference to Schauder bases, for which we know that thecanonical projections converge point wise.

Lemma 5.2.3. [LT, Lemma 1.a.6] Assume that X is an infinite dimensionalspace and that F ⊂ X and G∗ ⊂ X∗ are finite dimensional subspaces of X andX∗, respectively. Let ε > 0. Then there is an x ∈ X, ‖x‖ = 1 and an x∗ ∈ X∗so that ‖x∗‖ ≤ 2 + ε, x∗(x) = 1, z∗(x) = 0, for all z∗ ∈ G∗, and x∗(z) = 0 for allz ∈ F .

Proof. Let (y∗i )mi=1 ⊂ SX∗ be finite and 1/(1 + ε) norming the space F . Pick

x ∈ ⊥({y∗j : j = 1, 2, . . .m} ∪G∗

)={z ∈ X : y∗j (z) = 0, j = 1, 2 . . .m, and z∗(z) = 0, z∗ ∈ G∗

}, with ‖x‖ = 1.

It follows for all λ ∈ R and all y ∈ F that

||y + λx‖ ≥ max∣∣y∗j (y + λx)

∣∣ = max∣∣y∗j (y)

∣∣ ≥ ‖y‖1 + ε

.

Then defineu∗ : span(F ∪ {x})→ R, y + λx 7→ λ.

We claim that ‖u∗‖ ≤ 2 + ε. Indeed, let y ∈ F , y 6= 0, and λ ∈ R. Then∣∣∣∣∣u∗(

λx+ y

‖λx+ y‖

)∣∣∣∣∣ =|λ|

‖λx+ y‖


≤

2 ‖y‖‖λx+y‖ ≤ 2(1 + ε) if |λ| ≤ 2‖y‖,

2‖y‖2‖y‖−‖y‖ = 2 if |λ| > 2‖y‖.

Letting now x∗ be a Hahn Banach extension of u∗ onto all of X, our claim isproved.

Lemma 5.2.4. ([Ma], see also [HMVZ, Lemma 1.21])Let X be an infinite dimensional Banach space. Suppose that (zn) ⊂ X and(z∗n) ⊂ X∗ are sequences so that span(zn : n∈N) and (z∗n : n∈N) are both infinitedimensional and so that

(M1) (zn) separates points of span(z∗n : n∈N),

(M2) (z∗n) separates points of span(zn : n∈N).

Let N ⊂ N be co-infinite, and ε > 0.Then we can choose a biorthogonal system

(xn, x∗n) ⊂ span(zn : n∈N)× span(z∗n : n∈N)

with

span(zn :n∈N)⊂span(xn :n∈N) and span(z∗n :n∈N)⊂span(x∗n :n∈N)(5.7)

supn∈N‖x∗n‖ · ‖x∗n‖ < 2 + ε.(5.8)

Remark. Note that (xn, x∗n) is an M -basis if span(zn : n ∈ N) is dense in X,

and if (z∗n) separates points of X. It is norming if B∗X ∩ span(z∗n) is norming X.

Proof. Choose s1 = min{n ∈ N : zn 6= 0}. x1 = zs1/‖zs1‖. If 1 ∈ N choosex∗1 ∈ SX∗ with x∗1(x1) = 1. Otherwise choose x∗1 = z∗t1 with t1 = min{m ∈ N :z∗m(x1) 6= 0} (which exists by (M1)).

Write N \N = {k1, k2, . . .} and proceed by induction to choose x1, x2, . . . xnand x∗1, . . . x

∗n as follows.

Assume that x1, x2, . . . xn and x∗1, . . . x∗n have been been chosen.

Case 1: n + 1∈N . Then let F = span(xi : i ≤ n) and G∗ = span(x∗i : i ≤ n);and choose xn+1 and x∗n+1 by Lemma 5.2.3.Case2: n+ 1 = k2j−1 ∈ N \N . Then let

s2j−1 = min{s : zs 6∈ span(xi : i ≤ n)

}and define

xn+1 = zs2j−1 −n∑i=1

x∗i (zs2j−1)xi.


This implies that x∗i (xn+1) = 0 for i = 1, 2, . . . n. Next choose (using (M2))

t2j−1 = min{t : z∗t (xn+1) 6= 0}

and let

x∗n+1 =z∗t2j−1

−∑n

i=1 z∗t2j−1

(xi)x∗i

z∗t2j−1(xn+1)

,

which yields that x∗n+1(xi) = 0, for i = 1, 2 . . . n, and x∗n+1(xn+1) = 0.

Case 3: n+ 1 = k2j ∈ N \N . Then we choose

t2j = min{s : z∗s 6∈ span(x∗i : i ≤ n)

}.

Let

x∗n+1 = z∗s2j −n∑i=1

zs2j−1(xi)x∗i ,

and hence x∗n+1(xi) = 0, for i = 1, 2 . . . n, and then let (using (M1)

s2j = min{s : x∗n+1(xs) 6= 0},

and

xn+1 =zs2j −

∑ni=1 x

∗i (zs2j ))xi

x∗n+1(zs2j ),

which implies that x∗i (xn+1) = 0, for i = 1, 2 . . . n and x∗n+1(xn+1) = 1.

We insured by this choice that((xi, x

∗i ) : i ∈ N

)is a biorthogonal sequence

in X ×X∗ which also satisfies (5.8) and, since for any m∈N we have span(zi :i ≤ m) ⊂ span(xk2j−1

: j ≤ m) and span(z∗i : i ≤ m) ⊂ span(x∗k2j : j ≤ m),((xi, x

∗i ) : i∈N

)it also satisfies (5.7).

For n ∈ N we consider on `2n

2 the discrete Haar basis

{h0} ∪ {h(r,s), r = 0, 1, . . . , n− 1, and s = 0, 1, . . . 2r−1 − 1},

with

h0 = 2−n/2χ{1,2,3...,2n}

h(r,s) =χ{2s2n−r−1+1,2s2n−r−1+2,...(2s+1)2n−r−1}−χ{(2s+1)2n−r−1+1,(2s+1)2n−r−1+2,...(2s+2)2n−r−1}

2(n−r)/2

if r = 0, 1, 2, . . . n− 1 and s = 0, 1, 2 . . . 2r − 1.

The unit vector basis (ei)2ni=1 as well as the Haar basis

{h0} ∪ {h(r,s), r = 0, 1, . . . , n− 1, s = 0, 1, . . . 2r−1 − 1}


are orthonormal bases in `2n

2 . Thus the matrix A = A(n) with the property thatA(e1) = h0 and A(e2r+s+1) = h(r,s) is a unitary matrix. If we write

A(n) = (a(n)(i,j) : 0 ≤ i, j ≤ 2n − 1)

then it follows for k = 0, 1, 2 . . . 2n − 1 that a(n)(k,0) = h0(k) = 2−n/2, and if

r = 0, 1, 2 . . . n− 1 and s = 0, 1, . . . 2r − 1 that

a(n)(k,2r+s) = h(r,s)(k)

=

2−(n−r)/2 if k ∈ {2s2n−r−1 + 1, 2s2n−r−1+2, . . . (2s+1)2n−r−1},−2−(n−r)/2 if {(2s+ 1)2n−r−1 + 1, (2s+ 1)2n−r−1+2, . . . (2s+2)2n−r−1},0 if k ≤ 2s2n−r−1 or k > (2s+2)2n−r−1.

and thus

A =

2−n/2 2−n/2 2−(n−1))/2 0 · · · 2−1/2 · · · 0

2−n/2...

...... · · · −21/2 · · ·

......

... 2−(n−1))/2... · · · 0 · · ·

......

... −2−(n−1))/2... · · ·

... · · ·...

......

...... · · ·

... · · ·...

... 2−n/2 −2−(n−1)/2 0 · · ·... · · ·

...... −2−n/2 0 2−(n−1))/2 · · ·

... · · ·...

......

...... · · ·

... · · ·...

......

... 2(n−1))/2 · · ·... · · ·

......

...... −2−(n−1))/2 · · ·

... · · ·...

......

...... · · ·

... · · · 2−1/2

2−n/2 −2−n/2 0 −2−(n−1)/2 · · · 0 · · · −2−1/2

It follows therefore that for all k = 0, 1 . . . , 2n − 1 we have

(5.9)2n−1∑j=1

∣∣a(n)(k,j)

∣∣ =n−1∑r=0

2−(n−r)/2 =n∑i=1

( 1√2

)i≤

√2√

2− 1= 1 +

√2,

because, leaving out the first column, in each row and for each r ∈ {0, 1, 2 . . . n−1}the value 2(n−r)/2 is absolutely taken exactly once.

This implies the following:


Corollary 5.2.5. If((xi, x

∗i ) : i = 0, 1, . . . 2n − 1

)is a biorthogonal sequence of

length 2n, in a Banach space X and we let

ek =

2n−1∑j=0

a(n)(k,j)xj and(5.10)

e∗k =2n−1∑j=0

a(n)(k,j)x

∗j for k = 0, 1, . . . 2n − 1,(5.11)

then

max0≤k<2n

‖ek‖ < (1 +√

2) max0≤k<2n

‖xk‖+ 2−n/2‖x0‖,(5.12)

max1≤k<2n

‖e∗k‖ < (1 +√

2) max0≤k<2n

‖x∗k‖+ 2−n/2‖x∗0‖,(5.13) ((ej , e

∗j ) : j = 0, 1, . . . , 2n − 1

)is biorthogonal(5.14)

span(ej : 0 ≤ j < 2n) = span(xj : 0 ≤ j < 2n) and(5.15)

span(e∗j : 0 ≤ j < 2n) = span(x∗j : 0 ≤ j < 2n).(5.16)

Proof of Theorem 5.2.2 . Let δ > 0 and put M = 2 + δ. We start with a fun-damental sequence (zi) ⊂ X and a w∗-dense sequence (z∗i ) ⊂ BX∗ ((BX∗ , w

∗) isseparable if X is norm separable) , which we choose norm dense if X∗ is normseparable. Then we use Lemma 5.2.4 to choose a norming (reps. shrinking) M -basis

((xn, x

∗n) : n ∈ N

)of X which satisfies for N being the odd numbers the

conditions (5.7) and (5.8). Without loss of generality we assume that ‖xn‖ = 1,for n ∈ N. Now we will define a reordering

((xn, x

∗n) : n∈N

)of((xn, x

∗n) : n∈N

)as follows:

By induction we choose for ` ∈ N a number m` ∈ N and define q` =∑`

j=1 2mj

and q0 = 0, and choose(xq`−1

, x∗q`−1

),(xq`−1+1, x

∗q`−1+1

), . . . ,

(xq`−1, x

∗q`−1

)as fol-

lows

Assume that for all 0 ≤ r < `, ` ≥ 1, mr, and (x0, x∗0), (x1, x

∗1), . . . (xqr−1, x

∗qr−1)

have been chosen. Put

s` = min{s : (x2s, x

∗2s) 6∈ {(xt, x∗t ) : t ≤ q`−1 − 1}

}(recall that ((x2s, x

∗2s) : s ∈ N) are the elements of ((xs, x

∗s) : s ∈ N

)for which we

do not control the norm) and choose m`∈N, so that[(1 +

√2) + 2−m`/2

]·[(1 +

√2)M + ‖x∗2s`‖2

−m`/2]< (1 +

√2)2M + δ.

Then let (xq`−1, x∗q`−1

) = (x2s` , x∗2s`

) while (xq`−1+1, x∗q`−1+1) . . . (xq`−1, x

∗q`−1) con-

sist of the elements of((x2t−1, x

∗2t−1) : t ∈ N

)) which are not in the set {(xt, x∗t ) :

t ≤ q`−1 − 1} and have the lowest 2m` − 1 indices.


By that choice we made sure that all elements of((xt, x

∗t ) : t ∈ N

)appear

exactly once in the sequence(xt, x

∗t : t∈N

).

Then we apply Corollary 5.2.5 and define for k = 0, 1, 2, . . . 2m`−1

eq`−1+k =

2m`−1∑j=0

a(m`)(k,j)xq`−1+j and e∗q`−1+k =

2m`−1∑j=0

a(m`)(k,j)x

∗q`−1+j .

It follows then from (5.12) and (5.13) that for k = 0, 1, 2, . . . 2m`−1

‖eq`−1+k‖ · ‖e∗q`−1+k‖ ≤ (1 +√

2)2M + δ.

Choosing δ > 0 small enough we can ensure that (1+√

2)2M+δ < 2(1+√

2)2+ε.Since

((xn, x

∗n) : n ∈ N

)is a norming M -basis, it follows from (5.14), (5.15)

and (5.16) that((en, e

∗n) : n ∈ N

)is a norming M basis which is shrinking if(

(xn, x∗n) : n∈N

)is shrinking.


Chapter 6

Appendix B: Some facts aboutLp[0, 1] and Lp(R)

6.1 The Haar basis and Wavelets

We recall the definition of the Haar basis of Lp[0, 1]. Let

T = {(n, j) : n ∈ N0, j = 0, 1 . . . , 2n − 1} ∪ {0}.

Let 1 ≤ p < ∞ be fixed. We define the Haar basis (ht)t∈T and the normalized

Haar basis (h(p)t )t∈T in Lp[0, 1] as follows.

h0 = h(p)0 ≡ 1 on [0, 1] and for n ∈ N0 and j = 0, 1, 2 . . . 2n − 1 we put

h(n,j) = 1[j2−n,(j+ 12

)2−n) − 1[(j+ 12

)2−n,(j+1)2−n).

and we let

∆(n,j) = supp(h(n,j)) =[j2−n, (j + 1)2−n

),

∆+(n,j) =

[j2−n,

(j +

1

2

)2−n

)∆−(n,j) =

[(j +

1

2

)2−n, (j + 1)2−n

).

We let h(∞)(n,j) = h(n,j). And for 1 ≤ p <∞

h(p)(n,j) =

h(n,j)

‖h(n,j)‖p= 2n/p

(1[j2−n,(j+ 1

2)2−n − 1[(j+ 1

2)2−n),(j+1)2−n)

).

Theorem 6.1.1. [Schl, Theorems 3.2.2, 5.5.1]

We order (h(p)t : t ∈ T ) into as sequence (hn : n ∈ N), with h1 = h

(p)0 , and

the property that if 2 ≤ m < n, then either supp(hn) ⊂ supp(hm) or supp(hm) ∩supp(hn) = ∅. Then (hn) is a monotone basis for Lp[0, 1].

87

88CHAPTER 6. APPENDIX B: SOME FACTS ABOUT LP [0, 1] AND LP (R)

For 1 < p < ∞, (h(p)t : t ∈ T ) is an unconditional basis for Lp[0, 1], but it is

not unconditional for p = 1. In fact L1[0, 1] does not embed into a Banach spacewith unconditional basis.

Define h = 1[0,1/2]−1(1/2,1]. Then we can write for n ∈ N0 and j=0, 1, . . . 2n−1.

h(n,j)(t) = h(2nt− j), for t ∈ [0, 1], and

h(p)(n,j)(t) = 2n/ph(2nt− j), for t ∈ [0, 1].

We define now for all n ∈ Z and all j ∈ Z a function h(n,j) as follows

h(n,j)(t) = h(2nt− j), for t ∈ R, and(6.1)

h(p)(n,j)(t) = 2n/ph(2nt− j), for t ∈ R.(6.2)

For all n, j ∈ Z we have

supp(h(n,j)) := {t :h(n,j)(t)<0} = {t :0≤2nt−j≤1} = [j2−n, (j+1)2−n) =: ∆(n,j)

and

{h(n,j) > 0} =[j2−n,

(j +

1

2

)2−n

)=: ∆+

(n,j),

{h(n,j) < 0} =[(j +

1

2

)2−n, j + 1)2−n

)=: ∆−(n,j),

We note for and (m, i) and (n, j) in Z× Z that

(6.3) Either ∆(m,i) ⊂ ∆(n,j) or ∆(n,j) ⊂ ∆(m,i) or ∆(m,i) ∩∆(n,j) = ∅.

Theorem 6.1.2. Let 1 ≤ p <∞.

1. {h(p)(n,j) : n ∈ N0, j ∈ Z} ∪ {1[j,j+1) : j ∈ Z}, when appropriately ordered, is

a monotone basis for Lp(R), which is unconditional if 1 < p <∞.

2. {h(p)(n,j) : n ∈ Z, j ∈ Z} is an unconditional basis for Lp(R) if 1 < p <∞.

Remark. Note that the second part of Theorem 6.1.2 is wrong for p = 1. Indeed,the integral functional

I : L1(R)→ R, f 7→∫ ∞−∞

f(x) dx,

is a bounded linear functional on L1(R) which is not identical to the zero-

functional, but for all n, j ∈ Z h(1)(n,j) is in the kernel of I, and thus the span

of the h(1)(n,j) cannot be dense in L1(R). The same argumentation is invalid for

1 1.

6.1. THE HAAR BASIS AND WAVELETS 89

Proof. First note that Lp(R) is isometrically isomorphic to (⊕i∈ZLp[i, i+ 1]) viathe map

Lp(R)→ (⊕i∈ZLp[i, i+ 1]), f 7→ (f |[i,i+1] : i ∈ Z),

and that for i∈Z, by Theorem 6.1.1, the shifted Haar basis(h

(p,i)(n,j) :n∈N0, j = 0, 1, . . . , 2n−1

)) ∪ {1[i,i+1)}

=(h

(p)(n,j) : n∈N0, j = 2ni, 2ni+1, 2ni+2 . . . 2n(i+1)− 1

)∪ {1[i,i+1)}

is a monotone basis for Lp[i, i+1], if ordered appropriately, which is unconditionalif 1 < p < ∞. Since Lp(R) is the 1-unconditional sum of the spaces Lp[i, i+ 1),

i ∈ Z the union of(h

(p,i)(n,j) : n∈N0, j = 0, 1, . . . , 2n−1

)) ∪ {1[i,i+1) over all i ∈ Z

is a monotone basis of Lp[0, 1], if ordered appropriately, which is unconditionalif p > 1

In order to show (2) we assume B ⊂ Z × Z is finite and A ⊂ B and thenverify condition (5.3). Since B is finite, there is n1,∈ N so that

∆(n,j) ⊂ ∆(−n1,0) = [0, 2n1) for all (n, j) ∈ B+ = {(m, i) ∈ Z× N0} ∩B,

and there is n2,∈ N so that

∆(n,j) ⊂ ∆(−n2,−1) = [−2n2 , 0) for all (n, j) ∈ B− = {(m, i) ∈ Z× (−N)} ∩B,

Since the Lp-nrm is shift invariant, it is enough to assume that B− = ∅ andB = B+.

Consider the map (rescaling)

φ : ∆(0,0) = [0, 1]→ ∆(−n1,0), t 7→ t2n1

and the mapT : Lp(∆(−n1,0))→ Lp[0, 1], f 7→ 2n1/pf ◦ φ,

which is an isometry between Lp(∆(n1,0)) and Lp[0, 1], mapping the family

{h(p)(n,j) : ∆(n,j) ⊂ ∆(n1,0)} ∪ {1∆(n1,0)

}

into the Haar basis of Lp[0, 1]

{h(p)(n,j) : ∆(n,j) ⊂ ∆(0,0)} ∪ {1[0,1]}.

This proves with Theorem 6.1.1 that {h(p)(n,j) : ∆(n,j) ⊂ ∆(n1,0)} ∪ {1∆(n1,0)

} is

unconditional and therefore (5.3) is satisfied.

It is left to show that the closed linear span of {h(p)(n,j) : n ∈ Z, j ∈ Z} is all

of Lp(R). By part (1) and using shifts, it is enough to show that 1[0,1] is in the

closed linear span of {h(p)(n,j) : n ∈ Z, j ∈ Z}.


Notice that for N ∈ N we have

N∑n=0

2−(n+1)(1[0,2n) − 1[2n,2n+1)

)= 1[0,1)

N∑n=0

2−(n+1)

+1[1,2)

( N∑n=1

2−(n+1) − 2−1)

+1[2,4)

( N∑n=2

2−(n+1) − 2−2)

...

+1[2N−1,2N )(2−(N+1) − 2−N )

−1[2N ,2N+1)2−(N+1)

= 1[0,1)(1− 2−(N+1))− 1[1,2N )2−(N+1) − 1[2N ,2N+1)2

−(N+1).

For the last equality note that for k = 1, 2, . . . N − 1

N∑n=k

2−n−1 − 2−k = 2−k − 2−(N+1) − 2−k = −2−(N+1).

Since we assumed that p > 1, it follows that

Lp − limN→∞

N∑n=0

2−(n+1)(1[0,2n) − 1[2n,2n+1)

)= 1[0,1),

which finishes the prove of our claim.

Definition 6.1.3. A function Ψ ∈ L2(R) is called wavelet if the family (Ψ(n,j) :n, j∈Z), defined by

Ψ(n,j)(t) = 2n/2Ψ(2nt− j), for t ∈ R and n, j ∈ Z,

is an orthonormal basis of L2(R).

Definition 6.1.4. A Multi Resolution Analysis of L2(R) (MRA) is sequence ofclosed subspaces (Vn : n∈Z) of L2(R) such that

(MRA1) . . . V−2 ⊂ V−1 ⊂ V0 ⊂ V1 ⊂ V2 . . .,

(MRA2)⋃n∈Z Vn = L2(R),


(MRA3)⋂n∈Z Vn = {0},

(MRA4) Vn = {f(2+n (·) ) : f ∈ V0}, for n ∈ Z,

(MRA5) there is a compactly supported function Φ ∈ V0, so that (Φ((·)−m) : m ∈ Z)is an orthonormal basis of V0.

In this case we call Φ a scaling function of the MRA (Vn : n∈Z).

Note that (MRA5) implies

(MRA6) V0 (and thus any Vn) is translation invariant by integer shifts, i.e.

f ∈ V0 ⇐⇒ f((·)− j) ∈ V0, for all j ∈ Z and

For h ∈ R and f ∈ L2(R) we put

Th : L2(R)→ L2(R), f 7→ f((·)− h) (Shift to there right by h units)

Jh : L2(R)→ L2(R), f 7→ 2h/2f((·)2h) (Scaling )

Remark. For h ∈ R the operators Th and Jh are isometries and

(6.4) T−1h = T−h and J−1

h = J−h.

We can rephrase (MRA4) and (MRA6) equivalently as follows

(MRA4’) Vn = Jn(V0), for n ∈ Z, and

(MRA6’) V0 = Tn(V0) for all n ∈ Z.

Finally note that (MRA4), (MRA6) and (MRA5) implies that for j ∈ Z

(MRA5’) {2j/2Φ(2j(·)− k) : j, k ∈ Z} is an orthonormal basis of Vj .

and note that Th as well as Jh are both isometries on L2(R).

Example 6.1.5. Take Φ = 1[0,1), and for (n ∈ Z), put

Vn = span(Jn ◦ Tj(φ) : j ∈ Z) = span(1[j2−n,(j+1)2−n); j∈Z).

(Note 1[j2−n,(j+1)(2−n)(2−n(·)) = 1[j,j+1)(·) ∈ V0).

Then (Vn) is an MRA.

We now discuss how to produce a wavelet Ψ starting with an MRA (Vn : n ∈Z) with scaling function Φ.

We denote the orthogonal complement of Vj inside Vj+1 by Wj ; this meansthat any f ∈ Vj+1 can be written as f = g + h with g ∈ Vj and h ∈ Wj and


‖f‖2 =√‖g‖22 + ‖h‖22. We write Vj+1 = Wj⊕Vj . Since Jj is an unitary operator

(keeping orthogonality), we deduce that

Vj ⊕Wj = Vj+1

Jj(V1)

= Jj(V0 ⊕W0)

= Jj(V0)⊕ Jj(W0) = Vj ⊕ Jj(W0).

Since the orthogonal complement to a subspace W of a Hilbert space is unique(namely W⊥ = {h ∈ H : ∀w∈W 〈h,w〉 = 0}), we obtain

(6.5) Wj = Jj(W0) for all j ∈ Z.

Next we observe that

(6.6) Wj and Wi are orthonormal if j 6= i.

Indeed, assume w.l.o.g. that i < j. Then Wi ⊂ Vi+1 ⊂ Vj and Wj is orthogonalto Vj .

From (MRA2) we deduce that every f ∈ L2(R) can be arbitrary approximatedby some g ∈ Vn for large enough n ∈ N and (MRA3) yields that

limk→∞

PV−k(f) = 0

where PV−kis the orthogonal projection of L2(R) onto V−k. Thus, choosing k ∈ N

we can arbitrarily approximate g by an element h in the orthogonal complementof V−k inside Vn. But since

Vn = Wn−1 ⊕ Vn−1 = Wn−1 ⊕Wn−2 ⊕ Vn−2 = . . . (Wn−1 ⊕Wn−2 ⊕W−k)⊕ V−k,

it follows thath ∈Wn−1 ⊕Wn−2 ⊕ . . .W−k.

As a consequence we deduce that L2(R) is the orthonormal sum of the Wj , j ∈ Z.Together with our observations (6.5) and (6.6) this yields the following result.

Proposition 6.1.6. Every Ψ ∈ W0 for which {Ψ((·) − j) : j ∈ Z} is an or-thonormal basis of W0 is a wavelet. We say in that case that Ψ is the waveletassociated to the MRA (Vn).

Example 6.1.7. We consider the Example 6.1.5. Then

W0 = {f ∈ V1 : ∀g∈V0 〈g, f〉 = 0} ={f ∈ V1 :

∫ j+1

jf(t) dt = 0

}.

Thus we could takeΨ = 1[0,1/2) − 1[1/2,1),

as a wavelet associated to (Vn).


We would like to explain how to construct the wavelet Ψ associated to anMRA (Vn : n∈Z) with scaling function Φ.

Theorem 6.1.8. Suppose that (Vn : n∈Z) is an MRA with scaling function Φwhich is integrable and ∫ ∞

−∞Φ(t) dt 6= 0.

1. The the following Scaling Relation holds:

(6.7) Φ =∑k∈Z

pkΦ(2(·)− k) with pk = 2

∫ ∞−∞

Φ(x)Φ(2x− k)dx.

More generally

(6.8) Φ(2j−1(·)− l) =∑k∈Z

pk−2lΦ(2j(·)− k) for all j, l ∈ Z.

2. The sequence (pk : k ∈ Z) satisfies∑k∈Z

pk−2lpk = 2δ0,l for all l ∈ Z(6.9) ∑k∈2Z

pk =∑

k∈2Z+1

pk = 1.(6.10)

3. The function Ψ defined by

(6.11) Ψ =∑k∈Z

(−1)kp1−kΦ(2(·)− k)

is a wavelet associated to (Vn : n ∈ Z).

Proof. Since Φ ∈ V0 ⊂ V1 and since (21/2Φ(2(·) − k) : k ∈ Z) is an orthonormalbasis of V1, we can write Φ as in (6.7). For j, l ∈ Z it follows therefore that

Φ(2j−1(·)− l) =∑k∈Z

pkΦ(2(2j−1(·)− l)− k

)=∑k∈Z

pkΦ(2j(·)− 2l − k

)=∑k∈Z

pk−2lΦ(2j(·)− k).

Since (Φ((·) − l) : l ∈ Z) is an orthonormal sequence we obtain from (6.7) and(6.8) (with j =) that

δ(0,l) = 〈Φ((·)− l),Φ〉

=∑m,k∈Z

pm−2lpk〈Φ(2(·)−m),Φ(2(·)− k)〉


=∑k∈Z

pk−2lpk

∫ ∞−∞|Φ(2t)|2dt =

1

2

∑k∈Z

pk−2lpk,

which proves (6.9) and implies (replacing l by −l) that

2 = 2∑l∈Z

δ(0,−l)

=∑l,k∈Z

pk+2lpk

=∑l∈Z

∑k∈Z

p2k+2lp2k +∑l∈Z

∑k∈Z

p2k+1+2lp2k+1

=∑k∈Z

p2k

(∑l∈Z

p2k+2l

)+∑k∈Z

p2k+1

(∑l∈Z

p2k+1+2l

)=∑k∈Z

p2k

(∑l∈Z

p2l

)+∑k∈Z

p2k+1

(∑l∈Z

p2l+1

)=∣∣∣∑k∈Z

p2k

∣∣∣2 +∣∣∣∑k∈Z

p2k+1

∣∣∣2 =: A2 +B2.

Moreover if we integrate the scaling relation (6.7) we obtain

(6.12)

∫ ∞−∞

Φ(x)dx =1

2

∑k∈Z

pk

∫ ∞−∞

Φ(x)dx.

By our assumption on the integral of Φ, we can cancel the integral on both sidesof (6.12) and obtain that

A+B =∑k∈Z

pk = 2.

The only solution of A2 + B2 = 2 and A + B = 2 is A = B = 1 which proves(6.10) (draw a picture!).

In order to prove the (3) we first note that Ψ ∈ V1. Secondly, we note thatthe sequence (Ψ((·) − k) : k ∈ Z) is orthonormal. Indeed, it follows from (6.8)and (6.9) for l,m ∈ Z that

〈Ψ((·)− l),Ψ((·)−m)〉

=⟨∑k=1

(−1)kp1−kΦ(2((·)− l)− k),∑k=1

(−1)kp1−kΦ(2((·)−m)− k)⟩

=⟨∑k=1

(−1)kp1−k+2lΦ(2(·)− k),∑k=1

(−1)kp1−k+2mΦ(2(·)− k)⟩

=1

2

∑k∈Z

p1−k+2lp1−k+2m


=1

2

∑k∈Z

p1−k+2l−2mp1−k = δ(l,m).

Thirdly, it follows from 6.8 for any l,m ∈ Z that

〈Φ((·)− l),Ψ((·)−m))〉

=⟨∑k∈Z

pk−2lΦ((·)− k),∑k=1

(−1)kp1−kΦ(2((·)−m)− k)⟩

=⟨∑k∈Z

pk−2lΦ((·)− k),∑k=1

(−1)kp1−k+2mΦ((·)− k)⟩

=1

2

∑k∈Z

(−1)kpk−2lp1−k+2m

=1

2

∑k∈Z

(−1)kpkp1−k+r with r = 2m− 2l

=1

2

∑k∈Z

p2kp1−2k+r −1

2

∑k∈Z

p2k+1p−2k+r

=1

2

∑k∈Z

p2kp1−2k+r −1

2

∑l∈Z

p1−2l+rp2l = 0.

(substitute 2l = r − 2k)

Finally, in order to show that (Ψ((·)− k) : k ∈ Z) and (Φ((·)− k) : k ∈ Z) spanall of V1 we need to show that for given j ∈ Z the projection of Φ(2(·)− j) ontothe space spanned by (Ψ((·)− k) : k ∈ Z) and (Φ((·)− k) : k ∈ Z) has the samenorm (namely 1/2) as Φ(2(·)− j). Let us denote the projected vector by Φj . Bythe above shown orthonormalities of (Ψ((·)− k) : k ∈ Z) and (Φ((·)− k) : k ∈ Z)we can write

Φj =∑k∈Z

[akΦ((·)− k) + bkΨ((·)− k))

],

with

ak = 〈Φ(2(·)− j),Φ((·)− k)〉 = 〈Φ(2(·) + 2k − j),Φ〉 =1

2pj−2k.

and

bk = 〈Φ(2(·)− j),Ψ((·)− k)〉

=∑l∈Z

(−1)l〈Φ(2(·)− j), p1−lΦ(2(·)− l − 2k)〉 =1

2(−1)jp1−j+2k

for k∈Z. It follows therefore that

‖Φj‖2 =1

4

∑k∈Z|pj−2k|2 +

1

4

∑k∈Z|p1−j+2k|2 =

1

4

∑k∈Z|pk|2 =

1

2.


The proof of the following result goes beyond the scope of these notes, sinceit requires several tools from harmonic analysis

Theorem 6.1.9. [Wo1, Theorem 8.13] Assume that Ψ is a wavelet for L2(R)satisfying the following two conditions for some constant C > 0:

|Ψ(x)| ≤ C(1 + |x|)−2 for all x ∈ R(6.13)

Ψ is differentiable and∣∣Ψ′(x)

∣∣ ≤ C(1 + |x|)−2

for all x ∈ R(6.14)

Then for every 1 < p < ∞ the family Ψ(p)j,k = (2j/pΨ(2j(·) − k) : j, k ∈ Z) is a

basis of Lp(R) which is isomorphically equivalent to (h(p)j,k) : j, k ∈ Z).

We finally, want to present without proof another basis of Lp[0, 1]. Recallthat

(e(·)n/2π) is an orthonormal basis of L2[0, 1]. A deep Theorem by M. Riesz

states the following.

Theorem 6.1.10. (c.f. [Ka, Chapter II and III]) The sequence of trigonometricpolynomials (tn : n ∈ Z), with

tn(ξ) = eiξn/2π, for ξ ∈ [0, 1]

is a Schauder basis of Lp[0, 1], 1 < p <∞, when ordered as (t0, t1, t−1, t2, t−2, . . .

In the next section we prove that for p 6= 2 (tn) cannot be unconditional.

6.2 Khintchine’s inequality and Applications

Theorem 6.2.1. [Khintchine’s Theorem, see Theorem 5.3.1 in [Schl]]Lp[0, 1], 1 ≤ p ≤ ∞ contains subspaces isomorphic to `2. If 1 < p < ∞ Lp[0, 1],contains a complemented subspaces isomorphic to `2.

Definition 6.2.2. The Rademacher functions are the functions:

rn : [0, 1]→ R, t 7→ sign(sin 2nπt), whenever n∈N.

Lemma 6.2.3. [Khintchine inequality], see Lemma 5.3.3 in [Schl]For every p ∈ [1,∞) there are numbers 0 < Ap ≤ 1 ≤ Bp so that for any m ∈ Nand any scalars (aj)

mj=1,

(6.15) Ap

( m∑j=1

|aj |2)1/2

≤∥∥∥ m∑j=1

ajrj

∥∥∥Lp

≤ Bp( m∑j=1

|aj |2)1/2

.

There is a complex version of the Rademacher functions namely the sequence(gn : n∈N), with

gn(t) = ei2nπt for t ∈ [0, 1].

6.2. KHINTCHINE’S INEQUALITY AND APPLICATIONS 97

Theorem 6.2.4. [Complex Version of Khintchine’s Theorem] For every p ∈[1,∞) there are numbers 0 < A′p ≤ 1 ≤ B′p so that for any m ∈ N and anyscalars (aj)

mj=1,

(6.16) A′p

( m∑j=1

|aj |2)1/2

≤∥∥∥ m∑j=1

ajgj

∥∥∥Lp

≤ B′p( m∑j=1

|aj |2)1/2

.

Moreover, if 1 < p <∞ then (gj) generates a copy of `2 inside Lp[0, 1] which iscomplemented.

Theorem 6.2.5 (The square-function norm). Let 1 ≤ p < ∞ and let (fn) be aλ-unconditional basic sequence in Lp[0, 1] for some λ ≥ 1.

Then there is a constant C = C(p, λ) ≥ 1, depending only on the uncondi-tionality constant of (fi) and the constants Ap and Bp in Khintchine’s Inequality

(Lemma 6.2.3), so that for any g =∑∞

i=1 aifi ∈ span(fi : i ∈ N) it follows that

1

C

∥∥∥ ∞∑i=1

(|ai|2|fi|2

)1/2∥∥∥p≤ ‖g‖p ≤ C

∥∥∥ ∞∑i=1

(|ai|2|fi|2

)1/2∥∥∥p,

which means that ‖ · ‖p is on span(fi : i ∈ N) equivalent to the norm

|||f ||| =∥∥∥ ∞∑i=1

(|ai|2|fi|2

)1/2∥∥∥p

=∥∥∥ ∞∑i=1

|ai|2|fi|2∥∥∥1/2

p/2.

Proof. For two positive numbers A and B and c > 0 we write: A ∼c B if1cA ≤ B ≤ cA. Let Kp be the Khintchine constant for Lp, i.e the smallestnumber so that for the Rademacher sequence (rn)∥∥∥ ∞∑

i=1

airi

∥∥∥p∼Kp

( ∞∑i=1

|ai|2)1/2

for (ai) ⊂ K,

and let Cu be the unconditionality constant of (fi), i.e.∥∥∥ ∞∑i=1

σiaifi

∥∥∥p∼Cu

∥∥∥ ∞∑i=1

aifi

∥∥∥p

for (ai) ⊂ K and (σi) ⊂ {±1}.

We consider Lp[0, 1] in a natural way as subspace of Lp[0, 1]2, with f(s, t) :=f(s) for f ∈ Lp[0, 1]. Then let rn(t) = rn(s, t) be the nth Rademacher functionaction on the second coordinate, i.e

rn(s, t) = sign(sin(2nπt)), (s, t) ∈ [0, 1]2.

It follows from the Cu-unconditionality for any (aj)mj=1 ⊂ K, that∥∥∥ m∑

j=1

ajfj(·)∥∥∥pp∼Cp

u

∥∥∥ m∑j=1

ajfj(·)rj(t)∥∥∥pp


=

∫ 1

0

(m∑j=1

ajfj(s)rj(t))pds for all t ∈ [0, 1],

and integrating over all t ∈ [0, 1] implies

∥∥∥ m∑j=1

ajfj(·)∥∥∥pp∼Cp

u

∫ 1

0

∫ 1

0

(m∑j=1

ajfj(s)rj(t)

)pds dt

=

∫ 1

0

∫ 1

0

(m∑j=1

ajfj(s)rj(t)

)pdt ds(By Theorem of Fubini)

=

∫ 1

0

∥∥∥ m∑j=1

ajfj(s)rj(·)∥∥∥ppds

∼Kpp

∫ 1

0

( m∑j=1

|ajfj(s)|2)p/2

ds =∥∥∥( m∑

j=1

|ajfj |2)1/2∥∥∥p

p,

which proves our claim using C = KpCu.

Theorem 6.2.6. Assume that 1 < p <∞ and assume that (fj) is a normalizedλ-unconditional sequence in Lp[0, 1] for some λ ≥ 1. Let C = C(p, λ) as inTheorem 6.2.5.

Then for all scalars (aj)nj=1 ⊂ K

1

C

( n∑j=1

|aj |2)1/2

≤∥∥∥ n∑j=1

ajfj

∥∥∥p≤ C

( n∑j=1

|aj |p)1/p

if 1 ≤ p ≤ 2, and(6.17)

1

C

( n∑j=1

|aj |p)1/p

≤∥∥∥ n∑j=1

ajfj

∥∥∥p≤ C

( n∑j=1

|aj |2)1/2

if 2 ≤ p <∞.(6.18)

Proof. We will show the inequalities (6.17) and (6.18) but with their middle termsreplaced by the square-function norm. Having done that the claim will followfrom Theorem 6.2.5.

Let (ai)ni=1 ⊂ K. First we note that since for for 1 ≤ s < t ≤ ∞ the `s-norm

of a vector (bj)j≤n is at least the `t-norm we deduce

∥∥∥( n∑j=1

|aj |2|fj |2)1/2∥∥∥p

p

=

∫ 1

0

( n∑j=1

|aj |2|fj(t)|2)p/2

dt

6.2. KHINTCHINE’S INEQUALITY AND APPLICATIONS 99

≤∫ 1

0

n∑j=1

|aj |p|fj(t)|p dt if 1 ≤ p ≤ 2,

≥∫ 1

0

n∑j=1

|aj |p|fj(t)|p dt if 2 ≤ p <∞

=n∑j=1

|aj |p

which implies the second inequality of (6.17) and the first of (6.18). In order toverify the other two inequalities we observe that∥∥∥( n∑

j=1

|aj |2|fj |2)1/2∥∥∥

p

=

∥∥∥∥∥(

n∑j=1

|aj |2∑ni=1 |ai|2

|fj |2)1/2∥∥∥∥∥

p

( n∑i=1

|ai|2)1/2

=

∥∥∥∥∥((

n∑j=1

|aj |2∑ni=1 |ai|2

|fj |2)p/2)1/p∥∥∥∥∥

p

( n∑i=1

|ai|2)1/2

≥∥∥∥( n∑

j=1

|aj |2∑ni=1 |ai|2

|fj |p)1/p∥∥∥

p

( n∑i=1

|ai|2)1/2

if 1 ≤ p ≤ 2

≤∥∥∥( n∑

j=1

|aj |2∑ni=1 |ai|2

|fj |p)1/p∥∥∥

p

( n∑i=1

|ai|2)1/2

if 2 ≤ p <∞

[ξ → ξp/2 is convex if p > 2 and concave if 1 ≤ p < 2

]=

∫ 1

0

n∑j=1

|aj |2∑ni=1 |ai|2

|fj |p dt

1/p ( n∑i=1

|ai|2)1/2

=( n∑i=1

|ai|2)1/2

,

which implies the first inequality of (6.17) and the second of (6.18).

Corollary 6.2.7. Let 1 ≤ p < ∞. Every normalized unconditional sequence(fj) ⊂ Lp[0, 1] which consists of uniformly bounded functions is equivalent to the`2-unit vector basis. In particular, if p 6= 2, (fj) cannot span all of Lp[0, 1].

We will need Jensen’s inequality.

Theorem 6.2.8. [Jensen’s Inequality] If f : R→ R is convex and g : [0, 1]→ R,so that g and f ◦ g are integrable. It follows that

f

(∫ 1

0g(ξ) dξ

)≤∫ 1

0f(g(ξ)) dξ.


(Here [0, 1], together with the Lebesgues measure, can be replaced by any proba-bility space)

Proof of Corollary 6.2.7. Assume that (fj) is uniformly bounded, normalizedand λ-unconditional. Let C = supj∈N ‖fj‖L∞ . If 1 ≤ p ≤ 2 we deduce for(aj) ∈ c00 from the proof of 6.2.6 that

∥∥∥( ∞∑j=1

|aifi|2)1/2∥∥∥

Lp

≤ C(∑∞

i=1 a2i

)1/2

≥(∑∞

i=1 a2i

)1/2.

Thus our claim follows for in the case that 1 ≤ p ≤ 2.If p ≥ 2 we obtain for (aj) ∈ c00 first that∥∥∥( ∞∑

i=1

|aifi|2)1/2∥∥∥

Lp

≤ C( ∞∑i=1

|ai|2)1/2

.

Since ‖fi‖Lp = 1 and ‖fi‖∞ ≤ C for i ∈ N, we deduce that

1 = ‖fi‖pp

=

∫ 1

0|fi(t)|p dt

≤ Cpm({|fi| ≥ 1/2C}) +1

(2C)pm({|fi| < 1/2C})

≤ Cpm({|fi| ≥ 1/2C}) +1

2,

and thus

m({|fi| ≥ 1/2C}) ≥ 1

2Cp.

We deduce that∥∥∥( ∞∑i=1

|aifi|2)1/2∥∥∥

Lp

=

(∫ 1

0

( ∞∑j=1

|ajfj(t)|2)p/2

dt

)1/p

≥ 1

2C

(∫ 1

0

( ∞∑j=1

|aj |21{|fj |≥1/2C}

)p/2dt

)1/p

≥ 1

2C

(∫ 1

0

∞∑j=1

|aj |21{|fj |≥1/2C} dt

)1/2

(By Jensen’s inequality)

≥ 1

2C

1

2Cp

( ∞∑j=1

|aj |2)1/2

Our claim follows therefore from the equivalence between the Lp-norm and thesquare function norm in Lp (Theorem 6.2.5).

Bibliography

[AW] F. Albiac and P. Wojtaszczyk. Characterization of 1-greedy bases J.Approx. Theory, 138(1):65–86, 2006.

[EW] I. S. Edelstein and P. Wojtaszcyk, On projections and unconditionalbases in direct sums of Banach spaces. Stud. Math. 56 (3) (1976) 263– 276.

[An] A. D. Andrew, On subsequences of the Haar system in C(∆), IsraelJournal of Math, 31, N0. 1 (1978) 85–90.

[BCLT] J. Bourgain, P.G. Casazza, J. Lindenstrauss, and L. Tzafriri, Banachspaces with a unique unconditional basis, up to permutation, MemoirsAmer. Math. Soc. No. 322, 1985.

[DT] R. A. DeVore and V. N. Temlyakov, Some remarks on greedy algorithms,Adv. in Comp. Math. 5 (1996) 173 –187.

[DFOS] S.J. Dilworth, D. Freeman and E. Odell, and Th. Schlumprecht, Greedybases for Besov spaces. Constr. Approx. 34 (2011), no. 2, 281296.

[DKSTW] S. Dilworth, D. Kutzarova, K. Shuman, V. Temlyakov, and P. Woj-taszczyk, Weak convergence of greedy algorithms in Banach spaces. J.Fourier Anal. Appl. 14 (2008), no. 5-6, 609 - - 628.

[DOSZ1] S. Dilworth, E. Odell, Th. Schlumprecht, and A. Zsak, Renormingsand symmetry properties of one-greedy bases, J. Approx. Theory 163(2011), no. 9, 1049 – 1075.

[DOSZ2] S. Dilworth, E. Odell, Th. Schlumprecht, and A. Zsak,Partial Uncon-ditionality

[DOSZ3] S. Dilworth, E. Odell, Th. Schlumprecht, and A. Zsak, On the conver-gence of greedy algorithms for initial segments of the Haar basis. Math.Proc. Cambridge Philos. Soc. 148 (2010), no. 3, 519529.

101

102 BIBLIOGRAPHY

[1] S. Dilworth, D. Kutzarova, E. Odell, Th. Schlumprecht, and P. Wo-jtaszczyk, Weak thresholding greedy algorithms in Banach spaces. J.Funct. Anal. 263 (2012), no. 12, 3900– 3921.

[2] S. Dilworth, D. Kutzarova, E. Odell, Th. Schlumprecht, and A. Zsak,Renorming spaces with greedy bases, J. Approx. Theory 188 (2014),3956.

[DKK] Dilworth, S. J.; Kalton, N. J.; Kutzarova, Denka On the existence of al-most greedy bases in Banach spaces. Dedicated to Professor AleksanderPe lczynski on the occasion of his 70th birthday. Studia Math. 159(2003), no. 1, 67–101.

[En] P. Enflo,, A counterexample to the approximation problem in Banachspaces. Acta Math. 130 (1973), 309 – 317.

[GK] M. Ganichev and N. J. Kalton, Convergence of the weak dual greedyalgorithm in Lp-space, J. Approx. Theory 124 (2003) 89 – 95.

[HMVZ] P. Hajek, V. Montesinos Santalucıa, J. Vanderwerff, and V. Zizler,Biorthogonal systems in Banach spaces. CMS Books in Mathemat-ics/Ouvrages de Mathmatiques de la SMC, 26. Springer, New York,(2008) xviii+339 pp.

[Jo] L. K. Jones, A simple lemma on greedy approximation in Hilbert spaceand convergence rates for projection pursuit regression and neural net-work training, Annals of Stat. 20 (1992) 608–613.

[Ka] Y. Katznelson, An Introduction to Harmonic Analysis

[KT1] S. V. Konyagin and V. N. Temlyakov, A remark on greedy approxima-tion in Banach spaces, East Journal on Approximation, 5 no.3 (1999),365–379.

[KT2] S. V. Konyagin and V. N. Temlyakov, Rate of convergence of pure greedyalgorithm. East J. Approx. 5 (1999), no. 4, 493–499.

[LT] Lindenstrauss, J. and Tzafriri, L., “Classical Banach Spaces I – Se-quence Spaces,” Springer-Verlag, Berlin, 1979.

[Ma] A.I. Markushevich, On a basis in the wide sense for linear spaces, Dokl.Akad. Nauk. 41 (1943), 241– 244.

[OP] R.I. Ovsepian and A. Pe lczynski, On the existence of a fundamental to-tal and bounded biorthogonal sequence in every separable Banach space,and related constructions of uniformly bounded orthonormal systems inL2. Studia Math. 54 (1975), no. 2, 149 – 159.

BIBLIOGRAPHY 103

[Pe] A. Pe lczynski, All separable Banach spaces admit for every ε > 0 funda-mental total and bounded by 1 +ε biorthogonal sequences. Studia Math.55 (1976), no. 3, 295 – 304.

[Schl] Th. Schlumprecht, Course Notes for Functional Analysis, Fall 2012,http://www.math.tamu.edu/∼schlump/course notes FA2012.pdf

[Sch2] Th. Schlumprecht, Embedding Theorem of Banach spaces into Banachspaces with bases. Adv. Math. 274 (2015), 833 – 880

[Schm] E. Schmidt, Zur Theoreie der linearen und nicht linearen Integralgle-ichungen. I, Math. Annalen 63 (1906) , 433 – 476.

[T1] V. Temlyakov, Nonlinear methods of approximation. Found. Comput.Math. 3 (2003), no. 1, 33–107

[T2] V. Temlyakov, Relaxation in greedy approximation, preprint.

[T3] V. Temlyakov, Greedy algorithms in Banach spaces, Adv. Comp. Math.,14 (2001) 277 – 292.

[Wo1] P. Wojtaszczyk, A mathematical introduction to wavelets. LondonMathematical Society Student Texats 37 (1997).

[Wo2] P. Wojtaszczyk, Greedy algorithm for general biorthogonal systems. J.Approx. Theory 107 (2000), no. 2, 293 314.

Course Notes for Greedy Approximations Math 663-601

Documents