Top Banner
Variational principle L´aszl´oErd˝os Dec 21, 2010 1 Introduction The variational principle is a fundamental method to find solutions to partial differential equa- tions (PDE). It relies on the idea that many PDE’s originate from a basic physical principle that the state of the system is determined by minimizing its energy. Not every PDE is suitable for variational solutions, but many PDE’s with physical origin are. The primary example is the solution to the eigenvalue equation (Δ+ V )ψ = of the Schr¨odinger operator H = Δ+ V , but many ideas presented here are applicable in a broader context. The ground state energy of the system was defined as E 0 = inf {E (ψ): ψ2 =1}, E (ψ)= |∇ψ| 2 + V |ψ| 2 Modulo (nontrivial) technicalities about operator domains, we have E (ψ)= ψ,Hψ, thus E 0 = inf {〈ψ,Hψ: ψ=1} = inf ψ=0 ψ,Hψψ2 This is exactly the same definition as one has in finite dimensions for the lowest eigenvalue of an N by N hermitian matrix H = H . In that case, we have the spectral theorem (unitary diagonalization), i.e. H = N j =1 λ j v j v j 1
32

Noun Phrases in L2 French and Haitian: Clues on the Origin - Unish

Mar 28, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Untitled1 Introduction
The variational principle is a fundamental method to find solutions to partial differential equa- tions (PDE). It relies on the idea that many PDE’s originate from a basic physical principle that the state of the system is determined by minimizing its energy. Not every PDE is suitable for variational solutions, but many PDE’s with physical origin are.
The primary example is the solution to the eigenvalue equation
(− + V )ψ = Eψ
of the Schrodinger operator H = − + V , but many ideas presented here are applicable in a broader context.
The ground state energy of the system was defined as
E0 = inf{E(ψ) : ψ2 = 1}, E(ψ) =
∫ |∇ψ|2 +
Modulo (nontrivial) technicalities about operator domains, we have E(ψ) = ψ,Hψ, thus
E0 = inf{ψ,Hψ : ψ = 1} = inf ψ 6=0
ψ,Hψ ψ2
This is exactly the same definition as one has in finite dimensions for the lowest eigenvalue of an N by N hermitian matrix H = H∗. In that case, we have the spectral theorem (unitary diagonalization), i.e.
H = N∑
j=1
λjvjv ∗ j
1
with λ1 ≤ λ2 ≤ . . . ≤ λN being the eigenvalues (counted with multiplicity) and vj ∈ CN
being the normalized eigenvectors. Then for any x ∈ CN
x, Hx =
|vj,x|2 = λ1x2
(in the last step using that vj is an orthonormal basis). Thus
inf x 6=0
x, Hx x2
≥ λ1
and by setting x = v1 it is easy to see that this lower bound can be achieved, thus
inf x 6=0
x, Hx x2
= λ1
This is the variational characterization of the lowest eigenvalue. In finite dimensions the infimum is always achieved (big difference between finite and infinite dimensions!) and it defines the corresponding eigenvector. It may happen that the minimizer is not unique (even after taking into account the trivial change v1 → (const.)v1), since the eigenspace of λ1 could be degenerate. However, as long as the minimum is achieved, we can use variational principle to find eigenvectors as well.
The key idea is that to find the solution of the eigenvalue equation
Hv1 = λ1v1
inf x 6=0
x, Hx x2
The latter is mathematically and numerically easier to handle (many numerical methods for eigenvalues rely on this). It also gives rise to estimates on λ1. Upper bounds on λ1 are easy; one just needs a good guess for x, since
λ1 ≤ x, Hx
holds for any normalized x. Lower bounds are typically much harder, since somehow one has to test all possible vectors x.
2
What about higher eigenvalues? They are also given by a variational principle. For example, for the second lowest eigenvalue
λ2 = inf{x, Hx : x = 1, x,v1 = 0}
i.e. it minimizes the quadratic form on the subspace orthogonal to the first eigenvector. It may happen that the lowest eigenvalue is degenerate, in this case λ2 = λ1 (we will use the convention that we list eigenvalues with their multiplicities).
Exercise 1.1 Using the spectral decomposition of H, prove that the k-th lowest eigenvalue is given by
λk = inf{x, Hx : x = 1, x,vj = 0 ∀j ≤ k − 1} and that the infimum is achieved for x = vk.
Note that the variational principle can be used only for hermitian (or, in real case, sym- metric) matrices.
2 Domination of the kinetic energy
Let us summarize the consequences of the Sobolev inequality for the ground state energy problem of − + V , i.e. for
E(ψ) =
Introduce the notation
Lp + Lq = {f(x) : ∃f1 ∈ Lp, f2 ∈ Lq, f = f1 + f2}
i.e. this is the set of functions that can be written as a sum of an Lp and an Lq function. The decomposition is not unique.
Theorem 2.1 Let the negative part of the potential satisfy
V− ∈ Ld/2 + L∞, if d ≥ 3
V− ∈ L1+ε + L∞, if d = 2
V− ∈ L1 + L∞, if d = 1
3
for some ε > 0. Then there are finite constants Cand D, depending only on V such that
E(ψ) ≥ −Cψ2
(i.e. E0 > −∞) and ∫ |∇ψ|2 ≤ 2E(ψ) +Dψ2 (2.2)
i.e. the kinetic energy is dominated by the total energy.
Proof. For simplicity we work in d ≥ 3 dimensions, it is an easy exercise to modify the proof for d = 1, 2. We have
E(ψ) ≥ ∫
|∇ψ|2 − ∫ V−|ψ|2 (2.3)
Write V− = V1 + V2, with V1 ∈ L∞, V2 ∈ Ld/2. Note that one can choose the decomposition such that V2d/2 is arbitrary small. To see this, write
V2 = V2 · 1{x : |V2(x)| ≤M} + V2 · 1{x : |V2(x)| > M}
for any M > 0 (here 1{A} is the characteristic function of the set A). The first term is in L∞, so one can add it to the V1 part for any fixed M . The second term converges to zero in Ld/2 by dominated convergence, therefore by choosing M sufficiently large, the Ld/2-norm of the second term can be made arbitrarily small.
Now by Holder and Sobolev inequalities
∫ V−|ψ|2 ≤ V1∞ψ2 +
∫ |V2||ψ|2
d−2
≤ V1∞ψ2 + 1
2
(2.4)
where C = Cd is the Sobolev constant, and where we used that V2d/2 is sufficiently small (smaller than 1/2C). Thus, from (2.3) we have
E(ψ) ≥ 1
which proves both statements in the theorem.
4
3 Minimizing sequence
Now we would like to know if the infimum in (2.1) is achieved, i.e. if there is a minimizer (in this case, this will be called ground state). A more refined question: what are the properties of the ground state? How regular is it? For example, the ground state of the Hydrogen atom was e−Z|x|/2, i.e. it is Lipschitz continuous, but it fails to be once continuously differentiable. This singularity comes from the singularity of the Coulomb potential, Z
|x| , but notice that the
singularity in the eigenfunction is much less severe that the singularity in the potential (which is even not bounded, let alone continuous). This “regularizing effect” is a general feature of Schrodinger type equations (actually for a much bigger class of so-called elliptic equations) and it is due to the presence of the Laplacian (as the highest order derivative term in the equation). The phenomenon goes under the name of elliptic regularity. We will not have time to discuss it, but it is a very important feature of PDE’s.
How to find a minimizer to (2.1)? Simply take a sequence ψn of normalized wavefunctions, ψn = 1, such that
E(ψn) → E0,
since E0 was an infimum, such a minimizing sequence always exists. The natural guess is then to look for the limit of ψn, i.e. to ask whether there is a ψ such that
ψn → ψ (3.5)
and check that the map ψ → E(ψ) is continuous, i.e. whether
E(ψn) → E(ψ) (3.6)
Both of these steps are problematic in infinite dimensional spaces (like H1). First, the limit may not exist. Second, the functional E(ψ) (as a map from, say, H1 to R) may not be continuous.
Recall an analogous statement from real analysis, namely, that if f : K → R is defined on a compact set K ⊂ Rd and it is continuous, then
inf K f(x)
is attained and it can be found by the following procedure. Choose a minimizing sequence, xn ∈ K, such that
f(xn) → inf K f, (3.7)
since K is compact, by Bolzano-Weierstrass one can select a convergent subsequence, xnk → x,
whose limit is also in K, and then by continuity of f , we have f(xnk ) → f(x), which, together
with (3.7) means that x is a minimizer.
5
The key ingredient here was compactness. Our functional E(ψ) is defined on the whole H1, which is not compact. So the first question is, how to modify the argument above for functions defined on the whole Rd. We certainly need some extra condition, since in general the infimum of a continuous function on the whole R is not attained (e.g. f(x) = (1+x2)−1). We need some weak condition on the behaviour of f at infinity. It is sufficient to assume that
lim inf |x|→∞
f(x) > inf f
or, with other words, that there exists a constant M > inf f , such that the level set
SM := {x : f(x) ≤M}
is a bounded subset of Rd. Now we can extend the argument above: we choose a minimizing sequence so that f(xn) →
inf f . Since M > inf f , we can assume that xn ∈ SM . Since f is continuous, the level set SM is closed and by assumption it is bounded, thus SM compact. Then we can use the previous argument with K = SM .
This argument used the finite dimensionality in an essential way, namely that a bounded and closed set is compact. In other words, that from a bounded sequence one can always choose a convergent subsequence. This does not hold in infinite dimension. Here is an example. Let ψn(x) = e2πinx in the Hilbert space L2[0, 1]. Clearly {ψn} is a bounded sequence (it is bounded in any Lp[0, 1]) with norm 1, but it has no convergent subsequence (HOMEWORK: check!)
[Remark for mathematicians: whenever we say compactness, we really mean the concept of sequential compactness. These two concepts coincide in all spaces we discuss; there is a deviation for topological spaces with uncountable basis of neighborhoods, i.e. in spaces where the topology is not determined by convergence of sequences.]
The situation will be remedied by changing the topology, i.e. by redefining what we mean by convergence in (3.5). Since E(ψ) was defined on H1, it was natural to work in the topology determined by the H1-norm. Alternatively, one could even come to the idea to work with the other natural norm available, i.e. the L2 norm. But none of them is good, bounded sequences may have no convergent subsequence.
We want to weaken the topology to give more chance to a subsequence of the minimizing sequence to converge. Weaker topology means there are less open sets, so it will be easier to converge. But one cannot overdo it; in a weaker topology, there are less functions that are continuous (recall, continuity of a function(al) from a topological space X to R means that all level sets {x ∈ X : f(x) < M} are open for any M ∈ R). So in a too strong topology (3.5) may fail, in a too weak topology, (3.6) may fail. The goal is to find an intermediate one, and this will be the weak convergence.
6
Since the concept of weak convergence is discussed in sufficient details in Lieb-Loss, Sec- tions 2.9, 2.10, 2.11, 2.12, 2.13 and 2.18, I will not write it up – read these sections. Although it can be defined for general normed spaces, we will need it only for separable Hilbert spaces (like H1) and for Lp spaces (1 ≤ p ≤ ∞). Recall that a sequence ψj in a normed space X converges weakly to ψ ∈ X, in notation ψj ψ if (ψj) → (ψ) for any bounded linear functional on X, i.e. for any ∈ X∗ (dual space).
In our main examples: the dual space of Lp is Lq for 1 ≤ p <∞ (with 1 = 1/p+1/q). The dual space of L∞ contains L1 but it is considerably bigger. The dual space a Hilbert space is itself (with the appropriate identification of vectors with linear functionals, following Riesz theorem).
We will need the following facts
• Weak convergence is indeed weaker than the usual (norm or strong) convergence
• Weak convergence still separates points, i.e. the limit is unique (if it exists)
• The norm is lower semicontinuous, i.e. it may drop along the weak limit but never increase:
ψj ψ =⇒ f ≤ lim inf fj (3.8)
• Weakly convergent sequences are bounded. Even more general, if the sequence ψj sat- isfies that (ψj) is a bounded numerical sequence for any bounded linear functional, then supj ψj <∞ [Uniform boundedness principle]
• Mazur theorem: If fj ∈ Lp converges weakly to f ∈ Lp, then there is a convex combi- nation of fj that converges strongly, i.e. there are coefficients, cjk ≥ 0, 1 ≤ k ≤ j with∑j
k=1 cjk = 1 such that
Fj :=
j∑
strongly.
• Banach-Alaoglu theorem: Let fj be a bounded sequence in Lp with 1 < p < ∞ or in a separable Hilbert space. Then fj has a weakly convergent subsequence.
The only remark is that Theorem 2.18 is for Lp spaces, but it can be immediately extended to separable Hilbert spaces, see the remark at the end of Section 2.21.
7
4 Existence of the ground state
I will follow Theorem 11.5 of Lieb-Loss, but supplement with some more details.
Theorem 4.1 Let the potential V : Rd → R satisfy the conditions
V ∈ Ld/2 + L∞, if d ≥ 3
V ∈ L1+ε + L∞, if d = 2
V ∈ L1 + L∞, if d = 1
for some ε > 0 [Unlike in Theorem (2.1), we make these assumptions not only on the negative part of V ]. Assume that V vanishes at infinity in the sense that for any a > 0 we have
|{x : |V (x)| > a}| <∞
Let E(ψ) = ∫ |∇ψ|2 +
∫ V |ψ|2 for any ψ ∈ H1 as before and let
E0 = inf{E(ψ) : ψ ∈ H1, ψ = 1}
be the ground state energy. By Theorem 2.1, E0 > −∞ and it is easy to see [HOMEWORK!] that E0 ≤ 0. Assume that
E0 < 0 (4.9)
Then there exists ψ ∈ H1, ψ2 = 1 such that E(ψ) = E0, i.e. there is a ground state. Moreover, the ground state satisfies the Schrodinger equation
−ψ + V ψ = E0ψ (4.10)
in a weak sense, i.e. for any φ ∈ C∞ 0 we have
∫ ψ(−φ) +
E(ψn) → E0, ψn2 = 1.
By Theorem 2.1, we have ∫ |∇ψn|2 ≤ 2E(ψn) +Dψn2
2
8
sup n
ψnH1 <∞ (4.11)
Using that H1 is a separable Hilbert space, we can apply Banach-Alaoglu theorem to se- lect a weakly convergent subsequence ψnj
ψ. For notational simplicity, we reindex this subsequence: ψj := ψnj
. We make three claims that we will prove later:
∫ |∇ψ|2 ≤ lim inf
E(ψ) ≤ lim inf j→∞
E(ψj) = E0
but E0 was the infimum of all E(ψ) with ψ2 = 1, thus there must be equality in the above inequality: E0 = E(ψ). Thus we have found a ground state (note that the ground state may not be unique, the constructed ψ may depend on the chosen subsequence).
Before we prove the above claims, we show that the ground state satisfies (4.10). Pick any φ ∈ C∞
0 and set ψε = ψ + εφ. Define the Rayleigh-quotient:
R(ε) := E(ψε)
ψε2
Note that R(ε) is a rational function (ratio of two quadratic polynomials in ε) that is finite at ε = 0, thus it is differentiable in a small neighborhood of 0. Since R(0) = minR(ε), we have that
0 = dR(ε)
∫ [ ∇ψ · ∇φ+ V ψφ
∫ ψ φ
Re
∫ ψ
∇ψ · ∇φ =
∫ ψ(−φ)
which is allowed if ψ ∈ H1 and ∇φ ∈ H1 which holds since φ is a smooth, compactly supported function (see, e.g. Theorem 7.7 in Lieb-Loss).
Replacing by i in the previous argument, we immediately obtain that
Im
∫ ψ
] = 0
for any φ ∈ C∞ 0 , thus ψ satisfies (4.10) in a weak sense.
Proof of (4.12). This proof is almost the same as proving that the norm can only drop along the weak limit. Recall the variational characterization of the L2-norm:
f2 = sup{|f, g| : g ∈ L2, g2 = 1} = sup{|f, g| : g ∈ C∞ 0 , g2 = 1}
(the first identity holds in any Hilbert space, the second one uses the fact that C∞ 0 is dense
in L2, THINK IT OVER!) Thus, by integration by parts,
∇ψ2 = sup{|∇ψ, g| : g ∈ C∞ 0 , g2 = 1}
= sup{|ψ,∇ · g| : g ∈ C∞ 0 , g2 = 1}
= sup {
}
}
}
∇ψj
(4.15)
10
[THINK OVER the last step when we interchanged sup and lim inf!], where we used that ψn converges to ψ weakly in L2 as well.
Remark: Note that in this proof we did not use that ψj ψ in H1, just that ψj ψ weakly in L2. The weak convergence in L2 is weaker than the weak convergence in H1, i.e. every H1-weak-convergent sequence converges weakly also in L2. This is because for any g ∈ L2 we have
ψj, g =
1 + 4π2k2 (1 + 4π2k2)dk = ψ, g
In the middle limit we used that ψj weakly converges in H1 and that the function
g(k)
1 + 4π2k2
is the Fourier transform of an H1-function because its H1 norm is given by
∫ g(k)
This statement holds in full generality:
Lemma 4.2 Suppose that the vectorspace X is equipped with two norms, · 1 and · 2 such that the second norm is stronger, i.e. there is a constant C such that
x1 ≤ Cx2, ∀x ∈ X
and (X, · 2) is dense in (X, · 1). Suppose that a sequence xn converges weakly in the norm · 2, then it converges weakly in · 1 as well.
Proof: Let X1 and X2 denote the normed spaces that are the vectorspace X equipped with · 1 and · 2, respectively. Since X2 ⊂ X1 is a dense subset, we easily see that X∗
1 ⊂ X∗ 2
(CHECK!) Thus, if xn converges to x weakly in X2, then (xn) → (x) for any ∈ X∗ 2 , but
then (xn) → (x) for any ∈ X∗ 1 as well, i.e. xn converges to x weakly in X1 as well.
Proof of (4.13). Cut off the function V as follows
V δ(x) = V (x) · 1{x : |V (x)| ≤ 1/δ}
11

≤ V − V δd/2ψjH1 → 0
as δ → 0 uniformly in j (because of (4.11)). Therefore it is enough to show [THINK IT OVER!] that
lim j→∞
∫ V δ|ψj |2 =
∫ V δ|ψ|2
for each fixed δ. Fix now δ, choose ε > 0 and define
Aε = {x : |V δ(x)| > ε}
Since V vanishes at infinity, we have |Aε| <∞. Write
∫ V δ|ψj|2 =

V δ|ψ|2


and similary
V δ|ψ|2 ≤ εψ2 ≤ ε
using the lower semicontinuity of the norm ψ ≤ lim inf ψj = 1 (see (3.8)). Thus it is sufficient to show that
lim j→∞
12
for every fixed ε > 0, δ > 0. [Make sure you really understand this argument! Uniformity in j in the truncation bounds guarantees that it is sufficient to show the j → ∞ limit for each fixed value of the truncation!]
We estimate


lim j→∞
= 0
i.e. that a sequence that is weakly convergent in H1 is converges strongly in L2 on any set of finite measure. This is the content of the Rellich-Kondrashev theorem (see Theorem 5.1 later) and this proves (4.13).
Proof of (4.14). Using (4.12) and (4.13), we know that E(ψ) ≤ lim inf E(ψj). Thus
E0 = lim j→∞
E(ψj) ≥ E(ψ) ≥ E0ψ2
where, in the last step, we used that E0 was the infimum of E(ψ) on the unit sphere, ψ = 1. Thus
E0 ≥ E0ψ2
and since E0 < 0, we get ψ2 ≥ 1. But ψ was the weak limit of a normalized sequence in H1
(thus also in L2), and the norm can only drop along the weak limit, we have
ψ ≤ lim inf j→∞
5 Rellich-Kondrashev theorem
We have seen that strong (norm) convergence is indeed stronger than weak convergence. So how can it happen that fn 0 weakly in Lp (p < ∞) but fn does not converge strongly? There are essentially three qualitative possibilities. First is that the function oscillates to death, second is that it scales up to a spike or down to a flat pancake, and third is that it wanders out to infinity. The oscillation and the spike features can occur only if the derivative blows up (or the function wanders out to infinity in Fourier space!). The pancake and the wandering features can occur only in infinite volumes. Therefore it is natural to guess that if we keep the derivative under control and we restrict ourselves to a finite domain, then weak convergence actually implies strong convergence. This is the content of the following important theorem.
Theorem 5.1 (Rellich-Kondrashev) Let B ⊂ Rd a subset with a finite Lebesgue measure, |B| <∞. Let fn converge weakly in H1(Rd) to f . Then for any 1 ≤ q < 2d
d−2 we have
|fn − f |q = 0
i.e. χBfn converges strongly (in Lq) to χBf . Here the exponent q must satisfy the following
1 ≤ q < 2d
d− 2 , if d ≥ 3
1 ≤ q <∞, if d = 2
1 ≤ q ≤ ∞, if d = 1
Corollary 5.2 Let fn converge weakly in H1(Rd) to f . Then there is a subsequence fnj that
converges to f(x) pointwise almost everywhere.
Proof of the Corollary. Consider the sequence of balls Bk centered at the origin with radius k. Recall that by the Riesz-Fischer theorem, one can select a pointwise almost convergent subsequence from a strongly convergent sequence in Lp. Thus, by Rellich-Kondrashev, we can find a subsequence fn1(j) that converges pointwise almost everywhere on B1. Repeating the argument for the ball B2 and this subsequence instead of the original sequence, we can choose a sub-subsequence fn2(j) that converges almost everywhere on B2. Etc. Finally, by Cantor diagonalization, we can choose the subsequence f (j) = fnj(j) that will converge for almost every x ∈ Rd.
14
This theorem is sometimes formulated as a compact embedding. Notice that the allowed range for q is exactly the Sobolev exponents in finite volume, i.e. by Holder and Sobolev inequalities
fLq(B) ≤ C(B, q)f2d/(d−2) ≤ C ′(B, q)fH1 (5.16)
This means that the restriction operator
iB : f → f |B
(defined as (iBf)(x) = f(x)1B(x)) as a map from H1 to Lq(B)
iB : H1 → Lq(B)
is a compact linear map, i.e. it maps bounded sets (in theH1-topology) into relatively compact ones (in the Lq(B)-topology). [recall: a set is relatively compact if its closure is compact]. In term of sequences this means that any bounded sequence in H1 has a convergent subsequence in Lq(B).
To see this, suppose we have a sequence fn such that supn fnH1 <∞, i.e. it is uniformly bounded in H1. In particular, from the Sobolev inequality (5.16) it follows that {fn} is a bounded subset of Lq(B), since supn fnLq(B) <∞. However, it is even compact, i.e. there is a subsequence fnj
that converges in Lq(B). To see this, just recall that by the Banach-Alaoglu theorem (applied to the Hilbert space H1) there is a subsequence fnj
that converges weakly in H1 to some function f . But then by Rellich-Kondrashev the same sequence, restricted to B, converges strongly in Lq(B).
The theorem also holds if we assume the weak convergence of fn in H1(B) if B is a bounded, sufficiently nice domain (e.g. with piecewise C1 boundary without cusps) so that the Sobolev inequality holds:
fLq(B) ≤ CqfH1(B)
in other words H1(B) ⊂ Lq(B)
The Rellich-Kondrashev theorem for this case says that this containment is actually compact, i.e. a closed bounded set in H1 is a compact set in Lq. In other words, that the identity operator
id : H1(B) → Lq(B)
is a compact linear map. There are similar compact embeddings for general Sobolev spaces, see Theorem 8.9 of
Lieb-Loss for a more general statement. The most general formulation is given in R. Adams, J. Fournier: Sobolev Spaces, Theorem 6.3.
15
Recall the Arzela-Ascoli theorem which is of similar spirit. It characterizes compact sub- sets of C(K), where K ⊂ Rd is compact. For compactness it is not sufficient for a sequence {fn} ⊂ C(K) to be simply bounded, because of the possible oscillation. But if, addition- ally, the sequence is equicontinuous (which, on a compact set is equivalent being uniformly equicontinous), then it is compact.
Proof of Theorem 5.1. We will give the proof in d ≥ 3 case, the other two cases are analogous (HOMEWORK!).
The proof starts with a standard smoothing idea; define a function φ ∈ C∞ 0 with
∫ φ = 1
φm(y) := mdφ(my), so that
∫ |φ(y)||y|dy
First we present the idea, for simplicity assume that q = 2, and we write
fn − fL2(B) ≤ fn − fn φmL2(B) + fn φm − f φmL2(B) + f φm − fL2(B) (5.17)
We have seen in the proof of the density of C∞ 0 in Lp that the last term (even after extending
the norm from L2(B) to L2 = L2(Rd)) converges to zero:
lim m→∞
f φm − f2 = 0
Similarly, the first term converges to zero as m → ∞, but this convergence is in general not uniform in n. Finally, the function in the middle term in (5.17) converges at least pointwise to zero for any fixed m as n→ ∞:
( fn φm − f φm
≤ fn − f2φm2 → 0 (5.18)
Note that the L2 norm of φm is not uniformly bounded, since the scaling set the normalization of φm in L1.
Thus we have to circumvent the lack of uniformity in the double limit. Without further conditions (like boundedness in H1) it will not work, think over what happens if fn = e2πinx
in L2[0, 1]!
Since fn weakly converges in H1, it is in particular bounded
sup n
fnH1 <∞
16
by the uniform boundedness principle. The boundedness in H1 will guarantee that the con- vergence in the first term in (5.17), limm→∞ fn− fn φm2 = 0, is uniform in n. To see this, let h ∈ Rd and f ∈ H1, and compute
∫ |f(x+ h) − f(x)|2dx =
∫ |k|2|f(k)|2dk = |h|2 ∇f2
2
≤∇f2
(5.20)
In the first step we used ∫ φ = 1 to “smuggle in” the additional f(x) inside the x-integration.

|an|fn
but the summation can be replaced with an integral and the coefficients an with any weight function φ(y). Finally, the last step is (5.19).
Applying this inequality for fn and φm, and using that supn ∇fn2 <∞, we have
fn φm − fn2 ≤ ∇fn ∫
|φm(y)||y|dy ≤ C
m
with some constant, uniformly in n. Thus, given any ε > 0, we can find an m such that
fn φm − fn2 ≤ ε
3 (5.21)
3 (5.22)
i.e. the first and the third term in (5.17) are under control. For the middle term, we note that the weak convergence of fn in H1 also implies weak
convergence in L2, thus by (5.18) we have that
fn φm(x) =
∫ fn(y)φm(x− y)dy →
∫ f(y)φm(x− y)dy = f φm(x)
holds for all x and all m (test the weak convergence fn f against the L2-function y → φm(x− y) for any fixed x). Moreover,
|fn φm(x)| = ∫ fn(y)φm(x− y)dy
≤ fn2φm2 ≤ Cφm2
with some constant C, uniformly in n. Thus, by dominated convergence and the finiteness of |B|, we have
lim n→∞
fn φm − f φmL2(B) = 0 (5.23)
(this limit is not uniform in m). Combining this bound with (5.17), (5.21), (5.22), we obtain the statement of the theorem
for q = 2. (THINK it over! For a given ε first choose a sufficiently big m, dictated by (5.21), (5.22), then fix this big m, and choose a sufficiently large n so that the norm in (5.23) is smaller than ε/3).
We have thus proved the Rellich-Kondrashev theorem for q = 2. For 1 ≤ q < 2 we simply use Holder’s inequality and the fact that B has finite measure
fn − fLq(B) ≤ C(B, q)fn − fL2(B) → 0
Finally, for 2 < q < 2d d−2
we use Holder and Sobolev inequalities
fn − fq ≤ fn − fθ2fn − f1−θ 2d/(d−2) ≤ fn − fθ2
( ∇fn2 + ∇f2
≤ Cfn − fθ2 → 0
by using the uniform boundedness of ∇fn2. Here the exponent θ is dictated by the Holder inequality (check that θ = d(1/q − (d− 2)/2d) > 0 will do the job).
18
6 Distributions in nutshell
The concept of distributions is not really necessary for the main arguments in this course, so we will not introduce them in fully rigorous details, however, they will be (mildly) needed in the following proof. So we summarize the basic idea following Chapter 6 of Lieb-Loss (for more details, look up the book).
We have already defined the concept of weak solutions of a PDE. Look at the following very simple differential equation:
f ′(x) = 2H(x) − 1 (6.24)
where f is the unknown function and H(x) = 1{x : x > 0} is the Heaviside function. What could f be?
If we insist on the usual definition of the derivative, then there is no solution, because of Darboux theorem from real analysis [Recall: if f is differentiable on a closed interval [a, b] and it has right derivative at a and left derivative at b, then f ′ takes on any value between f ′
+(a) and f ′
−(b).] On the other hand, the function f(x) = |x| + c (with some constant c) “should be” a
solution in any reasonable sense, since f ′(x) = 2H(x) − 1 for all x 6= 0. Since we are anyway prepared to forget about a zero measure set of points, this “hole” should not really prevent us from declaring f(x) = |x| + c to be a solution.
Now we can be even bolder, and differentiate (6.24):
f ′′(x) = 2H ′(x)
Of course H ′(x) = 0 for any x 6= 0 and we agreed not to care about one single point, so it looks like
f ′′(x) = 0
But we can solve this differential equation, the solution is f(x) = bx + c for some constants b, c. However, now we have a problem, since the solution f(x) = |x| + c is not recovered. So maybe something still happened at x = 0 which we missed? What is allowed and what is not?
The right concept is the weak derivative, as we already mentioned earlier. Recall that f ∈ L1
loc(R d) (locally integrable function) if its restriction to every compact set K is L1.
We say that a function f ∈ L1 loc(R
d) has a weak derivative, Df(x), if there is a function g ∈ L1
loc(R d) such that ∫
Rd
Dφ(x)f(x)dx = − ∫
Rd
φ(x)g(x)dx
for all φ ∈ C∞ 0 testfunction. (D denotes the derivative in general, so in case of functions on Rd
it is just the gradient, but we sometimes use the more general notation as we will differentiate other objects as well).
19
It is easy to see that g(x) is unique (as an element of L1 loc(R
d)) and we denoteDf(x) = g(x). It is also easy to see that the weak derivative coincides with the strong one if f is differentiable. In other words, the weak derivative, if it exists, is defined with the requirement that the formal integration by parts holds when tested against a nice (smooth, compactly supported) test function. It is easy to check [HOMEWORK!] that in this sense
(|x| + c)′ = 2H(x) − 1
but H ′(x) 6= 0. So what is H ′(x)? We need a g(x) such that
∫ Dφ(x)H(x)dx = −
∫ φ(x)g(x)dx
(here, in one dimension Dφ = φ′). But it is easy to see that
∫ Dφ(x)H(x)dx =
so we need g(x) such that ∫ φ(x)g(x)dx = φ(0) (6.25)
for any φ ∈ C∞ 0 . Obviously, there is no such function g(x) ∈ L1
loc, in particular the above argument, saying that H ′ = 0 was wrong. [THINK it over: g = 0 does not do the job, but otherwise there is a set of positive Lebesgue measure A, separated away from zero, dist(A, 0) > 0, such that, say,
∫ A g > 0. We can assume that A is bounded. By the regularity of the measure
with density g, there is a sequence of open sets On such that A ⊂ On, ∫ On g →
∫ A g > 0. In
particular, there is an open interval I, separated away from zero, such that ∫ I g > 0. For any
ε > 0 choose a smooth function φε such that 0 ≤ φε ≤ 1I , and such that φε → 1I pointwise as ε→ 0. Conclude that
0 <
∫ g(φε − 1I)
The first term is φε(0) = 0, the second goes to zero as ε → 0 by dominated convergence. Contradiction.]
So it seems that H ′(x) cannot be defined with the above definition of the weak derivative. However, we can just define H ′(x) to be the “object” that assigns to any function φ(x) its value at zero. In other words, it will not be a function, but a linear functional on the space of test functions. The key idea is thus to depart from the concept of functions, and replace them by linear functionals on C∞
0 .
20
Definition 6.1 For any ⊂ Rd non-empty open set we define the space of test-functions, D() to be the space C∞
0 () equipped with the following notion of convergence. A sequence φn ∈ D() converges to φ ∈ D() if there is a (common!) compact set K ⊂ such it contains the support of φn − φ,
supp(φn − φ) ⊂ K
and Dαφn → Dαφ
uniformly on K for any multiindex α = (α1, α2, . . . , αd). Obviously, D() is a linear space.
For mathematicians: The space D() is not a normed space, there is not a single norm that generates this topology, but it can be generated by a family of seminorms. It is not metrizable, but complete. In this presentation we avoided defining the topology precisely (via bases of neighborhoods etc), we only defined convergence. In general, convergence does not determine the topology, but in “not too big” spaces it does. This is the case for D(), so actually the convergence given above defines the topology.
Definition 6.2 A distribution T is a continuous linear functional on D(). The space of distributions is a linear space, denoted by D′() and it is equipped with the natural convergence:
Tn → T in D′() iff Tn(φ) → T (φ) for any φ ∈ D()
For any f ∈ L1 loc(R
d) we can define a distribution
Tf(φ) =
Rd

It is easy to see that this is indeed a distribution [CHECK!]. It can be easily shown that if f, g ∈ L1
loc, and Tf = Tg as distributions (i.e. ∫ fφ =
∫ gφ for
any φ ∈ C∞ 0 ), then f = g almost everywhere, i.e. f = g as elements of L1
loc. We can therefore identify f with the distribution Tf , and in case of such distributions we will not distinguish between a function f and the distribution Tf .
The Dirac delta distribution at the point x ∈ Rd is defined as
δx(φ) = φ(x)
We have shown above that this distribution is not given by a function (the proof above was given only in d = 1, CHECK that it works in any dimensions). Despite this fact, it is often called (especially in physics) the Dirac delta function.
Distributions can be differentiated:
21
Definition 6.3 For any distribution T ∈ D′() and any multiindex α we define its α-order partial derivative ( ∂
∂x1
)α1 ( ∂
∂x2
)α2
(DαT )(φ) = (−1)|α|T (Dαφ)
with |α| = ∑d
j=1 αj. This concept is called the weak or distributional derivative. [HOME- WORK: check that it is indeed a distribution]
It is easy check that for smooth functions f , the old concept of derivative and the new one coincide:
DαTf = TDαf
Actually it is true even in a stronger form [See Theorem 6.10 of Lieb-Loss]: If T ∈ D′() and we find that ∇T is a continuous function G : → Rd, then T is a C1 function F , i.e. T = TF and of course G = ∇F (in classical sense).
The main advantage of the new definition is that every distribution is infinitely often differentiable! Moreover, the limit is compatible with the derivative:
Tn → T (in D′()) =⇒ DαTn → DαT (in D′())
for any multiindex [WHY??]. So any convergent sequence of distributions can be differentiated. This is a big plus, remember how messy it is to interchange limits with derivatives for functions, e.g. if fn → f pointwise, then f ′
n → f does not hold at all; first f ′ n may not exists, and second
even if f ′ n and f ′ exist, the limit does not follow.
With this new concept, we are able to differentiate non-differentiable (or even discontinu- ous) functions, one example is
H ′ = δ0
as elements of D′(R). We are also able to make sense of (partial) differential equations in a much bigger class of
solutions (i.e. we can talk about solution in distributional sense). There are some facts for functions that easily extend to distributions (for the proofs, see,
e.g. Lieb-Loss Section 6):
i) There is a natural formulation of the Fundamental Theorem of Calculus for distributions (see Thm 6.9 of Lieb-Loss), here I just write up one special case: for any y ∈ Rd and a.e. x ∈ Rd, we have
f(x+ y) − f(x) =
22
for any f ∈ W 1,1 loc = {f : Rd → C : f ∈ L1
loc,∇f ∈ L1 loc} where ∇f is understood in
distribution sense.
ii) If the derivative of a distribution is zero on a connected set, then the distribution is constant (meaning that it is identical to a constant function).
iii) Distributions can be naturally multiplied with C∞ functions ψ:
(ψT )(φ) := T (ψφ)
and the usual chain rule holds. This definition extends the usual pointwise multiplication of functions.
iv) Distributions can be smoothed out by taking convolutions with a rescaled compactly supported function j ∈ C∞
c with ∫ j = 1, i.e. if we define
(j T )(φ) = T (∫
j(y)φ(· + y)dy )
then this distribution is given by a function, i.e. there exists a function t ∈ C∞ (de- pending on T and j) such that
(j T )(φ) =
T (∫
) is smooth.
This definition extends the usual convolution of functions. If we rescale jε(x) = ε−dj(x/ε), then
jε T → T in D′()
v) Let ψ1, ψ2, . . . ψn be a family of L1 loc functions, and assume that the distribution T
satisfies the property that T (φ) = 0 for all φ ∈ D() such that ∫ φψj = 0, j = 1, 2, . . . , n.
Then there exists constants cj such that
T =
n∑
cjψj
23
[Let me just sketch the proof for n = 1. Fix a function u1 ∈ D such that ∫ u1ψ1 = 1.
(such function exists, why??) Write any φ ∈ D as
φ = v + (∫
∫ vψ1 =
∫ [ φ−
(∫ φψ1
) u1
) T (u1)
i.e T = c1ψ1 with c1 = T (u1). The generalization for n > 1 is an easy linear algebra.]
vi) Suppose that T (φ) = 0 for any φ ∈ D() such that supp φ ⊂ \ {0}. Then there is a finite number K and constants c0, c1, . . . cK such that
T = K∑
j=0
cjδ (j) 0
i.e. distributions supported at the origin (in the above sense) can be only linear combi- nations of derivatives of the delta distribution.
vii) [Integration by parts] Similarly to the H1-spaces, in some cases integration by parts is allowed. We will need the following statement, that can be proved by standard approximation argument.
Suppose that u, v ∈ H1(Rd) and suppose v ∈ L1 loc. Moreover, suppose that v is real
− ∫
∇u · ∇v (6.26)
viii) A distribution T is called positive if T (φ) ≥ 0 for any nonnegative test function φ ≥ 0. Positive distributions “behave” much nicer than general ones, in fact, positive distribu- tions are regular Borel measures.
24
So far it seems that distributions are in every aspect superior to functions: we seemingly can do everything with them (limit, differentiation, convolution) and everything seems much easier. The life is not as nice:
However, there are natural operations that cannot be done with distribution. First is that the distributions form a linear space, so they can be added and subtracted, but in general they cannot be multiplied or divided. E.g. it does not make sense to talk about the square of the Dirac delta distribution. This is the main reason why distributions are very useful in linear PDE’s, but one has to be very careful with their usage for nonlinear PDE’s.
The second operation that is in general meaningless is to take compositions, one cannot make sense of T (S(·)) for any two distributions T, S ∈ D′. However, for certain func- tions in Sobolev spaces, it does make sense. An important special case is the H1 functions (for more general statement, see Lieb-Loss Theorem 6.16) If G = G(s1, s2, . . . sd) is a differentiable function with bounded and continuous derivatives and u = (u1, u2, . . . ud) is a vector-valued function with components uj ∈ H1, then the function
K(x) = (G u)(x)
∂xj
Remark on the tempered distributions: In case of = Rd one can develop the theory of distributions via the Schwarz functions, S() instead of D(). The definition is the same, but we get a smaller set S ′(Rd) ⊂ D′(Rd). This will be called the space of tempered distributions. Note that S ′ takes into account the behavior at infinity, while D′(Rd) is a local concept. E.g. any f ∈ L1
loc(R d) function is in D′(Rd), but if the function f grows too
fast at infinity (faster than any polynomial) then it is not in S ′(Rd). One advantage to work with S ′(Rd) is that the Fourier transform can be extended from
S(Rd) to S ′(Rd) and it is a bijection from S ′(Rd) to itself. The Fourier transform of elements of D′ are much harder to characterize. Some of these issues may be discussed in the Tutorium, they are not necessary for this course.
7 Excited states
[We follow Lieb-Loss, Theorem 11.6] Consider − + V where V satisfies the conditions of Theorem 4.1. Let E0 be the ground
state energy and let ψ0 be (one of) the ground state eigenfunction. We successively define
25
energies E1 ≤ E2 ≤ · ≤ Ek ≤ · and eigenfunctions ψ1, ψ2, . . . by the following variational principle. Suppose that E0, . . . Ek−1 and ψ0, . . . ψk−1 have been defined, then let
Ek := inf { E(ψ) : ψ ∈ H1(Rd), ψ2 = 1, ψ, ψj = 0, j = 0, 1, . . . , k − 1
} (7.27)
Theorem 7.1 Let V satisfy the conditions of Theorem 4.1. Suppose that for some k ≥ 0 the first k eigenstates exists and assume that Ek defined above is negative, Ek < 0. Then the (k + 1)-th eigenstate ψk, defined as the minimizer in (7.27), also exists and it satisfies the Schrodinger equation, (− + V )ψk = Ekψk in a weak sense. Thus the recursion above stops only when the zero energy is achieved. Moreover, if Ek < 0, then its multiplicity is finite, i.e. it can be listed at most finitely many times, i.e. there could be at most finitely many eigenstates corresponding to Ek. Even more, the increasing sequence E1 ≤ E2 ≤ . . . cannot accumulate at any negative number. The choice of ψk may not be unique, but the finite dimensional space spanned by the eigenfunctions belonging to the same eigenvalue is unique. In particular, the sequence of eigenvalues does not depend on the choice. Moreover, the eigenfunctions can be chosen real.
Sketch of the proof. The proof of the existence of ψk, is exactly the same as the proof of Theorem 4.1: we choose a minimizing sequence, we pass to a subsequence that converges in H1, and the weak limit will be ψk. The constraints ψ, ψj = 0, j ≤ k − 1, survive under the weak limit.
To prove that ψk satisfies the Schrodinger equation, we follow the same proof for the ground state, but the test function φ ∈ C∞
0 must also satisfy the constraints, φ, ψj = 0, j ≤ k − 1 [WHY???]
Thus we conclude that T = (− + V − Ek)ψk
is a distribution such that D(φ) = 0 for any φ ∈ C∞ 0 , with φ, ψj = 0, j ≤ k − 1. It follows
from this (see property v) of the distributions) that T is a linear combination of ψj :
T = k−1∑
j=0
cjψj (7.28)
To prove that all cj = 0, we multiply D by some ψi, i ≤ k − 1. This step is only formal, since ψi is not a C∞
0 function, so T (ψi) strictly speaking does not make sense. For the rigorous proof, one can use an approximation argument, i.e. approximate ψi by a C∞
0 function. Accepting this formality, and integration by parts, we have
ci = T (ψi) =
26
(we used that ψi and ψk are orthogonal). On the other hand, ψi satisfies (−+V −Ei)ψi = 0 in distribution sense, and multiplying
its complex conjugate with ψk we have ∫
∇ψi · ∇ψk +
∫ V ψiψk = 0
so we have proved that ci = 0 for all i ≤ k − 1. Finally, we assume that Ek < 0 has infinite multiplicity, i.e. Ek = Ek+1 = Ek+2 = . . ..
Following the proof of the existence of the successive eigenstates, we find an orthonormal sequence ψk, ψk+1, . . ., that all satisfy (− + V )ψk = Ekψk. By (2.2), we see that ψjH1 is bounded, so there is a weakly convergent subsequence, ψnj
ψ in H1. But then ψnj ψ
weakly in L2 as well (see Lemma 4.2). However, an orthogonal sequence in L2 converges weakly to 0 [WHY? – THINK IT OVER], thus ψ = 0. But then
lim j→∞
∫ V |ψnj
Ek =
∫ |∇ψnj
|2 +
|2
Now taking the liminf as j → ∞, the first term is positive, the second goes to zero, so we get Ek ≥ 0 which contradicts to Ek < 0.
It is easy to see that all we used in this proof is that we have an infinite sequence of eigenvalues that below a strictly negative number, i.e. E1 ≤ E2 ≤ . . . ≤ E with some E < 0 is also excluded. Thus there could be no accumulation point of eigenvalues in the negative axis.
The uniqueness of the eigenspace is an easy exercise in linear algebra (as far as the eigenspaces are concerned now we can work in finite dimensions). The fact that the eigen- functions can be chosen real, follows from the fact that if ψ is an eigenfunction, then so is ψ, so if ψ is not real and is not a constant multiple of a real function, then instead of ψ and ψ one can take their real and imaginary parts, that are also eigenfunctions and span the same space.
Remark. If you are unhappy that the proof was not rigorously formalized in every detail and you do not want to spend a little time to go through the approximation argument with general distributions, then here is an alternative argument. I will just show it for the first excited state ψ1. The existence of ψ1 was rigorous and so was the derivation of the relation
∫ ∇ψ1 · ∇φ+
27
for any φ ∈ C∞ 0 such that ψ0, φ = 0. Since ψ1 ∈ H1, by standard density argument we see
that (7.29) holds for all φ ∈ H1 functions with ψ0, φ = 0 [CHECK! – you will need the fact that
∫ V |f |2 ≤ C(V )fH1 that has been used several times].
Define the linear functional L on H1 as follows
L(φ) :=
∫ ψ1φ,
it is easy to see that this is a continuous linear functional. We know that L ≡ 0 on the orthogonal complement of the one dimensional space spanned by ψ0. Therefore there exists a number µ (called Lagrange multiplier) such that
L(φ) = µψ0, φ
µ = L(ψ0) =
∫ ∇ψ1 · ∇ψ0 +
∫ V ψ1ψ0 (7.30)
where we used ψ0, ψ1 = 0. But ψ0 was the ground state, i.e.
∫ ∇ψ0 · ∇φ+
∫ ψ0φ
for any φ ∈ C∞ 0 which, as above, could be extended to any φ ∈ H1. Plugging in φ = ψ1, we
get ∫ ∇ψ0 · ∇ψ1 +
∫ V ψ0ψ1 = E0
∫ ψ0ψ1 = 0
by orthogonality. Comparing this equation with (7.30), we see that µ = 0, so L ≡ 0 and thus ∫
∇ψ1 · ∇φ+
∫ ψ1φ
holds for any φ ∈ H1 (without restriction onto the orthogonal complement). This means that ψ1 is a weak solution to the Schrodinger equation.
8 Properties of eigenfunctions
We do not have time to discuss all important properties in detail, but I would like to list a few key facts.
28
Theorem 8.1 Assume the conditions of Theorem 4.1 and suppose that E0 < 0. Then the following hold:
i) The ground state is unique (modulo trivial constant multiple)
ii) The ground state can be chosen strictly positive.
iii) The positivity of an eigenfunction characterizes the ground state, i.e. if (−+V )ψ = Eψ for some ψ ∈ H1 and E ∈ R, and ψ ≥ 0, then E = E0 and ψ is the ground state.
iv) If V is spherically symmetric, i.e. V (x) = V (|x|), then so is the ground state.
v) If (−+V )ψ = 0 holds in distribution sense in some open ball B, and V ∈ Ck(B), then ψ ∈ Ck+2(B), i.e. the regularity of the solution is two order better than the regularity of the potential.
We will need the following lemma:
Lemma 8.2 If f ∈ H1 then |f | ∈ H1 and
∫ ∇|f | 2 ≤
∫ |∇f |2 (8.31)
actually it holds even pointwise that |∇|f || ≤ |∇f |. Moreover, if |f(x)| > 0, then equality in (8.31) can hold only if there is complex number λ of unit length, |λ| = 1, such that f(x) = λ|f(x)|.
Sketch of the proof. The short (and almost correct) proof of this fact is the following. Write f(x) = |f(x)|eiθ(x), i.e. decompose it into modulus and phase. Then by ∇f = eiθ∇|f |+ i(∇θ)|f |eiθ we have
|∇f |2 = ∇|f |
2 + |∇θ|2|f |2
so the inequality is obvious. Equality holds only if |∇θ|2|f |2 = 0, but if |f | > 0, then clearly ∇θ = 0, so θ = const.
The little problem with this argument is that the function θ in the decomposition f(x) = |f(x)|eiθ(x) is not unique if f vanishes on some set. In other words, ∇|f | has to be defined also on the set where f(x) = 0. The absolutely honest argument goes through by defining ∇|f | to be 0 where f(x) = 0 and otherwise
(∇|f |)(x) = R(x)∇R(x) + I(x)∇I(x)
|f(x)|
29
where f(x) = R(x) + iI(x) is the real and imaginary part decomposition. The above formula coincides with the chain rule definition
∇ √ R2 + I2 =
R∇R + I∇I√ R2 + I2
however the chain rule literally cannot be applied to the absolute value function since it is not C1. So one first has to regularize the absolute value function, apply the chain rule and check that in the claimed inequality the regularization can be removed (see Theorem 6.17 of Lieb-Loss for more details). In principle, the same care is needed when establishing the cases of equality, see Theorem 7.8 of Lieb-Loss.
Sketch of the proof of Theorem 8.1. We start with proving i) and ii). Armed with the lemma, we see that
E(|ψ|) ≤ E(ψ)
for any ψ ∈ H1 and if |ψ| > 0, then equality holds if and only if ψ = λ|ψ|. Suppose we had two ground states, ψ and φ, i.e. E0 was listed twice among the eigenvalues.
We can assume that ψ and φ are orthogonal. [Exercise: the space of all ground states is a linear space, i.e. if ψ and φ are ground states, so is aφ+ bψ with any a, b ∈ C.]
By the remark above, we can also assume that both are non-negative since E(|ψ|) = E(ψ), so we can always replace ψ with |ψ| by possibly lowering the energy, but the ground state cannot be lowered further.
Now it is a deeper fact of elliptic PDE theory (Harnack inequality) that if f is a non- negative function such that for some potential W that is bounded from above and for which Wf ∈ L1
loc, we have −f +Wf = 0
in distribution sense, and f is not identically zero, then f is strictly positive (see Lieb-Loss Theorem 9.10). Using this fact, we are given the necessary condition |ψ| > 0 for the equality, i.e. we know that ψ = λ|ψ| and φ = µ|φ|, but these two functions cannot be orthogonal
∫ ψφ = λµ
This proves parts i) and ii) of Theorem 8.1.
Remark: Those who have had some PDE, the Harnack inequality is related to the mean value property of harmonic functions: if −f = 0, then f(x) = 1
|B|
30
with center x. If W is present, the mean value property does not hold any more, but it is still correct as an inequality: there is a constant C, depending on B and W , such that
f(x) ≥ C
f(y)dy. [Harnack inequality]
For the proof of iii), we only need to show that E = E0, then the claim will follow from i) and ii). We claim that if E > E0, then ψ must be orthogonal to the ground state ψ0. From the eigenvalue equation (− + V )ψ = Eψ we have
∫ ∇ψ · ∇φ+
∫ ψφ
for any φ ∈ C∞ 0 , and by standard approximation, we get it also for φ ∈ H1, in particular for
φ = ψ0: ∫ ∇ψ · ∇ψ0 +
∇ψ0 · ∇φ+
∇ψ0 · ∇ψ +
∫ ψ0ψ
Comparing these two equations and using that V is real and that E 6= E0, we immediately get that
∫ ψ0ψ = 0. But this is impossible since ψ0 > 0 and ψ ≥ 0 (and ψ is not identically
zero).
The proof of iv) follows a different path, namely the rearrangement inequalities. We will show the proof only for V ≤ 0, although the statement is true in general.
For any nonnegative function f : Rd → R+ one can define its symmetric decreasing rearrangement, f ∗(x), as follows. The function f ∗(x) will be a spherically symmetric function, i.e. f ∗(x) = f ∗(|x|) (this is a bit sloppy notation, strictly speaking one should say that f ∗(x) is given by a function of one variable composed with the length; F (|x|) where F : R+ → R+). Furthermore, we require that each level set of f ∗ has the same measure as that of f :
|{x : f(x) > t}| = |{x : f ∗(x) > t}|
31
for all t > 0. It can be shown that there is a unique function with these properties (see Section 3.3 of Lieb-Loss). For the symmetric rearrangement it can be easily shown that
fp = f ∗p for any p
using the formula
fpp = p
tp−1|{x : f(x) > t}|dt
for any non-negative function. A bit more involved to prove that
∫ V f ≤
∫ V ∗f ∗
for any two nonnegative functions. Moreover, if V = V ∗, then equality can occur only if f = f ∗. [Lieb-Loss: Thm 3.4]. An even bit more work is required to show that if f ∈ H1, then f ∗ ∈ H1 and ∫
|∇f ∗|2 ≤ ∫
|∇f |2
(Lemma 7.17 of Lieb-Loss) But then, putting these information together, we see that if ψ is the positive ground state,
then E(ψ∗) ≤ E(ψ)
thus ψ∗ is also a ground state. But the ground state is unique (modulo constant multiple), so ψ = (const)ψ∗, i.e. it is a spherically symmetric function.
There is another proof of the symmetry of the ground state eigenfunction; just show that rotations of ψ0, i.e. x 7→ ψ0(Rx) for any rotation R is also a ground state and use i).
Finally, statement v) follows from standard elliptic regularity theory. It is related to the fact that the solution of the Poisson equation −u = f has two more derivatives than f [roughly speaking]. We do not present the proof here [see Lieb-Loss: Theorem 11.7]
32