Top Banner
Logarithmic Sobolev inequalities in discrete product spaces: proof by a transportation cost distance Katalin Marton Alfréd Rényi Institute of Mathematics of the Hungarian Academy of Sciences
35

Logarithmic Sobolev inequalities in discrete product ...

Dec 03, 2021

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Logarithmic Sobolev inequalities in discrete product ...

Logarithmic Sobolev inequalities in discrete product spaces:proof by a transportation cost distance

Katalin Marton

Alfréd Rényi Institute of Mathematicsof the Hungarian Academy of Sciences

Page 2: Logarithmic Sobolev inequalities in discrete product ...

Relative entropy

DefinitionZ: measurable space, µ, ν: probability measures on Z.Z, U : random variables, L(Z) = µ, L(U) = ν.

Relative entropy:

D(µ||ν) = D(Z||U) =

∫Z

logdµ

dνdµ (1)

If Z is finite:

D(µ||ν) = D(Z||U) =∑z∈Z

µ(z) logµ(z)

ν(z). (2)

Page 3: Logarithmic Sobolev inequalities in discrete product ...

Entropy contraction of Markov kernels

Definition(Z, ν) : probability space,P(Z): measures on Z,Γ: Markov kernel on Z with invariant measure ν.

Entropy contraction for (Z, ν,Γ)with rate 1− c, 0 < c ≤ 1:

D(µΓ||ν) ≤ (1− c) ·D(µ||ν). (3)

Equivalently:

c ·D(µ||ν) ≤(D(µ||ν)−D(µΓ||ν)

), for all µ ∈ P(Z) (4)

Page 4: Logarithmic Sobolev inequalities in discrete product ...

Gibbs sampler governed by local specifications of qn

DefinitionΓi: Markov kernel X n 7→ Xn

Γi(zn|yn) = δ(yi, zi) · qi(zi|yi),

Γ: Markov kernel X n 7→ Xn:

Γ =1

n·n∑i=1

Γi.

Definitionqn has the entropy contraction property if:

its Gibbs sampler Γ has.

Page 5: Logarithmic Sobolev inequalities in discrete product ...

Entropy contraction in product space

(X n, qn): probability space

QuestionWhich measures qn have the entropy contraction property witha reasonable constant c?

c cannot be larger than O(1/n).Changing notation: WHEN

c

n·D(pn||qn) ≤

(D(pn||qn)−D(pnΓ||qn)

)for all pn ∈ P(X n)

?(5)

Equivalently: WHEN

D(pn||qn) ≤ (1− c

n) ·D(pnΓ||qn) ? (6)

Page 6: Logarithmic Sobolev inequalities in discrete product ...

Conditional relative entropy

DefinitionZ: measurable space,

V: another measurable space, π: probability measure on V,V : random variable, L(V ) = π

For v ∈ Vprobability measures on Z:µ(·|v) = L(Z|V = v) ν(·|v) = L(U |V = v)

Conditional relative entropy:

D(µ(·|V )||ν(·|V )

)= D

(Z|V )||U |V )

),∫VD(µ(·|v)||ν(·|v)

)dπ

(7)

Page 7: Logarithmic Sobolev inequalities in discrete product ...

Cont’dIf V is finite:

D(µ(·|V )||ν(·|V

)= D

(Z|V )||U |V )

)=∑v∈V

π(v)D(µ(·|v)||ν(·|v)

). (8)

Page 8: Logarithmic Sobolev inequalities in discrete product ...

Sufficient condition for entropy contraction in productspace

(X n, qn,Γ): probability space with Gibbs sampler,qn = L(Xn)pn = L(Y n): another distribution on X n

Recall: Γ = 1n

∑ni=1 Γi Yi = (Y1, . . . , Yi−1, Yi+1, . . . , Yn),

Proposition

c ·D(pn||qn) ≤n∑i=1

D(pi(·|Yi)||qi(·|Yi

)(∗)

=⇒

D(pnΓ||qn) ≤(1− c

n

)D(pn||qn).

(9)

If qn is a product measure then: c = 1.

Page 9: Logarithmic Sobolev inequalities in discrete product ...

NotationFor xn = (x1, x2, . . . , xn) ∈ X n:

xi , (x1, . . . , xi−1, xi+1, . . . , xn)

pi , L(Yi), qi , L(Xi)

Expansion formula

D(pn||qn) = D(pi||qi) +D(pi(·|Yi)||qi(·|Yi)

)=⇒

D(pn||qn)

=1

n·n∑i=1

D(pi||qi) +1

n·n∑i=1

D(pi(·|Yi)||qi(·|Yi)

)

Page 10: Logarithmic Sobolev inequalities in discrete product ...

Proof of Proposition

D(pn||qn)−D(pnΓ||qn)

≥ (by convexity of entropy)

D(pn||qn)− 1

n·n∑i=1

D(pi||qi)

=1

n

n∑i=1

D(pi(·|Yi||qi(·|Yi)

)(by assumption) ≥ c

nD(pn||qn)

Page 11: Logarithmic Sobolev inequalities in discrete product ...

Entropy contraction in DISCRETE product spaces

X finite

(Xn, qn,Γ)

Wanted: inequality

D(pn||qn) ≤ C·n∑i=1

D(pi(·|Yi||qi(·|Yi)

)for all pn ∈ P(Xn) (∗)

To get (*):use a Wassersteine-like distance.

Page 12: Logarithmic Sobolev inequalities in discrete product ...

A reverse Pinsker’s inequality

µ, ν: probability measures on X X finite

Notation: Variational distance

|µ− ν| = 1

2·∑x∈X|µ(x)− ν(x)|

LemmaSet

X+ , {x ∈ X : ν(x) > 0}, α , min{ν(x) : x ∈ X+

}Then

D(µ||ν) ≤ 4

α· |µ− ν|2

Follows from the inequality D(µ||ν) ≤∑

x∈X|µ(x)−ν(x)|2

ν(x)

Page 13: Logarithmic Sobolev inequalities in discrete product ...

A distance between measures on product spaces

Xn: product space

µn = L(Zn), νn = L(Un): probability measures on Xn

Definition (P. Massart)The square of the W2-distance of µn and νn:

W 22 (µn, νn) , min

n∑i=1

Pr2{Zi 6= Ui

}inf: on couplings of µn and νn.

Page 14: Logarithmic Sobolev inequalities in discrete product ...

NotationFor I ⊂ [1, n], pn = L(Y n) and yn ∈ X n:

yI , (yi : i ∈ I), yI , (yi : i /∈ I)

pI , L(YI), pI(·|yI) , L(YI |YI = yI)

For Theorem 1 we need the inequality

W 22

(pn||qn) ≤ C · E

n∑i=1

∣∣pi(·|Yi)− qi(·|Yi)∣∣2.in a MORE GENERAL FORM:

We require a bound for

W 22

(pI(·|yI), qI(·|yI)

)for ALL subsets I ⊂ [1, n] (not just I = [1, n]), and all yI .

Page 15: Logarithmic Sobolev inequalities in discrete product ...

Main Theorem

(X n, qn), X finite !

Theorem 1Set

α = min{qi(xi|xi) : qn(xn) > 0, 1 ≤ i ≤ n

}(10)

Fix a pn = L(Y n) ∈ P(X n); assume

qn(xn) = 0 =⇒ pn(xn) = 0. (11)

Main assumption:

W 22

(pI(·|yI), qI(·|yI))

≤ C · E{∑i∈I

∣∣pi(·|Yi)− qi(·|Yi)∣∣2 ∣∣∣∣ YI = yI

},

(12)

for all I ⊂ [1, n], yI ∈ X [1,n]\I .

Page 16: Logarithmic Sobolev inequalities in discrete product ...

Main Theorem continued

Assume all the inequalities

W 22

(pI(·|yI), qI(·|yI))

≤ C · E{∑i∈I

∣∣pi(·|Yi)− qi(·|Yi)∣∣2 ∣∣∣∣ YI = yI

},

(13)

where I ⊂ [1, n] and yI ∈ X [1,n]\I is fixed.Then

D(pn||qn) ≤

4C

α·n∑i=1

E∣∣pi(·|Yi)− qi(·|Yi)∣∣2

≤ 2C

α·n∑i=1

D(pi(·|Yi)||qi(·|Yi)

).

(14)

Page 17: Logarithmic Sobolev inequalities in discrete product ...

An analogous result for densities in Rn

Theoremf(xn) = exp(−V (xn) : density on Rn,qn : probability measure with density f .

Assume: conditional densities a f(xi|xi) satisfy alogarithmic Sobolev inequality with constant ρ, for all i, xi.Under some conditions on

1

ρ·Hess V (xn)

(expressing that V is not too far from being uniformly convex):

D(pn||qn) ≤ C ·n∑i=1

D(pi(·|Yi)||qi(·|Yi)

)for all pn

(C = C(qn))

Page 18: Logarithmic Sobolev inequalities in discrete product ...

Proof of Theorem 1By induction on n. Assume Theorem 1 for n− 1

Notation

pi(·|yi) , L(Yi|Yi = yi)

For every i ∈ [1, n]

D(pn||qn) = D(Y n||Xn) = D(Yi||Xi) +D(pi(·|Yi)||qi(·|Yi)

)=⇒

D(pn||qn) =1

n

n∑i=1

D(Yi||Xi) +1

n

n∑i=1

D(pi(·|Yi)|qi(·|Yi)

)(15)

By the induction hypothesis the second term is

≤(1− 1

n

)· 4C

α

n∑i=1

E∣∣pi(·|Yi)− qi(·|Yi)∣∣2

Page 19: Logarithmic Sobolev inequalities in discrete product ...

Proof of Theorem 1 Cont’dSecond term:

≤(1− 1

n

)· 4C

α

n∑i=1

E∣∣pi(·|Yi)− qi(·|Yi)∣∣2

First term:

1

n

n∑i=1

D(Yi||Xi) ≤ (by the Lemma)1

n· 4

α·∣∣L(Yi)− L(Xi)

∣∣2≤

n∑i=1

Pr2{Yi 6= Xi

}in any coupling of pn, qn

=1

n· 4

α·W 2

2 (pn, qn) for the best coupling

≤ (by the assumption of Theorem 1)

1

n· 4C

α·n∑i=1

∣∣pi(·|Yi)− qi(·|Yi)∣∣2(16)

Page 20: Logarithmic Sobolev inequalities in discrete product ...

Entropy contraction

(X n, qn,Γ), X finiteΓ: Gibbs sampler

Corollary 1If qn satisfies the conditions of Theorem 1 then

D(pnΓ||qn) ≤(

1− α

2nC

)·D(pn||qn). (17)

Page 21: Logarithmic Sobolev inequalities in discrete product ...

Logarithmic Sobolev inequality

NotationE : Dirichlet form associated with Γ isthe quadratic form

E(f, g) =⟨(Id− Γ)f, g

⟩qn

Definitionqn satisfies a logarithmic Sobolev inequalitywith constant c > 0 if:

c ·D(pn||qn) ≤ E(√

pn

qn,

√pn

qn

)for all pn ∈ P(X n)

Page 22: Logarithmic Sobolev inequalities in discrete product ...

Logarithmic Sobolev inequality for Gibbs sampler

(X n, qn,Γ), X finite

Corollary 2Under conditions of Theorem 1the logarithmic Sobolev inequality holds true:

1

n·D(pn||qn) ≤ 4C

α· EΓ

(√pn

qn,

√pn

qn

)=

4C

αn·n∑i=1

E(

1−(∑yi∈X

√pi(yi|Yi

)· qi(yi|Yi

))2).

(18)

=⇒ hypercontractivity

Page 23: Logarithmic Sobolev inequalities in discrete product ...

***********************************?????????????????

Page 24: Logarithmic Sobolev inequalities in discrete product ...

Application: Gibbs measures with Dobrushin’suniqueness condition

(X n, qn), X finite

Definitionqn satisfies (an L2-version of)Dobrushin’s uniqueness condition with coupling matrix

A =(ak,i)nk,i=1

, ai,i = 0,

if:

(i) max |qi(·|zi)− qi(·|si)| ≤ ak,i, k 6= i,

max : for all zi, si differing only in the k-th coordinate,(19)

and (ii)||A||2 < 1.

Page 25: Logarithmic Sobolev inequalities in discrete product ...

(X n, qn), X finite

Theorem 2Assume Dobrushin’s uniqueness condition with coupling matrix

A, ||A||2 < 1.

Then conditions of Theorem 1 hold with

C = 1/(1− ||A||

)2.

Thus

D(pn||qn) ≤ 4

α· 1(

1− ||A||)2 · n∑

i=1

E∣∣pi(·|Yi)− qi(·|Yi)∣∣2

≤ 2

α· 1(

1− ||A||)2 · n∑

i=1

D(pi(·|Yi)||qi(·|Yi)

),

(20)

Page 26: Logarithmic Sobolev inequalities in discrete product ...

Cont’d

ProofDobrushin’s uniqueness condition implies that Γ contractsW2-distance with rate

1− 1

n·(

1− ||A||2).

Page 27: Logarithmic Sobolev inequalities in discrete product ...

Application: Gibbs measures on Zd

Notation

Zd : d-dimensional lattice, i ∈ Zd: site

ρ(k, i) = max1≤ν≤d |kν − iν |: distance on Zd,

Λ ⊂⊂ Zd: finite set of sites

X finite: spin space

xZd

= (xi : i ∈ Zd) ∈ X Zd: spin configuration

X Zd: configuration space,

For xZd

and Λ ⊂ Zd

xΛ = (xi : i ∈ Λ), xΛ = (xi : i /∈ Λ),

xΛ is called an outside configuration for Λ.

Page 28: Logarithmic Sobolev inequalities in discrete product ...

Local specificatons

Definition

qΛ(·|xΛ), Λ ⊂⊂ Zd : conditional distributions on XΛ.Assume compatibility conditions.

There exists at least one probability measure q on the space ofconfigurations:

q = L(X) ∈ P(X Zd)

satisfying

L(XΛ|XΛ = xΛ) = qΛ(·|xΛ), all Λ ⊂⊂ Zd.

qΛ(·|xΛ): local specifications of q.

Page 29: Logarithmic Sobolev inequalities in discrete product ...

Finite range interactions

DefinitionThe local specifications have finite range of interactions if:there is an R > 0:

qΛ(·|xΛ) only depends on coordinates k /∈ Λ with

ρ(k,Λ) ≤ R.

q may not be uniquely defined by the local specifications, evenfor finite range interactions.

Page 30: Logarithmic Sobolev inequalities in discrete product ...

Dobrushin-Shlosman’s strong mixing condition

Given local specifications qΛ(·|xΛ).

Assumption

There exists a function ϕ(ρ) of the distance such that:(i) ∑

i∈Zd

ϕ(ρ(k, i)

)<∞,

and:(ii) for every

Λ ⊂⊂ Zd, M ⊂ Λ, k /∈ Λ

and everyyΛ, zΛ, differing only at k :∣∣qM (·|yΛ)− qM (·|zΛ)

∣∣ ≤ ϕ(ρ(k,M)).

Page 31: Logarithmic Sobolev inequalities in discrete product ...

In case of finite range interactions:

If Dobrushin-Shlosman’s strong mixing condition holds then

ϕ(ρ) = C · exp(−γ · ρ)

can be taken

Page 32: Logarithmic Sobolev inequalities in discrete product ...

ρ(k,M) = max length of red segments

|qM (·|yΛ)− qM (·|zΛ)| ≤ ϕ(ρ(k,M))

yΛ, zΛ differ only at k ∈ Λ

Page 33: Logarithmic Sobolev inequalities in discrete product ...

Dobrushin-Shlosman’s strong mixing condition

Cont’dMeaning: The influence of the spin at k /∈ Λon the spins in M ⊂ Λ that are far away from kis small.

Essential:

The spins over Λ are fixed in two different ways.

The spins over Λ \M are not fixed.

Page 34: Logarithmic Sobolev inequalities in discrete product ...

Logarithmic Sobolev inequality for strongly mixingmeasures

Earlier results for the case of finite range interactions:D. Stroock, B. Zegarlinski 1992, F. Cesi 2001F. Martinelli, E. Olivieri

Theorem 3

(XΛ, qΛ(·|yΛ)) for fixed Λ and yΛ

α = min{qi(xi|xi) : qi(xi|xi) > 0

}.

(Finite range is not assumed.){Dobrushin-Shlosman’s strong mixing condition + {α > 0}

}=⇒conditions of Theorem 1 for qΛ(·|yΛ), with uniform constant=⇒logarithmic Sobolev inequality for qΛ(·|yΛ) with uniform constant

Page 35: Logarithmic Sobolev inequalities in discrete product ...

Logarithmic Sobolev inequality for strongly mixingmeasures

Cont’dThe proof uses a Gibbs sampler

updating cubes of size mdepending on the dimension and on the function ϕ(ρ).

We get

W 22

(pΛ, qΛ(·|yΛ)

)≤ Cm ·

∑I:m-sided cube

E∣∣pI∩Λ(·|YI∩Λ)− qI∩Λ(·|YI∩Λ)

∣∣2≤ Cm,α ·

∑i∈Λ

E∣∣pi(·|YΛ\i

)− qi

(·|YΛ\i, yΛ

)∣∣2for an appropriate m that is good enough for any Λ and yΛ.