Top Banner
1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University
54

1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Jan 12, 2016

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

1

Information and interactive computation

January 16, 2012

Mark BravermanComputer Science, Princeton University

Page 2: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Prelude: one-way communication

• Basic goal: send a message from Alice to Bob over a channel.

2

communication channel

Alice Bob

Page 3: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

One-way communication1) Encode;2) Send;3) Decode.

3

communication channel

Alice Bob

Page 4: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Coding for one-way communication• There are two main problems a good

encoding needs to address:– Efficiency: use the least amount of the

channel/storage necessary.– Error-correction: recover from (reasonable)

errors;

4

Page 5: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Interactive computation

Today’s themeExtending information and coding theory to interactive computation.

5

I will talk about interactive information theory and Anup Rao will talk about

interactive error correction.

Page 6: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Efficient encoding• Can measure the cost of storing a random

variable X very precisely. • Entropy: H(X) = ∑Pr[X=x] log(1/Pr[X=x]).• H(X) measures the average amount of

information a sample from X reveals. • A uniformly random string of 1,000 bits has

1,000 bits of entropy.

6

Page 7: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Efficient encoding

7

• H(X) = ∑Pr[X=x] log(1/Pr[X=x]).• The ZIP algorithm works because

H(X=typical 1MB file) < 8Mbits.• P[“Hello, my name is Bob”] >>

P[“h)2cjCv9]dsnC1=Ns{da3”].• For one-way encoding, Shannon’s source

coding theorem states that

Communication ≈ Information.

Page 8: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Efficient encoding

8

• The problem of sending many samples of X can be implemented in H(X) communication on average.

• The problem of sending a single sample of X can be implemented in <H(X)+1 communication in expectation.

Page 9: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Communication complexity [Yao]

• Focus on the two party setting.

9

A B

X YA & B implement a functionality F(X,Y).

F(X,Y)

e.g. F(X,Y) = “X=Y?”

Page 10: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Communication complexity

10

A B

X Y

Goal: implement a functionality F(X,Y).A protocol π(X,Y) computing F(X,Y):

F(X,Y)

m1(X)m2(Y,m1)

m3(X,m1,m2)

Communication cost = #of bits exchanged.

Page 11: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Distributional communication complexity

• The input pair (X,Y) is drawn according to some distribution μ.

• Goal: make a mistake on at most an ε fraction of inputs.

• The communication cost: C(F,μ,ε):C(F,μ,ε) := minπ computes F with error≤ε C(π, μ).

11

Page 12: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Example

12

μ is a distribution of pairs of files. F is “X=Y?”:

MD5(X) (128b)

X=Y? (1b)

Communication cost = 129 bits. ε ≈ 2-128.

A B

X Y

Page 13: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Randomized communication complexity

• Goal: make a mistake of at most ε on every input.

• The communication cost: R(F,ε).• Clearly: C(F,μ,ε)≤R(F,ε) for all μ.• What about the converse?• A minimax(!) argument [Yao]:

R(F,ε)=maxμ C(F,μ,ε).

13

Page 14: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

A note about the model

• We assume a shared public source of randomness.

14A B

X YR

Page 15: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

The communication complexity of EQ(X,Y)

• The communication complexity of equality:R(EQ,ε) ≈ log 1/ε

• Send log 1/ε random hash functions applied to the inputs. Accept if all of them agree.

• What if ε=0?R(EQ,0) ≈ n,

where X,Y in {0,1}n.15

Page 16: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Information in a two-way channel

• H(X) is the “inherent information cost” of sending a message distributed according to X over the channel.

16

communication channel

Alice BobXWhat is the two-way

analogue of H(X)?

Page 17: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Entropy of interactive computation

A B

X YR

• “Inherent information cost” of interactive two-party tasks.

Page 18: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

One more definition: Mutual Information

• The mutual information of two random variables is the amount of information knowing one reveals about the other:

I(A;B) = H(A)+H(B)-H(AB)• If A,B are independent, I(A;B)=0.• I(A;A)=H(A).

18

H(A) H(B)I(A,B)

Page 19: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Information cost of a protocol

• [Chakrabarti-Shi-Wirth-Yao-01, Bar-Yossef-Jayram-Kumar-Sivakumar-04, Barak-B-Chen-Rao-10].

• Caution: different papers use “information cost” to denote different things!

• Today, we have a better understanding of the relationship between those different things.

19

Page 20: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Information cost of a protocol • Prior distribution: (X,Y) ~ μ.

A B

X Y

Protocol πProtocol transcript π

I(π, μ) = I(π;Y|X) + I(π;X|Y) what Alice learns about Y + what Bob learns about X

Page 21: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

External information cost• (X,Y) ~ μ.

A B

X Y

Protocol πProtocol transcript π

Iext(π, μ) = I(π;XY) what Charlie learns about (X,Y)

C

Page 22: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Another view on I and Iext

• It is always the case that C(π, μ) ≥ Iext(π, μ) ≥ I(π, μ).

• Iext measures the ability of Alice and Bob to compute F(X,Y) in an information theoretically secure way if they are afraid of an eavesdropper.

• I measures the ability of the parties to compute F(X,Y) if they are afraid of each other.

Page 23: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Example

23

• F is “X=Y?”.• μ is a distribution where w.p. ½ X=Y and w.p. ½

(X,Y) are random.

MD5(X) [128b]

X=Y?A B

X Y

Iext(π, μ) = I(π;XY) = 129 bits what Charlie learns about (X,Y)

Page 24: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Example• F is “X=Y?”.• μ is a distribution where w.p. ½ X=Y and w.p. ½

(X,Y) are random.

MD5(X) [128b]

X=Y?A B

X Y

I(π, μ) = I(π;Y|X)+I(π;X|Y) ≈what Alice learns about Y + what Bob learns about X

1 + 64.5 = 65.5 bits

Page 25: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

The (distributional) information cost of a problem F

• Recall:C(F,μ,ε) := minπ computes F with error≤ε C(π, μ).

• By analogy:I(F, μ, ε) := infπ computes F with error≤ε I(π, μ).

Iext (F, μ, ε) := infπ computes F with error≤ε Iext (π, μ).

25

Page 26: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

I(F,μ,ε) vs. C(F,μ,ε): compressing interactive computation

Source Coding Theorem: the problem of sending a sample of X can be

implemented in expected cost <H(X)+1 communication – the information

content of X.

Is the same compression true for interactive protocols?

Can F be solved in I(F,μ,ε) communication?

Or in Iext(F,μ,ε) communication?

Page 27: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

The big question

• Can interactive communication be compressed?

• Can π be simulated by π’ such that C(π’, μ) ≈ I(π, μ)?

Does I(F,μ,ε) ≈ C(F,μ,ε)?

27

Page 28: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Compression results we know• Let ε, ρ be constants; let π be a protocol

that computes F with error ε.• π’s costs: C, Iext, I.• Then π can be simulated using:

– (I·C)½·polylog(C) communication; [Barak-B-Chen-Rao’10]

– Iext·polylog(C) communication; [Barak-B-Chen-Rao’10]

– 2O(I) communication; [B’11]

while introducing an extra error of ρ.28

Page 29: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

The amortized cost of interactive computation

Source Coding Theorem: the amortized cost of sending many independent

samples of X is =H(X).

What is the amortized cost of computing many independent

copies of F(X,Y)?

Page 30: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Information = amortized communication

• Theorem[B-Rao’11]: for ε>0I(F,μ,ε) = limn→∞ C(Fn,μn,ε)/n.

• I(F,μ,ε) is the interactive analogue of H(X).

30

Page 31: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Information = amortized communication

• Theorem[B-Rao’11]: for ε>0I(F,μ,ε) = limn→∞ C(Fn,μn,ε)/n.

• I(F,μ,ε) is the interactive analogue of H(X).• Can we get rid of μ? I.e. make I(F,ε) a

property of the task F?

C(F,μ,ε) I(F,μ,ε) R(F,ε) ?

Page 32: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Prior-free information cost

• Define:I(F,ε) := infπ computes F with error≤ε maxμ I(π, μ)

• Want a protocol that reveals little information against all priors μ!

• Definitions are cheap!• What is the connection between the

“syntactic” I(F,ε) and the “meaningful” I(F,μ,ε)?

• I(F,μ,ε) ≤ I(F,ε)…32

Page 33: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

33

Prior-free information cost

• I(F,ε) := infπ computes F with error ≤ε maxμ I(π, μ).• I(F,μ,ε) ≤ I(F,ε) for all μ.• Recall: R(F,ε)=maxμ C(F,μ,ε).• Theorem[B’11]:

I(F,ε) ≤ 2·maxμ I(F,μ,ε/2).

I(F,0) = maxμ I(F,μ,0).

Page 34: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

34

Prior-free information cost

• Recall: I(F,μ,ε) = limn→∞ C(Fn,μn,ε)/n.• Theorem: for ε>0

I(F,ε) = limn→∞ R(Fn,ε)/n.

Page 35: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Example

• R(EQ,0) ≈ n.• What is I(EQ,0)?

35

Page 36: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

The information cost of Equality

• What is I(EQ,0)?• Consider the following protocol.

36A B

X in {0,1}n Y in {0,1}n

A non-singular in nn2Z

A1·XA1·Y

A2·XA2·Y

Continue for n steps, or until a disagreement is

discovered.

Page 37: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Analysis (sketch)

• If X≠Y, the protocol will terminate in O(1) rounds on average, and thus reveal O(1) information.

• If X=Y… the players only learn the fact that X=Y (≤1 bit of information).

• Thus the protocol has O(1) information complexity.

37

Page 38: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Direct sum theorems

• I(F,ε) = limn→∞ R(Fn,ε)/n.• Questions:

– Does R(Fn,ε)=Ω(n·R(F,ε))?– Does R(Fn,ε)=ω(R(F,ε))?

38

Page 39: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Direct sum strategy

• The strategy for proving direct sum results. • Take a protocol for Fn that costs Cn=R(Fn,ε),

and make a protocol for F that costs ≈Cn/n.• This would mean that C<Cn/n, i.e. Cn>n C.∙

39

~ ~

A protocol for n

copies of F

1 copy of FCnCn/n?

Page 40: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Direct sum strategy

• If life were so simple…

40

1 copy of FCnCn/nEasy!

Copy 1Copy 2

Copy n

Page 41: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Direct sum strategy

• Theorem: I(F,ε) = I(Fn,ε)/n ≤ Cn = R(Fn,ε)/n.• Compression → direct sum!

41

Page 42: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

The information cost angle

• There is a protocol of communication cost Cn, but information cost ≤Cn/n.

1 copy of F

Cn

Cn/n

Restriction

1 bit

Copy

1

Copy

2

Copy

n

C n/n

info

Compression?

Page 43: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Direct sum theorems

Best known general simulation [BBCR’10]:• A protocol with C communication and I

information cost can be simulated using (I·C)½·polylog(C) communication.

• Implies: R(Fn,ε) = Ω(n1/2 R(F,∙ ε)).

43

~

Page 44: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Compression vs. direct sum

• We saw that compression → direct sum. • A form of the converse is also true. • Recall: I(F,ε) = limn→∞ R(Fn,ε)/n.• If there is a problem such that I(F,ε)=o(R(F,ε)),

then R(Fn,ε)=o(n·R(F,ε)).

44

Page 45: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

A complete problem

• Can define a problem called Correlated Pointer Jumping – CPJ(C,I).

• The problem has communication cost C and information cost I.

• CPJ(C,I) is the “least compressible problem”. • If R(CPJ(C,I),1/3)=O(I), then R(F,1/3)=O(I(F,1/3))

for all F.

45

Page 46: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

The big picture

R(F, ε) R(Fn,ε)/n

I(F, ε) I(Fn,ε)/n

direct sum for information

information = amortized

communicationdirect sum for

communication?

interactive compression?

Page 47: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Partial progress

• Can compress bounded-round interactive protocols.

• The main primitive is a one-shot version of Slepian-Wolf theorem.

• Alice gets a distribution PX. • Bob gets a prior distribution PY. • Goal: both must sample from PX.

47

Page 48: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Correlated sampling

48

A B

PX PY

M ~ PXM ~ PX

• The best we can hope for is D(PX||PY).

Uu Y

XXYX uP

uPuPPPD

)(

)(log)()||(

Page 49: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

49

Proof Idea• Sample using D(PX||PY)+O(log 1/ε+D(PX||PY)½)

communication with statistical error ε.

PXPY

u1 u1

u2 u2

u3 u3

u4 u4

u4

~|U| samplesPublic randomness:

q1 q2 q3 q4 q5 q6 q7 ….u1 u2 u3 u4 u5 u6 u7

PX PY

1 1

0 0

Page 50: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Proof Idea• Sample using D(PX||PY)+O(log 1/ε+D(PX||PY)½)

communication with statistical error ε.

u4u2

h1(u4) h2(u4)

50PX

PYu4

PX PY

1 1

0 0u2

Page 51: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

5151

Proof Idea• Sample using D(PX||PY)+O(log 1/ε+D(PX||PY)½)

communication with statistical error ε.

u4u2

h4(u4)… hlog 1/ ε(u4)

u4

h3(u4)

PX 2PY

PXPYu4

u4

h1(u4), h2(u4)

1 1

0 0

Page 52: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

52

Analysis

• If PX(u4)≈2k PY(u4), then the protocol will reach round k of doubling.

• There will be ≈2k candidates.• About k+log 1/ε hashes.• The contribution of u4 to cost:

– PX(u4) (log PX (u4)/PY (u4) + log 1/ε).

Uu Y

XXYX uP

uPuPPPD

)(

)(log)()||(

Done!

Page 53: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

Directions• Can interactive communication be fully

compressed? R(F, ε) = I(F, ε)?• What is the relationship between I(F, ε),

Iext(F, ε) and R(F, ε)?• Many other questions on interactive coding

theory!

53

Page 54: 1 Information and interactive computation January 16, 2012 Mark Braverman Computer Science, Princeton University.

54

Thank You