Top Banner
1896 1920 1987 2006 Computing and Communications 2. Information Theory -Entropy Ying Cui Department of Electronic Engineering Shanghai Jiao Tong University, China 2017, Autumn 1
50

Computing and Communications 2. Information Theory -Entropy

Apr 01, 2022

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Computing and Communications 2. Information Theory -Entropy

1896 1920 1987 2006

Computing and Communications2. Information Theory

-Entropy

Ying Cui

Department of Electronic Engineering

Shanghai Jiao Tong University, China

2017, Autumn

1

Page 2: Computing and Communications 2. Information Theory -Entropy

Outline

• Entropy

• Joint entropy and conditional entropy

• Relative entropy and mutual information

• Relationship between entropy and mutual information

• Chain rules for entropy, relative entropy and mutual information

• Jensen’s inequality and its consequences

2

Page 3: Computing and Communications 2. Information Theory -Entropy

Reference

• Elements of information theory, T. M. Cover and J. A. Thomas, Wiley

3

Page 4: Computing and Communications 2. Information Theory -Entropy

OVERVIEW

4

Page 5: Computing and Communications 2. Information Theory -Entropy

Information Theory

• Information theory answers two fundamental questions in communication theory

– what is the ultimate data compression?

-- entropy H

– what is the ultimate transmission rate of communication? -- channel capacity C

• Information theory is considered as a subset of communication theory

5

Page 6: Computing and Communications 2. Information Theory -Entropy

Information Theory

• Information theory has fundamental contributions to other fields

6

Page 7: Computing and Communications 2. Information Theory -Entropy

A Mathematical Theory of Commun.

• In 1948, Shannon published “A Mathematical Theory of Communication”, founding Information Theory

• Shannon made two major modifications having huge impact on communication design

– the source and channel are modeled probabilistically

– bits became the common currency of communication

7

Page 8: Computing and Communications 2. Information Theory -Entropy

A Mathematical Theory of Commun.

• Shannon proved the following three theorems– Theorem 1. Minimum compression rate of the source is its entropy

rate H

– Theorem 2. Maximum reliable rate over the channel is its mutual information I

– Theorem 3. End-to-end reliable communication happens if and only if H < I, i.e. there is no loss in performance by using a digital interface between source and channel coding

• Impacts of Shannon’s results– after almost 70 years, all communication systems are designed based

on the principles of information theory

– the limits not only serve as benchmarks for evaluating communication schemes, but also provide insights on designing good ones

– basic information theoretic limits in Shannon’s theorems have now been successfully achieved using efficient algorithms and codes

8

Page 9: Computing and Communications 2. Information Theory -Entropy

ENTROPY

9

Page 10: Computing and Communications 2. Information Theory -Entropy

Definition

• Entropy is a measure of the uncertainty of a r.v.

• Consider discrete r.v. X with alphabet and p.m.f.

– log is to the base 2, and entropy is expressed in bits• e.g., the entropy of a fair coin toss is 1 bit

– define , since• adding terms of zero probability does not change the entropy

10

( ) Pr[ ], p x X x x

log 0 as 0x x x 0log 0 0

Page 11: Computing and Communications 2. Information Theory -Entropy

Properties

– entropy is nonnegative

– base of log can be changed

11

Page 12: Computing and Communications 2. Information Theory -Entropy

Example

– H(X)=1 bit when p=0.5• maximum uncertainty

– H(X)=0 bit when p=0 or 1• minimum uncertainty

– concave function of p

12

Page 13: Computing and Communications 2. Information Theory -Entropy

Example

13

Page 14: Computing and Communications 2. Information Theory -Entropy

JOINT ENTROPY AND CONDITIONAL ENTROPY

14

Page 15: Computing and Communications 2. Information Theory -Entropy

Joint Entropy

• Joint entropy is a measure of the uncertainty of a pair of r.v.s

• Consider a pair of discrete r.v.s (X,Y) with alphabet and p.m.f.s

15

( ) Pr[ ], ( ) Pr[ ], p x X x x p y Y y y ,

Page 16: Computing and Communications 2. Information Theory -Entropy

Conditional Entropy

• Conditional entropy of a r.v. (Y) given another r.v. (X)

– expected value of entropies of conditional distributions, averaged over conditioning r.v.

16

Page 17: Computing and Communications 2. Information Theory -Entropy

Chain Rule

17

Page 18: Computing and Communications 2. Information Theory -Entropy

Chain Rule

18

Page 19: Computing and Communications 2. Information Theory -Entropy

Example

19

Page 20: Computing and Communications 2. Information Theory -Entropy

Example

20

Page 21: Computing and Communications 2. Information Theory -Entropy

RELATIVE ENTROPY AND MUTUAL INFORMATION

21

Page 22: Computing and Communications 2. Information Theory -Entropy

Relative Entropy

• Relative entropy is a measure of the “distance” between two distributions

– convention:

– if there is any

22

0 00log 0, 0 log 0 and log

0 0

pp

q

such that ( ) 0 and ( ) 0, then ( || ) .x p x q x D p q

Page 23: Computing and Communications 2. Information Theory -Entropy

Example

23

Page 24: Computing and Communications 2. Information Theory -Entropy

Mutual Information

• Mutual information is a measure of the amount of information that one r.v. contains about another r.v.

24

Page 25: Computing and Communications 2. Information Theory -Entropy

RELATIONSHIP BETWEEN ENTROPY AND MUTUAL INFORMATION

25

Page 26: Computing and Communications 2. Information Theory -Entropy

Relation

26

Page 27: Computing and Communications 2. Information Theory -Entropy

Proof

27

Page 28: Computing and Communications 2. Information Theory -Entropy

Illustration

28

Page 29: Computing and Communications 2. Information Theory -Entropy

CHAIN RULES FOR ENTROPY, RELATIVE ENTROPY, AND MUTUAL INFORMATION

29

Page 30: Computing and Communications 2. Information Theory -Entropy

Chain Rule for Entropy

30

Page 31: Computing and Communications 2. Information Theory -Entropy

Proof

31

Page 32: Computing and Communications 2. Information Theory -Entropy

Alternative Proof

32

Page 33: Computing and Communications 2. Information Theory -Entropy

Chain Rule for Information

33

Page 34: Computing and Communications 2. Information Theory -Entropy

Proof

34

Page 35: Computing and Communications 2. Information Theory -Entropy

Chain Rule for Relative Entropy

35

Page 36: Computing and Communications 2. Information Theory -Entropy

Proof

36

Page 37: Computing and Communications 2. Information Theory -Entropy

JENSEN'S INEQUALITY AND ITS CONSEQUENCES

37

Page 38: Computing and Communications 2. Information Theory -Entropy

Convex & Concave Functions

• Examples:

38

2convex functions: , | |, , log (for 0)xx x e x x x

concave functions: log and (for 0)x x x

linear functions are both convex and concaveax b

Page 39: Computing and Communications 2. Information Theory -Entropy

Convex & Concave Functions

39

Page 40: Computing and Communications 2. Information Theory -Entropy

Jensen’s Inequality

40

Page 41: Computing and Communications 2. Information Theory -Entropy

Information Inequality

41

Page 42: Computing and Communications 2. Information Theory -Entropy

Proof

42

Page 43: Computing and Communications 2. Information Theory -Entropy

Nonnegativity of Mutual Information

43

Page 44: Computing and Communications 2. Information Theory -Entropy

Max. Entropy Dist. – Uniform Dist.

44

Page 45: Computing and Communications 2. Information Theory -Entropy

Conditioning Reduces Entropy

45

Page 46: Computing and Communications 2. Information Theory -Entropy

Independence Bound on Entropy

46

Page 47: Computing and Communications 2. Information Theory -Entropy

Summary

47

Page 48: Computing and Communications 2. Information Theory -Entropy

Summary

48

Page 49: Computing and Communications 2. Information Theory -Entropy

Summary

49