Information Theory Mohamed Hamada Software Engineering Lab The University of Aizu Email: [email protected]URL: http://www.u-aizu.ac.jp/~hamada Evaluation • Active sheets 20% • Midterm Exam 30% • Final Exam 50 % 1 Goals • Understand the concepts of information entropy and channel capacity • Understand the digital communication model and its components • Understand how the components operate • Understand data compression • Understand error detection and correction 2 Course Outline • Introduction to set theory & probability • Introduction to information theory • Coding techniques & data compression • Information Entropy • Communication Channel • Error Detection and Correction 3 Final Overview 4 Lecture 1 Introduction to Set Theory and Probability • 1. Sets, Operations on sets • 2. Trial, Probability space, Events • 3. Random variables, Probability distribution • 4. Expected values, Variance • 5. Conditional Probability • 6. Bayes Theory 5
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
• Understand the concepts of information entropy and channel capacity
• Understand the digital communication model and its components
• Understand how the components operate
• Understand data compression
• Understand error detection and correction 2
Course Outline
• Introduction to set theory & probability
• Introduction to information theory
• Coding techniques & data compression
• Information Entropy
• Communication Channel
• Error Detection and Correction
3
Final Overview
4
Lecture 1
Introduction to Set Theory and Probability
• 1. Sets, Operations on sets
• 2. Trial, Probability space, Events
• 3. Random variables, Probability distribution
• 4. Expected values, Variance
• 5. Conditional Probability
• 6. Bayes Theory
5
Lecture 2
• Overview of Information Theory
• Digital Communication
• History
• What is Information Theory
6
Information Theory
Communication Theory
Probability Theory
Statistics
EconomyMathematics
Physics
Computer Science
Others
OVERVIEW OF INFORMATION THEORY FRAMEWORK
7
DIGITAL COMMUNICATION
8
DIGITAL COMMUNICATION
9
Lecture 3
• Watching a Coding Video (50 mins.)
• What is Information Theory
• Information Source
• Introduction to Source Coding
• What is Information Theory
10
Channelencoder decoderSource
decoding
Channel
decodingSource
coding
Channel
coding
Sent
messagessymbols
Error Correction
Channel Capacity
Compression
Source Entropy
Decompression
Capacity vs Efficiency
Received
messages
INFORMATION TRANSFER ACROSS CHANNELS
source receiver
11
Channel
Digital Communication Systems
Destination
Source Encoder
Channel Encoder
Modulator
Source Decoder
Channel Decoder
De-Modulator
InformationSource
Memoryless
Stationary
Discrete
12
Memoryless means the generated symbols (of a source message ) are independent.
Memoryless
The idea of stationary of a source demands no change with time
Stationary
The source produces independent symbols in different unit times
Discrete
Information Source
13
Lecture 4
• What is Entropy
• Information Source
• Measure of Information
• Self-Information
• Unit of Information
• Entropy
• Properties
14
Measure of Information• The amount of Information gained after
observing sk which has a probability pk
1I logk
k
sp
15
Unit of Information• Depends on the BASE of the Logarithm
2
1I logk
k
sp
10
1I logk
k
sp
e
1I logk
k
sp
Bits
Nats
Hartleys
16
Lecture 5
• Entropy
• Conditional Entropy
• Example
• Joint Entropy
• Chain Rule
17
Entropy H(S)• Entropy is the average information content of
a source
K-1
2k=0
H =E I
1H = log
k
kk
S s
S pp
18
Conditional Entropy H(Y|X)
Is the amount of information contained in Y such that X is given
19
H (Y |X) = Σj P (X=vj) H (Y | X = vj)
Joint EntropyIs the amount of information contained in both events X and Y
H(X, Y) = -Σ p(x,y) log p(x,y)X,Y
20
H(X,Y) = H(X) + H(Y|X)
Chain RuleChain Rule
Relationship between conditional and joint entropy
21
Lecture 6
• Entropy and Data Compression
• Uniquely decodable codes
• Prefix Code
• Average Code Length
• Shannon’s First Theorem
• Kraft-McMillan Inequality
• Code Efficiency
• Code Extension
22
Prefix Coding (Instantaneous code)
• A prefix code is defined as a code in which nocodeword is the prefix of some other code word.
• A prefix code is uniquely decodable.
23
Average Code Length
• Source has K symbols• Each symbol sk has probability pk
• Each symbol sk is represented by a codeword ck of length lk bits
• Average codeword length
Information Source
Source Encoder
sk ck
1
0
K
k kk
L p l
24
Shannon’s First Theorem:The Source Coding Theorem
The outputs of an information source cannot be represented by a source code whose average length is less than the source entropy
HL S
25
Kraft-McMillan Inequality
• If codeword lengths of a code satisfy the Kraft McMillan’s inequality, then a prefix code with these codeword lengths can beconstructed.
1
0
2 1k
Kl
k
26
Code Efficiency η
• An efficient code means η1
H SL
27
Lecture 7
• Mid Term Exam
28
Lecture 8
• Source Coding Techniques
• Huffman Code
• Two-pass Huffman Code
• Lemple-Ziv Encoding
• Lemple-Ziv Decoding
29
Channel
InformationSource
Destination
Source Encoder
Channel Encoder
Modulator
Source Decoder
Channel Decoder
De-Modulator
1. Huffman Code.
2. Two-pass Huffman Code.
4. Fano code.
5. Shannon Code.
6. Arithmetic Code.
3. Lemple-Ziv Code.
Data Compression
30
Huffman Code.
1 take together smallest probabilites: P(i) + P(j) 2 replace symbol i and j by new symbol 3 go to 1 - until end
Application examples: JPEG, MPEG, MP3
31
Another Solution BSource Symbol
sk
Stage I Stage II Stage III Stage IV Code
s2 0.4 0.4 0.4 0.6 1
s1 0.2 0.2 0.4 0.4 01
s3 0.2 0.2 0.2 000
s0 0.1 0.2 0010
s4 0.1 0011
0
10
1
0
1
0
1
32
Two-pass Huffman Code.
This method is used when the probability of symbols in the information source is unknown. So we first can estimate this probability by calculating the number ofoccurrence of the symbols in the given message then we can find the possible Huffman codes. This can be summarized by the following two passes.
Pass 1 : Measure the occurrence possibility of each character in the message
Pass 2 : Make possible Huffman codes
33
Source Coding Techniques2. Two-pass Huffman Code.
Example
Consider the message: M=ABABABABABACADABACADABACADABACAD
A Markov chain is said to be Ergodic if, after a certain finite number of steps, it is possible to go from any state to any other state with a nonzero probability.
Stochastic
Markov
Ergodic
55
Markov information source is an information source with memory in which the probability of a symbol occurring in a message will depend on a finite number of preceding symbols
Stochastic process: is any sequence of random variables from some probability space.
Lecture 12
• Communication Channel
• Noiseless binary channel
• Binary Symmetric Channel (BSC)
• Symmetric Channel
• Mutual Information
• Channel Capacity
56
Channel
InformationSource
Destination
Source Encoder
Channel Encoder
Modulator
Source Decoder
Channel Decoder
De-Modulator
Noiseless Binary Channel
Binary Symmetric Channel
Symmetric Channel
MarkovErgodic
Stochastic
Channel Capacity
Mutual Information
Conditional Entropy
57
Noiseless binary channelNoiseless binary channel
0 0
Channel
1 1
Transition Matrix
0 1
0 1 0
1 0 1
p(y | x) =
58
Binary Symmetric Channel (BSC)
0 01
1 10
0 011
BSC Channel
BSC Channel
1-pp
BSC Channel
p 1-p
p
1-pp
(Noisy channel)
59
X Y
Channel
Transition Matrix
p(y | x) =
Symmetric Channel(Noisy channel)
In the transmission matrix of this channel , all the rows are permutations of each other and so the columns.