Part1 Markov Models for Pattern Recognition – Introduction
CSE717, SPRING 2008
CUBS, Univ at Buffalo
Textbook
Markov models for pattern recognition: from theory to applications
by Gernot A. Fink, 1st Edition, Springer, Nov 2007
Textbook
Foundation of Math Statistics Vector Quantization and Mixture
Density Models Markov Models
Hidden Markov Model (HMM) Model formulation Classic algorithms in the HMM Application domain of the HMM
n-Gram Systems
Character and handwriting recognition
Speech recognition Analysis of biological sequences
Preliminary Requirements
Familiar with Probability Theory and Statistics
Basic concepts in Stochastic Process
Part 2 aFoundation of Probability Theory, Statistics & Stochastic Process
CSE717 , SPRING 2008
CUBS, Univ at Buffalo
Coin Toss Problem
Coin toss result: X: random variable head, tail: states SX: set of states Probabilities:
} tailhead,{ XSX
5.0)tail(Pr)head(Pr XX
Discrete Random Variable
A discrete random variable’s states are discrete: natural numbers, integers, etc
Described by probabilities of states PrX(s1), PrX(x=s2), …
s1, s2, …: discrete states (possible values of x)
Probabilities over all the states add up to 1 i
iX s 1)(Pr
Continuous Random Variable
A continuous random variable’s states are continuous: real numbers, etc
Described by its probability density function (p.d.f.): pX(s)
The probability of a<X<b can be obtained by integral
Integral from to
b
a X dssp )(
1)(
dsspX
Joint Probability and Joint p.d.f.
Joint probability of discrete random variables
Joint p.d.f. of continuous random variables
Independence Condition
iinXXX Xxxxxn
of state possibleany is ),,...,,(Pr 21,...,, 21
)(Pr...)(Pr)(Pr),...,,(Pr 2121,...,, 2121 nXXXnXXX xxxxxxnn
iinXXX Xxxxxpn
of state possibleany is ),,...,,( 21,...,, 21
)(...)()(),...,,( 2121,...,, 2121 nXXXnXXX xpxpxpxxxpnn
Conditional Probability and p.d.f.
Conditional probability of discrete random variables
Joint p.d.f. for continuous random variables
)(Pr/),(Pr)|(Pr 121,12| 11112xxxxx XXXXX
)(/),()|( 121,12| 11112xpxxpxxp XXXXX
Statistics: Expected Value and Variance
For discrete random variable
For continuous random variable
dxxxpXE X )(}{
i
iXi ssXE )(Pr}{
i
iXi sXEsXVar )(Pr}){(}{ 2
dxxpXExXVar X )(}){(}{ 2
Normal Distribution of Single Random Variable
Notation
p.d.f
Expected value
Variance
)2
)(exp(
2
1)(
2
2
x
xpX
}{xE
2}{ xVar
),( 2N
),(~ 2NX
Stochastic Process
A stochastic process is a time series of random variables : random variable t: time stamp
},,,{...,}{ 11 tttt XXXX
tX
Audio signal
Stock market
Causal Process
A stochastic process is causal if it has a finite history
A causal process can be represented by
,...,,, 21 tXXX
Stationary Process
A stochastic process is stationary if the probability at a fixed time t is the same for all other times, i.e., for any n, and ,
A stationary process is sometimes referred to as strictly stationary, in contrast with weak or wide-sense stationarity
}{ tX
}{,,,21 tttt XXXX
n
),,,(Pr
),,,(Pr
21,,,
21,,,
21
21
nXXX
nXXX
xxx
xxx
nttt
nttt
Gaussian White Noise
White Noise: obeys independent identical distribution (i.i.d.)
Gaussian White Noise
),(~ 2NX t
tX
Gaussian White Noise is a Stationary Process
Proof
for any n, and , }{,,,21 tttt XXXX
n
),,,(p
)2
)(exp(
)2(
1
)(
),,,(
21,,,
12
2
1
21,,,
21
21
nXXX
n
i
i
n
iiX
nXXX
xxx
x
xp
xxxp
nttt
it
nttt
Temperature
Q1: Is the temperature within a day stationary?
Markov Chains
A causal process is a Markov chain if
for any x1, …, xt
k is the order of the Markov chain First order Markov chain
Second order Markov chain
),,|(Pr),,|(Pr 1,,|11,,| 111 tkttXXXttXXX xxxxxx
tktttt
}{ tX
)|(Pr),,|(Pr 1|11,,| 111 ttXXttXXX xxxxx
tttt
),|(Pr),,|(Pr 12,|11,,| 1211 tttXXXttXXX xxxxxx
ttttt
Homogeneous Markov Chains
A k-th order Markov chain is homogeneous if the state transition probability is the same over time, i.e.,
Q2: Does homogeneous Markov chain imply stationary process?
}{ tX
k
kXXXkXXX
xxτt
xxxxxxktktt
,, ,, any for
),,|(Pr),,|(Pr
0
10,,|10,,| 11
State Transition in Homogeneous Markov Chains Suppose is a k-th order Markov chain and
S is the set of all possible states (values) of xt, then for any k+1 states x0, x1, …, xk, the state transition probability
can be abbreviated to
}{ tX
),,|(Pr 10,,| 1xxx kXXX tktt
),,|Pr( 10 xxx k
Rain Dry
0.60.4
0.2 0.8Two states : ‘Rain’ and ‘Dry’.Transition probabilities:
Pr(‘Rain’|‘Rain’)=0.4 , Pr(‘Dry’|‘Rain’)=0.6 , Pr(‘Rain’|‘Dry’)=0.2, Pr(‘Dry’|‘Dry’)=0.8
Example of Markov Chain
Rain Dry
0.60.4
0.2 0.8
Initial (say, Wednesday) probabilities: PrWed(‘Rain’)=0.3, PrWed(‘Dry’)=0.7
What’s the probability of rain on Thursday?
PThur(‘Rain’)= PrWed(‘Rain’)xPr(‘Rain’|‘Rain’)+PrWed(‘Dry’)xPr(‘Rain’|‘Dry’)= 0.3x0.4+0.7x0.2=0.26
Short Term Forecast
Rain Dry
0.60.4
0.2 0.8
Pt(‘Rain’)= Prt-1(‘Rain’)xPr(‘Rain’|‘Rain’)+Prt-1(‘Dry’)xPr(‘Rain’|‘Dry’)= Prt-
1(‘Rain’)x0.4+(1– Prt-1(‘Rain’)x0.2=0.2+0.2xPrt(‘Rain’)
Pt(‘Rain’)= Prt-1(‘Rain’) => Prt-1(‘Rain’)=0.25, Prt-1(‘Dry’)=1-0.25=0.75
Condition of Stationary
steady state distribution
Rain Dry
0.60.4
0.2 0.8
Pt(‘Rain’) = 0.2+0.2xPrt-1(‘Rain’)
Pt(‘Rain’) – 0.25 = 0.2x(Prt-1(‘Rain’) – 0.25)
Pt(‘Rain’) = 0.2t-1x(Pr1(‘Rain’)-0.25)+0.25
Pt(‘Rain’) = 0.25 (converges to steady state distribution)
Steady-State Analysis
tlim
Rain Dry
10
1 0
Periodic Markov chain never converges to steady states
Periodic Markov Chain