Lecture 8: Hidden Markov Models (HMMs) Michael Gutkin Shlomi Haba Prepared by Prepared by Originally presented at Yaakov Stein’s DSPCSP Seminar, Originally presented at Yaakov Stein’s DSPCSP Seminar, spring 2002 spring 2002 Modified Modified by Benny Chor, using also some slides of Nir by Benny Chor, using also some slides of Nir Friedman (Hebrew Univ.), for the Computational Genomics Friedman (Hebrew Univ.), for the Computational Genomics Course, Tel-Aviv Univ., Dec. 2002 Course, Tel-Aviv Univ., Dec. 2002
34
Embed
Lecture 8: Hidden Markov Models (HMMs) Michael Gutkin Shlomi Haba Prepared by Originally presented at Yaakov Stein’s DSPCSP Seminar, spring 2002 Modified.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Lecture 8: Hidden Markov Models (HMMs)
Michael Gutkin Shlomi Haba
Prepared byPrepared by
Originally presented at Yaakov Stein’s DSPCSP Seminar, spring Originally presented at Yaakov Stein’s DSPCSP Seminar, spring 20022002
ModifiedModified by Benny Chor, using also some slides of Nir Friedman by Benny Chor, using also some slides of Nir Friedman (Hebrew Univ.), for the Computational Genomics Course, Tel-Aviv (Hebrew Univ.), for the Computational Genomics Course, Tel-Aviv Univ., Dec. 2002Univ., Dec. 2002
Hidden Markov Models – Computational Genomics
Previous Next Back Outline Auxiliary
Outline
Discrete Markov Models Hidden Markov Models Three major questions: Q1. Computing the probability of a given observation. A1. Forward – Backward (Baum Welch) DP algorithm. Q2. Computing the most probable sequence, given
an observation. A2. Viterbi DP Algorithm Q3. Given an observation, learn best model. A3. Expectation Maximization (EM): A Heuristic.
Hidden Markov Models – Computational Genomics
Previous Next Back Outline Auxiliary
Markov Models A discrete (finite) system:
N distinct states. Begins (at time t=1) in some initial state. At each time step (t=1,2,…) the system moves from current to next state (possibly the same
as the current state) according to transition probabilities associated with current state.
This kind of system is called aDiscrete Markov Model
Hidden Markov Models – Computational Genomics
Previous Next Back Outline Auxiliary
Discrete Markov Model Example: Discrete Markov
Model with 5 states Each of the aij represents
the probability of moving from state i to state j
The aij are given in a matrix A = {aij}
The probability to start in a given state i is i , The vector represents these
startprobabilities.
Hidden Markov Models – Computational Genomics
Previous Next Back Outline Auxiliary
Types of Models Ergodic model
Strongly connected - directed
path w/ positive probabilities
from each state i to state j
(but not necessarily complete directed graph)
Hidden Markov Models – Computational Genomics
Previous Next Back Outline Auxiliary
Types of Models (cont.) Left-to-Right (LR) model
Index of state non-decreasing with time
Hidden Markov Models – Computational Genomics
Previous Next Back Outline AuxiliaryDiscrete Markov Model - Example
States – Rainy:1, Cloudy:2, Sunny:3
Matrix A –
Problem – given that the weather on day 1 (t=1) is sunny(3), what is the probability for the observation O:
Hidden Markov Models – Computational Genomics
Previous Next Back Outline Auxiliary
Discrete Markov Model – Example (cont.)
The answer is -
Hidden Markov Models – Computational Genomics
Previous Next Back Outline Auxiliary Hidden Markov Models (probabilistic finite state automata)
Often we face scenarios where states cannot be directly observed.
We need an extension: Hidden Markov Modelsa11 a22
a33 a44
a12 a23a34
b11 b14
b12
b13
12 3
4
Observed
phenomenon
aij are state transition probabilities.
bik are observation (output) probabilities.
b11 + b12 + b13 + b14 = 1,
b21 + b22 + b23 + b24 = 1, etc.
Hidden Markov Models – Computational Genomics
Previous Next Back Outline AuxiliaryExample: Dishonest Casino
Actually, what is hidden in this model?
Hidden Markov Models – Computational Genomics
Previous Next Back Outline AuxiliaryBiological Example: CpG islands
In human genome, CpG dinucleotides are relatively rare
CpG pairs undergo a process called methylation that modifies the C nucleotide
A methylated C can (with relatively high probability) mutate to a T
Promoter regions are CpG rich These regions are not methylated, and
thus mutate less often These are called CpG islands
Hidden Markov Models – Computational Genomics
Previous Next Back Outline Auxiliary
CpG Islands We construct two
Markov chains: One for CpG rich, one for CpG poor regions.
Using observations from 60K nucleotide, we get two models, + and - .
Hidden Markov Models – Computational Genomics
Previous Next Back Outline Auxiliary
HMMs – Question I Given an observation sequence O = (O1 O2 O3 … OT),
and a model M = {A, B, }how do we efficiently compute P(O|M), the probability that the given model M produces the observation O in a run of length T ?
This probability can be viewed as a measure of the
quality of the model M. Viewed this way, it enables discrimination/selection among alternative models.
Hidden Markov Models – Computational Genomics
Previous Next Back Outline AuxiliaryHMM – Question II (Harder)
Given an observation sequence, O = (O1 O2 O3 … OT), and a model, M = {A, B, }how do we efficiently compute the most probable sequence(s) of states, Q?
That is, the sequence of states Q = (Q1 Q2 Q3 … QT) , which maximizes P(O|Q,M), the probability that the given model M produces the given observation O when it goes through the specific sequence of states Q .
Recall that given a model M, a sequence of observations O, and a sequence of states Q, we can efficiently compute P(O|Q,M) (should watch out for numeric underflows)
Hidden Markov Models – Computational Genomics
Previous Next Back Outline AuxiliaryHMM – Question III (Hardest)
Given an observation sequence O = (O1 O2 O3 … OT), and a
class of models, each of the form M = {A, B, }, which
specific model “best” explains the observations? A solution to question I enables the efficient computation of P(O|M) (the probability that a specific model M produces the observation O). Question III can be viewed as a learning problem: We want to use the sequence of observations in order to
“train” an HMM and learn the optimal underlying model parameters (transition and output probabilities).
Hidden Markov Models – Computational Genomics
Previous Next Back Outline AuxiliaryHMM Recognition (question I)
For a given model M = { A, B, } and a given state sequence
Q1 Q2 Q3 … QT ,, the probability of an observation sequence O1 O2 O3 … OT is P(O|Q,M) = bQ1O1 bQ2O2 bQ3O3
… bQTOT
For a given hidden Markov model M = { A, B, }the probability of the state sequence Q1 Q2 Q3 … QT
is (the initial probability of Q1 is taken to be Q1)
P(Q|M) = Q1 aQ1Q2 aQ2Q3 aQ3Q4 …
aQT-1QT
So, for a given hidden Markov model, Mthe probability of an observation sequence O1 O2 O3 … OT
is obtained by summing over all possible state sequences
Hidden Markov Models – Computational Genomics
Previous Next Back Outline Auxiliary
HMM – Recognition (cont.)
P(O| M) = P(O|Q) P(Q|M)
= Q Q1 bQ1O1 aQ1Q2 bQ2O2 aQ2Q3 bQ2O2 …
Requires summing over exponentially many paths But can be made more efficient