This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
11/1/2007
1
Hidden Markov Models &Its application in Bioinformatics
Bioinformatics Group,Electrical and computer Department
University of Tehran, 2008
By: Mahdi Pakdaman
2Pakdaman@ gmail.com 1387/7/291387/7/29
Markov modelsHidden Markov models
DefinitionThree basic problems
Forward/Backward algorithmViterbi algorithmBaum-Welch estimation algorithmIssuesApplications in Bioinformatics
Forecast the weather state, given the current weather variables.
11/1/2007
8
15Pakdaman@ gmail.com 1387/7/291387/7/29
N urns containing colored balls
M distinct colors of balls
Each urn has a (possibly) different distribution of colors
Sequence generation algorithm:1. Pick initial urn according to some random process.
2. Randomly pick a ball from the urn and then replace it
3. Select another urn according a random selection process associated with the urn
4. Repeat steps 2 and 3
Urn and Ball ModelUrn and Ball Model
16Pakdaman@ gmail.com 1387/7/291387/7/29
N – the number of hidden statesQ – set of states Q={1,2,…,N}M – the number of symbolsV – set of symbols V ={1,2,…,M}A – the state-transition probability matrix
B – Observation probability distribution:
π - the initial state distribution:
λ – the entire model
Elements of Hidden Elements of Hidden Markov ModelsMarkov Models
, 1( | ) 1 ,i j t ta P q j q i i j N+= = = ≤ ≤
( ) ( | ) 1 ,j t k tB k P o v q j i j M= = = ≤ ≤
1( ) 1i P q i i Nπ = = ≤ ≤
( , , )A Bλ π=
11/1/2007
9
17Pakdaman@ gmail.com 1387/7/291387/7/29
1. EVALUATION – given observation O=(o1 , o2 ,…,oT )and model , efficiently compute
Hidden states complicate the evaluation
Given two models λ1 and λ1, this can be used to choose the better one.
2. DECODING - given observation O=(o1 , o2 ,…,oT ) and model λ find the optimal state sequence q=(q1 , q2 ,…,qT ) .
Optimality criterion has to be decided (e.g. maximum likelihood)
3. LEARNING – given O=(o1 , o2 ,…,oT ), estimate model parameters that maximize
Three Basic ProblemsThree Basic Problems
( , , )A Bλ π= ( | ).P O λ
( , , )A Bλ π= ( | ).P O λ
18Pakdaman@ gmail.com 1387/7/291387/7/29
Problem: Compute P(o1 , o2 ,…,oT |λ).
Algorithm:Let q=(q1 , q2 ,…,qT ) be a state sequence.
expected frequency in state i at time (t=1) (expected number of transitions from state i to state j) / expected number of transitions from state i):
(expected number of times in state j and observing symbol k) / (expected number of times in state j ):
BaumBaum--Welch: Update RulesWelch: Update Rules_
iπ = 1( )iγ=_
ija =
_ ( , )( )
tij
t
i ja
iξγ
= ∑∑_
( )ib k =
∑
∑
=
=−== T
tt
T
tt
j
j
i
kb kvtOts
1
1
)(
)(
)( ..
γ
γ
38Pakdaman@ gmail.com 1387/7/291387/7/29
Some issuesSome issues
Limitations imposed byMarkov chain
ScalabilityLearning
InitialisationModel orderLocal maximaWeighting training sequences
11/1/2007
20
39Pakdaman@ gmail.com 1387/7/291387/7/29
HMM ApplicationsHMM Applications
Classification (e.g., Profile HMMs)Build an HMM for each class (profile HMMs)Classify a sequence using Bayes rule
Multiple sequence alignmentBuild an HMM based on a set of sequencesDecode each sequence to find a multiple alignment
Segmentation (e.g., gene finding)Use different states to model different regionsDecode a sequence to reveal the region boundaries
40Pakdaman@ gmail.com 1387/7/291387/7/29
HMMsHMMs for Classificationfor Classification
1{ ,..., }( | ) ( )( | )
( )* arg max ( | ) ( )
k
C
C C Cp X C p Cp C X
p XC p X C p C
∈
=
=
p(X|C) is modeled by a profile HMM built specifically for C
Assuming example sequences are available for C
E.g., Protein families
Assign a family to X
11/1/2007
21
41Pakdaman@ gmail.com 1387/7/291387/7/29
HMMsHMMs for Motif Findingfor Motif Finding
Given a set of sequences S={X1, …,Xk}Design an HMM with two kinds of states
Background states: For outside a motifMotif states: For modeling a motif
Train the HMM, e.g., using Baum-Welch (finding the HMM that maximizes the probability of S)The “motif part” of the HMM gives a motif model (e.g., a PWM) The HMM can be used to scan any sequence (including Xi) to figure out where the motif is. We may also decode each sequence Xi to obtain a set of subsequences matched by the motif (e.g., a multiset of k-mers)
42Pakdaman@ gmail.com 1387/7/291387/7/29
HMMsHMMs for Multiple Alignmentfor Multiple Alignment
Given a set of sequences S={X1, …,Xk}Train an HMM, e.g., using Baum-Welch (finding the HMM that maximizes the probability of S)Decode each sequence XiAssemble the Viterbi paths to form a multiple alignment
The symbols belonging to the same state will be aligned to each other
11/1/2007
22
43Pakdaman@ gmail.com 1387/7/291387/7/29
HMMHMM--based Gene Findingbased Gene Finding
Design two types of states “Within Gene” States“Outside Gene” States
Use known genes to estimate the HMMDecode a new sequence to reveal which part is a geneExample software:
• Exit: 5’ Splice site or three stop codons(taa, tag, tga)
VEIL Architecture
11/1/2007
23
45Pakdaman@ gmail.com 1387/7/291387/7/29
Solutions to the Local Maxima Solutions to the Local Maxima ProblemProblem
Repeat with different initializationsStart with the most reasonable initial modelSimulated annealing (slow down the convergence speed)
46Pakdaman@ gmail.com 1387/7/291387/7/29
Local Maxima: IllustrationLocal Maxima: Illustration
Global maximaLocal maxima
Good starting pointBad starting point
11/1/2007
24
47Pakdaman@ gmail.com 1387/7/291387/7/29
Optimal Model ConstructionOptimal Model Construction
( | ) ( )( | )( )
* arg max ( | )arg max ( | ) ( )
HMM
HMM
p X HMM p HMMp HMM Xp X
HMM p HMM Xp X HMM p HMM
=
==
Bayesian model selection: -P(HMM) should prefer simpler models
(i.e., more constrained, fewer states, fewer transitions)-P(HMM) could reflect our prior on the parameters
48Pakdaman@ gmail.com 1387/7/291387/7/29
Sequence WeightingSequence Weighting
Avoid over-counting similar sequences from the same organismsTypically compute a weight for a sequence based on an evolutionary treeMany ways to incorporate the weights, e.g.,
Unequal likelihoodUnequal weight contribution in parameter estimation
11/1/2007
25
49Pakdaman@ gmail.com 1387/7/291387/7/29
Toolkits for HMMToolkits for HMM
Hidden Markov Model Toolkit (HTK)http://htk.eng.cam.ac.uk/Hidden Markov Model (HMM) Toolbox for Matlabhttp://www.cs.ubc.ca/~murphyk/Software/HMM/hmm.htmlTraining HMM for ASRhttp://cslu.cse.ogi.edu/tutordemos/nnet_training/tutorial.html#1.1_Setup
50Pakdaman@ gmail.com 1387/7/291387/7/29
L. Rabiner and B. Juang. An introduction to hidden Markov models. IEEE ASSP Magazine, p. 4--16, Jan. 1986.