Pairwise Sequence Comparison Stat 246, Spring 2002, Week 5,
Jan 25, 2016
Pairwise Sequence Comparison
Stat 246, Spring 2002, Week 5,
Sequence comparison: topics
General concepts
Dot plots
Global alignments
Scoring matrices
Gap penalties
Dynamic programming
Chance or common ancestry?
Dot Plot
This is the earliest, simplest and most complete method for comparing two sequences
It is possible to filter the plot to minimise noise whilst preserving the obvious relationship
This plot can identify
• regions of similarity
• internal repeats
• rearrangement events
A C A C A C T A
A
G
C
A
C
A
C
A
b
a .A dot goes where the two sequences match
Sequence1 down:
Sequence 2along:
(Add a “guard” row and colum.)
Connect the dotsalong diagonals.
Extensions to dot plots
Modern dot plots are more sophisticated, using the notions of
window : size of diagonal strip centered on an entry, over which matching is accumulated, and
stringency: the extent of agreement required over the window, before a dot is placed at the central entry.
e.g. for a window of size 5, we might require at least 3 matches, and then we put a dot in the central spot. More complex scoring rules can be used.
Human globin vs. human myoglobin
a
beta-human.pep ck: 1,242, 1 to 146050100150100500
Human LDL receptor vs. itself (w=30, s=9)
a
ldlrecep.pep ck: 3,641, 1 to 860 02004006008008006004002000
Human LDL receptor vs. itself (40, 15)
COMPARE Window: 40 Stringency: 15.0 Points: 5,287
ldlrecep.pep ck: 3,641, 1 to 860
ldlrecep.pep ck: 3,641, 1 to 860
0
200
400
600
800
8006004002000
Human LDL receptor vs. itself (40, 17.5)
ldlrecep.pep ck: 3,641, 1 to 860
0
200
400
600
800
8006004002000
COMPARE Window: 40 Stringency: 17.5 Points: 3,079
ldlrecep.pep ck: 3,641, 1 to 860
Human LDL receptor vs. itself (40, 20)
ldlrecep.pep ck: 3,641, 1 to 860
0
200
400
600
800
8006004002000
COMPARE Window: 40 Stringency: 20.0 Points: 2,295
ldlrecep.pep ck: 3,641, 1 to 860
Plasmodium falciparum MSP3 vs. itself (30,9)
a
msp3.pep ck: 4,247, 1 to 3800100200300
3002001000
Plasmodium falciparum MSP3 vs. itself (20,9)
COMPARE Window: 20 Stringency: 9.0 Points: 15,619
msp3.pep ck: 4,247, 1 to 380
msp3.pep ck: 4,247, 1 to 380
0
100
200
300
3002001000
Plasmodium falciparum MSP3 vs. itself (10,9)
COMPARE Window: 10 Stringency: 9.0 Points: 1,263
msp3.pep ck: 4,247, 1 to 380
msp3.pep ck: 4,247, 1 to 380
0
100
200
300
3002001000
Global alignment
An alignment of two sequences a and b is an arrangement of a and b by position, where a and b can be padded with gap symbols to achieve the same length:
a: AGCACAC-A or AG-CACACA
b: A-CACACTA ACACACT-A
If we read the alignment column-wise, we have a protocol of edit operations that lead from a to b.
Left: Match (A,A) Right: Match (A,A)
Delete (G,-) Replace (G,C)
Match (C,C) Insert (-,A)
Match (A,A) Match (C,C)
Match (C,C) Match (A,A)
Match (A,A) Match (C,C)
Match (C,C) Replace (A,T)
Insert (-,T) Delete (C,-)
Match (A,A) Match (A,A)
The left-hand alignment shows one Delete, one Insert, and the other edit operations are Matches.
The right-hand alignment shows one Insert, one Delete, two Replaces, and some trivial ones.
Cost (scoring) of global alignments; optimal global alignments
Next we turn the edit protocol into a measure of distance by assigning a “cost” or “weight” S to each operation. For example, for arbitrary characters u,v from A we may define
S(u,u) = 0; S(u,v) = 1 for u ≠ v; S(u,-) = S(-,v) = 1. (Unit Cost)
This scheme is known as the Levenshtein distance, also called unit cost model. Its predominant virtue is its simplicity. In general, more sophisticated cost models must be used. For example, replacing an amino acid by a biochemically similar one should weight less than a replacement by an amino acid with totally different properties. Details shortly. Now we are ready to define the most important notion for sequence analysis:
The cost of an alignment of two sequences a and b is the sum of the costs of all the edit operations that lead from a to b.
An optimal alignment of a and b is an alignment which has minimal cost among all possible alignments.
The edit distance of a and b is the cost of an optimal alignment of a and b under a cost function S. We denote it by d(a,b).
Using the unit cost model for S in our previous example, we obtain the following cost:
a: AGCACAC-A or AG-CACACA
b: A-CACACTA ACACACT-A
cost: 2 cost: 4
Here it is easily seen that the left-hand assignment is optimal under the unit cost model, and hence the edit distance d(a,b) = 2.
More general scores = - costs: see later.
C 9
S -1 4
T -1 1 5
P -3 -1 -1 7
A 0 1 0 -1 4
G -3 0 -2 -2 0 6
N -3 1 0 -2 -2 0 6
D -3 0 -1 -1 -2 -1 1 6
E -4 0 -1 -1 -1 -2 0 2 5
Q -3 0 -1 -1 -1 -2 0 0 2 5
H -3 -1 -2 -2 -2 -2 1 -1 0 0 8
R -3 -1 -1 -2 -1 -2 0 -2 0 1 0 5
K -3 0 -1 -1 -1 -2 0 -1 1 1 -1 2 5
M -1 -1 -1 -2 -1 -3 -2 -3 -2 0 -2 -1 -1 5
I -1 -2 -1 -3 -1 -4 -3 -3 -3 -3 -3 -3 -3 1 4
L -1 -2 -1 -3 -1 -4 -3 -4 -3 -2 -3 -2 -2 2 2 4
V -1 -2 0 -2 0 -3 -3 -3 -2 -2 -3 -3 -2 1 3 1 4
F -2 -2 -2 -4 -2 -3 -3 -3 -3 -3 -1 -3 -3 0 0 0 -1 6
Y -2 -2 -2 -3 -2 -3 -2 -3 -2 -1 2 -2 -2 -1 -1 -1 -1 3 7
W -2 -3 -2 -4 -3 -2 -4 -4 -3 -2 -2 -3 -3 -1 -3 -2 -3 1 2 11
C S T P A G N D E Q H R K M I L V F Y W
134 LQQGELDLVMTSDILPRSELHYSPMFDFEVRLVLAPDHPLASKTQITPEDLASETLLI | ||| | | |||||| | || || 137 LDSNSVDLVLMGVPPRNVEVEAEAFMDNPLVVIAPPDHPLAGERAISLARLAEETFVM
D:D = +6
D:R = -2
From Henikoff 1996
Scoring Matrices
Physical/Chemical similarities
comparing two sequences according to the properties of their residues may highlight regions of structural similarity
Identity matrices
by stressing only identities in the alignment, stretches of sequence that may have diverged will not penalise any remaining common features
Scoring Matrices (ctd)
As the direct source of residue by residue comparison scores the scoring matrix you choose will have a major impact on the alignment calculated
The most commonly used will be one of the mutation matrices
PAM or BLOSUM
Von Bing will explain the derivation of these and other mutation matrices next Tuesday.
The matrix that performs best will be the matrix that best reflects the evolutionary separation of the sequences being aligned.
Statistical motivation for alignment scores
pr(data|H) = pr( |H) = pr( |H) x ...
= (1-p)apd d = # disagreements, a = # agreements, p = (1-e-8t)
pr(data|R) = pr( |R) = pr( |R) x ...
= ( )a( )d
= a log + d log . Since p < , log <0, log >0
score = a + d (-) >0 match score, -<0 mismatch penalty
Note that if t 0, p 6t, 1-p 1 and so log4, while - log8t is large and negative: a big difference in the two scores.
Conversely, if t is large, p = (1-), = 1-, and log(1-) -, while 1-p = (1+3), = 1+3, and so log(1+3) 3. Thus the scores are about 3:1.
AGCTGATCA...AACCGGTTA...Alignment: H = homologous (indep. sites, Jukes-
Cantor)R = random (indep. sites, equal freq.)
Hypotheses:
34
34
14
log {pr(data|H)pr(data|R) } 1-p
1/4 p3/4
34
p3/4
1-p1/4
≈ ≈ ≈ ≈ ≈
34
p3/4 ≈
14
1-p1/4
≈
We can do the same with any other Markov substitution matrix for molecular evolution. E.g. with a PAM or BLOSUM matrix of probabilities,
a1 ..... am
b1 ..... bmdata = a gap free alignment of two a.a. sequence fragments
pr(data|H) = aipaibi(2t) pr(data|R) = aibi
log{ } = log{ }
The elements of a log-odds score matrix are typically > 0 on the diagonal and < 0 off the diagonal, but not always.
Also the relative sizes of match and mismatch penalties increase as #PAMs (t) decreases. Thus PAM(120) is more stringent than PAM(250), while PAM(360) is less stringent than it.
PAM(0) = the identity matrix is the toughest.
There are plenty of score matrices based on other principles.
m
1
i
pr(data|H)pr(data|R)
ipaibi(2t)/ bi
Below diagonal: BLOSUM62 substitution matrixAbove diagonal: Difference matrix obtained by subracting the
PAM 160 matrix entrywise.
From Henikoff & Henikoff 1992
C S T P A G N D E Q H R K M I L V F Y W
0 -1 1 0 2 1 1 2 1 2 0 0 2 4 1 5 1 2 -2 5 C
2 0 -2 0 -1 0 0 0 1 0 0 0 1 0 1 -1 1 1 -1 S
C 9 2 -1 -1 -1 0 0 0 0 0 0 -1 0 -1 1 0 1 1 3 T
S -1 4 2 -2 -1 -1 0 0 -1 -1 -1 1 1 0 -1 0 0 2 1 P
T -1 1 5 2 -1 -2 -2 -1 0 0 1 1 0 0 1 0 1 1 2 A
P -3 -1 -1 7 2 0 -1 -2 0 1 1 0 0 -1 0 -1 1 2 4 G
A 0 1 0 -1 4 3 -1 -1 0 0 1 -1 0 -1 0 -1 0 0 0 N
G -3 0 -2 -2 0 6 2 -1 -1 -1 0 -1 0 0 0 0 2 1 3 D
N -3 1 0 -2 -2 0 6 1 0 0 2 2 1 -1 0 0 2 2 4 E
D -3 0 -1 -1 -2 -1 1 6 0 -2 0 1 1 -1 0 0 1 3 3 Q
E -4 0 -1 -1 -1 -2 0 2 5 2 -1 0 1 0 -1 0 1 2 2 H
Q -3 0 -1 -1 -1 -2 0 0 2 5 -1 -1 0 -1 1 0 1 3 -4 R
H -3 -1 -2 -2 -2 -2 1 -1 0 0 8 1 -2 -1 1 1 2 3 1 K
R -3 -1 -1 -2 -1 -2 0 -2 0 1 0 5 -2 -1 -1 0 1 2 4 M
K -3 0 -1 -1 -1 -2 0 -1 1 1 -1 2 5 -1 1 0 0 1 3 I
M -1 -1 -1 -2 -1 -3 -2 -3 -2 0 -2 -1 -1 5 -1 0 -1 1 2 L
I -1 -2 -1 -3 -1 -4 -3 -3 -3 -3 -3 -3 -3 1 4 0 1 2 4 V
L -1 -2 -1 -3 -1 -4 -3 -4 -3 -2 -3 -2 -2 2 2 4 -1 -2 1 F
V -1 -2 0 -2 0 -3 -3 -3 -2 -2 -3 -3 -2 1 3 1 4 -1 2 Y
F -2 -2 -2 -4 -2 -3 -3 -3 -3 -3 -1 -3 -3 0 0 0 -1 6 -1 W
Y -2 -2 -2 -3 -2 -3 -2 -3 -2 -1 2 -2 -2 -1 -1 -1 -1 3 7
W -2 -3 -2 -4 -3 -2 -4 -4 -3 -2 -2 -3 -3 -1 -3 -2 -3 1 2 11
C S T P A G N D E Q H R K M I L V F Y W
Above diagonal: SG scoring system (Feng et al., 1985)Below diagonal: Log-odds matrix for 250 PAMs (Dayhoff et al., 1978)
C S T P A G N D E Q H R K M I L V F Y W
6 4 2 2 2 3 2 1 0 1 2 2 0 2 2 2 2 3 3 3 C
6 5 4 5 5 5 3 3 3 3 3 3 1 2 2 2 3 3 2 S
C 12 6 4 5 2 4 2 3 3 2 3 4 3 3 2 3 1 2 1 T
S 0 2 6 5 3 2 2 3 3 3 3 2 2 2 3 3 2 2 2 P
T -2 1 3 6 5 3 4 4 3 2 2 3 2 2 2 5 2 2 2 A
P -3 1 0 6 6 3 4 4 2 1 3 2 1 2 2 4 1 2 3 G
A -2 1 1 1 2 6 5 3 3 4 2 4 1 2 1 2 1 3 0 N
G -3 1 0 -1 1 5 6 5 4 3 2 3 0 1 1 3 1 2 0 D
N -4 1 0 -1 0 0 2 6 4 2 2 4 1 1 1 4 0 1 1 E
D -5 0 0 -1 0 1 2 4 6 4 3 4 2 1 2 2 1 2 1 Q
E -5 0 0 -1 0 0 1 3 4 6 4 3 1 1 3 1 2 3 1 H
Q -5 -1 -1 0 0 -1 1 2 2 4 6 5 2 2 2 2 1 1 2 R
H -3 -1 0 0 -1 -2 2 1 1 3 6 6 2 2 2 3 0 1 1 K
R -4 0 0 0 -2 -3 0 -1 -1 1 2 6 6 4 5 4 2 2 3 M
K -5 0 0 -1 -1 -2 1 0 0 1 0 3 5 6 5 5 4 3 2 I
M -5 -2 -1 -2 -1 -3 -2 -3 -2 -1 -2 0 0 6 6 5 4 3 4 L
I -2 -1 0 -2 -1 -3 -2 -2 -2 -2 -2 -2 -2 2 5 6 4 3 3 V
L -6 -3 -2 -3 -2 -4 -3 -4 -3 -2 -2 -3 -3 4 2 6 6 5 3 F
V -2 -1 0 -1 0 -1 -2 -2 -2 -2 -2 -2 -2 2 4 2 4 6 3 Y
F -4 -3 -3 -5 -4 -5 -4 -6 -5 -5 -2 -4 -5 0 1 2 -1 9 6 W
Y 0 -3 -3 -5 -3 -5 -2 -4 -4 -4 0 -4 -4 -2 -1 -1 -2 7 10W -8 -2 -5 -6 -6 -7 -4 -7 -7 -5 -3 2 -3 -4 -5 -2 -6 0 0 17
C S T P A G N D E Q H R K M I L V F Y W
Gap penalties
Gap penalties are usually composed of two parts:
Gap opening penalty
This reduces the alignment score and therefore must create more significant alignment downstream than would be present if no gap were created
The size of the penalty is usually of the order of one to three times the size of values in the scoring matrix
Gap penalties (ctd)
Gap extension penalty
If a gap has been created then extending it should not be as hard to do
On the other hand we want to limit the size of the gap to practical lengths
A smaller gap extension penalty may allow an alignment to resolve situations where complete loops may be missing between one structure and another
Low gap penalty eclustalw May 24, 1999 18:44
lgb1_pea.pep ck: 2970 from: 1 to: 147 Length: 147 hbhu.pep ck: 3588 from: 1 to: 147 Length: 147
Pairwise similarity parameter: K-Tuple length: 1 Gap Penalty: 3 Number of diagonals: 5 Diagonal window size: 5 Scoring Method: Percentage
Multiple alignment parameter: Gap Penalty (fixed): 1.00 Gap Penalty (varying): 0.05 Gap separation penalty range: 8 Percent. identity for delay: 40% List of hydrophilic residue: GPSNDQEKR Protein Weight Matrix: blosum
10 20 30 40 50 60 . . . . . .LGB1_PEA.pep --GFTDKQE-ALVNSSSEFKQNLPGYSILFYTIVLEKAPAAKGLF-SF--LKDTAGVEDSHBHU.pep MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVY--PWTQRFFESFGDLSTPDAVMGN * . *. * * .*. * .. * ** * *
LGB1_PEA.pep PKLQAHAEQVFGLVRDSAAQLR-TKGEVVLGNATLGAIHVQKGVTNP-HFVVVKEALLQTHBHU.pep PKVKAHGKKVLGAFSDGLAHLDNLKGTF----ATLSELHCDKLHVDPENFRLLGNVLVCV **..** .* * * *.* ** *** .* * * .* .. *.
LGB1_PEA.pep IKKASGNNWSEELNTAWEVAYDGLATAIKKAMKTAHBHU.pep LAHHFGKEFTPPVQAAYQKVVAGVANAL--AHKYH . . * . ...* . *.*.*. * *
Middling gap penalty eclustalw May 24, 1999 18:50
lgb1_pea.pep ck: 2970 from: 1 to: 147 Length: 147 hbhu.pep ck: 3588 from: 1 to: 147 Length: 147
Pairwise similarity parameter: K-Tuple length: 1 Gap Penalty: 3 Number of diagonals: 5 Diagonal window size: 5 Scoring Method: Percentage
Multiple alignment parameter: Gap Penalty (fixed): 25.00 Gap Penalty (varying): 0.05 Gap separation penalty range: 8 Percent. identity for delay: 40% List of hydrophilic residue: GPSNDQEKR Protein Weight Matrix: blosum
10 20 30 40 50 60 . . . . . .LGB1_PEA.pep ----GFTDKQEALVNSSSEFKQNLPGYSILFYTIVLEKAPAAKGLFSFLKDTAGVEDSPKHBHU.pep MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPK .* . * .. .* . * * * **
LGB1_PEA.pep LQAHAEQVFGLVRDSAAQLRTKGEVVLGNATLGAIHVQKGVTNP-HFVVVKEALLQTIKKHBHU.pep VKAHGKKVLGAFSDGLAHLDN---LKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAH ..** .* * * *.* . . *** .* * * .* .. *. . .
LGB1_PEA.pep ASGNNWSEELNTAWEVAYDGLATAIKKAMKTAHBHU.pep HFGKEFTPPVQAAYQKVVAGVANALAHKYH-- * . ...* . *.*.*. . .
Very high gap penalty eclustalw May 24, 1999 18:52
lgb1_pea.pep ck: 2970 from: 1 to: 147 Length: 147 hbhu.pep ck: 3588 from: 1 to: 147 Length: 147
Pairwise similarity parameter: K-Tuple length: 1 Gap Penalty: 3 Number of diagonals: 5 Diagonal window size: 5 Scoring Method: Percentage
Multiple alignment parameter: Gap Penalty (fixed): 50.00 Gap Penalty (varying): 0.05 Gap separation penalty range: 8 Percent. identity for delay: 40% List of hydrophilic residue: GPSNDQEKR Protein Weight Matrix: blosum
10 20 30 40 50 60 . . . . . .LGB1_PEA.pep ----GFTDKQEALVNSSSEFKQNLPGYSILFYTIVLEKAPAAKGLFSFLKDTAGVEDSPKHBHU.pep MVHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPK .* . * .. .* . * * * **
LGB1_PEA.pep LQAHAEQVFGLVRDSAAQLRTKGEVVLGNATLGAIHVQKGVTNPHFVVVKEALLQTIKKAHBHU.pep VKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPEN--FRLLGNVLVCVLAHH ..** .* * * *.* . . * . ... * * .. *. . .
LGB1_PEA.pep SGNNWSEELNTAWEVAYDGLATAIKKAMKTAHBHU.pep FGKEFTPPVQAAYQKVVAGVANALAHKYH-- * . ...* . *.*.*. . .
Dynamic Programming
This is a mathematical implementation that can be seen as an extension of the dotplot method
Rather than dots, the comparison matrix positions are assigned values that reflect the scores in the scoring matrix
For obtaining optimal alignments
Dynamic Programming
The optimum alignment is obtained by tracing the highest scoring path from the top left-hand corner to the bottom right-hand corner of the matrix
When the alignment steps away from the diagonal this implies an insertion or deletion event, the impact of which can be assessed by the application of a gap penalty
A C A C A C T A
A
G
C
A
C
A
C
A
b
a 0 1 0 1 0 1 1 0
1 1 1 1 1 1 1 1
1 0 1 0 1 0 1 1
0 1 0 1 0 1 1 0
1 0 1 0 1 0 1 1
0 1 0 1 0 1 1 0
1 0 1 0 1 0 1 1
0 1 0 1 0 1 1 0
Dynamic programming: the formula
Suppose that our two sequences are a=(a1,...,am) and b=(b1,...,bn),
and that we denote by dij the edit distance between the initial
segments ai=(a1,...,ai) and bj=(b1,...,bj) of a and b.
Extend this to i=j=0 by writing d00=0.
Supposing that a deletion or an insertion incurs a penalty of +1,
the following formula summarizes our verbal argument:
dij=min(di-1,j-1 + s(ai,bj), di,j-1 + 1, di-1,j + 1).
(More is needed to give a complete algorithm: what is it?)
A C A C A C T A
0 1 2 3 4 5 6 7 8
A 1 0 1 2 3 4 5 6 7
G 2 1 1 2 3 4 5 6 7
C 3 2 1 2 2 3 4 5 6
A 4 3 2 1 2 2 3 4 5
C 5 4 3 2 1 2 2 3 4
A 6 5 4 3 2 1 2 3 3
C 7 6 5 4 3 2 1 2 3
A 8 7 6 5 4 3 2 2 2
b
a
Chance or common ancestry?
Idea: calculate optimal alignment scores for pairs of sequences where one is a randomized (shuffled) version of the original. This will give a distribution of random scores, representing chance similarity rather than homology.
The score from our original pair of sequences can be referred to this distribution and assigned a Z-score (subtract mean of randoms and divide by SD of randoms), or (better) a p-value.
Criticism: Such random a.a. sequences might have plausible a.a. compositions but are quite unlike real protein sequences.
Partial reply: a) restrict the randomization to blocks; or, b) create a distribution of chance similarity scores using real a.a. sequences known or assumed not to be homologous to our query sequence. [Other approaches use theory, but this is still subject to the criticism above.]
Dynamic Programming
Based on notes by George Rudy, formerly WEHI.
“Life must be lived forwards and understood backwards.”
Søren Kierkegaard
What is DP?
Operations research: “A mathematical formalism applicable to problems involving optimization of decisions over time.”
(after R. Bellman and S. Dreyfus)
Bioinformatics : “An algorithm for finding optimal sequence alignments given an additive alignment score.”
( after R. Durbin, et al.)
Computer programming: “An approach to algorithm design whereby the target problem is decomposed into smaller problems that are then solved independently.”
(after R. Sedgewick)
Where did DP come from?
- Richard Bellman
- The RAND Corporation
- “Dynamic” and “Programming”
Where can DP be applied?
- Both discrete and continuous problems concerning deterministic, stochastic, or adaptive processes
- Multiple fields: research, industry, finance,…
- Examples: allocation processes
smoothing and scheduling processes
optimal search and stopping techniques
optimal trajectories
multistage production processes
feedback control processes
Markovian decision processes
DP in biomedical literature (1)
0
5
10
15
20
25
Years
DP in biomedical literature (2)- A symmetric-iterated multiple alignment of protein sequences.
[Brocchieri, L. and Karlin S., J. Mol. Biol. 276(1):249-64, 1998.]
- Sequence assembly validation by multiple restriction digest fragment coverage analysis.
[Rouchka, E.C. and States, D.J., ISMB. 6:140-7, 1998.]
- Automated protein sequence database classification. I. Integration of compositional similarity search, local similarity search, and multiple sequence alignment.
[Gracy, J. and Argos, P., Bioinformatics 14(2):164-73, 1998.]
- A segment-based dynamic programming algorithm for predicting gene structure.
[Wu, T.D., J. Comput. Biol. 3(3):375-94, 1996.]
- Automatic detection of cardiac contours on MR images using fuzzy logic and dynamic programming.
[Lalande A. et al., Proc. AMIA Annu. Fall Symp. :474-8, 1997.]
- Process models for production of beta-lactam antibiotics.
[Bellgardt, K.H., Adv. Biochem. Eng. Biotechnol. 60:153-94, 1998.]
- Dynamic programming approach for newborn’s incubator humidity control.
[Bouattoura, D. et al., IEEE Trans. Biomed. Eng. 45(1):48-55, 1998.]
- Minimum energy trajectories of the swing ankle when stepping over obstacles of different heights.
[Chou L.S. et al., J. Biomech. 30(2):115-20, 1997.]
- A theoretical study of the socioecology of ungulates. II. A dynamic programming study of the stochastic formulation.
[Paveri-Fontana, S.L. and Focardi, S. Theor. Popul. Biol. 46(3):279-99, 1994.]
What problems are suitable for DP?
- Essential components (common to all OR problems):
a decision-maker
access to results of decisions
- Additionally:
decisions are sequential
later decisions are affected by earlier ones
effect of a decision can be calculated independently of other decisions
The Stagecoach Problem (1)
A
C
H
E L
O
D
BF
I
M
G
J P
N
K[after S. E. Dreyfus]
1
5
2 3
5
1
2
4
2
0
4
1
2
3
4
8
2
4
7
1
3
5
2
2
Some terminology
- Vertex
- Edge
- Path
-Monotonic-to-the-right
- (Admissible) path
- Stage
- State
The Stagecoach Problem (2)
A
C
H
E L
O
D
BF
I
M
G
J P
N
K
1
5
2 3
5
1
2
4
2
0
4
1
2
3
4
8
2
4
7
1
3
5
2
2
0
The Stagecoach Problem (2)
A
C
H
E L
O
D
BF
I
M
G
J P
N
K
1
5
2 3
5
1
2
4
2
0
4
1
2
3
4
8
2
4
7
1
3
5
2
2 2
1
0
The Stagecoach Problem (2)
A
C
H
E L
O
D
BF
I
M
G
J P
N
K
1
5
2 3
5
1
2
4
2
0
4
1
2
3
4
8
2
4
7
1
3
5
2
2 2
4
1
0
The Stagecoach Problem (2)
A
C
H
E L
O
D
BF
I
M
G
J P
N
K
1
5
2 3
5
1
2
4
2
0
4
1
2
3
4
8
2
4
7
1
3
5
2
2
10
8
7
2
4
6
7
5
1
0
The Stagecoach Problem (2)
A
C
H
E L
O
D
BF
I
M
G
J P
N
K
1
5
2 3
5
1
2
4
2
0
4
1
2
3
4
8
2
4
7
1
3
5
2
2
10
9
12
13
14
8
8
7
2
4
6
11
7
5
1
0
Some more terminology
- Optimal value function
- Policy
- Optimal policy function
The Stagecoach Problem (3)
A
C
H
E L
O
D
BF
I
M
G
J P
N
K
1
5
2 3
5
1
2
4
2
0
4
1
2
3
4
8
2
4
7
1
3
5
2
2
10
9
12
13
14
8
8
7
2
4
6
11
7
5
1
0
The Stagecoach Problem (3)
A
C
H
E L
O
D
BF
I
M
G
J P
N
K
1
5
2 3
5
1
2
4
2
0
4
1
2
3
4
8
2
4
7
1
3
5
2
2
10
9
12
13
14
8
8
7
2
4
6
11
7
5
1
0
The Stagecoach Problem (3)
A
C
H
E L
O
D
BF
I
M
G
J P
N
K
1
5
2 3
5
1
2
4
2
0
4
1
2
3
4
8
2
4
7
1
3
5
2
2
10
9
12
13
14
8
8
7
2
4
6
11
7
5
1
0
The Stagecoach Problem (4)
A
C
H
E L
O
D
BF
I
M
G
J P
N
K
1
5
2 3
5
1
2
4
2
0
4
1
2
3
4
8
2
4
7
1
3
5
2
2
10
9
12
13
14
8
8
7
2
4
6
11
7
5
1
0
Efficiency of the DP approach
- At each of 9 vertices where a real choice existed: 2 additions
1 binary comparison
- At the other 6 vertices: 1 addition
Total: 24 additions
9 comparisons
- Compare this with direct evaluation of the original problem by enumeration of all 20 admissible paths:
5 additions/path = 100 additions 20 comparisons
Efficiency (2), and the Curse of Dimensionality
In general, for the n-stage problem treated here,
DP involves (n2/2) + n additions
Direct enumeration generates paths, or
additions.
Thus, for n=20, DP requires 220 additions while direct enumeration would demand 3,510,364 additions.
n
n
2
⎛
⎝⎜
⎞
⎠⎟ =
n !n2⎛⎝
⎞⎠ ! n
2⎛⎝
⎞⎠ !
(n −1) n!n2⎛⎝
⎞⎠!n2⎛⎝
⎞⎠ !
The Stagecoach Problem (5)
A
C
H
E L
O
D
BF
I
M
G
J P
N
K
y
x
1
2
3
-1
-2
-3
1 2 3 4 5 6
The Principle of Optimality, or Bellman’s Principle
“An optimal policy has the property that whatever the initial state and initial decision are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision.” (Bellman)
or, “An optimal sequence of decisions in a multistage decision process problem has the property that whatever the initial stage, state, and decision are, the remaining decisions must constitute an optimal sequence of decisions for the remaining problem, with the stage and state resulting from the first decision considered as initial conditions.” (Dreyfus)
or, “An optimal policy must have the property that no matter what path is taken to enter a particular state, the remaining stages (decisions) taken must constitute an optimal policy for departure from that state.”
or, “An optimal policy is comprised of optimal subpolicies.”
or, “An optimal policy from any state is independent of the path taken to that state, and is made up entirely of optimal subpolicies.”
or, ...
The optimal value function
S(x,y) = the value of the minimum-value admissible path connecting the vertex (x,y) and the terminal vertex (6,0)
eu(x,y) = the value of the edge connecting the vertices (x,y) and
(x+1, y+1)
ed(x,y) = the value of the edge connecting the vertices (x,y) and
(x+1, y-1)
S(x,y) = min {eu(x,y) + S(x+1, y+1), ed(x,y) + S(x+1, y-1)}
S(6,0) = 0.
A more formal restatement of common features of DP problems
A physical system characterized at any stage by a small set of parameters, the state variables;
At each stage of the process there is a choice of a number of decisions;
The effect of a decision is a transformation of the state variables;
The past history of the system is of no importance in determining future actions;
The purpose of the process is to maximize some function of the state variables.
The practice of DP
Imbed the specific given problem in a more general family of problems;
Define the optimal value function which associates a value with each of the various possible initial conditions of problems in that family;
Invoke the principle of optimality in order to deduce a recurrence relation characterizing that function;
Seek the solution of the recurrence relation in order to obtain the optimal policy function which furnishes the solution to the specific given problem and all other problems in the more general family as well.
More practically speaking,Determine the decision-maker and the decisions to be made;
Determine the stages;
Determine the possible states;
Formulate the optimal value function in the form of a recurrence relation;
Calculate and tabulate the optimal value function for each stage and state;
Find the optimal policy (ies) for the problem.
New problem, new terminology
Edit operations: M(atch), R(eplacement), I(nsert), D(elete).
Edit transcript: A string over the alphabet M, R, I, D that describes a transformation of one string into another. Example:
R D I M D MR D I M D M
M A - T H S
A - R T - S
Edit (Levens(h)tein) distance: The minimum number of edit operations necessary to transform one string into another. (Note: matches are not counted.) Example:
R D I M D MR D I M D M
1+ 1+ 1+ 0+ 1+ 0 = 4
Once again,
Imbed the problem in the more general family;
Define the optimal value function;
Deduce the recurrence relation;
Solve for the recurrence relation to obtain the optimal policy function.
The recurrence
Stage: position in the edit transcript;
State: I, D, M, or R;
Optimal value function: D(i, j)
where D(i, j) = edit distance of Seq1[1...i] and Seq2[1...j]
Recurrence relation:
D(i, j) = min {1 + D(i-1, j),1 + D(i, j-1), t(i, j) + D(i-1, j-1) } ,
where t(i, j) = 0 if Seq1(I) = Seq2(j), and =1 otherwise.
The tabulation , D(i, j)
Seq2(j) A R T S
Seq1(i) 0 1 2 3 4
0
M 1
A 2
T 3
H 4
S 5
The tabulation , D(i, j)
Seq2(j) A R T S
Seq1(i) 0 1 2 3 4
0 0
M 1
A 2
T 3
H 4
S 5
The tabulation , D(i, j)
Seq2(j) A R T S
Seq1(i) 0 1 2 3 4
0 0 1
M 1
A 2
T 3
H 4
S 5
The tabulation , D(i, j)
Seq2(j) A R T S
Seq1(i) 0 1 2 3 4
0 0 1 2
M 1
A 2
T 3
H 4
S 5
The tabulation , D(i, j)
Seq2(j) A R T S
Seq1(i) 0 1 2 3 4
0 0 1 2 3 4
M 1 1
A 2 2
T 3 3
H 4 4
S 5 5
The tabulation , D(i, j)
Seq2(j) A R T S
Seq1(i) 0 1 2 3 4
0 0 1 2 3 4
M 1 1 1
A 2 2
T 3 3
H 4 4
S 5 5
The tabulation , D(i, j)
Seq2(j) A R T S
Seq1(i) 0 1 2 3 4
0 0 1 2 3 4
M 1 1 1 2
A 2 2
T 3 3
H 4 4
S 5 5
The tabulation , D(i, j)
Seq2(j) A R T S
Seq1(i) 0 1 2 3 4
0 0 1 2 3 4
M 1 1 1 2 3 4
A 2 2 1 2 3 4
T 3 3
H 4 4
S 5 5
The tabulation , D(i, j)
Seq2(j) A R T S
Seq1(i) 0 1 2 3 4
0 0 1 2 3 4
M 1 1 1 2 3 4
A 2 2 1 2 3 4
T 3 3 2 2 2 3
H 4 4
S 5 5
The tabulation , D(i, j)
Seq2(j) A R T S
Seq1(i) 0 1 2 3 4
0 0 1 2 3 4
M 1 1 1 2 3 4
A 2 2 1 2 3 4
T 3 3 2 2 2 3
H 4 4 3 3 3 3
S 5 5 4 4 4 3
The traceback
Seq2(j) A R T S
Seq1(i) 0 1 2 3 4
0 0 1 2 3 4
M 1 1 1 2 3 4
A 2 2 1 2 3 4
T 3 3 2 2 2 3
H 4 4 3 3 3 3
S 5 5 4 4 4 3
The solutions - #1
1 0 1 1 0 = 3
DD MM RR RR MM
M A T H S
- A R T S
The traceback
Seq2(j) A R T S
Seq1(i) 0 1 2 3 4
0 0 1 2 3 4
M 1 1 1 2 3 4
A 2 2 1 2 3 4
T 3 3 2 2 2 3
H 4 4 3 3 3 3
S 5 5 4 4 4 3
The solutions - #2
1 0 1 0 1 0 = 3
DD MM II MM DD MM
M A - T H S
- A R T - S
The traceback
Seq2(j) A R T S
Seq1(i) 0 1 2 3 4
0 0 1 2 3 4
M 1 1 1 2 3 4
A 2 2 1 2 3 4
T 3 3 2 2 2 3
H 4 4 3 3 3 3
S 5 5 4 4 4 3
The solutions - #3
1 1 0 1 0 = 3
RR RR MM DD MM
M A T H S
A R T - S
DP, in general (well, for a discrete, deterministic, additive process, anyway)
F(t, s) = Opt {r(t, s, x) + aF(t´, s´) : x in X(t, s) and s´ = T(t, s, x)}
Need not be additive. When a stochastic process, r and F are expected values; the state transform is random with a probability distribution
P[T(t, s, x) = s´ | s, x]’, and
F(t´, s´) is replaced by
∑s´ {F(t´, s´) P[T(t, s, x) = s´ | s, x]}
“Life must be lived forwards and understood backwards.”
Søren Kierkegaard