Understanding the Immune Response Through Modeling and Simulation Steven H. Kleinstein Department of Computer Science Princeton University
Understanding the Immune ResponseThrough Modeling and Simulation
Steven H. KleinsteinDepartment of Computer Science
Princeton University
The Immune System
• Protects the body from damaging pathogens– viruses, bacteria, parasites
• Provides basis for vaccines (e.g., flu)• Implicated in disease:
– Autoimmune (Lupus, MS, Rheumatoid Arthritis)– Sepsis, Cancer
Relatively new science, began with Jenner in 1796
Understanding will lead to better diagnostics and therapiesUnderstanding will lead to better diagnostics and therapies
Why Model the Immune System?
• Immune response involves the collective and coordinated response of ≈1012 cells and molecules
• Distributed throughout body– blood, lymph nodes, spleen, thymus, bone marrow, etc.
• Interactions involve feedback loops and non-linear dynamics
• Experiments often require artificial constructs• High variability observed in experimental results
Somatic Hypermutation: important component of responseSomatic Hypermutation: important component of response
Experiments provide only a static window onto the real dynamics of immunity
B cells Antibody Receptors “Recognize” Antigens1. B cell’s must recognize universe of pathogens (antigens)2. Response to any specific antigen must be efficient
Hypermutation & Selection ≈ Darwinian Evolution, but in 3 weeks!Hypermutation & Selection ≈ Darwinian Evolution, but in 3 weeks!
Hypermutation and selection lead to affinity increase over time…
Rearrangement creates initial diversity…
B Cell
Antibody Receptor
What might go wrong?
Commonly Accepted:Somatic Hypermutation Restricted to Germinal Centers
Commonly Accepted:Somatic Hypermutation Restricted to Germinal Centers
Somatic Hypermutation
Antibody Receptors Against Self-antigens
Autoimmune Disease
Autoimmunity is a response against body’s own proteins, DNA, etc.
© 2001 by Garland Publishinghttp://mcb.berkeley.edu/courses/mcb150/Lect10/Lect10.pdf
Germinal Centers Form in the Spleen
Commonly Accepted:Germinal Centers are the Site of Somatic Hypermutation
and Selection of Higher-Affinity B Cells
Commonly Accepted:Germinal Centers are the Site of Somatic Hypermutation
and Selection of Higher-Affinity B Cells
Motivating Experiment
B cells T cells
Estimate mutation rate to show (hyper?)mutationEstimate mutation rate to show (hyper?)mutation
In auto-immune mouse model, observed mutating B cells inextra-follicular areas of spleen (not germinal centers)
Auto-immune MouseMRL/lpr AM14 heavy chain transgenic(William, Euler, Christensen, and Shlomchik. Science. 2002 )
Extra-Follicular Areas
Dividing B cells FDC
ControlPrimary anti-hapten response to NP
(Jacob et al., 1991; Jacob and Kelsoe, 1992; Jacob et al., 1993; Radmacher et al., 1998)
Germinal Centers
Microdissection(10 cells)
What’s hard about estimating the mutation rate?The number of divisions in vivo is unknown
Most recognized in vivo estimates took educated guesses(McKean et al, 1984 and Sablitzky et al, 1985)
Most recognized in vivo estimates took educated guesses(McKean et al, 1984 and Sablitzky et al, 1985)
Number of Cell Divisions
Num
ber o
f Mut
atio
ns
High Mutation Rate
Low Mutation Rate
Observed Number of Mutations
Clonal Trees Provide Needed InformationAnalyze pattern of shared and unique mutations among sequences
from each microdissectionGermline GGGATTCTC1 -C-----G-2 -------G-3 A------GA4 A---C--GA
Clonal tree ‘shapes’ reflect underlying dynamicsClonal tree ‘shapes’ reflect underlying dynamics
1(G→A)9(C→A)
2
1 3
4
8(T→G)
2(G→C)
5(T→C)
Germline
Relating Tree Shapes to Underlying Dynamics
BA C D
g ct c
BA D
Initial Sequence
a t
Initial Sequence
a
D
t
g ct
BA
Investigate with computer simulation of B cell clonal expansion
Parameters: mutation rate (µ), lethal frequency (λ), # divisions (d), pick size (p)
Compare: Rate of 0.2 division-1 for 14 divisionsRate of 0.4 division-1 for 7 divisions
0.01
0.1
1
10
100
EDGESVERTIC
IESNODESLE
AVESLE
AF CELL
S
LEAF R
EPEATCELLS
INTERMID
IATE
INTRMEDIA
TE CELL
SREPEAT N
ODESREPEAT C
ELLS
INTERMEDIATE N
ODES
INTERMEDIATE N
ODE CELL
SGERMLIN
E
UNIQUE M
UTATIONS
TOTAL MUTATIO
NS
Relevant shape measures can differentiate similar clonesRelevant shape measures can differentiate similar clones
(Simulation Data – 5 sequences / tree)
Intermediate Vertices is Useful Measure Compare: Rate of 0.2 division-1 for 14 divisions
Rate of 0.4 division-1 for 7 divisions
0.2
0.4
0.6
0.8
1
1.2
1.4
1.5 1.7 1.9 2.1 2.3 2.5Total Mutations Per Sequence
Inte
rmed
iate
Ver
ticie
s
Shape measures can supplement information from mutation countingShape measures can supplement information from mutation counting
(Simulation Data)
Method for Estimating Mutation Rate (µ)Find mutation rate that produces distribution of tree
‘shapes’ most equivalent to observed set of trees
1.E-39
1.E-37
1.E-35
1.E-33
1.E-31
1.E-29
1.E-27
1.E-25
0.10 0.20 0.30 0.40 0.50
Mutation Rate (per division)
Like
lihoo
d
Assumes equivalent mutation rate in all trees, although number divisions may differ
Also developed analytical method based on same underlying idea(The Journal of Immunology (2003) Vol. 171 No. 9, 4639-4649.)
Also developed analytical method based on same underlying idea(The Journal of Immunology (2003) Vol. 171 No. 9, 4639-4649.)
ExperimentalObservations
Set of ObservedTree Shapes
Mutation Rate
Distribution ofTree Shapes
Simulation ofB cell expansion
Number of mutationsIntermediate vertices
Sequences at root
=
Details of the Simulation Method
0000Tree T
0000…
0000Tree 2
0000Tree 1
D…21# divisions (d)
Equivalent Matrix, E(t,d)# simulated trees ‘equivalent’ to observed tree after d divisions
For each value of the mutation rate (µ), calculate likelihood by…
t
Use Golden Section Search to optimize mutation rate (µ)Use Golden Section Search to optimize mutation rate (µ)
( , )( | , )
( , )
d
d
E t dL t
O t dµ λ =
∑
∑
( ) ( | , )t
L L tµ µ λ=∏
2. Likelihood of experimentally observed tree t:
3. Likelihood of experimental dataset:
1. Run simulation many times to fill in equivalent matrix
Sample space is subset of all
simulation runs
Finding the Optimal Mutation RateGolden Section Search works by successive bracketing of minimum/maximum
http://lib-www.lanl.gov/numerical/bookcpdf/c10-1.pdf
Direct Search Method (No Derivative)Simple Implementation, Linear Convergence
0.38197(golden mean)
Method is effective with 128,000 simulations per LikelihoodMethod is effective with 128,000 simulations per Likelihood
Not tolerant of noise,Make sure evaluation is precise
2.E-27
3.E-27
3.E-27
4.E-27
4.E-27
5.E-27
5.E-27
6.E-27
0 64000 128000 192000 256000 320000
Number of Simulations per Likelihood Evaluation
Like
lihoo
d
Details of the Analytical MethodFormulas to approximate tree shapes…
Minimize error X(µ) over all experimentally observed trees (t)
( ) ( ) ( )2 2 2
( )( ) ( ) ( )
t t t
dt t t t
M M R R P PX MIN
VAR M VAR R VAR Pµ
− − −= + +
∑
For each observed tree, choose number of divisions to minimize error
Observed shape Calculated shape
The average number of mutations per sequence (M) 1(1 )M dλ µ= −d
teSR
−×=
−
)1( 1µλ
µ
The average number of sequences present at the root of the tree (R)
Total number of sequences in nodes with repeated sequences (P) 111 (1 ) tS
tP S p − = − −
Estimating the Lethal Frequency (λ)Simulation Model Parameters:mutation rate (µ), # divisions (d), # sequences (s), lethal frequency (λ)
Choose λ so expected R/(R+S) equals observed value over all mutationsChoose λ so expected R/(R+S) equals observed value over all mutations
Only replacement mutations can be lethal, so…
Fraction of all mutations that are
replacements
ExperimentalData
ObservedR / (R + S)
Lethal Frequency (λ)
ExpectedR / (R + S)
GermlineDNA Sequence
=
H…CAT…
H…CAC…
Y…TAT…
Silent
Replacement
Validating the Simulation MethodUse simulation to construct synthetic data sets with limited number of trees/sequences reflecting currently available experimental data
y = 1.2073xR2 = 0.9149
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Actual Mutation Rate (per division)
Pre
dict
ed M
utat
ion
Rat
e (p
er d
ivis
ion)
Method works even with limited number of clonal trees and sequencesMethod works even with limited number of clonal trees and sequences
Method Precision(SD = 0.035 division-1)
Validating the Analytical Method
y = 1.075xR2 = 0.9435
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
Actual Mutation Rate (per division)
Pred
icte
d M
utat
ion
Rat
e (p
er d
ivis
ion)
Use simulation to construct artificial data sets with limited number of trees/sequences reflecting currently available experimental data
Method works even with limited number of clonal trees and sequencesMethod works even with limited number of clonal trees and sequences
Results for analytical method (assumes correct λ)
Testing Method Assumption…All cells in single microdissection divided same number of times
(i.e., division is synchronous)
Assumption does not significantly impact rate estimateAssumption does not significantly impact rate estimate
0
0.1
0.2
0.3
0.4
0.5
Est
imat
ed M
utat
ion
Rat
e
AsynchronousDivision
SynchronousDivision
Generation of synthetic data
0.0E+00
2.0E-04
4.0E-04
6.0E-04
8.0E-04
1.0E-03
1.2E-03
1.4E-03
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Fraction of FWR Replacement Mutations Lethal (λ)
Est
imat
ed M
utat
ion
Rat
e ( µ
)
Simulation Estimate
Analytical Estimate
Mutation Rate in Autoimmune Response
Estimated mutation rate is 1.0 ± 0.1 x 10-3 base-pair-1 division-1Estimated mutation rate is 1.0 ± 0.1 x 10-3 base-pair-1 division-1
Experimental data set: 31 trees from 7 mice, ≈6 sequences / treefrom extra-follicular areas
(Williams et al, Science, 2002)
Estimated λλλλ ≈≈≈≈ 0.55based on R/(R+S)
0.0E+00
2.0E-04
4.0E-04
6.0E-04
8.0E-04
1.0E-03
1.2E-03
1.4E-03
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Fraction of FWR Replacement Mutations Lethal (λ)
Est
imat
ed M
utat
ion
Rat
e
Simulation Estimate
Analytical Estimate
Mutation Rate in Primary NP Response
Estimated mutation rate is 1.1 ± 0.1 x 10-3 base-pair-1 division-1Estimated mutation rate is 1.1 ± 0.1 x 10-3 base-pair-1 division-1
Experimental data set: 23 trees, ≈7 sequences / treefrom germinal centers
(Jacob et al., 1991; Jacob and Kelsoe, 1992; Jacob et al., 1993; Radmacher et al., 1998)
Testing impact on estimate:
• Data based on larger picks
• Positive selection may be factor
Summary
� Developed simulation and analytical methods to estimate in vivo mutation rates (and lethal frequencies)� First rigorous method for in vivo estimates
� Synthetic datasets used to show that…� Methods are precise (± 0.1 x 10-3 base-pair-1 division-1)� Assumption of synchronous division does not impact results
� Extra-follicular B cells in autoimmune mouse hypermutate� Mutation rate (0.9 ± 0.1 x 10-3) similar to NP response (1.1 ± 0.1 x 10-3)
� Future improvements in precision with additional data
Rigorous method to compare mutation rates under varying experimental conditions
Rigorous method to compare mutation rates under varying experimental conditions
Acknowledgements
For more information:
[email protected]; www.cs.princeton.edu/~stevenk
Yoram Louzoun (Bar-Ilan University)
Mark Shlomchik (Yale University)
Jaswinder Pal Singh (Princeton University)
Kleinstein, Louzoun and ShlomchikThe Journal of Immunology (2003) Vol. 171 No. 9, 4639-4649.
PICASsoPICASsoProgram in Integrative Information, Computer and Application Sciences