June 1999 NCSU QTL Workshop © Brian S. Yandell 1 Bayesian Inference for QTLs in Inbred Lines II Brian S Yandell University of Wisconsin- Madison www.stat.wisc.edu/~yandell NCSU Statistical Genetics June 1999
Jan 17, 2016
June 1999 NCSU QTL Workshop © Brian S. Yandell
1
Bayesian Inference for QTLs in Inbred Lines II
Brian S YandellUniversity of Wisconsin-Madisonwww.stat.wisc.edu/~yandell
NCSU Statistical GeneticsJune 1999
June 1999 NCSU QTL Workshop © Brian S. Yandell
2
Many Thanks
Michael NewtonDaniel SorensenDaniel GianolaJaya SatagopanPatrick GaffneyFei Zou
Tom OsbornDavid ButruilleMarcio FerreraJosh UdahlPablo Quijada
USDA Hatch Grants
June 1999 NCSU QTL Workshop © Brian S. Yandell
3
Overview II
• quick review of trait model– single & multiple QTL– details of Gibbs sampler full conditionals– vector notation
• reversible jump MCMC– multiple regression– number of QTLs
• deconstructing Bayesian LODs
June 1999 NCSU QTL Workshop © Brian S. Yandell
4
Quick Review of trait Model
• single QTL details of Gibbs sampler– normal priors & likelihoods
• mean, additive effects
– inverse gamma prior for variance• or inverse chi-square
– vague priors lead to usual estimates as posterior means
• multiple QTL trait model– model with vector notation
June 1999 NCSU QTL Workshop © Brian S. Yandell
5
Single QTL trait Model• trait = mean + additive + error• trait = effect_of_geno + error• prob( trait | geno, effects )
**
2**
**
),,;|(
jj
jj
jjj
xby
bxy
exby
10 12 14
x=-1x=0
x=1
June 1999 NCSU QTL Workshop © Brian S. Yandell
6
Gibbs Sampler updates
variancemean
traits
additive
genos
June 1999 NCSU QTL Workshop © Brian S. Yandell
7
Full Conditional for mean
nnbV
n
xby
n
xby
bE
xbyb
n
jjj
n
jjj
n
j
jj
222**
1
**
1
**
2**
1
**2**
2
2
2
2
2
2
),;,|(
)()(
),;,|(
),;,|(
xy
xy
xy• normal prior
with large variance
• leads to normal posterior
• posterior mean
• posterior variance
2
June 1999 NCSU QTL Workshop © Brian S. Yandell
8
Full Conditional for additive Effect
n
jj
n
jj
n
jj
n
jjj
n
jj
n
jjj
n
j
jj
xxbV
x
yx
x
yx
bE
xbybb
1
2*
2
1
2*
22**
1
2*
1
*
1
2*
1
*
2**
1
***2**
)()(),;,|(
)(
)(
)(
)(
),;,|(
),;,|(
2
2
2
2
xy
xy
xy• normal prior with large variance
• leads to normal posterior
• posterior mean
• posterior variance
2
June 1999 NCSU QTL Workshop © Brian S. Yandell
9
Full Conditional for variance
n
xby
a
vbE
vaInvb
evaInv
n
jjj
n
n
nn
va
1
2**
2
2
22**2
222
**2
/)1(22
)(
ˆ1
ˆ),;,|(
ˆ,~,;,|
)(),(~2
xy
xy
• inverse gamma prior with large v/a
• posterior distribution
• posterior mean
June 1999 NCSU QTL Workshop © Brian S. Yandell
10
MCMC run for variance
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
11
1
11
1
1
1
1
1
11
1
1
1
11
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
111
1
11
1
1
1
1
1
111
1
1
11
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
111
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
11
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
11
1
111
1
1
1
1
11
1
1
111
1
1
1
1
1
1
1
1
11
1
1
1
1
11
11
1
1
11
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
11
1
1
1
1
1
11
1
1
11
1
1
11
1
1
1
1
1
1
1
11
1
1
11
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1111
111
1
1
1
1
1
1
111
1
1
1
1
1
1
1
1
1
11
1
1
11
1
1
1
1
1
1
1
11
11
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
11
11
1
11
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
11
11
11
1
1
1
1
1
1111
1
1
11
1
1
1
11
1
1
1
1
11
1
1
1
11
1
1
1
1
1
1
1
1
11
1
111
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
11
11
1
11
1
1
1
1
11
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
11
1
1
111
1
1
11
111
1
1
11
11
1
1
1
1
11
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
11
1
1
11
11
1
1
1
1
11
1
11
11
1
11
11
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
11
1
1
1
1
11
11
1
1
1
1
1
1
11
1
11
1
1
11
1
11
1
1
1
1
1
1
1
1
1
111
1
111
111
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
111
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
11
1
1
1
11
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
11
1
1
11
1
1
1
1
1
1
1
11
11
1
1
1
1
1
1
1
1
111
1
1
1
11
1
1
1
1
1
1
1
1
1
11
1
1
1
111
1
11
1
1
1
1
1
1
11
1
1
1
1
1
11
1
1
1
11
1
1
1
1
11
1
1
1
1
11
1
1
1
1
1
1
1
111
1
1
1
11
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
11
1
1
11
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
0 200 400 600 800 1000
1.0
1.2
1.4
1.6
1.8
2.0
MCMC run
1.0
1.2
1.4
1.6
1.8
2.0
0 10 20 30 40
varia
nce
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
11
Alternative for Variance:use Inverse Chi-square
nd
xbyvd
ndInvb
vdvdvdInv
n
jjj
dd
1
2**
2**2
222
22
)(
,~,;,|
~or ,),(~
xy
• inverse chi-square prior with large d,v
• posterior distribution
June 1999 NCSU QTL Workshop © Brian S. Yandell
12
Multiple QTL model
• trait = mean + add1 + add2 + error• trait = effect_of_genos + error• prob( trait | genos, effects )
j
m
rjrrj
jjjj
exby
exbxby
1
**
*2
*2
*1
*1
June 1999 NCSU QTL Workshop © Brian S. Yandell
13
Vector Notation for QTLs• inner product for sum• condense notation
**1
*
*
*1
*
*
*1
*
**
1
**
,,,,
,
n
jm
j
j
m
j
m
rjrr
x
x
b
b
xb
xxXxb
xb
June 1999 NCSU QTL Workshop © Brian S. Yandell
14
Multiple loci• vector of loci across linkage map• careful bookkeeping during update
– identifiability & bump hunting– possibility of two loci in one marker interval
• ordered loci are sufficient
n
jrjrrr
m
m
rr
x1
**
11
**
)|()()|(
),,(,)|()|(
X
XX
June 1999 NCSU QTL Workshop © Brian S. Yandell
15
Posterior: Multiple QTLs• posterior = likelihood * prior / constant• posterior( paramaters | data )
prob( genos, effects, loci | traits, map )
is proportional to
)|;,,;( 2** yX b
m
r
n
jrjrrr
n
jjj
xb
y
1 1
**2
1
2**
)|()()()()(
),,;|(
bx
June 1999 NCSU QTL Workshop © Brian S. Yandell
16
MCMC for Multiple QTLs• construct Markov chain around posterior• update one (or several) components at a time
– update effects given genotypes & traits– update loci given genotypes & traits– update genotypes give loci & effects
• update all terms for each locus at one time?– open questions of efficient mixing
N
21
2**2** )|;,,;(~);,,;( ybXbX
June 1999 NCSU QTL Workshop © Brian S. Yandell
17
MCMC Updates
effectsloci
traits
genos
June 1999 NCSU QTL Workshop © Brian S. Yandell
18
MCMC Conditions• construct Markov chain with stable distribution• ergodic Markov chain
– reversibile (detailed balance)– irreducible (can get from any value to any other)– aperiodic (no fixed pattern)– positive recurrent (chance to visit all possible values)
z
N
zzzz
zzzzzz
zzzz
all22
212121
10
)|()()(0
1)|()()|()(0
)(~,,,
June 1999 NCSU QTL Workshop © Brian S. Yandell
19
Reversible Jump MCMC
• basic idea of Green(1995)• model selection in regression• how many QTLs?
– number of QTL is random– estimate the number m
• RJ-MCMC vs. Bayes factors• other similar ideas
June 1999 NCSU QTL Workshop © Brian S. Yandell
20
Jumping the Number of QTL • model changes with number of QTL
– almost analogous to stepwise regression– use reversible jump MCMC to change number
• book keeping helps in comparing models• change of variables between models
• prior on number of QTL– uniform over some range– Poisson with prior mean
!)|(
m
em
m
June 1999 NCSU QTL Workshop © Brian S. Yandell
21
Posterior: Number of QTL• posterior = likelihood * prior / constant• posterior( paramaters | data )
prob( genos, effects, loci, m | traits, map )
is proportional to
)|,;,,;( 2** yX mb
m
r
n
jrjrrr
n
jjj
xbm
my
1 1
**2
1
2**
)|()()()()()(
);,,;|(
bx
June 1999 NCSU QTL Workshop © Brian S. Yandell
22
Reversible Jump Choicesaction step: draw one of three choices• update step with probability 1-b(m+1)-d(m)
– update current model– loci, effects, genotypes as before
• add a locus with probability b(m+1)– propose a new locus– innovate effect and genotypes at new locus– decide whether to accept the “birth” of new locus
• drop a locus with probability d(m)– pick one of existing loci to drop– decide whether to accept the “death” of locus
June 1999 NCSU QTL Workshop © Brian S. Yandell
23
Markov chain for number m
• add a new locus• drop a locus• update current model
0 1 mm-1 m+1...
m
June 1999 NCSU QTL Workshop © Brian S. Yandell
24
Jumping QTL number & loci
111111111111111111111111
1
11111
111111
11111
1111
111
1111
11111
11111
11111111
111111111
11
111111111
1111111
111
22222222222222222222222
2 22222222222 22
2222 22222
22222222
222222222 222222222
22 22
23333
333333333333 333
3
0 20 40 60 80 100
20
40
60
80
dist
ance
(cM
)
MCMC run
0 20 40 60 80 1000
1
2
3
num
ber
of Q
TL
June 1999 NCSU QTL Workshop © Brian S. Yandell
25
RJ-MCMC Updates
effectsloci
traits
genos
addlocus
drop locus
b(m+1)
d(m)
1-b(m+1)-d(m)
June 1999 NCSU QTL Workshop © Brian S. Yandell
26
Propose to Add a locus• propose a new locus
– similar proposal to ordinary update• uniform chance over genome• easier to avoid interval with another QTL
– need genotypes at locus & model effect
• innovate effect & genotypes at new locus– draw genotypes based on recombination (prior)
• no dependence on trait model yet
– draw effect as in Green’s reversible jump• adjust for collinearity• modify other parameters accordingly
• check acceptance ...
Dqb /1)(
June 1999 NCSU QTL Workshop © Brian S. Yandell
27
Propose to Drop a locus
• choose an existing locus– equal weight for all loci ?– more weight to loci with small effects?
• “drop” effect & genotypes at old locus– adjust effects at other loci for collinearity– this is reverse jump of Green (1995)
• check acceptance …– do not drop locus, effects & genotypes– until move is accepted
mmrqd /1);(
June 1999 NCSU QTL Workshop © Brian S. Yandell
28
Acceptance of Reversible Jump
• accept birth of new locus with probabilitymin(1,A)
• accept death of old locus with probabilitymin(1,1/A)
m
Jmrq
q
mb
md
m
mA
m
d
mb
m
m
,;,,;
1
)1;(
)(
)(
)1(
)|,(
)|1,(
2**
11
bX
yy
June 1999 NCSU QTL Workshop © Brian S. Yandell
29
Acceptance of Reversible Jump
• move probabilities
• birth & death proposals
• Jacobian between models–fudge factor–see stepwise regression example
nsJ
mrq
q
mb
md
othersr
d
mb
|
1
)1;(
)(
)(
)1(
m m+1
June 1999 NCSU QTL Workshop © Brian S. Yandell
30
RJ-MCMC: Number of QTL
MCMC run0 200 400 600 800
01
23
45
6
MCMC run/10
num
ber
of Q
TL
0 200 400 600 800
01
23
45
6
MCMC run/1000 200 400 600 800
01
23
45
6
MCMC run/1000
num
ber
of Q
TL
0 200 400 600 800
01
23
45
6
June 1999 NCSU QTL Workshop © Brian S. Yandell
31
Posterior # QTL for 8-week Data
0 1 2 3 4 5 6
0.0
0.2
0.4
98% credible region for m: (1,3)based on 1 million steps
with prior mean of 3
June 1999 NCSU QTL Workshop © Brian S. Yandell
32
How Good is RJ-MCMC?
• simulations with 0, 1 or 2 QTL– strong effects (additive = 2, variance = 1)– linked loci 36cM apart
• differences with number of QTL– clear differences by actual number– works well with 100,000, better with 1M
• effect of Poisson prior mean– larger prior mean shifts posterior up– but prior does not take over
June 1999 NCSU QTL Workshop © Brian S. Yandell
33
Posterior for Simulated Data
• 0,1 or 2 large QTL• prior Poisson mean of 2• 100,000 RJ-MCMC runs
0 1 2 3 4 5
0.0
0.4
0.8
no QTL present0 1 2 3 4 5
0.0
0.4
1 QTL present0 1 2 3 4 5
0.0
0.4
0.8
2 QTL present
June 1999 NCSU QTL Workshop © Brian S. Yandell
34
Effect of Prior Mean
0 1 2 3 4 5
0.0
0.4
0.8
prio
r m
ean
= 2
priorpost.
0 QTL present0 1 2 3 4 5
0.0
0.4
1 QTL present0 1 2 3 4 5
0.0
0.4
0.8
2 QTL present
0 1 2 3 4 5
0.0
0.2
0.4
prio
r m
ean
= 4
0 QTL present0 1 2 3 4 5
0.0
0.2
0.4
0.6
1 QTL present0 1 2 3 4 5
0.0
0.2
0.4
2 QTL present
June 1999 NCSU QTL Workshop © Brian S. Yandell
35
# QTL in Brassica Data
• 4-week & 8-week vernalization– log( days to flower)– 105 lines, 10 markers– modest effects– evidence of 1 or 2 QTL using Bayes factors
• histograms of posterior number of QTL– depends somewhat on prior– mode is 1 or 2 QTL
• 90% credible sets– all include 2 QTL– include 1 QTL if prior not huge
June 1999 NCSU QTL Workshop © Brian S. Yandell
36
#QTL for Brassica 8-week
0 1 2 3 4 5 6 7 8
0.0
0.4
0.8
priorpost.
prior mean = 10 1 2 3 4 5 6 7 8
0.0
0.2
0.4
0.6
prior mean = 20 1 2 3 4 5 6 7 8
0.0
0.2
0.4
prior mean = 3
0 1 2 3 4 5 6 7 8
0.0
0.2
0.4
prior mean = 40 1 2 3 4 5 6 7 8
0.0
0.2
prior mean = 60 1 2 3 4 5 6 7 8
0.0
0.15
0.30
prior mean = 10
June 1999 NCSU QTL Workshop © Brian S. Yandell
37
Brassica #QTL 90% Credible Sets
prior lo hi level lo hi level
1 1 2 0.98 1 2 0.99
2 1 2 0.95 1 2 0.94
3 1 3 0.98 1 3 0.98
4 1 3 0.95 1 3 0.93
6 1 4 0.96 1 4 0.94
10 2 5 0.90 2 6 0.97
8-week 4-week
June 1999 NCSU QTL Workshop © Brian S. Yandell
38
Brassica #QTL Comparison
0 1 2 3 4 5 6
0.0
0.4
0.8
4-w
eek
data
prior mean = 10 1 2 3 4 5 6
0.0
0.2
0.4
prior mean = 20 1 2 3 4 5 6
0.0
0.2
0.4
prior mean = 3
0 1 2 3 4 5 6
0.0
0.4
0.8
8-w
eek
data
prior mean = 10 1 2 3 4 5 6
0.0
0.2
0.4
0.6
prior mean = 20 1 2 3 4 5 6
0.0
0.2
0.4
prior mean = 3
June 1999 NCSU QTL Workshop © Brian S. Yandell
39
Reversible Jump II
• reversible jump MCMC details– can update model with m QTL– have basic idea of jumping models– now: careful bookkeeping between models
• RJ-MCMC & Bayes factors– Bayes factors from RJ-MCMC chain– components of Bayes factors
June 1999 NCSU QTL Workshop © Brian S. Yandell
40
RJ-MCMC Updates
effectsloci
traits
genos
addlocus
drop locus
b(m+1)
d(m)
1-b(m+1)-d(m)
June 1999 NCSU QTL Workshop © Brian S. Yandell
41
Reversible Jump Idea• expand idea of MCMC to compare models• adjust for parameters in different models
– augment smaller model with innovations– constraints on larger model
• calculus “change of variables” is key– add or drop parameter(s)– carefully compute the Jacobian
• consider stepwise regression– Mallick (1995) & Green (1995)– efficient calculation with Hausholder decomposition
June 1999 NCSU QTL Workshop © Brian S. Yandell
42
Model Selection in Regression
• known regressors (e.g. markers)– models with 1 or 2 regressors
• jump between models– centering regressors simplifies calculations
jjjj
jjj
exxbxxbym
exxbym
)()(:2
)(:1
222111
11
June 1999 NCSU QTL Workshop © Brian S. Yandell
43
Slope Estimate for 1 Regressor
recall least squares estimate of slopenote relation of slope to correlation
n
jjy
n
jj
y
n
jjj
yyy
nyysnxxs
ss
nyyxx
rs
srb
1
22
1
211
21
1
111
11
1
/)(,/)(
/))((
,ˆ
June 1999 NCSU QTL Workshop © Brian S. Yandell
44
2 Correlated Regressors
slopes adjusted for other regressors
2
11222
1
2122121
2
2
1
21211
)(ˆ
,ˆ)(ˆ
s
srrrb
s
srcc
s
srb
s
srrrb
yyy
yyyyy
June 1999 NCSU QTL Workshop © Brian S. Yandell
45
Gibbs Sampler for Model 1
• mean
• slope
• variance
n
jjj
n
n
jjj
xxbyvaInv
nsns
yxx
b
nn
yn
1
2112
12
2
21
2
21
111
2
)(,~
,
))((
~
,~
2
2
2
2
2
2
2
2
2
2
June 1999 NCSU QTL Workshop © Brian S. Yandell
46
Gibbs Sampler for Model 2
• mean
• slopes
• variance
n
j kkjkkj
n
n
jjj
n
jjjj
xxbyvaInv
nxxcxxs
nsns
xxbyxx
b
nn
yn
1
22
121
22
1
2112122
21|2
21|2
2
21|2
111122
2
2
)(,~
/))((
,
))()((
~
,~
2
2
2
2
2
2
2
2
2
2
June 1999 NCSU QTL Workshop © Brian S. Yandell
47
Updates from 2->1
• drop 2nd regressor• adjust other regressor
02
2121
b
cbbb
June 1999 NCSU QTL Workshop © Brian S. Yandell
48
Updates from 1->2• add 2nd slope, adjusting for collinearity• adjust other slope & variance
212212121
212
111122
222
12
ˆ
)(ˆˆ)(ˆ,ˆ
),1,0(~
cbJczbcbbb
ns
xxbyxx
bJzbb
nsJz
n
jjjj
June 1999 NCSU QTL Workshop © Brian S. Yandell
49
Model Selection in Regression
• known regressors (e.g. markers)– models with 1 or 2 regressors
• jump between models– augment with new innovation z
0),,,(12
ˆ,1)0(~);,,(21
2121221
2121
222
z
cbbbbb
cbbb
Jzbbzzb
tionstransformasinnovationparametersm
June 1999 NCSU QTL Workshop © Brian S. Yandell
50
Change of Variables• change variables from model 1 to model 2• calculus issues for integration
– need to formally account for change of variables– infinitessimal steps in integration (db)– involves partial derivatives (next page)
Jdbdzzbdbdbbb
zbg
b
z
b
J
cJc
b
b
),,|;(),,|,(
),,|;(ˆ10
1
21212121
21
2
2121
2
1
xxyxxy
xxy
June 1999 NCSU QTL Workshop © Brian S. Yandell
51
Jdbdzdbdbzb
zbgdbdb
JJcJJ
Jc
J
Jc
zb
zbgbbzbg
21
2
21
2121
2121
);,,(det
)(010
1det
0
1);(),,();(
Jacobian & the Calculus
• Jacobian sorts out change of variables– careful: easy to mess up here!
June 1999 NCSU QTL Workshop © Brian S. Yandell
52
Geometry of Reversible Jump
0.0 0.2 0.4 0.6 0.8
0.0
0.2
0.4
0.6
0.8
b1
b2
c21 = 0.7
Move Between Models
m=1
m=2
0.0 0.2 0.4 0.6 0.8
0.0
0.2
0.4
0.6
0.8
b1
b2
Reversible Jump Sequence
June 1999 NCSU QTL Workshop © Brian S. Yandell
53
QT additive Reversible Jump
0.05 0.10 0.15
0.0
0.05
0.10
0.15
b1
b2
a short sequence
-0.3 -0.1 0.1
0.0
0.1
0.2
0.3
0.4
first 1000 with m<3
b1
b2
-0.2 0.0 0.2
June 1999 NCSU QTL Workshop © Brian S. Yandell
54
Credible Set for additive
-0.1 0.0 0.1 0.2
-0.1
0.0
0.1
0.2
0.3
b1
b2
90% & 95% setsbased on normal
regression linecorresponds toslope of updates
June 1999 NCSU QTL Workshop © Brian S. Yandell
55
Efficient Updating of additive
• more computations when m > 2• want to avoid matrix inverses
– decompose matrix instead– solve linear system of equations
• use linear algebra– Hausholder (QR) decomposition– LAPACK User’s Guide (1995, 2nd ed)
Anderson et al., SIAM.
June 1999 NCSU QTL Workshop © Brian S. Yandell
56
Hausholder (QR) Decomposition
• decomposition– G is upper triangular– F is orthogonal
• orthogonality
• design matrix 0XF
0FF0,FF
IFF,IFF
IFFFF
FFFFFF
GF0
GFFFGX
T
TT
mnT
mT
nTT
TTT
2
1221
2211
2212
2111
111
21 ]:[
June 1999 NCSU QTL Workshop © Brian S. Yandell
57
QR & Regression
• model
• error piece
• model piece
• estimators 1
11
11
1
11
11111
22
2222
)(ˆr
sr
SS
SS
yyT-T-T
MODELTT
TTTT
ERRORTT
TTTT
yFGyXXXb
yFFy
eFbGeFXbFyF
yFFy
eFeFXbFyF
eXby
June 1999 NCSU QTL Workshop © Brian S. Yandell
58
Absorbing Old Model
• old model– m regressors– QR decomposition
• new model– m+1 regressor– use QR to absorb
old model
eFxFyF
exXby
GFFGX
eXby
Tmm
TT
mmold
old
b
b
21122
11
11
June 1999 NCSU QTL Workshop © Brian S. Yandell
59
Adjusted Slope Estimators• old slopes
– note m=1 case
• added slope– note sum of squares
• variance– note Jacobian
• new slopes 1111
1
1111
1
221
21|21221
2
1122221
11
1
11
11
ˆˆˆ
)ˆ(ˆ
/)ˆ(
)(ˆ
ˆ
mmT-
oldnew
mmT-
new
m
mTT
m
yyyTTmm
yyT-old
b
b
JVbV
nsV
s
srrrVb
s
sr
xFGbb
xyFGb
xFFx
yFFx
yFGb
June 1999 NCSU QTL Workshop © Brian S. Yandell
60
How To Infer loci?
• if m is known, use fixed MCMC– histogram of loci– issue of bump hunting
• combining loci estimates in RJ-MCMC– some steps are from wrong model
• too few loci (bias)• too many loci (variance/identifiability)
– condition on number of loci• subsets of Markov chain
June 1999 NCSU QTL Workshop © Brian S. Yandell
61
Brassica 8-week Data locus MCMC with m=2
11
1
1
1
1
1111
1111
111
1
11
11
11
1
1
1
11
111
1
1
1
11
111
1
11
1
111
1
1
1
1
11
11111
1
1
1
1
111
1
1
1
1
1
111
1
1
1
11
1
11
1
1
111
1
1
1
1
1
11
1
1
1111
111
1
1
1
111
1
11
1111
11
111
1
1
1
11
11
11
1
11
1
11
1
1
1
1111
11
1
1
1
11
1
11
11
1
1
1
1
11
11
111
11
1
11
11
1
1
1
1
11
11
1
1
11
11
1
1
1
111
1
1
111111111
11
1
11
11
1
11
1
1
11
1
1
11
11
11
1
11
111
1
1
1
1
1
11
11
111
1
1
11
1
1
1
111
1
11
111
111
1
11
1
111
1
111
11
1
1
1
1
1111
11
1
1
111111
1
1111
111
11
1
1
1
1
1
1
1
1
11
1
1
111
111
1
1
1
1
1
11111
11
11
1
1
11
11
1
1
1111
1
1
1
11
1
1
11
11
1
1
1
11
1
1
11
11111
1
1
1
1
1
1
1
111
1
11
1
1
1
1
1
1
1
1
1
1
11
1
1
11
1
111
11
11
1
1
1
1
111111
11
1
111
1
11
1
1
1
11
1
1
11
1
1
111
11
1
1
1
1
1
1
1
11
1
1
1
1
1
1111
1
11
11
11
1
11
1
1
1
11
1
1
1111
1
1
1
1
11
1
1
1
1
11
1
1
1
11
11
1
1
1
1
1111
11
1111
111
1
1
11
1
1
1
11
11
1
11
1
1
11
1111
1
111
1
11
11
1
1
1
1
1
1
11
111
11
1
1
11
1
11
1
1
111
11
1
1
1
1
1
1
11
11
11
11
1
1
111
1
1
111111
1
1
1
1
11
11
1
1
1
1
1
111
1
1
1
1
1
1
11
11
111
1111
1
1
111
1
11
111
1
1
1
1
11
1
1
1
11
11
11111
1
11111
1
111
1
1111
1
11
1
1
1
1
1
1
11
1
1
111
1
1
1
1
11
1
11
111111
1111
1111
1
1
1
111
1
1
1
11
1
1
1
1
1
1111
11
1
11
1
1
1
11
1
1
1
1
11111
1
1
1111
11
111
11
11
1
1
1
1
1
1
111
1
11
1
11
11
1
11
11
1
1
1
11
111
11
111
11
1
1
1
111
1
1
1
1
1
1
1
1
1
1
111
11111
1
11
1
11
111
11
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
11
1
1
11
1
1
1111
1
1
1
1
1
11
1
1
1
1
111
111
1
1
1
1
1
1
1
1
111
1
1
11
1
111
111
11
111
11
1
1111
1
1
1
1
1
1
1
1
1
1
11
111
1
1
111111
1
111
11
11
11
11
1
11
11
111
111
11111
111
1
1
1
11
11
1
1
1
1
11
1
1
2
2
22222
22222
22222
22
2
2
222222
2222222222222222
2
2222222
2
2
222
22
22
2
2
22
222222222222222
222222222222
222
2222
2
22
22222
2
2
222
222222
2
2222
2
222222222222
2
2222222222222
2222
2
2
2
2
2
22
2
2222222
2
222222
2
22222
2222
22
2222
2
222
22
222222
2
2222
2222
22
22222222
22
2
222
22
222222
222
2
22222222
2
22222
222222
222222222222
2
2222222
22
2222222222222
2
22222
22222222
22222
2
2
2
2222222222
22222
2
2
222222
222222
2
2222
2
2
22
22222222
2
222
22
22222
222222222222
2
22222
2222
222
2
2
2
2222222
222222
222222
2
222
2
222
22
22222222222
22222
222
222222222
222
22
2222222
2222
2
2
222
22222
2
22222222222
2
222222
2222
222
2
222222222222
2
22
2
222
2
22222222
2
22222222222
22222222
222
2
2222
2
2222222
2222222222222
2
2222
222222222
2
22222
2
22222222
22222222222222
222222
2
22222
222222222222222
222222222
2
222222
2
2222
2222222222222222222222
222
2
222222
22222222222
22222
22222222222
222222
2222
2
2222222222222222
2222
22222
2
22222222
222
2
22
2222222
2
2
2
22222
2
222222
2
2222222
222222
2222
22
222
2
22
2222222
22
2
22
22
22222222
22222222222
2222222222
2
22
2
2222222222
222
22222222222222222222222222222
222
2222
2
222222
22
22222222
2222
2
22
222222
2
22
2222
22222222
222
22222222222
222
0 200 400 600 800 1000
20
40
60
80
MCMC run
20
40
60
80
0 50 100 150 200 250 300
dist
ance
(cM
)
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
62
Jumping QTL number & loci
111111111111111111111111
1
11111
111111
11111
1111
111
1111
11111
11111
11111111
111111111
11
111111111
1111111
111
22222222222222222222222
2 22222222222 22
2222 22222
22222222
222222222 222222222
22 22
23333
333333333333 333
3
0 20 40 60 80 100
20
40
60
80
dist
ance
(cM
)
MCMC run
0 20 40 60 80 1000
1
2
3
num
ber
of Q
TL
June 1999 NCSU QTL Workshop © Brian S. Yandell
63
RJ-MCMC loci chain
111111111111111111111111
1
11111
111111
11111
11
11
111
1111
11111
11111
11111111
111111111
11
111111111
11
11111
1111111
111111111111111
11
111
1
11111111
11111
1
1111111
111111111111111111
1111111111
11111111111
111
11111
11111
11
11111111111
11111
11111111111111111111
1111111111111111
1
1
111111
1
111111111111
1
11111
11111111111111111111
111111
11111111111111
1111111111
11111111111
1111
1111111
1
11111111111111
1111111111111111
1111
11111111111111
1111111111
11111111111111111111
111
111111111
1111111111111
111111111111111111111111111
111111111111111111
111111111111
111111111111
11111111111111111111111
1
1
11111111111
111111111
1
1111
11
1
11111
111
111111111
111
111111
111111111111111111111111111111111111111111111111111111
11
11111111111
1111
11111111111111111111
111111
11111
11111111111111111111111111
111111111
1111111
1
11111111111111
11
111111111111
111
111111111111111
1
1111111111111111111111111111111111
111111
1
1111
11
1111
111111111111111
1111111111111111111
111111
1111111111111111
111111
1111
1111111111111111111111111111111111
1
111111111111111111111111111111
11111111
111
111111111
1
122222222222222222222222
2222222
2222222
222222222
22222222
222222222222222
2222222
22222
222222
2
22222222
22
22222222222222222222222222222222222
22222
22
2222222222222222
222222222
2
22222222
22222
22
222222222
2
2
2
2
2222
22222
222222222
2
2222222222
22222222222222222222222222
2
2222222222222222222
2222
22222222222222222222
2222222222222
222222
22222222
22222222222222222222222222222222222222222222222
22222222222
222222222
222
2
22222
222
222222222
222
22
22222222
22222222
2222
22222222
22222222222
2222222222222
22
222222
2
22222
22
2222222
22222
2222222222
2222222222222222222222
222222222
2222222222222222
22222
22
2222222222222222222222222
2
222222
222222222222222222
22222
22222
222222222222222222222222222222222
2222
22
22222222
22222222
2
2222
2222222222222222222222222222222
2
2
2222222222222222222
22222222222
22
222222
222
22
22222
22
2
23333
3333
3333333333333333333
333333
33
3
3333333333333333
333333333
33 33333
3333333333333
3
333333
3
33333
3333
33333333333
333
333333
33333333
33333333333
333
333333
3333
3333333333333333
3333333333333333
33
333333333333333
333333333333333
3333333
33333
3 3
333
33
3333
3333
3333333333
3
3
33333333333333333333333333333
3333333
333
33333
4
4
4444444
444
44444444
4444444 4444444444
444444
44
44444
4444444
4444
4 4
4444444
444444444444
444444444444455
5
55
555555 555555 5555
55555555555555
5555566 6666666
MCMC run0 400 800
0
20
40
60
80
111
11111
1
1
1
1
1
1
11
1
1
1
1
11
1
1
1
1
1111
1
1
1
11
11
1
1
1
1
11
1
1
1111
1
11
1
11
1
1
1
1
1111
11
1
11
1
1
11
1
111
111
1
111111
1
1
1
1111
1
1
1
1
1
1
1
1111
1
1
1
1
1
1
1
111
1
11
111
1
11
1
11
1
1
1
1
1
1
1
1
11
11111
1
1
1
1111
1
11
11
1
11
1
11
1
1
111
11
111
111
1
1
1111
11
11
1
1
11
1111
1
1
1
111
1
11111
11
1
11
1
11
1
1
11
111
11
1
111
11
11111
11
1
1111
1
1
111
1
1
111
1
1
11111
1
1
1
1
11
111
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
11
11
111
1
1
1
1
1
111
1
1
1
111
111
1
1
1
1
1
1
1
1
1
1
1
11
1
111
11
11
11
1
1
11
1111
1
1
1
1
11
1
11
1
11
1
1
11
1
11
1
11
1
1
1
11
1
1
1
111
1
111
1
1
1
1
11
1
1
11
1
1
1
11
111
111
1
1111
1
1
1
1
1
1
11
1
1
1
1
111
1
1
11
11
1
1
1
11
11
1
11111
1
11
1
111
1
1
11
111
1
1
111
11
1
11
1
1
1111
11
1
1
1
111
1111
1
1
11
11
11
1
1
1
1
1
1
111
1
1
1
1
1
11
1111
11
1
1
1
1
1
11
11
1
1
11111
11
1
111
111
111
1
1
1
1
1
1
1
1
11
1
1
1
1
111111
1
111111
111
11
1
1
111
1
11
1
1
11
11
1
1111
1
11
111
1
1
11
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
111
111
1
1
1
1
1
11
1
1
1
111
1
111
1
1
1
1
11
11
1
1
111
1
1
111
111
11
11
1
11111
1
1
11
1
1
1
1
11
1
11
1
11
1
11
1
1
11
1
1
1
1
1
1
1
11
11
111
111
1
11
1
1
111
1
1
1111
1
1
11
111
111
1
11
11
1
1
1
1
1
1
1
11
1111
1
11
1
1
11
1
111
11
11
11
1
11
11
11
1
1
1
1
1
1
111
1
1
1
11
1
11111
11
1
1
11
11
11
1
111
11
1
1
111111
1
1
1
11
1
1
111
1
1
11
1
1
1
1
1
11
1
1
11
1
111
11
1
11
1
11
1
11
1
1
111111
1
1
1
1
1
1
11
111
111
1
11111111
111111
1
11
11
11
11111
111
11
11111
1
1
111
11
1111
1
11
1111
1
11
111
1
1
1
1
1
111
1
1
111111111
111
11
11
1
1
11
1
11
111
1
1
1
11
11
111
1
1
1
1
1
11
1
111
1
1
22
2
2
2
2
2
22
222
2222
2
2
2
22
22
222
2
22
2
2
222
2
2222
2
2
22
2
2
22
2
2
2
22222222
22
2
2
2
2
2
2
2
2
2
2
2222
2222
2
2
22222222
2
222222
222
22
22
2222
2
2
222222
22
2
2
2
222
2
222
2
2
22
22222
2
22
2
2
2
2222
22
22
2
222
2
222
2
2
22
2
22
2
2
2
222
2
2
2
2
2
2
2
2
2
22
2
2
22
222
2
22
2
22
22
22
2
222
2
2
2
2
2
2
2
2
2
22
2222
22
22
2
2
222
2
222222
2
22
2
222
2
22
2
2
222
2
2
2
22
22
2
2
2
2
2222
222
22
22
222222
22
2
2
2
22
2
2222
22
2
22
22
222222
2
2
22
2
22
2
2222222
2222
22
222
2
22
2
2
2
222
2
2
2
22
2
2
22
222
2
222
2
2222
22
2
2
22
2
2
2
2
22222
2222
2
2
2
222
2
2
22
22
2
22
2
2
2
2
2
2
2
22
2
22
2
22
22
2
2222
2
2
2
2
2
2
22
22
2222222222
2
22
22
222
22
2
2
22
2
2222
22
2
22222
2
222
222
2
2
22
222
2
2
2
222
2
222
2
2
2
222
222222
222
2
222
22
22
222
2
2222
2
222
2
2
2
2
2
222
2
2
22222
222
2
22
22
22
2
2
2
222
22
222
2
22
22
2
2
2
22
2
2
2
22
2
22222
2
22
2
222
222
2
22
22
22
2
2
2
2
2
222
2
2
222222
2
2
2222
2
2
2
22
2
2
2
22
2222
2222
2222
2
2
22
22
22
222
22
2
222
2
22
22
2
222
2
2
2
2
2
2
2
2
22
2222
222222
22
2222
2
2222222
22
2
2
22222
22
2
2222222
2
333333
33
3
3
3
33333
3
333
3
333
33
3
3
33
333
3
3
33
3
3
333
3
3
333
33333
3
3
33333333333
3
3
3
33333
3
3
3
3
333
33
33
33
33333
3
33
33
3
33
333
33333
33
3
333
333
3
33
33
3
3
3
33333
3
3
3
3
33
33
33
33
3
3
3
3
3
33
3
3
3
33
333
3
3333
3333
3
3
3
333333333
3
3
3
3
3
3
3
3
3
3
3
3333
3
3
3
3
3
33333333
3
33
3
33
3
3
333
3
33
3
3
3
3333
3
3
3333
333
33
333
3333
33
3
3
33
333333
4
4
44
4
444
4
4
4 444 4
4
4 4 44
4
44
4
4
44
4
4
44
4444
4
444
444
444
4
4444
5 555 5 5
55 55 55
5
5
5
MCMC run/10
num
ber
of Q
TL
0 400 8000
20
40
60
80
11
1
11
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
111
1
11
1
1
11
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
111
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
111
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
11
1
1
11
1
1
1
111
1
1
1
1
11
1
1
1
11
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
11
1
1
1
1
1
1
1
1
1
11
1
11
1
1
1
1
11
1
11
1
1
11
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
11
111
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
11
1
1
11
11
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
11
1
1
11
1
1
11
1
1
1
11
11
1
1
11
1
1
11
1
1
1
1
1
1
1
11
1
1
1
111
1
1
1
1
1
1
1
1
1
1
1
1
1
1
111
11
1
11
1
1
1
1
1
1
1
1
1
1
11
11
1
1
1
1
1
11
1
1
1
1
1
1
111
1
1
1
1
111
1
111111
1
1
1
11
1
11
1
1
11
1
1
11
11
11
1
1
1
11
1
1
1
1
1
1
11
1
1
1
1
1
1
11
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
11
1
1
1
1
1
1
1
1
1
11
1
1
11
11
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
111
1
1
1
111
1
1
11
1
1
11
1
1
11
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
111
1
1
1
1
1
1
1
1
1
1
11
1
1
11
1
1
1
1
1
1
111
1
1
1
1
1
1
1
1
1
1
11
1
1
11
1
11
1
1
1
1
1
1
1
1
1
1
1
111
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
111
1
1
1
11
1
1
1
1
1
1
11
11
1
1
1
1
1
11
1
1
1
1
1
1
11
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
11
1
1
1
1
1
1
1
1
1111
1
1
1
1
1
1
111
11
1
1
11
1
11
1
1
11
1
1111
1
1
1
11
1
1
1
1
1
1
1
1
11
1
1
1
11
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1111
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
11
111
1
1
11
11
1
11
1
1
1
111
11
11
11
1
1
11
1
1
11
11
1
11
1
1111
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
11
11
1
11
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
11
1
1
1
1
1
1
11
11
1
1
11
1
11
1
11
1
1
1
1
1
1
11
11
1
1
1
111
1
11
2
22
2
222
2
2
2
2
2
2
22
2
2
22
2
2
2
2
2
2
2
2
2
2
2
222
2
2
2
2
2
2
22
2
2
2
2
2
2
2
2
2
2
2
2222
2
2
2
2
2222
2
2
2
22
222
22
2
2
2
2
2
2
2
2
2
2
22
2
2
2
2
22
2
2
2
22
22
22
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2222
2
2
2
22
2
2
2
2
2
2
2
222
2
222
2
2
2
2
2
2
2
22
2
22
22
2
2
222
2
2
2
2
2
22
22
2
222
2
22
2
2
2
2
2
2
2
22
2
22
22
2
22
22
2
2
2
22
2
2
2
2
22
2
22
2
2
2
2
2
2
2
2222
2
22
22
22
2
22
2
22
22
2
2
2
2
2
2
2
22
2
22
22
2
22
2222
2
22
2
2
2
22
2
2
2
2
2
2
2
2
22
22
2
2
2
2
2
2
2
2
22
2
2
2
2
22
2
2
2
2
2
22
2
22
2
2
2
2
2
2
22
22
2
2
2
2
22
2
2
22
22
2
2
2
2
2
22
2
22
2
222
2
2
222
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
222
2
2
222
2
2
2
2
2
2
2
2
2
2
22
22
2
2
22
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
222
22
2
2
22
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
22
2
22
2
2
2
2
22
2
22
2
22
22
22
2
2
22
2
22
2
22
2
2
2
2
2
2
2
2
2
22
2
2
22
2
2
2
22
2
22222
2
2
2
2
2
22
22
2
22
2
2
22222
2
2
2
2
2
2
22
22
2
2
2
22
22
2
2
2
2
22
2
2
2
2
22
2
2
2
2
222
2
2
222
2
2
2
2
2
2
2
2
2
2
2
222
22
2
2
2
2
2
22
2
22
2
2
2
2
22
2
2
2
2
2
2
2
2
2
2
2
22
222
2
222
2
22
2
2
2
22
22
22
2
2
2
2
2
2
22
2
2
2
22
2
2
2
22
2
2
2
2
22
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
22
222
2
2
2
2
2
222
2
22
222
2
222
2
2
22
2
2
2
2
2
22
2
22
22
2
2
3
3
3
3
33
3
333
33
33
333
3
3
3
3
3
3
33
3
33333
33
33
3
33
3
3
3333
3
3
33
3
3
333
33
3
3
3
3
33
3
3
3
33333
3
333
3
3
3
3
3
3
3
3
33
333
3
3
3
3
3
3
33
3
3
3
3
3
3
3
3
3333
3
33
33
3
333
3333
33
3
3
3
3
3
3
33
33
3
333
3
33
3
3
33
3
3
3
333333
3
3
33
3
3
3
3
3
33
3
33
3
33333
33
333
3
3
3
333
3
33
3
3
3
3
33
3
33
3
3
3
33
33
3
33333
3
33
333
3
3
3
33
3
33
333
3
33
3
3
3
333333
33
33
3
3
3
33
3
33
3
3
3
3
3
3
3
3
3
33
44
444444
4
4
4
4
444
4
4
4
4444
444
4
44
4
4
4
44
4
4 44444
4
4
4
444
4444
4
4
4
444
5 5
5
55
55
6
MCMC run/1000 400 800
0
20
40
60
80
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
11
1
11
1
1
111
1
1
11
1
1
11
1
11
1
1
1
1
11
1
1
11
1
1
1
1
1
1
1
1
11111
1
11
1
1
1
1
1
1
11
1
11
1
11
1
1
1
1
1
1
11
11
1
11
1
1
1
1
1
11
1
1
111
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
11
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
11
1
111
1
1
11
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
11
1
1
111
1
1
1
1
1
1
1
11
1
1
11
1
1
1
11
1
1
11
1
1
11
1
1
1
1
1
1
1
1
1
11
1
111
1
1
11
11
1
11
11
1
1
1
11
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
11
11
1
1
1
1
1
1
111
1
1
1
1
1
1
1
11
1
1
1
1
1
1
11
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
11
1
1
1
1
1
1
11
1
1
1
1
1
1
111
1
11
1
1
11
11
1
1
1
1
1
1
1
11
1
1111
1
1
1
1
11
1
1
111
11
1
1
1
11
11
1
1
1
1
1
1
1
1
1
1
1
11111
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11111
1
1
1
11
1
1
111
11
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
11
1
1
1
1
1
1
11
11
1
1
1
111
1
1
1
1
1
1
1
1
1
1
1
1
1
1
111
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
11
1
1
1
1
1
11
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
11
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
11
111
1
1
11
1
111
1
1
1
1
11
1
11
1
1
1
1
11111
1
111
1
1
1
1
1
1
1
1
1
1
1
11
1
1
11
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
111
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
11
1
1
1
1
1
11
1
1
1
11111
1
11
1
1
1
11
1
11111
1
1
1
1
1
1
1
1
1
11
11
1
11
1
1
1
1
1
1
1
1
1
1
1
11
1
1111
1
1
11
1
11
1111
11
1
1
1111
1
1
1
1
11
11
11
1
11
1
1
1
1
11
1
1
11
1
11
1
1
1
111
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
11111
1
1
1
1
1
11
1
1
11
1
11
1
1111
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1111
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
11
1
1
1
1
1
1111
11
1
1
1
1
111
1
2
2
2
2
2
2
22
22
2
22
2
2
2
22
2
2
2
2
22
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
222
2
2
2
2
2
22222
222
22
2
222
2
2
2
22
2
222
22
2
2
2
22
2
2
222
2
2
22
2
22
22
2
2
2
22
2
2
2
2
2222
22
2
22
2
2
2
2
22
2
2
2
2
222
2
2
2
2
2
2
22
2
22
2
2
2
22
2
222
2
222
2
2
2
2
22
2
2
2
2
22
2
2
2
2
2
22
2
2
2
22
2
2
2
2
2
2
2
2
2
22
2
2
22
2
2
2
2
2
22
22
2
2
22
2
2
2
22
2
2
2
2
2
2
2
2
2
2
2
2
222
222
2
2
2
2
2
2
2
2
2
2
22
22
2
2
2
2
2
22
22
22
2
2
2
2
2222
2
2
2
2
2
2
2
2
2
2
2
2
22
2
2
2
2
2
2
2
22
2
2222
2
2222
22
2
2222
2
222
2
2
2
2
2
2
22
2
2
222
2
2
22
2
222
2
2
2
22
2
22
222
2
2
2
22
2
2
2
2
2
2
2
2
2
2
2
22
2
2
2
2
2
2
22
2222
2
22
22
222
2
222
2
22
2
2
22222
2
2
222
2
22
2
2
2
2
222
2
222
2
2
2
22
22
2
2
22
2
2
2
22
2
2
2
22
22
2
2
22
2
2
2
2
2
2
22
2
22
2
22
2
2
222
2
2
2
222
2
2
2
2
2
2
2
2
22
2
2
2
2
2
22
2
2
2
2
2
2
2
2
2
2
2
22
2
2
22
22222
2
22
2
2
2
2222
2
2
2
2
22
22
2
2
2
222
2
2
2
2
2
2
2222
22
2
2
2
2
2
2
2
222
2
2
2
2
22
222
2
2
2
2
22
2
2
2
2
2
2
2
2
2
22
2
2222
2
2
2
2
2
2
223
3
333
3
33
3
3
3
3
3
33
3
3333
3
3
333333
3
33
3
33
333
33
3
333
3
3
3
3
3
3
3
3
3333
3
3
3
33
3
3
3
3
3
3
33
3
33
3
3
3
3
3
3333333
3
3
333
3
3
33
3
33
33
3
3333333
3
3
33
3
33
3
3
3
333
333
33
3
33
3
33333
3
33
3
3
3
3
33
3
3
3
3
3
33
33
33
33
3
3
3
3333
3
334
4
44
4
44 444 4 44
4 44 4
4
4
MCMC run/1000
num
ber
of Q
TL
0 400 8000
20
40
60
80
June 1999 NCSU QTL Workshop © Brian S. Yandell
64
Raw Histogram of loci
0 20 40 60 80
0.0
0.01
0.02
0.03
0.04
distance (cM)
June 1999 NCSU QTL Workshop © Brian S. Yandell
65
Conditional Histograms
20 40 60 80
0.0
0.02
0.05
distance (cM)
m =
1
0 20 40 60 80
0.0
0.02
0.04
distance (cM)
m =
2
0 20 40 60 80
0.0
0.02
0.04
distance (cM)
m =
3
0 20 40 60 80
0.0
0.02
0.04
distance (cM)
m >
3
June 1999 NCSU QTL Workshop © Brian S. Yandell
66
Bayes Factors•ratio of posterior odds to prior odds
– RJ-MCMC gives posterior on number of QTL– prior is Poisson
BF(1:2) 2log(BF) evidence for 1st
< 1 < 0 negative
1 to 3 0 to 2 negligible
3 to 12 2 to 5 positive
12 to 150 5 to 10 strong
> 150 > 10 very strong
)|(
)|(
)(/)(
)|(/)|(
2
1
21
2112 model
model model model model model
yyyy
B
June 1999 NCSU QTL Workshop © Brian S. Yandell
67
#QTL for Brassica 8-week
0 1 2 3 4 5 6 7 8
0.0
0.4
0.8
priorpost.
prior mean = 10 1 2 3 4 5 6 7 8
0.0
0.2
0.4
0.6
prior mean = 20 1 2 3 4 5 6 7 8
0.0
0.2
0.4
prior mean = 3
0 1 2 3 4 5 6 7 8
0.0
0.2
0.4
prior mean = 40 1 2 3 4 5 6 7 8
0.0
0.2
prior mean = 60 1 2 3 4 5 6 7 8
0.0
0.15
0.30
prior mean = 10
June 1999 NCSU QTL Workshop © Brian S. Yandell
68
RJ-Bayes Factors(8-week Brassica data)
prior mean
ratio
1 2 3 4 6 10
1:2 2.87 1.91 1.51 1.45 1.12 0.85
1:3 27.62 9.10 5.06 4.22 2.28 1.28
1:4 1743.29 81.30 28.85 18.51 7.17 2.51
2:3 9.63 4.76 3.35 2.91 2.04 1.5
2:4 608.00 42.51 19.09 12.75 6.41 2.95
3:4 63.13 8.93 5.70 4.39 3.15 1.96
June 1999 NCSU QTL Workshop © Brian S. Yandell
69
Simulation Study of Prior Effect
• how dramatic is the effect of prior?• simulations of 0, 1 or 2 QTL
– QTL have large effect • additive = 2, variance = 1
– 2 QTL spaced 36cM apart– sample sized of 105
• RJ-MCMC runs of 100,000
June 1999 NCSU QTL Workshop © Brian S. Yandell
70
Effect of Prior Mean
0 1 2 3 4 5
0.0
0.4
0.8
prio
r m
ean
= 2
priorpost.
0 QTL present0 1 2 3 4 5
0.0
0.4
1 QTL present0 1 2 3 4 5
0.0
0.4
0.8
2 QTL present
0 1 2 3 4 5
0.0
0.2
0.4
prio
r m
ean
= 4
0 QTL present0 1 2 3 4 5
0.0
0.2
0.4
0.6
1 QTL present0 1 2 3 4 5
0.0
0.2
0.4
2 QTL present
June 1999 NCSU QTL Workshop © Brian S. Yandell
71
Bayes Factor
m
ratio
0 1 2
0:1 3.85 0 0
0:2 50.93 0 0
0:3 569.11 0.03 0
1:2 13.22 1.87 0
1:3 147.75 30.09 0
2:3 11.17 16.05 2.38
m
ratio
0 1 2
0:1 0.97 0 0
0:2 3.02 0 0
0:3 15.07 0 0
1:2 3.12 1.32 0
1:3 15.54 3.04 0
2:3 4.99 2.58 0.75
prior of 2 prior of 4
June 1999 NCSU QTL Workshop © Brian S. Yandell
72
Computing Bayes Factors• arithmetic mean
– using samples from prior– mean across Monte Carlo or MCMC runs– can be inefficient if prior differs from posterior
• harmonic mean– using samples from posterior– more efficient but less stable– careful choice of weight h() close to posterior
1
1 )()(
)()|(ˆ
G
g kkkk
kk
hG
model |model ;|model
yy
kkkkkk d )()|()|( model |model ;model yy
June 1999 NCSU QTL Workshop © Brian S. Yandell
73
Stable Bayes Factors• Satagopan, Raftery & Newton (1999)
– weighted harmonic mean
– absorb variance (normal to t dist) replace
);,,|( *2* Xby by
);,|( ** Xby
June 1999 NCSU QTL Workshop © Brian S. Yandell
74
Bayes Factors & LODs
• others have tried arithmetic & harmonic mean• why not geometric mean?• terms that are averaged are log likelihoods...
runsMCMCGg
G
G
gkk
k
,,1
)(log
exp)|(ˆ 1
model ;|
model
y
y
June 1999 NCSU QTL Workshop © Brian S. Yandell
75
Bayesian LOD• Bayesian “LOD” computed at each step
– based on LR given sampled genotypes and effects– can be larger or smaller than profile LOD– informal diagnostic of fit– combine to for geometric estimates of Bayes factors
n
j j
jj
n
j j
xj
by
bxyeBLOD
by
xbxy
eLOD
12*
2**
10
12*
1,0,1
2*
10
),0,|(
),,;|(ln)(log
)ˆ̂,0,ˆ|(
)|()ˆ,ˆ,ˆ;|(
ln)(log)(
June 1999 NCSU QTL Workshop © Brian S. Yandell
76
Compare LODs• scatter plot of loci and Bayesian LODs
– same BLOD for all loci of step– overlay LOD from interval mapping (red)– overlay LOD from CIM (green)– vertical lines at true or inferred loci (blue)
• steps with higher BLOD– may have more likely genotypes– basis for MCMC step
• always up to higher likelihood• sometimes down to lower likelihood
June 1999 NCSU QTL Workshop © Brian S. Yandell
77
BLODs with no QTL• LOD profile roughly zero• most BLOD values should be negative
– no pattern across linkage group
• distribution similar to rescaled chi-square with 1 df– -2log(LR) approximately chi-square– assignment of genotypes “irrelevant”
• RJ-MCMC skewed with inferred QTL– numbers indicate #QTL
June 1999 NCSU QTL Workshop © Brian S. Yandell
78
Simulated Data with No QTL
78
91
011
12
78
91
011
12
0 2 4 6
June 1999 NCSU QTL Workshop © Brian S. Yandell
79
BLOD: no QTL-3
-2-1
0
distance (cM)0 10 20 30 40 50 60 70 80
-3-2
-10
0.0 0.2 0.4 0.6 0.8 1.0 1.2
LOD
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
80
RJ-MCMC BLOD: no QTL-3
-2-1
01
2
distance (cM)0 10 20 30 40 50 60 70 80
1
11
1
2
1
21
1
3
1
2
1
1
1
2
11
1
11
1
2
1
1 1
1
1
1
1
1
1
2
1
1
11
2
1
1
12
1
1
2
1
1
1
1
2
11
2
2
11
1 1
1
1
1
3
2
11
2
1
1
11
12
1
21
1
11
1
1
1
2
1
1
1
1
1
2
2
1
1
11
1
1
11
2
1
1
1
1
1
2
1
1
1
1
1
1
1
111
112
1
1
2 111
1
1
12
11
2
1
1
1
11
113
1
1
1
1
1
1
11
11
1
3
1
11
1
11
1
1
1
1
1
1
11
1
1
1 1
11
2
11
2
2
11
1
1
1
1
1
1
2
1
1
1
1
2
1
3
4
2
2 1
11
1
2 1
12
1
1
2
2
1
1
1
1
1
2
1
1
2
2
1 2
1
1
21
2
12
1
1
2
1
11
1
2
2
1
1
1
11 1
1
1 21
1
2
1
1
2
2
1
2
2
1
1
1
2
11
12
1
2
1
11 1
111
11
3
12
1
1
2
1
1
2
1
21
1
211
1 1
2
2
1
1
111
1
3
1
11
11
2
2 1
1
1
1
1
1
1 111
12
2
21
1
1
1
2
1
1
1
12
1
1
11
11
1
11
1
1
1
2
11
1
1
1
1
2
1
1 1
1
2
1
1
11
1
2
12
111
12
1
1
2
11
2
1
1
3
1
1
1
2
11
2
2
1
1
1
3
1
1
12
1
1
1
1
1
1
1
1
1
1
1
1
1
12
1
1
1 11
1 1
2
1
1
1
1
1
1
111
1 1
2
1
1
11
1
12
12 11 1
11
1
1
1
1
1
1 1
1
11
1
1
11
1
11 1
1
11 21
2
11
1
11
1
1
11
1
2
1
1
2
2
1
12
2
1
11
11
1
1
1 1
2
11
11
2
1
11
11
1 1
1
1
1 1
11
1
1
1
1
11
1
111
1
211
11
11
2
11
1
1
1
111
1
1
1
1
1
1
1
2
11
1
1 2
2
32
2
2
2
2
2
22
2
2
3
2
222
2
2
2
2
2
22
22
3
3
2
2
2
2
2
3
4
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
3
22
2
22
2
2
32
2
2
2
2 22
2
2
2
2
2
2
22
3
2
2
23 2
2
2
222
2
2
2
2
22
2
2
2
2
2
2
3
3
3
3
3
4
3
3
3
3
4
-3-2
-10
12
0.0 0.2 0.4 0.6 0.8 1.0 1.2
LOD
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
81
BLODs with 1 QTL• LOD peaks at correct locus• most BLOD values near locus
– some considerably larger than LOD– inferred genotype vs EM average
• non-central distribution of BLOD– rescaled non-central chi-square?
• RJ-MCMC dispersed from peak– locus proposal has local dispersion
June 1999 NCSU QTL Workshop © Brian S. Yandell
82
LOD for 1 QTL 5
10
15
20
25
30
35
distance (cM)0 10 20 30 40 50 60 70 80
51
01
52
02
53
03
5
0.0 0.1 0.2 0.3 0.4 0.5 0.6
LOD
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
83
RJ-MCMC BLOD: 1 QTL5
10
15
20
25
30
35
distance (cM)0 10 20 30 40 50 60 70 80
1
2
1
1
32
2
11
11
1
2
2 1
1
1
1
2
2
2
1
11
2
1
2
2
2
3
1
11
1
2
2
1
2
11
2
1
111
1
11
211
11
3
1
11
21
2
3
1
12
2
3
2
2
1
3 11
1
3
1
1
2
1
3
111
2
1112
2
2
3
11
1
2
2
1
1
11
4
1
11
1
21
2
1
1
1
3
12 1
31
12
11
1
12
2
1
111
1221
2 111
1
11
1
1
2
11
1
1 1
12
1
13
2
11
1
21
1
1
1
2
221
2
21
2 1
1
1
3
1
121
11
2
1
22
1
1
1
1
1
2
2
2
1
1
112
2
1
1
1
3
21
11
1
1
2
12
1
1
11
222
2
2
112
1
1
1
1
2
2 21
1
1
3
2
1
1
12
1
12
1
1
13
2
2
2
2
3
1
1
12
2
1
2 11
1
1 1
1
1
1
21
1
1
1
11
2
112
111
1
2 11
11
2
1
1
2
2
2
3211
1
2
2
11
1
112
12
1
2
2
1
1
1
3
111
1
1
3
112 2 1
1
11
1
1112
22
2
111
211
1
2
2
2
1
1
111 11
1
1
21
12
1
1
1
22
11
1
2
1
2
1
1
2
1
1
1
2
1
1
2
3
2121
111
22
1
1
1
2
1
1
1
1
1
2
2
1
1
1
1
2
1
2
1
3
2
2
1
1
3
1
2
11
111
1
2
12 1
3
1
4
2
1
2
2 2
2
1
1
2
2 1121
2
2
1
11
1
2
1
2
2
111
112
1
21
11
1
2
2 21
2
1
2
2
1
2
111
221
1
111
1
1
1
1
1
2
2
2
2
1
2
11
1
2
21
1
2
1
2
2
111 2
1
1
112
1
1
1
1
1
4
1
1
1
2
2 1
2
1
11
3
1
1
2
2
1
1
1
1
2
1
212
1
3
1
24
1
1
1
1
11
11
2
1
21
1
1
1
1
11
2
11
2
1112
1
1
2
31
1
1
1
1
2
1
1
2
1
3
2
2
2
1
1
11
1
11
1
1
3
2
2
1
12 111
1
1
12
1
11
2 11
11
1
12
1
3
2
2
1
1112 2
1
1
1
2
1
1
1
1121
1
2
1
11
2
2
1111
2
2 1
11
1
1
1
1
1
1
1
11
23
1
2
1
14
12
4
1
11
21
11
1232
3
2
1
2
2 111
2
3
2
1
1
1
1
112
1
2
12
1
1
3
11
2
1
3
2
1
1
1
1
2
2
2
32 1
2
3
2
1
1
2
11
1
1
1
1
2
1
2
13
2
12 2
111
2
1
2
11
2
1
1
211
1
2
1
21
22
3
1
3
1
1
2
1
1
11212 11
1
13
2
1
1
1
11
1
21
31
1
2
1
11
2
11
2
2
1
1
1
2
1
1
2
1
32
2
2
11
2
1
2
1
1
11
3
1
1
1
2
1
11
1
2
111
1
11111
1
2
1
1
2
1
3
1
1
111121
2
111
1
1
2 211
2 1
2
123
1
11
1
2
1
2
1
1
3
2
1
3
3
1
1
2
1
1111
2
1 111
1
1
2
21
1
2
1
11
11
1
1
1
2
2
1
11
1
14
1
1
2
21
2
3
112
11
2
2
1
1
1
1
2
1
1
1
3
121
2
1211
3
1
1
1
2
1
2 2
1
1
12
2
1
1
1
2
1
2
1
1
2 12
122
32
2
2
22
2
2
22
2
2
3
2
2
22
2
3
2
2
3
2
2
3
2
23
3
2
3
2
2
2
2
3
2
2
4
22
3
2
3
2
2
22 22
2
23
2
2
2
22
2
22
3
2
2
22
2
2
2
2
23
2
2
22
22
2
2
2
2
2 2
3
2
2
2
32
2
2
2
3
2
2
2
2
2
22
2
2
2
2
322
2
2
2
2
2
3
3
222
22
2
22
2
2
2
2
22
2
2
2
2
2
3
2222
2
2
2
2
2
3
2
23
2
2
2
3
4
2
2
2 2
2
2
22
2
2
2
2
2
22
2
2 2
2
2
2
2 2 2
2
2
2
2
2
2
2
2
2
2
2
2
4
2
2
2
3
2
2
2
2
2
3
24
2
2
2
2
22
3
2
2
3
2
2
2
3
2
2
2
2
22
3
2
2
2 22 2
2
2
2
2
2
23
2
4
2
4
2 232
3
2
2
2
2
3
22 2
2
3
23
2
2
2
2
32
2
3
2
22
2
32
2 2
2
2
2
2
2
2
22
3
3
2
22
32 23
2
2
2
2
2
2
32
2
2
22
3
22
2
2
3
2
2
2 22
2
232
2
3
2
3
3
2
22
2
2
2
2
4
2
2
2
3
2
2
2
2
3
2
2
2
3
2
2 2
2
2
2
2
222
3
3
3
3
3
3
3
33
4
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
4
4
3 3
4
33
3
3
3
4
4
3
3
3
3
3
3
33
3
3
33
3
3
3
3
3
3
34 3
33
4
4
4
4
4
4
4
51
01
52
02
53
03
5
0.0 0.05 0.10 0.15 0.20 0.25
LOD
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
84
BLODs with 2 QTL• incorrect fit of 1 QTL model
– LOD peaks at ghost locus– BLOD values near LOD peaks
• correct 2-QTL model– IM misses, CIM gets loci– BLOD at loci– BLOD approx. 2x LOD
• simultaneous fit of both loci
• RJ-MCMC dispersed from peak– need to look conditional on m=2
June 1999 NCSU QTL Workshop © Brian S. Yandell
85
LOD for 2 QTL with 1 Fit5
10
15
20
25
30
35
distance (cM)0 10 20 30 40 50 60 70 80
51
01
52
02
53
03
5
0.0 0.1 0.2 0.3 0.4
LOD
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
86
LOD for 2 QTL with 2 Fit1
02
03
04
05
06
0
distance (cM)0 10 20 30 40 50 60 70 80
10
20
30
40
50
60
0.0 0.1 0.2 0.3
LOD
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
87
RJ-MCMC BLOD: 2 QTL0
10
20
30
40
50
60
distance (cM)0 10 20 30 40 50 60 70 80
22
333
2
232
3
23
42 3
22
2
423
4
2
2
3
2223
2
2
2
22
22
3
222
23 2
23
2
3
22
333
4
22
322
2
2
3
2
2
2
3
222
22332
232 2
2
3 2
2
42 2
222
33
333223 3
2
2
3 3354
2
3 2
3
4
32
2
3
3
32
2
3
2
2
3
3
2
3
233
33
2
3
3222
24
5
3
2 2323
2
3
32
2
4
3
2
3
522
2
22
2
2
2
22
22
43 43
2
3
2
2
42
2
32
22
2
222
3 4 2
3
3 3
35
2
3 2333
2
2
2
3
3
2
2
2
3
22
2
3
4
3
2322
2
3
2
3
32
22
2
2
2
2 22
33
32
2
2
3
333
2
2
2
23
2 2
22 22
4
23
4
333
3
2
3
2
2223
3
2
33
2
4
43
3 2
2
23222
33
3
2
2
2
22 322
2
2222
2
232
2
2
243
2 33
3 2
2
22
332 232
2
33
3
322
3
23
3323
22
3
3
22
3 24 2
2
3
3
2
2
22
3
2
32
2
2
233
3
3
3
2
2
3244 22
22
3
2
3
2
4
2
4
42
2
2334
33
2
2
2
232233 22
3
5
2
3
23
3
33
3223
2
4
22
3
2
2
22
2
2
32
23
2
2
3
22
2
23 22
3
2
3
3233
3
2222
32
22
2
2
3
3
23
223
2
3
2323
3
2
3
2
3
4 3
3
23
422
2323
24
222
3
33
223
2
3
4
4
222
3
3
2
3 2
232
3
322
2
3 233
33 2
33
22323
243
23
3
3
3
3
2
2
4
2
22
22
3
22
2
2
2
32
3
2
3
2
2324
3
3
3
5
322
24
2
3
22
325
23
3
3
2
222
433
2
2
2
2
33
2
22
3
3
2
4
3
222
3
33
32
3
33
2
4 22
3
223
23
3
33
222
332332
323
3
2
33
2
2 222
22
2
2
3
4
3
33
33
2223
223 2
3 3
3
3
2
3
2
2
22
3
32
3
32
2
23
3
3
3
3
22222
3
23
3
33
22222222
222
43
3
2
25332
3 222
3
23
2
222
2
2
2
2
322
2
3
22
3
3
2
3
2
3 22
33
3
3
3
34
232
3
2
2
2
3
333
33
33
22
2
24
2
32
22
3
22
2
32
4
3
23 2
2
2
2
2
3
33
332 2
2
332
4
2
2
222
4
3
2
3
23
2
22
2
224 2
3
3
3
2
2222
3
2
2
2
4
22233
334
3 33
2
2322
3
23
34
3 2
3
2224
3
2
22
2
2
4
33222
3 32
2
2
3
3
3
323
4
2
2
32 2
222
3
3
3
2
2
2
2
2
223
3
3
4
3
322223
22
32
2
3
3
223
3
32223
2
4 2
2
222
2
2
2
3 23
332
232
3
223
23
3
3
22
23
2
3
3
3
23
3
2
3
3
2
3222
2
22
222 2 2
333
2
232
3
23
423
22
2
423
4
2
2
3
2 22
3
2
2
2
22
22
3
222
23 2
23
2
3
22
33 3
4
22
3 222
2
3
2
2
2
3
222223
32
23 22
2
3 2
2
422
22 2
33
333
223 3
2
2
3 3 354
2
3 2
3
4
3 2
2
3
3
3 2
2
3
2
2
3
3
2
3
23
3
33
2
3
322 2
24
5
3
223 23
2
3
32
2
4
3
2
3
52 2
2
22
2
2
2
22
2 2
434 3
2
3
2
2
42
2
32
22
2
2 22
34 2
3
3 3
35
2
3 2333
2
2
2
3
3
2
2
2
3
2 2
2
3
4
3
23 22
2
3
2
3
3 2
22
2
2
2
222
33
3 2
2
2
3
333
2
2
2
23
22
2222
4
23
4
333
3
2
3
2
22 23
3
2
3 3
2
4
433 2
2
23 2
22
33
3
2
2
2
223 22
2
22
22
2
23 2
2
2
24 3
233
3 2
2
22
33 2 23 22
33
3
3 22
3
23
33
23
22
3
3
22
3 24 2
2
3
3
2
2
22
3
2
322
2
233
3
3
3
2
2
3244 2222
3
2
3
2
4
2
4
42
2
2334
33
2
2
2
23
2 23
3 22
3
5
2
3
23
3
33
3 223
2
4
22
3
2
2
22
2
2
32
23
2
2
3
22
2
23 22
3
2
3
3 23 3
3
222
2
32222
2
3
3
23
223
2
3
232
3
3
2
3
2
3
4 3
3
23
422
23 23
24
22 2
3
33
223
2
3
4
4
22
23
3
2
3 2
23
2
3
3 22
2
3 23
3
3 3 2
33
223 23
243
23
3
3
3
3
2
2
4
2
22
22
3
22
2
2
2
32
3
2
3
2
23
24
3
3
3
5
322
24
2
3
22
3 25
23
3
3
2
222
43
3
2
2
2
2
33
2
22
3
3
2
4
3
222
3
33
32
3
33
2
4 22
3
223
23
3
33
222
3 3 233
2
323
3
2
33
2
222 2
22
2
2
3
4
3
3 3
33
22
23
223 2
33
3
3
2
3
2
2
22
3
32
3
322
23
3
3
3
3
22222
3
233
33
222
2 22222
2 243
3
2
253
32
3 222
3
23
2
222
2
2
2
2
3 22
2
3
22
3
3
2
3
2
3 22
33
3
3
3
34
23 2
3
2
2
2
3
333
33
33
22
2
24
2
3 2
22
3
22
2
32
4
3
23 2
2
2
2
2
3
33
3 322
2
33 2
4
2
2
222
4
3
2
3
23
2
22
2
224 2
3
3
3
2
222
2
3
2
2
2
4
2223
33
343 33
2
2322
3
23
34
3 2
3
222
4
3
2
22
2
2
4
33 22
23 3 2
2
2
3
3
3
3 23
4
2
2
3222
22
3
3
3
2
2
2
2
2
223
3
3
4
3
3 22 223
22
32
2
3
3
22
3
3
3 2223
2
4 2
2
22
2
2
2
2
3 23
332
232
3
223
23
3
3
22
23
2
3
3
3
23
3
2
3
3
2
3 2 222
2 2
222
3333
3
34
3
43
4
3
3
333
3 333
4 3
3
3333 3
4
33
33
333 3335
43
3
4
33
3
33
3
3
3
33
33
3
34
5
3
33
3
3
4
3
3
54
34 3
3
4
334
3
33
35
3333
3
33
3
4
33 3
3
3
33
3
3
333 3
43
4
333
3
3 3
3
3 3
4
433
3
33
3
33
4 3
33
33
33333
3
3
3
33
3
3
3
34
3
33
333
3
3
3
344
3
3
4
4
4 334
333 3
3
3
5
3
3
3
33
33
43
33
33
33
333
3
33
3
33
3
33
3
3
3
4 3
3
34 3
3
4 3
33
33
4
4
3
333
3
33
33
33
33
33
43
3
3
3
3
3
43
3
3
3
34
3
3
3
5
34
3
35 3
3
343
3
33
3
3
4
3
3
33
3
3
334 3
33
3
33
33 33
33
33
33
4
3
33
33
3
3
33
3
3
3
3
3
3
3
3
3
3
3
3
333
334 3
3
53
33
3
3
33
3
3
3
333
3
3
3
34
3
3
3
333
33
33
4 3
3
3
4
3
3
3
33
333
3
4
4
3
3
343
3
3
3
4
33
334
33 33
3
3
34
3
34
3
4
33
333
3
3
33
4
33
3
3
3
3
3
4
3
33
33
3
3
3
33
4
33333
33
3
3
3
33
3
3
3
33
3
34
4
4
4454
4
4
5
4
54
44
4
5
44
44
44444
4
4
445
4
44
4
4
4
44
454
5 4
4
44 45
4
4
44
4
4
4
4
4
44
4
4455
555
555
01
02
03
04
05
06
0
0.0 0.05 0.10 0.15 0.20
LOD
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
88
Brassica BLODs• 4-week clearer than 8-week
– ghost QTL or smear for 8-week?
• MCMC with m=2 fairly clear• RJ-MCMC dispersed
– conditioning on m=2 similar to MCMC• not shown
– mixing models together– local proposal moves hamper mixing
June 1999 NCSU QTL Workshop © Brian S. Yandell
89
Brassica 4-week BLOD Map0
51
01
52
02
5
distance (cM)0 10 20 30 40 50 60 70 80
05
10
15
20
25
0.0 0.05 0.10 0.15 0.20 0.25 0.30
LOD
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
90
Brassica 8-week BLOD Map0
24
68
10
distance (cM)0 10 20 30 40 50 60 70 80
02
46
81
0
0.0 0.05 0.10 0.15 0.20 0.25 0.30
LOD
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
91
4-week RJ-MCMC BLOD0
51
01
52
02
5
distance (cM)0 10 20 30 40 50 60 70 80
2 2
23
1
3
11
2
12
1
11
212
1
2
2 2
1
11
3
2
2
2
2
2
2
1 11
2
1
1
2
2
2
21
111
1
1
22
1
1
112
2
11
1
1 1
3
1
2
222
1
11
11
1
1
2
1
2
2
12
2
1
2
1
2
122
2
1
1
2
222
2
2
2
2
2
1
2
12
31
1
1
1111
22
21
21
3 1
2
21
1
1
2111
2
3
21
2
12
11
112 1
1
2 11
1
2
1
21
22
1
2
1
2
2
2
2
2 1
1
12 2 1
2
12
2
2
22
1
2
1
1
312 1
2
1
22 1 12
1121
2
112 11
3
1
2
3
2
11
11
1
1 111
1
22
2
12
2
121211
22
1
2
1
1
2
22
2
11 1
2
1
3
2
12
1 2
2
241
1
1
1
2
1
2
1
3
11
1
1
2
1
2
2
11
2
1
2
1121
12
1
12
11
2
1
11
2
2
2
12
221
2
22
1
11
21
12
2
1
11
11
2
1
2
11
1
4
1
2
3
1
1
3
2
11
2
121
2
12
2
1
1
2
1
21
13 12111
22
2
1111
1
3
1
2
22
2
1
2
2
2
1
22
2
1
2
11
2
2
1
3
12
1
1
112
121
21
1
11212
2
11111
2
1
2
22
11
2
1 21
21
11
2
1
2
3
11
2
1
21
1
111
222
2
4
2
2
2
11
3
1
1111
2
1
11
11
1
121
2
2
11
1
2
1
1
1
2
12
1111
2
1
1
2
1
3
2
2 2
1
1
11
2
11
221
2 1
1
1
11
21 1
2
2
11
12
1
1
1
22111212 1
2
22
1
3 1
2
1212
2
1
1
11
2
1
2
1
1
1 21
2
1
1
2
2
11
1
12
11
1
1
21
11
3
22
1
22
11
1
2
3
11
22
21
11
2
2
1
2 1
11
211
2
1
2
1
1
3
2
1
1
1
3
1
2 2
1
11
23
1
212
12
2
1
2
1
1
111
32
2
122
11
21
2
21
21
1
2
2
2
3
2
1
1
21
2
2
2
2
11
2
12
1
2111
2
22
2 1
2
1
2
1
2
1
12
1
2
1
3
2
1
2
11
12
12
121 1
2
2
1
1
2
112
122
2
113 1
2
12
1
1
1
22
11
121
21
2
1
1
2
11
11
1
2
2
2
11
1
11
2
1
21
1
1
2
1
1
12
2
2
1
1
1
2
121
1
2
11 1
1
1
3
11 2
1
1
12
11
2
2 1
1
12
1
22
1
2
111
2
221
11
2121
11
2
11 1 1
2
3
2
1
1 111
2
1
1
21
2
2
1
1
2
1
2
11
2
1
1
1
2
3
111111 2
2 1
2
1
1
112
2
1
21
2
2
111
2
2
2
111 1
2
2
2
1
11
2
3
2
2
2
23
2
2122
2
1
22
1
1
11
32 2
1
2
3
1
22
112
2
11
2
22
2
1
1
11
2
222
12
1
111
11
1
11
2
1
21
2
1
121
11
2111211
2
3
1
1112
12
2
1
1
2
21
1
222
11
2
111
1
1
2
2
1
21
1
2
1
2
3 12
2
1
12
1
2 1122
23 3
2
222
2
2 23
2
2
2
2
2
22
2
2
2
2 222
2
3
2
22
22
2
2
2
2
2
2
22
22
2 2 2
2
2
2
2
2
2
23
22
2
2
3
2
2
2
2
3
2
2
2
22
22
22
2
2
2
2
2
2
222
2
2
2
22
2
3
2
2
222 2
2
23
2
3
2
22
2
2
2
2 2
2 22
2
22
2
2
3
2
2
2
2
24
2
2
3
2 2
22
2
22
2
2
2
2
2
22 2
2
22
2
22
22
4
2
33
2
2
2
22
22
2
32
22
2
3
2
22
2
2
2
2
22
2
2
2
2
3
2
2
2
2
2 2
2
2
2
222
22
2
2
3
2
2
2 22
2
4
2
2
23
2
22
2
2
2
2
2
2
3
2
22
2
2222
2
2
2 2222
2
22
3
2
22
2
2
2
2
2
2
2
22
3
22
222
3
22
22
2
2
2
2
2
3
2
3
2 22
3
222
22
32
222
2
2
222
2
2
3
2
22
2
2
22
2 22
22
2
2 22
22
3
2
2
22
2
2
2
22
22
2
3
2
2
22
22
2
2
2
2
22
2
2
2
2
2
2
2
23
22
2
22
22
22
222
2
2
2
3
2
2
2
2
22
22
2
3
2
2
2
2
2 2
2
22
2
22
2
2
2
3
2
2
2
23
2
222
22
23
22
2
32 2
2
2
2
222
2
222
2
2
2
2
2
22
2
32
2
2
2
2
222
2
2
2
22
2
32
2
22
3333
33
3
3 33
3
4
34 33
33
3
3
4
3
3
33
3
3
3 3
3
3
33
3
3
3
3
333
3
3 4
4 4
05
10
15
20
25
0.0 0.05 0.10 0.15 0.20 0.25
LOD
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
92
8-week RJ-MCMC BLOD0
51
0
distance (cM)0 10 20 30 40 50 60 70 80
12
1
1
1
1
2
1
12 2 1
1
1
11
2
2
1
1
1
2
1
2
2
11
11
1 1
1
2
1
1
11
3
22
1
1
21 11
2 11
11
1
1
11
2 2
1
21
2
1
1
22
2
1
1
1
2 1213
2
11
2
11
1
1
2
11
1
2
211
2
1
1
1
1
1
121
1
1
1
11
11
11
1
2
1
2
11 12
1
21
1
1
2
1
1
1
1
12 121
1 11
1
2
1
2
1
2
12
1
1 21
1
1
22
12
1
3
1
1
1
2
3
2
1
3
1
2
11
3
1
13
1
1
21
1 11
1
2
32
2
1
1
21
2
11
1
1
1
21
2
3
11
1
11
1
1
1
1 1
222
11
1112
21
3
1
22
1
1
1
12
12
2
1
2
1
2
1
2
1
2
2
2
1
1
1
1
1
2
1
1
1
2
1
1
1
31
222
11
33
1
2
2
21
2
2
1
2
1
111
212 1
2
1
1
1
2
1
1
1
2
11
3
1
1
1
1
1
2
1
1
1
1 1
1
2
2
1
2
1
2
11
2
1
2
1
2 11
12 1 1
1
12 1
1
1
11
11
1
2
3
11
1
2
32
1
12
2 1
2
1
11
2
1
1
1
1
2
2
1
1
1
11
11
12
21
2
21
2
23
1
11 1
2
1
2
1
1
1
2
2
1
2
32
2
1
11 1
1
1
1
14
1
1
2
13
1
2
12
2 1
2
1
1
2
11
2
11
2
22 1
2 2
1
1
1
12
21
1
1
1
1
1
1
112
2
1
1
1
1
2
111
2
12
1
2
22
1
21
22
11
1
11
22
1
12
2
2
12
1
1 1
3
11
3
11
1
1 11
12
2
1
2
2
1
1
3 1
11
1
22
31
1
2
1
2
2
1
1
1
21
122
2
1
1
11
12
12
1
3 1
11 1
2
1
1
1 1
2
1
21
1
21
1
2
1
1
3
1
1
1
11
2
1
1
2
1
22
1
1 1
1
1 1
2
12
2
1
11
1
2
2
1
2
11
1
1
2 1 11
1
2
11
11
1
1
2 1
1
11
2
1
3
2
1
1
1
3
2 1
12 1
111
1
1
2
1
1
12
1 1
11
1
1
1
1
1
1
2
2
1
1
1
1
11
3
1
1
1
1
2
1 1
2
1
1
1
2
1
2
2 2
2
112 2
2
1
1
11
2
1
1 1 11
11
21
2
1
1
1
1
1
22
2
3 12
11
1
2
2
2
121
11
1
2
11
1
2
1
21
2
1
311
12 1
23
3
2
1
1
11
11
21
2
1
2
3
2
22
1
1
31
11
11
2
1
1
2
1
1
123
1
2
122
1
1
1
1
1
2
1
2
21
1
22
1
2
1
2
1
112
1
2
122
2
31
1
1
1
1
1
2
1
11
1
1
2
11
1
2
21
2
12
1
2
2
11
1
2
1
3 2
2
1
2
3
22 2
2
2
2
1
1
111
122
23
21
32
2
1
1
2 1
1
1
21
1
1
1
2
1
1
22
1
1 11
1
1
22
21
2
1
2
112
1
1
2 1
1
1
1
21
2
1
1
2
1
2
4
2
1
12 1
23
1
1
1
1
2
12
111
113
1
1
121
1 11
1
12
1
1
1
1
1
11
3
1
2
1
1 1
2
3 1
2
1
12
1
21
1
212
2
3
1
2
1
2
2
1
1
12
1
11
2
3
2
2
21
11
2
3
2
1
11
1
11 1
12
1
2
2
2
1
32
2
22
2
22
2
2
2
3
22
2
2
222
2
22
22232
2
2
2
2
22
2
2
2
2
22 2
2 2
2
2 2
22
2
32
3
2
3
2
3
32
2
32
2
2
2
2
2
32
222
2
3
22
2
22
2
22
2
2
2
2
2
3
222
33 2
2
2
2
2
2
2
22
2
2
3
2
2
2
2
2
2
2
2
22
2
3
2
32
2
2
2 22
2
2
22
2
2
23
2
2
2
2
2
3 2
2
4
2
3
2
2
2
2
2
2
2
22
22
2
2
2
2
22
2
2
22 2
22
22
2
2
2
2
33
2
2
2
2
32
2
3
2
2
2
2
22
2
2
23
2
2
2
2
2
32
222
2
2
2
2
2
2
2
222
3
23
2
2
2
2
2
2
322
22
22
2
2 2
2
2
2
2
22
2
32
2
2
2
2
2
2
2
2
32
23
3
2
2
2
2
3
2
22
3
2
2
2 3
2
22
2
2
2
22
2
2
2
2
22
2
3 2
2
2
2
2
2
2
2
2
3 2
2
2
3
222
2
2
2 22
23
232
2
22
2
2 2
22
2
22
2
2
2
2
22
4
2
2
23
2
2
3
2
2
3
2
2
3
22
2
2
2
2
3
2 2
2
2
2
3
2
2
22
3
2
22
2
2
3
3
33
3
3
3
33
3 3
3
33
3
3
3
3
3
4
3
333
33
33
3
3
3
33
3
3
3
3
3
3
3
3
3
4
3 3
33
3
3
33
4
4
05
10
0.0 0.05 0.10 0.15 0.20 0.25
LOD
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
93
The Art of MCMC• convergence issues
– burn-in period & when to stop
• proper mixing of the chain– smart proposals & smart updates
• frequentist approach– simulated annealing: reaching the peak– simulated tempering: heating & cooling the chain
• Bayesian approach– influence of priors on posterior– Rao-Blackwell smoothing
• bump-hunting for mixtures (e.g. QTL)
June 1999 NCSU QTL Workshop © Brian S. Yandell
94
RJ-MCMC Software• General MCMC software
– U Bristol links
• http://www.stats.bris.ac.uk/MCMC/pages/links.html– BUGS (Bayesian inference Using Gibbs Sampling)
• http://www.mrc-bsu.cam.ac.uk/bugs/
• Our MCMC software for QTLs– C code using LAPACK
• ftp://ftp.stat.wisc.edu/pub/yandell/revjump.tar.gz– coming soon:
• perl preprocessing (to/from QtlCart format)• Splus post processing• Bayes factor computation
June 1999 NCSU QTL Workshop © Brian S. Yandell
95
RJ-MCMC Software Details
• input files– marker.dist
– marker.mark
– trait.y
• output file– result.write– result.error
• distances between loci (cM)
• marker genotypes (-1,1)– one line per marker
• trait phenotypes
• results• errors if any
June 1999 NCSU QTL Workshop © Brian S. Yandell
96
trait.y file
missing value = -99.000000 8.097789 7.928476 12.538893 8.370929 8.347937 11.252023 10.687911 6.961958 12.318708 11.985510 11.841586 8.339165 9.347868 11.933833 7.593694 12.248663 13.183897 9.928043 8.326296 ...
June 1999 NCSU QTL Workshop © Brian S. Yandell
97
marker.dist file
8.8
11.8
6.8
6.8
8.7
10.7
10.5
5.1
14.7
June 1999 NCSU QTL Workshop © Brian S. Yandell
98
marker.mark file
one row per marker, one column per line-1 -1 -1 1 -1 1 1 ...
-1 -1 -1 1 -1 1 1 ...
-1 -1 1 1 -1 1 1 ...
-1 -1 1 1 -1 1 1 ...
-1 -1 1 -1 -1 1 1 ...
-1 -1 1 -1 -1 1 1 ...
-1 -1 1 -1 -1 1 1 ...
-1 -1 1 -1 -1 1 1 ...
-1 -1 1 -1 -1 1 -1 ...
-1 1 1 1 -1 1 -1 ...
June 1999 NCSU QTL Workshop © Brian S. Yandell
99
nval.dat file
1 # 1=revjump,0=no
105 10 # n=individuals markers
10000 10 # N=MCMC_runs skips
2 # m
0 0.5 # mu sigmasq
0 0 # initial b’s
3.5 7.5 # initial loci
0 10 1 1 # prior(mu)~N(0,10) prior(sigmasq)~IG(1,1)
0 10 # prior(b)~N(0,1)
4 # prior(m)~Poisson(4)
June 1999 NCSU QTL Workshop © Brian S. Yandell
100
result.write filem mu b(1)..b(m) sigmasq lambda(1)..lambda(m) move LOD2 3.0702 -0.0820 0.0975 0.2085 10.5301 83.7381 1 -0.39433 3.0512 0.0343 -0.1166 -0.0600 0.0853 18.4030 51.2329 76...2 3.0400 -0.0415 -0.0946 0.0619 28.6062 74.0904 3 6.1284...1 3.1045 -0.1412 0.0851 76.9420 3 6.13102 3.0669 -0.0434 -0.1576 0.1091 17.4566 80.7973 3 4.3184
propose acceptbirth 384474 100625death 227999 100625locus 387527 259705
m=0 m=1 m=2 m=3 m=4 m=5 ...1936 428306 425105 126871 16703 1047 32 0 0 0
June 1999 NCSU QTL Workshop © Brian S. Yandell
101
Bayes Factor References• MA Newton & AE Raftery (1994) “Approximate Bayesian
inference with the weighted likelihood bootstrap”, J Royal Statist Soc B 56: 3-48.
• RE Kass & AE Raftery (1995) “Bayes factors”, J Amer Statist Assoc 90: 773-795.
• JM Satagopan, MA Newton & AE Rafter (1999) ms in prep, mailto:[email protected].
June 1999 NCSU QTL Workshop © Brian S. Yandell
102
Reversible JumpMCMC References
• PJ Green (1995) “Reversible jump Markov chain Monte Carlo computation and Bayesian model determination”, Biometrika 82: 711-732.
• S Richardson & PJ Green (1997) “On Bayesian analysis of mixture with an unknown of components”, J Royal Statist Soc B 59: 731-792.
• BK Mallick (1995) “Bayesian curve estimation by polynomials of random order”, TR 95-19, Math Dept, Imperial College London.
• L Kuo & B Mallick (1996) “Bayesian variable selection for regression models”, ASA Proc Section on Bayesian Statistical Science, 170-175.
June 1999 NCSU QTL Workshop © Brian S. Yandell
103
QTL Reversible JumpMCMC: Inbred Lines
• JM Satagopan & BS Yandell (1996) “Estimating the number of quantitative trait loci via Bayesian model determination”, Proc JSM Biometrics Section.
• DA Stephens & RD Fisch (1998) “Bayesian analysis of quantitative trait locus data using reversible jump Markov chain Monte Carlo”, Biometrics 54: 1334-1347.
• MJ Sillanpaa & E Arjas (1998) “Bayesian mapping of multiple quantitative trait loci from incomplete inbred line cross data”, Genetics 148: 1373-1388.
• R Waagepetersen & D Sorensen (1999) “Understanding reversible jump MCMC”, mailto:[email protected].
June 1999 NCSU QTL Workshop © Brian S. Yandell
104
QTL Reversible JumpMCMC: Pedigrees
• S Heath (1997) “Markov chain Monte Carlo segregation and linkage analysis for oligenic models”, Am J Hum Genet 61: 748-760.
• I Hoeschele, P Uimari , FE Grignola, Q Zhang & KM Gage (1997) “Advances in statistical methods to map quantitative trait loci in outbred populations”, Genetics 147:1445-1457.
• P Uimari and I Hoeschele (1997) “Mapping linked quantitative trait loci using Bayesian analysis and Markov chain Monte Carlo algorithms”, Genetics 146: 735-743.
• MJ Sillanpaa & E Arjas (1999) “Bayesian mapping of multiple quantitative trait loci from incomplete outbred offspring data”, Genetics 151, 1605-1619.
June 1999 NCSU QTL Workshop © Brian S. Yandell
105
June 1999 NCSU QTL Workshop © Brian S. Yandell
106
MCMC run of loci for 2 QTL
111
1
1
1
11
1111111
1
1111
1111
1111
1
111
1111
1111
1
1
111
1
1111
11111
1
111
1
1111111111
111
1111
1111
1111111
1
1111
11
1
11
11111111
1
1
1
111111111
1111111111111111111111111
1
11
1
1
1
11111
1
11111111111111
11
111111111111
11
11111111111111
111111
11
1
1
1
1111
111
1111111
1
11111111111
11
1
1
1
1111111111111
11111111111
111
11
11111111
1
11111
1
11
1
1
1111111111
1
1
111111
11111
1111
1111
1
1
1111
1
1111
11111
111
1
1
1
1
1
1
111111111111
1
111
1
1
1
1
1
1111
11
1
1
1
1
1
111
1111
1
1
1
11
111111
1
1
11111111111111111
1
1
11111111
1
111111111111111
11
1111
1111
11111
111111
1111
11
111111
111
1
111
111111
111111
1111
1
111
11111
1111111
111111
111
111111
11
1
11111111111
1
1111
111
11
1
1
1111111
11
111111111111
1
111111
1
1111
11111111
1
111
1
1
1111111
1
11111
111111
1
1
11111
1
111
1
1
1
1
1
11111
11111111
111111111
111111
1
1111
1
11111
11
11
11
11111111111111
1111
1111
111
11111
11
11111
1111111111
1
1111
111
1
11111
1
1111111
111
111
111111
1
1111111
1111111111
1
111
1
11111
11111
1
111111111
1
11111
11
1111111111
1
1
1
1
1
1
1
1
1
11
111111111
1
1
1
1111
1
11
1111111
1
111111111
1
1
1
11
11
111111111
1
1111111
11111111111
11
11
111111111
11
1111
1111111111
111
1
11
111
1
1
111111
1
1
1
1
1
1
11
1
1
111
11111111
1
1
1
11
1
11111
1
11
111111
1
1
1
11
22
2
2
2
22
2
2
22
2
2
2
22
2
22
2
2
22
22
22
222
2
2
22
2
2
22
2
2
22222
2
2
22
2
22
2222222222
2
2
2
2222
22
2
2222
22
2
2222222
2
2
2
2
2
222222
222
2
2
2
2
2222
2
2
2222
2
2
2
22
222
2
2
2
2
2222
2
22
222
22
222222
22
222
2
2
2
222222222
22222
2
2
22
22
22
2
2
2
222
2
2
2
2
222222
2
2
222
222
2
2
2
2
22222
22
2
2
2
222
2
2
2222
2
222
2
22
2
22
2222
2
22
2
2222222
2
2
2
22
2
2
2
2
2
2
2222
2
2222
2222
22
2
2
22
2
22
2
22
222
2
222222
2
2
2
2
2
2
22222
22222
2
2
2
2
22
22
22
2
2
2
22222
2
2
22
22
222
2
2222
22
22
2
2
2
2
222
2
2
22
2
2
222
22222
2222
2
2
2
2222
2
2222
2
222
2
222
2
22
2
2
2222
22
2
2
2
22
2
2
222
222
2
2
22
2
222
2
2
22
222
2
2
2
22
222
222
2
2
22
22
2
2
22
222222222
2
22
222
2222222
22
22
2
22222222
22
2
2
2
2
2
2
2
222222222222
2
2
22
2222
2
2
2
222222
2
2
2
2
222222
2
222222222
22
2
22
22
222
22
2
2
2
2
2
22
22222222222
22
2
222
22
222222222
2
222222
2
2222
2
222
2
2
2
2222
222
2
2
2222
22
2222
2
22
22
22
22
2
2
2
2
22
2
2
2222
22
22
2
22222
2222
2
2
2
2
2
22
22
22
22
222222
22
222
2
2
2222222
22
2
22
22
22222
2
2
2
2222
2
22
22
2
2222
2
2
22222
2
2
2
2222
2
2
2222
2
2
2
22
2
2
2
2
2
2
2
2
22
222
222
2
2222
2
222
22222
2
22
22222
2
222
222
2
2
222
22
2
2
2
2222
2222
22
2
22
222222
22
22
2
2
2
2
222
2
2
2
2
22
2
2
2222
2
2
22
22
2
2
2
22
22
2
2
222222
2
2
2
2
2222
2
2
2
2
2
2
2
222
22
2
2
2
2
2
22
222222222222
22
22
2
2
2
222
2222
22
2
2
222
2
22
22
22
22
222
2
2
2
2
2
2
2
2
2
22
22
22222
22
2
2
2
2
22
2
22
2
2
2
2222222
2222
2
222
2
2
2
2
222222
0 200 400 600 800 1000
40
50
60
70
80
MCMC run
40
50
60
70
80
0 50 100 150 200 250 300
dist
ance
(cM
)
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
107
Simulated Data effect Correlation
1.8 2.0 2.2 2.4 2.6
1.8
2.0
2.2
2.4
effect 1
effect
2
cor = -0.58
June 1999 NCSU QTL Workshop © Brian S. Yandell
108
June 1999 NCSU QTL Workshop © Brian S. Yandell
109
June 1999 NCSU QTL Workshop © Brian S. Yandell
110
Samples from MCMC• ergodic Markov chain
– sample averages agree tend to mean of stable dist– but samples are dependent
• ordinary sample summaries – means, quantiles, histograms– high probability density (HPD) regions
• sample chain for long run (100,000-1,000,000)• use time series methods to study chain
– standard error of estimates– Monte Carlo error
June 1999 NCSU QTL Workshop © Brian S. Yandell
111
QTL Bayesian Posterior• posterior = likelihood * prior / constant• posterior distribution is proportional to
– likelihood of traits & genos given effects & locus – prior distribution of effects & locus
• posterior distribution of genotypes, effects & locus given traits and linkage map
n
jjjj
n
jjj
bxyxb
bxybb
1
2***2*
1
2**2*2**
),,;|()|()()()()(
);,,|,();,,()|;,,;(
yx
June 1999 NCSU QTL Workshop © Brian S. Yandell
112
Brassica napus Experimental Results
• two QTLs inferred on LG9• corroborated by Butruille (1998)• now exploiting synteny with
Arabidopsis thaliana
June 1999 NCSU QTL Workshop © Brian S. Yandell
113
Brassica napus Experimental Design
• 3 vernalization treatments– none– 4 week– 8 week
• only consider 8 week Vernalization here• non-flowering lines for none, 4-week• skew suggests log transformation• data average over 3 blocks (EU = line)
June 1999 NCSU QTL Workshop © Brian S. Yandell
114
Scatter Plot of 2 loci Subset
0 20 40 60 80
02
04
06
08
0
June 1999 NCSU QTL Workshop © Brian S. Yandell
115
RJ-MCMC Changing loci
11111111111111111
1111111
1
1111
1
111111
11111
11
11
111
1
111
11111
11111
11111
111
11
1111111
11
11111
1111
11
11111
111
22222
222222222222222222
22222
22
2
222222
222222222
22222222
222222222
222222
222
22 22
2
3333
3333
33333333
333
3
0 20 40 60 80 100
20
40
60
80
RJ-MCMC run
QT
loci
June 1999 NCSU QTL Workshop © Brian S. Yandell
116
June 1999 NCSU QTL Workshop © Brian S. Yandell
117
Bayesian Inference for QTLs in Inbred Lines I
Brian S YandellUniversity of Wisconsin-Madisonwww.stat.wisc.edu/~yandell
NCSU Statistical GeneticsJune 1999
June 1999 NCSU QTL Workshop © Brian S. Yandell
118
Overview• Part I: Single QTL
– Modeling a trait with a QTL– Likelihoods, Frequentists & Bayesians– Profile LODs & Bayesian Posteriors – Graphical Summaries– Testing & Estimation
• Part II: Multiple QTLs – Sampling from a Markov Chain – Issues for Multiple QTLs– Bayes Factors & Model Selection– Brassica data analysis
June 1999 NCSU QTL Workshop © Brian S. Yandell
119
Part I: Single QTL• Modelling a trait with a QTL
– linear model for trait given genotype– recombination near loci for genotype
• Bayesian Posterior• Likelihoods
– EM & MCMC– Frequentists & Bayesians
• Review Interval Maps & Profile LODs• Case Study: Simulated Single QTL
June 1999 NCSU QTL Workshop © Brian S. Yandell
120
QTL Components• observed data on individual
– trait: field or lab measurement• log( days to flowering ) , yield, …
– markers: from wet lab (RFLPs, etc.)• linkage map of markers assumed known
• unobserved data on individual– geno: genotype (QQ=1/Qq=0/qq=-1)
• unknown model parameters– effects: mean, difference, variance– locus: quantitative trait locus (QTL)
June 1999 NCSU QTL Workshop © Brian S. Yandell
121
Single QTL trait Model• trait = mean + additive + error• trait = effect_of_geno + error• prob( trait | geno, effects )
**
2**
**
),,;|(
jj
jj
jjj
xby
bxy
exby
10 12 14
x=-1x=0
x=1
June 1999 NCSU QTL Workshop © Brian S. Yandell
122
Simulated Data with 1 QTL6
810
1214
x=-1 x=1
68
1012
14
0 2 4 6 8frequency
trai
t
June 1999 NCSU QTL Workshop © Brian S. Yandell
123
Recombination and Distance• no interference--easy approximation
– Haldane map function– no interference with recombination
• all computations consistent in approximation– rely on given map
• marker loci assumed known
– 1-to-1 relation of distance to recombination– all map functions are approximate
• assume marker positions along map are known
221 1 er
June 1999 NCSU QTL Workshop © Brian S. Yandell
124
*x
2r
1M 2M 3M 4M 5M 6M
5r4r3r1r
markers, QTL & recombination rates
distance along chromosome
?
?
June 1999 NCSU QTL Workshop © Brian S. Yandell
125
Interval Mapping of QT genotype
• can express probabilities in terms of distance– locus is distance along linkage map– flanking markers sufficient if no missing data– could consider more complicated relationship
prob( geno | locus, map )= prob( geno | locus, flanking markers )
),,|()|( 1,,**
kjkjjj MMxx
June 1999 NCSU QTL Workshop © Brian S. Yandell
126
Basic Idea of Likelihood Use
• build likelihood in steps– build from trait & genotypes at locus– likelihood for individual i– log likelihood over individuals
• maximum likelihood (interval mapping)– EM method (Lander & Botstein 1989)– MCMC method (Guo & Thompson 1994)
• posterior distribution (Bayesian)– analytical methods (e.g. Carlin & Louis 1998)– MCMC method (Satagopan et al 1996)
June 1999 NCSU QTL Workshop © Brian S. Yandell
127
Bayes Theorem• posteriors and priors
– prior: prob( parameters )– posterior: prob( parameters | data )
• posterior = likelihood * prior / constant• posterior distribution is proportional to
– likelihood of parameters given data– prior distribution of parameters
)()|()|(
)(
)()|(
)(
),()|(
paramparamdatadataparam
data
paramparamdata
data
dataparamdataparam
June 1999 NCSU QTL Workshop © Brian S. Yandell
128
Bayesian Prior• “prior” belief used to infer “posterior” estimates
– higher weight for more probable parameter values• based on prior knowledge
– use previous study to inform current study• weather prediction: tomorrow is much like today• previous QTL studies on related organisms
– historical criticism: can get “religious” about priors
• often want negligible effect of prior on posterior– pick non-informative priors
• all parameter values equally likely• large variance on priors
– always check sensitivity to prior
June 1999 NCSU QTL Workshop © Brian S. Yandell
129
How to Study Posterior?
• exact methods– exact if possible– can be difficult or
impossible to analyze
• approximate methods– importance sampling– numerical integration– Monte Carlo & other
• Monte Carlo methods– easy to implement– independent samples
• MCMC methods– handle hard problems– art to efficient use– correlated samples
June 1999 NCSU QTL Workshop © Brian S. Yandell
130
QTL Bayesian Inference
• study posterior distribution of locus & effects– sample joint distribution
• locus, effects & genotypes
– study marginal distribution of• locus• effects
– overall mean, genotype difference, variance
• locus & effects together
• estimates & confidence regions– histograms, boxplots & scatter plots– HPD regions
June 1999 NCSU QTL Workshop © Brian S. Yandell
131
Posterior for locus & effect
36 38 40 42
1.8
2.0
2.2
distance (cM)
36 38 40 42
0.0
0.1
0.2
0.3
distance (cM)
1.8
2.0
2.2
QTL 1
addi
tive
June 1999 NCSU QTL Workshop © Brian S. Yandell
132
QTL Marginal Posterior• posterior = likelihood * prior / constant
• posterior distribution is proportional to– likelihood of traits & genos given effects & locus – prior distribution of effects & locus
is proportional to
n
jjj bxyb
1
2*** ),,;|()(
)|( * yb
June 1999 NCSU QTL Workshop © Brian S. Yandell
133
Marginal Posterior Summary
• marginal posterior for locus & effects• highest probability density (HPD) region
– smallest region with highest probability– credible region for locus & effects
• HPD with 50,80,90,95%– range of credible levels can be useful– marginal bars and bounding boxes– joint regions (harder to draw)
June 1999 NCSU QTL Workshop © Brian S. Yandell
134
HPD Region for locus & effect
36 38 40 42
1.8
2.0
2.2
distance (cM)
addi
tive
June 1999 NCSU QTL Workshop © Brian S. Yandell
135
QTL Bayesian Posterior• posterior = likelihood * prior / constant• posterior( paramaters | data )
prob( genos, effects, loci | trait, map )
is proportional to
)|;,,;( 2** yx b
n
jj
n
jjj
xb
bxy
1
*2*
1
2**
)|()()()()(
),,;|(
June 1999 NCSU QTL Workshop © Brian S. Yandell
136
Frequentist or Bayesian?
• Frequentist approach– fixed parameters
• range of values
– maximize likelihood• ML estimates• find the peak
– confidence regions• random region• invert a test
– hypothesis testing• 2 nested models
• Bayesian approach– random parameters
• distribution
– posterior distribution• posterior mean• sample from dist
– credible sets• fixed region given data• HPD regions
– model selection/critique• Bayes factors
June 1999 NCSU QTL Workshop © Brian S. Yandell
137
Frequentist or Bayesian?
• Frequentist approach– maximize over mixture
of QT genotypes– locus profile likelihood
• max over effects
– HPD region for locus• natural for locus
– 1-2 LOD drop
• work to get effects– approximate shape
of likelihood peak
• Bayesian approach– joint distribution over
QT genotypes– sample distribution
• joint effects & loci
– HPD regions for• joint locus & effects• use density estimator
June 1999 NCSU QTL Workshop © Brian S. Yandell
138
Interval Mapping forQuantitative Trait Loci
• profile likelihood (LOD) across QTL– scan whole genome locus by locus
• use flanking markers for interval mapping
– maximize likelihood ratio (LOD) at locus• best estimates of effects for each locus• EM method (Lander & Botstein 1989)
n
j j
xj
by
xbxy
eLOD1
2*
1,0,1
2*
10)ˆ̂,0,ˆ|(
)|()ˆ,ˆ,ˆ;|(
ln)(log)(
June 1999 NCSU QTL Workshop © Brian S. Yandell
139
Profile LOD for 1 QTL0
51
01
52
02
5
distance (cM)
LOD
0 10 20 30 40 50 60 70 80 90
QTLIM
June 1999 NCSU QTL Workshop © Brian S. Yandell
140
Building trait Likelihood
• likelihood is mixture across possible genotypes• sum over all possible genotypes at locus
like( effects, locus | trait )
= sum of prob( trait, genos | effects, locus )
1,0,1
2*
2*2*
)|(),,;|(
);,,|()|,,,(
xj
jj
xbxy
byybL
June 1999 NCSU QTL Workshop © Brian S. Yandell
141
Likelihood over Individuals
• product of trait probabilities across individuals– product of sum across possible genotypes
like( effects, locus | traits, map )
= product of prob( trait | effects, locus, map )
n
j xj
n
jj
xbxy
bybL
1 1,0,1
2*
1
2*2*
)|(),,;|(
);,,|()|;,,(
y
June 1999 NCSU QTL Workshop © Brian S. Yandell
142
Interval Mapping Tests• profile LOD across possible loci in genome
– maximum likelihood estimates of effects at locus– LOD is rescaling of L(effects, locus|y)
• test for evidence of QTL at each locus– LOD score (LR test)– adjust (?) for multiple comparisons
June 1999 NCSU QTL Workshop © Brian S. Yandell
143
Interval Mapping Estimates
• confidence region for locus– based on inverting test of no QTL– 2 LODs down gives approximate CI for locus– based on chi-square approximation to LR
• confidence region for effects– approximate CI for effect based on normal– point estimate from profile LOD
)ˆ(96.1ˆ
2)()ˆ(|** bsebCIeffect
LODLODCIlocus
June 1999 NCSU QTL Workshop © Brian S. Yandell
144
IM Confidence Region
36 38 40 42
1.6
1.8
2.0
2.2
2.4
distance (cM)
addi
tive
June 1999 NCSU QTL Workshop © Brian S. Yandell
145
How to Proceed?
• figure out how to construct posterior– Markov chain Monte Carlo
• run Markov chain• summarize results• diagnostics
June 1999 NCSU QTL Workshop © Brian S. Yandell
146
MCMC Run for 1 locus Data
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
111
1
1
11
1
11
1
1
1
11
1
1
1
11
1
1
1
1
1
1
1
111
1
1
1
11
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
11
1
111
1
1
1
1
1
1
1
111
1
11
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
11
1
1
1
1
1
1
111
1
1
1
1
1
1
1
1
1
1
1
11
1
11
1
1
11
1
1
1
1
1
11
1
1
1
11
11
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
11
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
11
1
11
11
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
11
1
1
1
1
111
1
1
1
11
1
11
1
1
11
1
1
1
1
1
11
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
111
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
11
11
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
11
1
1
1
11
111
1
1
11
1
1
1
1
1
1
1
1
1
1
1
11
11
1
1
1
1
11
1
11
1
1
1
11
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
11
1
1
1
1
1
1
1
1
11
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
11
1
1
11
1
1
1
1
1
1
1
1
1
111
1
1
1
1
11
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
111
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1111
1
1
1
1
111
1
11
1
1
1
1
1
1
11
1
1
1
1
1
1
11
1
1
11
1
1
11
1
1
1
1
1
1
1
11
1
11
1
1
1
1
1
1
1
11
1111
1
1
1
1
1
1
1
111
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1111
1
1
11
1
1
111
11
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
111
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
11
1
1
1
11
1
1
111
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
11
1
1
111
1
11
1
1
11
11
1
1
11
1
1
1
1
1
11
1
1
1
1
1
1
1
1
11
1
1
1
11
0 200 400 600 800 1000
36
38
40
42
MCMC run/100
36
38
40
42
0 20 40 60
dist
ance
(cM
)
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
147
Part II: Multiple QTL• Sampling from the Posterior
– Markov chain Monte Carlo• Gibbs sampler for effects• Metropolis-Hastings for loci
• Issues for 2 QTL• Bayes factors & Model Selection• Simulated data for 0,1,2 QTL• Brassica data on days to flowering
June 1999 NCSU QTL Workshop © Brian S. Yandell
148
Posterior for effects
•start with joint distribution of effects– does not depend on locus given genotypes– standard linear model problem
n
jjj bxyb
1
2**2* ),,;|()()()(
),|,,( *2* xy b
is proportional to
June 1999 NCSU QTL Workshop © Brian S. Yandell
149
How to Study Posterior?
• exact methods– exact if possible– can be difficult or
impossible to analyze
• approximate methods– importance sampling– numerical integration– Monte Carlo & other
• Monte Carlo methods– easy to implement– independent samples
• MCMC methods– handle hard problems– art to efficient use– correlated samples
June 1999 NCSU QTL Workshop © Brian S. Yandell
150
Ordinary Monte Carlo• independent samples of joint distribution• chaining (or peeling) of effects• requires numerical integration
– possible analytically here– very messy in general
),|,,(),|(
),|();,|(),;,|(
),|,,(
*2*
),(
*
*****2
*2*
2* xyxy
xyxyxy
xy
bE
bb
b
b
June 1999 NCSU QTL Workshop © Brian S. Yandell
151
Gibbs Sampler for effects• set up Markov chain around posterior for effects• sample from posterior by sampling from full conditionals
– conditional posterior of each parameter given the other– update parameter by sampling full conditional
),,;|()(),;,|(
),,;|()(),;,|(
),,;|()(),;,|(
2**2**2
2***2**
2**2**
bb
bbb
bb
xyxy
xyxy
xyxy
update mean
update additive
update variance
June 1999 NCSU QTL Workshop © Brian S. Yandell
152
Markov chain updates
variancemean
traits
additive
genos
June 1999 NCSU QTL Workshop © Brian S. Yandell
153
Markov chain idea• future events given the present state are
independent of past events• update chain based on current value• can make chain arbitrarily complicated• chain converges to stable pattern
0 1 1-q
q
p
1-p
)/()1( qpp
June 1999 NCSU QTL Workshop © Brian S. Yandell
154
Markov chain Monte Carlo• can study arbitrarily complex models
– need only specify how parameters affect each other– can reduce to specifying full conditionals
• construct Markov chain with “right” model– update some parameters given data and others– can fudge on “right” (importance sampling)– next step depends only on current estimates
• nice Markov chains have nice properties– sample summaries make sense– consider almost as random sample from distribution
June 1999 NCSU QTL Workshop © Brian S. Yandell
155
MCMC run of mean & effect
1
1
111
1
1
111
1
11
1
1
111
1
1
1
1111
1
1
1
11
111
1
11
1
111
11
1
111
1
11
11
1
1
1
1
1
1
1
111
1
111111
1
111
1
11
1
1
1
1
1
11111
1
1111
1
111
1
1111
111
1
11111
1
1
11
1
111111
11
1
1111
1
1
1
1
11
1
1
111
1
11111
1111
1
11
111
111
11
1
1
11
111
1
11111111
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
111
1
1
11
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
111
1
1
11
11
1
1
1
1
1
11
1
11
1
1
1
1
11
11
11
111111
111
1
11111
1
1
1
1
111
11
111
1
111
11
1
1
111
1
11
11
1
1
1
111
1111
1
11
1
11
1
1
11
1
1111
1
11
1111
1
1
1111
11
111
11
1
1
1
1
1
1
1
111
1
11
11
11
11
1
111
1
1
11
111
1
11
1
1
11
1
111
1
111111
11
1
11
11
111
111
1
11111
111
11
11
1
1
1
1
111
1
111
1
11
1
1
1
1
111
1
1
1
11
1
1
1
1
1
1
1
1
111
1
1
1
11
1
1
111
11
1111
1
1
1111
11
1
11
1
1
1
1
1
1
1
11
11
1
1
1
1
1
1
1
11
11
111
1
1
1
1
111
1
111111
111
1
11
1
11
111
1
1
11
11
111111
1
1
11
1
1
111
1111
11
1
1
11
111
1
1
1
111
1
11
1
111
1
11
1
1
111
1
11
1
1
1111
11111
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
11
1
11
1
1
1
1
1
1
1
1
11
11
1
1
1
1
11
11
1
11
11
1
1
1
1
1
1
1
1
1
1111
1
11
11
1
1
1
11
1
11111
1
1
11
11111
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
11
1
1
1
111
11
1
1
1111
1
11
1
1
1
1
11
1
1
1
1
11
1
1
1
1
1
1
1
1
111
1
1
1
111
1
1
1
1
1
111
1
1
1
1
1
11
1
111
11
1
1
1
1
1
1
1
1
1
11111
1
11
1
1
1
1
1
1
1111
1
1
1
11
1
1
1
1
1
1
11
1
1
11
1
1
1
1
1
111
1
11
1
11
1
1
11111
1
1
1
1
1
1
1
1
111
1
1
11
1
11
111
1
11
1
1
1
1
1
111
111
1
1111111
11
1
11
1
1111
1
111
1
1
1
1
11
1111
1
1
1
11
1
1
1
1
11
1
1
1
1
11
111
1
11
1
11
11
11
1
1111
1
11
1
1
11
1
1
1
1
1
1
11
11
1
1
1
1
11
11
111
111
1111
1
111
111
1
1
1
1
11
1
1
11
11
1
1
11
1
1111
11
0 200 400 600 800 1000
9.6
9.8
10
.01
0.2
10
.4
MCMC run/100
9.6
9.8
10
.01
0.2
10
.4
0 20 40 60
mea
n
frequency
1
11
1111
1
11
111
1
1
111
111
1
1
111
11
1
1
1
1
11
1
11
11
1
111
1
1
1
1
1
1
1
11111
11
111
11
11
1
1
11
1
11
1
1
1
1
1
1
1
11
1
1
1
11
11
11
111
111
11
1
1
1
1
1
1
111
1
1
1
1111
1
1
111
1
1
11
1
11
1
1
1
11
1
111111
1
1
1
1
11
1
11
1
1
11
1
1
11
1
1
1
1
1
11
1
1
1
1
1
1
1
11
1
1
11
11
1
111
11
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
111
1
1
1
1
1
1
11
1
11
1
111
1
1
11
1
1
11
1
111
1
1
1
1
1
1
1111
1
1
1
11
1
1
1
1
11
1111
1
1
11
1
1
1
11
11
1
1
1
1
1
11
11
1
1
1
1
11
1
1
11
1
11
1
1
1
11
11
1
1
1
1
1
1
1
1
1
1
1
11111
1
1
1
1
111
1
1
1
11
11
1
1
1
1
11
1
11
1
111
1
1
11
1
1111
11
11
1
1
1
1
1
1
1
11111
1
11
1
1
111
1
1
1
1
1
1
11
1
1111
11
1
1
1
1
1
1
1
1
1
11
1
1
1
111
1
1111
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
11
1
1
1
11
1
1
1
1
1
1
1
11
1
111
111
11
111
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
111111
11
11
111
1
1
1
1
1
1
11
1
1
1
11
1
11
1
1
1
1
1
1
1
1
11
1
111
1
1
11
11
1
1
1
1
111
1
1
1
1
1
1
11
1
111
111
1
111
111
11
1
111
1
1
1
1
111
1
11
1
1
1
1
1
11111
1
1
1
1
1
1
1
11
1
1
1
11
1
1
11
11
1
1
1
1
11
1
1
1
11
1
1
11
1
11
1
1
11111
111
1
1
11
1
1
1
11
1111
1
111
1
1
1
1
1
1111
1
1
1
1
1
1
1
1
1
11
1111
1
11
1
1
1
11
1
1
1
1
1
1
11
1
1
1
1
1
1
111
1
11
11
1
11
1
11
11111
1
1
11
1
1
1
1
11
1
1
1
1
1
1
111
1
1
1
1
11
1
111
1
1
11
111
111111
111
111
1
1
1
1
1
1
1
111
1
1
11
1
111
1111
11
1111
1
1
1
1
1
1
1
1
111
11
11
1
1
1
1
1
11
11
111
1
1111
1
11
11
111
11
1
11
111
11
1
11
1
1
1
111
1
11
1
1
1
11
1
1
11
1111
1
1
1
11111
11
1
1
1
1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
111
1
1
1
11111
11
11
1
1111
1
1
1
11
1
11
1
11
1
111
1
1
1
1
1
111
1
11
1
1
111
1111
1
1
1
11
1
11
11
1
1
1
1
1
1
1
11
1
1
1
1
1
111
1
1
1
111
1
1
11
11
1
1
1
1
1
1
11
1
1
1
1
0 200 400 600 800 1000
1.8
2.0
2.2
MCMC run/100
1.8
2.0
2.2
0 20 40 60
addi
tive
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
156
Care & Use of MCMC• sample chain for long run (100,000-1,000,000)
– longer for more complicated likelihoods– use diagnostic plots to assess “mixing”
• standard error of estimates– use histogram of posterior– compute variance of posterior--just another
summary
• studying the Markov chain– Monte Carlo error of series (Geyer 1992)
• time series estimate based on lagged auto-covariances
– convergence diagnostics for “proper mixing”
June 1999 NCSU QTL Workshop © Brian S. Yandell
157
MCMC Idea for QTLs• construct Markov chain around posterior
– want posterior as stable distribution of Markov chain– in practice, the chain tends toward stable distribution
• initial values may have low posterior probability• burn-in period to get chain mixing well
• update one (or several) components at a time– update effects given genotypes & traits– update locus given genotypes & traits– update genotypes give locus & effects
N
bb
21
2**2** )|;,,;(~);,,;( yxx
June 1999 NCSU QTL Workshop © Brian S. Yandell
158
Markov chain updates
effectslocus
traits
genos
June 1999 NCSU QTL Workshop © Brian S. Yandell
159
Full Conditional for genos• full conditional for genotype depends on
– effects via trait model– locus via recombination model
• can explicitly decompose by individual j– binomial (or trinomial) probability
1,0,1
2*
*2**2**
*
)|(),,;|(
)|(),,;|();,,;|(
1,0,1
xj
jjjjj
j
xbxy
xbxybyx
orx
June 1999 NCSU QTL Workshop © Brian S. Yandell
160
Prior for locus• no prior information on
locus– uniform prior over genome– use framework map
•choose interval proportional to length
•then pick uniform position within interval
0 20 40 60 80
0.0
0.2
distance (cM)
• prior information from other studies
•concentrate on credible regions•use posterior of previous study as new prior
June 1999 NCSU QTL Workshop © Brian S. Yandell
161
Full Conditional for locus• cannot easily sample from locus full conditional• cannot explicitly determine full conditional
– difficult to normalize– need to consider all possible genotypes over entire map
• Gibbs sampler will not work – but can get something proportional ...
n
jjx
b
1
*
*2**
)|()(
)|(),,;,|(
xxy
June 1999 NCSU QTL Workshop © Brian S. Yandell
162
Metropolis-Hastings Step• pick new locus based upon current locus
– propose new locus from distribution q( )• pick value near current one?• pick uniformly across genome?
– accept new locus with probability a()
• Gibbs sampler is special case of M-H– always accept new proposal
• acceptance insures right stable distribution
),()|(
),()|(,1min),(
*
*
newoldold
oldnewnewnewold q
qa
xx
June 1999 NCSU QTL Workshop © Brian S. Yandell
163
MCMC run: 2 loci assuming only 1
111
1
1111
1
111111
1
11
111
1
1
11
1
1111
11111111
1
1
11
11111111
11
111
1
11
1
1
1
1
1
1111
11
11
1
1
1
1
11
1
111
1
1
1
1
1
1
11
1
1
1
1
111
11
1
1
1
1
11
1
1
1
111
11
111111
1
1
1
1
1
1
11
1
111
1
1111
1
1
1
1
1
1
1
111
111
11
11
1
1
11
1
1111111
1
1
1
11
1
11
11
11
1111
111111
1
1
11
11
1
11
1
1
1
111
1
1
1
11
1
1
11
1
1
1
1
1
1
111
1
1
1
11
1
1
11
1
1
1111
11
1
1
1
1
1
11
1
1
11
1
1
1
1
111
1111
1
11111111
1111
1
1
111111
1
111
1
1111
1
1111
1
111111
1
111
1
1
1
1
1
1
1
1
11
1
1111111111
1
1
1
1111
1
1111
1
111
1
111
1
1
1111
11
11
1
111
1
11
1
111
1
11
1
1
11
1
11
11
1111111111
11
1
1
1
11
1
1
11
11
1
1
1
1
1
1111
1
1
1
1
11
1
1
1
1
1
1
1
1
111
11
1
1
1
1
1
1
11
11
11111
1
1
1
111
1
1
11
111
1
1
1
1
1
1
1
111
1
1
1
11
1
11
1
1
1
111
1
1
111
1
1
1
1
111
1
1
1
1
1
11
1
1
1
1
1
11
1
1
111
1
111
1
1
1
1
1
1
1
1
111
1
1
1
1111
11
1
1
1
1
1
11
1
11
1
1
1
1
111
11
1
1
111
1
1
11
11
1
11
11
1
1
1
1
1
1
1
1
1
1
11111
1
11
1
1
111
1
1
1
1
11
11111111
1
1
1111
1
11
11
1
111
11
1
1
1
1
1
1
1
1
1
111111
1
1
1
1
1
1111111
1
111
1
1
11111
111
1
1
1
1
1
1
1111
111
1
1
1
11
11
11
1
111111
1
1
1
11
11
1
1
1
111
1
11
1
1
1
1
1
1
11
1
1
1
1
1
11
1
1
11
1
1
1
1
1
1
11
1
1
11
1
111
1
111
11
11
1
11
1
1
11
1
1
1
11111
1
1111
1
1
1
1
11
1
111
1
1
1
111
1
1
1
1
111
11
1
11
1
1
1
11
11
11
1
1
1
1
1
111
11
1
1
1
1
1
1
1
1
1
111
1
1
1111
1
1
1
111
1
1
11
11
1
1
1
1
1
11
11111
111
1
1
111
1
1
11
1
11
1
11
1
1
1
1111111
1
1
11111111111111
1
11
111
1111
1
1
1
111
1111111111
1111
1
1111
11
111
1
1
1
1
1
1111111
11
1
1
11
1
1
11
1
1
1
1111
111
11
1
11111111111111
0 200 400 600 800 1000
40
50
60
70
MCMC run/1000
40
50
60
70
0 50 100 150 200 250
dist
ance
(cM
)
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
164
Multiple QTL model
• trait = mean + add1 + add2 + error• trait = effect_of_genos + error• prob( trait | genos, effects )
j
m
rjrrj
jjjj
exby
exbxby
1
**
*2
*2
*1
*1
June 1999 NCSU QTL Workshop © Brian S. Yandell
165
Simulated Data with 2 QTL4
68
1012
1416
-1,-1 -1,1 1,-1 1,1recombinants
46
810
1214
16
0 1 2 3 4 5 6frequency
trai
t
June 1999 NCSU QTL Workshop © Brian S. Yandell
166
Issues for Multiple QTL
• how many QTL influence a trait?– 1, several (oligogenic) or many (polygenic)?– how many are supported by the data?
• searching for 2 or more QTL– conditional search (IM, CIM)– simultaneous search (MIM)
• epistasis (inter-loci interaction)– many more parameters to estimate– effects of ignored QTL
June 1999 NCSU QTL Workshop © Brian S. Yandell
167
LOD for 2 QTL0
51
01
52
02
53
0
distance (cM)
LOD
0 10 20 30 40 50 60 70 80 90
QTLIMCIM
June 1999 NCSU QTL Workshop © Brian S. Yandell
168
Interval Mapping Approach
• interval mapping (IM)– scan genome for 1 QTL
• composite interval mapping (CIM)– scan for 1 QTL while adjusting for others– use markers as surrogates for other QTL
• multiple interval mapping (MIM)– search for multiple QTL
June 1999 NCSU QTL Workshop © Brian S. Yandell
169
MCMC run with 2 loci
111
1
1
1
11
1111111
1
1111
1111
1111
1
1
11
1111
1111
1
1
111
1
1111
11111
1
111
1
1111111111
111
1111
1111
1111111
1
1111
11
1
11
11111111
1
1
1
111111111
1111111111111111111111111
1
11
1
1
1
11111
1
11111111111111
11
111111111111
11
11111111111111
111111
11
1
1
1
1111
1
11
1111111
1
11111111111
11
1
1
1
1111111111111
11111111111
111
11
11111111
1
11111
1
11
1
1
1111111111
1
1
111111
11111
1111
1111
1
1
1111
1
1111
11111
111
1
1
1
1
1
1
111111111111
1
111
1
1
1
1
1
1111
11
1
1
1
1
1
111
1111
1
1
1
11
111111
1
1
11111111111111111
1
1
11111111
1
111111111111111
11
1111
1111
11111
111111
1111
11
111111
111
1
111
111111
111111
1111
1
111
11111
1111111
111111
111
111111
11
1
11111111111
1
1111
111
11
1
1
1111111
11
111111111111
1
1
11111
1
1111
11111111
1
111
1
1
1
111111
1
11111
111111
1
1
11111
1
111
1
1
1
1
1
11111
11111111
111111111
111111
1
1111
1
11111
11
11
11
11111111111111
1111
1111
111
11111
11
11111
1111111111
1
1111
111
1
11111
1
1111111
111
111
111111
1
1111111
1111111111
1
111
1
11111
11111
1
111111111
1
11111
11
1111111111
1
1
1
1
1
1
1
1
1
11
111111111
1
1
1
1111
1
11
1111111
1
111111111
1
1
1
11
11
111111111
1
1111111
11111111111
11
11
111111111
11
1111
1111111111
111
1
11
111
1
1
111111
1
1
1
1
1
1
11
1
1
111
11111111
1
1
1
11
1
11111
1
11
111111
1
1
1
11
22
2
2
2
22
2
2
22
2
2
2
22
2
22
2
2
22
22
2
2
222
2
2
22
2
2
22
2
2
22222
2
2
22
2
22
2222222222
2
2
2
2222
22
2
2222
22
2
2222222
2
2
2
2
2
222222
222
2
2
2
2
2222
2
2
2222
2
2
2
22
222
2
2
2
2
2222
2
22
222
22
222222
22
222
2
2
2
222222222
22222
2
2
22
22
22
2
2
2
222
2
2
2
2
222222
2
2
222
222
2
2
2
2
22222
22
2
2
2
222
2
2
2222
2
222
2
22
2
22
2222
2
22
2
2222222
2
2
2
22
2
2
2
2
2
2
2222
2
2222
2222
22
2
2
22
2
22
2
22
222
2
222222
2
2
2
2
2
2
22222
22222
2
2
2
2
22
22
22
2
2
2
22222
2
2
22
22
222
2
2222
22
22
2
2
2
2
222
2
2
22
2
2
222
22222
2222
2
2
2
2222
2
2222
2
222
2
222
2
22
2
2
2222
2
2
2
2
2
22
2
2
222
222
2
2
22
2
222
2
2
22
222
2
2
2
22
222
222
2
2
22
22
2
2
22
222222222
2
22
222
2222222
22
22
2
22222222
22
2
2
2
2
2
2
2
2222222
22222
2
2
22
2222
2
2
2
222222
2
2
2
2
222222
2
222222
222
22
2
22
22
222
22
2
2
2
2
2
22
22222222222
22
2
222
22
222222222
2
222222
2
2222
2
222
2
2
2
2222
222
2
2
22
22
22
2222
2
22
22
22
22
2
2
2
2
22
2
2
2222
22
22
2
22222
2222
2
2
2
2
2
22
22
22
22
222222
22
222
2
2
2222222
22
2
22
22
22222
2
2
2
2222
2
22
22
2
2222
2
2
22222
2
2
2
2222
2
2
2222
2
2
2
22
2
2
2
2
2
2
2
2
22
222
222
2
2222
2
222
22222
2
22
22222
2
222
222
2
2
222
22
2
2
2
2222
2222
22
2
22
222222
22
22
2
2
2
2
222
2
2
2
2
22
2
2
2222
2
2
22
22
2
2
2
22
22
2
2
222222
2
2
2
2
2222
2
2
2
2
2
2
2
222
22
2
2
2
2
2
22
222222222222
22
22
2
2
2
222
2222
22
2
2
222
2
22
22
22
22
222
2
2
2
2
2
2
2
2
2
22
22
22222
22
2
2
2
2
22
2
22
2
2
2
2222222
2222
2
222
2
2
2
2
222222
0 200 400 600 800 1000
40
50
60
70
80
MCMC run/100
40
50
60
70
80
0 50 100 150 200 250 300
dist
ance
(cM
)
frequency
June 1999 NCSU QTL Workshop © Brian S. Yandell
170
Bayesian Approach
• simultaneous search for multiple QTL• use Bayesian paradigm
– easy to consider joint distributions– easy to modify later for other types of data
• counts, proportions, etc.
– employ MCMC to estimate posterior dist
• study estimates of loci & effects• use Bayes factors for model selection
– number of QTL
June 1999 NCSU QTL Workshop © Brian S. Yandell
171
Effects for 2 Simulated QTL
distance (cM)
effe
ct
40 50 60 70 80
1.8
2.0
2.2
2.4
2.6
40 50 60 70 80
0.0
0.0
50
.10
0.1
5
distance (cM)
1.8
2.0
2.2
2.4
2.6
effect 1 effect 2
June 1999 NCSU QTL Workshop © Brian S. Yandell
172
Brassica napus Data• 4-week & 8-week vernalization effect
– log(days to flower)
• genetic cross of– Stellar (annual canola)– Major (biennial rapeseed)
• 105 F1-derived double haploid (DH) lines– homozygous at every locus (QQ or qq)
• 10 molecular markers (RFLPs) on LG9– two QTLs inferred on LG9 (now chromosome N2)– corroborated by Butruille (1998)– exploiting synteny with Arabidopsis thaliana
June 1999 NCSU QTL Workshop © Brian S. Yandell
173
Brassica 4- & 8-week Data
2.5 3.0 3.5 4.0
2.5
3.0
3.5
4-week
8-w
eek
2.5
3.5
0 2 4 6 8 108-week vernalization
2.5 3.0 3.5 4.0
02
46
8
4-week vernalization
June 1999 NCSU QTL Workshop © Brian S. Yandell
174
Brassica Data LOD Maps0
24
68
distance (cM)
LOD
0 10 20 30 40 50 60 70 80 90
QTLIMCIM
8-w
eek
05
10
15
distance (cM)
LOD
0 10 20 30 40 50 60 70 80 90
QTLIMCIM
4-w
eek
June 1999 NCSU QTL Workshop © Brian S. Yandell
175
4-week vs 8-week vernalization
4-week vernalization• longer time to flower• larger LOD at 40cM• modest LOD at 80cM• loci well determined
cM add40 .3080 .16
8-week vernalization• shorter time to flower• larger LOD at 80cM• modest LOD at 40cM• loci poorly determined
cM add40 .0680 .13
June 1999 NCSU QTL Workshop © Brian S. Yandell
176
Brassica Credible Regions
20 40 60 80
-0.6
-0.4
-0.2
0.0
0.2
distance (cM)
addi
tive
4-week
20 40 60 80
-0.3
-0.2
-0.1
0.0
0.1
0.2
distance (cM)
addi
tive
8-week
June 1999 NCSU QTL Workshop © Brian S. Yandell
177
Collinearity of QTLs• multiple QT genotypes are correlated
– QTL linked on same chromosome– difficult to distinguish if close
• estimates of QT effects are correlated– poor identifiability of effects parameters– correlations give clue of how much to trust
• which QTL to go after in breeding?– largest effect?– may be biased by nearby QTL
June 1999 NCSU QTL Workshop © Brian S. Yandell
178
Brassica effect Correlations
-0.6 -0.4 -0.2 0.0 0.2
-0.6
-0.4
-0.2
0.0
additive 1
addi
tive
2
cor = -0.81
4-week
-0.2 -0.1 0.0 0.1 0.2
-0.3
-0.2
-0.1
0.0
additive 1
addi
tive
2
cor = -0.7
8-week
June 1999 NCSU QTL Workshop © Brian S. Yandell
179
Bayes FactorsWhich model (1 or 2 or 3 QTLs?) has higher probability of supporting the data?
– ratio of posterior odds to prior odds– ratio of model likelihoods– related to likelihood ratio test of simple hypotheses
)|(
)|(
)(/)(
)|(/)|(
2
112
21
2112
model model
model model model model
yy
yy
B
B
June 1999 NCSU QTL Workshop © Brian S. Yandell
180
Bayes Factors & LR
• equivalent to LR statistic when– comparing two nested models– simple hypotheses (e.g. 1 vs 2 QTL)
• Bayes Information Criteria (BIC) in general– Schwartz introduced for model selection– penalty for different number of parameters p
)log()()log(2)log(2 1212 nppLRB
June 1999 NCSU QTL Workshop © Brian S. Yandell
181
Model Determinationusing Bayes Factors
• pick most plausible model– histogram for range of models– posterior distribution of models– use Bayes theorem– often assume flat prior across models
• posterior distribution of number of QTLs
June 1999 NCSU QTL Workshop © Brian S. Yandell
182
Brassica Bayes Factors
i vs. j Bayes Factor -log(LR)
2 vs. 1 2.49 7.82
3 vs. 1 .005 7.41
3 vs. 2 .002 4.17
•compare models for 1, 2, 3 QTL•Bayes factor and -2log(LR)•large value favors first model•8-week vernalization only here
June 1999 NCSU QTL Workshop © Brian S. Yandell
183
Computing Bayes Factors• using samples from prior
– mean across Monte Carlo or MCMC runs– can be inefficient if prior differs from posterior
• using samples from posterior– weighted harmonic mean more efficient but unstable– care in choosing weight h() close to posterior– increase stability: absorb variance (normal to t dist)
1
1 )()(
)()|(ˆ
G
g kkkk
kk
hG
model |model ;|model
yy
kkkkkk d )()|()|( model |model ;model yy
June 1999 NCSU QTL Workshop © Brian S. Yandell
184
MCMC Software• General MCMC software
– U Bristol links
• http://www.stats.bris.ac.uk/MCMC/pages/links.html– BUGS (Bayesian inference Using Gibbs Sampling)
• http://www.mrc-bsu.cam.ac.uk/bugs/
• Our MCMC software for QTLs– C code using LAPACK
• ftp://ftp.stat.wisc.edu/pub/yandell/revjump.tar.gz– coming soon:
• perl preprocessing (to/from QtlCart format)• Splus post processing• Bayes factor computation
June 1999 NCSU QTL Workshop © Brian S. Yandell
185
Bayes & MCMC References• CJ Geyer (1992) “Practical Markov chain Monte Carlo”,
Statistical Science 7: 473-511• L Tierney (1994) “Markov Chains for exploring posterior
distributions”, The Annals of Statistics 22: 1701-1728 (with disc:1728-1762).
• A Gelman, J Carlin, H Stern & D Rubin (1995) Bayesian Data Analysis, CRC/Chapman & Hall.
• BP Carlin & TA Louis (1996) Bayes and Empirical Bayes Methods for Data Analysis, CRC/Chapman & Hall.
• WR Gilks, S Richardson, & DJ Spiegelhalter (Ed 1996) Markov Chain Monte Carlo in Practice, CRC/Chapman & Hall.
June 1999 NCSU QTL Workshop © Brian S. Yandell
186
QTL References• D Thomas & V Cortessis (1992) “A Gibbs sampling
approach to linkage analysis”, Hum. Hered. 42: 63-76.• I Hoeschele & P vanRanden (1993) “Bayesian analysis of
linkage between genetic markers and quantitative trait loci. I. Prior knowledge”, Theor. Appl. Genet. 85:953-960.
• I Hoeschele & P vanRanden (1993) “Bayesian analysis of linkage between genetic markers and quantitative trait loci. II. Combining prior knowledge with experimental evidence”, Theor. Appl. Genet. 85:946-952.
• SW Guo & EA Thompson (1994) “Monte Carlo estimation of mixed models for large complex pedigrees”, Biometrics 50: 417-432.
• JM Satagopan, BS Yandell, MA Newton & TC Osborn (1996) “A Bayesian approach to detect quantitative trait loci using Markov chain Monte Carlo”, Genetics 144: 805-816.
June 1999 NCSU QTL Workshop © Brian S. Yandell
187
Brassica Data: 1 QTL
distance (cM)
effe
ct
20 30 40 50 60 70 80
-0.2
5-0
.15
-0.0
5
20 30 40 50 60 70 80
0.0
0.0
20
.04
0.0
6
distance (cM)
-0.2
5-0
.15
-0.0
5
effect 1
June 1999 NCSU QTL Workshop © Brian S. Yandell
188
Brassica 8-week Data: 2 QTL
20 40 60 80
-0.3
-0.1
0.1
0.2
distance (cM)
20 40 60 80
0.0
0.0
20
.04
0.0
60
.08
distance (cM)
-0.3
-0.1
0.1
0.2
QTL 1 QTL 2
addi
tive
June 1999 NCSU QTL Workshop © Brian S. Yandell
189
Brassica 8-week Data: 2 QTL
20 40 60 80
-0.6
-0.2
0.0
0.2
distance (cM)
20 40 60 80
0.0
0.0
20
.04
0.0
60
.08
distance (cM)
-0.6
-0.2
0.0
0.2
QTL 1 QTL 2
addi
tive
June 1999 NCSU QTL Workshop © Brian S. Yandell
190
Geometry of Reversible Jump
2/)(
12:
21:
21
2
1
bbb
m
zbb
zbb
m
0 b b+z
0b
b-z
b1
b2 b
1
2
June 1999 NCSU QTL Workshop © Brian S. Yandell
191
Simulation of Jumps
b1
b2 b
June 1999 NCSU QTL Workshop © Brian S. Yandell
192
Mean & Variance Estimates
n
jjjj
n
jjj
n
jj
nxxbxxbym
nxxbym
nyym
1
2
2221112
1
2
112
1
/)(ˆ)(ˆˆˆ̂:2
/)(ˆˆˆ:1
/ˆ:2,1
June 1999 NCSU QTL Workshop © Brian S. Yandell
193
Another Way to Jump
• many ways to parameterize models• can augment both models
bbzzbbbvzzbb
zzzzbbvzzzzb
tionstransformasinnovationparametersm
rr
rr
,2/)()(0,2~);,,,(12
,)(0,~,),;,,(21
212
21
2121212
June 1999 NCSU QTL Workshop © Brian S. Yandell
194
Simulated Data
• two loci, estimates from data• Bayes factors for model selection
2.78ˆ,26.ˆ
2.42ˆ,10.ˆ
08.0ˆ,78.2ˆ
2*2
1*1
2
b
b
June 1999 NCSU QTL Workshop © Brian S. Yandell
195
Posterior Number of loci (simulated after 8-week)
m 3 4 5 60 .05 .02 .00 .001 .44 .31 .26 .182 .38 .41 .44 .403 .12 .20 .23 .304 .02 .05 .06 .105 .00 .01 .01 .02
distribution across m by prior mean
June 1999 NCSU QTL Workshop © Brian S. Yandell
196
Bayes Factors(simulated after 8-week)
prior mean
models
3 4 5 6
1:2 1.74 1.51 1.48 1.35
1:3 5.50 4.13 4.71 3.60
1:4 24.75 16.53 22.57 16.20
2:3 3.17 2.73 3.19 2.67
2:4 14.25 10.93 15.28 12.00
3:4 4.50 4.00 4.79 4.50
June 1999 NCSU QTL Workshop © Brian S. Yandell
197
Sensitivity of Number to Prior
0 1 2 3 4 5 6 7 8
010
2030
4050
234610
4-week vernalization
0 1 2 3 4 5 6 7 8
010
2030
4050
60
234610
8-week vernalization
June 1999 NCSU QTL Workshop © Brian S. Yandell
198
Number of lociBrassica 8-week
m 3 4 5 61 .43 .28 .21 .122 .43 .44 .41 .343 .12 .22 .28 .344 .02 .05 .09 .165 .00 .00 .01 .046 .00 .00 .00 .005
distribution across m by prior mean
June 1999 NCSU QTL Workshop © Brian S. Yandell
199
Sensitivity of Prior: Simulations
0 1 2 3 4 5
020
4060
80012
Prior of 2
0 1 2 3 4 5 6
010
2030
4050
60
012
Prior of 4
June 1999 NCSU QTL Workshop © Brian S. Yandell
200
Profile LOD & LR• profile LOD related to LR test at a locus
– maximum likelihood estimates of effects– weighted average over all possible genotypes
n
j j
xj
by
xbxy
eLOD
bL
bLLRLReLOD
12*
1,0,1
2*
10
2*
2*
1021
)ˆ̂,0,ˆ|(
)|()ˆ,ˆ,ˆ;|(
ln)(log)(
)|;ˆ,ˆ,ˆ(
)|ˆ̂,0,ˆ(ln2)(),()(log)(
y
y
June 1999 NCSU QTL Workshop © Brian S. Yandell
201
Bayes FactorsWhich model (1 or 2 or 3 QTLs?) has higher probability of supporting the data?
– ratio of posterior odds to prior odds– ratio of model likelihoods– related to likelihood ratio test of simple hypotheses
)|(
)|(
)(/)(
)|(/)|(
2
112
21
2112
model model
model model model model
yy
yy
B
B
June 1999 NCSU QTL Workshop © Brian S. Yandell
202
Bayes Factors & LR
• equivalent to LR statistic when– comparing two nested models– simple hypotheses (e.g. 1 vs 2 QTL)
• Bayes Information Criteria (BIC) in general– Schwartz introduced for model selection– penalty for different number of parameters p
)log()()log(2)log(2 1212 nppLRB
June 1999 NCSU QTL Workshop © Brian S. Yandell
203
Model Determinationusing Bayes Factors
• pick most plausible model– histogram for range of models– posterior distribution of models– use Bayes theorem– often assume flat prior across models
• posterior distribution of number of QTLs
June 1999 NCSU QTL Workshop © Brian S. Yandell
204
Computing Bayes Factors• samples from prior distribution
– mean across Monte Carlo or MCMC runs– can be inefficient if prior differs from posterior
• samples from posterior distribution– weighted harmonic mean more efficient but unstable– care in choosing weight h() close to posterior– increase stability absorbing variance (normal to t dist)
1
1 )()(
)()|(ˆ
)()|()|(
G
g kkkk
kk
kkkkkk
hG
d
model |model ;|model
model |model ;model
yy
yy
June 1999 NCSU QTL Workshop © Brian S. Yandell
205
Brassica Bayes Factors
)17.4(
002.0
)41.7(
005.0)82.7(
49.2
QTLthree
QTLtwo
QTLtwoQTLone
• compare models for 1, 2, 3 QTL• Bayes factor (and -2log(LR)) • arrows point to favored model
June 1999 NCSU QTL Workshop © Brian S. Yandell
206
QTLs and Polygenesphenotype = design + QTLs + polygenes + error
QTLs: quantitative trait locipolygenes: many genes of small effect
spread throughout genome
distinction is arbitrary, depending on• sample size• magnitude of effects• design/cross/marker polymorphism
analogy to multiple regression
June 1999 NCSU QTL Workshop © Brian S. Yandell
207
Polygenes and Inbred Lines
• same (raw) genetic correlation across cohort:– 1/2 for DH– 2/3 for F2– 4/7 for F3– 1/2 for RI
• modified by specific information:– major & minor QTLs– marker surrogates for polygenes (CIM)
June 1999 NCSU QTL Workshop © Brian S. Yandell
208
Composite QTL model
• trait = mean + add + dom + other + error• trait = effect_of_geno + other + error
• prob( trait | genos, effects, other )
• other ( ): other linked and unlinked QTLs, and polygenes
jjjjjXzdxay
j
m
rjrrj
jjjj
exby
exbxby
1
**
*
2
*
2
*
1
*
1
jX