COMMINGLING ANALYSIS By : Jeff Berry
WHAT IT ISu1 u2 u3
q
sig
QTp
Dosage
ASSUMPTIONS* Families are independent
i.e. parents are independent
* QTp is normally distributed in each genotype
* Variance is the same across genotypes
ASSUMPTIONS
This is only for one person!
ASSUMPTIONS
This is only for one person!
ASSUMPTIONS
The goal is develop an algorithm that maximizes the log-likelihood for a given set of parameters
THE PROBLEMNull is True
*Governing Distribution of Pheno Is one Gaussian
Alternative is true
*Governing distribution of phenoIs a mixture of Gaussians
Hypothesis
How much different does it have to be to be significant?
VS
df = number of freely estimated parameters = 5-2 = 3
How much different does it have to be to be significant?
VS
df = number of freely estimated parameters = 5-2 = 3
METHODS• Hybrid EM-Grid Search
q u1 u2 u3 sigma
L()
E
M
Alternative Model
METHODS• Hybrid EM-Grid Search
u sigma
L()
E
M
Null Model
METHODS• Computational Intensity
I say approximately because it takes about four iterationsAt each step to converge to MLE.
ni := number of steps taken to maximize parameter i
MAIN.R
MAXIMIZE.R
Exact same idea for other parameters
INITIALIZATION OF PARAMS
NULLMODEL.R
ALTMODEL.R
COMPAREMODELS.R
OUTPUT
TIME REQUIREMENTS• About my machine
• 2.4 GHz Intel Core 2 Duo• 4 GB 667 MHz DDR2 SDRAM• Mac OS X 10.6.8
• Using R Studio
TIME REQUIREMENTS
0 200 400 600 800 1000 12000
100
200
300
400
500
600
Time vs nfams
Number of Families Simulated
Com
puta
tiona
l Tim
e (s
ec)
Stepsize=0.05Alternative is TRUE
TIME REQUIREMENTS Nfams = 90Alternative is TRUE
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.110
5
10
15
20
25
30
35
40
45
50
Time vs Step-size
Step-Size for Grid Search
Com
puta
tiona
l Tim
e (s
ec)
LET’S TEST IT
output q u1 u2 u3 sigma nfams1 0.4 -1.34 0 1 0.1 75 alt2 0.34 -1.34 -0.5 1.3 0.15 75 alt3 0.27 1.5 1.43 1.6 0.2 150 null4 0.27 1.5 1.5 1.5 0.2 150 null
PERSPECTIVE Limitations• On my computer, it takes considerable time while nfams
gets moderately large. Similar to stepsize decreasing• Possible Solution: Coarse search, then fine search
• The starting conditions assumptions can be violated in real data• Possible Solution: Look at your data! Then adjust starting
values accordingly. • Strictly additive model with HWE
• Possible Solution: ??? • If assumptions are reasonably met, I would feel
comfortable using these functions
ACKNOWLEDGEMENTS Thanks to:• Dr. Province and Dr. Kraja• All lecturers• HSG and MSIBS classmates