Presentation

COMMINGLING ANALYSIS By : Jeff Berry

WHAT IT ISu1 u2 u3

q

sig

QTp

Dosage

ASSUMPTIONS* Families are independent

i.e. parents are independent

* QTp is normally distributed in each genotype

* Variance is the same across genotypes

ASSUMPTIONS

This is only for one person!

ASSUMPTIONS

This is only for one person!

ASSUMPTIONS

The goal is develop an algorithm that maximizes the log-likelihood for a given set of parameters

THE PROBLEMNull is True

*Governing Distribution of Pheno Is one Gaussian

Alternative is true

*Governing distribution of phenoIs a mixture of Gaussians

Hypothesis

How much different does it have to be to be significant?

VS

df = number of freely estimated parameters = 5-2 = 3

How much different does it have to be to be significant?

VS

df = number of freely estimated parameters = 5-2 = 3

METHODS• Hybrid EM-Grid Search

q u1 u2 u3 sigma

L()

E

M

Alternative Model

METHODS• Hybrid EM-Grid Search

u sigma

L()

E

M

Null Model

METHODS• Computational Intensity

I say approximately because it takes about four iterationsAt each step to converge to MLE.

ni := number of steps taken to maximize parameter i

MAIN.R

MAXIMIZE.R

Exact same idea for other parameters

INITIALIZATION OF PARAMS

NULLMODEL.R

ALTMODEL.R

COMPAREMODELS.R

OUTPUT

TIME REQUIREMENTS• About my machine

• 2.4 GHz Intel Core 2 Duo• 4 GB 667 MHz DDR2 SDRAM• Mac OS X 10.6.8

• Using R Studio

TIME REQUIREMENTS

0 200 400 600 800 1000 12000

100

200

300

400

500

600

Time vs nfams

Number of Families Simulated

Com

puta

tiona

l Tim

e (s

ec)

Stepsize=0.05Alternative is TRUE

TIME REQUIREMENTS Nfams = 90Alternative is TRUE

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.110

5

10

15

20

25

30

35

40

45

50

Time vs Step-size

Step-Size for Grid Search

Com

puta

tiona

l Tim

e (s

ec)

LET’S TEST IT

output q u1 u2 u3 sigma nfams1 0.4 -1.34 0 1 0.1 75 alt2 0.34 -1.34 -0.5 1.3 0.15 75 alt3 0.27 1.5 1.43 1.6 0.2 150 null4 0.27 1.5 1.5 1.5 0.2 150 null

PERSPECTIVE Limitations• On my computer, it takes considerable time while nfams

gets moderately large. Similar to stepsize decreasing• Possible Solution: Coarse search, then fine search

• The starting conditions assumptions can be violated in real data• Possible Solution: Look at your data! Then adjust starting

values accordingly. • Strictly additive model with HWE

• Possible Solution: ??? • If assumptions are reasonably met, I would feel

comfortable using these functions

ACKNOWLEDGEMENTS Thanks to:• Dr. Province and Dr. Kraja• All lecturers• HSG and MSIBS classmates

Presentation

Documents

Presentation