Quantitative Genomics and Genetics - Jason Mezey …mezeylab.cb.bscb.cornell.edu/labmembers/documents/class...Announcements Schedule - 3 I April 4 Spring break No Class!! 11 April

Jason [email protected]

April 11, 2017 (T) 8:40-9:55

Quantitative Genomics and Genetics

BTRY 4830/6830; PBSB.5201.01

Lecture16: Population structure and logistic regression I

mailto:[email protected]

Announcements I Schedule - 3

April 4 Spring break No Class!! 11

April 6 Spring break No Class!!

April 11

Genome-Wide Association Studies (GWAS) IV:

logistic regression 1 (the model)

12

April 13 Project Assigned

Genome-Wide Association Studies (GWAS) V: logistic

regression II (IRLS algorithm and GLMs)

April 18

Genome-Wide Association Studies (GWAS) X: Haplotype testing,

alternative tests, and minimum GWAS analysis

13

April 20 Advanced topics I: Mixed Models

April 25

Advanced topics II: Multiple regression

(epistasis) and multivariate regression

14

April 27 MAPPING LOCI: BAYESIAN ANALYSIS

Bayesian inference I: inference basics / linear

models

May 2 Bayesian inference II: MCMC algorithms 15

May 4

PEDIGREE / INBRED LINE ANALYSIS /

CLASSIC QUANTITATIVE

GENETICS

Basics of linkage analysis / Inbred line analysis

May 9 Project Due Heritability and additive genetic variance 16

Announcements

• Midterm will be available next week

• No more homeworks (!!) - just a project and final (and computer labs)

• Your PROJECT will be assigned on Thurs.!

• I will have office hours today

• In NY go to the SMALL Genetic Med Conference Room

• In Ithaca, same location as always

Conceptual OverviewGenetic System

Does A1 -

> A2

affec

t Y?

Sample or experimental

popMeasured individuals

(genotype,

phenotype)

Pr(Y|X)Model params

Reject / DNR Regres

sion

model

F-test

Review: modeling covariates I• If we have a factor that is correlated with our phenotype

and we do not handle it in some manner in our analysis, we risk producing false positives AND/OR reduce the power of our tests!

• The good news is that, assuming we have measured the factor (i.e. it is part of our GWAS dataset) then we can incorporate the factor in our model as a covariate(s):

• The effect of this is that we will estimate the covariate model parameter and this will account for the correlation of the factor with phenotype (such that we can test for our marker correlation without false positives / lower power!)

�� = �a

�a+

�d2(p1 � p2)

⇥(97)

�µ,0 (98)

H0 : Cov(Y,X) (99)

To see how this is accomplished in a permutation analysis, let’s first describe a permutation.If we write our data in a matrix as follows:

Data =

⇤

⌥⇧z11 ... z1k y11 ... y1m x11 ... x1N...

......

......

......

......

zn1 ... znk yn1 ... ynm x11 ... xnN

⌅

�⌃

where the latter columns are the genotypes, a permutation is produced by randomizing thephenotype samples y keeping the genotypes in the same order, e.g.:

Y = �µ +Xa�a +Xd�d +Xz,1�z,1 +Xz,2�z,2 + ⇥ (100)

17

Review modeling covariates II

• How do we perform inference with a covariate in our lines regression model?

• We perform MLE the same way (!!) our X matrix now simply includes extra columns, one for each of the additional covariates, where for the linear regression we have:

• We perform hypothesis testing the same way (!!) with a slight difference: our LRT includes the covariate in both the null hypothesis and the alternative, but we are testing the same null hypothesis:

2 Hypothesis testing with the regression model

As a reminder, our inference goal in quantitative genomics is to test the following nullhypothesis for a multiple regression model: Y = �µ +Xa�a +Xd�d + ✏ with ✏ ⇠ N(0,�2

✏ ),which we use to assess whether there is an e↵ect of a polymorphism on a phenotype:

H0 : �a = 0 \ �d = 0 (1)

HA : �a 6= 0 [ �d 6= 0 (2)

To do this, we will construct a likelihood ratio test (LRT) with an exact distribution (inthis case, an F-test). We will not go into the details of how this test is derived, but remem-ber that this has the same form as any LRT that we discussed in a previous lecture (andremember that a LRT works like any other statistic, i.e. it is a function on a sample thatproduces a value that we then use to determine a p-value!!). We will however consider thecomponents of an F-statistic so we know how to calculate it and perform our hypothesistest.

To construct this LRT, we need the maximum likelihood estimates of the regression pa-rameters:

MLE(✓) =

2

4�µ�a�d

3

5

where recall from last lecture, this has the following form:

MLE(✓) = (XTX)�1

X

TY (3)

MLE(�) = (xTx)�1

x

Ty (4)

With these estimates, we can construct the predicted phenotypic value yi for an individuali in a sample:

yi = �µ + xi,a�a + xi,d�d (5)

where the parameter estimates are the MLE. We will next define two functions of thepredicted values. The first is the sum of squares of the model (SSM):

SSM =nX

i=1

(yi � y)2 (6)

where y = 1n⌃

ni yi is the mean of the sample. The second is the sum of squares of the error

(SSE):

SSE =nX

i=1

(yi � yi)2 (7)

2

the other haplotype alleles, this is a reasonable solution for determining the number of al-leles. Now, this might not be a very satisfying answer but it turns out that, for humans atleast, if one looks at a haplotype region, it is often relatively easy to identify 3-5 haplotypealleles that account for all observed variation. In sum, there is no hard rule, but we definea collapsing that makes the most sense given data we observe.

3 Fixed Covariates

Remember that when we are performing a GWAS using a GLM:

Y = ��1(�µ +Xa�a +Xd�d) (1)

where we are testing:H0 : �a = 0 \ �d = 0 (2)

HA : �a 6= 0 [ �d 6= 0 (3)

and where another way to consider these hypotheses is that we are actually testing:

H0 : Cov(Y,Xa) = 0 \ Cov(Y,Xd) = 0 (4)

HA : Cov(Y,Xa) 6= 0 [ Cov(Y,Xd) 6= 0 (5)

Let’s now consider a case where a marker is not linked to a causal polymorphism, so thatthe null hypothesis is true, but there is another factor, which we could code as an additionalvariable Xz, that has an e↵ect on Y (which we could describe with a parameter �) suchthat Cov(Y,Xz) 6= 0. Let’s assume that this factor has the following relationship with thegenotype Cov(Xa, Xz) 6= 0, i.e. Xz it is correlated with Xa. In this case, when testing thenull hypothesis using equation (8), we should expect to reject the null. While this is not afalse positive in the sense that we are getting the right statistical answer, this is the wronganswer from a genetic perspective, so it is a biological false positive i.e. the result of thetest is indicating that the marker is linked to a causal polymorphism although it is not.

Let’s now consider a case where there is a factor that has an e↵ect on Y but it is notcorrelated with either Xa or Xd. If we apply our basic glm, we are actually incorporatingthe e↵ect of this factor in the error term. For example, for a linear regression model:

Y = �µ +Xa�a +Xd�d + ✏Xz (6)

the actual error we are considering is:

✏Xz = Xz�z + ✏ (7)

✏ ⇠ N(0,�2✏ ) (8)

4

• First, determine the predicted value of the phenotype of each individual under the null hypothesis (how do we set up x?):

• Second, determine the predicted value of the phenotype of each individual under the alternative hypothesis (set up x?):

• Third, calculate the “Error Sum of Squares” for each:

• Finally, we calculate the F-statistic with degrees of freedom [2, n-3] (why two degress of freedom?):

Modeling covariates III

X : X(H) = 0, X(T ) = 1

X : ⌦ ! R

X1

: ⌦ ! R

X2

: ⌦ ! R

Pr(F) ! Pr(X)

Pr(✓)

Pr(T (X)|H0

: ✓ = c)

H0

: ✓ = c

A1

! A2

) �Y |Z (211)

Pr(A1

, A1

) = Pr(A1

)Pr(A1

) = p2 (212)

Pr(A1

, A2

) = 2Pr(A1

)Pr(A2

) = 2pq (213)

Pr(A2

, A2

) = Pr(A2

)Pr(A2

) = q2 (214)

Pr(AiAj , BkBl) 6= Pr(AiAj)Pr(BkBl) (215)

✏i = 0.9✏ ⇠ N(0,�2

✏ ) (216)

Y = ��1(�µ +Xa�a +Xd�d +Xz,1�z,1 +Xz,2�z,2) (217)

l(✓0

|y) =nX

i=1

yiln(�

�1(�µ + xi,z�z)) + (1� yi)ln(1� ��1(�µ + xi,z�z))

�(218)

yi,ˆ✓0

= �µ +X

j=1

xi,z,j �z,j + ✏i (219)

yi,ˆ✓1

= �µ + xi,a�a + xi,d +X

j=1

xi,z,j �z,j + ✏i (220)

F[2,n�3]

(y,x) =SSE(

ˆ✓0

)�SSE(

ˆ✓1

)

2

SSE(

ˆ✓1

)

n�3

(221)

24

SSM =nX

i=1

(yi � y)2 (8)

where y = 1n⌃


(SSE):

SSE(✓0) =nX

i=1

(yi � yi,✓0)2 (9)

SSE(✓1) =nX

i=1

(yi � yi,✓1)2 (10)

We will next use these two expressions to define two corresponding functions: the meansquare model (MSM) and the mean square error (MSE) terms. These later functionsdepend on the concept of degrees of freedom (df). Degrees of freedom have a rigorous jus-tification that you will encounter in an advanced statistics course. In this course, we willnot consider this justification or a deep intuition as to what df represent. For our purposes,it is enough to be able to calculate the df for our model and for our error. For our model,we determine df as the total number of � parameters in our model (three in this case: �µ,�a, and �d) minus one for the estimate of y such that df(M) = 3 � 1 = 2. For our error,the df is the total sample n minus the one for each of the three � parameters estimated inthe regression model such that df(E) = n� 3. Note that this approach for determining dfworks for any model. For example, if we were to consider a regression model with just �µand �a (and no �d), we would have df(M) = 2� 1 and df(E) = n� 2.

With these terms for df, we can now define MSM and MSE:

MSM =SSM

df(M)=

SSM

2(11)

MSE =SSE

df(E)=

SSE

n� 3(12)

and with these definitions, we can finally calculate our F-statistic:

F[2,n�3] =MSM

MSE(13)

F[2,n�3](y,xa,xd) =MSM

MSE(14)

F[2,n�3](y,xa,xd) =SSE(✓0)�SSE(✓1)

2

SSE(✓1)n�3

(15)

3

SSM =nX

i=1

(yi � y)2 (8)

where y = 1n⌃


(SSE):

SSE(✓0) =nX

i=1

(yi � yi,✓0)2 (9)

SSE(✓1) =nX

i=1

(yi � yi,✓1)2 (10)

We will next use these two expressions to define two corresponding functions: the meansquare model (MSM) and the mean square error (MSE) terms. These later functionsdepend on the concept of degrees of freedom (df). Degrees of freedom have a rigorous jus-tification that you will encounter in an advanced statistics course. In this course, we willnot consider this justification or a deep intuition as to what df represent. For our purposes,it is enough to be able to calculate the df for our model and for our error. For our model,we determine df as the total number of � parameters in our model (three in this case: �µ,�a, and �d) minus one for the estimate of y such that df(M) = 3 � 1 = 2. For our error,the df is the total sample n minus the one for each of the three � parameters estimated inthe regression model such that df(E) = n� 3. Note that this approach for determining dfworks for any model. For example, if we were to consider a regression model with just �µand �a (and no �d), we would have df(M) = 2� 1 and df(E) = n� 2.

With these terms for df, we can now define MSM and MSE:

MSM =SSM

df(M)=

SSM

2(11)

MSE =SSE

df(E)=

SSE

n� 3(12)

and with these definitions, we can finally calculate our F-statistic:

F[2,n�3] =MSM

MSE(13)

F[2,n�3](y,xa,xd) =MSM

MSE(14)

F[2,n�3](y,xa,xd) =SSE(✓0)�SSE(✓1)

2

SSE(✓1)n�3

(15)

3

X : X(H) = 0, X(T ) = 1

X : ⌦ ! R

X1

: ⌦ ! R

X2

: ⌦ ! R

Pr(F) ! Pr(X)

Pr(✓)

Pr(T (X)|H0

: ✓ = c)

H0

: ✓ = c

A1

! A2

) �Y |Z (211)

Pr(A1

, A1

) = Pr(A1

)Pr(A1

) = p2 (212)

Pr(A1

, A2

) = 2Pr(A1

)Pr(A2

) = 2pq (213)

Pr(A2

, A2

) = Pr(A2

)Pr(A2

) = q2 (214)


✏i = 0.9✏ ⇠ N(0,�2

✏ ) (216)

Y = ��1(�µ +Xa�a +Xd�d +Xz,1�z,1 +Xz,2�z,2) (217)

l(✓0

|y) =nX

i=1

yiln(�

�1(�µ + xi,z�z)) + (1� yi)ln(1� ��1(�µ + xi,z�z))

�(218)

yi,ˆ✓0

= �µ +X

j=1

xi,z,j �z,j (219)

yi,ˆ✓1

= �µ + xi,a�a + xi,d�d +X

j=1

xi,z,j �z,j (220)

F[2,n�3]

(y,x) =SSE(

ˆ✓0

)�SSE(

ˆ✓1

)

2

SSE(

ˆ✓1

)

n�3

(221)

24

X : X(H) = 0, X(T ) = 1

X : ⌦ ! R

X1

: ⌦ ! R

X2

: ⌦ ! R

Pr(F) ! Pr(X)

Pr(✓)

Pr(T (X)|H0

: ✓ = c)

H0

: ✓ = c

A1

! A2

) �Y |Z (211)

Pr(A1

, A1

) = Pr(A1

)Pr(A1

) = p2 (212)

Pr(A1

, A2

) = 2Pr(A1

)Pr(A2

) = 2pq (213)

Pr(A2

, A2

) = Pr(A2

)Pr(A2

) = q2 (214)


✏i = 0.9✏ ⇠ N(0,�2

✏ ) (216)

Y = ��1(�µ +Xa�a +Xd�d +Xz,1�z,1 +Xz,2�z,2) (217)

l(✓0

|y) =nX

i=1

yiln(�

�1(�µ + xi,z�z)) + (1� yi)ln(1� ��1(�µ + xi,z�z))

�(218)

yi,ˆ✓0

= �µ +X

j=1

xi,z,j �z,j (219)

yi,ˆ✓1

= �µ + xi,a�a + xi,d�d +X

j=1

xi,z,j �z,j (220)

F[2,n�3]

(y,x) =SSE(

ˆ✓0

)�SSE(

ˆ✓1

)

2

SSE(

ˆ✓1

)

n�3

(221)

24

Modeling covariates VI• Say you have GWAS data (a phenotype and genotypes) and your

GWAS data also includes information on a number of covariates, e.g. male / female, several different ancestral groups (different populations!!), other risk factors, etc.

• First, you need to figure out how to code the XZ in each case for each of these, which may be simple (male / female) but more complex with others (where how to code them involves fuzzy rules, i.e. it depends on your context!!)

• Second, you will need to figure out which to include in your analysis (again, fuzzy rules!) but a good rule is if the parameter estimate associated with the covariate is large (=significant individual p-value) you should include it!

• There are many ways to figure out how to include covariates (again a topic in itself!!)

Review: population structure

• “Population structure” or “stratification” is a case where a sample includes groups of people that fit into two or more different ancestry groups (fuzzy def!)

• Population structure is often a major issue in GWAS where it can cause lots of false positives if it is not accounted for in your model

• Intuitively, you can model population structure as a covariate if you know:

• How many populations are represented in your sample

• Which individual in your sample belongs to which population

• QQ plots are good for determining whether there may be population structure

• “Clustering” techniques are good for detecting population structure and determining which individual is in which population (=ancestry group)

Origin of population structure

© Sarver World Cultures

People geographically separate through migration and then the set of alleles present in the population evolves (=changes) over time

Principal Component Analysis (PCA) of population structure

© Nature Publishing

• To learn a population factor, analyze the genotype data

• Apply a Principal Component Analysis (PCA) where the “axes” (features) in this case are individuals and each point is a (scaled) genotype

• What we are interested in the projections (loadings) of the individual PCs on the axes (dotted arrows) on each of the individual axes, where for each, this will produce n (i.e. one value for each sample) value of a new independent (covariate) variable XZ

Learning unmeasured population factors

�� = �a

�a+

�d2(p1 � p2)

⇥(97)

�µ,0 (98)

H0 : Cov(Y,X) (99)


Data =

⇤

⌥⇧z11 ... z1k y11 ... y1m x11 ... x1N...

......

......

......

......


⌅

�⌃


17

Zi,1

Zi,2

�� = �a

�a+

�d2(p1 � p2)

⇥(97)

�µ,0 (98)

H0 : Cov(Y,X) (99)


Data =

⇤

⌥⇧z11 ... z1k y11 ... y1m x11 ... x1N...

......

......

......

......


⌅

�⌃


Y = �µ +Xa�a +Xd�d +Xz,1�z,1 +Xz,2�z,2 + ⇥ (100)

17

• Calculate the nxn (n=sample size) covariance matrix for the individuals in your sample across all genotypes

• Apply a PCA to this covariance matrix, the output will be matrices containing “eigenvalues” and “eigenvectors” (= the Principal Components), where the size of the eigenvalue indicates the ordering of the Principal Component

• Each Principal Component (PC) will be a n element vector where each element is the “loading” of the PC on the individual axes, where these are your values of your independent variable coding (e.g., if you include the first PC as your first covariate, your coding will be XZ,1 = PC loadings)

• Note that you could also get the same answer by calculating an NxN (N=measured genotypes) covariance matrix, apply PCA and take the projects of each sample on the PCs (why might this be less optimal?)

Applying a PCA population structure analysis (in practice)

Using the results of a PCA population structure analysis

• Once you have detected the populations (e.g. by eye in a PCA = fuzzy!) in your GWAS sample, set your independent variables equal to the loadings for each individual, e.g., for two pop covariates, set XZ,1 = Z1, XZ,2 = Z2

• You could also determine which individual is in which pop and define random variables for pop assignment, e.g. for two populations include single covariate by setting, XZ,1(pop1) = 1, XZ,1(pop2) = 0 (generally less optimal but can be used!)

• Use one of these approaches to model a covariate in your analysis, i.e. for every genotype marker that you test in your GWAS:

• The goal is to produce a good QQ plot (what if it does not?)

�� = �a

�a+

�d2(p1 � p2)

⇥(97)

�µ,0 (98)

H0 : Cov(Y,X) (99)


Data =

⇤

⌥⇧z11 ... z1k y11 ... y1m x11 ... x1N...

......

......

......

......


⌅

�⌃


Y = �µ +Xa�a +Xd�d +Xz,1�z,1 +Xz,2�z,2 + ⇥ (100)

17

Before (top) and after including a population covariate (bottom)

Review: linear regression• So far, we have considered a linear regression is a reasonable

model for the relationship between genotype and phenotype (where this implicitly assumes a normal error provides a reasonable approximation of the phenotype distribution given the genotype):

and we can write the ‘predicted’ value of yi of an individual as:

yi = �0 + xi�1 (14)

which is the value we would expect yi to take if there is no error. Note that by conventionwe write the predicted value of y with a ‘hat’, which is the same terminology that we usefor parameter estimates. I consider this a bit confusing, since we only estimate parame-ters, but you can see where it comes from, i.e. the predicted value of yi is a function ofparameter estimates.

As an example, let’s consider the values all of the linear regression components wouldtake for a specific value yi. Let’s consider a system where:

Y = �0 +X�1 + ✏ = 0.5 +X(1) + ✏ (15)

✏ ⇠ N(0,�2✏ ) = N(0, 1) (16)

If we take a sample and obtain the value y1 = 3.8 for an individual in our sample, the truevalues of the equation for this individual are:

3.8 = 0.5 + 3(1) + 0.3 (17)

Let’s say we had estimated the parameters �0 and �1 from the sample to be �0 = 0.6 and�1 = 2.9. The predicted value of y1 in this case would be:

y1 = 3.5 = 0.6 + 2.9(1) (18)

Note that we have not yet discussed how we estimate the � parameters but we will get tothis next lecture.

To produce a linear regression model useful in quantitative genomics, we will define amultiple linear regression, which simply means that we have more than one independent(fixed random) variable X, each with their own associated �. Specifically, we will definethe two following independent (random) variables:

Xa(A1A1) = �1, Xa(A1A2) = 0, Xa(A2A2) = 1 (19)

Xd(A1A1) = �1, Xd(A1A2) = 1, Xd(A2A2) = �1 (20)

and the following regression equation:

Y = �µ +Xa�a +Xd�d + ✏ (21)

✏ ⇠ N(0,�2✏ ) (22)

7

and we can write the ‘predicted’ value of yi of an individual as:

yi = �0 + xi�1 (14)

which is the value we would expect yi to take if there is no error. Note that by conventionwe write the predicted value of y with a ‘hat’, which is the same terminology that we usefor parameter estimates. I consider this a bit confusing, since we only estimate parame-ters, but you can see where it comes from, i.e. the predicted value of yi is a function ofparameter estimates.

As an example, let’s consider the values all of the linear regression components wouldtake for a specific value yi. Let’s consider a system where:

Y = �0 +X�1 + ✏ = 0.5 +X(1) + ✏ (15)

✏ ⇠ N(0,�2✏ ) = N(0, 1) (16)

If we take a sample and obtain the value y1 = 3.8 for an individual in our sample, the truevalues of the equation for this individual are:

3.8 = 0.5 + 3(1) + 0.3 (17)

Let’s say we had estimated the parameters �0 and �1 from the sample to be �0 = 0.6 and�1 = 2.9. The predicted value of y1 in this case would be:

y1 = 3.5 = 0.6 + 2.9(1) (18)

Note that we have not yet discussed how we estimate the � parameters but we will get tothis next lecture.

To produce a linear regression model useful in quantitative genomics, we will define amultiple linear regression, which simply means that we have more than one independent(fixed random) variable X, each with their own associated �. Specifically, we will definethe two following independent (random) variables:

Xa(A1A1) = �1, Xa(A1A2) = 0, Xa(A2A2) = 1 (19)

Xd(A1A1) = �1, Xd(A1A2) = 1, Xd(A2A2) = �1 (20)

and the following regression equation:

Y = �µ +Xa�a +Xd�d + ✏ (21)

✏ ⇠ N(0,�2✏ ) (22)

7

Case / Control Phenotypes I• While a linear regression may provide a reasonable model for

many phenotypes, we are commonly interested in analyzing phenotypes where this is NOT a good model

• As an example, we are often in situations where we are interested in identifying causal polymorphisms (loci) that contribute to the risk for developing a disease, e.g. heart disease, diabetes, etc.

• In this case, the phenotype we are measuring is often “has disease” or “does not have disease” or more precisely “case” or “control”

• Recall that such phenotypes are properties of measured individuals and therefore elements of a sample space, such that we can define a random variable such as Y(case) = 1 and Y(control) = 0

Case / Control Phenotypes II

• Let’s contrast the situation, let’s contrast data we might model with a linear regression model versus case / control data:

Case / Control Phenotypes II

• Let’s contrast the situation, let’s contrast data we might model with a linear regression model versus case / control data:

Logistic regression I

• Instead, we’re going to consider a logistic regression model

Logistic regression II

• It may not be immediately obvious why we choose regression “line” function of this “shape”

• The reason is mathematical convenience, i.e. this function can be considered (along with linear regression) within a broader class of models called Generalized Linear Models (GLM) which we will discuss next lecture

• However, beyond a few differences (the error term and the regression function) we will see that the structure and out approach to inference is the same with this model

Logistic regression III• To begin, let’s consider the structure of a regression model:

• We code the “X’s” the same (!!) although a major difference here is the “logistic” function as yet undefined

• However, the expected value of Y has the same structure as we have seen before in a regression:

• We can similarly write for a population using matrix notation (where the X matrix has the same form as we have been considering!):

• In fact the two major differences are in the form of the error and the logistic function

phenotypes, and any statistical test that accomplishes this goal is a reasonable approach.For the moment, we will consider a logistic regression approach to modeling case-controlphenotypes. Logistic regression (and related models) provide the most versatile approachto case-control analysis.

As the general framework is the same as we have discussed before, we are still dealingwith a sample space S = {S

g

, SP

}, which contains genotype Sg

and phenotype SP

sub-sets. We will define the same genotypic random variables as before X : (S

g

, ⇤) ! Rusing the same codings: X

a

(A1A1) = �1, Xa

(A1A2) = 0, Xa

(A2A2) = 1 and Xd

(A1A1) =�1, X

d

(A1A2) = �1, Xa

(A2A2) = �1. We will also define a phenotypic random variableY : (⇤, S

P

) ! R which has the following structure: Y (case) = 1, Y (control) = 0. You’ll no-tice that plotting phenotype versus the three genotype classes in this case is a little di↵erentthan for a continuous, normal phenotype because we only have six possible combinations ofgenotype and phenotype. We will therefore use a slightly di↵erent ‘circle’ notation to repre-sent the frequency of observations in each of these categories (see class notes for a diagram).

As with our continuous, normal random variable, we will define a probability modelfor Y under the assumption Pr(Y |X). Now we could in theory continue to use a lin-ear regression to model the relationship between genotype and phenotype and, in fact,you sometimes see this approach (although I would encourage you not to use this strat-egy). However, the distribution of the phenotype has clearly violated a major assumptionof the linear regression model, that the distribution of Y |A

j

Ak

⇠ N(E(Y |Aj

Ak

),�2✏

) =N((�

µ

+ Xa

�a

+ Xd

�d

),�2✏

) = N(G(Y ),�2✏

), i.e. this violates the assumption that thephenotype is normally distributed around the expected (genotypic) value of each geno-type. This error cannot be normal if the phenotype only takes two states: zero and one.What’s more, a linear regression model can lead to genotypic values greater or less thanone, which tends not to match our intuition about how we should model genotypic valuesof case-control phenotypes (as we will see). We therefore need a di↵erent approach and alogistic regression is the model we will consider.

Let’s first consider the structure of a logistic regression:

Y = logistic(�µ

+Xa

�a

+Xd

�d

) + ✏l

(1)

You’ll note this has the same structure as a linear regression with the addition of the, asof yet, undefined function logistic(). The logistic function results in fitting a function tothe data that is close to flat at zero, increases in the middle, and flattens out again nearone (see class notes for a diagram). However, just as E(Y |X) = �

µ

+Xa

�a

+Xd

�d

for alinear regression:

E(Y |X) = logistic(�µ

+Xa

�a

+Xd

�d

) (2)

2

and we can similarly write for an individual i :

E(Yi

|Xi

) = logistic(�µ

+Xi,a

�a

+Xi,d

�d

) (3)

That is, in our genotype-phenotype plot, if we were to find the value of the logistic functionon the Y-axis at the point on the X-axis corresponding to A1A1, this is the expected valueof the phenotype Y for genotype A1A1, etc. Note that this number will be between zeroand one. We can similarly write a equation for a sample of size n using vector notation:

E(Y|X) = logistic(X�) (4)

where Y, X, and the vector � have the same definition as previously.

There is one other di↵erence between equation (1) and a linear regression: the distri-bution of the error random variable ✏. For a given value of the logistic regression for agenotype A

j

Ak

, this random variable has to make up the di↵erence between a value ofY , which is zero or one, and the value of this function. For a given genotype A

j

Ak

, thisrandom variable has to take one of two values. For a genotype A

j

Ak

, the value of thephenotype Y = 1:

✏ = �E(Y |X) = �E(Y |Ai

Aj

) = �logistic(�µ

+Xa

�a

+Xd

�d

) (5)

or if for this same genotype Aj

Ak

, the value of the phenotype Y = 0, then:

✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj

) = 1� logistic(�µ

+Xa

�a

+Xd

�d

) (6)

The random variable ✏ therefore takes one of two values, which is the di↵erence betweenthe value of the function at a genotype and one or zero (see class notes for a diagram).

As ✏ only has two states, this random variable has a Bernoulli distribution. Note thata Bernoulli distribution is parameterized by a single parameter: ✏ ⇠ bern(p), where theparameter p is the probability that the random variable will take the value ‘one’. So whatis the parameter p? This takes the following value:

p = logistic(�µ

+Xa

�a

+Xd

�d

) (7)

where ✏ takes the value 1�logistic(�µ

+Xa

�a

+Xd

�d

) with probability logistic(�µ

+Xa

�a

+X

d

�d

) and the value �logistic(�µ

+Xa

�a

+Xd

�d


+Xa

�a

+X

d

�d

). The error is therefore di↵erent depending on the expected value of the phenotype(=genotypic value) associated with a specific genotype.

While this may look complicated, this parameter actually allows for a simple interpre-tation. Note that if the value of the logistic regression function is low (i.e. closer to zero),

3


E(Yi

|Xi

) = logistic(�µ

+Xi,a

�a

+Xi,d

�d

) (3)





j

Ak


j

Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (5)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (6)



p = logistic(�µ

+Xa

�a

+Xd

�d

) (7)


+Xa

�a

+Xd

�d


+Xa

�a

+X

d

�d


+Xa

�a

+Xd

�d


+Xa

�a

+X

d

�d



3

Logistic regression: error term I

• Recall that for a linear regression, the error term accounted for the difference between each point and the expected value (the linear regression line), which we assume follow a normal, but for a logistic regression, we have the same case but the value has to make up the value to either 0 or 1 (what distribution is this?):

Y Y

Xa Xa

Logistic regression: error term II• For the error on an individual i, we therefore have to construct

an error that takes either the value of “1” or “0” depending on the value of the expected value of the genotype

• For Y = 0

• For Y = 1

H0 : Cov(Xa, Y ) = 0 \ Cov(Xd, Y ) = 0 (35)

HA : Cov(Xa, Y ) 6= 0 [ Cov(Xd, Y ) 6= 0 (36)

H0 : �a = 0 \ �d = 0 (37)

HA : �a 6= 0 [ �d 6= 0 (38)

F�statistic = f(⇤) (39)

�µ = 0,�a = 4,�d = �1,�2✏ = 1 (40)

�0a = 0, �0

d = 0 (41)

�0a = �a, �

0d = �d (42)

Pr(A1, A1) = Pr(A1)Pr(A1) = p2 (43)

Pr(A1, A2) = Pr(A1)Pr(A2) = 2pq (44)

Pr(A2, A2) = Pr(A2)Pr(A2) = q2 (45)

) (Corr(Xa,A, Xa,B) = 0) \ (Corr(Xa,A, Xd,B) = 0) (46)

\(Corr(Xd,A, Xa,B) = 0) \ (Corr(Xd,A, Xd,B) = 0) (47)

) (Corr(Xa,A, Xa,B) 6= 0) [ (Corr(Xa,A, Xd,B) 6= 0) (48)

[(Corr(Xd,A, Xa,B) 6= 0) [ (Corr(Xd,A, Xd,B) 6= 0) (49)

Pr(AiBk, AjBl) = Pr(AiAj)Pr(BkBl) (50)

Pr(AiBk, AjBl) = Pr(AiBk)Pr(AjBl) (51)

= Pr(Ai)Pr(Aj)Pr(Bk)Pr(Bl) = Pr(AiAj)Pr(BkBl) (52)

XAi

: XAi

(A1) = 1, XAi

(A2) = 0 (53)

XBj

: XBj

(B1) = 1, XBi

(B2) = 0 (54)

r =Pr(Ai, Bk)� Pr(Ai)Pr(Bk)p

Pr(Ai)(1� Pr(Ai)pPr(Bk)(1� Pr(Bk)

(55)

r2 =(Pr(Ai, Bk)� Pr(Ai)Pr(Bk))2

(Pr(Ai)(1� Pr(Ai))(Pr(Bk)(1� Pr(Bk))(56)

D = Pr(Ai, Bk)� Pr(Ai)Pr(Bk) (57)

D0 =D

min(Pr(A1B2), P r(A2, B1))ifD > 0 (58)

D0 =D

min(Pr(A1B1), P r(A2, B2))ifD < 0 (59)

✏i = �E(Yi|Xi) = �E(Y |AiAj) = �logistic(�µ +Xi,a�a +Xi,d�d) (60)

✏i = 1� E(Yi|Xi) = 1� E(Y |AiAj) = 1� logistic(�µ +Xi,a�a +Xi,d�d) (61)

✏i = Z � E(Yi|Xi) (62)

Pr(Z) ⇠ bern(p) (63)

14

H0 : Cov(Xa, Y ) = 0 \ Cov(Xd, Y ) = 0 (35)

HA : Cov(Xa, Y ) 6= 0 [ Cov(Xd, Y ) 6= 0 (36)

H0 : �a = 0 \ �d = 0 (37)

HA : �a 6= 0 [ �d 6= 0 (38)


�µ = 0,�a = 4,�d = �1,�2✏ = 1 (40)

�0a = 0, �0

d = 0 (41)

�0a = �a, �

0d = �d (42)

Pr(A1, A1) = Pr(A1)Pr(A1) = p2 (43)

Pr(A1, A2) = Pr(A1)Pr(A2) = 2pq (44)

Pr(A2, A2) = Pr(A2)Pr(A2) = q2 (45)








XAi

: XAi

(A1) = 1, XAi

(A2) = 0 (53)

XBj

: XBj

(B1) = 1, XBi

(B2) = 0 (54)



(55)




D0 =D

min(Pr(A1B2), P r(A2, B1))ifD > 0 (58)

D0 =D

min(Pr(A1B1), P r(A2, B2))ifD < 0 (59)



✏i = Z � E(Yi|Xi) (62)


14



• For Y = 0

• For Y = 1

• For a distribution that takes two such values, a reasonable distribution is therefore the Bernoulli distribution with the following parameter

H0 : Cov(Xa, Y ) = 0 \ Cov(Xd, Y ) = 0 (35)

HA : Cov(Xa, Y ) 6= 0 [ Cov(Xd, Y ) 6= 0 (36)

H0 : �a = 0 \ �d = 0 (37)

HA : �a 6= 0 [ �d 6= 0 (38)


�µ = 0,�a = 4,�d = �1,�2✏ = 1 (40)

�0a = 0, �0

d = 0 (41)

�0a = �a, �

0d = �d (42)

Pr(A1, A1) = Pr(A1)Pr(A1) = p2 (43)

Pr(A1, A2) = Pr(A1)Pr(A2) = 2pq (44)

Pr(A2, A2) = Pr(A2)Pr(A2) = q2 (45)








XAi

: XAi

(A1) = 1, XAi

(A2) = 0 (53)

XBj

: XBj

(B1) = 1, XBi

(B2) = 0 (54)



(55)




D0 =D

min(Pr(A1B2), P r(A2, B1))ifD > 0 (58)

D0 =D

min(Pr(A1B1), P r(A2, B2))ifD < 0 (59)



✏i = Z � E(Yi|Xi) (62)


14

H0 : Cov(Xa, Y ) = 0 \ Cov(Xd, Y ) = 0 (35)

HA : Cov(Xa, Y ) 6= 0 [ Cov(Xd, Y ) 6= 0 (36)

H0 : �a = 0 \ �d = 0 (37)

HA : �a 6= 0 [ �d 6= 0 (38)


�µ = 0,�a = 4,�d = �1,�2✏ = 1 (40)

�0a = 0, �0

d = 0 (41)

�0a = �a, �

0d = �d (42)

Pr(A1, A1) = Pr(A1)Pr(A1) = p2 (43)

Pr(A1, A2) = Pr(A1)Pr(A2) = 2pq (44)

Pr(A2, A2) = Pr(A2)Pr(A2) = q2 (45)








XAi

: XAi

(A1) = 1, XAi

(A2) = 0 (53)

XBj

: XBj

(B1) = 1, XBi

(B2) = 0 (54)



(55)




D0 =D

min(Pr(A1B2), P r(A2, B1))ifD > 0 (58)

D0 =D

min(Pr(A1B1), P r(A2, B2))ifD < 0 (59)



✏i = Z � E(Yi|Xi) (62)


14

H0 : Cov(Xa, Y ) = 0 \ Cov(Xd, Y ) = 0 (35)

HA : Cov(Xa, Y ) 6= 0 [ Cov(Xd, Y ) 6= 0 (36)

H0 : �a = 0 \ �d = 0 (37)

HA : �a 6= 0 [ �d 6= 0 (38)


�µ = 0,�a = 4,�d = �1,�2✏ = 1 (40)

�0a = 0, �0

d = 0 (41)

�0a = �a, �

0d = �d (42)

Pr(A1, A1) = Pr(A1)Pr(A1) = p2 (43)

Pr(A1, A2) = Pr(A1)Pr(A2) = 2pq (44)

Pr(A2, A2) = Pr(A2)Pr(A2) = q2 (45)








XAi

: XAi

(A1) = 1, XAi

(A2) = 0 (53)

XBj

: XBj

(B1) = 1, XBi

(B2) = 0 (54)



(55)




D0 =D

min(Pr(A1B2), P r(A2, B1))ifD > 0 (58)

D0 =D

min(Pr(A1B1), P r(A2, B2))ifD < 0 (59)



✏i = Z � E(Yi|Xi) (62)


14



• For Y = 0

• For Y = 1

• For a distribution that takes two such values, a reasonable distribution is therefore the Bernoulli distribution with the following parameter


E(Yi

|Xi

) = logistic(�µ

+Xi,a

�a

+Xi,d

�d

) (3)





j

Ak


j

Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (5)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (6)

✏i,l

= �E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (7)


Ak


✏i,l

= 1� E(Yi

|Xi

) = 1� E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (8)


As ✏ only has two states, this random variable has a Bernoulli distribution. Note thata Bernoulli distribution is parameterized by a single parameter:

✏i,l

⇠ bern(p|X)

where the parameter p is the probability that the random variable will take the value‘one’. So what is the parameter p? This takes the following value:

p = logistic(�µ

+Xa

�a

+Xd

�d

) (9)

3

H0 : Cov(Xa, Y ) = 0 \ Cov(Xd, Y ) = 0 (35)

HA : Cov(Xa, Y ) 6= 0 [ Cov(Xd, Y ) 6= 0 (36)

H0 : �a = 0 \ �d = 0 (37)

HA : �a 6= 0 [ �d 6= 0 (38)


�µ = 0,�a = 4,�d = �1,�2✏ = 1 (40)

�0a = 0, �0

d = 0 (41)

�0a = �a, �

0d = �d (42)

Pr(A1, A1) = Pr(A1)Pr(A1) = p2 (43)

Pr(A1, A2) = Pr(A1)Pr(A2) = 2pq (44)

Pr(A2, A2) = Pr(A2)Pr(A2) = q2 (45)








XAi

: XAi

(A1) = 1, XAi

(A2) = 0 (53)

XBj

: XBj

(B1) = 1, XBi

(B2) = 0 (54)



(55)




D0 =D

min(Pr(A1B2), P r(A2, B1))ifD > 0 (58)

D0 =D

min(Pr(A1B1), P r(A2, B2))ifD < 0 (59)



✏i = Z � E(Yi|Xi) (62)


14

H0 : Cov(Xa, Y ) = 0 \ Cov(Xd, Y ) = 0 (35)

HA : Cov(Xa, Y ) 6= 0 [ Cov(Xd, Y ) 6= 0 (36)

H0 : �a = 0 \ �d = 0 (37)

HA : �a 6= 0 [ �d 6= 0 (38)


�µ = 0,�a = 4,�d = �1,�2✏ = 1 (40)

�0a = 0, �0

d = 0 (41)

�0a = �a, �

0d = �d (42)

Pr(A1, A1) = Pr(A1)Pr(A1) = p2 (43)

Pr(A1, A2) = Pr(A1)Pr(A2) = 2pq (44)

Pr(A2, A2) = Pr(A2)Pr(A2) = q2 (45)








XAi

: XAi

(A1) = 1, XAi

(A2) = 0 (53)

XBj

: XBj

(B1) = 1, XBi

(B2) = 0 (54)



(55)




D0 =D

min(Pr(A1B2), P r(A2, B1))ifD > 0 (58)

D0 =D

min(Pr(A1B1), P r(A2, B2))ifD < 0 (59)



✏i = Z � E(Yi|Xi) (62)


14

H0 : Cov(Xa, Y ) = 0 \ Cov(Xd, Y ) = 0 (35)

HA : Cov(Xa, Y ) 6= 0 [ Cov(Xd, Y ) 6= 0 (36)

H0 : �a = 0 \ �d = 0 (37)

HA : �a 6= 0 [ �d 6= 0 (38)


�µ = 0,�a = 4,�d = �1,�2✏ = 1 (40)

�0a = 0, �0

d = 0 (41)

�0a = �a, �

0d = �d (42)

Pr(A1, A1) = Pr(A1)Pr(A1) = p2 (43)

Pr(A1, A2) = Pr(A1)Pr(A2) = 2pq (44)

Pr(A2, A2) = Pr(A2)Pr(A2) = q2 (45)








XAi

: XAi

(A1) = 1, XAi

(A2) = 0 (53)

XBj

: XBj

(B1) = 1, XBi

(B2) = 0 (54)



(55)




D0 =D

min(Pr(A1B2), P r(A2, B1))ifD > 0 (58)

D0 =D

min(Pr(A1B1), P r(A2, B2))ifD < 0 (59)



✏i = Z � E(Yi|Xi) (62)


14

H0 : Cov(Xa, Y ) = 0 \ Cov(Xd, Y ) = 0 (35)

HA : Cov(Xa, Y ) 6= 0 [ Cov(Xd, Y ) 6= 0 (36)

H0 : �a = 0 \ �d = 0 (37)

HA : �a 6= 0 [ �d 6= 0 (38)


�µ = 0,�a = 4,�d = �1,�2✏ = 1 (40)

�0a = 0, �0

d = 0 (41)

�0a = �a, �

0d = �d (42)

Pr(A1, A1) = Pr(A1)Pr(A1) = p2 (43)

Pr(A1, A2) = Pr(A1)Pr(A2) = 2pq (44)

Pr(A2, A2) = Pr(A2)Pr(A2) = q2 (45)








XAi

: XAi

(A1) = 1, XAi

(A2) = 0 (53)

XBj

: XBj

(B1) = 1, XBi

(B2) = 0 (54)



(55)




D0 =D

min(Pr(A1B2), P r(A2, B1))ifD > 0 (58)

D0 =D

min(Pr(A1B1), P r(A2, B2))ifD < 0 (59)



✏i = Z � E(Yi|Xi) (62)


14

Logistic regression: error term III

• This may look complicated at first glance but the intuition is relatively simple

• If the logistic regression line is near zero, the probability distribution of the error term is set up to make the probability of Y being zero greater than being one (and vice versa for the regression line near one!):

Y

Xa


E(Yi

|Xi

) = logistic(�µ

+Xi,a

�a

+Xi,d

�d

) (3)





j

Ak


j

Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (5)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (6)

✏i,l

= �E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (7)


Ak


✏i,l

= 1� E(Yi

|Xi

) = 1� E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (8)



✏i,l

⇠ bern(p|X)


p = logistic(�µ

+Xa

�a

+Xd

�d

) (9)

3

H0 : Cov(Xa, Y ) = 0 \ Cov(Xd, Y ) = 0 (35)

HA : Cov(Xa, Y ) 6= 0 [ Cov(Xd, Y ) 6= 0 (36)

H0 : �a = 0 \ �d = 0 (37)

HA : �a 6= 0 [ �d 6= 0 (38)


�µ = 0,�a = 4,�d = �1,�2✏ = 1 (40)

�0a = 0, �0

d = 0 (41)

�0a = �a, �

0d = �d (42)

Pr(A1, A1) = Pr(A1)Pr(A1) = p2 (43)

Pr(A1, A2) = Pr(A1)Pr(A2) = 2pq (44)

Pr(A2, A2) = Pr(A2)Pr(A2) = q2 (45)








XAi

: XAi

(A1) = 1, XAi

(A2) = 0 (53)

XBj

: XBj

(B1) = 1, XBi

(B2) = 0 (54)



(55)




D0 =D

min(Pr(A1B2), P r(A2, B1))ifD > 0 (58)

D0 =D

min(Pr(A1B1), P r(A2, B2))ifD < 0 (59)



✏i = Z � E(Yi|Xi) (62)


14

H0 : Cov(Xa, Y ) = 0 \ Cov(Xd, Y ) = 0 (35)

HA : Cov(Xa, Y ) 6= 0 [ Cov(Xd, Y ) 6= 0 (36)

H0 : �a = 0 \ �d = 0 (37)

HA : �a 6= 0 [ �d 6= 0 (38)


�µ = 0,�a = 4,�d = �1,�2✏ = 1 (40)

�0a = 0, �0

d = 0 (41)

�0a = �a, �

0d = �d (42)

Pr(A1, A1) = Pr(A1)Pr(A1) = p2 (43)

Pr(A1, A2) = Pr(A1)Pr(A2) = 2pq (44)

Pr(A2, A2) = Pr(A2)Pr(A2) = q2 (45)








XAi

: XAi

(A1) = 1, XAi

(A2) = 0 (53)

XBj

: XBj

(B1) = 1, XBi

(B2) = 0 (54)



(55)




D0 =D

min(Pr(A1B2), P r(A2, B1))ifD > 0 (58)

D0 =D

min(Pr(A1B1), P r(A2, B2))ifD < 0 (59)



✏i = Z � E(Yi|Xi) (62)


14

Logistic regression: link function I• Next, we have to consider the function for the regression line of

a logistic regression (remember below we are plotting just versus Xa but this really is a plot versus Xa AND Xd!!):

Y

Xa

We can therefore write for an individual i:

E(Yi

|Xi

) =e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

(13)

and for the observed values of individual i:

E(yi

|xi

) =e�µ

+x

i,a

�

a

+x

i,d

�

d

1 + e�µ

+x

i,a

�

a

+x

i,d

�

d

(14)

Note that equation (12) describes a sample of size n using vector notation. We can writethis out as follows:

E(y|x) = ��1(x�) =

2

6664

e

�

µ

+x1,a�

a

+x1,d�d

1+e

�

µ

+x1,a�

a

+x1,d�d

...e

�

µ

+x

n,a

�

a

+x

n,d

�

d

1+e

�

µ

+x

n,a

�

a

+x

n,d

�

d

3

7775

Note that the logit link function is not the only link function that we could use for analyzingcase-control data (there are in fact, quite a number of functions we could use). However,the logit link (logistic inverse) has some nice properties that have to do with ‘su�ciency’of the parameter estimates. As a consequence, the logit link is called the ‘canonical’ linkfunction for this case and tends to be the most widely used.

4 Estimation of logistic regression parameters

Now that we have all the components of a logistic regression, we can consider inference

with this model. For GWAS applications, our goal will be hypothesis testing and, aswith the case when applying a linear regression, we will perform our hypothesis test usinga likelihood ratio test (LRT), which requires that we have maximum likelihood estimates

(MLE) of the � parameters in the model, i.e. MLE(�). To derive the MLE(�) for the� parameters of a logistic regression model, we will use the standard approach for findingMLE’s, i.e. solve for where the derivative of the (log-)likelihood function dl(�)/d� equalszero and solve for the parameters (and use the second derivative to assess whether we areconsidering a maximum). So, we first need to consider the log-likelihood (ln(L(�|Y))) forthe logistic regression model. For a sample of size n this is:

l(�) =nX

i=1

⇥yi

ln(��1(�µ

+ xi,a

�a

+ xi,d

�d

)) + (1� yi

)ln(��1(�µ

+ xi,a

�a

+ xi,d

�d

))⇤

(15)

Now taking the first and second derivative of this equation is straightforward. However,unlike the case with a linear regression, where we could solve for the parameters and pro-duce a simple equation, the resulting function in the logistic case is a function of the �0s,which is a problem, since we are attempting to solve for the �’s. We therefore cannot take

5


E(Yi

|Xi

) = logistic(�µ

+Xi,a

�a

+Xi,d

�d

) (3)





j

Ak


j

Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (5)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (6)



p = logistic(�µ

+Xa

�a

+Xd

�d

) (7)


+Xa

�a

+Xd

�d


+Xa

�a

+X

d

�d


+Xa

�a

+Xd

�d


+Xa

�a

+X

d

�d



3

Calculating the components of an individual II

• For example, say we have an individual i that has genotype A1A1 and phenotype Yi = 0

• We know Xa = -1 and Xd = -1

• Say we also know that for the population, the true parameters (which we will not know in practice! We need to infer them!) are:

• We can then calculate the E(Yi|Xi) and the error term for i:


E(Yi

|Xi

) = logistic(�µ

+Xi,a

�a

+Xi,d

�d

) (3)

Yi

= E(Yi

|Xi

) + ✏i,l

(4)

Yi

= ��1(Yi

|Xi

) + ✏i,l

(5)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(6)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(7)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)(2.2)+(�1)(0.2)+ ✏

i,l

(8)

0 = 0.1� 0.1 (9)





j

Ak


j

Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (11)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (12)

✏i,l

= �E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (13)


Ak


✏i,l

= 1�E(Yi

|Xi

) = 1�E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (14)

3


E(Yi

|Xi

) = logistic(�µ

+Xi,a

�a

+Xi,d

�d

) (3)

Yi

= E(Yi

|Xi

) + ✏i,l

(4)

Yi

= ��1(Yi

|Xi

) + ✏i,l

(5)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(6)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(7)

�µ

= 0.2 �a

= 2.2 �d

= 0.2 (8)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)(2.2)+(�1)(0.2)+ ✏

i,l

(9)

0 = 0.1� 0.1 (10)





j

Ak


j

Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (12)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (13)

✏i,l

= �E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (14)


Ak


✏i,l

= 1�E(Yi

|Xi

) = 1�E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (15)

3

⇤i = 1� E(Yi|Xi) = 1� E(Y |AiAj) = 1� logistic(�µ +Xi,a�a +Xi,d�d) (61)

⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)

Yi =e�µ+xi,a�a+xi,d�d

1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)

Pr(Z) ⇥ bern(p) (70)

�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)

⇥�1(�µ + xi,a�a + xi,d�d) =e�µ+xi,a�a+xi,d�d

1 + e�µ+xi,a�a+xi,d�d(72)

⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15

Calculating the components of an individual III


• We know Xa = -1 and Xd = -1




E(Yi

|Xi

) = logistic(�µ

+Xi,a

�a

+Xi,d

�d

) (3)

Yi

= E(Yi

|Xi

) + ✏i,l

(4)

Yi

= ��1(Yi

|Xi

) + ✏i,l

(5)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(6)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(7)

�µ

= 0.2 �a

= 2.2 �d

= 0.2 (8)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)(2.2)+(�1)(0.2)+ ✏

i,l

(9)

0 = 0.1� 0.1 (10)





j

Ak


j

Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (12)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (13)

✏i,l

= �E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (14)


Ak


✏i,l

= 1�E(Yi

|Xi

) = 1�E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (15)

3


E(Yi

|Xi

) = logistic(�µ

+Xi,a

�a

+Xi,d

�d

) (3)

Yi

= E(Yi

|Xi

) + ✏i,l

(4)

Yi

= ��1(Yi

|Xi

) + ✏i,l

(5)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(6)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(7)

�µ

= 0.2 �a

= 2.2 �d

= 0.2 (8)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)(2.2)+(�1)(0.2)+ ✏

i,l

(9)

0 = 0.1� 0.1 (10)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)(2.2)+(�1)(0.2)+ ✏

i,l

(11)

1 = 0.1 + 0.9 (12)





j

Ak


j

Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (14)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (15)

3


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15

Calculating the components of an individual IV


• We know Xa = 0 and Xd = 1




E(Yi

|Xi

) = logistic(�µ

+Xi,a

�a

+Xi,d

�d

) (3)

Yi

= E(Yi

|Xi

) + ✏i,l

(4)

Yi

= ��1(Yi

|Xi

) + ✏i,l

(5)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(6)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(7)

�µ

= 0.2 �a

= 2.2 �d

= 0.2 (8)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)(2.2)+(�1)(0.2)+ ✏

i,l

(9)

0 = 0.1� 0.1 (10)





j

Ak


j

Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (12)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (13)

✏i,l

= �E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (14)


Ak


✏i,l

= 1�E(Yi

|Xi

) = 1�E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (15)

3


E(Yi

|Xi

) = logistic(�µ

+Xi,a

�a

+Xi,d

�d

) (3)

Yi

= E(Yi

|Xi

) + ✏i,l

(4)

Yi

= ��1(Yi

|Xi

) + ✏i,l

(5)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(6)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(7)

�µ

= 0.2 �a

= 2.2 �d

= 0.2 (8)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)(2.2)+(�1)(0.2)+ ✏

i,l

(9)

0 = 0.1� 0.1 (10)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)(2.2)+(�1)(0.2)+ ✏

i,l

(11)

1 = 0.1 + 0.9 (12)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)(2.2)+(1)(0.2)+ ✏

i,l

(13)

0 = 0.6� 0.6 (14)





j

Ak


j

Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (16)

3


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15

Calculating the components of an individual V


• We know Xa = 1 and Xd = -1




E(Yi

|Xi

) = logistic(�µ

+Xi,a

�a

+Xi,d

�d

) (3)

Yi

= E(Yi

|Xi

) + ✏i,l

(4)

Yi

= ��1(Yi

|Xi

) + ✏i,l

(5)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(6)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(7)

�µ

= 0.2 �a

= 2.2 �d

= 0.2 (8)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)(2.2)+(�1)(0.2)+ ✏

i,l

(9)

0 = 0.1� 0.1 (10)





j

Ak


j

Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (12)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (13)

✏i,l

= �E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (14)


Ak


✏i,l

= 1�E(Yi

|Xi

) = 1�E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (15)

3


E(Yi

|Xi

) = logistic(�µ

+Xi,a

�a

+Xi,d

�d

) (3)

Yi

= E(Yi

|Xi

) + ✏i,l

(4)

Yi

= ��1(Yi

|Xi

) + ✏i,l

(5)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(6)

Yi

=e�µ

+X

i,a

�

a

+X

i,d

�

d

1 + e�µ

+X

i,a

�

a

+X

i,d

�

d

+ ✏i,l

(7)

�µ

= 0.2 �a

= 2.2 �d

= 0.2 (8)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)(2.2)+(�1)(0.2)+ ✏

i,l

(9)

0 = 0.1� 0.1 (10)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)(2.2)+(�1)(0.2)+ ✏

i,l

(11)

1 = 0.1 + 0.9 (12)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)(2.2)+(1)(0.2)+ ✏

i,l

(13)

0 = 0.6� 0.6 (14)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)(2.2)+(�1)(0.2)+ ✏

i,l

(15)

0 = 0.9� 0.9 (16)





j

Ak


j

Ak

, this

3


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15

For the entire probability distributions I

• Recall that the error term is either the negative of E(Yi | Xi) when Yi is zero and 1- E(Yi | Xi) when Yi is one:

• For the entire distribution of the population, recall that

Y , which is zero or one, and the value of this function. For a given genotype Aj

Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (19)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (20)

✏i,l

= �E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (21)


Ak


✏i,l

= 1�E(Yi

|Xi

) = 1�E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (22)



Pr(✏i,l

) ⇠ bern(p|X)� E(Y |X)


p = logistic(�µ

+Xa

�a

+Xd

�d

) (23)

p = E(Y |X) (24)


+Xa

�a

+Xd

�d


+Xa

�a

+X

d

�d


+Xa

�a

+Xd

�d


+Xa

�a

+X

d

�d


While this may look complicated, this parameter actually allows for a simple interpre-tation. Note that if the value of the logistic regression function is low (i.e. closer to zero),the expected value of the phenotype is low, and the probability of being zero is greater(and vice versa). Thus, the value of the logistic regression is directly related to the proba-bility of being in one phenotypic state (one) or the other (zero). This also provides a clearbiological interpretation of the genotypic value for a case-control phenotype: this is theprobability of being a case or control (sick or healthy) conditional on the genotype of anindividual.

4

For example:


Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (19)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (20)

✏i,l

= �E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (21)


Ak


✏i,l

= 1�E(Yi

|Xi

) = 1�E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (22)



Pr(✏i,l


✏l

= �0.1 or ✏l

= 0.9

✏l

= �0.6 or ✏l

= 0.4

✏l

= �0.9 or ✏l

= 0.1


p = logistic(�µ

+Xa

�a

+Xd

�d

) (23)

p = E(Y |X) (24)

p = 0.1 (25)

p = 0.6 (26)

p = 0.9 (27)

4


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15

X : X(H) = 0, X(T ) = 1

X : ⌦ ! R

X1

: ⌦ ! R

X2

: ⌦ ! R

Pr(F) ! Pr(X)

Pr(✓)

Pr(T (X)|H0

: ✓ = c)

H0

: ✓ = c

A1

! A2

) �Y |Z (211)

Pr(A1

, A1

) = Pr(A1

)Pr(A1

) = p2 (212)

Pr(A1

, A2

) = 2Pr(A1

)Pr(A2

) = 2pq (213)

Pr(A2

, A2

) = Pr(A2

)Pr(A2

) = q2 (214)


✏i = 0.9

24




Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (19)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (20)

✏i,l

= �E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (21)


Ak


✏i,l

= 1�E(Yi

|Xi

) = 1�E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (22)



Pr(✏i,l



p = logistic(�µ

+Xa

�a

+Xd

�d

) (23)

p = E(Y |X) (24)


+Xa

�a

+Xd

�d


+Xa

�a

+X

d

�d


+Xa

�a

+Xd

�d


+Xa

�a

+X

d

�d



4

For example:


Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (19)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (20)

✏i,l

= �E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (21)


Ak


✏i,l

= 1�E(Yi

|Xi

) = 1�E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (22)



Pr(✏i,l


✏l

= �0.1 or ✏l

= 0.9

✏l

= �0.6 or ✏l

= 0.4

✏l

= �0.9 or ✏l

= 0.1


p = logistic(�µ

+Xa

�a

+Xd

�d

) (23)

p = E(Y |X) (24)

p = 0.1 (25)

p = 0.6 (26)

p = 0.9 (27)

4


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15

For the entire probability distributions II




Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (19)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (20)

✏i,l

= �E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (21)


Ak


✏i,l

= 1�E(Yi

|Xi

) = 1�E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (22)



Pr(✏i,l



p = logistic(�µ

+Xa

�a

+Xd

�d

) (23)

p = E(Y |X) (24)


+Xa

�a

+Xd

�d


+Xa

�a

+X

d

�d


+Xa

�a

+Xd

�d


+Xa

�a

+X

d

�d



4

For example:


Ak


j

Ak


✏ = �E(Y |X) = �E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (19)


Ak


✏ = 1� E(Y |X) = 1� E(Y |Ai

Aj


+Xa

�a

+Xd

�d

) (20)

✏i,l

= �E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (21)


Ak


✏i,l

= 1�E(Yi

|Xi

) = 1�E(Yi

|Xi

) = �E(Y |Ai

Aj


+Xi,a

�a

+Xi,d

�d

) (22)



Pr(✏i,l


✏l

= �0.1 or ✏l

= 0.9

✏l

= �0.6 or ✏l

= 0.4

✏l

= �0.9 or ✏l

= 0.1


p = logistic(�µ

+Xa

�a

+Xd

�d

) (23)

p = E(Y |X) (24)

p = 0.1 (25)

p = 0.6 (26)

p = 0.9 (27)

4


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15


⇤i = Z � E(Yi|Xi) (62)

Yi = E(Yi|Xi) + ⇤i (63)

Yi = ⇥�1(Yi|Xi) + ⇤i (64)


1 + e�µ+xi,a�a+xi,d�d+ ⇤i (65)

0 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (66)

1 =e0.2+(�1)2.2+(�1)0.2

1 + e0.2+(�1)2.2+(�1)0.2+ ⇤i (67)

0 =e0.2+(0)2.2+(1)0.2

1 + e0.2+(0)2.2+(1)0.2+ ⇤i (68)

0 =e0.2+(1)2.2+(�1)0.2

1 + e0.2+(1)2.2+(�1)0.2+ ⇤i (69)


�[t+1] = �[t] + [xTWx]�1xT(y� ⇥�1(x�[t]) (71)



⇥�1(x�) =ex�

1 + ex�(73)

⇤i = �0.6 (74)

⇤i = 0.4 (75)

⇤i = �0.1 (76)

⇤i = 0.9 (77)

⇤i = 0.1 (78)

⇤i = �0.9 (79)

Pr(⇤i) ⇥ bern(p|X)� E(Y |X) (80)

⇤i|(Yi = 0) = �E(Yi|Xi) (81)

⇤i|(Yi = 1) = 1� E(Yi|Xi) (82)

⇥�1(�[t]µ + xi,a�

[t]a + xi,d�

[t]d ) =

e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

1 + e�[t]µ +xi,a�

[t]a +xi,d�

[t]d

(83)

15

For the entire probability distributions III

That’s it for today

• See you on Thurs.!

Quantitative Genomics and Genetics - Jason Mezey …mezeylab.cb.bscb.cornell.edu/labmembers/documents/class...Announcements Schedule - 3 I April 4 Spring break No Class!! 11 April

Documents