Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Linear Model The classical linear model is defined by Y = Xβ + ε where Y is an observable data (response variable) vector β is a vector of unknown parameters X is the design matrix (for factors and regressors) ε is a vector of random errors and ε ∼ N(0,σ 2 I) Then E(Y)= Xβ and Var(Y)= σ 2 I The ordinary least-squares estimator (the same as MLE) of β is ˆ β =(X 0 X) -1 X 0 Y Disadvantages too restrictive for most of typical data sets the error-structure in real-world experiments is often more complex than Σ = σ 2 I Clarice G.B. Dem´ etrio and Cristian Villegas 1 Modelos Mistos e Componentes de Variˆ ancia
245
Embed
Session 1 - Review of Basic ConceptsSession 2 - Linear ...€¦ · Session 1 - Review of Basic ConceptsSession 2 - Linear Mixed ModelSession 3 - Linear Mixed ModelSession 4 - EstimationSession
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Linear Model
The classical linear model is defined by
Y = Xβ + ε
where
Y is an observable data (response variable) vector
β is a vector of unknown parameters
X is the design matrix (for factors and regressors)
ε is a vector of random errors and ε ∼ N(0, σ2I)
ThenE(Y) = Xβ and Var(Y) = σ2I
The ordinary least-squares estimator (the same as MLE) of β is
β = (X′X)−1X′Y
Disadvantages
too restrictive for most of typical data sets
the error-structure in real-world experiments is often more complexthan Σ = σ2I
Clarice G.B. Demetrio and Cristian Villegas 1 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Expectation and variance - properties
1. Expected value
Definition
The expected value or mean of a random variable Y , denoted by E(Y ) isdefined by
E(Y ) =
∫ +∞
−∞y fY (y) d y .
PropertiesLet X and Y be two random variables and a and b ∈ R be constants.Then
1 E(a) = a.2 E(aX ) = aE (X ).3 E(aX ± bY ) = aE(X )± bE(Y ).4 E(aX ± b) = aE(X )± b.5 E[(X − a)2] = E(X 2)− 2aE(X ) + a2.6 E(XY ) = E(X )E(Y ), for X and Y independent random variables.7 E
(∑ni=1 Yi
)=∑n
i=1 E(Yi )
Clarice G.B. Demetrio and Cristian Villegas 2 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
2. Variance
Definition
Let Y be a random variable and let assume that µ = E(Y ) exists. Thevariance of Y is the number denoted by Var(Y) and defined by
The variance for a continuous random variable Y is calculated by
Var(Y ) =
∫ +∞
−∞(y − µ)2 fY (y) d y
or
Var(Y ) =
∫ +∞
−∞y 2 fY (y) d y −
[∫ +∞
−∞y fY (y) d y
]2
Clarice G.B. Demetrio and Cristian Villegas 3 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
PropertiesLet X and Y be two random variables and a and b ∈ R be constants.Then
1 Var(aY + b) = a2Var(Y )
2 Var(a) = 0
3 Var(aY ) = a2Var(Y )
4 Var(−Y ) = Var(Y )
5 Var(X ± Y ) = Var(X )± Var(Y ), for X and Y independent randomvariables.
6 Var
(n∑
i=1
aiYi
)=∑n
i=1 a2i Var(Yi ), for Yi independent random
variables.
Clarice G.B. Demetrio and Cristian Villegas 4 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
3. Covariance
Definition
The covariance between Y and Z is defined by
Cov(Y ,Z ) = E(YZ )− E(Y )E(Z ).
Properties
1 Cov(aY , bZ ) = abCov(Y ,Z )
2
n∑i=1
Cov(aiYi , biZi ) =n∑
i=1
aibiCov(Yi ,Zi )
3 Var
(n∑
i=1
aiYi
)=
n∑i=1
a2i Var(Yi ) + 2
∑i<i ′
aiai ′Cov(Yi ,Yi ′)
Clarice G.B. Demetrio and Cristian Villegas 5 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Explanatory variables
2 types of explanatory variables:1 factors↪→ interest is in attributing variability in y to various categories ofthe factorExample: corn yields from two replicates of three varieties (A/B/C)in a completely randomized design
Yij = µ+ τi + εij i = 1, 2, 3 j = 1, 2↪→ In matrix notation, this model can be expressed as: y1
y2
y3
=
12
12
12
µ+
12 02 02
02 12 02
02 02 12
τ1
τ2
τ3
+
ε1
ε2
ε3
yi = [yi1, yi2]′ is the vector of observations of variety i ; 12 and 02 are2-dimensional column vectors of 1′s and 0′s, respectively; andεi = [εi1, εi2]′ is the vector of residuals associated with variety i .↪→ parameter values give the impact of factor’s levels on theresponse variablefactors may be crossed or nestedfactors may have main effect and interaction effect
Clarice G.B. Demetrio and Cristian Villegas 6 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
2 regressors↪→ interest is in attributing variability in Y to changes in values of acontinuous covariable
Example: changes due to weight x
Yi = β0 + β1xi + εi
↪→ In matrix notation, this model can be expressed as:y1
y2
· · ·yn
=
1 x1
1 x2
· · · · · ·1 xn
[ β0
β1
]+
ε1
ε2
· · ·εn
↪→ parameter values give the impact of an increase in x on theresponse variable
Clarice G.B. Demetrio and Cristian Villegas 7 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Terminology :
Multiple Linear Regression/ANOVA/ANCOVA
if matrix X contains only regressors, models are called regressionmodels
if matrix X contains only factors, model are called Analysis ofVariance (ANOVA) (X is a matrix with 1’s and 0’s) models.
if matrix X contains both regressors and factors, models are calledAnalysis of Covariance (ANCOVA) models.
Clarice G.B. Demetrio and Cristian Villegas 8 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation
Let’s assume a linear model :
Y = Xβ + ε
Parameters to be estimated are β, σIn all the following, X is supposed of full rank: rank(X)= K
Least squares approach : min(||Y − Xβ||2)
βls = (X′X)−1X′Y
best linear unbiased estimator of β
βls ∼ N (β, σ2(X′X)−1)
best quadratic unbiased estimator of σ2
σ2ls =
1
n − K(Y − Xβls)′(Y − Xβls) and σ2
ls ∼σ2
n − Kχ2
(n−K)
Clarice G.B. Demetrio and Cristian Villegas 9 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Maximum likelihood approach
Likelihood
L(β, σ; y) =n∏
i=1
1√2πσ2
e−1
2σ2 (yi−x′i β)′(yi−x′i β)
Log-likelihood
`(β, σ; y) = −n
2log (2πσ2)− 1
2σ2(y − Xβ)′(y − Xβ)
Maximum log-likelihood
∂β,σ`(β, σ, y) = 0⇒
βml = (X′X)−1X′Y
σ2ml =
1
n(Y − Xβ)′(Y − Xβ)
Clarice G.B. Demetrio and Cristian Villegas 10 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
βls = βml , unbiased
E[βls ] = E[βml ] = E[(X′X)−1X′Y] = β
but σ2ls 6= σ2
ml
σ2ls is unbiasedσ2ls is calculated on the orthogonal space of Xσ2ls takes into account the difference between Y and its projection Xβ
on X and the lost of degrees of freedom due to the estimation of β
σ2ml is biased
joint estimation of σ2 and βit does not take into account the lost in degrees of freedom due tothe estimation of β
Clarice G.B. Demetrio and Cristian Villegas 15 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Note that T = MTY is a vector with the first r1 elements of equalto the mean of the Yi ’s for the first treatment, the next r2 elementsequal to the mean of those for the second treatment and so on.
MT is called the treatment mean operator as it computes thetreatment means from the vector to which it is applied and replaceseach element of this vector with its treatment mean.
ΨT = XT θ = XT (X′TXT )−1X′TY = MTY = T =
T1
. . .T1
. . .Tt
. . .Tt
=
T11r1
T21r2
. . .Tt1rt
For the observed values y of Y, t = MTy is the estimate of ΨT .
Clarice G.B. Demetrio and Cristian Villegas 16 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Other types of restriction
∑ti=1 τi = 0
θ = [µ, τ1, τ2, · · · , τt ]T = [Y , T1 − Y , T2 − Y , · · · , Tt − Y ]T
Clarice G.B. Demetrio and Cristian Villegas 17 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Sums of squares for the analysis of variance
LM theory: Y′Y = Y′HY + Y′(I−H)Y =1
nY′JY + α′X′TY
From Chapter XII of Chris Brien’s notes,an SSq is the SSq of the elements of a vector andcan be written as the product of transpose of a column vector withoriginal column vector.
For a completely randomized design, the sums of squares in the analysisof variance for Units, Treatments and Residual are given by the quadraticforms, respectively,
Clarice G.B. Demetrio and Cristian Villegas 18 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Degrees of freedom of the sums of squares for anANOVA
Definition: The trace of a square matrix is the sum of its diagonalelements.Definition: The degrees of freedom of a sum of squares is the rank ofthe idempotent of its quadratic form. That is the degrees of freedom ofY′AY is given by rank(A).Lemma: For B idempotent, rank(B) = trace(B).Lemma: Let c be a scalar and (A), (B) and (C) be matrices. Thenwhen the appropriate operations are defined, we have
(i) trace(A) = trace(A′);
(ii) trace(cA) = c trace(A);
(iii) trace(A + B) =trace(A) + trace(B);
(iv) trace(AB) =trace(BA);
(v) trace(ABC) =trace(CAB) =trace(BCA)
(vi) trace(A⊗ B) = trace(B) trace(A);
(vii) trace(A′A) = 0 if only if A = 0.
Clarice G.B. Demetrio and Cristian Villegas 19 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Expected mean squares
Have an ANOVA in which we use F (= ratio of MSqs) to decidebetween models.
But why is this ratio appropriate?
One way of answering this question is to look at what the MSqsmeasure?
Use expected values of the MSqs, i.e. E[MSq]s, to do this.
To derive the expected values, we note that the general form of a meansquare is a quadratic form divided by degrees of freedom, Y′QY/ν.
Clarice G.B. Demetrio and Cristian Villegas 20 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Expectation of quadratic forms
Definition: A quadratic form in a vector Y is a scalar function of Y ofthe form Y′AY where A is called the matrix of the quadratic form.
Expectation: Let Y be an n × 1 vector of random variables with
E[Y] = Ψ and Var[Y] = V
where Ψ is a n × 1 vector of expected values and V is an n × n matrix.Let A be an n × n matrix of real values. Then
E(YTAY) = trace (AV) + ΨTAΨ
Clarice G.B. Demetrio and Cristian Villegas 21 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Distribution of a quadratic form
Theorem: Let A be an n × n symmetric matrix of rank ν and Y be ann × 1 normally distributed random vector with E[AY] = 0, Var[Y] = Vand E[Y′AY/ν] = λ. Then (1/λ)Y′AY follows a χ2-distribution withν = rank(A) degrees of freedom if and only if A is idempotent.
- The mean and variance of a χ2-distribution with ν degrees of freedomare equal to ν and 2ν, respectively.
Clarice G.B. Demetrio and Cristian Villegas 22 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Cochran’s theorem (1934)
Theorem: Let Y be an n × 1 normally distributed random vector withE[Y] = Xθ and Var[Y] = V. Let Y′A1Y, . . ., Y′AhY be a collection of hquadratic forms where, for each i = 1, 2, . . . , h,
Ai is symmetric, of rank νi , E[AiY] = 0¯
, E[Y′AiY/νi ] = λi .
If any two of the following three statements are true,1. All Ai are idempotent
2.∑h
i=1 Ai is idempotent
3. AiAj = 0, i 6= j
then for each i , Y′AiY/νi follows a χ2-distribution with νi degrees of
freedom. Furthermore, Y′AiY are independent for i 6= j and∑h
i=1 νi = ν
where ν denotes the rank of∑h
i=1 Ai .
Clarice G.B. Demetrio and Cristian Villegas 23 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Distribution of a ratio of independent χ2-distributions
Theorem: Let U1 and U2 be two random variables distributed χ2 with ν1
and ν2 degrees of freedom. Then, provided U1 and U2, are independent,the random variable
W =U1/ν1
U2/ν2
is distributed as Snedecor’s F with ν1 and ν2 degrees of freedom.
Note: Two quadratic forms Y′AiY and Y′AjY are independent ifAiAj = 0, i 6= j
Clarice G.B. Demetrio and Cristian Villegas 24 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
It is possible to show (See Chapter XII of Chris Brien’s notes) thatfor the completely randomized design
-Y′QURes
Y
σ2∼ χ2
n−t
-Y′QTY
σ2∼ χ2
t−1, under H0
-Y′QURes
Y
σ2and
Y′QTY
σ2are independent
- F =Treatments MSq
Residual MSq∼ Ft−1,n−t
Clarice G.B. Demetrio and Cristian Villegas 25 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Standard errors of samples variances
Consider a random sample Yi , i = 1, 2, · · · , n, from a normal distributionwith mean E(Yi ) = µ and variance Var(Yi ) = σ2.
The sample mean Y =∑
i Yi/n and the varianceS2 =
∑i (yi − y)2/(n − 1) are unbiased estimators for µ and σ2,
respectively
(n − 1)S2/σ2 follows a χ2-distribution with (n − 1) df
E(S2) = σ2 and Var(S2) = 2σ4/(n − 1)
Let MS denote a mean square with ν df. If νMS/E(MS) ∼ χ2ν , the
variance of MS is Var(MS) = 2E2(MS)/ν. Hence,
Var(MS) = E(MS2)− E2(MS) = E(MS2)− ν
2Var(MS).
Thus, (ν + 2)Var(MS)/2 = E(MS2) and an unbiased estimator ofVar(MS) is given by
2MS2
ν + 2
As an illustration, the estimator of the variance of the variance S2 isVar(S2) = 2S4/(n + 1).
Clarice G.B. Demetrio and Cristian Villegas 26 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Linear combinations of χ2 variables
Consider the mean squares MSi , i = 1, 2, · · · , k , independent, with νidegree of fredoom, and that independently νiMSi/E(MSi ) ∼ χ2
νi .
Estimators of variance components usually take the form ofMS =
∑i aiMSi , where ai are constants.
Following Smith (1938), Satterthwaite (1946) considersνMS/E(MS) ∼ χ2
ν .
As a consequence, Var(MS) = 2E2(MS)/ν.
However, Var(MS) =∑
i a2i Var(MSi ) = 2
∑i [a
2i E2(MSi )/νi ]
Equating the two expressions for Var(MS),
ν =E2(MS)∑
i [a2i E2(MSi )/νi ]
=[∑
i aiE(MSi )]2∑i [a
2i E2(MSi )/νi ]
In practice ν is obtained from (∑
i aiMSi )2/∑
i (a2i MS2
i /νi ).Clarice G.B. Demetrio and Cristian Villegas 27 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Goodness of fit criterion
Adjusted R-square
R2 = 1−∑n
i=1(yi − x ′i β)2/(n − K )∑ni=1(yi − y)2/(n − 1)
Akaike’s Information Criterion
AIC = −2 logL(βml , σml , y) + 2K
Bayesian Information Criterion
BIC = −2 logL(βml , σml , y) + K log(n)
Clarice G.B. Demetrio and Cristian Villegas 28 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
SAS procedure
Proc GLM data = data;
class x; * if x is a factor
model y = x;
output out=Regr p=Predite r=Residu;
run;
Clarice G.B. Demetrio and Cristian Villegas 29 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Checking
Gaussian hypothesis
Graphical
histogram, QQ-plot,
proc univariate data=Regr;var Residu ;histogram Residu / normal ;qqplot Residu / normal(mu=est sigma=estcolor=red L=1);inset mean std / cfill=blank format=5.2;run;
Clarice G.B. Demetrio and Cristian Villegas 30 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Variable selection in multiple regression
The main approaches
Forward selection, which involves starting with no variables in themodel, trying out the variables one by one and including them ifthey are statistically significant.
Backward elimination, which involves starting with all candidatevariables and testing them one by one for statistical significance,deleting any that are not significant.
Methods that are a combination of the above, testing at each stagefor variables to be included or excluded.
SAS Reg procedure
proc reg
model Y = x/selection = adjrsq bic;
model Y = x/selection = stepwise;
run;
Clarice G.B. Demetrio and Cristian Villegas 31 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Linear Mixed Models
Linear mixed effects models have been widely used in analysis ofdata where responses are clustered around some random effects,such that there is a natural dependence between observations in thesame cluster.
For example, consider repeated measurements taken on each subjectin longitudinal data, or observations taken on members of the samefamily in a genetic study.
They can easily accommodate covariances among observations.
They handle correlated data by incorporating random effects andestimating their associated variance components to model variabilityover and above the residual error.
Because of the estimation procedures usually envolved, mixed-modelapproaches can circumvent the problems associated with unbalancedand incomplete data.
Clarice G.B. Demetrio and Cristian Villegas 32 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Maize trial
Example
5 progenies of a population of maize progenies were investigated
the trial was conducted randomizing completely 4 replicates of eachprogeny
the response variable was the weight of corn-cob (kg/10m2)
At crossing, genetic effects may be reasonably assumed as normalrandom variables.During early stages of a selection programme, the nature ofgenotypic effects may still be regarded as random.In general, the interest is in the heritability of a trait.
Clarice G.B. Demetrio and Cristian Villegas 33 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Penicillin yield (Brien, 2009)
Example
The effects of four treatments on the yield of penicillin are to beinvestigated. It is known that corn steep liquor, an important rawmaterial in producing penicillin, is highly variable from one blending of itto another. To ensure that the results of the experiment apply to morethan one blend, five blends (blocks) are to be used in the experiment.The trial was conducted using the same blend in four flasks andrandomizing the four treatments to these four flasks.
interest of course in each particular treatment usedno interest in each blend which are very depending on thecircumstancesblend effect can be viewed as a sample of a random blend effect(levels are chosen at random from an infinite set of blend levels)interest in estimating the variance of the blend effect as a source ofrandom variation in the datathe four flasks with the same blend share something whichpresumably violates the assumption of independence
Clarice G.B. Demetrio and Cristian Villegas 34 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Clarice G.B. Demetrio and Cristian Villegas 35 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Calf birth weight
Example
In an animal breeding experiment 20 unrelated cows were subjected tosuperovulation and artificial insemination. Each group of 4 cows wasinseminated with a different sire, with a total of 5 unrelated sires. Out ofeach mating (combination of dam and sire), three calves were generatedand their yearling weights were recorded.
no interest in each sire or dam which are very depending on thecircumstances
sire effect can be viewed as a sample of a random sire effect (levelsare chosen at random from an infinite set of sire levels)
dam effect can be viewed as a sample of a random dam effect (levelsare chosen at random from an infinite set of dam levels)
interest in estimating the variance of the sire and dam effects assources of random variation in the data
the three calves with the same parents share something whichpresumably violates the assumption of independence
Clarice G.B. Demetrio and Cristian Villegas 36 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
...
S1
D1 D2 D3 D4
...
S5
D17 D18 D19 D20
Clarice G.B. Demetrio and Cristian Villegas 37 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Fixed vs Random effects
Random effect: A factor will be designated as random if it isconsidered appropriate to use a probability distribution function todescribe the distribution of effects associated with the population setof levels.
influence only the variance of the response variableinfinite set of levels (only a finite subset present) and interest liesmore in the variance induced by these levels than in the estimation ofthe levels themselvesblends in the penicilin example, progenies in the maize trial
Fixed effect: It will be designated as fixed if it is consideredappropriate to have the effects associated with the population set oflevels for the factor differ in an arbitrary manner, rather than beingdistributed according to a regularly-shaped p.d.f.
influence only the mean of the response variablefinite set of levels and interest lies in the estimation of eachparticular level effecttreatments in the penicilin example
Clarice G.B. Demetrio and Cristian Villegas 38 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
In practice
Random if
i . large number of population levels andii . random behaviouriii . occur in two contrasting kinds of circumstances:
observational studies or designed experiments with hierarchicalstructure- School/Class/Student- Sire/Dam/Calfdesigned experiments with different spatial or temporal scales- longitudinal studies
Fixed if
i . small or large number of population levels andii . systematic behaviour
↪→ Consequence: data collected within each level of the random effectfactor are linked to a same realization of a random variable. Thisintroduce dependency between this data.
Clarice G.B. Demetrio and Cristian Villegas 39 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Type of Models
Fixed-effects model - envolves only fixed effects– to make inferences about those particular levels of theclassification factor that were used in the experiment
Random-effects model - envolves only random effects– to make inferences about the population from which these levelswere drawn
Mixed model - envolves fixed and mixed effects
Clarice G.B. Demetrio and Cristian Villegas 40 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Example
Consider a study, related to observations of half-sib families of Iunrelated sires.
If the interest is on comparing only the I sires, the following fixedmodel can be used to represent the data:
E(Yij) = µ+ si
where yij represents the phenotypic trait observation of progeny j ,j = 1, . . . , r , in family i , i = 1, . . . , I , µ is a mean, si is a fixed effectcommon to all animals having sire i .
If the I sires are considered as a sample of a population of sires, thefollowing random model can be used to represent the data:
E(Yij |si ) = µ+ si
where Si is a random effectTwo usual assumptions:
1 si ’s are independently and identically distributed2 si ’s have zero mean and the same variance σ2
s
Si ∼ i .i .d .(0, σ2s )
Clarice G.B. Demetrio and Cristian Villegas 41 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
On matrix notation, this model can be expressed as:
y1
y2
· · ·yI
=
1r
1r
· · ·1r
µ+
1r 0r . . . 0r
0r 1r . . . 0r
· · · · · · · · · · · ·0r 0r . . . 1r
s1
s2
· · ·sI
+
ε1
ε2
· · ·εI
where yi = [yi1, yi2, . . . , yiI ]
′ represents the vector of observations ofprogeny i (i.e., relative to sire i); 1r and 0r represent r -dimensionalcolumn vectors of 1′s and 0′s, respectively; and εi = [εi1, εi2, . . . , εiI ]
′ isthe vector of residuals associated with progeny j .
Clarice G.B. Demetrio and Cristian Villegas 42 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Simulation
Case 1: Consider the simple model yij = µ+ si + eij , with 3 independentsires and 2 replicates
fix µ = 50
get a sample of 3 values for si from a N(0, σ2s )
get a sample of 6 values for eij from a N(0, σ2)
Case 2: We could have a more complex covariance structure for sires (forexample, A ∗ σ2
s , where A could be the parental matrix). The simulationcould be done using the Cholesky decomposition of A, i.e. A = DD ′).Then, we could get a vector z with dimension 3 from a normal N(0,1) –with each of its elements obtained from a (0,1) and then z is multipliedby D and by the square root of σ2
s , i.e. the s vector for sires is given bys = D ∗ z ∗ σs .
Clarice G.B. Demetrio and Cristian Villegas 43 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Advantages of Linear Mixed Models
flexibility of mixed models for grouped or correlated observations.
models can be used for related individuals (like animal and plantbreeding), longitudinal data, spatial statistics, etc.
generalized linear models with random effects, as, for example,implemented in GLIMMIX of SAS,
non-linear mixed models (NLINMIX of SAS, for example), forgrowth curves.
Clarice G.B. Demetrio and Cristian Villegas 44 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Linear Mixed Model
Y = Xβ + Zu + ε
Y is an observable data vector
β is a vector of unknown parameters
u is a vector of unobservable random variables
X and Z are design matrices for the fixed and random effects
ε is a vector of random errors
Generally, it is assumed that U and ε are independent from eachother and normally distributed with zero-mean vectors andvariance-covariance matrices G and Σ, respectively, i.e.:[
Uε
]∼ N
([00
],
[G 00 Σ
])Inferences regarding mixed effects models refer to the estimation offixed effects, the prediction of random effects, and the estimation ofvariance and covariance components, which are briefly discussednext.
Clarice G.B. Demetrio and Cristian Villegas 45 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Linear Mixed Models
Recall that the general linear mixed models equals
The implied marginal model equals Y ∼ N(Xβ,V) whereV = ZGZ′ + Σ
Note that inferences based on the marginal model do not explicitlyassume the presence of random effects representing the naturalheterogeneity between subjects (case of longitudinal data)
Clarice G.B. Demetrio and Cristian Villegas 46 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Some properties of the direct product of matrices
if Ar and Br are square matrices of order r and c, respectively,
Ar ⊗ Bc =
a11B . . . a1rB. . . . . . . . .ar1B . . . arrB
where ⊗ is called the direct (Kronecker) product operator
In general, A⊗ B 6= B⊗ A
If u and v are vectors, then u′ ⊗ v = v ⊗ u′ = vu′
If D(n) is a diagonal matrix and A is any matrix, then:
D⊗ A = d11A⊕ d22A⊕ . . . dnnA
If matrix dimensions are compatible
(A⊗ B)(C⊗ D) = AC⊗ BD
(αAA⊗ αBB) = αAαB (A⊗ B)
(A⊗ B)T = (AT ⊗ BT )
(A⊗ B)−1 = (A)−1(B)−1
rank(A⊗ B) = rank(A)rank(B)
tr(A⊗ B) = tr(A)tr(B)
det(A⊗ B) = det(A)rank(B)det(B)rank(A)
Clarice G.B. Demetrio and Cristian Villegas 47 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Completely Randomized Design (CRD)
Let’s suppose a CRD with treatment as a random effect and with thesame number of replicates (r) per treatment. The model is
Yij = µ+ τi + εij ,
where i = 1, 2, · · · , t, j = 1, 2, · · · , r , µ constant, τi random and εijrandom
τi ∼ N(0, σ2T ) and εij ∼ N(0, σ2)
τi and εij , τi and τi ′ , εij and εi ′j′ (j 6= j ′ and/or i 6= i ′) areindependent
then
Var(Yij) = Var(τi + εij) = σ2 + σ2T
Cov(Yij ,Yij′) = Cov(τi + εij , τi + εij′) = σ2T (observations from the
Clarice G.B. Demetrio and Cristian Villegas 55 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Expected mean squares for an ANOVA, using matrixnotation
Recall that the general linear mixed models equals
Y = Xβ + Zu + ε
U ∼ N(0,G)
ε ∼ N(0,Σ)
u and ε independent. Then E(Y) = Xβ and V = ZGZT + Σ.
Expected mean squares for an ANOVATheorem: Let Y be an n × 1 vector of random variables with E[Y] = µand Var[Y] = V, where µ is a n × 1 vector of expected values and V isan n × n matrix. Let A an n × n matrix of real numbers.Then
E(YTAY) = tr(AV) + µTAµ
Clarice G.B. Demetrio and Cristian Villegas 56 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
i) Assuming a fixed CRD model (fixed effect for treatment), we have
Y = XGµ+ XTτ + ε
with τ fixed and ε ∼ N(0, Inσ2), that is, E(τ ) = τ , G = Var(τ ) = 0t×t ,E(ε) = 0 and Σ = Var(ε) = Inσ2. Then,
Clarice G.B. Demetrio and Cristian Villegas 62 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
U 1 h 1 Mean
Progenies 5 h 4 Progenies
U 1 x 1 Mean
Plots 20 x 19 Plots
U MG h MG Mean
Progenies MPr h MPr −MG Progenies
U MG x MG Mean
Plots MPl x MPl −MG Plots
U 1 h 1 Mean
Progenies 4σ2T h 4σ2
T Progenies
U 1 x 1 Mean
Plots σ2 x σ2 PlotsClarice G.B. Demetrio and Cristian Villegas 63 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation of Fixed Effects
Recall that the general linear mixed models equals
Y = Xβ + Zu + ε
with U ∼ N(0,G) and ε ∼ N(0,Σ) independent
Then,E(Y|u) = Xβ + Zu and Var(Y|u) = ΣE(Y) = E(Xβ + ZU) = XβVar(Y) = Var(Xβ + ZU) + E(Σ) = ZGZ′ + Σ = Vand marginal model Y ∼ N(Xβ,V)
Notationβ: vector of fixed effects (as before)α: vector of all variance components in G and Σθ = (β′,α′)′: vector of all parameters in marginal model
Marginal likelihood function:
LML(θ) = (2π)−n/2|V(α)|−1/2 exp[− 1
2(Y−Xβ)′V−1(α)(Y−Xβ)
]If α were known, MLE of β equals
β(α) = (X′V−1X)−1X′V−1Y ∼ N(β, (X′V−1X)−1)Clarice G.B. Demetrio and Cristian Villegas 64 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation of Fixed Effects
As G and Σ are generally unknown, an estimate of V is used instead,such that the estimator becomes β(α) = (X′V−1X)−1X′V−1Y.
The variance-covariance matrix of β is approximated by(X′V−1X)−1.
Note: (X′V−1X)−1 is biased downwards as a consequence ofignoring the variability introduced by working with estimates of(co)variance components instead of their true (unknown) parametervalues.
Approximated confidence regions and test statistics for estimablefunctions of the type K′β can be obtained by using the result:
(K′β0)′(K′(X′V−1X)−K)−1(K′β0)
rank(K)≈ F[ϕNϕD ]
where F[ϕNϕD ] refers to an F-distribution with ϕN = rank(K) degreesof freedom for the numerator, and ϕD degrees of freedom for thedenominator, which is generally calculated from the data using, forexample, the Satterthwaite’s approach
Clarice G.B. Demetrio and Cristian Villegas 65 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Matrix review
X ∼ Nk(µ,Σ)
Consider the partitions:
X =
[X1
X2
], µ =
[µ1
µ2
]and Σ =
[Σ11 Σ12
Σ21 Σ22
],
X1 ∼ N(µ1,Σ11) e X2 ∼ N(µ2,Σ22) (marginal distributions)
and
X1|X2 ∼ N(µ1.2,Σ11.2) e X2|X1 ∼ N(µ2.1,Σ22.1) (condicional distributions),
Clarice G.B. Demetrio and Cristian Villegas 66 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Prediction of Random Effects
In addition to the estimation of fixed effects, very often in genetics,for example, interest is also on prediction of random effects.
In linear (Gaussian) models such predictions are given by theconditional expectation of U given the data, i.e. E[U|y].
Given the model specifications, the joint distribution of Y and U is:[YU
]∼ N
([Xβ0
],
[V ZG
GZ′ G
])From the properties of multivariate normal distribution, we have
E[U|y] = E[U] + Cov[U,Y′]Var−1[Y](y − E[Y])
= GZ′V−1(y − Xβ) = GZ′(ZGZ′ + Σ)−1(y − Xβ)
The fixed effects β are typically replaced by their estimates, so thatpredictions are made based on the following expression:
u = GZ′(ZGZ′ + Σ)−1(y − Xβ)
Clarice G.B. Demetrio and Cristian Villegas 67 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Mixed Model Equations
The solutions β and u discussed before require V−1. As V can be ofhuge dimensions, especially in plant and animal breedingapplications, its inverse is generally computationally demanding ifnot unfeasible.
However, Henderson (1950) presented the mixed model equations(MME) to estimate β and u simultaneously, without the need forcomputing V.The MME were derived by maximizing (β and u) the joint density ofY and U ,[f (y,u|β,G,Σ) = f (y|u|β,Σ)f (u|G)], expressed as:
Clarice G.B. Demetrio and Cristian Villegas 68 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Mixed Model Equations
The derivatives of ` regarding β and u are: ∂`
∂β∂`
∂u
=
[X′Σ−1y − X′Σ−1Xβ − X′Σ−1Zu
Z′Σ−1y − Z′Σ−1Xβ − Z′Σ−1Zu− G−1u
]
Equating them to zero gives the following system:[X′Σ−1Xβ + X′Σ−1Zu
Z′Σ−1Xβ + Z′Σ−1Zu + G−1u
]=
[X′Σ−1yZ′Σ−1y
]which can be expressed as:[
X′Σ−1X X′Σ−1ZZ′Σ−1X Z′Σ−1Z + G−1
] [βu
]=
[X′Σ−1yZ′Σ−1y
]known as the mixed model equations (MME).
Clarice G.B. Demetrio and Cristian Villegas 69 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
BLUE and BLUP
Using the second part of the MME, we have that:
Z′Σ−1Xβ + (Z′Σ−1Z + G−1)u = Z′Σ−1y
so thatu = (Z′Σ−1Z + G−1)−1Z′Σ−1(y − Xβ)
It can be shown that this expression is equivalent tou = GZ′(ZGZ′ + Σ)−1(y − Xβ) and, more importantly, that u isthe best linear unbiased predictor (BLUP) of u.
Using this result into the first part of the MME, we have that:
Similarly, it is shown that this expression is equivalent toβ = (X′V−1X)−1X′V−1Y, which is the best linear unbiasedestimator (BLUE) of β
Clarice G.B. Demetrio and Cristian Villegas 70 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
BLUE and BLUP
It is important to note that β and u require knowledge of G and Σ.
These matrices, however, are rarely known.
This is a problem without an exact solution using classical methods.
The practical approach is to replace G and Σ by their estimates (Gand Σ) into the MME.
Note that if G and Σ are known, the variance covariance matrix ofthe BLUE and BLUP is:
Var
[βu
]=
[X′Σ−1X X′Σ−1ZZ′Σ−1X Z′Σ−1Z + G−1
]−1
Clarice G.B. Demetrio and Cristian Villegas 71 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
BLUE and BLUP
If G and Σ are unknown and their values are replaced in the MMEby some sort of point estimates G and Σ, the new solutions β and uof the system:[
X′Σ−1X X′Σ−1Z
Z′Σ−1X Z′Σ−1Z + G−1
] [βu
]=
[X′Σ−1y
Z′Σ−1y
]are no longer BLUE and BLUP solutions, as they are not even linearfunctions of the data y.
It is shown also that generally:
Var
[βu
]>
[X′Σ−1X X′Σ−1Z
Z′Σ−1X Z′Σ−1Z + G−1
]−1
Clarice G.B. Demetrio and Cristian Villegas 72 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Inverse of a nonsingular partitioned matrix
Let A be a nonsingular partitioned matrix and A−1 its inverse as follows
A =
[A11 A12
A21 A22
]A−1 =
[A11 A12
A21 A22
]=
[Var(β) Cov()Cov() Var(u)
]whereA11 = (A11 − A12A−1
22 A21)−1
A12 = A21T = −(A11 − A12A−122 A21)−1A12A−1
22
A22 = A−122 + A−1
22 A21(A11 − A12A−122 A21)−1A12A−1
22
Clarice G.B. Demetrio and Cristian Villegas 73 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Example
Considering the completely randomized design with random treatmenteffect and r = 2, t = 3, then β = µ, X = 16, G = σ2
T I3, Σ = σ2I6
Z =
12 02×1 02×1
02×1 12 02×1
02×1 02×1 12
and V =
J2×2 02×2 02×2
02×2 J2×2 02×2
02×2 02×2 J2×2
σ2T+σ2I6
Thenµ = y (Exercise!)
u = (Z′Σ−1Z + G−1)−1Z′Σ−1(y − Xβ)
=(Z′Z
σ2+
I3
σ2t
)−1 Z′
σ2(y − 16µ) =
( r
σ2I3 +
1
σ2t
I3
)−1 Z′
σ2(y − 16µ)
= =( rσ2
t + σ2
σ2σ2t
)−1 Z′
σ2(y − 16µ) =
σ2t
σ2t + σ2
r
Z′
r(y − 16µ)
ui = BLUP i =σ2t
σ2t +σ2
r
(yi − µ) = (shrinkage factor)BLUE i
Clarice G.B. Demetrio and Cristian Villegas 74 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The EBLUP for ui is given by(1− MSqRes
MSqT
)(yi − y)
The BLUP for µi = µ+ ui is given by
µi = y +σ2t
σ2t + σ2
r
(yi − y) = yi −σ2
rσ2t + σ2
(yi − y)
and substituting σ2t and σ2 by their estimates we have the EBLUP for µi
yi −MSqRes
MSqT(yi − y)
The relationship between the shrunk or adjusted means (EBLUP’s) andunadjusted means (BLUP’s) can also be illustrated by a scatter diagram.The shrinkage towards the overall mean is indicated by the fact that thepoints representing treatments that have an estimated mean above µ = ylie below the line
EBLUP = BLUP
whereas those representing treatments with an estimated mean belowµ = y lie above the line.
Clarice G.B. Demetrio and Cristian Villegas 75 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
A11 = X′Σ−1X, A12 = X′Σ−1Z, A22 = Z′Σ−1Z + G−1
Var(Y ) =
[X′Σ−1X− X′Σ−1Z
(Z′Σ−1Z + G−1
)−1
Z′Σ−1X
]−1
=
[X′X
σ2− X′Z
σ2
(Z′Z
σ2+
I
σ2T
)−1Z′X
σ2
]−1
=
(n
σ2+
X′Z
σ2
σ2σ2T
rσ2T + σ2
Z′X
σ2
)−1
=σ2
n
(σ2
rσ2T + σ2
)−1
=rσ2
T + σ2
n
and an estimate of Var(Y ) is given by
Var(Y ) =MSqT
n
Clarice G.B. Demetrio and Cristian Villegas 76 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Var(u) =(
Z′Σ−1Z + G−1)−1
+(
Z′Σ−1Z + G−1)−1
Z′Σ−1X[X′Σ−1X− X′Σ−1Z
(Z′Σ−1Z + G−1
)−1
Z′Σ−1X
]−1
X′Σ−1Z(
Z′Σ−1Z + G−1)−1
=σ2σ2
T
rσ2T + σ2
and an estimate of Var(u) is given by
Var(u) =
(1− MSqRes
MSqT
)MSqRes
r
Clarice G.B. Demetrio and Cristian Villegas 77 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
Recall that α is the vector of all variance components in G and Σ
In most cases, α is not known, and needs to be replaced by anestimate α
Three frequently used estimation methods for α
Moment method or ANOVA Method (MM)
Maximum likelihood method (ML)
Restricted maximum likelihood method (REML)
Clarice G.B. Demetrio and Cristian Villegas 78 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
Anova Estimation
Fit the model by assuming that the random effects in the model arefixed effects. Obtain the corresponding ANOVA table.
Compute the expected mean squares of the observed mean squaresin the ANOVA table under the true assumption about the u′s and ε.
Equate the observed mean squares to their expected mean squaresand solve the resulting system of equations for each of the variancecomponents.
Use the resulting solutions as the estimates of the variancecomponents
Clarice G.B. Demetrio and Cristian Villegas 79 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
Example
Consider the data set below, related to observations of half-sib families oft unrelated sires.
Sire1 2 . . . t
y11 y21 . . . yt1
y12 y22 . . . yt2
. . . . . . . . . . . .y1r1 y2r2 . . . ytrt
The following model can be used to represent these data:
yij = µ+ si + εij
where yij represents the phenotypic trait observation of progeny j(j = 1, 2 . . . , ri ) in family i , µ is a mean, si is an effect common toall animals having sire i , and εij is a residual term.
Clarice G.B. Demetrio and Cristian Villegas 80 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
Example
The sire effect si is equivalent to the transmitting ability (which isequal to one-half additive genetic value) of sire i , as one-half of itsgenes are (randomly) transmitted to each of its ri progeny.
The residual terms εij refer to additional genetics effects (such asthe effect of dams) and environmental components.
It is assumed that si ∼ N(0, σ2s ) and εij ∼ N(0, σ2)
The expectation and variance of Yij are
E(Yij) = µ and Var(Yij) = σ2s + σ2
Clarice G.B. Demetrio and Cristian Villegas 81 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
Example
The ANOVA table with expected mean squares isSource df SSq MSq E[MSq]Units n − 1Sire t − 1 SSSq SMSq σ2 + kσ2
s
Residual n − t RSSq RMSq σ2
where k = 1t−1 (n − 1
n
∑ti=1 r 2
i ).
The ANOVA (MM) estimators for σ2 and σ2s are
σ2 =RSSq
n − tand σ2
s =SMSq − RMSq
k=
1
k
[SMSq − σ2
]In the specific case of balanced data, i.e. the same progeny size forall sires, ri = r = n/t and the ANOVA estimators become:
σ2 = RMSq =RSSq
t(r − 1)and σ2
s =SMSq − RMSq
r=
1
r
[1
t − 1SSSq−σ2
]Clarice G.B. Demetrio and Cristian Villegas 82 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
Anova Estimation – Advantages
In general, the ANOVA approach works well for simple models (suchas a one-way structure) or balanced data (such as data fromdesigned experiments with no missing data).
The estimators of the variance components are unbiased.
One can often approximate the degrees of freedom corresponding tothe estimated standard errors of estimators of estimable functions ofthe fixed effects by using Satterthwaite’s Method.For the sire example
σ2s =
SMSq − RMSq
k
with ns degrees of freedom given by
ns =(SMSq − RMSq)2
(SMSq)2
t − 1+
(RMSq)2
n − t
SAS and R can produce the necessary information to perform theseanalysis.
Clarice G.B. Demetrio and Cristian Villegas 83 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
Anova Estimation – Disadvantages
It is not indicated for more complex models and data structures suchas those generally found in plant and animal breeding, longitudinalstudies.
There is no unique way in which to form an ANOVA table when thedata are not balanced.
The procedure can produce negative estimates of the variancecomponents which do not make sense.
If some of the expected mean squares of the random effects in theANOVA table depend on fixed effects, the method cannot beapplied. This problem can be avoided by placing all the fixed effectsin the model first followed by the random effects.
Clarice G.B. Demetrio and Cristian Villegas 84 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Estimation methods for the variance components
A number of methods have been proposed for estimating variancecomponents in more complex scenarios, such as the expected meansquares approach of Henderson (1953), and the minimum normquadratic unbiased estimation (Rao 1971a, 1971b), but maximumlikelihood based methods are currently the most popular ones,especially the restricted (or residual) maximum likelihood (REML)approach, which attempts to correct for the well-known bias in theclassical maximum likelihood (ML) estimation of variancecomponents.
These two methods are briefly described next.
Clarice G.B. Demetrio and Cristian Villegas 85 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Maize trial
Example
5 progenies of a maize population were investigated
the trial was conducted using a completely randomized design with 4replicates of each progeny
the response variable was the weight of corn-cob (kg/10m2)
Clarice G.B. Demetrio and Cristian Villegas 86 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Completely Randomized Design (CRD) with same numberof replicates – Expected mean squares for an ANOVA
Let Y be an n × 1 vector of random variables with E[Y] = µ andVar[Y] = V, where µ is a n × 1 vector of expected values and V is ann × n matrix. Let A an n × n matrix of real numbers. Then
E(YTAY) = tr (AV) + µTAµ
For a fixed CRD modelE(Y) = XTτ and V = Inσ2
For a random CRD modelE(Y) = Inµ and V = Inσ2 + rσ2
TMT
whereXT = It ⊗ 1r , MT = XT (XT
TXT )−1XTT = r−1Ir ⊗ Jt
Clarice G.B. Demetrio and Cristian Villegas 87 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The expected mean squares under the fixed and random models are givenin the following table
## BLUE, BLUP (step by step)summary(lm(Yield ~ Progeny-1, CRDMaize.dat))# shows the BLUE’s for mu_isqrt(MSq_Res/4) #(standard error the BLUE’s for mu_i)mean_T <- tapply(CRDMaize.dat$Yield,CRDMaize.dat$Progeny,mean)tau_BLUE <- mean_T - ybartau_BLUP <- tau_BLUE*sigma2_T/(sigma2_T+sigma2/4)mean_T_BLUP <- ybar+tau_BLUP(var_tau_BLUP <- sigma2_T*sigma2/(4*sigma2_T+sigma2))sqrt(var_tau_BLUP)data.frame(tau_BLUE,tau_BLUP,mean_T,mean_T_BLUP)plot(mean_T,mean_T_BLUP, pch=’*’,xlim=c(4,6),ylim=c(4,6),xlab=’Unadjusted means’,ylab=’Adjusted means’)abline(0,1)
R program
require(nlme)# Restricted Maximum Likelihood MethodCRDMaize.reml <- lme(Yield ~ 1, random = ~1|Progeny,CRDMaize.dat, method="REML")summary(CRDMaize.reml)VarCorr(CRDMaize.reml)VarCorr(CRDMaize.reml)[1]VarCorr(CRDMaize.reml)[2]VarCorr(CRDMaize.reml)[3]VarCorr(CRDMaize.reml)[4](summary(CRDMaize.reml)$sigma)^2random.effects(CRDMaize.reml) ## tau EBLUPCRDMaize.reml$coef ## mean EBLUPcoef(CRDMaize.reml)
# Maximum Likelihood MethodCRDMaize.ml <- update(CRDMaize.reml, method="ML")#CRDMaize.ml <- lme(Yield ~ 1, random = ~1|Progeny, CRDMaize.dat, method="ML")summary(CRDMaize.ml,corr = F)VarCorr(CRDMaize.ml)(summary(CRDMaize.ml)$sigma)^2random.effects(CRDMaize.ml)coef(CRDMaize.ml)
R program
## Restricted Maximum Likelihood Method, using library lme4library(lme4)CRDMaize.lmer <- lmer(Yield ~ 1 + (1|Progeny),CRDMaize.dat, REML=TRUE)summary(CRDMaize.lmer)summary(CRDMaize.lmer)@coefsdata.frame(summary(CRDMaize.lmer)@REmat)
# Restricted Maximum Likelihood Method using ASReml-Rrequire(asreml)CRDMaize.asreml<- asreml(Yield ~ 1,random=~Progeny,data=CRDMaize.dat)summary(CRDMaize.asreml)summary(CRDMaize.asreml)$varcomp
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
ASReml [email protected]://www.vsni.co.uk/products/asremlASReml forum www.vsni.co.uk/forumCookbook: http://uncronopio.org/ASReml
Clarice G.B. Demetrio and Cristian Villegas 96 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Differences between lme4 and nlme
(B. Venables, 2010, personal communication)
1 With nlme the fixed and random parts of the model are specifiedusing two formulae; in lme4 they are specified in the one formulawith the random parts ”added on” to the fixed parts.
2 With nlme you have no generalized linear mixed model fitter, thoughglmmPQL in the MASS library can be used for some GLMMs, and ituses the nlme library. lme4 has a GLMM built-in. It allows you tospecify families in the glm sense, but not all glm families aresupported, yet.
3 nlme offers non-linear mixed effect models; lme4 does not and neverwill.
4 The nlme package allows you to specify variance heterogeneity andcorrelation patterns; the only way to do this within lme4 is to use aglm family, which is often not what you want to do.
5 The nlme package has a gls functon for ”generalized least squares”.This allows you to make use of the variance heterogeneity andcorrelation patterns feature even if the model does not contain anyrandom effects. This is handy.
Clarice G.B. Demetrio and Cristian Villegas 97 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Differences between lme4 and nlme
(B. Venables, 2010, personal communication, cont.)
6 (Probably most important difference). nlme is hard to use withcrossed random effects, but is very well-developed for nested randomeffects. lme4 is the opposite: it handles crossed random effects welland using it with nested random effects is still simple enough, but abit more work than with nlme.
7 nlme uses an older algorithm which struggles for large data sets.lme4 uses a newer algorighm and can handle quite large data setsvery quickly. (I think the SAS Proc mixed, though, will handle evenbigger ones.)
8 lme4 is that, at this stage, it is relatively under-developed. Someimportant things are missing.
9 ASREML is wonderful, but it only handles a relatively small set ofmodels (though the most important set, of course)
Clarice G.B. Demetrio and Cristian Villegas 98 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
(C. Brien, 2010, personal communication)
1 ASREML does a wide range of heterogeneous variances andcorrelations for nested and crossed random effects, althoughprobably not the full range of heterogeneous, nested models thatnlme does. ASREML also does GLMMs, similar to GLMMPQL. Itdoes not do the non-linear models.
2 ASREML is good for experiments and lme4/nlme are good for largesurveys, because that is what they were developed for
Clarice G.B. Demetrio and Cristian Villegas 99 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Software
SAS procedures
PROC GLM – general linear model
PROC MIXED – linear mixed model
PROC GENMOD – generalized linear model
PROC GLIMMIX
PROC NLMIXED – non-linear mixed model
Clarice G.B. Demetrio and Cristian Villegas 100 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Basic SAS code1/ proc mixed data=variety.eval;
2/ class block type dose;
3/ model y = type|dose ;
4/ random block block*dose ;
5/ ods select Tests3 CovParms; run;
call procedure and declare data set
define block, type, dose as factor
define fixed effects in the model
declare random effects
output test type 3 and covarianceparameters
Clarice G.B. Demetrio and Cristian Villegas 101 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
1/ proc mixed statement <options>;
DATA= SAS data set. Name of SAS data set to be used by PROCMIXED. The default is the most recently created data set.
METHOD
REML (default method)ML
COVTEST allows to specify if asymptotic standard errors and WaldZ-test for variance-covariance structure parameter estimates is used.
Clarice G.B. Demetrio and Cristian Villegas 102 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
3/ MODEL statement <option>;
describes linear relation between Y and fixed covariables
S or Solution for fixed effects output;
DDFM method to compute approximate Degree of Freedom
CONTAIN (default)RESKRSATTERTH
outpred=Names1, output data-sets Names1 contains predictedvalues X β + Z u, sd...
outpredm=Names2, output data-sets Names2 contains predictedvalues X β, sd...
4/ Random statement
random block / Solution;
↪→ Blup and t-test
Clarice G.B. Demetrio and Cristian Villegas 103 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
CRD – Variance of Component of Variance Estimators
From Session 1: Let MS denote a mean square with ν df. IfνMS/E(MS) ∼ χ2
ν , the variance of MS is
Var(MS) = 2E2(MS)/ν.
Hence,
Var(MS) = E(MS2)− E2(MS) = E(MS2)− ν
2Var(MS).
Thus, (ν + 2)Var(MS)/2 = E(MS2) and an unbiased estimator ofVar(MS) is given by
2MS2
ν + 2Then
Var(σ2) = Var(MSqRes) =2σ4
n − tand
Var(σ2) = Var(MSqRes) = 2MSq2
Res
n − t + 2
Maize example: σ2 = 0.3212 and Var(σ2) = 2 0.32122
15+2 = 0.0121
Clarice G.B. Demetrio and Cristian Villegas 104 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
We saw that
σ2T =
MSqT −MSqRes
r=
∑i (yi − y)2
t − 1− s2
r
Since MSqT and MSqRes are independent, the two terms of σ2T are
distributed independently. Furthermore,
(t − 1)MSqT
σ2 + rσ2T
∼ χ2t−1 and
(n − t)MSqRes
σ2∼ χ2
n−t
From these results
Var(σ2T ) =
1
r 2[Var(MSqT ) + Var(MSqRes)] =
2
r 2
[(rσ2
T + σ2)2
t − 1+
σ4
n − t
]An unbiased estimator of this variance is given by
Var(σ2T ) =
2
r 2
[MSq2
T
t − 1 + 2+
MSq2Res
n − t + 2
]=
2
r
[MSq2
T
n + r+
MSq2Res
r(n − t + 2)
]Maize example: σ2
P = 0.2639 and
Var(σ2P) = 2
42
[1.37702
4+2 + 0.32122
15+2
]= 0.0403
Clarice G.B. Demetrio and Cristian Villegas 105 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Confidence interval for σ2
From(n − t)MSqRes
σ2∼ χ2
n−t obtained from
P
(χ2n−t;α/2 <
(n − t)MSqRes
σ2< χ2
n−t;1−α/2
)= 1− α
or equivalently
P
((n − t)MSqRes
χ2n−t;1−α/2
< σ2 <(n − t)MSqRes
χ2n−t;α/2
)= 1− α
Then a confidence interval for σ2 with a 100(1− α)% is[(n − t)MSqRes
χ2n−t;1−α/2
;(n − t)MSqRes
χ2n−t;α/2
]Maize example: A confidence interval for σ2 with a 100(1− α)% is[
15 ∗ 0.3212
27.4884;
15 ∗ 0.3212
6.2621
]= [0.1753; 0.7693]
Clarice G.B. Demetrio and Cristian Villegas 106 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Confidence interval for σ2T
To get the confidence interval for σ2T we need first to determine the
number of degrees of freedom associated to σ2T , by Satterthwaite
method.
As σ2T =
MSqT −MSqRes
r, from Session 1,
νT =(∑
i aiMSi )2∑
ia2i MS2
i
νi
=(MSqT −MSqRes)2
MSq2T
t−1 +MSq2
Res
n−t
.
Then a confidence interval for σ2T with a 100(1− α)% is[
νt σ2T
χ2νt ;1−α/2
;νt σ
2T
χ2νt ;α/2
]Maize example:
νT =(1.37695− 0.32118)2
1.376952
4 + 0.321182
15
= 2.32
and [2.32 ∗ 0.2639
8.0308;
2.32 ∗ 0.2639
0.0903
]= [0.07618; 6.7714]
Clarice G.B. Demetrio and Cristian Villegas 107 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Inference regarding the mean
It is easy to show that the sample mean Y =∑
i,j Yij
n is an unbiasedestimator for µ and has variance
Var(Y ) =1
t(σ2
T +σ2
r) =
1
n(rσ2
T + σ2)
An unbiased estimator of this variance is
Var(Y ) =1
nMSqT
The hypothesis H0 : µ = µ0 can be tested using
tt−1 =y − µ0√Var(Y )
which follows from the Student’s t-distribution with (t-1)d.f. The intervalfor µ with 100(1− α)% confidence has limits
CI (µ) :
[y − tt−1;α/2
√Var(Y ); y + tt−1;1−α/2
√Var(Y )
]Maize example: y = 5.07, Var(Y ) = 1.3770
20 = 0.0688,
t = (5.0705− 0)/√
0.0688 = 19.32 and the CI(µ): [4.34, 5.80].Clarice G.B. Demetrio and Cristian Villegas 108 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Expected mean squares for an ANOVA – CRD withsubsampling
The model for a CRD with subsampling (k subsamples per plot) withtreatment random is
Yijk = µ+ τi + εij + εijk ,
where i = 1, . . . , t, j = 1, . . . , r , k = 1, . . . , k , µ constant, τi random, εijrandom and εijk random. The ANOVA table is
Source df SSq MSq FPlots rt − 1 Y′QPY
Treatments t − 1 Y′QTYY′QTY
t − 1MSqTMSqRes
Residual t(r − 1) Y′QUResY
Y′QUResY
n − tMSqResMSqW
Between samples within plots rt(k − 1) YTQUWY
YTQUWY
rt(k − 1)
MU = In, XG = 1n, MG = XG (XTG XG )−1XT
G = n−1Jn
QT = MT −MG , QU = MU −MG , QURes= MU −MT
Clarice G.B. Demetrio and Cristian Villegas 109 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Then,
SSqT = Y′QTY = 1rk
∑ti=1 T 2
i − C , C =(∑
i,j,k Yijk )2
n
SSqPlots = 1k
∑i,j Y 2
ij. − C , SSqRes = SSqPlots − SSqT
SSqWithin =∑
i,j,k Y 2ijk − C − SSqPlots
Assuming thatYijk = µ+ τi + εij + εijk
where τi ∼ N(0, σ2T ), εij ∼ N(0, σ2
P) and εijk ∼ N(0, σ2PS). Then
E(τi ) = 0, Var(τi ) = E(τ 2i ) = σ2
T ,
E(εij) = 0, Var(εij) = E(ε2ij) = σ2
P ,
E(εijk) = 0, Var(εijk) = E(ε2ijk) = σ2
PS ,
Clarice G.B. Demetrio and Cristian Villegas 110 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
i) E(SSqUnits)
E(SSqUnits) =∑i,j,k
E(Y 2ijk)− E(C )
E(Y 2ijk) = E(µ2) + E(τ 2
i ) + E(ε2ij) + E(dp) = µ2 + σ2
T + σ2
∑i,j,k
E(Y 2ijk
)= nµ2 + nσ2
T + nσ2P + nσ2
PS
C =1
trk
(∑i,j,k
Yijk
)2=
1
trk
[∑i,j,k
(µ+ τi + εij + εijk
)]2=
1
trk
[(trkµ+ rk
∑i
τi + k∑i,j
εij +∑i,j,k
εijk)]2
=1
trk
[(trkµ)2 + (rk)2
(∑i
τi)2
+ k2(∑
i,j
εij)2
+(∑i,j,k
εijk)2
+ dp]
E(C) = trkµ2 +rk
tE[(∑
i
τi)2]
+k
trE[(∑
i,j
εij)2]
+1
rtkE[(∑
i,j,k
εijk)2]
+1
trkE(dp)
Clarice G.B. Demetrio and Cristian Villegas 111 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
But
E[(∑
i
τi)2]
=∑i
E(τ 2i ) +
∑i
E(dp) = tσ2T
E[(∑
i,j
εij)2]
=∑i,j
E(ε2ij
)+∑i,j
E (dp) = trσ2P
E[(∑
i,j,k
εijk)2]
=∑i,j,k
E(ε2ijk)∑i,j,k
E(dp) = trkσ2PS
E(dp) = 0
Then
E(C ) = trkµ2 +rk
ttσ2
T +k
trtrσ2
P +1
trktrkσ2
PS
= trkµ2 + rkσ2T + kσ2
P + σ2PS
and
E(SSqUnits) = (n − rk)σ2T + (n − k)σ2
P + (n − 1)σ2PS
= rk(t − 1)σ2T + k(tr − 1)σ2
P + (n − 1)σ2PS
Clarice G.B. Demetrio and Cristian Villegas 112 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
ii) E(SSqPlots)
E(SSqPlots) =1
k
∑i,j
E(P2ij
)− E(C )
Pij =∑k
Yijk =∑k
(µ+ τi + εij + εijk) = kµ+ kτi + kεij +∑k
εijk
P2ij =
[∑k
(µ+ τi + εij + εijk
)]2= k2µ2 + k2τ 2
i + k2ε2ij +
(∑k
εijk)2
+ dp
E(P2ij
)= k2µ2 + k2E(τ 2
i ) + k2E(ε2ij) + E
[(∑k
εijk)2]
+ E(dp)
= k2µ2 + k2σ2T + k2σ2
P + kσ2PS
E(SSqPlots) =1
k
∑i,j
(k2µ2 + k2σ2
T + k2σ2P + kσ2
PS
)− E(C)
=1
k
(trk2µ2 + trk2σ2
T + trk2σ2P + trkσ2
PS
)− E(C)
= trkµ2 + trkσ2P + trkσ2
P + trσ2PS − E(C)
= rk(t − 1)σ2T + k(tr − 1)σ2
P + (tr − 1)σ2PS
Clarice G.B. Demetrio and Cristian Villegas 113 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
iii) E(SSqT ) and E(MSqT )
SSqT =1
rk
t∑i=1
T 2i − C
T 2i =
(rkµ+ rkτi + k
∑j
εij +∑j,k
εijk)2
=(rkµ)2
+(rkτi
)2+(k∑j
εij)2
+(∑
j,k
εijk)2
+
2(rkµ)(
rkτi)
+ 2(rkµ)(
k∑j
εij)
+ 2(rkµ)(∑
j,k
εijk)
+
2(rkτi
)(k∑j
εij)
+ 2(rkτi
)(∑j,k
εijk)
+ 2(k
r∑j=1
εij)(∑
j,k
εijk)
E(T 2i
)= r 2k2µ2 + r 2k2E
(τ 2i
)+ k2E
(∑j
εij)2
+ E(∑
j,k
εijk)2
= r 2k2µ2 + r 2k2σ2T + rk2σ2
P + rkσ2PS
Clarice G.B. Demetrio and Cristian Villegas 114 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
1
rk
t∑i=1
E(T 2i
)= trkµ2 + trkσ2
T + tkσ2P + tσ2
PS
E(SSqT ) =1
rk
t∑i=1
E(T 2i
)−E(C ) = rk(t−1)σ2
T +k(t−1)σ2P +(t−1)σ2
PS
and
E(MSqT ) =E(SSqT )
t − 1= rkσ2
T + kσ2P + σ2
PS
iv) E(SSqRes) and E(MSqRes)
SSqRes = SSqPlots − SSqT
E(SSqRes) = kt(r−1)σ2P+t(r−1)σ2
PS and E(MSqRes) =E(SSqRes)
t(r − 1)= kσ2
P+σ2PS
v) E(SSqWithin) and E(MSqWithin)
SSqWithin = SSqUnits − SSqPlots
E(SSqWithin) = tr(k − 1)σ2PS and E(MSqWithin) =
E(SSqWithin)
tr(k − 1)= σ2
PS
Clarice G.B. Demetrio and Cristian Villegas 115 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
E(MSq)Source df SSq MSq T fixed T RandomPlots rt − 1 YTQUY
Treat. t − 1 YTQTYYTQTY
t − 1σ2 + kσ2
P + qT (Ψ) σ2 + kσ2P + rkσ2
T
Res. t(r − 1) YTQUResY
YTQUResY
n − tσ2 + kσ2
P σ2 + kσ2P
S[Plots] rt(k − 1) YTQUWY
YTQUWY
rt(k − 1)σ2 σ2
qT (Ψ) =Ψ′QTΨ
t − 1=
t∑i=1
rk(τi − τ .)2
t − 1
Clarice G.B. Demetrio and Cristian Villegas 116 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Wood shearing strength
Example
The effects of six treatments (a 2 x 3 set of factorial treatments from twotypes of resin and 3 wood blade densities) on the shearing strength are tobe investigated. The two types of resin were APM (resin of highmolecular weight) and BPM (resin of low molecular weight) and thethree wood blade densities were VH (Very Hard), H (Hard) and S (Soft).The trial was conducted using a completely randomized design with threewood panels from each treatment and the shearing strength (kgf/cm2) offive test bodies from each panel resin was measured.
interest of course in each particular treatment used
no interest in each panel which are very depending on the circumstances
no interest in each test body which are very depending on thecircumstances
interest in estimating the variance of the panel effect as a source ofrandom variation in the data
the five body tests from the same panel share something whichpresumably violates the assumption of independence
Clarice G.B. Demetrio and Cristian Villegas 117 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Clarice G.B. Demetrio and Cristian Villegas 135 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Penicillin yield (Brien, 2009)
Example
The effects of four treatments on the yield of penicillin are to beinvestigated. It is known that corn steep liquor, an important rawmaterial in producing penicillin, is highly variable from one blending of itto another. To ensure that the results of the experiment apply to morethan one blend, five blends (blocks) are to be used in the experiment.The trial was conducted using the same blend in four flasks andrandomizing the four treatments to these four flasks.
interest of course in each particular treatment usedno interest in each blend which are very depending on thecircumstancesblend effect can be viewed as a sample of a random blend effect(levels are chosen at random from an infinite set of blend levels)interest in estimating the variance of the blend effect as a source ofrandom variation in the datathe four flasks with the same blend share something whichpresumably violates the assumption of independence
Clarice G.B. Demetrio and Cristian Villegas 136 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Clarice G.B. Demetrio and Cristian Villegas 137 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Penicillin yield
ANOVA table using R
Source df SSq MSq F Prob
Blend 4 264.0 66.0 1.97 0.15
Plots[Blocks] 15
Treat 3 70.0 23.3 1.24 0.34
Residual 12 226.0 18.8
ANOVA estimators:
σ2 = MSqRes = 18.8, σ2B =
MSqB −MSqRes
t=
66.0− 18.8
4= 11.8
MM REML ML
σ2B 11.8 11.8 9.4
σ2 18.8 18.8 15.1
Clarice G.B. Demetrio and Cristian Villegas 138 Modelos Mistos e Componentes de Variancia
SAS programdata pen;
input Blend Treat$ Yield @@;
cards;
1 A 89 3 C 87
1 B 88 3 D 85
1 C 97 4 A 87
1 D 94 4 B 92
2 A 84 4 C 89
2 B 77 4 D 84
2 C 92 5 A 79
2 D 79 5 B 81
3 A 81 5 C 80
3 B 87 5 D 88
;
* Moment Method;
proc glm data=pen;
class Blend Treat;
model Yield = Blend Treat;
run;
* Restricted Maximum Likelihood Method;
proc mixed data=pen;
class Blend Treat;
model Yield = Treat / solution ddfm=sat;
random Blend / solution ;
run;
* Maximum Likelihood Method;
proc mixed data=pen method=ML;
class Blend Treat;
model Yield = Treat / solution ddfm=sat;
random Blend / solution ;
run;
R program
#set up data.frame with factors Flasks, Blends and Treat and response variable YieldRCBDPen.dat <- data.frame(Blend=factor(rep(c(1,2,3,4,5), times=c(4,4,4,4,4))),
In breeding experiments, when families or clones are evaluated, themeasurements are made at individual’s levels within the plots inorder to estimate the variability within the plot.
For clones evaluation (same genes), as in sugar-cane, potato ormanioc the variation within the plot will be only due to theenvironment.
The same is true for homozygote lines.
For segregant families of plants, the phenotypic variation within theplots is due to two components: one genetic and anotherenvironmental, that is, the phenotypic variance (σ2
W ) is equal to theenvironmental variance within the plots (σ2
E ) plus genetic variancewithin the families (σ2
G ).
This type of information allows the geneticist to get estimates ofgenetic parameters as heritability and expected gain with theselection.
Clarice G.B. Demetrio and Cristian Villegas 144 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Eucalyptus data (Ramalho et all, 2013)
Example
For the evaluation of progenies of Eucalyptus camaldulensis a RCBD wasperformed using 10 progenies as treatments and three blocks. Theresponse variable was wood volume (m3 × 10−4) of six trees per plot.
Block I Block II Block IIIProgeny I II III IV V VI I II III IV V VI I II III IV V VI Means
ANOVA at individual level:Consider a randomized complete block design, with subsampling
Yijk = µ+ βi + τj + εij + εijk , i = 1, . . . , r , j = 1, . . . , t, k = 1, . . . , s
where µ constant, βi is the effect of the i-th block (fixed), τj is the effectof the j-th treatment (random), εij is the experimental error at plot’slevel and εijk is the effect of the individual (e.g. plant) k within the plotij . The ANOVA table is
Source df SSq MSq E(MSq) F
Blocks r − 1 Y′QBY σ2W + sσ2
e + qB (Ψ)Plots[Blocks] r(t − 1) Y′QPY
Treatments t − 1 Y′QT YY′QT Y
t − 1σ2W + sσ2
e + rsσ2T
MSqTMSqRes
Residual (r − 1)(t − 1) Y′QUResY
Y′QUResY
n − tσ2W + sσ2
eMSqR es
MSqW
Samples[Blocks ∧ Plots] rt(s − 1) YT QUWY
YT QUWY
rt(k − 1)σ2W
Clarice G.B. Demetrio and Cristian Villegas 146 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Then,
SSqB = 1t
∑i B2
i − C , C =(∑
i,j,k Yijk )2
n
SSqT = 1rs
∑rj=1 T 2
j − C ,
SSqPlots = 1k
∑i,j Y 2
ij. − C ,
SSqPlots[Blocks] = 1k
∑i,j Y 2
ij. − C − SSqB = 1s
∑i,j Y 2
ij. − 1t
∑i B2
i ,
SSqRes = SSqPlots − SSqT ,
SSqResidual = SSqPlots[Blocks] − SSqT ,
SSqWithin =∑
i,j,k Y 2ijk − C − SSqPlots[Blocks]
Clarice G.B. Demetrio and Cristian Villegas 147 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Clarice G.B. Demetrio and Cristian Villegas 155 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Heritability coefficient
The amount of genetic variation among the individuals of a speciesof crop or domesticated animal can be compared with the amount ofvariation due to non-genetic causes in a ratio called the heritability.
The heritability of a trait is defined as
h2 =σ2G
σ2Ph
where σ2G is the genetic component of variance, i.e. the part of the
variation in the organism’s phenotype (its observable traits) that isdue to genetic effects; σ2
Ph is the phenotypic variance, i.e. thevariance due to the combined effects of genotype and environment.
For the Eucalyptus example
h2 =σ2P
σ2W
rs +σ2e
r + σ2P
=679.78
5094.383×6 + 808.06
3 + 679.78= 0.5517
Clarice G.B. Demetrio and Cristian Villegas 156 Modelos Mistos e Componentes de Variancia
## Components of variance - Moment method(sigma2_W <- MSqWithin)(sigma2_Res<- (MSqRes - MSqWithin)/6)(sigma2_P<- (MSqP- MSqRes)/(3*6))(sigma2_W/sigma2_Res)(h2 <- sigma2_P/(sigma2_P+sigma2_Res/3+sigma2_W/18))## Estimate of Var(sigma2_P)(Var_sigma2_P=2/(3^2*6^2)*(MSqP^2/(dfP+2)+MSqRes^2/(dfRes+2)))sqrt(Var_sigma2_P)
## BLUE and EBLUP for tau - calculating step by step(ybar <- mean(RCBDk_Eucaliptus.dat$volume))(mean_P <- tapply(RCBDk_Eucaliptus.dat$volume,RCBDk_Eucaliptus.dat$progeny,mean))mean(mean_P) ## mean of the Progeny means(mean_B <- tapply(RCBDk_Eucaliptus.dat$volume,RCBDk_Eucaliptus.dat$block,mean))mean(mean_B)(tau_BLUE <- mean_P - ybar)tau_EBLUPc <- tau_BLUE*sigma2_P/(sigma2_P+sigma2_Res/3+sigma2_W/18)
## REML using library lme4library(lme4)RCBDk_Eucaliptus.REML <- lmer(volume~ block +(1|progeny)+ (1|block:plots),data=RCBDk_Eucaliptus.dat,REML=TRUE)summary(RCBDk_Eucaliptus.REML)summary(RCBDk_Eucaliptus.REML)@coefsdata.frame(summary(RCBDk_Eucaliptus.REML)@REmat)
## EBLUP for tau and shrunk meanstau_EBLUP <- ranef(RCBDk_Eucaliptus.REML)[[2]]round(sum(tau_EBLUP),2)mm <- model.matrix(terms(RCBDk_Eucaliptus.REML),RCBDk_Eucaliptus.dat)RCBDk_Eucaliptus.dat$distance <- mm %*% fixef(RCBDk_Eucaliptus.REML)mu_EBLUP <- RCBDk_Eucaliptus.dat$distance + tau_EBLUP(Blup<- data.frame(round(tau_BLUE,1),round(tau_EBLUP,1),round(mean_P,1),round(mu_EBLUP,1)))plot(Blup[[3]],Blup[[4]], pch=’*’,xlim=c(100,200),ylim=c(100,200),xlab=’Unadjusted means’,ylab=’Shrunk means’)abline(0,1)
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Randomized Incomplete Block Design
In many situations the number of treatments is large and given theheterogeneity of the experimental conditions there is need to useblocks
However, blocks with too many plots could also becomeheterogeneous.
In breeding experiments, for example, it is common to have 100 ormore cultivars of corn to evaluate.
In other situations, there is not enough material to use. -
In biological work on animals, for example, it will be desirable, if atall possible, to compare several treatments within litters, but the sizeof the litter will depend on the particular species and will often besuch that it is impossible to include all the treatments within a litter.
Clarice G.B. Demetrio and Cristian Villegas 158 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The randomized incomplete block design can be of three types:
Balanced - here are included the “balanced incomplete block design(BIBD)” and the “balanced lattices square”
Partially unbalanced - here are included the “lattice squares” and“partially balanced incomplete block designs (PBIBD)”
Unbalanced
Clarice G.B. Demetrio and Cristian Villegas 159 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Definition: A balanced incomplete block design (BIBD) is one inwhich each of the t treatments is replicated r times and occurs at mostin each of the b blocks that contain k plots and the arrangement oftreatments in blocks is that each of treatments occurs together the samenumber of times (λ) in a block. (Brien, 2010)
The first condition means that the total number of units is tr = bk
the second condition implies that the total number of plots withother treatments in the blocks in which a treatment occur isλ(t − 1) = r(k − 1)
A BIBD cannot exist if these two first conditions are not met
However, that both of these conditions are satisfied does not implythat a BIBD must exist.
For example, a BIBD does not exist for t = 15, k = 5, b = 21, r = 7and λ = 2, even though both conditions are satisfied.
Such designs are not orthogonal, however they are balanced.
That is to say they are not orthogonal because treatments areconfounded with both blocks and plots within blocks.
They are balanced because all comparisons between treatments areconfounded with blocks to the same extent, as they are with plotwithin blocks.
Clarice G.B. Demetrio and Cristian Villegas 160 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
It can be shown that for a BIBD the proportion of the informationwithin blocks is e2 = tλ
kr and between blocks is e1 = 1− e2.These proportions are called the canonical efficiency factors whichare always values between zero and one and sum to one for aparticular randomized term, in this case Treatments.It is desirable that e2 is as close to one as possible and this impliesthat as much of the information as possible is confounded withplots, which are less variable than blocks.Designs can be obtained from Cochran and Cox (1957) and Box,Hunter and Hunter (2005) or can be generated as follows.Suppose t = 4, k = 3, b = 4 ⇒ r = 3 and λ = 2
BlocksI II III IVA A A BB B C CC D D D
e2 = 4×2×3 = 0.8889and e1 = 1− 0.8889 = 0.1111, that is, 88,89% of
the information about treatments is between plots within blocks.Randomization: the treatment combinations are randomized to theblocks and the treatments in a block are randomized to the plots(dae).Clarice G.B. Demetrio and Cristian Villegas 161 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
ANOVA table
E(MSq)Source df Fixed B random T random RandomBlocks b − 1Treatments t − 1 σ2 + e1qTB (Ψ) σ2 + kσ2
Note that there are two Treatment lines in the analysis, the firstbeing referred to as the “interblock” Treatment line and the secondas the “intrablock” Treatment line.Generally, one tries to have e2 as close to one as possible and tobase conclusions on the intrablock Treatment effects.Because, when Blocks are fixed, qTB
involves both β’s and τ ’s, it isnot possible to separately test for treatment difference between theblocks in this case – the intrablock test for treatments will be theonly test for treatments that can be performed here.Thus it is preferable to designate Blocks as random, if it isappropriate.
Clarice G.B. Demetrio and Cristian Villegas 162 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Barley data – unbalanced
Example
The data here is from a field trial of barley breeding lines (Galwey, page87). The lines studied were derived from a cross between two parentvarieties, “Chebec” and “Harrington”. They were “double haploid” lines,which means they were obtained by a laboratory technique that ensuresthat all plants within the same breeding line are genetically identical, sothat the line will breed true. This feature improves the precision whichgenetic variation among the lines can be estimated. The trial consideredhere was arranged in two randomized blocks. Within each block, eachline occupied a single rectangular field plot. All lines were present inBlock I, but due to limited seed stocks, some were absent in Block II.The grain yield (g/m2) was measured in each field plot.
Clarice G.B. Demetrio and Cristian Villegas 163 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Barley data
Blocks Blocks Blocks BlocksLines I II Lines I II Lines I II Lines I II
where Yjk is the grain yield of the k-th plot in the j-th block; µ is thegrand mean value of the grain yield; βj is the effect of the j-th block; τkis the effect of the effect of the k-th breeding line, being the line sown inthe jk-th block.plot combination.
It is natural in this case to consider block as a random effect, thatis, βj ∼ N(0, σ2
B) and εjk ∼ N(0, σ2).
Note that the cross Chebec × Harrington could produce many linesbesides those studied here, and the lines in this field trial mayreasonably be considered as a random sample from this populationof potential lines.
Thus it is reasonable to consider “line” as a random-effect term,that is to assume that τk ∼ N(0, σ2
L)
Clarice G.B. Demetrio and Cristian Villegas 165 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Using the R aov function with Error, the ANOVA table is
Source df SSq MSq F p-valueBlocks 1Treatments 1 58079.91 58079.91
The estimate of variance due to breeding lines is about double theresidual variance.
The ML estimates of the variance components are smaller than theREML estimates
Note that the estimates of the fixed parameter µ using ML andREML don’t differ much.
Clarice G.B. Demetrio and Cristian Villegas 167 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The likelihood ratio test for σ2L = 0 is obtained by fitting the full
model and the reduced model from which the term “line” is omitted.
By comparing the deviances of both models, the contribution madeby the term “line” to the fit of the model can be assessed, providedthat the deviances were obtained from models with the samefixed-effect terms.
Using the R lme4 library, the deviances for the full and reducedmodels are, respectively, 1880.37 and 1919.67. The likelihood ratiostatistic with 1 d.f. is
Devreduced model − Devfull model = 1919.67− 1880.37 = 39.30
Note that R lme4 library uses the deviances from a ML estimation.
In a similar way, the likelihood ratio test for σ2B = 0 is obtained by
fitting the full model and the reduced model from which the term“block” is omitted. The likelihood ratio statistic with 1 d.f. is
Devreduced model−Devfull model = 2×940.18508−2×940.18504 = 0.00008
These results are similar but not identical to GENSTAT (see Galwey,page 101-104)
Clarice G.B. Demetrio and Cristian Villegas 168 Modelos Mistos e Componentes de Variancia
R results
summary(RIBD_barley.REML)Linear mixed model fit by REMLFormula: yield_g_m2 ~ 1 + (1 | fblock) + (1 | fline)
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Heritability. The prediction of genetic advance underselection.
The heritability for the Barley data can be calculated by
h2 =σ2L
σ2L + σ2
r∗
=30666.89
30666.89 + 13225.751.55
= 0.7823
where r∗ is the number of replications per line.
One way to calculate r∗ is to use
r∗ =t∑t
k=01ri
=83
24 11 + 59 1
2
= 1.55
similar to the value 1.63 given by Galwey, on page 106.
The heritability can be used to calculate the expected geneticadvance under selection in a plant or animal breeding programme.
This is given by the formula
Gs = iσPhh2
where i is an index of selection.Clarice G.B. Demetrio and Cristian Villegas 172 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The index is defined in relation to the standard normal model: thatis, the distribution of a variable Z , such that
Z ∼ N(0, 1)
It is the value of Z that corresponds to the fraction k of thepopulation that is to be selected.
Clarice G.B. Demetrio and Cristian Villegas 173 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The best linear unbiased predictor or ’shrunk’ estimate
The adjustment to obtain the random-effect mean is made asfollows.The true mean of the kth breeding line is represented by
µk = µ+ τk
In the table of means presented (using R lme4 library), this value isestimated by
µk =
∑rkj=1 yjk
rkwhere yjk is the jth observation of the kth breeding line; rk is thenumber of observations of the kth breeding line.The overall mean of the population of breeding lines, µ is estimatedby
µ = 572.5
Note that this not quite the same as the mean of all observations(= 581.1) or the mean of the line means (= 569.1).Then
µk = µ+ τk ⇒ τk = µk − µClarice G.B. Demetrio and Cristian Villegas 174 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
An estimate of τk is given by
BLUE k = τk = µk − µ
To allow for the expectation that high-yielding lines in the presenttrial will perform less well in a future trial – and that low-yieldinglines will perform better - the BLUE is replaced by a “shrunkestimate” called the best linear unbiased predictor (BLUP)
BLUPk = BLUE kshrinkage factor = (µk − µ)σ2L
σ2L + σ2
rk
This relationship, combined with the constraint
t∑k=1
BLUPk = 0,
where t is the number of breeding lines, determines the value of µ aswell as those of the BLUP’s.
A new estimate of the mean for the kth breeding is then given by
µ′k = µ′ + BLUPk
Clarice G.B. Demetrio and Cristian Villegas 175 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
## REML considering fblock and fline as random effects## using library lme4library(lme4)RIBD_barley.REML <- lmer(yield_g_m2 ~ 1 + (1|fblock) +(1|fline), data=RIBD_barley, REML=TRUE)summary(RIBD_barley.REML)summary(RIBD_barley.REML)@coefsdata.frame(summary(RIBD_barley.REML)@REmat)
## a likelihood ratio test for sigma2_LRIBD_barley.REML_L0 <- lmer(yield_g_m2 ~ 1 +(1|fblock) , data=RIBD_barley, REML=TRUE)summary(RIBD_barley.REML)summary(RIBD_barley.REML_L0)anova(RIBD_barley.REML,RIBD_barley.REML_L0)
## a likelihood ratio test for sigma2_B(RIBD_barley.REML_B0 <- lmer(yield_g_m2 ~ 1 +(1|fline), data=RIBD_barley, REML=TRUE))anova(RIBD_barley.REML,RIBD_barley.REML_B0)
## ML considering fblock and fline as random effects## using library lme4(RIBD_barley.ML <- lmer(yield_g_m2 ~ 1 + (1|fblock) +(1|fline), data=RIBD_barley, REML=FALSE))(RIBD_barley.ML_L0 <- lmer(yield_g_m2 ~ 1 + (1|fblock),data=RIBD_barley, REML=FALSE))summary(RIBD_barley.ML)summary(RIBD_barley.ML_L0)anova(RIBD_barley.ML_L0,RIBD_barley.ML)
# meansunad_mean<-tapply(RIBD_barley$yield_g_m2,RIBD_barley$fline, mean) ## unadjusted meansmean(unad_mean) ## general meanmean(RIBD_barley$yield_g_m2) # mean of the line means
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Latin squares designs (LS)
Sometimes we need more than one type of blocks. In general callone sort of blocks “rows” and the other sort “columns”.
Definition: A Latin square design is one in which
each treatment occurs once and only once in each row and eachcolumnso that the numbers of rows, columns and treatments are all equal.
Clearly, the total number of observations is n = t2.
Suppose in a field trial moisture is varying across the field and thestoniness down the field.
A Latin square can eliminate both sources of variability.
Clarice G.B. Demetrio and Cristian Villegas 178 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Sugarcane experiment
Suppose there are five different varieties of sugarcane to becompared and suppose that moisture is varying across the field andthat the stoniness down the field.
A Latin square design for this would be as follows:
5× 5 Latin SquareColumn Less
1 2 3 4 5 stony ofI A B C D E field
II C D E A B ⇓Row III E A B C D ⇓
IV B C D E A ⇓V D E A B C Stonier
end offield
Less Moremoisture ⇒⇒⇒ moisture
VarietiesA, B, C, D, E
Clarice G.B. Demetrio and Cristian Villegas 179 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Even if one has not identified trends in two directions, a Latin squaremay be employed to guard against the problem of putting the blocksin the wrong direction.
Latin squares may also be used when there are two different kinds ofblocking variables, for example, animals and times.
General principle is that one is interested in maximizing row andcolumn differences so as to minimize the amount of uncontrolledvariation affecting treatment comparisons.
The major disadvantage with the Latin square is that you arerestricted to having the number of replicates equal to the number oftreatments.
Clarice G.B. Demetrio and Cristian Villegas 180 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Several fundamentally different Latin squares exist for a particular t
for t = 4 there are three different squares.A collection of Latin squares for t = 3, 4, . . . , 9 is given in Appendix8A of Box, Hunter and Hunter.
To randomize these designs appropriately involves the following:
1. randomly select one of the designs for a value of t;2. randomly permute the rows and then the columns;3. randomly assign letters to treatments.
Note: no nested.factors as Rows and Columns are to be randomizedindependently
Hence they are not nested (they are crossed)
Generally we will use R to obtain randomized layouts.
General instructions given in Appendix B (Chris Brien’s notes),Randomized layouts and sample size computations in R.
Clarice G.B. Demetrio and Cristian Villegas 181 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Latin Square Design
Consider a latin square design,
yij = µ+ βi + γj + τk(ij) + εij
where µ constant, βi is the effect of the i-th row, γj is the effect of j-thcolumn, τk is the effect of the k-th treatment within the plot ij , εij is theexperimental error associated to the i , j-th plot. The ANOVA table is
Source df SSq MSq F
Rows t − 1 Y′QRYY′QRY
t − 1
Columns t − 1 Y′QCYY′QCY
t − 1Rows:Columns (t − 1)2
Treatments t − 1 Y′QTYY′QTY
t − 1
Residual (t − 1)(t − 2) Y′QRCRes YY′QRCRes Y
(t − 1)(t − 2)Total t2 − 1
where SSqR = 1t
∑ti=1 R2
i − C , C =(∑
i,j,k Yijk )2
n ,
SSqC = 1t
∑tj=1 C 2
j − C , SSqT = 1t
∑tk=1 T 2
k − CClarice G.B. Demetrio and Cristian Villegas 182 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Considering t=3:
ColumnRow 1 2 3
1 A B C2 C A B3 B C A
In matrix notation, for a fixed model,
Y = XGµ + XRβ + XCγ + XTτ + ε
y11
y12
y13
y21
y22
y23
y31
y32
y33
=
111111111
µ +
1 0 01 0 01 0 00 1 00 1 00 1 00 0 10 0 10 0 1
β +
1 0 00 1 00 0 11 0 00 1 00 0 11 0 00 1 00 0 1
γ +
1 0 00 1 00 0 10 0 11 0 00 1 00 1 00 0 11 0 0
τ + ε
Clarice G.B. Demetrio and Cristian Villegas 183 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
In a general way we can have at least five types of models:
1 Fixed model with βi , γj and τk as fixed effects and εij ∼ N(0, σ2)
2 Mixed model with γj and τk as fixed effects and βi ∼ N(0, σ2R) and
εij ∼ N(0, σ2);
3 Mixed model with βi and γj as fixed effects and τk ∼ N(0, σ2T ) and
εij ∼ N(0, σ2);
4 Mixed model with βi as a fixed effect and γj ∼ N(0, σ2C ),
τk ∼ N(0, σ2T ) and εij ∼ N(0, σ2);
5 A random model with βi ∼ N(0, σ2R), γj ∼ N(0, σ2
C ), τk ∼ N(0, σ2T )
and εij ∼ N(0, σ2).
Clarice G.B. Demetrio and Cristian Villegas 184 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
1. Fixed model
βi , γj and τk as fixed effects and εij ∼ N(0, σ2)
It is likely to be necessary to use either the each or times argumentsto generate the replicate combinations.
The syntax of fac.gen and examples are given in Appendix B,Randomized layouts and sample size computations in R.
For Yates order, as opposed to standard order, the first factorchanges fastest, last slowest whereas the first factor changes slowestand the last fastest in standard order.
Clarice G.B. Demetrio and Cristian Villegas 195 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Summary of advantages of factorial experiments
To summarize, relative to one-factor-at-a-time experiments, factorialexperiments have the advantages that:
1. if the factors interact, factorial experiments allow this to bedetected and estimates of the interaction effect can be obtained, and2. if the factors are independent, factorial experiments result in theestimation of the main effects with greater precision.
Clarice G.B. Demetrio and Cristian Villegas 196 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
CRD Factorial
yijk = µ+ αi + βj + τij + εijk
where i=1,...,a;j =1,...,b;k = 1,...,r ; µ constant; αi is effect of the i-thfactor A; βj is effect of the j-th factor B; τij is effect of the i-th factor Acombined with the j-th factor B; εijk is the experimental error associatedto the i , j ; k-th plot.
Considering a=3, b=4 e r=2:
A2B1 A3B4 A1B1 A3B1
A2B4 A3B3 A1B3 A2B3
A2B2 A1B1 A3B2 A1B2
A1B2 A3B1 A2B2 A1B4
A2B1 A1B4 A3B2 A2B4
A1B3 A2B3 A3B3 A3B4
Clarice G.B. Demetrio and Cristian Villegas 197 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
1µ
1
��
1µ
1
ww ''abr
Plot P(abr−1)hh ii
Aa
A(a−1)
&&
Bb
B(b−1)
xxA ∧ Bab
A#B
(a−1)(b−1)
µ
��
µ
vv ((σ2p
Plot Pσ2p
AqA(ψ)
AqA(ψ)
''
BqB (ψ)
BqB (ψ)
wwA ∧ B
qAB (ψ)A#B
qAB (ψ)
Clarice G.B. Demetrio and Cristian Villegas 198 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
1µ
1
��
1µ
1
ww ''abr
Plot P(abr−1)hh ii
Aa
A(a−1)
&&
Bb
B(b−1)
xxA ∧ Bab
A#B
(a−1)(b−1)
µ
��
µ
vv ((σ2p
Plot Pσ2p
AqA(ψ)
AqA(ψ)
''
BqB (ψ)
BqB (ψ)
wwA ∧ B
qAB (ψ)A#B
qAB (ψ)
Clarice G.B. Demetrio and Cristian Villegas 199 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
source d.f. E(QM)Plots rab - 1
A a-1 σ2p + qA(ψ)
B b-1 σ2p + qB(ψ)
A#B (a-1)(b-1) σ2p + qAB(ψ)
Residual ab(r-1) σ2p
Total rab-1
Clarice G.B. Demetrio and Cristian Villegas 200 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
CRDB Factorial
yijk = µ+ γk + αi + βj + τij + εijk
where i=1,...,a;j =1,...,b;k = 1,...,r ; µ constant; γk is effect of the k-thblock; αi is effect of the i-th factor A; βj is effect of the j-th factor B; τijis effect of the i-th factor A combined with the j-th factor B; εijk is theexperimental error associated to the i , j ; k-th plot.
3*Bloco I A3B4 A1B2 A2B2 A1B3
A2B4 A2B1 A1B1 A2B3
A1B4 A3B1 A3B2 A3B3
3*Bloco II A2B3 A1B3 A3B1 A3B2
A3B4 A2B1 A1B4 A2B2
A1B1 A2B4 A1B2 A3B3
Clarice G.B. Demetrio and Cristian Villegas 201 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
1µ
1
��
1µ
1
ww ''Blockr
Bl(r−1)
��
Aa
A(a−1)
&&ww
Bb
B(b−1)
xxrrPlot ∧ Block
abrP[Bl]
r(ab−1)
A ∧ Bab
A#B
(a−1)(b−1)
µ
��
µ
ww ''Block
qBl (ψ)Bl
σ2PB
+qBl (ψ)
��
AqA(ψ)
AqA(ψ)
&&
BqB (ψ)
BqB (ψ)
xxPlot ∧ Block
σ2PB
P[Bl]
σ2PB
A ∧ BqAB (ψ)
A#B
qAB (ψ)
Clarice G.B. Demetrio and Cristian Villegas 202 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
1µ
1
��
1µ
1
ww ''Blockr
Bl(r−1)
��
Aa
A(a−1)
&&ww
Bb
B(b−1)
xxrrPlot ∧ Block
abrP[Bl]
r(ab−1)
A ∧ Bab
A#B
(a−1)(b−1)
µ
��
µ
ww ''Block
qBl (ψ)Bl
σ2PB
+qBl (ψ)
��
AqA(ψ)
AqA(ψ)
&&
BqB (ψ)
BqB (ψ)
xxPlot ∧ Block
σ2PB
P[Bl]
σ2PB
A ∧ BqAB (ψ)
A#B
qAB (ψ)
Clarice G.B. Demetrio and Cristian Villegas 203 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
source d.f. E(QM)Block r-1 σ2
PB + qBl(ψ)Plots[Blocks] r(ab - 1)
A a-1 σ2PB + qA(ψ)
B b-1 σ2PB + qB(ψ)
A#B (a-1)(b-1) σ2PB + qAB(ψ)
Residual (ab-1)(r-1) σ2PB
Total rab-1
Clarice G.B. Demetrio and Cristian Villegas 204 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
A Design of split-plot experiments
Designs in which main effects confounded with more variable unitssuch as large plots.
Their defining attribute is that there is randomization to twodifferent physical entities such that some main effects arerandomized to the more variable entities.
Definition: The standard split-plot design is one in which twofactors, say A and B with a and b levels, respectively are assigned asfollows:
one of the factors, A say, is randomized according to a RCBD withsay r blocks andeach of its ra plots, called the main plots, is split into b subplots (orsplit-plots) and levels of B randomized independently in each subplot.Altogether the experiment involves n = rab subplots.
That is, the generic factor names for this design are Blocks,MainPlots, SubPlots, A and B.
Clarice G.B. Demetrio and Cristian Villegas 205 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Split-plot principle
Very flexible principle that can be used to generate a large number ofdifferent types of experiments.
For example, the main plots could be arranged in any of a CRD, RCBD,Latin square, BIBD, Youden Square
each plot of the design is subdivided into subplots.
The subplots may utilize more complicated designs as well.
For example, the main plots may be arranged in a RCBD each ofwhich are subdivided in such a way as to allow a Latin Square to beplaced in each main plot.
Also, subplots can be split into subsubplots and subsubplots into ...
Nor is one restricted to applying just one factor to each type of unit.
More than one factor can be randomized to main plots, more thanone to subplots and so on.
The standard split-plot design is nearly the simplest possibility; only aCRD in the main plots would be simpler.
Clarice G.B. Demetrio and Cristian Villegas 206 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
When to use a split-plot design
1 When the physical attributes of a factor require the use of largerunits of experimental material than other factors.
For example, land preparation treatments usually require to beperformed on larger areas of land than do the sowing of differentvarieties (due to the different pieces of equipment).Temperature control for say storage purposes involves the use ofrelatively large chambers in which several samples can usually bestored.Different processing runs are often of a minimum size such that theirproduce can be readily subdivided for the application of furthertreatments.Also, some factors are relatively hard to change. For example, thetemperature of a production operation is often difficult to change sothat it might be better to change it less often by making it amain-plot factor.
Clarice G.B. Demetrio and Cristian Villegas 207 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
2 When it is desired to incorporate an additional factor into anexperiment.
3 When it is expected that differences amongst the levels of certainfactors are larger than amongst those of other factors.
The levels of the factors with larger differences are randomized tomain plots.One effect of this may be to increase the precision of comparisonsbetween the levels of the other factors.
4 When it is desired to ensure greater precision between some factorsthan others.
Irrespective of the size of the differences between the main plottreatment factors, it is desired to increase the precision of somefactors by assigning them to subplots.One may be less interested in main effects of some factors. Aparticular example of such factors is ”noise” factors.
Clarice G.B. Demetrio and Cristian Villegas 208 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Notes
Note that the last two of these situations are utilising theanticipated greater variability of main plots relative to subplots.
That is, we are expecting the larger units to be more variable thanthe smaller units.This will be expressed in the models and E[MSq]s for theseexperiments.
In describing the type of study, you need to identify the main plotand subplot design.
Clarice G.B. Demetrio and Cristian Villegas 209 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Ravioli dataHere we will illustrate the design with data from an evaluation of fourcommercial brands of ravioli by nine trained assessors. (Guillermo Hough,DESA-ISETA, Argentina.) The purpose of the study was to identifydifferences in taste and texture between the brands. Knowledge of suchdifferences is of great commercial importance to food manufacturers, butdifficult to obtain: these sensory characteristics must ultimately beassessed by the subjective impressions of a human observer, which varyamong individuals, and over occasions in the same individual. However, ifthe subjective assessment of some aspect of taste or texture (such assaltiness or gumminess) is consistent, for a particular brand, amongindividuals and over occasions that is, if the perceived differencesbetween brands are statistically significant it is safe to conclude thatthese differences are real. Differences among assessors are of less interest.Different individuals may simply be using different parts of theassessment scale to describe the same sensations: who can say whetherfood tastes saltier to you than it does to me? However, if there aresignificant interactions between brand and assessor for example, if theassessor ANA consistently perceives Brand A as saltier than Brand B,whereas GUI consistently ranks these brands in the opposite order this isof interest to the investigator.
Clarice G.B. Demetrio and Cristian Villegas 210 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
CRD Split-plot
yijk = µ+ αi + eik + βj + τij + εijk
where i=1,...,a;j =1,...,b;k = 1,...,r ; µ constant; αi is effect of the i-thfactor A; eik is the experimental error associated to the i-th factorassociated with the k-th plot; βj is effect of the j-th factor B; τij is effectof the i-th factor A combined with the j-th factor B; εijk is theexperimental error associated to the i , j ; k-th sub-plot.
Considering a=3, b=4 e r=2:
2*A3 B1 B2 B4 B1
B4 B3 B3 B2
2*A1 B3 B4 B1 B2
B2 B1 B3 B4
2*A2 B1 B4 B2 B4
B2 B3 B3 B1
Clarice G.B. Demetrio and Cristian Villegas 211 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
1µ
1
��
1µ
1
ww ''ar Plot P
ar−1
��
iiAa
A(a−1)
&&
Bb
B(b−1)
xxrrSubplot ∧ Plot
abr
S[P]
ar(b−1)
A ∧ Bab
A#B
(a−1)(b−1)oo
µ
��
µ
ww ''Plotsabrarσ2p
P
σ2+bσ2p
��
AqA(ψ)
AqA(ψ)
&&
BqB (ψ)
BqB (ψ)
xxS ∧ P
σ2S[P]
σ2
A ∧ BqAB (ψ)
A#B
qAB (ψ)
Clarice G.B. Demetrio and Cristian Villegas 212 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
1µ
1
��
1µ
1
ww ''ar Plot P
ar−1
��
iiAa
A(a−1)
&&
Bb
B(b−1)
xxrrSubplot ∧ Plot
abr
S[P]
ar(b−1)
A ∧ Bab
A#B
(a−1)(b−1)oo
µ
��
µ
ww ''Plotsabrarσ2p
P
σ2+bσ2p
��
AqA(ψ)
AqA(ψ)
&&
BqB (ψ)
BqB (ψ)
xxS ∧ P
σ2S[P]
σ2
A ∧ BqAB (ψ)
A#B
qAB (ψ)
Clarice G.B. Demetrio and Cristian Villegas 213 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Clarice G.B. Demetrio and Cristian Villegas 214 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
yijk = µ+ γk + αi + eik + βj + τij + εijk
where i=1,...,a;j =1,...,b;k = 1,...,r ; µ constant; γk is effect of the k-thblock; αi is effect of the i-th factor A; eik is the experimental errorassociated to the i-th factor associated with the k-th plot; βj is effect ofthe j-th factor B; τij is effect of the i-th factor A combined with the j-thfactor B; εijk is the experimental error associated to the i , j ; k-thsub-plot.
Considering a=3, b=4 e r=2:
3*Block I A3 B1 B2 B4 B3
A1 B4 B3 B1 B2
A2 B3 B4 B1 B2
3*Block II A2 B1 B3 B4 B2
A1 B2 B3 B1 B4
A3 B3 B4 B2 B1
Clarice G.B. Demetrio and Cristian Villegas 215 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
1µ
1
��
1µ
1
ww ''r Block Bl
r−1
��
Aa
A(a−1)
&&vv
Bb
B(b−1)
xx
ww
Plot ∧ Blockar
P[ Bl ]
r(a−1)
��
A ∧ Bab
A#B
(a−1)(b−1)
ssSubplot ∧ Plot ∧ Block
abr
S[P ∧ Bl]
ar(b−1)
Clarice G.B. Demetrio and Cristian Villegas 216 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
unrandomized randomized
µ
��
µ
ww ''Block
qBl (ψ)Bl
σ2+bσ2p+qBl (ψ)
��
AqA(ψ)
AqA(ψ)
%%
BqB (ψ)
BqB (ψ)
yyPlot ∧ Block
abrarσ2p
P[Bl]
σ2+bσ2p
��
A ∧ BqAB (ψ)
A#B
qAB (ψ)
Subplot ∧ Plot ∧ Block
σ2S[P ∧ Bl]
σ2
Clarice G.B. Demetrio and Cristian Villegas 217 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
source d.f. E(QM) F
Blocks r-1 σ2 + bσ2P + qBl (ψ)
Plots r(a - 1)A a-1 σ2 + bσ2
P + qA(ψ) QMA/QMResAResidual A (a-1)(r-1) σ2 + bσ2
P
Subplots[Plots] ra(b-1)B b-1 σ2 + qB (ψ) QMB/QMResBA#B (a-1)(b-1) σ2 + qAB (ψ) QMAB/QMResBResidual B a(b-1)(r-1) σ2
Total rab-1
Clarice G.B. Demetrio and Cristian Villegas 218 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
If βj ∼ N(0, σ2B)
H01: σ2AB=0
H02: σ2B=0
H03: µA1 = µA2 = . . . = µAa = 0
source d.f. E(QM) F
Blocks r-1 σ2 + bσ2P + qBl (ψ)
Plots r(a - 1)
A a-1 σ2 + bσ2P + rσ2
AB + qA(ψ) z (sob H03)
Residual (a-1)(r-1) σ2 + bσ2P
Subplots[Plots] ra(b-1)
B b-1 σ2 + rσ2AB + raσ2
B QMB/QMAB (sob H02)
A#B (a-1)(b-1) σ2 + rσ2AB QMAB/QMResB (sob H01)
Residual a(b-1)(r-1) σ2
Total rab-1
z F = QMA+QMResBQMResA+QMA#B ∼ Fν1,ν2
Sattertwaite: ν1 = (QMA+QMResB)2
QMA2
a−1QMResB2
a(b−1)(r−1)
e ν2 = (QMResA+QMA#B)2
QMResA2
(r−1)(a−1)QMA#B2
(a−1)(b−1)
Clarice G.B. Demetrio and Cristian Villegas 219 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
The brands of ravioli were cooked, served into small dishes and presentedhot to the assessors. Three replicate evaluations were made, each beingcompleted on a single day; hence each day comprised a block. There mayhave been uncontrolled and unobserved variation from day to day in thecooking and serving conditions for example, the temperature of theroom may have changed. On each day, the order in which the fourbrands were presented to the assessors was randomised. However, on anygiven day, all the assessors received the brands in the same order: for thistype of product it is complicated to randomise the order of presentationamong assessors. Hence each presentation of a brand comprised a mainplot: the brand varied only between presentations, but the whole set ofassessors received the brand within each presentation. Each serving, in asingle dish, comprised a sub-plot. During each presentation, the servingswere shuffled before being taken to the assessors; thus the assessors wereinformally randomised over the sub-plots within each main plot. (Itwould have been cumbersome to follow a formal randomisation at thisstage: it was more important to get the servings to the assessors whilethey were still hot.)Each assessor gave the serving presented to him or her a numerical scorefor saltiness.
Clarice G.B. Demetrio and Cristian Villegas 220 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Hierarchical classification or nested classification model
Experimental designs with hierarchical classification are frequentlyused in agricultural,genetic, industrial, medical and other types ofresearch.Cochran (1939) described a sampling scheme for estimating wheatproduction: samples of farms were selected from six districts; at thenext stage, samples of fields were selected from each of the selectedfarm; at the final stage, measurements on the yield of wheat wereobtained from sample ”paths” in each of the selected fields.For demographic, political and socioeconomic studies, samples ofgeographic regions, counties, districts and towns are selected in asuccession.Similar procedures are employed in geographical studies on rockformation, mineral deposition and soil erosion.These types of designs are also used in studies related to water andatmospheric pollution, and also in environmental and ecologicalstudies.Nested designs can be balanced or unbalanced, the classificationfactors can be fixed or random, and the variances can be equal orunequal.
Clarice G.B. Demetrio and Cristian Villegas 221 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Further illustrations (Rao, 1997)
A number of applications of the nested designs appeared in the literature,as, for example,
Fabric differences: Tippett (1931) presents an experiment forexamining the properties of four fabrics. Three tests were performedon each of the fabrics, and each test was repeated four times.
Blood pressure measurements: Canner et al. (1991) used data fromthe Hypertension Prevention Trial, conducted in U.S. during1983-1986 and a nested model to examine errors made in measuringblood pressure of individuals. The effects of the participants in thestudy, their visits and duplicate measurements on each visit wereincluded in the model and they were all assumed to be random.
Eye examination: Rosner (1982) used a nested model to study itemsuch as ”intraoccular pressures in persons”. The models consists ofgroups, individuals in the groups, and the measurements on both theeyes. The groups were considered to be fixed, and the remaining twofactors random. For some items, measurement on one of the eyeswas missing. For some other items, the condition being examined bythe opthalmologist existed in only one of the eyes.
Clarice G.B. Demetrio and Cristian Villegas 222 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Asparagus clones: For patenting asparagus clones, a plant producerused estimates of means and variances of their importantcharacteristics. For future experimentation, estimates of the variancecomponents of the cladophylls, ” the tiny leaves located on theasparagus branches,” were also obtained (Trout, 1985). The studywas conducted by selecting in stages, five stalks from a clone, twobranches from each stalk, five nodes on each branch, and threecladophylls from each node. Variance components were obtainedfrom the lengths of the 150 cladophylls at the nodes.
Experimental drugs: Patients with certain diseases are hospitalizedand administered suitable medical treatments. Experimental drugscan be examined by administering them to the patients receivingeach of the treatments. In a split-plot experiment, the treatmentsand drugs are considered to be the main and the sub-plottreatments, respectively.
Spectral density: Jackson and Lawton (1969) examined theconsequences of estimating a spectral density through a nestedclassification.
Clarice G.B. Demetrio and Cristian Villegas 223 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Textile production: Bainbridge (1965) suggests a staggered designfor detecting sources of variation occurring in industrial productions,and illustrates it through a chemical test on a specific textile. Froma large number of machines, two were selected on each of forty-twodays, the sample from one machine was tested by two analysts ondifferent shifts, one of them obtaining duplicate measurements. Thesample from the second machine was analysed only once by ananalyst. The data from this experiment were used for studying thevariations occurring from (1) changes in the raw material over thedays, (2) differences in the machines, (3) long term tests at thedifferent shifts, and (4) short term tests through the duplicatemeasurements. This four stage design is unbalanced.
Animal breeding: In several experiments on animal breeding, each ofa sample of sires are randomly mated to samples of dames. Theobservations from the offspring are analyzed through the model for anested design.
Clarice G.B. Demetrio and Cristian Villegas 224 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Calf birth weight
Example
In an animal breeding experiment 20 unrelated cows were subjected tosuperovulation and artificial insemination. Each group of 4 cows wasinseminated with a different sire, with a total of 5 unrelated sires. Out ofeach mating (combination of dam and sire), three calves were generatedand their yearling weights were recorded.
no interest in each sire or dam which are very depending on thecircumstances
sire effect can be viewed as a sample of a random sire effect (levelsare chosen at random from an infinite set of sire levels)
dam effect can be viewed as a sample of a random dam effect (levelsare chosen at random from an infinite set of dam levels)
interest in estimating the variance of the sire and dam effects assources of random variation in the data
the three calves with the same parents share something whichpresumably violates the assumption of independence
Clarice G.B. Demetrio and Cristian Villegas 225 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
...
S1
D1 D2 D3 D4
...
S5
D17 D18 D19 D20
Clarice G.B. Demetrio and Cristian Villegas 226 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Dam 1 Dam 2 Dam 3 Dam 4Sire C1 C2 C3 C1 C2 C3 C1 C2 C3 C1 C2 C3
Clarice G.B. Demetrio and Cristian Villegas 227 Modelos Mistos e Componentes de Variancia
Session 1 - Review of Basic Concepts Session 2 - Linear Mixed Model Session 3 - Linear Mixed Model Session 4 - Estimation Session 5 - Session 6 - Session 7 Session 8 Session 9 Session 10 Session 11
Hierarchical classification model
The three stages of the illustration can be represented by the model
yijk = µ+ αi + β(i)j + ε(ij)k
where µ is the grand mean, αi is the effect of the i-th sire, β(i)j is theeffect of the j-th dam inseminated by the i-th sire, ε(ij)k is the effect ofthe k-th calf born from the j-th dam with the i-th sire. Assuming
αi ∼ N(0, σ2s ), β(i)j ∼ N(0, σ2
d) and ε(ij)k ∼ N(0, σ2)
αi β(i)j and ε(ij)k , αi and α′i (i 6= i ′), β(i)j and β(i ′)j′ (i 6= i ′ and/orj 6= j ′), ε(ij)k and ε(i ′j′)k′ (i 6= i ′, j 6= j ′ and/or k 6= k ′) areindependent