Statistical Modelling Chapter V 1 V. Latin squares designs (LS) V.A Design of Latin squares V.B Indicator-variable models and estimation for a Latin square V.C Hypothesis testing using the ANOVA method for a Latin square V.D Diagnostic checking V.E Treatment differences V.F Design of sets of Latin squares V.G Hypothesis tests for sets of Latin squares
72
Embed
Statistical Modelling Chapter V 1 V. Latin squares designs (LS) V.ADesign of Latin squares V.B Indicator-variable models and estimation for a Latin square.
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Statistical Modelling Chapter V 1
V. Latin squares designs (LS) V.A Design of Latin squares V.B Indicator-variable models and
estimation for a Latin squareV.C Hypothesis testing using the ANOVA
method for a Latin squareV.D Diagnostic checkingV.E Treatment differencesV.F Design of sets of Latin squaresV.G Hypothesis tests for sets of Latin
squares
Statistical Modelling Chapter V 2
V.A Design of Latin squares• Definition V.1: A Latin square design is
one in which – each treatment occurs once and only once in
each row and each column – so that the numbers of rows, columns and
treatments are all equal. • Clearly, the total number of observations is
n t2. • Suppose in a field trial moisture is varying
across the field and the stoniness down the field.
• A Latin square can eliminate both sources of variability.
Statistical Modelling Chapter V 3
Example V.1 Fertilizer experiment5 x 5 Latin square
Column Less 1 2 3 4 5 stony
of I A B C D E field II C D E A B
Row III E A B C D IV B C D E A V D E A B C Stonier end of field
Less More moisture moisture
(Fertilizers
A, B, C, D, E)
Statistical Modelling Chapter V 4
Notes• Even if one has not identified trends in two directions, a
LS may be employed to guard against the problem of putting the blocks in the wrong direction.
• LSs may also be used when there are two different kinds of blocking variables — for example animals and times.
• General principle is to maximize row and column differences so as to minimize uncontrolled variation affecting treatment differences.
• Problem is restriction that no. replicates = no. treats• Several fundamentally different LSs exist for a particular t
– for t 4 there are three different squares. – Latin squares for t 3,4, ..., 9 given in Appendix 8A of Box,
Hunter and Hunter. • To randomize these designs appropriately involves:
1. randomly select one of the Latin squares available for t;2. randomly permute the rows and then the columns;3. randomly assign letters to treatments.
Statistical Modelling Chapter V 5
a) Obtaining a layout for a Latin Square in R
• General instructions given in Appendix B, Randomized layouts and sample size computations in R.
Example V.2 Pollution effects of petrol additives• 4 cars and 4 drivers in a study of effects of 4 petrol
additives on pollution.• Desirable to isolate both car-to-car and driver-to-driver
differences.• A 4 4 Latin square would enable this to be done. • Names for rows, columns and treats for this example are
Cars, Drivers and Additives, respectively. • Also, t = 4 and a design obtained from BH2.
• Note for general systematic layout XR It 1t and XC 1t It but XT cannot be written as a direct product.
Statistical Modelling Chapter V 11
Estimators of expected values under max. model
Also, note that
R+C+TThen 2ˆ R C T G
are the t2-vectors of row, column, treatment and grand means, respectively.where , , and R C T G
2G
1R
1C
t t
t t
t t
t
t
t
M J J
M I J
M J IIn this case it is not possible to write MT as a direct product of I and J matrices as the treatments will not be in a systematic order expressible in this form.
That is, MR, MC, MT and MG are the row, column, treatment and grand mean operators, respectively. So once again the estimators are functions of means. Further, if the data in the vector Y has been arranged in standard order for Rows then Columns, the operators are:
b) Alternative expectation models• 8 possible different models for the expectation when
Rows, Columns and Treatments are considered fixed:
G G
R R
C C
R+C R C
T T
R+T R T
no treatment, row or column differences
row differences only
column differences only
additive row and column
treatment differences only
additive row and tre
X
X
X
X X
X
X X
C+T C T
R+C+T R C T
atment differences
additive column and treatment differences
additive row, column and treatment differences
X X
X X X
Statistical Modelling Chapter V 14
Marginality relations between the models
• Estimators
G R C R+C Τ R+Τ C+Τ R+C+Τ
R R+C R+Τ R+C+Τ
C R+C C+Τ R+C+Τ
T R+T C+Τ R+C+Τ
R+C R+Τ C+Τ R+C+Τ
, , , , , ,
, ,
, ,
, ,
, ,
ψ ψ ψ ψ ψ ψ ψ ψ
ψ ψ ψ ψ
ψ ψ ψ ψ
ψ ψ ψ ψ
ψ ψ ψ ψ
G
R
C
R+C
Τ
R+Τ
C+Τ
R+C+T
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
ˆ
2ˆ
ψ G
ψ R
ψ C
ψ R C G
ψ T
ψ R T G
ψ C T G
R C T G
all are functions of the four mean vectors for this design
Statistical Modelling Chapter V 15
V.C Hypothesis testing using the ANOVA method for a Latin square
• An ANOVA will be used to choose between the 8 alternative expectation models for a Latin square.
Statistical Modelling Chapter V 16
a) Analysis of an example• Example V.2 Pollution effects of petrol
additives (continued)
Results for the Latin square experiment
Car 1 2 3 4 I B
20 D 20
C 17
A 15
II A 20
B 27
D 23
C 26
Drivers III D 20
C 25
A 21
B 26
IV C 16
A 16
B 15
D 13
(Additives A, B, C, D)
Statistical Modelling Chapter V 17
Hypothesis test for the example
Step 1: Set up hypotheses
a) H0: A B C D (or XA not required in model)H1: not all population Additives means are equal
b) H0: I II III IV (or XD not required in model)H1: not all population Drivers means are equal
c) H0: 1 2 3 4(or XC not required in model)H1: not all population Cars means are equal
Set 0.05.
Statistical Modelling Chapter V 18
Hypothesis test for the example (continued)
Step 2: Calculate test statistics Source df MSq E[MSq] F Prob
Drivers 3 72.00 2Dq ψ 27.0 <0.001
Cars 3 8.00 2Cq ψ 3.0 0.117
Drivers#Cars 9
Additives 3 13.33 2Aq ψ 5.0 0.045
Residual 6 2.67 2
Total 15
• Note Drivers#Cars refers to the "interaction
between Drivers and Cars" – contrasts with Cars[Drivers] or Drivers[Cars]; – explained in chapter VII;– R does not distinguish as all are Drivers:Cars.
Statistical Modelling Chapter V 19
Hypothesis test for the example (continued)
Step 3: Decide between hypotheses
Differences between drivers but not cars and differences between the additives.
The model that best describes the data would appear to be D+A = XD + XA, an additive model for Driver and Additive effects.
Statistical Modelling Chapter V 20
b) Sums of squares for the analysis of variance
• In this section we will use the generic names of Rows, Columns and Treatments for the factors in a Latin square.
• The estimators of the SSqs for the Latin square ANOVA are the SSqs of the following vectors:
R+C
R+C+T
Rows SSq:
Columns SSq
Rows#Columns SSq:
Treatments SSq:
Residual SSq: 2
e
e
e
e e e
R R G
C C G
D Y R C G
T T G
D Y R C T G
Y R C T Gwhere• Ds are n-vectors of deviations from Y and • vectors with the e subscripts are n-vectors of effects.
Statistical Modelling Chapter V 21
b) Ssq for the ANOVA (continued)• From section V.B,
Models and estimation for a Latin square,
2G
1R
1C
T
t t
t t
t t
t
t
t
G M Y J J Y
R M Y I J Y
C M Y J I Y
T M Y• Can be shown that
G G U U U G
R R R G
C C C G
R+C R+C RC
RC U R C G
T T T G
R+C+T R+C+T
with
with
with
with
with
e e
e e
e e
D D Y G Y G Y Q Y Q M M
R R R G R G Y Q Y Q M M
C C C G C G Y Q Y Q M M
D D Y R C G Y R C G Y Q Y
Q M M M M
T T T G T G Y Q Y Q M M
D D Y R C Res
Res
RC
RC U R C T G
2 2
with 2
T G Y R C T G Y Q Y
Q M M M M M
All the Ms and Qs are symmetric and idempotent.
Statistical Modelling Chapter V 22
ANOVA table is constructed as follows:Source df SSq MSq F p
Rows t 1 RY Q Y 2RR1
st
Y Q Y
Res
2 2R RCs s Rp
Columns t 1 CY Q Y 2CC1
st
Y Q Y
Res
2 2C RCs s Cp
Rows#Columns (t 1)2 RCY Q Y
Treatments t 1 TY Q Y 2TT1
st
Y Q Y
Res
2 2T RCs s Tp
Residual (t 1)(t 2) ResRCY Q Y
Res
Res
RC 2RC1 2
st t
Y Q Y
Total t2 1 UY Q Y
• See notes for example of computation of vectors and
geometrical interpretation
Statistical Modelling Chapter V 24
c) Expected mean squares
• To justify our choice of test statistics, we want to work out the E[MSq]s in the ANOVA table under the 8 alternative expectation models.
• However, to save space work out E[MSq]s under the maximal model and identify which terms in E[MSq]s go to zero under alternative models.
Statistical Modelling Chapter V 25
E[MSq]s with fixed Rows and Columns effects
• Given the expressions in the above table, the population means of the mean squares could be computed if knew the is, js, ks and 2.
• Each of qR(), qC() and qT() equal 0 when the terms XR, XC and XT, respectively, removed from the model.
• Hence a significant F value for a line indicates that the corresponding term should be included in the model.
Source df MSq E[MSq] F
Rows t 1 2RR1
st
Y Q Y
2Rq
Res
2 2R RCs s
Columns t 1 2CC1
st
Y Q Y
2Cq
Res
2 2C RCs s
Rows#Columns (t 1)2
Treatments t 1 2TT1
st
Y Q Y
2Tq
Res
2 2T RCs s
Residual (t 1)(t 2)
Res
Res
RC 2RC1 2
st t
Y Q Y
2
Total t2 1
2RR .
1
11
b
ii
q t tt
Q , 2C
C .1
11
t
jj
q t tt
Q
and 2TT .
1
11
t
kk
q t tt
Q
Statistical Modelling Chapter V 26
Alternative analysis• Both Rows and Columns are random• The model in this case would be that
T T
2 2U R R C
2 1 2 1R
2 2R
and
C
t t t t C t t
t t t t C t t
E
t t
t t t t
2
2
2
Y X
V M M M
I I I J J I
I I I J J I
• It allows for equal covariance between units from the same row and also between units from the same column.
• This exactly parallels what happens when both are fixed.
• Alternative variance models involve setting and/or and this will result in the one(s) set to zero being dropped from the expected mean square.
2R 0
2C 0
Statistical Modelling Chapter V 29
d) Summary of the hypothesis test
• See notes
e) Comparison with traditional Latin-square ANOVA table
• Differences symbolic – see notes for details
Statistical Modelling Chapter V 32
f) Computation of ANOVA and diagnostic checking in R
• Diagnostic checking is the same as for the RCBD
Example V.2 Pollution effects of petrol additives (continued)
• First set up and attach data.frame and do initial boxplots.
• Then, use the aov function, either with or without the Error as part of the model. – In this experiment uncontrolled variation made up of Drivers, Cars
and Drivers:Cars. – R shorthand for this: Drivers*Cars that expands to Drivers + Cars
+ Drivers:Cars, the latter being equivalent to Drivers#Cars.
• Outputs for analysis with Error & diagnostic checking are given below
Statistical Modelling Chapter V 33
R output
> load("LSPolut.dat.rda")> attach(LSPolut.dat)> boxplot(split(Reduct.NO, Drivers), xlab="Drivers", ylab="Reduction in NO")
> boxplot(split(Reduct.NO, Cars), xlab="Cars", ylab="Reduction in NO")
> boxplot(split(Reduct.NO, Additives), xlab="Additives", ylab="Reduction in NO")
Statistical Modelling Chapter V 34
Boxplots for initial graphical exploration of the data
1 2 3 4
14
16
18
20
22
24
26
Drivers
Re
du
ctio
n in
NO
1 2 3 4
14
16
18
20
22
24
26
Cars
Re
du
ctio
n in
NO
A B C D
14
16
18
20
22
24
26
Additives
Re
du
ctio
n in
NO
Statistical Modelling Chapter V 35
R output (continued)< LSPolut.aov <- aov(Reduct.NO ~ Drivers + Cars + Additives + + Error(Drivers*Cars), LSPolut.dat)> summary(LSPolut.aov)Error: Drivers Df Sum Sq Mean SqDrivers 3 216 72Error: Cars Df Sum Sq Mean SqCars 3 24 8Error: Drivers:Cars Df Sum Sq Mean Sq F value Pr(>F)Additives 3 40.000 13.333 5 0.0452Residuals 6 16.000 2.667
R output (continued)> #> # Diagnostic checking> #> res <- resid.errors(LSPolut.aov)> fit <- fitted.errors(LSPolut.aov)> data.frame(Drivers,Cars,Additives,Reduct.NO,res,fit) Drivers Cars Additives Reduct.NO res fit
• To overcome the small residual df problem several squares can be used.
• In the case Example V.2, Pollution effects of petrol additives, Latin Square could be repeated using:
1. using the same drivers and cars in each replicate;2. using the same drivers but new cars (or the same cars but
new drivers); or3. using new cars and drivers.
• In general, one can have as many (r) squares as one likes.
– However, will only present layouts for 2 squares.• General expressions for randomizing the
various cases are given in Appendix B, Randomized layouts and sample size computations in R.
Statistical Modelling Chapter V 47
Case 1 — same Drivers and Cars • This case involves a complete repetition of the
experiment, say on consecutive mornings, with the same 4 Drivers and 4 Cars on the two occasions.
• There is no re-randomization of the square for the second occasion — preserves crossed relationships between Occasions and other factors.
• Layout (r=2)
Occasions 1 2
Cars 1 2 3 4 1 2 3 4 Drivers
1 A B C D A B C D 2 C D A B C D A B 3 D C B A D C B A 4 B A D C B A D C
Statistical Modelling Chapter V 48
Case 2 — same cars different drivers• Experiment repeated on a different occasion with
– same 4 cars on both occasions, – but with different drivers on second occasion.
• As a result the rows of the square, but not the columns, are rerandomized on the second occasion.
• Layout (r=2) Occasion 1 2
Cars 1 2 3 4 1 2 3 4 Drivers
1 C A B D D B A C 2 A C D B A C D B 3 B D C A C A B D 4 D B A C B D C A
• Note order in which additives are tested by second driver on occasion 1 is same as for fourth driver on occasion 2.– That is, the second row of the square on occasion 1 is the same
as the fourth row on occasion 2.
Statistical Modelling Chapter V 49
Case 3 — different drivers and cars
• In this case, – not only are the drivers on different occasions unconnected, – but so are the cars as the cars used on the second occasion
are completely different to those used on the first occasion.
• As a result the rows and columns of the square are rerandomized on the second occasion.
• Layout (r=2)
Occasions 1 2
Cars 1 2 3 4 1 2 3 4 Drivers
1 B A C D D B C A 2 C D B A A C B D 3 A B D C B D A C 4 D C A B C A D B
Statistical Modelling Chapter V 50
V.G Hypothesis tests for sets of Latin squares
• In previous section discussed the use of several squares to overcome the residual df problem.
– e.g. 4 4 Latin square has 6 (< 10) Residual df
• Gave 3 cases for Example V.2, Pollution effects of petrol additives:
1. using the same drivers and cars in each replicate;2. using the same drivers but new cars (or the same cars but new
drivers); or3. using new cars and drivers.
• Shall determine ANOVA for each of these cases. • In determining the E[MSq]s will be assumed that
– unrandomized factors are to be classified as random factors– randomized factors as fixed factors.
• While layouts were for 2 squares will give DF for the general case of r squares.
Statistical Modelling Chapter V 51
a) Case 1 — same Drivers and Cars • no re-randomization of the square for the
second occasion• Layout (r=2)
Occasions 1 2
Cars 1 2 3 4 1 2 3 4 Drivers
1 A B C D A B C D 2 C D A B C D A B 3 D C B A D C B A 4 B A D C B A D C
Statistical Modelling Chapter V 52
A. Description of pertinent features of the study
1. Observational unit– a car with a driver on an occasion
2. Response variable– Reduction
3. Unrandomized factors– Occasions, Drivers, Cars
4. Randomized factors– Additives
5. Type of study– Sets of Latin Squares
Statistical Modelling Chapter V 53
B. The experimental structure
• For this structure to be appropriate requires that the same square without re-randomization be used for each occasions; otherwise, some factors would be nested (as would be randomizing within Occasions).
Structure Formula unrandomized 2 Occasions*4 Drivers*4 Cars randomized 4 Additives
C. Sources derived from the structure formulaeOccasions*Drivers*Cars
= (Occasions + Drivers + Occasions#Drivers)*Cars
= Occasions + Drivers + Occasions#Drivers
+ Cars + Occasions#Cars + Drivers#Cars
+ Occasions#Drivers#Cars
Additives = Additives
Statistical Modelling Chapter V 54
D. Degrees of freedom and sums of squares• Hasse diagrams, with degrees of freedom, for this
study are:
Driver 4
Car
Driver Car 16
Unrandomized factors
Occ Car 3(r 1)
Occ r
Occ Driver 4r
Occ Driver Car 16r
r 1 3 4 3
3(r 1) 4r 9
9(r 1)
1 1
O D C
O#D O#C D#C
O#D#C
Statistical Modelling Chapter V 55
Alternative degrees of freedom calculation
• As all factors in unrandomized structure are crossed, the rule for a set of crossed factors can be used.
• DF of any source is no. of levels minus one for each factor in the source, multiplied together.
• For example, since Occasions has r levels and Drivers has 4 levels, – DF of Occasions#Drivers is (r1)(41) 3(r1).
Statistical Modelling Chapter V 56
Hasse diagrams, with M and Q matrices
Statistical Modelling Chapter V 57
E. The analysis of variance table
Source df SSq Occasions r 1 OY Q Y Drivers 3 DY Q Y Cars 3 CY Q Y Occasions#Drivers 3(r 1) ODY Q Y Occasions#Cars 3(r 1) OCY Q Y Drivers#Cars 9 DCY Q Y Additives 3 AY Q Y Residual 6
ResDCY Q Y
Occasions#Drivers#Cars 9(r 1) ODCY Q Y Total 16r 1
Statistical Modelling Chapter V 58
F. Maximal expectation and variation models
• Assume the randomized factor is a fixed factor and that all the unrandomized factors are random factors.
• Then the expectation term is Additives.• The variation terms are:
G. The expected mean squares • The Hasse diagram, with contributions to E[MSq]s, for the
unrandomized factors in this study is:
• The single randomized factor Additive will contribute qA() to the E[MSq] for its source.
Statistical Modelling Chapter V 60
ANOVA table with E[MSq]Source df SSq E[MSq] Occ r 1 OY Q Y 2 2 2 2
ODC OC OD O4 4 16
Drivers 3 DY Q Y 2 2 2 2ODC DC OD D4 4r r
Cars 3 CY Q Y 2 2 2 2ODC DC OC C4 4r r
Occ#Drivers 3(r 1) ODY Q Y 2 2ODC OD4
Occ#Cars 3(r 1) OCY Q Y 2 2ODC OC4
Drivers#Cars 9 DCY Q Y
Additives 3 AY Q Y 2 2ODC DC Ar q ψ
Residual 6 ResDCY Q Y 2 2
ODC DCr
Occ#Drivers#Car 9(r 1) ODCY Q Y 2ODC
Total 16r 1 • Hypothesis tests for Additive, Occ#Drivers and Occ#Cars
straightforward.• Tests for Occ, Drivers and Cars are more difficult.
– For example, to test Occ, need ratio of 2 sums of MSqs.– DF a problem
Statistical Modelling Chapter V 62
b) Case 2 — same cars different drivers
• rows of the square, but not the columns, are rerandomized on the second occasion.
• Layout (r=2)
Occasion 1 2
Cars 1 2 3 4 1 2 3 4 Drivers
1 C A B D D B A C 2 A C D B A C D B 3 B D C A C A B D 4 D B A C B D C A
Statistical Modelling Chapter V 63
A. Description of pertinent features of the study
1. Observational unit– a car with a driver on an occasion
2. Response variable– Reduction
3. Unrandomized factors– Occasions, Drivers, Cars
4. Randomized factors– Additives
5. Type of study– Sets of Latin Squares
Statistical Modelling Chapter V 64
B. The experimental structure Structure Formula unrandomized (2 Occasion/4 Drivers)*4 Cars randomized 4 Additives
C. Sources derived from the structure formulae
(Occasions/Drivers)*Cars
= (Occasions + Drivers[Occasions])*Cars
= Occasions + Drivers[Occasions] + Cars
+Occasions#Cars + Drivers#Cars[Occasions]
Additives = AdditivesD. Degrees of freedom and sums of squares
Left as an exercise
Statistical Modelling Chapter V 65
E. The analysis of variance table
Source df SSq Occasions r 1 OY Q Y Drivers[Occasions] 3r ODY Q Y Cars 3 CY Q Y Occasions#Cars 3(r 1) OCY Q Y Drivers#Cars[Occasions] 9r ODCY Q Y Additives 3 AY Q Y Residual 9r 3
ResODCY Q Y
Total 16r 1
Statistical Modelling Chapter V 66
F. Maximal expectation and variation models
• Assume the randomized factor is a fixed factor and that all the unrandomized factors are random factors.
• Then the expectation term is Additives.• The variation terms are:
Source df SSq Occasions r 1 OY Q Y Drivers[Occasions] 3r ODY Q Y Cars[Occasions] 3r OCY Q Y Drivers#Cars[Occasions] 9r ODCY Q Y Additives 3 AY Q Y Residual 9r 3
ResODCY Q Y
Total 16r 1
Statistical Modelling Chapter V 72
F. Maximal expectation and variation models
• Assume the randomized factor is a fixed factor and that all the unrandomized factors are random factors.
• Then the expectation term is Additives.• The variation terms are: