Statistical Modelling Chapter V 1 V. Latin squares designs (LS) V.ADesign of Latin squares V.B Indicator-variable models and estimation for a Latin square.

Statistical Modelling Chapter V 1

V. Latin squares designs (LS) V.A Design of Latin squares V.B Indicator-variable models and

estimation for a Latin squareV.C Hypothesis testing using the ANOVA

method for a Latin squareV.D Diagnostic checkingV.E Treatment differencesV.F Design of sets of Latin squaresV.G Hypothesis tests for sets of Latin

squares


V.A Design of Latin squares• Definition V.1: A Latin square design is

one in which – each treatment occurs once and only once in

each row and each column – so that the numbers of rows, columns and

treatments are all equal. • Clearly, the total number of observations is

n t2. • Suppose in a field trial moisture is varying

across the field and the stoniness down the field.

• A Latin square can eliminate both sources of variability.


Example V.1 Fertilizer experiment5 x 5 Latin square

Column Less 1 2 3 4 5 stony

of I A B C D E field II C D E A B

Row III E A B C D IV B C D E A V D E A B C Stonier end of field

Less More moisture moisture

(Fertilizers

A, B, C, D, E)


Notes• Even if one has not identified trends in two directions, a

LS may be employed to guard against the problem of putting the blocks in the wrong direction.

• LSs may also be used when there are two different kinds of blocking variables — for example animals and times.

• General principle is to maximize row and column differences so as to minimize uncontrolled variation affecting treatment differences.

• Problem is restriction that no. replicates = no. treats• Several fundamentally different LSs exist for a particular t

– for t 4 there are three different squares. – Latin squares for t 3,4, ..., 9 given in Appendix 8A of Box,

Hunter and Hunter. • To randomize these designs appropriately involves:

1. randomly select one of the Latin squares available for t;2. randomly permute the rows and then the columns;3. randomly assign letters to treatments.


a) Obtaining a layout for a Latin Square in R

• General instructions given in Appendix B, Randomized layouts and sample size computations in R.

Example V.2 Pollution effects of petrol additives• 4 cars and 4 drivers in a study of effects of 4 petrol

additives on pollution.• Desirable to isolate both car-to-car and driver-to-driver

differences.• A 4 4 Latin square would enable this to be done. • Names for rows, columns and treats for this example are

Cars, Drivers and Additives, respectively. • Also, t = 4 and a design obtained from BH2.


Expressions to be used for example

> t <- 4

> n <- t*t

> LSPolut.unit <- list(Drivers=t, Cars=t)

> Additives <- factor(c(1,2,3,4, 4,3,2,1, 2,4,1,3, 3,1,4,2),

+ labels=c("A","B","C","D"))

> LSPolut.lay <- fac.layout(unrandomized=LSPolut.unit,

+ randomized=Additives,seed=941)

> remove("Additives")

> LSPolut.lay

• Note: no nested.factors as Drivers and Cars are to be randomized independently

• Hence they are not nested (are crossed)


Randomized layout> LSPolut.lay Units Permutation Drivers Cars Additives1 1 11 1 1 B2 2 12 1 2 D3 3 10 1 3 C4 4 9 1 4 A5 5 7 2 1 A6 6 8 2 2 B7 7 6 2 3 D8 8 5 2 4 C9 9 15 3 1 D10 10 16 3 2 C11 11 14 3 3 A12 12 13 3 4 B13 13 3 4 1 C14 14 4 4 2 A15 15 2 4 3 B16 16 1 4 4 D

4 x 4 Latin square

Car 1 2 3 4

I B D C A II A B D C

Drivers

III D C A B IV C A B D

(Additives A, B, C, D)


V.B Indicator-variable models and estimation for a Latin square

• Have to decide whether each of the factors Rows, Columns and Treatments are to be regarded as fixed or random.

• As for the RCBD, it happens that the analysis of the Latin square is essentially unaffected by which model is used.

• Generally, the Latin square involves t rows and columns so that there are n t2 observations in all.


a) Maximal model• The maximal model when all are fixed, is

2R+C+T R C T and t

E 2Y X X X V I where

Y is the t2-vector of random variables for the response variable observations,

is the t-vector of parameters specifying a different mean response for each row,

XR is the t2t matrix indicating the row from which an observation came,

is the t-vector of parameters specifying a different mean response for each column,

XC is the t2t matrix indicating the column from which an observation came,

is the t-vector of parameters specifying a different mean response for each treatment,

XT is the t2t matrix indicating the observations that received each of the treatments.

• Our model also assumes Y ~ N(R+C+T, V)


Example V.3 A 33 Latin square• Suppose that a 33 Latin square with the following

arrangement of treatments was being considered: 3 x 3 Latin square

Column 1 2 3 I A B C

Row II C A B III B C A

R C T

1 0 0 1 0 0 1 0 01 0 0 0 1 0 0 1 01 0 0 0 0 1 0 0 10 1 0 1 0 0 0 0 1

, ,0 1 0 0 1 0 1 0 00 1 0 0 0 1 0 1 00 0 1 1 0 0 0 1 00 0 1 0 1 0 0 0 10 0 1 0 0 1 1 0 0

X X X

• Then, for this example,

• Note for general systematic layout XR It 1t and XC 1t It but XT cannot be written as a direct product.


Estimators of expected values under max. model

Also, note that

R+C+TThen 2ˆ R C T G

are the t2-vectors of row, column, treatment and grand means, respectively.where , , and R C T G

2G

1R

1C

t t

t t

t t

t

t

t

M J J

M I J

M J IIn this case it is not possible to write MT as a direct product of I and J matrices as the treatments will not be in a systematic order expressible in this form.

That is, MR, MC, MT and MG are the row, column, treatment and grand mean operators, respectively. So once again the estimators are functions of means. Further, if the data in the vector Y has been arranged in standard order for Rows then Columns, the operators are:

R C T G, , , R M Y C M Y T M Y G M Y


Example V.3 A 33 Latin square (continued)

1 1G 3 39 9

1 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 11 1 1 1 1 1 1 1 1

M J J

1 1R 3 33 3

1 1 1 0 0 0 0 0 01 1 1 0 0 0 0 0 01 1 1 0 0 0 0 0 00 0 0 1 1 1 0 0 00 0 0 1 1 1 0 0 00 0 0 1 1 1 0 0 00 0 0 0 0 0 1 1 10 0 0 0 0 0 1 1 10 0 0 0 0 0 1 1 1

M I J 1 1C 3 33 3

1 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 11 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 11 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 1

M J I

T

1 0 00 1 00 0 10 0 11 0 00 1 00 1 00 0 11 0 0

X T

1 0 0 0 1 0 0 0 10 1 0 0 0 1 1 0 00 0 1 1 0 0 0 1 00 0 1 1 0 0 0 1 011 0 0 0 1 0 0 0 1

3 0 1 0 0 0 1 1 0 00 1 0 0 0 1 1 0 00 0 1 1 0 0 0 1 01 0 0 0 1 0 0 0 1

M


b) Alternative expectation models• 8 possible different models for the expectation when

Rows, Columns and Treatments are considered fixed:

G G

R R

C C

R+C R C

T T

R+T R T

no treatment, row or column differences

row differences only

column differences only

additive row and column

treatment differences only

additive row and tre

X

X

X

X X

X

X X

C+T C T

R+C+T R C T

atment differences

additive column and treatment differences

additive row, column and treatment differences

X X

X X X


Marginality relations between the models

• Estimators

G R C R+C Τ R+Τ C+Τ R+C+Τ

R R+C R+Τ R+C+Τ

C R+C C+Τ R+C+Τ

T R+T C+Τ R+C+Τ

R+C R+Τ C+Τ R+C+Τ

, , , , , ,

, ,

, ,

, ,

, ,

ψ ψ ψ ψ ψ ψ ψ ψ

ψ ψ ψ ψ

ψ ψ ψ ψ

ψ ψ ψ ψ

ψ ψ ψ ψ

G

R

C

R+C

Τ

R+Τ

C+Τ

R+C+T

ˆ

ˆ

ˆ

ˆ

ˆ

ˆ

ˆ

2ˆ

ψ G

ψ R

ψ C

ψ R C G

ψ T

ψ R T G

ψ C T G

R C T G

all are functions of the four mean vectors for this design


V.C Hypothesis testing using the ANOVA method for a Latin square

• An ANOVA will be used to choose between the 8 alternative expectation models for a Latin square.


a) Analysis of an example• Example V.2 Pollution effects of petrol

additives (continued)

Results for the Latin square experiment

Car 1 2 3 4 I B

20 D 20

C 17

A 15

II A 20

B 27

D 23

C 26

Drivers III D 20

C 25

A 21

B 26

IV C 16

A 16

B 15

D 13

(Additives A, B, C, D)


Hypothesis test for the example

Step 1: Set up hypotheses

a) H0: A B C D (or XA not required in model)H1: not all population Additives means are equal

b) H0: I II III IV (or XD not required in model)H1: not all population Drivers means are equal

c) H0: 1 2 3 4(or XC not required in model)H1: not all population Cars means are equal

Set 0.05.


Hypothesis test for the example (continued)

Step 2: Calculate test statistics Source df MSq E[MSq] F Prob

Drivers 3 72.00 2Dq ψ 27.0 <0.001

Cars 3 8.00 2Cq ψ 3.0 0.117

Drivers#Cars 9

Additives 3 13.33 2Aq ψ 5.0 0.045

Residual 6 2.67 2

Total 15

• Note Drivers#Cars refers to the "interaction

between Drivers and Cars" – contrasts with Cars[Drivers] or Drivers[Cars]; – explained in chapter VII;– R does not distinguish as all are Drivers:Cars.



Step 3: Decide between hypotheses

Differences between drivers but not cars and differences between the additives.

The model that best describes the data would appear to be D+A = XD + XA, an additive model for Driver and Additive effects.


b) Sums of squares for the analysis of variance

• In this section we will use the generic names of Rows, Columns and Treatments for the factors in a Latin square.

• The estimators of the SSqs for the Latin square ANOVA are the SSqs of the following vectors:

R+C

R+C+T

Rows SSq:

Columns SSq

Rows#Columns SSq:

Treatments SSq:

Residual SSq: 2

e

e

e

e e e

R R G

C C G

D Y R C G

T T G

D Y R C T G

Y R C T Gwhere• Ds are n-vectors of deviations from Y and • vectors with the e subscripts are n-vectors of effects.


b) Ssq for the ANOVA (continued)• From section V.B,

Models and estimation for a Latin square,

2G

1R

1C

T

t t

t t

t t

t

t

t

G M Y J J Y

R M Y I J Y

C M Y J I Y

T M Y• Can be shown that

G G U U U G

R R R G

C C C G

R+C R+C RC

RC U R C G

T T T G

R+C+T R+C+T

with

with

with

with

with

e e

e e

e e

D D Y G Y G Y Q Y Q M M

R R R G R G Y Q Y Q M M

C C C G C G Y Q Y Q M M

D D Y R C G Y R C G Y Q Y

Q M M M M

T T T G T G Y Q Y Q M M

D D Y R C Res

Res

RC

RC U R C T G

2 2

with 2

T G Y R C T G Y Q Y

Q M M M M M

All the Ms and Qs are symmetric and idempotent.


ANOVA table is constructed as follows:Source df SSq MSq F p

Rows t 1 RY Q Y 2RR1

st

Y Q Y

Res

2 2R RCs s Rp

Columns t 1 CY Q Y 2CC1

st

Y Q Y

Res

2 2C RCs s Cp

Rows#Columns (t 1)2 RCY Q Y

Treatments t 1 TY Q Y 2TT1

st

Y Q Y

Res

2 2T RCs s Tp

Residual (t 1)(t 2) ResRCY Q Y

Res

Res

RC 2RC1 2

st t

Y Q Y

Total t2 1 UY Q Y

• See notes for example of computation of vectors and

geometrical interpretation


c) Expected mean squares

• To justify our choice of test statistics, we want to work out the E[MSq]s in the ANOVA table under the 8 alternative expectation models.

• However, to save space work out E[MSq]s under the maximal model and identify which terms in E[MSq]s go to zero under alternative models.


E[MSq]s with fixed Rows and Columns effects

• Given the expressions in the above table, the population means of the mean squares could be computed if knew the is, js, ks and 2.

• Each of qR(), qC() and qT() equal 0 when the terms XR, XC and XT, respectively, removed from the model.

• Hence a significant F value for a line indicates that the corresponding term should be included in the model.

Source df MSq E[MSq] F

Rows t 1 2RR1

st

Y Q Y

2Rq

Res

2 2R RCs s

Columns t 1 2CC1

st

Y Q Y

2Cq

Res

2 2C RCs s

Rows#Columns (t 1)2

Treatments t 1 2TT1

st

Y Q Y

2Tq

Res

2 2T RCs s

Residual (t 1)(t 2)

Res

Res

RC 2RC1 2

st t

Y Q Y

2

Total t2 1

2RR .

1

11

b

ii

q t tt

Q , 2C

C .1

11

t

jj

q t tt

Q

and 2TT .

1

11

t

kk

q t tt

Q


Alternative analysis• Both Rows and Columns are random• The model in this case would be that

T T

2 2U R R C

2 1 2 1R

2 2R

and

C

t t t t C t t

t t t t C t t

E

t t

t t t t

2

2

2

Y X

V M M M

I I I J J I

I I I J J I

• It allows for equal covariance between units from the same row and also between units from the same column.


Example V.3 A 33 Latin square (continued)

1

T T 2

3

1 0 00 1 00 0 10 0 11 0 00 1 00 1 00 0 11 0 0

E

Y X

2 2 2 2U R R C R

2R

1 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 00 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 00 0 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 00 0 0 1 0 0 0 0 0 0 0 0 1 1 1 0 0 00 0 0 0 1 0 0 0 0 0 0 0 1 1 1 0 0 00 0 0 0 0 1 0 0 0 00 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1

C t t t t C t tt t

2 2

2

V M M M I I I J J I

2

0 0 1 1 1 0 0 00 0 0 0 0 0 1 1 10 0 0 0 0 0 1 1 10 0 0 0 0 0 1 1 1

1 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 11 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 11 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 00 0 1 0 0 1 0 0 1

C

Shows:

2 2R Cvar ijkY

2

' Rcov ,ijk ij kY Y

2

' Ccov ,ijk i jkY Y

' 'cov , 0ijk i j kY Y


E[MSq]s under alternative model

• Alternative expectation model is G XG and under

this model qT() 0.

Source df MSq E[MSq] F

Rows t 1 2RR1

st

Y Q Y

2 2

Rt Res

2 2R RCs s

Columns t 1 2CC1

st

Y Q Y

2 2

Ct Res

2 2C RCs s

Rows#Columns (t 1)2

Treatments t 1 2TT1

st

Y Q Y

2Tq

Res

2 2T RCs s

Residual (t 1)(t 2)

Res

Res

RC 2RC1 2

st t

Y Q Y

2

Total t2 1

• This exactly parallels what happens when both are fixed.

• Alternative variance models involve setting and/or and this will result in the one(s) set to zero being dropped from the expected mean square.

2R 0

2C 0


d) Summary of the hypothesis test

• See notes

e) Comparison with traditional Latin-square ANOVA table

• Differences symbolic – see notes for details


f) Computation of ANOVA and diagnostic checking in R

• Diagnostic checking is the same as for the RCBD

Example V.2 Pollution effects of petrol additives (continued)

• First set up and attach data.frame and do initial boxplots.

• Then, use the aov function, either with or without the Error as part of the model. – In this experiment uncontrolled variation made up of Drivers, Cars

and Drivers:Cars. – R shorthand for this: Drivers*Cars that expands to Drivers + Cars

+ Drivers:Cars, the latter being equivalent to Drivers#Cars.

• Outputs for analysis with Error & diagnostic checking are given below


R output

> load("LSPolut.dat.rda")> attach(LSPolut.dat)> boxplot(split(Reduct.NO, Drivers), xlab="Drivers", ylab="Reduction in NO")

> boxplot(split(Reduct.NO, Cars), xlab="Cars", ylab="Reduction in NO")

> boxplot(split(Reduct.NO, Additives), xlab="Additives", ylab="Reduction in NO")


Boxplots for initial graphical exploration of the data

1 2 3 4

14

16

18

20

22

24

26

Drivers

Re

du

ctio

n in

NO

1 2 3 4

14

16

18

20

22

24

26

Cars

Re

du

ctio

n in

NO

A B C D

14

16

18

20

22

24

26

Additives

Re

du

ctio

n in

NO


R output (continued)< LSPolut.aov <- aov(Reduct.NO ~ Drivers + Cars + Additives + + Error(Drivers*Cars), LSPolut.dat)> summary(LSPolut.aov)Error: Drivers Df Sum Sq Mean SqDrivers 3 216 72Error: Cars Df Sum Sq Mean SqCars 3 24 8Error: Drivers:Cars Df Sum Sq Mean Sq F value Pr(>F)Additives 3 40.000 13.333 5 0.0452Residuals 6 16.000 2.667

> #Compute Drivers and Cars Fs and p-values> Drivers.F <- 72/2.667> Drivers.p <- 1-pf(Drivers.F, 3, 6)> Cars.F <- 8/2.667> Cars.p <- 1-pf(Cars.F, 3, 6)> data.frame(Drivers.F,Drivers.p,Cars.F,Cars.p) Drivers.F Drivers.p Cars.F Cars.p1 26.99663 0.0006989578 2.999625 0.1169842


R output (continued)> #> # Diagnostic checking> #> res <- resid.errors(LSPolut.aov)> fit <- fitted.errors(LSPolut.aov)> data.frame(Drivers,Cars,Additives,Reduct.NO,res,fit) Drivers Cars Additives Reduct.NO res fit

1 1 1 B 20 1 19

2 1 2 D 20 1 19

3 1 3 C 17 -1 18

4 1 4 A 15 -1 16

5 2 1 A 20 -1 21

6 2 2 B 27 -1 28

7 2 3 D 23 1 22

8 2 4 C 26 1 25

9 3 1 D 20 -1 21

10 3 2 C 25 -1 26

11 3 3 A 21 1 20

12 3 4 B 26 1 25

13 4 1 C 16 1 15

14 4 2 A 16 1 15

15 4 3 B 15 -1 16

16 4 4 D 13 -1 14


R output (continued)

> plot(fit, res, pch=16)> qqnorm(res, pch=16)> qqline(res)> tukey.1df(LSPolut.aov, LSPolut.dat, error.term = "Drivers:Cars")

$Tukey.SS[1] 4.54224 $Tukey.F[1] 1.982167 $Tukey.p[1] 0.2181923 $Devn.SS[1] 11.45776


Hypothesis test for the example

Step 1: Set up hypotheses (as before)

a) H0: A B C D (or XA not required in model)

H1: not all population Additives means are equal

b) H0: I II III IV (or XD not required in model)

H1: not all population Drivers means are equal

c) H0: 1 2 3 4(or XC not required in model)

H1: not all population Cars means are equal

Set 0.05.



Step 2: Calculate test statistics

Source df MSq E[MSq] F Prob

Drivers 3 72.00 2Dq ψ 27.0 <0.001

Cars 3 8.00 2Cq ψ 3.0 0.117

Drivers#Cars 9

Additives 3 13.33 2Aq ψ 5.0 0.045

Residual 6 2.67 2

Nonadditivity 1 4.54 1.98 0.218

Deviation 5 2.29

Total 15

• Note inclusion of Nonadditivity



Step 3: Decide between hypotheses

As before, differences between drivers but not cars and differences between the additives.

The model that best describes the data would appear to be D+A = XD + XA, the additive model for Driver and Additive effects.

The test for transformable nonadditivity is nonsignificant.


Diagnostic checking

• The residuals-versus-fitted-values plot indicates that the residuals are either -1 or 1 (must be artificial data).

• Normal Probability Plot indicates that the data are not normal.

• Clearly example can only be considered illustrative.

14 16 18 20 22 24 26

-1.0

-0.5

0.0

0.5

1.0

fit

res

-2 -1 0 1 2

-1.0

-0.5

0.0

0.5

1.0

Normal Q-Q Plot

Theoretical Quantiles

Sa

mp

le Q

ua

ntil

es


IV.D Diagnostic checking • Again, we have assumed Y ~ N(, 2I) where, for the

maximal model, R+C+T E[Y] XRXCXT • For this model to be appropriate requires a similar set of

behaviours as for the RCBD:a) response is operating additively: a treatment has

about the same additive effect on each unit;b) variability of the units is the same for all row-column

combinations;c) each observation displays the covariance implied by

the model (independence for Rows and Columns fixed; equal correlation within rows (columns) for Rows (Columns) random); and

d) that the response of the units is normally distributed. • As noted before, diagnostic checking same as for RCBD


V.E Treatment differences

• For the purposes of the scientist the effects of rows and columns are not of primary interest

• Rather, focus on treatment differences.• Same as for CRD and RCBD.


• As Additives significant, use Tukey's HSD procedure.


> # multiple comparisons> #> model.tables(LSPolut.aov, type="means")Tables of meansGrand mean20

Drivers Drivers 1 2 3 4 23 24 15 18

Cars Cars 1 2 3 4 19 19 22 20

Additives Additives A B C D 18 22 21 19 > q <- qtukey(0.95, 4, 6)> q[1] 4.895599


• Comparing the differences in the additive means with Tukey’s HSD, it is concluded that only the difference between A and B are significant.

4.895599 2.667 25%

424.895599

1.15482

4.00

w


Bar chart of Additives differences> # Plotting Treat means> LSPolut.tab <- model.tables(LSPolut.aov, type="means")> LSPolut.Adds.Mean <- data.frame(Adds.lev = levels(Additives), + Adds.Mean = as.vector(LSPolut.tab$tables$Additives))> LSPolut.Adds.Mean <-

LSPolut.Adds.Mean[order(LSPolut.Adds.Mean$Adds.Mean),]> LSPolut.Adds.Mean$Adds.lev <-factor(LSPolut.Adds.Mean$Adds.lev, + levels=LSPolut.Adds.Mean$Adds.lev)> barchart(Adds.Mean ~ Adds.lev, xlab="Additives", + ylim=c(0,25), ylab="NO Reduction", + main="Fitted values for Nitrous Oxide Reduction", + data=LSPolut.Adds.Mean)

Note use of ylim to include 0 on y-axis

Fitted values for Nitrous Oxide Reduction

Additives

NO

Re

du

ctio

n

5

10

15

20

A D C B


V.F Design of sets of Latin squares

• To overcome the small residual df problem several squares can be used.

• In the case Example V.2, Pollution effects of petrol additives, Latin Square could be repeated using:

1. using the same drivers and cars in each replicate;2. using the same drivers but new cars (or the same cars but

new drivers); or3. using new cars and drivers.

• In general, one can have as many (r) squares as one likes.

– However, will only present layouts for 2 squares.• General expressions for randomizing the

various cases are given in Appendix B, Randomized layouts and sample size computations in R.


Case 1 — same Drivers and Cars • This case involves a complete repetition of the

experiment, say on consecutive mornings, with the same 4 Drivers and 4 Cars on the two occasions.

• There is no re-randomization of the square for the second occasion — preserves crossed relationships between Occasions and other factors.

• Layout (r=2)

Occasions 1 2

Cars 1 2 3 4 1 2 3 4 Drivers

1 A B C D A B C D 2 C D A B C D A B 3 D C B A D C B A 4 B A D C B A D C


Case 2 — same cars different drivers• Experiment repeated on a different occasion with

– same 4 cars on both occasions, – but with different drivers on second occasion.

• As a result the rows of the square, but not the columns, are rerandomized on the second occasion.

• Layout (r=2) Occasion 1 2

Cars 1 2 3 4 1 2 3 4 Drivers

1 C A B D D B A C 2 A C D B A C D B 3 B D C A C A B D 4 D B A C B D C A

• Note order in which additives are tested by second driver on occasion 1 is same as for fourth driver on occasion 2.– That is, the second row of the square on occasion 1 is the same

as the fourth row on occasion 2.


Case 3 — different drivers and cars

• In this case, – not only are the drivers on different occasions unconnected, – but so are the cars as the cars used on the second occasion

are completely different to those used on the first occasion.

• As a result the rows and columns of the square are rerandomized on the second occasion.

• Layout (r=2)

Occasions 1 2

Cars 1 2 3 4 1 2 3 4 Drivers

1 B A C D D B C A 2 C D B A A C B D 3 A B D C B D A C 4 D C A B C A D B


V.G Hypothesis tests for sets of Latin squares

• In previous section discussed the use of several squares to overcome the residual df problem.

– e.g. 4 4 Latin square has 6 (< 10) Residual df

• Gave 3 cases for Example V.2, Pollution effects of petrol additives:

1. using the same drivers and cars in each replicate;2. using the same drivers but new cars (or the same cars but new

drivers); or3. using new cars and drivers.

• Shall determine ANOVA for each of these cases. • In determining the E[MSq]s will be assumed that

– unrandomized factors are to be classified as random factors– randomized factors as fixed factors.

• While layouts were for 2 squares will give DF for the general case of r squares.


a) Case 1 — same Drivers and Cars • no re-randomization of the square for the

second occasion• Layout (r=2)

Occasions 1 2

Cars 1 2 3 4 1 2 3 4 Drivers

1 A B C D A B C D 2 C D A B C D A B 3 D C B A D C B A 4 B A D C B A D C


A. Description of pertinent features of the study

1. Observational unit– a car with a driver on an occasion

2. Response variable– Reduction

3. Unrandomized factors– Occasions, Drivers, Cars

4. Randomized factors– Additives

5. Type of study– Sets of Latin Squares


B. The experimental structure

• For this structure to be appropriate requires that the same square without re-randomization be used for each occasions; otherwise, some factors would be nested (as would be randomizing within Occasions).

Structure Formula unrandomized 2 Occasions*4 Drivers*4 Cars randomized 4 Additives

C. Sources derived from the structure formulaeOccasions*Drivers*Cars

= (Occasions + Drivers + Occasions#Drivers)*Cars

= Occasions + Drivers + Occasions#Drivers

+ Cars + Occasions#Cars + Drivers#Cars

+ Occasions#Drivers#Cars

Additives = Additives


D. Degrees of freedom and sums of squares• Hasse diagrams, with degrees of freedom, for this

study are:

Driver 4

Car

Driver Car 16

Unrandomized factors

Occ Car 3(r 1)

Occ r

Occ Driver 4r

Occ Driver Car 16r

r 1 3 4 3

3(r 1) 4r 9

9(r 1)

1 1

O D C

O#D O#C D#C

O#D#C


Alternative degrees of freedom calculation

• As all factors in unrandomized structure are crossed, the rule for a set of crossed factors can be used.

• DF of any source is no. of levels minus one for each factor in the source, multiplied together.

• For example, since Occasions has r levels and Drivers has 4 levels, – DF of Occasions#Drivers is (r1)(41) 3(r1).


Hasse diagrams, with M and Q matrices


E. The analysis of variance table

Source df SSq Occasions r 1 OY Q Y Drivers 3 DY Q Y Cars 3 CY Q Y Occasions#Drivers 3(r 1) ODY Q Y Occasions#Cars 3(r 1) OCY Q Y Drivers#Cars 9 DCY Q Y Additives 3 AY Q Y Residual 6

ResDCY Q Y

Occasions#Drivers#Cars 9(r 1) ODCY Q Y Total 16r 1


F. Maximal expectation and variation models

• Assume the randomized factor is a fixed factor and that all the unrandomized factors are random factors.

• Then the expectation term is Additives.• The variation terms are:

– Occasions, Drivers, OccasionsDrivers, Cars, – OccasionsCars, DriversCars and – OccasionsDriversCars.

• The expectation model isE[Y] = Additives

• and the variation model isvar[Y] = Occasions + Drivers + OccasionsDrivers

+ Cars + OccasionsCars + DriversCars+ OccasionsDriversCars


G. The expected mean squares • The Hasse diagram, with contributions to E[MSq]s, for the

unrandomized factors in this study is:

• The single randomized factor Additive will contribute qA() to the E[MSq] for its source.


ANOVA table with E[MSq]Source df SSq E[MSq] Occ r 1 OY Q Y 2 2 2 2

ODC OC OD O4 4 16

Drivers 3 DY Q Y 2 2 2 2ODC DC OD D4 4r r

Cars 3 CY Q Y 2 2 2 2ODC DC OC C4 4r r

Occ#Drivers 3(r 1) ODY Q Y 2 2ODC OD4

Occ#Cars 3(r 1) OCY Q Y 2 2ODC OC4

Drivers#Cars 9 DCY Q Y

Additives 3 AY Q Y 2 2ODC DC Ar q ψ

Residual 6 ResDCY Q Y 2 2

ODC DCr

Occ#Drivers#Car 9(r 1) ODCY Q Y 2ODC

Total 16r 1 • Hypothesis tests for Additive, Occ#Drivers and Occ#Cars

straightforward.• Tests for Occ, Drivers and Cars are more difficult.

– For example, to test Occ, need ratio of 2 sums of MSqs.– DF a problem


b) Case 2 — same cars different drivers

• rows of the square, but not the columns, are rerandomized on the second occasion.

• Layout (r=2)

Occasion 1 2

Cars 1 2 3 4 1 2 3 4 Drivers

1 C A B D D B A C 2 A C D B A C D B 3 B D C A C A B D 4 D B A C B D C A









B. The experimental structure Structure Formula unrandomized (2 Occasion/4 Drivers)*4 Cars randomized 4 Additives

C. Sources derived from the structure formulae

(Occasions/Drivers)*Cars

= (Occasions + Drivers[Occasions])*Cars

= Occasions + Drivers[Occasions] + Cars

+Occasions#Cars + Drivers#Cars[Occasions]

Additives = AdditivesD. Degrees of freedom and sums of squares

Left as an exercise



Source df SSq Occasions r 1 OY Q Y Drivers[Occasions] 3r ODY Q Y Cars 3 CY Q Y Occasions#Cars 3(r 1) OCY Q Y Drivers#Cars[Occasions] 9r ODCY Q Y Additives 3 AY Q Y Residual 9r 3

ResODCY Q Y

Total 16r 1





– Occasions, OccasionsDrivers, Cars, OccasionsCars, OccasionsDriversCars.


• and the variation model isvar[Y] = Occasions + OccasionsDrivers + Cars

+ OccasionsCars + OccasionsDriversCars


G. The expected mean squares • The Hasse diagram, with contributions to E[MSq]s, for the

unrandomized factors in this study is left as an exercise.• The single randomized factor Additive will contribute qA()

to the E[MSq] for its source.ANOVA table with E[MSq]

Source df SSq E[MSq] Occasions r 1 OY Q Y 2 2 2 2

ODC OC OD O4 4 16 Drivers[Occasions] 3r ODY Q Y 2 2

ODC OD4 Cars 3 CY Q Y 2 2 2

ODC OC C4 4r Occasions#Cars 3(r 1) OCY Q Y 2 2

ODC OC4 Drivers#Cars[Occasions] 9r ODCY Q Y Additives 3 AY Q Y 2

ODC Aq ψ Residual 9r 3

ResODCY Q Y 2ODC

Total 16r 1 • Hypothesis tests for Drivers[Occasions], Cars,

Occasions#Cars and Additives are straightforward• The test for Occasions is more difficult.


c) Case 3 — different drivers and cars

• rows and columns of the square are rerandomized on the second occasion.

• Layout (r=2)

Occasions 1 2

Cars 1 2 3 4 1 2 3 4 Drivers

1 B A C D D B C A 2 C D B A A C B D 3 A B D C B D A C 4 D C A B C A D B









B. The experimental structure Structure Formula unrandomized 2 Occasions/(4 Drivers*4 Cars) randomized 4 Additives

C. Sources derived from the structure formulae

Occasions/(Drivers*Cars)

= Occasions/(Drivers + Cars + Drivers#Cars)= Occasions + Drivers[Occasions] + Cars[Occasions]

+ Drivers#Cars[Occasions]

Additives = Additives

D. Degrees of freedom and sums of squares

Left as an exercise



Source df SSq Occasions r 1 OY Q Y Drivers[Occasions] 3r ODY Q Y Cars[Occasions] 3r OCY Q Y Drivers#Cars[Occasions] 9r ODCY Q Y Additives 3 AY Q Y Residual 9r 3

ResODCY Q Y

Total 16r 1





– Occasions, OccasionsDrivers, OccasionsCars, OccasionsDriversCars.


• and the variation model isvar[Y] = Occasions + OccasionsDrivers

+ OccasionsCars + OccasionsDriversCars


G. The expected mean squares

• The Hasse diagram, with contributions to E[MSq]s, for the unrandomized factors in this study is left as an exercise.

• The single randomized factor Additive will contribute qA() to the E[MSq] for its source.

ANOVA table with E[MSq]Source df SSq E[MSq] Occasions r 1 OY Q Y 2 2 2 2

ODC OC OD O4 4 16 Drivers[Occasions] 3r ODY Q Y 2 2

ODC OD4 Cars[Occasions] 3r OCY Q Y 2 2

ODC OC4 Drivers#Cars[Occasions] 9r ODCY Q Y

Additives 3 AY Q Y 2ODC Aq ψ

Residual 9r 3 ResODCY Q Y 2

ODC Total 16r 1

• Hypothesis tests for Drivers[Occasions], Cars[Occasions] and Additives are straightforward

• The test for Occasions is more difficult.


d) Comparison of Latin Square Analyses

One Square only

Unrandomized Structure Car*Driver

Source df Car 3 Driver 3 Driver#Car 9 Additives 3 Residual 6

Case 1 Same drivers

and cars

Case 2 Same cars Different drivers

Case 3 Different

drivers and cars

Source df Source Df Source df Occ r 1 Occ r 1 Occ r 1 Driver 3 Driver[Occ] 3r Driver[Occ] 3r Car 3 Cars 3 Car[Occ] 3r Occ#Driver 3(r 1) Occ#Car 3(r 1) Drive#Car[Occ] 9r Occ#Car 3(r 1) Drive#Car[Occ] 9r Additives 3 Driver#Car 9 Additives 3 Residual 9r 3 Additives 3 Residual 9r 3 Total 16r 1 Residual 6 Total 16r 1

Occ#Driver#Car 9(r 1) Total 16r 1

Unrandomized structure 2 Occasions*4 Driver*4

Car (2 Occ/4 Driver)*4 Car 2 Occ/(4 Driver*4 Car)


e) Computation of ANOVA in R

• The analysis of one of these experiments in R is obtained by using a model formula in which the explanatory variables are specified to be

• Additives + Error(unrandomized structure formula)

• In addition, fixed terms in the unrandomized structure formula need to be also included with the terms outside the Error function.

• For further information see Appendix C, Analysis of designed experiments in R.


V.I Exercises

• Ex. V-1 asks for a randomized layout for a Latin square

• Ex. V-2 looks at random versus fixed effects

• Ex. V-3 asks for the complete analysis of an LSD with a qualitative treatment factor

• Ex. V-4 investigates the design and analysis of Sets of Latin squares

Statistical Modelling Chapter V 1 V. Latin squares designs (LS) V.ADesign of Latin squares V.B Indicator-variable models and estimation for a Latin square.

Documents