lec7.PDF

STAT3010: Lecture 7

1

Repeated Measures Analysis of Variance (Section 9.6, Page 436) Repeated Measurements are measurements made on your subjects over a particular time. The purpose of these repeated measurements is to assess changes in the measurements of the subjects over time. For example, a hypothesis might be that there may be a decreasing effect over time. Heres the data layout: Subject Time 1 Time 2 Time k

1 11x 12x kx1 2 21x 22x kx2 . . . . . .

n 1nx 2nx nkx In these applications, we have n subjects, and we take repeated measurements on each of the n subjects. Outline of the Procedure:

1. Set up the hypothesis:

2. Compute the test statistic:

STAT3010: Lecture 7

2

Repeated Measures Analysis of Variance Table Degrees of Source of Sums of Squares Freedom Mean Squares Variation (SS) (df) (MS) F Between 2... )( XXkSS ssubj n-1 Subjects Between 2... )( XXnSS jb k-1 1

2

kSSMSs bbb

w

b

MSMSF

Trtmts

Within bsubjtotalw SSSSSSSS (n-1)(k-1) )1)(1(2

knSS

MSs www

Total wbsubjtotal SSSSSSSS nk-1

STAT3010: Lecture 7

3

3. Decision Rule:

4. Conclusion. Example 9.10: Repeated Measures ANOVA to Test Difference in Mean Completion Times Among 3 Training Courses Were comparing the cardiovascular fitness of elite runners on three different training courses, each of which covers 10 miles. Course 1 is flat, Course 2 has graded inclines and Course 3 includes steep inclines. Each runners heart rate is monitored at mile 5 of the run on each course. Ten runners are involved, and their heart rates measured on each course are shown: Runner Number Course 1 Course 2 Course 3 1 132 135 138 2 143 148 148 3 135 138 141 4 128 131 139 5 141 141 150 6 150 156 161 7 131 134 138 8 150 156 162 9 142 145 151 10 139 165 160

STAT3010: Lecture 7

4

Is there a significant difference in the mean heart rates of runners on the three courses? Run the appropriate test at a 5% level of significance. First of all, how is this different from our old examples? Treatment 1 Treatment 2 Treatment 3 29.0 25.1 20.1 29.2 25.0 20.0 29.1 25.0 19.9 28.9 24.9 19.8 28.8 25.0 20.2 Here, we have a total of 15 people who are randomly assigned to one of these 3 treatments, where above (page 3), we have a total of 10 runners who keep getting repeated measurements taken at different levels.

1. Set up the hypotheses:

2. Compute the test statistic: Construct an ANOVA, but first: Runner # Course 1 Course 2 Course 3 1 132 135 138 2 143 148 148 3 135 138 141 4 128 131 139 5 141 141 150 6 150 156 161 7 131 134 138 8 150 156 162 9 142 145 151 10 139 165 160

STAT3010: Lecture 7

5

The between subjects sums of squares is The between treatments sums of squares is The total sums of squares is The within sums of squares is Analysis of Variance Table Degrees of Source of Sums of Squares Freedom Mean Squares Variation (SS) (df) (MS) F Between Subjects Between Trtmts Within Total

STAT3010: Lecture 7

6

3. Decision Rule:

4. Conclusion: Using SAS: SAS CODE: options ps=62 ls=80; data repeatedmeasurments; input runner course1 course2 course3; subjmean=mean(course1,course2,course3); cards; 1 132 135 138 2 143 148 148 3 135 138 141 4 128 131 139 5 141 141 150 6 150 156 161 7 131 134 138 8 150 156 162 9 142 145 151 10 139 165 160 run; proc print; var course1 course2 course3 subjmean; run;

STAT3010: Lecture 7

7

proc means; var course1 course2 course3 subjmean; run; proc glm; model course1 course2 course3=/nouni; repeated course; run; SAS OUTPUT: The SAS System Obs course1 course2 course3 subjmean 1 132 135 138 135.000 2 143 148 148 146.333 3 135 138 141 138.000 4 128 131 139 132.667 5 141 141 150 144.000 6 150 156 161 155.667 7 131 134 138 134.333 8 150 156 162 156.000 9 142 145 151 146.000 10 139 165 160 154.667 The SAS System The MEANS Procedure Variable N Mean Std Dev Minimum Maximum course1 10 139.1000000 7.6077446 128.0000000 150.0000000 course2 10 144.9000000 11.2195266 131.0000000 165.0000000 course3 10 148.8000000 9.6930674 138.0000000 162.0000000 subjmean 10 144.2666667 9.0769005 132.6666667 156.0000000 The SAS System The GLM Procedure Number of Observations Read 10 Number of Observations Used 10 The SAS System The GLM Procedure Repeated Measures Analysis of Variance

STAT3010: Lecture 7

8

Repeated Measures Level Information Dependent Variable course1 course2 course3 Level of course 1 2 3 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no course Effect H = Type III SSCP Matrix for course E = Error SSCP Matrix S=1 M=0 N=3 Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda 0.09790213 36.86 2 8 F Source G - G H - F course 0.0013 0.0009 Error(course) Greenhouse-Geisser Epsilon 0.6350 Huynh-Feldt Epsilon 0.6922

Once a significant f-test has been obtained for a repeated-measures ANOVA, a post-hoc test (pairwise tests) can be done to determine which of the means are different. Tests that are commonly used include the least significant difference (LST) test, modified Bonferroni t-test, and Sidak test.

STAT3010: Lecture 7

9

The Modified Bonferroni t-test:

nMS

xxtE

o

221

with a critical value t-test using table B.3 with error degrees of freedom. Note: All comparisons must be completed (similar to the Scheffe Procedure). Randomized Complete Block Designs Another type of repeated measurements ANOVA is the Randomized Complete Block Designs. The most straightforward of the randomized block designs is one in which we randomly assign each treatment once to every block - each block constituting a single replication of the treatments. A typical layout for the randomized complete block design (RCB) using 3 measurements in 4 blocks is as follows: Block 1 Block 2 Block 3 Block 4 The ts denote the assignment to blocks of each of the 3 treatments. Of course, the true allocation of treatments to units within blocks is done at random. Once the experiment has been completed, the data can be recorded as in the following 3 x 4 array:

Treatment Block: 1 2 3 4

1

11y 12y 13y 14y 2

21y 22y 23y 24y

3

31y 32y 33y 34y

2t

1t

3t

1t

3t

2t

3t

2t

1t

2t

1t

3t

STAT3010: Lecture 7

10

where 11y represents the response obtained by using treatment 1 in block 1, 12y represents the response obtained by using treatment 1 in block 2, , and 34y represents the response obtained by using treatment 3 in block 4. Example: Four different machines are being considered for the assembling of a particular product. It is decided that 6 different operators are to be used in a randomized block experiment to compare the machines. The machines are assigned in a random order to each operator. The operation of the machines requires physical dexterity, and it is anticipated that there will be a difference among the operators in the speed with which they operate the machines. The amount of time (in seconds) were recorded for assembling the product:

Operator Machine 1 2 3 4 5 6 Total

1 42.5 39.3 39.6 39.9 42.9 43.6 247.8 2 39.8 40.1 40.5 42.3 42.5 43.1 248.3 3 40.2 40.5 41.3 43.4 44.9 45.1 255.4 4 41.3 42.2 43.5 44.2 45.9 42.3 259.4

Total 163.8 162.1 164.9 169.8 176.2 174.1 1010.9

Test the hypothesis at the 0.05 level of significance that the machines perform at the same mean rate of speed. SAS CODE: options ps=62 ls=80; data randomizedblock; input block machine1 machine2 machine3 machine4; blockmean=mean(machine1,machine2,machine3,machine4); cards; 1 42.5 39.8 40.2 41.3 2 39.3 40.1 40.5 42.2

STAT3010: Lecture 7

11

3 39.6 40.5 41.3 43.5 4 39.9 42.3 43.4 44.2 5 42.9 42.5 44.9 45.9 6 43.6 43.1 45.1 42.3 run; proc print; var machine1 machine2 machine3 machine4 blockmean; run; proc means; var machine1 machine2 machine3 machine4 blockmean; run; proc glm; model machine1 machine2 machine3 machine4=/nouni; repeated machine; run; SAS OUTPUT: The SAS System Obs machine1 machine2 machine3 machine4 blockmean 1 42.5 39.8 40.2 41.3 40.950 2 39.3 40.1 40.5 42.2 40.525 3 39.6 40.5 41.3 43.5 41.225 4 39.9 42.3 43.4 44.2 42.450 5 42.9 42.5 44.9 45.9 44.050 6 43.6 43.1 45.1 42.3 43.525 The SAS System The MEANS Procedure Variable N Mean Std Dev Minimum Maximum machine1 6 41.3000000 1.9047310 39.3000000 43.6000000 machine2 6 41.3833333 1.4119726 39.8000000 43.1000000 machine3 6 42.5666667 2.1924112 40.2000000 45.1000000 machine4 6 43.2333333 1.6609234 41.3000000 45.9000000 blockmean 6 42.1208333 1.4506392 40.5250000 44.0500000 The SAS System The GLM Procedure Number of Observations Read 6 Number of Observations Used 6

STAT3010: Lecture 7

12

The SAS System The GLM Procedure Repeated Measures Analysis of Variance Repeated Measures Level Information Dependent Variable machine1 machine2 machine3 machine4 Level of machine 1 2 3 4 MANOVA Test Criteria and Exact F Statistics for the Hypothesis of no machine Effect H = Type III SSCP Matrix for machine E = Error SSCP Matrix S=1 M=0.5 N=0.5 Statistic Value F Value Num DF Den DF Pr > F Wilks' Lambda 0.17064463 4.86 3 3 0.1133 Pillai's Trace 0.82935537 4.86 3 3 0.1133 Hotelling-Lawley Trace 4.86013150 4.86 3 3 0.1133 Roy's Greatest Root 4.86013150 4.86 3 3 0.1133 The SAS System The GLM Procedure Repeated Measures Analysis of Variance Univariate Tests of Hypotheses for Within Subject Effects Source DF Type III SS Mean Square F Value Pr > F machine 3 15.92458333 5.30819444 3.34 0.0479 Error(machine) 15 23.84791667 1.58986111 Adj Pr > F Source G - G H - F machine 0.0820 0.0482 Error(machine) Greenhouse-Geisser Epsilon 0.6283 Huynh-Feldt Epsilon 0.9963

STAT3010: Lecture 7

13

One more example in SAS: Example 9.4: Analysis of Variance (ANOVA): Unequal Sample Sizes. Compare Mean Ages among Three Groups of Students The following data reflect ages of students at completion of eighth grade. Test if there is a significant difference in the mean age at completion of eighth grade for rural, suburban, and urban students using SAS and a 5% level of significance. The following data were collected from randomly selected students at rural, suburban, and urban schools. Rural: 14 14 14 14 13 13 13 12 Suburban: 14 14 14 13 13 13 13 13 12 12 Urban: 16 16 15 15 15 14 14 14 13 12 SAS CODE: options ps=62 ls=80; data unequal; input school $ age; cards; rural 14 rural 14 rural 14 rural 14 rural 13 rural 13 rural 13 rural 12 suburban 14 suburban 14 suburban 14 suburban 13 suburban 13 suburban 13 suburban 13

STAT3010: Lecture 7

14

suburban 13 suburban 12 suburban 12 urban 16 urban 16 urban 15 urban 15 urban 15 urban 14 urban 14 urban 14 urban 13 urban 12 run; proc glm; class school; model age=school; run; SAS OUTPUT: The SAS System The GLM Procedure Class Level Information Class Levels Values school 3 rural suburban urban Number of Observations Read 28 Number of Observations Used 28 The SAS System The GLM Procedure Dependent Variable: age Sum of Source DF Squares Mean Square F Value Pr > F Model 2 9.25357143 4.62678571 4.99 0.0150 Error 25 23.17500000 0.92700000 Corrected Total 27 32.42857143 R-Square Coeff Var Root MSE age Mean

STAT3010: Lecture 7

15

0.285352 7.057234 0.962808 13.64286 Source DF Type I SS Mean Square F Value Pr > F school 2 9.25357143 4.62678571 4.99 0.0150 Source DF Type III SS Mean Square F Value Pr > F school 2 9.25357143 4.62678571 4.99 0.0150

lec7.PDF

Documents

subjects sums of squares

repeated measurements

total sums of squares

n subjects

total stat3010

sums of squares freedom

treatments sums of squares

runner number course