Chapter 14 - Repeated Measures Designsstatdhtx/methods8/Instructors Ma… · Web viewChapter 14 – Repeated-Measures Designs [As in previous chapters, there will be substantial

Chapter 14 - Repeated Measures Designs

Chapter 14 – Repeated-Measures Designs

[As in previous chapters, there will be substantial rounding in these answers. I have attempted to make the answers fit with the correct values, rather than the exact results of the specific calculations shown here. Thus I may round cell means to two decimals, but calculation is carried out with many more decimals.]

14.1Does taking the GRE repeatedly lead to higher scores?

a.Statistical model:

or

ijijijijijijij

XeXe

mptptmpt

=++++=+++

b.Analysis:

(

)

(

)

2

2

2

13520

7811200194933.33

24

total

X

SSX

N

=-=-=

å

å

(

)

(

)

(

)

(

)

2

22

..

.

3[566.67563.33...573.33563.33]363222.221

89,666.67

subj

i

SStXX

=S-

=-++-==

(

)

(

)

(

)

[

]

2

22

2

...

8[552.50563.33(563.75563.33)573.75563.33

]

8226.041808.33

testj

SSnXX

=S-=-+-+-

==

194,933.33189,666.671808.333458.33

errortotalsubjtest

SSSSSSSS

=--

=--=

Source

df

SS

MS

F

Subjects

7

189,666.66

Within subj

16

5266.67

Test session

2

1808.33

904.17

3.66 ns

Error

14

3458.33

247.02

Total

23

194,933.33

14.2Data on first two Test Sessions in Exercise 14.1:

a.Related-sample t test:

Subj

First

Second

Diff

1

550

570

20

2

440

440

0

3

610

630

20

4

650

670

20

5

400

460

60

6

700

680

-20

7

490

510

20

8

580

550

-30

Mean

11.25

(

)

2

90

6500

8

27.999

7

011.25

1.14

27.999

8

D

D

s

D

t

s

-

==

-

===

[t.025(7) = +2.365]

Do not reject H0

b.Repeated-measures ANOVA:

Source

df

SSMSF

Between subj

7

130,793.75

Within subj

8

3250.00

Test session

1

506.25

506.25

1.29ns

Error

7

2743.75

391.96

Total

15

134,185.94

F

=

1

.

29

=

1

.

14

=

t

from

part

a

.

14.3Teaching of self-care skills to severely retarded children:

Cell means:

Phase

Baseline

Training

Mean

Group:

Exp

4.80

7.00

5.90

Control

4.70

6.40

5.55

Mean

4.75

6.70

5.72

Subject means:

S1

S2

S3

S4

S5

S6

S7

S8

S9

S10

Grp

Exp

8.5

6.0

2.5

6.0

5.5

6.5

6.5

5.5

5.5

6.5

Control

4.0

5.0

9.0

3.5

4.0

8.0

7.5

4.5

5.0

5.5

X2 = 1501

X = 229

N = 40

n = 10

g = 2

p = 2

(

)

2

2

2

229

1501189.975

40

total

X

SSX

N

=-=-=

å

å

(

)

(

)

2

2

2

....

2[8.55.72...(5.55.72)]106.475

subjij

SSpXX

=S-

=-++-=

(

)

(

)

(

)

(

)

2

22

.....

28[5.905.725.555.72]1.225

groupk

SSpnXX

=S-

=-+-=

(

)

(

)

(

)

(

)

2

22

.....

210[4.755.726.705.72]38.025

phasej

SSgnXX

=S-

=-+-=

(

)

(

)

(

)

2

22

....

104.805.72...6.405.7239.875

cellsjk

SSnXX

=S-

éù

=-++-=

ëû

39.87538.0251.2250.925

PGcellsphasegroup

SSSSSSSS

=--=--=

Source

df

SS

MS

F

Between Subj

19

106.475

Groups

1

1.125

1.125

0.19

Ss w/in Grps

18

105.250

5.847

Within Subj

20

83.500

Phase

1

38.025

38.025

15.26*

P x G

1

0.625

0.625

0.25

P x Ss w/in Grps

18

44.850

2.492

Total

39

189.975

*

p

<

.

05

[

F

.

05

(

1

,

18

)

=

4

.

41

]

There is a significant difference between baseline and training, but there are no group differences nor a group x phase interaction.

14.4Independent t test on data in Exercise 14.3:

a.Difference scores (Training - Baseline)

Exper.

1

2

-1

2

7

1

3

-1

3

5

Control

2

0

2

3

-2

4

3

1

4

0

2

2

2

2

2.2 6.1778 10

1.7 3.7889 10

2.21.7

6.17783.7889

0.50

1010

EEE

EEE

EC

C

E

EC

Xsn

Xsn

XX

t

s

s

nn

===

===

-

-

==

+=

+

[t.025(9) = +2.262]

Do not reject H0

b.

t

2

=

.

50

2

=

.

25

=

F

for

P

x

G

interaction

.

c.The t is a test on whether the baseline vs. training difference is the same for both groups. This is a test of an interaction, not a test of overall group differences.

14.5Adding a No Attention control group to the study in Exercise 14.3:

Cell means:

Phase

Baseline

Training

Total

Group

Exp

4.8

7.0

5.90

Att Cont

4.7

6.4

5.55

No Att Cont

5.1

4.6

4.85

Total

4.87

6.00

5.43

Subject means:

S1

S2

S3

S4

S5

S6

S7

S8

S9

S10

Group:

Exp

8.5

6.0

2.5

6.0

5.5

6.5

6.5

5.5

5.5

6.5

Att Cont

4.0

5.0

9.0

3.5

4.0

8.0

7.5

4.5

5.0

5.0

No Att Cont

3.5

5.0

7.0

5.5

4.5

6.5

6.5

4.5

2.5

3.0

2

2026 326 60 10 3 2

XXNngp

======

åå

X = 326

N = 60

n = 10

g = 3

p = 2

(

)

2

2

2

326

2026254.7333

60

total

X

SSX

N

=-=-=

å

å

(

)

(

)

2

2

2

....

2[8.55.43...(3.05.43)]159.733

subjij

SSpXX

=S-

=-++-=

(

)

(

)

(

)

(

)

(

)

2

222

.....

28[5.905.435.555.434.855.43]11.433

groupk

SSpnXX

=S-

=-+-+-=

(

)

(

)

(

)

(

)

2

22

.....

310[4.875.436.005.43]19.267

phasej

SSgnXX

=S-

=-+-=

(

)

(

)

(

)

2

22

....

104.805.43...4.605.4352.333

cellsjk

SSnXX

=S-

éù

=-++-=

ëû

51.33319.26711.43320.633

PGcellsphasegroup

SSSSSSSS

=--=--=

Source

df

SSMSF

Between subj

29

159.7333

Groups

2

11.4333

5.7166

1.04

Ss w/ Grps

27

148.300

5.4926

Within subj

30

95.0000

Phase

1

19.2667

19.2667

9.44*

P * G

2

20.6333

10.3165

5.06*

P * Ss w/Grps

27

55.1000

2.0407

Total

59

254.733

*

p

<

.

05

[

F

.

05

(

1

,

27

)

=

4

.

22

;

F

.

05

(

2

,

27

)

=

3

.

36

]

b.Plot:

B

B

J

J

H

H

Baseline

Training

4

4.5

5

5.5

6

6.5

7

Phase

B

Exp

J

Att Cont

H

No Att Cont

c.There seems to be no difference between the Experimental and Attention groups, but both show significantly more improvement than the No Attention group.

14.6Summarization of stories by adult and children good and poor readers:

Cell Means for Age * Readers * Items:

Adults

Items:

Setting

Goal

Disp

Mean

Good Readers

6.20

6.0

5.0

5.73

Poor Readers

5.40

4.8

2.0

4.07

Mean

5.80

5.40

3.50

4.90

Children

Items:

Setting

Goal

Disp

Mean

Good Readers

5.80

5.60

3.00

4.80

Poor Readers

3.00

2.40

1.20

2.20

Mean

4.40

4.00

2.10

3.50

Cell Means for Age * Readers:

Adults

Children

Mean

Good Readers

5.73

4.80

5.27

Poor Readers

4.07

2.20

3.13

Mean

4.90

3.50

4.20

Cell Means for Age * Items:

Adults

Children

Mean

Setting

5.80

4.40

5.10

Goal

5.40

4.00

4.70

Disposition

3.50

2.10

2.80

Mean

4.90

3.50

4.20

Cell Means for Reader * Items:

Good Readers

Poor Readers

Mean

Setting

6.00

4.20

5.10

Goal

5.80

3.60

4.70

Disposition

4.00

1.60

2.80

Mean

5.27

3.13

4.20

Subject Means:

Good Adult Readers:

7.00

5.00

5.00

7.00

4.67

Good Children Readers:

4.00

6.33

6.00

4.33

3.33

Poor Adult Readers:

5.33

3.00

4.67

3.00

4.33

Poor Children Readers:

2.00

1.00

3.33

3.33

1.33

(

)

2

2

2

252

1312253.600

60

total

X

SSX

N

=-=-=

å

å

(

)

(

)

2

2

2

....

3[7.004.20...(1.334.20)]164.933

subjijki

SSiXX

=S-

=-++-=

(

)

(

)

(

)

(

)

(

)

2

22

.......

325[4.904.203.504.20]29.400

agej

SSirnXX

=S-

=-+-=

(

)

(

)

(

)

(

)

(

)

2

22

.......

235[5.274.203.134.20]68.267

readeri

SSainXX

=S-

=-+-=

(

)

(

)

(

)

(

)

(

)

2

22

......

355.734.20...2.204.20100.933

cellsARij

SSinXX

=S-

éù

=-++-=

ëû

100.93329.40068.2673.267

ARcellsARagereader

SSSSSSSS

=--=--=

(

)

(

)

(

)

(

)

(

)

2

22

2

.......

225[5.104.204.704.20(2.804.20)]60.400

itemj

SSarnXX

=S-

=-+-+-=

(

)

(

)

(

)

(

)

2

22

......

255.804.20...2.104.2089.800

cellsAIjk

SSrnXX

=S-

éù

=-++-=

ëû

89.80029.40060.4000.00

AIcellsAIageitem

SSSSSSSS

=--=--=

(

)

(

)

(

)

(

)

2

22

.......

256.004.20...1.604.20129.600

cellsRIik

SSanXX

=S-

éù

=-++-=

ëû

129.60068.26760.4000.933

RIcellsRIreaderitem

SSSSSSSS

=--=--=

(

)

(

)

(

)

2

22

.....

56.204.20...1.204.20170.800

cellsARIijk

SSnXX

=S-

éù

=-++-=

ëû

170.80029.40068.26760.4003.2670.0000.933

8.533

ARIcellsARIagereaderitemARAIRI

SSSSSSSSSSSSSSSS

=------

=------=

14.7From Exercise 14.6:

a.Simple effect of reading ability for children:

(

)

(

)

(

)

(

)

2

22

35[4.803.502.203.50]50.70

RatCRatCC

SSinXX

=S-

=-+-=

50.70

50.70

1

RatC

RatC

RatC

SS

MS

df

===

Because we are using only the data from Children, it would be wise not to use a pooled error term. The following is the relevant printout from SPSS for the Between-subject effect of Reader.

Tests of Between-Subjects Effects

a

Measure: MEASURE_1

Transformed Variable: Average

367.500

1

367.500

84.483

.000

50.700

1

50.700

11.655

.009

34.800

8

4.350

Source

Intercept

READERS

Error

Type III Sum

of Squares

df

Mean Square

F

Sig.

AGE = Children

a.

b.Simple effect of items for adult good readers:

(

)

(

)

(

)

(

)

2

222

5[6.205.736.005.735.005.73]4.133

IatAGIatAGAG

SSnXX

=S-

=-+-+-=

Again, we do not want to pool error terms. The following is the relevant printout from SPSS for Adult Good readers. The difference is not significant, nor would it be for any decrease in the df if we used a correction factor.

Tests of Within-Subjects Effects

Measure: MEASURE_1

Sphericity Assumed

4.133

2

2.067

3.647

.075

4.533

8

.567

Source

ITEMS

Error(ITEMS)

Type III Sum

of Squares

df

Mean Square

F

Sig.

14.8Within-groups covariance matrices for the data in Exercise 14.10:

S

ˆ

within

AG

=

1

.

70

1

.

25

1

.

00

1

.

25

2

.

50

1

.

25

1

.

00

1

.

25

1

.

00

S

ˆ

within

CG

=

1

.

70

1

.

90

1

.

25

1

.

90

3

.

30

1

.

50

1

.

25

1

.

50

1

.

00

S

ˆ

within

AP

=

1

.

30

1

.

10

0

.

75

1

.

10

1

.

70

1

.

00

0

.

75

1

.

00

1

.

00

S

ˆ

within

CP

=

2

.

00

2

.

00

0

.

25

2

.

00

2

.

80

0

.

40

0

.

25

0

.

40

0

.

70

S

ˆ

pooled

=

1

.

6750

1

.

5625

0

.

8125

1

.

5625

2

.

5750

1

.

0375

0

.

8125

1

.

0375

0

.

9250

14.9It would certainly affect the covariances because we would force a high level of covariance among items. As the number of responses classified at one level of Item went up, another item would have to go down.

14.10Cigarette smoking quitting techniques:

a.Analysis:

Cell Means for Group * Time * Place:

Pre

Post

Home

Work

Home

Work

Mean

Taper

6.80

6.00

5.80

3.60

5.55

Immediate

7.00

6.20

5.80

4.80

5.95

Aversion

7.00

6.20

4.80

2.40

5.10

Mean

6.93

6.13

5.47

3.60

5.53

Means Group * Time:

Means Group * Place:

Pre

Post

Mean

Home

Work

Mean

Taper

6.40

4.70

5.55

6.30

4.80

5.55

Immediate

6.60

5.30

5.95

6.40

5.50

5.95

Aversion

6.60

3.60

5.10

5.90

4.30

5.10

Mean

6.53

4.53

5.53

6.20

4.87

5.53

Time * Place:

Pre

Post

Total

Home

6.93

5.47

6.20

Work

6.13

3.60

4.87

Tot

6.53

4.53

5.53

Subject * Time:

Pre

6.5

4.5

7.5

8.0

5.5

7.5

5.0

6.5

7.5

6.5

8.5

4.0

7.0

6.0

7.5

Post

5.0

3.5

5.5

5.5

4.0

6.5

4.5

5.5

5.5

4.5

4.5

2.5

4.0

2.5

4.5

Mean

5.75

4.00

6.50

6.75

4.75

7.00

3.75

6.00

6.50

5.50

6.50

3.25

5.50

4.25

6.00

Subject * Place:

Home

6.5

5.0

7.5

7.0

5.5

7.5

5.0

6.5

7.0

6.0

7.0

3.5

6.0

6.0

7.0

Work

5.0

3.0

5.5

5.5

4.0

6.5

4.5

5.5

6.0

5.0

6.0

3.0

5.0

2.5

5.0

(

)

2

2

2

332

2024186.933

60

total

X

SSX

N

=-=-=

å

å

(

)

(

)

(

)

(

)

2

2

2

.......

22[5.755.53...(6.005.53)]69.433

subjl

SStpXX

=S-

=-++-=

(

)

(

)

(

)

(

)

(

)

(

)

2

222

...

....

225[5.555.535.955.535.105.53]7.233

groupi

SStpnXX

=S-

=-+-+-=

(

)

(

)

(

)

(

)

(

)

2

22

.......

325[6.535.534.535.53]60.000

timej

SSgpnXX

=S-

=-+-=

(

)

(

)

(

)

2

22

......

(2)6.505.53...4.505.53143.933

cellsTSijl

SSpnXX

=S-

éù

=-++-=

ëû

143.93360.00069.43314.500

TScellsTStimesubj

SSSSSSSS

=--=--=

(

)

(

)

(

)

(

)

2

22

......

256.405.53...3.605.5375.133

cellsGTij

SSpnXX

=S-

éù

=-++-=

ëû

75.1337.23360.0007.900

GTcellsGTgrouptime

SSSSSSSS

=--=--=

(

)

(

)

(

)

(

)

(

)

2

22

.......

325[6.205.534.875.5326.667

placek

SSgtnXX

=S-

=-+-=

(

)

(

)

(

)

2

22

......

26.505.53...5.005.53104.933

cellsPSkl

SStXX

=S-

éù

=-++-=

ëû

104.93326.66769.4338.833

PScellsPSplacesubj

SSSSSSSS

=--=--=

(

)

(

)

(

)

(

)

2

22

.......

256.305.53...4.305.5335.333

cellsGPik

SStnXX

=S-

éù

=-++-=

ëû

35.3337.23326.6671.433

GPcellsGPgroupplace

SSSSSSSS

=--=--=

(

)

(

)

(

)

(

)

2

22

.......

356.935.53...3.605.5390.933

cellsTPjk

SSgnXX

=S-

éù

=-++-=

ëû

90.93360.00026.6674.267

TPcellsTPtimeplace

SSSSSSSS

=--=--=

186.933

cellsTPStotal

SSSS

==

186.93360.00026.66769.43314.5008.8334.26

73.233

TPScellsTPStimeplacesubjTSPSTP

SSSSSSSSSSSSSSSS

=------

=------=

(

)

(

)

(

)

2

22

.....

56.805.53...2.405.53108.933

cellsGTPijk

SSnXX

=S-

éù

=-++-=

ëû

108.9337.23360.00026.6677.9001.4334.2671

.433

GTPgrouptimeplaceGTGPTP

SSSSSSSSSSSSSS

=-----

=------=

Source

df

SS

MS

F

Between subj

14

69.433

Group

2

7.233

3.617

0.70

Ss w/in grp

12

62.200

5.183

Within subj

45

116.500

Time

1

60.000

60.000

109.09*

TxS

14

14.500

GxT

2

7.900

3.950

7.18*

GxTw/in grps

12

6.600

0.550

Place

1

26.667

26.667

43.24*

PxS

14

8.833

GxP

2

1.433

0.717

1.16

GxPw/in grps

12

7.400

0.617

TxP

1

4.267

4.267

28.44*

TxPxS

14

3.233

TxPxG

2

1.433

0.717

4.78*

TxPxSw/in grp

12

1.800

0.150

Total

59

186.933

*p < .05 [F.05(1,12) = 4.75; F.05(2,12) = 3.89]

b.There is a significant decrease in desire from Pre to Post and a significant reduction at Work relative to at Home. There is also a Time by Place interaction, with a greater Place difference after treatment. The Time by Group interaction is the real test of our hypothesis, showing that the Pre-Post difference depends on the treatment group, with the greatest difference in the Aversion condition. The 3-way interaction shows that the Time by Group interaction itself interacts with Place.

14.11Plot of results in Exercise 14.10:

B

B

J

J

H

H

Pre

Post

3

3.5

4

4.5

5

5.5

6

6.5

7

Time

B

Taper

J

Immediate

H

Aversion

14.12I will look at the Group × Time interaction by looking at the simple effect of Time for each Group. For a more complex design, we should run separate analyses for each group, to avoid problems with sphericity. However, with only two levels of the repeated measures variables, we have only one off-diagonal covariance, so we don’t have a problem with constant covariances. I will test each by the same test term as was used to test the Group × Time interaction, MSGxTxSs w/in groups (The pattern of significance would not change with separate analyses. The Fs if we used separate analyses would be 44.46, 18.78, and 51.43, respectively.)

(

)

(

)

(

)

(

)

2

11

22

.....

256.405.554.705.5514.45

timeattaperj

SSpnXX

=S=

éù

=-+-=

ëû

14.45

14.45

1

14.45

26.27*

0.55

tatt

timeattaper

tatt

tatt

timeattaper

error

SS

MS

df

MS

F

MS

===

===

(

)

(

)

(

)

(

)

2

22

22

.....

256.605.955.305.958.45

timeatimmedj

SSpnXX

=S=

éù

=-+-=

ëû

8.45

8.45

1

8.45

15.36*

0.55

tati

timeatimmed

tati

tati

timeatimmed

error

SS

MS

df

MS

F

MS

===

===

(

)

(

)

(

)

(

)

2

33

22

256.605.103.605.1045.00

...

..

timeataverj

SSpnXX

=S=

éù

=-+-=

ëû

15.00

45.00

1

45.00

81.82*

0.55

tata

timeataver

tata

tati

timeatimmed

error

SS

MS

df

MS

F

MS

===

===

*

p

<

.

05

[

F

.

05

(

1

,

12

)

=

4

.

75

]

These SSSimple Effects sum to the same total as SSTime and SSG * T:

14.45 + 8.45 + 45.00 = 67.90 = 60.00 + 7.90

Each of the methods led to a significant reduction in desire to smoke. If we then look at the effect of Group at Post:

(

)

(

)

(

)

(

)

(

)

2

122

222

.....

25[4.704.535.304.533.604.53]14.867

groupatpost

SSpnXX

=S-

=-+-+-=

/

//

/

/

14.867

7.433

2

66.2006.600

3.033

1212

GxTxSswingrps

groupatpost

groupatpost

groupatpost

SswingrpsGxTxSswingrps

wincell

Sswingrps

SS

MS

df

SSSS

MS

dfdf

===

+

+

===

++

/

7.433

2.45

3.033

GatP

groupatpost

wincell

MS

F

MS

===

This would not be significant even for the maximum possible value of f ', meaning that we do not have data to allow us to recommend one method over the other. (If we had run a separate analysis on just the Posttest data, the corresponding F would have been 4.33, with p = .038.)

14.13Analysis of data in Exercise 14.5 by BMDP:

a.Comparison with results obtained by hand in Exercise 14.5.

b.The F for Mean is a test on H0: = 0.

c.MSw/in Cell is the average of the cell variances.

14.14An analysis of data in Exercise 14.6 by SPSS as if it were a factorial:

From Exercise 14.6:

/ */

/

/ */

64.00018.800

1.725

1632

SswingroupsISswngroups

wincell

SswingroupsISswngroups

MSMS

MS

dfdf

+

+

===

++

This equals the MSResidual in the SPSS printout.

14.15Source column of summary table for 4-way ANOVA with repeated measures on A & B and independent measures on C & D.

Source

Between Ss

C

D

CD

Ss w/in groups

Within Ss

A

AC

AD

ACD

A x Ss w/in groups

B

BC

BD

BCD

B x Ss w/in groups

AB

ABC

ABD

ABCD

AB x Ss w/in groups

Total

14.16Analysis of Foa et al. (1991) data

All three groups decreased over time, but the Supportive Counseling group decreased the least and the interaction was significant.

14.17Using the mixed models procedure on data from Exercise 14.16

If we assume that sphericity is a reasonable assumption, we could run the analysis with covtype(cs). That will give us the following, and we can see that the F’s are the same as they were in our analysis above.

However, the correlation matrix below would make us concerned about the reasonableness of a sphericity assumption. (This matrix is collapsed over groups, but reflects the separate matrices well.) Therefore we will assume an autoregressive model for our correlations.

These F values are reasonably close, but certainly not the same.

14.18Standard analysis with missing data:

Notice that we have lost considerable degrees of freedom and our F for Group is no longer significant

14.19Mixed model analysis with unequal size example.

Notice that we have a substantial change in the F for Time, though it is still large.

14.20Analysis of Stress data:

Source

df

SS

MS

F

Pillai F

Prob

Between subj

97

137.683

Gender

1

7.296

7.296

5.64*

Role

1

8.402

8.402

6.49*

G * R

1

0.298

0.298

<1

Ss w/in Grps

94

121.687

1.294

Within subj

97

87.390

Time

1

1.064

1.064

1.23*

1.23

0.2700

T*G

1

0.451

0.451

<1

0.52

0.4720

T*R

1

0.001

0.001

<1

0.00

0.9708

T*G*R

1

4.652

4.652

5.38*

5.38

0.0225

T*Ss w/in grps

94

81.222

0.864

Total

194

103.386

*p < .05

The univariate and multivariate F values agree because we have only two levels of each independent variable.

14.21Everitt’s study of anorexia:

a.SPSS printout on gain scores:


Dependent Variable: GAIN

614.644

a

2

307.322

5.422

.006

732.075

1

732.075

12.917

.001

614.644

2

307.322

5.422

.006

3910.742

69

56.677

5075.400

72

4525.386

71

Source

Corrected Model

Intercept

TREAT

Error

Total

Corrected Total

Type III Sum

of Squares

df

Mean Square

F

Sig.

R Squared = .136 (Adjusted R Squared = .111)

a.

b.SPSS printout using pretest and posttest:


Measure: MEASURE_1

Sphericity Assumed

366.037

1

366.037

12.917

.001

307.322

2

153.661

5.422

.006

1955.371

69

28.339

Source

TIME

TIME * TREAT

Error(TIME)

Type III Sum

of Squares

df

Mean Square

F

Sig.

c.The F comparing groups on gain scores is exactly the same as the F for the interaction in the repeated measures design.

d.

TREAT: 1.00 Cognitive Behavioral

PRETEST

100

90

80

70

60

POSTTEST

110

100

90

80

70

EMBED StaticEnhancedMetafile

TREAT: 2.00 Control

PRETEST

100

90

80

70

POSTTEST

100

90

80

70

TREAT: 3.00 Family Therapy

PRETEST

100

90

80

70

POSTTEST

110

100

90

80

70

The plots show that there is quite a different relationship between the variables in the different groups.

e.Treatment Group = Control

One-Sample Statistics

a

26

-.4500

7.9887

1.5667

GAIN

N

Mean

Std. Deviation

Std. Error Mean

Treatment Group = Control

a.

One-Sample Test

a

-.287

25

.776

-.4500

-3.6767

2.7767

GAIN

t

df

Sig. (2-tailed)

Mean

Difference

Lower

Upper

95% Confidence Interval

of the Difference

Test Value = 0

Treatment Group = Control

a.

This group did not gain significantly over the course of the study. This suggests that any gain we see in the other groups cannot be attributed to normal gains seen as a function of age.

f.Without the control group we could not separate gains due to therapy from gains due to maturation.

14.22When multiple respondents come from the same family, their data are not likely to be independent. We act as if we have 98 different respondents, but in fact we do not have 98 independent respondents, which is important. If we had complete data from each family we could treat Spouse and Patient as a repeated measures variable—it is a "within-family" variable. Alternatively, we could delete data so as to have only one respondent per family. In this situation, numerous studies have shown that there is a remarkably small degree of dependence between members of the same family, and many people would ignore the problem entirely.

14.23t = -0.555. There is no difference in Time 1 scores between those who did, and did not, have a score at Time 2.

b.If there had been differences, I would worried that people did not drop out at random.to answer.

14.24Intraclass correlation:

(

)

(

)

(

)

(

)

1/

82.574.08

82.5724.08370.124.08/20

78.49

.85

82.578.166.30

subjectsIxS

subjectsJxSjudgeJxS

MSMS

IC

MSjMSjMSMSn

-

=

+-+-

-

=

++-

==

++

The remainder of this exercise raises some questions that anyone interested in the reliability of their data (and we all should be) needs to be prepared

14.25Differences due to Judges play an important role.

14.26I would leave the variability due to Judge out of my calculations entirely.

14.27If I were particularly interested in differences between subjects, and recognized that judges probably didn’t have a good anchoring point, and if this lack was not meaningful, I would not be interested in considering it.

14.28The fact that the “parent” who supplies the data changes from case to case simply adds additional variability to our data, and this variability is confounded with differences between children

14.29Strayer et al. (2006)

b.Contrasts on means:

Because the variances within each condition are so similar, I have used MSerror(within) as my error term. The means are 776.95, 778.95, and 849.00 for Baseline, Alcohol, and Cell phone conditions, respectively..

2

12

13

23

2

12

*

13

*

23

ˆ

ˆ

776.95778.952

ˆ

776.95849.0072.05

ˆ

778.95849.0070.5

216303.709

28.551

40

2/28.5510.07

72.05/28.5512.52

70.05/28.5512.45

ierror

vs

vs

vs

ierror

vs

vs

vs

t

aMS

n

aMS

den

n

t

t

t

y

y

y

y

=

S

=-=

=-=

=-=

S

´

===

==

==

==

Both Baseline and Alcohol conditions show poorer performance than the cell phone condition, but, interestingly, the Baseline and Alcohol conditions do not differ from each other.

14.30 Study by Teri et al. (1997):

The following are the results from SPSS. As we would expect, the interaction is significant. In fact, the F for the interaction is exactly equal to the F for the Group effect on Change in Exercise 11-37, and the sum of squares for both the interaction and the error term are exactly half of what they were in Exercise 11-37.


Measure:MEASURE_1

Source

Type III Sum of Squares

df

Mean Square

F

Sig.

PrePost

Sphericity Assumed

164.360

1

164.360

32.557

.000

Greenhouse-Geisser

164.360

1.000

164.360

32.557

.000

Huynh-Feldt

164.360

1.000

164.360

32.557

.000

Lower-bound

164.360

1.000

164.360

32.557

.000

PrePost * Group

Sphericity Assumed

195.695

3

65.232

12.922

.000

Greenhouse-Geisser

195.695

3.000

65.232

12.922

.000

Huynh-Feldt

195.695

3.000

65.232

12.922

.000

Lower-bound

195.695

3.000

65.232

12.922

.000

Error(PrePost)

Sphericity Assumed

343.285

68

5.048

Greenhouse-Geisser

343.285

68.000

5.048

Huynh-Feldt

343.285

68.000

5.048

Lower-bound

343.285

68.000

5.048


Measure:MEASURE_1

Transformed Variable:Average

Source

Type III Sum of Squares

df

Mean Square

F

Sig.

Intercept

26031.127

1

26031.127

911.469

.000

Group

42.771

3

14.257

.499

.684

Error

1942.048

68

28.560

Chapter 15 - Multiple Regression

15.1Predicting Quality of Life:

a.All other variables held constant, a difference of +1 degree in Temperature is associated with a difference of –.01 in perceived Quality of Life. A difference of $1000 in median Income, again all other variables held constant, is associated with a +.05 difference in perceived Quality of Life. A similar interpretation applies to b3 and b4. Since values of 0.00 cannot reasonably occur for all predictors, the intercept has no meaningful interpretation.

b.

Y

ˆ

=

5

.

37

-

.

01

(

55

)

+

.

05

(

12

)

+

.

003

(

500

)

-

.

01

(

200

)

=

4

.

92

c.

Y

ˆ

=

5

.

37

-

.

01

(

55

)

+

.

05

(

12

)

+

.

003

(

100

)

-

.

01

(

200

)

=

3

.

72

15.2A difference of +1 standard deviation in Temperature is associated with a difference of ‑.438 standard deviations in perceived quality of life, while a difference of +1 standard deviation in Income is associated with about three quarters of a standard deviation difference in perceived Quality of Life. A similar interpretation can be made for the other variables, but in all cases it is assumed that all variables are held constant except for the one in question.

15.3The F values for the four regression coefficients would be as follows:

F

1

=

b

1

s

b

1

2

=

-

0

.

438

0

.

397

2

=

1

.

22

F

2

=

b

2

s

b

2

2

=

0

.

762

0

.

252

2

=

9

.

14

F

3

=

b

3

s

b

3

2

=

0

.

081

0

.

052

2

=

2

.

43

F

4

=

b

4

s

b

4

2

=

-

0

.

132

0

.

025

2

=

27

.

88

I would thus delete Temperature, since it has the smallest F, and therefore the smallest semi-partial correlation with the dependent variable.

15.4Predicting Job Satisfaction:

a.

Y

ˆ

=

.

605

Respon

-

.

334

NumSup

+

.

486

Envir

+

.

070

Yrs

+

1

.

669

b.[.624 -.311 .514 .063]

15.5a.Envir has the largest semi-partial correlation with the criterion, because it has the largest value of t.

b.The gain in prediction (from r = .58 to R = .697) which we obtain by using all the predictors is more than offset by the loss of power we sustain as p became large relative to N.

15.6Adjusted R2 for the data in Exercise 15.4:

est

R

*

2

=

1

-

(

1

-

R

2

)

(

N

-

1

)

(

N

-

p

-

1

)

=

1

-

(

1

-

.

4864

)

(

14

)

(

15

-

4

-

1

)

=

.

2810

15.7As the correlation between two variables decreases, the amount of variance in a third variable that they share decreases. Thus the higher will be the possible squared semi-partial correlation of each variable with the criterion. They each can account for more previously unexplained variation.

15.8The more highly two predictor variables are intercorrelated, the more "substitutability" there is between them. Thus for one data set variable X1 may receive the greater weight, but for a second data set variable X2 may happen to "substitute" for X1, leaving X1 with only a minor role to play.

15.9The tolerance column shows us that NumSup and Respon are fairly well correlated with the other predictors, whereas Yrs is nearly independent of them.

15.10For the data in Exercise 15.4:

Y

ˆ

=

.

605

Respon

-

.

334

NumSup

+

.

486

Envir

+

.

070

Yrs

+

1

.

669

Satisfaction

Y

ˆ

2

3.26

2

2.85

3

5.90

3

4.63

5

4.18

5

8.15

6

4.73

6

5.59

6

6.91

7

5.99

8

7.86

8

5.62

8

5.93

9

6.86

9

8.55

µ

µ

µ

(

)

(

)

0.1234

2.426

1.693

2.864

2.846

.697

2.4261.693

Y

Y

YY

YY

s

s

s

rR

=

=

=

===

15.11Using Y and

µ

Y

from Exercise 15.10:

(

)

2

ˆ

1

42.322

4.232 (also calculated by BMDP in Exerc

ise 15.4)

1541

residual

YY

MS

Np

-

=

--

==

--

å

15.12The effect of sample size on the multiple correlation, using random data:

For N = 15, R2 = .173

For N = 10, R2 = .402

For N = 6, R2 = .498 Notice that the correlation increases as N - p decreases.

For N = 5, R2 = 1.000 Here N is equal to the number of variables.

For N = 4 the matrix is singular and there is no solution.

15.13Adjusted R2 for 15 cases in Exercise 15.12:

R

2

0

.

1234

=

.

173

est

R

*

2

=

1

-

(

1

-

R

2

)

(

N

-

1

)

(

N

-

p

-

1

)

=

1

-

(

1

-

.

173

)

(

14

)

(

15

-

4

-

1

)

=

-

.

158

Since a squared value cannot be negative, we will declare it undefined. This is all the more reasonable in light of the fact that we cannot reject H0:R* = 0.

15.14Using the first three variables from Exercise 15.4:

a.The squared semi-partial correlation is .32546 – .31654 = .00892.

The squared partial correlation is .00892/(1 – .32546) = .01305.

b.Venn diagram illustrating squared partial and semi-partial correlations for Satisfaction predicted by Number Supervised, partialling out Responsibility.

NumSup

Respon

R

0(2.1)

2

R

2

0.12

15.15Using the first three variables from Exercise 15.4:

a.Figure comparable to Figure 15.1:

b.

µ

Y

= 0.4067Respon + 0.1845NumSup + 2.3542

The slope of the plane with respect to the Respon axis (X1) = .4067

The slope of the plane with respect to the NumSup axis (X2) = .1845

The plane intersects the Y axis at 2.3542

15.16Predicting percentage of low-birthweight live births from Vermont Health Planning statistics:

a.

R

0

.

2

=

.

6215

R

0

.

25

=

.

7748

R

0

.

2

54

=

.

8181

b.

r

2

0

(

5

.

2

)

=

r

2

05

.

2

(

1

-

R

2

0

.

2

)

=

(

-

.

59063

2

)

(

1

-

.

3862

)

=

.

2141

R

2

0

.

25

=

R

2

0

.

2

+

r

2

0

(

5

.

2

)

=

.

3862

+

.

2141

=

.

6003

=

.

7748

2

15.17It has no meaning in that we have the data for the population of interest (the 10 districts).

15.18The gain in R2 is not sufficient to offset the gain in p relative to N.

15.19It plays a major role through its correlation with the residual components of the other variables.

15.20For the data in Exercise 15.16:

Y

µ

Y

1

µ

Y

2

6.1

34.2

4.48

7.1

37.3

5.12

7.4

43.9

5.57

6.3

36.6

4.74

6.5

41.0

5.22

5.7

29.6

3.87

6.6

33.0

4.51

8.1

43.0

5.40

6.3

30.6

4.32

6.9

41.3

5.48

µ

Y

1 = 1X2 + 1X4 – 3X5

µ

Y

2 = .10446X2 + .189720X4 – .29372X5

Using the approximate regression coefficients the correlation between Y and

µ

Y

would be 0.793 instead of the 0.818 which would be calculated from the regression of Y and the

µ

Y

obtained using the actual regression equation.

15.21Within the context of a multiple-regression equation, we cannot look at one variable alone. The slope for one variable is only the slope for that variable when all other variables are held constant. The percentage of mothers not seeking care until the third trimester is correlated with a number of other variables.

15.22Create set of data illustrating leverage, distance, and influence.

15.23Create set of data examining residuals.

15.24Modeling depression

DepressT

=

0

.

614

PVLoss

-

0

.

164

SuppTotl

-

0

.

106

AgeAtLoss

+

59

.

961

R

2

=

.

2443

F

.

05

(

3

,

131

)

=

14

.

115

Both PVLoss and SuppTotl are significant predictors, but AgeAtLoss is not.

15.25Rerun of Exercise 15.24 adding PVTotal.

b.The value of R2 was virtually unaffected. However, the standard error of the regression coefficient for PVLoss increased from 0.105 to 0.178. Tolerance for PVLoss decreased from .981 to .345, whereas VIF increased from 1.019 to 2.900. (c) PVTotal should not be included in the model because it is redundant with the other variables.

15.26Vulnerability as a function of social support and age at loss.

PVLoss

=

.

041

SuppTotl

+

.

164

AgeAtLos

+

17

.

578

R

2

=

.

0203

;

F

=

1

.

42

;

F

.

05

(

2

,

137

)

=

3

.

06

The relationship is not significant.

Neither of the predictors is significant.

15.27Path diagram showing the relationships among the variables in the model.

SuppTotl

PVLoss

AgeAtLoss

DepressT

-.2361

.0837

.4490

.1099

-.0524

15.28The direct effect of SuppTotl is represented by the arrow that goes from SuppTotl to DepressT. This is a semi-partial relationship because PVLoss was in the model to begin with. The indirect effect of SuppTotl is the path that runs from SuppTotl to PVLoss and then runs from PVLoss to DepressT. The expected change in DepressT for a one standard deviation increase in SuppTotl (all other things equal) is the sum of the direct and indirect effects as measured by the standardized coefficients. In this case it is –.2361 + (.0837)(.4490) = –.1985. (Keep in mind that the indirect path is far from significant, and it is discussed here just to illustrate a point.)

15.29Regression diagnostics.

Case # 104 has the largest value of Cook's D (.137) but not a very large Studentized residual (t = –1.88). When we delete this case the squared multiple correlation is increased slightly. More importantly, the standard error of regression and the standard error of one of the predictors (PVLoss) also decrease slightly. This case is not sufficiently extreme to have a major impact on the data.

15.30Adding error to a predictor.

As we add error to a predictor we should expect to see that the standard error of that predictor will increase and its significance decrease. For my particular example the standard error actually decreased slightly. The most noticeable results in this example were a substantial decrease in the multiple correlation coefficient, and a corresponding increase in the residual variance.

15.31Logistic regression using Harass.dat:

The dependent variable (Reporting) is the last variable in the data set.

I cannot provide all possible models, so I am including just the most complete. This is a less than optimal model, but it provides a good starting point. This result was given by SPSS.

Block 1: Method = Enter

Omnibus Tests of Model Coefficients

35.442

5

.000

35.442

5

.000

35.442

5

.000

Step

Block

Model

Step 1

Chi-square

df

Sig.

Model Summary

439.984

.098

.131

Step

1

-2 Log

likelihood

Cox & Snell

R Square

Nagelkerke

R Square

Classification Table

a

111

63

63.8

77

92

54.4

59.2

Observed

No

Yes

REPORT

Overall Percentage

Step 1

No

Yes

REPORT

Percentage

Correct

Predicted

The cut value is .500

a.

Variables in the Equation

-.014

.013

1.126

1

.289

.986

-.072

.234

.095

1

.757

.930

.007

.015

.228

1

.633

1.007

-.046

.153

.093

1

.761

.955

.488

.095

26.431

1

.000

1.629

-1.732

1.430

1.467

1

.226

.177

AGE

MARSTAT

FEMIDEOL

FREQBEH

OFFENSIV

Constant

Step 1

a

B

S.E.

Wald

df

Sig.

Exp(B)

Variable(s) entered on step 1: AGE, MARSTAT, FEMIDEOL, FREQBEH, OFFENSIV.

a.

From this set of predictors we see that overall LR = 35.44, which is significant on 5 df with a p value of .0000 (to 4 decimal places). The only predictor that contributes significantly is the Offensiveness of the behavior, which has a Wald of 26.43. The exponentiation of the regression coefficient yields 0.9547. This would suggest that as the offensiveness of the behavior increases, the likelihood of reporting decreases. That’s an odd result. But remember that we have all variables in the model. If we simply predicting reporting by using Offensiveness, exp(B) = 1.65, which means that a 1 point increase in Offensiveness multiplies the odds of reporting by 1.65. Obviously we have some work to do to make sense of these data. I leave that to you.

15.32Predicting Reporting from Marital Status:

I requested this because both variables are a dichotomy and it is easy to see the odds ratios. If we set this up as a contingency table using the CrossTabs procedure, we get the following from SPSS:

MARSTAT * REPORT Crosstabulation

Count

84

84

168

90

85

175

174

169

343

Married

Not Married

MARSTAT

Total

No

Yes

REPORT

Total

Chi-Square Tests

.070

b

1

.791

.024

1

.876

.070

1

.791

.829

.438

.070

1

.792

343

Pearson Chi-Square

Continuity Correction

a

Likelihood Ratio

Fisher's Exact Test

Linear-by-Linear Association

N of Valid Cases

Value

df

Asymp. Sig.

(2-sided)

Exact Sig.

(2-sided)

Exact Sig.

(1-sided)

Computed only for a 2x2 table

a.

0 cells (.0%) have expected count less than 5. The minimum expected count is 82.78.

b.

For married women, the odds of reporting are 84/84 = 1.00

For unmarried women, the odds of reporting are 85/90 = .944

The odds ratio is 1.059, which means that you are 1.059 times more likely to report the offense if you are married. Put the other way around, you are .944/1.00 = .944 times more likely to report the offensive behavior if you are unmarried. Since the odds are less than 1.00, being unmarried means that you are less likely to report.

If we run the logistic regression, we obtain:

Classification Table

a

90

84

51.7

85

84

49.7

50.7

Observed

No

Yes

REPORT

Overall Percentage

Step 1

No

Yes

REPORT

Percentage

Correct

Predicted

The cut value is .500

a.

Variables in the Equation

-.057

.216

.070

1

.791

.944

.057

.344

.028

1

.868

1.059

MARSTAT

Constant

Step 1

a

B

S.E.

Wald

df

Sig.

Exp(B)

Variable(s) entered on step 1: MARSTAT.

a.

Notice that the exp(B) = .9444, which is exactly what you obtained above for the odds ratio given that you are unmarried. That was the point of the exercise. Notice also how the classification table here matches with the one in the CrossTabs procedure, except that the rows and columns are now labeled “Observed” and “Predicted.”

15.33It may well be that the frequency of the behavior is tied in with its offensiveness, which is related to the likelihood of reporting. In fact, the correlation between those two variables is .20, which is significant at p < .000. (I think my explanation would be more convincing if Frequency were a significant predictor when used on its own.)

15.34Malcarne’s data on distress in cancer patients,

a.Predicting Distress2 from Distress1 and BlamPer

Both predictors play a significant role in Distress2

b.We need to include Distress1 in our prediction because it is very reasonable to assume that initial distress would relate to later distress, and we want to control for that effect when looking at the effect of BlamPer.

15.35BlamPer and BlamBeh are correlated at a moderate level (r = .52), and once we condition on BlamPer by including it in the equation, there is little left for BlamBeh to explain.

15.36I want students to think about what it means when we speak of “capitalizing on chance.” They also should think about the fact that stepwise regression is a very atheoretical way of going about things, and perhaps theory should take more of a role.

15.37Make up an example.

15.38They should see no change in the interaction term when they center the data, but they should see important differences in the “main effects” themselves. Have them look at the matrix of intercorrelations of the predictors.

15.39This should cause them to pause. It is impossible to change one of the variables without changing the interaction in which that variable plays a role. In other words, I can’t think of a sensible interpretation of “holding all other variables constant” in this situation.

15.40Testing mediation with Jose’s data.

(

)

(

)

(

)

(

)

(

)

222222222222

.478.022.321.017.017.022

0.000150.0123

.478.321

12.47

.012

ab

ab

abbaab

ab

sssss

z

s

bb

bb

bb

bb

=+-=+-

==

===

15.41Analysis of results from Feinberg and Willer (2011).

The following comes from using the program by Preacher and Leonardelli referred to in the chapter. I calculated the t values from the regression coefficients and their standard errors and then inserted those t values in the program. You can see that the mediated path is statistically significant regardless of which standard error you use for that path.

15.42Guber’s results:

This is a computer analysis question and there is no fixed set of answers.

Chapter 16 - Analyses of Variance and Covariance as General Linear Models

16.1Eye fixations per line of text for poor, average, and good readers:

a.Design matrix, using only the first subject in each group:

X

=

1

0

-

1

0

1

-

1

b.Computer exercise:

R

2

=

.

608

SS

reg

=

57

.

7333

SS

residual

=

37

.

2000

c.Analysis of variance:

X

1 = 8.2000

X

2 = 5.6

X

3 = 3.4

X

. = 5.733

n1 = 5n2 = 5

n3 = 5

N = 15

X = 86X 2 = 588

(

)

(

)

(

)

(

)

22

2

2

222

.

()86

58894.933

15

5[8.20005.7335.65.7333.45.733]

57.733

94.93357.73337.200

total

groupj

errortotalgroup

X

SSX

N

SSnXX

SSSSSS

S

=S-=-=

=S-=-+-+-

=

=-=-=

Source

df

SS

MS

F

Group

2

57.733

28.867

9.312*

Error

12

37.200

3.100

Total

14

94.933

*

p

<

.

05

[

F

.

05

(

2

,

12

)

=

3

.

89

]

16.2Continuing with the data in Exercise 16.1:

a.Treatment effects:

EMBED Equation.DSMT4

123'

11.1

22.2

8.2 5.6 3.4 5.733

8.25.7332.467

5.65.7330.133

XXXX

XXb

XXb

a

a

====

=-=-==

=-=-=-=

b.

EMBED Equation.DSMT4

22

57.733

.608

94.933

treatment

total

SS

R

SS

h

====

16.3Data from Exercise 16.1, modified to make unequal ns:

R

2

=

.

624

SS

reg

=

79

.

0095

SS

residual

=

47

.

6571

Analysis of variance:

X

1 = 8.2000

X

2 = 5.8571

X

3 = 3.3333

.

X

= 5.7968

n1 = 5n2 = 7n3 = 9N = 21X = 112X 2 = 724

(

)

(

)

(

)

(

)

2

2

2

2

222

.

()

112

724126.6666

21

5[8.20005.796875.85715.796893.33335.7968

]

79.0095

126.666679.009547.6571

total

groupjj

errortotalgroup

X

SSX

N

SSnXX

SSSSSS

=-=-=

=-=-+-+-

=

=-=-=

S

å

å

Source

df

SS

MS

F

Group

2

79.0095

39.5048

14.92*

Error

18

47.6571

2.6476

Total

20

126.6666

*

p

<

.

05

[

F

.

05

(

2

,

18

)

=

3

.

55

]

16.4Continuing with the data in Exercise 16.3:

a.Treatment effects:

123.

11.1

22.12

.0

8.2 5.8571 3.3333 5.7968

8.25.79682.4032

5.85715.79680.0603

5.7968

XXXX

XXb

XXb

Xb

a

a

====

=-=-==

=-=-==

==

b.

22

79.0095

.624

126.6666

treatment

total

SS

R

SS

h

====

16.5Relationship between Gender, SES, and Locus of Control:

a.Analysis of Variance:

SES

Low

Average

High

Mean

Gender

Male

12.25

14.25

17.25

14.583

Female

8.25

12.25

16.25

12.250

Mean

10.25

13.25

16.75

13.417

X = 644X 2 = 9418n = 8N = 48

(

)

(

)

(

)

(

)

(

)

(

)

(

)

(

)

(

)

(

)

22

2

2

22

2

222

2

...

...

..

()644

9418777.6667

48

38[14.58313.41712.25013.417]

65.333

28[10.2513.41713.2513.41716.7513.417]

338.6667

8[12.251

total

genderi

SESj

cellsij

X

SSX

N

SSsnXX

SSgnXX

SSnXX

S

=S-=-=

=S-=-+-

=

=S-=-+-+-

=

=S-=-

(

)

(

)

22

3.417...16.2513.417]422.6667

422.666765.3333338.666718.6667

777.6667422.6667355.0000

GScellsgenderSES

errortotalcells

SSSSSSSS

SSSSSS

++-=

=--=--=

=-=-=

Source

df

SS

MS

F

Gender

1

65.333

65.333

7.730*

SES

2

338.667

169.333

20.034*

G x S

2

18.667

9.333

1.104

Error

42

355.000

8.452

Total

47

777.667

*

p

<

.

05

[

F

.

05

(

1

,

42

)

=

4

.

08

;

F

.

05

(

2

,

42

)

=

3

.

23

]

b.ANOVA summary table constructed from sums of squares calculated from design matrix:

(

)

(

)

(

)

(

)

(

)

(

)

,,,

,,,

,,,

422.6667357.333365.333

422.666784.0000338.667

422.6667404.00018.667

777.667

G

regareg

S

regareg

GS

regareg

totalY

SSSSSS

SSSSSS

SSSSSS

SSSS

abbbab

abbaab

abbab

=-=-=

=-=-=

=-=-=

==

The summary table is exactly the same as in part a (above).

16.6

SSSES = SSreg

a.This is because we have equal ns, and therefore the variables are orthogonal—that is they do not account for overlapping portions of the variance.

b.This will not be true with unequal ns.

16.7The data from Exercise 16.5 modified to make unequal ns:

(

)

(

)

(

)

(

)

,,

(,,)

,)

(,,)

,

(,,)

,

750.1951458.7285291.467

458.7285398.713560.015

458.7285112.3392346.389

458.7285437.633821

errorY

reg

Greg

reg

reg

reg

reg

reg

SSSSSS

SSSSSS

SSSSSS

SSSSSS

abab

abab

bab

abab

aab

abab

ab

=-=-=

=-=-=

=-=-=

=-=-=

.095

Source

df

SS

MS

F

Gender

1

60.015

60.015

7.21*

SES

2

346.389

173.195

20.80*

G x S

2

21.095

10.547

1.27

Error

35

291.467

8.328

Total

40

*

p

<

.

05

[

F

.

05

(

1

,

35

)

=

4

.

12

;

F

.

05

(

2

,

35

)

=

3

.

27

]

16.8

SSSES ≠ SSreg()

346.389 ≠ 379.3325

The two values are not the same because (as pointed out in Exercise 16.6) they will not agree when there are unequal ns. In Exercise 16.7 some of the variation accounted for by SES was shared with Gender and the interaction, and thus was not included in SSSES from Exercise 16.5.

16.9Model from data in Exercise 16.5:

1

.

1667

A

1

-

3

.

1667

B

1

-

0

.

1667

B

2

+

0

.

8333

AB

11

-

0

.

1667

AB

12

+

13

.

4167

Means:

SES (B)

Low

Avg

High

Gender (A)

Male

12.25

14.25

17.25

14.583

Female

8.25

12.25

16.25

12.250

¶

¶

0

111

112

223

11

11114

12

121

ˆ

..13.4167intercept

ˆ

..14.58313.41671.1667

ˆ

..10.2513.41673.1667

ˆ

..13.2513.41670.1667

..12.2514.58310.2513.14670.8337

Xb

AXb

BXb

BXb

ABABXb

ABA

m

a

b

b

ab

ab

====

=-=-==

=-=-=-=

=-=-=-=

=--+=--+==

=-

25

..14.2514.58313.2513.14670.1667

BXb

-+=--+=-=

16.10Model from data in Exercise 16.7:

1

.

2306

A

1

-

3

.

7167

B

1

+

0

.

3500

B

2

+

0

.

4778

AB

11

+

0

.

5444

AB

12

+

13

.

6750

Means:

SES (B)

Weighted

Means

Unweighted

Means

Low

Avg

High

Gender (A)

Male

11.617

15.800

17.250

15.105

14.906

Female

8.250

12.250

16.833

12.045

12.444

Weighted Means:

9.714

13.615

17.070

13.463

Unweighted:

9.958

14.025

17.043

13.675

With unequal ns we need to use the unweighted means in order to reproduce the values found by the regression model.

¶

¶

0

111

112

223

11

11114

12

1212

ˆ

..13.475intercept

ˆ

..14.90613.6751.231

ˆ

..9.95813.6753.717

ˆ

..14.02513.6750.350

..11.66714.9069.95813.6750.478

..

Xb

AXb

BXb

BXb

ABABXb

ABABX

m

a

b

b

ab

ab

====

=-=-==

=-=-=-=

=-=-==

=--+=--+==

=--+=

5

15.80014.90614.02513.6750.544

b

--+==

16.11Does Method III really deal with unweighted means?

Means:

B1

B2

weighted

unweighted

A1

4

10

8.5

7.0

A2

10

4

8.0

7.0

weighted

8.0

8.5

8.29

unweighted

7.0

7.0

7.0

The full model produced by Method 1:

1111

ˆ

0.00.03.07.0

YABAB

=+-+

Effects calculated on weighted means:

¶

0

111

112

11

11113

ˆ

..8.29intercept

ˆ

..8.508.290.21

ˆ

..8.008.290.29

..4.008.508.008.294.21

Xb

AXb

BXb

ABABXb

m

a

b

ab

===¹

=-=-=¹

=-=-=¹

=--+=--+=-¹

Effects calculated on unweighted means:

¶

0

111

112

11

11113

ˆ

..7.00=intercept

ˆ

..7.007.000.00

ˆ

..7.007.000.00

..4.007.007.007.003.00

Xb

AXb

BXb

ABABXb

m

a

b

ab

===

=-=-==

=-=-==

=--+=--+=-=

These coefficients found by the model clearly reflect the effects computed on unweighted means. Alternately, carrying out the complete analysis leads to SSA = SSB = 0.00, again reflecting equality of unweighted means.

16.12Venn diagram representing the sums of squares in Exercise 16.5:

SS(total)

16.13Venn diagram representing the sums of squares in Exercise 16.7:

SES

Sex

SxS

SS(total)

SS(error)

16.14SAS printout for the data in Exercise 16.7:

The SAS System 13:05 Saturday, November 18, 2000

The GLM Procedure

Dependent Variable: dv

Source DF Type I SS Mean Square F Value Pr > F

A 1 95.4511028 95.4511028 11.46 0.0018

B 2 342.1827043 171.0913522 20.55 <.0001

A*B 2 21.0946481 10.5473241 1.27 0.2944

Source DF Type II SS Mean Square F Value Pr > F

A 1 58.3013226 58.3013226 7.00 0.0121

B 2 342.1827043 171.0913522 20.55 <.0001

A*B 2 21.0946481 10.5473241 1.27 0.2944

Source DF Type III SS Mean Square F Value Pr > F

A 1 60.0149847 60.0149847 7.21 0.0110

B 2 346.3892120 173.1946060 20.80 <.0001

A*B 2 21.0946481 10.5473241 1.27 0.2944

Source DF Type IV SS Mean Square F Value Pr > F

A 1 60.0149847 60.0149847 7.21 0.0110

B 2 346.3892120 173.1946060 20.80 <.0001

A*B 2 21.0946481 10.5473241 1.27 0.2944

16.15Energy consumption of families:

a.Design matrix, using only the first entry in each group for illustration purposes:

X

=

1

.

.

.

0

.

.

.

-

1

0

.

.

.

1

.

.

.

-

1

58

.

.

.

60

.

.

.

75

75

.

.

.

70

.

.

.

80

b.Analysis of covariance:

SS

reg

(

a

,

cov

,

a

c

)

=

2424

.

6202

SS

reg

(

a

,

cov

)

=

2369

.

2112

SS

residual

=

246

.

5221

=

SS

error

There is not a significant decrement in SSreg and thus we can continue to assume homogeneity of regression.

SS

reg

(

a

)

=

1118

.

5333

SS

cov

=

SS

reg

(

a

,

cov

)

-

SS

reg

(

a

)

=

2369

.

2112

-

1118

.

5333

=

1250

.

6779

SS

reg

(

cov

)

=

1716

.

2884

SS

A

=

SS

reg

(

a

,

cov

)

-

SS

reg

(

cov

)

=

2369

.

2112

-

1716

.

2884

=

652

.

9228

Source

df

SS

MS

F

Covariate

1

1250.6779

1250.6779

55.81*

A (Group)

2

652.9228

326.4614

14.57*

Error

11

246.5221

22.4111

Total

14

2615.7333

*

p

<

.

05

[

F

.

05

(

1

,

11

)

=

4

.

84

;

F

.

05

(

2

,

11

)

=

3

.

98

]

16.16Exercise 16.15 expanded into a two-way analysis of covariance.

a.Analysis of covariance:

First we will test for homogeneity of regression:

(

)

2

,,,cov,

(,,,cov,)

2

,,,cov

,,,cov

.8931

4288.5572

.8931

4283.9008

512.79916

c

regbc

reg

residual

R

SS

R

SS

SS

ababab

abaab

abab

abab

=

=

=

=

=

There is a nonsignificant decrement in attributable variation (in fact, R2 is still the same to four decimal places!), so we will take our second model as our full model.

(

)

(

)

(

)

(

)

(

)

(

)

(

)

,,cov

,,,cov,,cov

,,cov

,,,cov,,cov

,,cov

4111.2036

4283.90084111.2036172.6972

3163.0287

4283.90083163.02871120.8721

4275.6550

reg

B

regreg

reg

A

regreg

reg

AB

SS

SSSSSS

SS

SSSSSS

SS

SSSS

aab

ababaab

bab

ababbab

ab

=

=-=-=

=

=-=-=

=

=

(

)

(

)

(

)

(

)

(

)

,,,cov,,cov

,,

cov

,,,cov,,

4283.90084275.65508.2458

1979.1000

4283.90081979.10002304.8008

regreg

reg

regreg

SS

SS

SSSSSS

ababab

abab

abababab

-=-=

=

=-=-=

Source

df

SS

MS

F

Covariate

1

2304.8008

2304.8008

103.37*

A (Group)

2

1120.8721

560.4361

25.14*

B (Meter)

1

172.6972

172.6972

7.75*

AB

2

8.2458

4.1229

<1

Error

11

512.7992

22.2956

Total

14

4796.7000

*

p

<

.

05

[

F

.

05

(

1

,

11

)

=

4

.

84

;

F

.

05

(

2

,

11

)

=

3

.

98

]

b.Conclusions: After adjusting for last year’s usage, there are significant differences between the time-of-day groups in terms of usage and between those who could check on current usage (the ‘metered’ group) and those who could not. In particular, Group 3 appears to use more electricity than the other two groups, and those with meters use less than those without meters. (A more precise statement would require the calculation of adjusted means as in Exercise 16.17.) There is no interaction between the two independent variables.

16.17Adjusted means for the data in Exercise 16.16:

(The order of the means may differ depending on how you code the group membership and how the software sets up its design matrix. But the numerical values should agree.)

1211121

ˆ

7.90990.87862.40220.56670.13110.72606.37

40

YAABABABC

=-+-++++

Y

ˆ

11

=

-

7

.

9099

(

1

)

+

0

.

8786

(

0

)

-

2

.

4022

(

1

)

+

0

.

5667

(

1

)

+

0

.

1311

(

0

)

+

0

.

7260

(

61

.

3333

)

+

6

.

3740

=

41

.

1566

Y

ˆ

12

=

-

7

.

9099

(

1

)

+

0

.

8786

(

0

)

-

2

.

4022

(

-

1

)

+

0

.

5667

(

-

1

)

+

0

.

1311

(

0

)

+

0

.

7260

(

61

.

3333

)

+

6

.

3740

=

44

.

8276

Y

ˆ

21

=

-

7

.

9099

(

0

)

+

0

.

8786

(

1

)

-

2

.

4022

(

1

)

+

0

.

5667

(

0

)

+

0

.

1311

(

1

)

+

0

.

7260

(

61

.

3333

)

+

6

.

3740

=

49

.

5095

Y

ˆ

22

=

-

7

.

9099

(

0

)

+

0

.

8786

(

1

)

-

2

.

4022

(

-

1

)

+

0

.

5667

(

0

)

+

0

.

1311

(

-

1

)

+

0

.

7260

(

61

.

3333

)

+

6

.

3740

=

54

.

0517

Y

ˆ

31

=

-

7

.

9099

(

-

1

)

+

0

.

8786

(

-

1

)

-

2

.

4022

(

1

)

+

0

.

5667

(

-

1

)

+

0

.

1311

(

-

1

)

+

0

.

7260

(

61

.

3333

)

+

6

.

3740

=

54

.

8333

Y

ˆ

32

=

-

7

.

9099

(

-

1

)

+

0

.

8786

(

-

1

)

-

2

.

4022

(

-

1

)

+

0

.

5667

(

1

)

+

0

.

1311

(

1

)

+

0

.

7260

(

61

.

3333

)

+

6

.

3740

=

61

.

0333

(We enter 61.3333 for the covariate in each case, because we want to estimate what the cell means would be if the observations in those cells were always at the mean of the covariate.)

16.18Analysis of difference scores for data in Exercise 16.16:

Source

df

SS

MS

F

Group

2

1086.4667

543.2333

15.50*

Meter

1

197.6333

197.6333

5.64*

G(M

2

6.0667

3.0333

<1

Error

24

841.2000

35.0500

Total

29

2131.3667

*p < .05

16.19Klemchuk, Bond, & Howell (1990)

16.20Analysis of variance on Epinuneq.dat:

Model Includes:

SSregression

SSresidual

Dose, Interval, DxI

Dose, Interval

Dose, DxI

Interval, DxI

162.39512

150.78858

159.44821

45.29395

226.7619

Source

df

SS

MS

F

Dose

2

117.1112

58.5556

28.92*

Interval

2

2.9469

1.4734

0.73

DI

4

11.60654

2.9016

1.43

Error

112

226.7619

2.0247

Total

120

16.21Analysis of GSIT in Mireault.dat:


Dependent Variable: GSIT

1216.924

a

5

243.385

2.923

.013

1094707.516

1

1094707.516

13146.193

.000

652.727

1

652.727

7.839

.005

98.343

2

49.172

.590

.555

419.722

2

209.861

2.520

.082

30727.305

369

83.272

1475553.000

375

31944.229

374

Source

Corrected Model

Intercept

GENDER

GROUP

GENDER * GROUP

Error

Total

Corrected Total

Type III Sum

of Squares

df

Mean Square

F

Sig.


a.

Estimated Marginal Means

GENDER * GROUP


62.367

1.304

59.804

64.931

64.676

1.107

62.500

66.853

63.826

1.903

60.084

67.568

62.535

.984

60.600

64.470

60.708

.858

59.020

62.396

58.528

1.521

55.537

61.518

GROUP

1

2

3

1

2

3

GENDER

Male

Female

Mean

Std. Error

Lower Bound

Upper Bound


16.22Analysis of covariance for Mireault’s data:



1441.155

a

6

240.192

2.844

.010

178328.902

1

178328.902

2111.844

.000

496.225

1

496.225

5.877

.016

303.986

1

303.986

3.600

.059

154.540

2

77.270

.915

.402

325.689

2

162.845

1.928

.147

24319.374

288

84.442

1163091.000

295

25760.529

294

Source

Corrected Model

Intercept

YEARCOLL

GENDER

GROUP

GENDER * GROUP

Error

Total

Corrected Total

Type III Sum

of Squares

df

Mean Square

F

Sig.


a.

Estimated Marginal Means

GENDER * GROUP


61.879

a

1.449

59.026

64.731

64.467

a

1.169

62.165

66.768

62.841

a

2.108

58.692

66.991

62.699

a

1.162

60.412

64.986

61.077

a

1.037

59.036

63.118

58.513

a

1.650

55.265

61.762

GROUP

1

2

3

1

2

3

GENDER

Male

Female

Mean

Std. Error

Lower Bound

Upper Bound


Evaluated at covariates appeared in the model: YEARCOLL = 2.68.

a.

16.23Analysis of variance on the covariate from Exercise 16.22.

The following is abbreviated SAS output.

General Linear Models Procedure

Dependent Variable: YEARCOLL

Sum ofMean

SourceDFSquaresSquareF ValuePr > F

Model513.34776452.66955292.150.0600

Error292363.00122881.2431549

Corrected Total297376.3489933

R-SquareC.V.Root MSE

YEARCOLL Mean

0.03546641.532581.11497

2.6845638

SourceDFType III SSMean SquareF ValuePr > F

GENDER15.950062995.950062994.790.0295

GROUP20.780704310.390352160.310.7308

GENDER*GROUP22.962723101.481361551.19 0.3052

GENDERGROUPYEARCOLL

LSMEAN

112.27906977

122.53225806

132.68421053

212.88888889

222.85000000

232.70967742

These data reveal a significant difference between males and females in terms of YearColl. Females are slightly ahead of males. If the first year of college is in fact more stressful than later years, this could account for some of the difference we found in Exercise 16.21.

16.24Analysis of Everitt’s data:

a.Analysis on Gain scores:

Dependent Variable: Gain

Sum of

Source DF Squares Mean Square F Value Pr > F

Model 2 614.643667 307.321833 5.42 0.0065

Error 69 3910.742444 56.677427

Corrected Total 71 4525.386111

R-Square Coeff Var Root MSE Gain Mean

0.135821 272.3858 7.528441 2.763889


Group 2 614.6436669 307.3218334 5.42 0.0065

b.Analysis of Post scores ignoring Pre scores:

Dependent Variable: Post

Sum of


Model 2 918.986916 459.493458 8.65 0.0004

Error 69 3665.057528 53.116776


R-Square Coeff Var Root MSE Post Mean

0.200475 8.556928 7.288126 85.17222


Group 2 918.9869160 459.4934580 8.65 0.0004

c.Analysis of Post scores with Pre score as covariate

Dependent Variable: Post

Sum of


Model 3 1272.781825 424.260608 8.71 <.0001

Error 68 3311.262620 48.695039


R-Square Coeff Var Root MSE Post Mean

0.277655 8.193027 6.978183 85.17222


Group 2 766.2728128 383.1364064 7.87 0.0008

Pre 1 353.7949086 353.7949086 7.27 0.0089

d.The analysis of gain scores and the analysis of covariance ask similar questions, but they would only agree if the relationship between Pre and Post had a slope of 1. In general the analysis of covariance will be better. The analysis of Post scores is confounded, because we can’t discriminate between effects of the treatments and any pre-existing group differences. In addition, the analysis of covariance would generally be better because it would adjust the error term.

e.Because there is variability in the pretest scores that has little to do with weight, I would be tempted to remove that from SStotal to get a more appropriate denominator for h2. If I did that would be 766.27/(4584.04-353.795) = 766.27/4230.245 = .18, which strikes me as a far amount of explained variance given all of the factors that influence weight. (A case might well be made for leaving in the variation due to pretest scores for the same reasons given when talking about matched sample t.)

f.When we leave out the Control group we can run a standard analysis of covariance between the remaining groups adjusted for the pretest. This gives us adjusted means of 85.70 and 90.198 for CogBehav and Family groups, respectively. The standard deviations of the groups are close, and a weighted average of the variances is 72.63, which gives a standard deviation of 8.5225. Then, using adjusted means,

12

85.7091.198

0.625

8.5225

XX

d

s

-

-

===-

Thus after we adjust for pretest weights, the Family Therapy group gained about two thirds of a standard deviation more than the Cognitive Behavior Therapy group.

16.25Everitt compared two therapy groups and a control group treatment for anorexia. The groups differed significantly in posttest weight when controlling for pretest weight (F = 8.71, p < .0001, with the Control group weighing the least at posttest. When we examine the difference between just the two treatment groups at posttest, the F does not reach significant, F = 3.745, p = .060, though the effect size for the difference between means (again controlling for pretest weights) is 0.62 with the Family Therapy group weighing about six pounds more than the Cognitive/Behavior Therapy group. It is difficult to k

Chapter 14 - Repeated Measures Designsstatdhtx/methods8/Instructors Ma… · Web viewChapter 14 – Repeated-Measures Designs [As in previous chapters, there will be substantial

Documents