Top Banner
IS 4800 Empirical Research Methods for Information Science Class Notes March 16, 2012 Instructor: Prof. Carole Hafner, 446 WVH [email protected] Tel: 617-373-5116 Course Web site: www.ccs.neu.edu/course/is4800sp12/
43

IS 4800 Empirical Research Methods for Information Science Class Notes March 16, 2012

Jan 05, 2016

Download

Documents

Goro

IS 4800 Empirical Research Methods for Information Science Class Notes March 16, 2012. Instructor: Prof. Carole Hafner, 446 WVH [email protected] Tel: 617-373-5116 Course Web site: www.ccs.neu.edu/course/is4800sp12/. Outline. Sampling and statistics (cont.) T test for paired samples - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

IS 4800 Empirical Research Methods for Information Science

Class Notes March 16, 2012

Instructor: Prof. Carole Hafner, 446 [email protected] Tel: 617-373-5116

Course Web site: www.ccs.neu.edu/course/is4800sp12/

Page 2: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Outline

• Sampling and statistics (cont.)

• T test for paired samples

• T test for independent means

• Analysis of Variance

• Two way analysis of Variance

Page 3: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

3

Relationship Between Population and Samples When a Treatment

Had No EffectPopulation

M1 M2

Sample 2Sample 1

Page 4: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

4

Relationship Between Population and Samples When a Treatment

Had An EffectControlgrouppopulation

c

Controlgroupsample

Mc

Treatmentgroupsample

Mt

Treatmentgrouppopulation

t

Page 5: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Population

Mean? Variance?

2

Sampling

Sample of size N

Mean values from all possible samples of size Naka “distribution of means”

N

XM

NM

22

N

MXSD

2

2 )(

ZM = ( M - M

Page 6: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Z tests and t-tests

t is like Z:

Z = M - μ /

t = M – μ / μ = 0 for paired samples

We use a stricter criterion (t) instead of Z because is based on an estimate of the population variance while is based on a known population variance.

M

MS

MS

M

S2 = Σ (X - M)2 = SS

N – 1 N-1S2

M = S2/N

Page 7: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Given info about population of changescores and thesample size we willbe using (N)

T-test with paired samples

Now, given a particular sample of change scores of size N

We can compute the distribution of means

We compute its mean

and finally determine the probability that this mean occurred by chance

?

= 0S2 est 2 from sample = SS/df

MS

Mt

df = N-1

S2M = S2/N

Page 8: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

t test for independent samples

Given two samples

Estimate population variances(assume same)

Estimate variancesof distributions of means

Estimate varianceof differences between means(mean = 0)

This is now yourcomparison distribution

Page 9: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Estimating the Population Variance

S2 is an estimate of σ2

S2 = SS/(N-1) for one sample (take sq root for S)

For two independent samples – “pooled estimate”:S2 = df1/dfTotal * S1

2 + df2/dfTotal * S22

dfTotal = df1 + df2 = (N1 -1) + (N2 – 1)

From this calculate variance of sample means: S2M = S2/N

needed to compute t statistic

S2difference = S2

Pooled / N1 + S2Pooled / N2

Page 10: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

t test for independent samples, continued

This is yourcomparison distributionNOT normal, is a ‘t’ distribution

Shape changes depending on df

df = (N1 – 1) + (N2 – 1)

Distribution of differencesbetween means

Compute t = (M1-M2)/SDifference Determine if beyond cutoff score for test parameters (df,sig, tails) from lookup table.

Page 11: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

ANOVA: When to use

• Categorial IVnumerical DV (same as t-test)

• HOWEVER:– There are more than 2 levels of IV so:– (M1 – M2) / Sm won’t work

Page 12: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

12

ANOVA Assumptions

• Populations are normal

• Populations have equal variances

• More or less..

Page 13: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

13

Basic Logic of ANOVA

• Null hypothesis– Means of all groups are equal.

• Test: do the means differ more than expected give the null hypothesis?

• Terminology– Group = Condition = Cell

Page 14: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

14

Accompanying Statistics• Experimental

– Between-subjects• Single factor, N-level (for N>2)

– One-way Analysis of Variance (ANOVA)

• Two factor, two-level (or more!)– Factorial Analysis of Variance

– AKA N-way Analysis of Variance (for N IVs)

– AKA N-factor ANOVA

– Within-subjects• Repeated-measures ANOVA (not discussed)

– AKA within-subjects ANOVA

Page 15: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

15

• The Analysis of Variance is used when you have more than two groups in an experiment– The F-ratio is the statistic computed in an Analysis of

Variance and is compared to critical values of F

– The analysis of variance may be used with unequal sample size (weighted or unweighted means analysis)

– When there are just 2 groups, ANOVA is equivalent to the t test for independent means

ANOVA: Single factor, N-level (for N>2)

Page 16: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

One-Way ANOVA – Assuming Null Hypothesis is True…

Within-Group EstimateOf Population Variance

21est

22est

23est

2estwithin

Between-Group EstimateOf Population Variance

M1

M2

M3

2estbetween

2

2

estwithin

estbetweenF

Page 17: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Justification for F statistic

Page 18: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Calculating F

Page 19: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Example

Page 20: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Example

Page 21: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Using the F Statistic

• Use a table for F(BDF, WDF)– And also α

BDF = between-groups degrees of freedom =

number of groups -1

WDF = within-groups degrees of freedom =

Σ df for all groups = N – number of groups

Page 22: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

One-way ANOVA in SPSS

Page 23: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

23

Data

0

1

2

3

4

5

6

1 Day 2 Day 3 Day

Performance

Mean

Page 24: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

24

Analyze/Compare Means/One Way ANOVA…

Page 25: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

SPSS Results…

ANOVA

Performance

24.813 2 12.406 9.442 .001

27.594 21 1.314

52.406 23

Between Groups

Within Groups

Total

Sum ofSquares df Mean Square F Sig.

F(2,21)=9.442, p<.05

Page 26: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

26

Factorial Designs

• Two or more nominal independent variables, each with two or more levels, and a numeric dependent variable.

• Factorial ANOVA teases apart the contribution of each variable separately.

• For N IVs, aka “N-way” ANOVA

Page 27: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

27

Factorial Designs

• Adding a second independent variable to a single-factor design results in a FACTORIAL DESIGN

• Two components can be assessed– The MAIN EFFECT of each independent variable

• The separate effect of each independent variable

• Analogous to separate experiments involving those variables

– The INTERACTION between independent variables • When the effect of one independent variable changes over levels of a

second

• Or– when the effect of one variable depends on the level of the other variable.

Page 28: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Example

Wait Time Sign in Student Centervs. No Sign

Satisfaction

Page 29: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

0

2

4

6

8

10

12

Level 1 Level 2

Level of Independent Variable A

Val

ue

of t

he

Dep

end

ent

Var

iab

le

Level 1 Level 2

Example of An Interaction - Student Center Sign – 2 Genders x 2 Sign Conditions

F

M

NoSign

Sign

Page 30: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

30

Two-way ANOVA in SPSS

Page 31: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

31

Analyze/General Linear Model/Univariate

Page 32: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

32

Results

Tests of Between-Subjects Effects

Dependent Variable: Performance

26.507a 5 5.301 3.685 .018

210.855 1 210.855 146.547 .000

20.728 2 10.364 7.203 .005

.002 1 .002 .001 .974

1.680 2 .840 .584 .568

25.899 18 1.439

401.250 24

52.406 23

SourceCorrected Model

Intercept

TrainingDays

Trainer

TrainingDays * Trainer

Error

Total

Corrected Total

Type III Sumof Squares df Mean Square F Sig.

R Squared = .506 (Adjusted R Squared = .369)a.

Page 33: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

33

Results

Page 34: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

34

Degrees of Freedom

• df for between-group variance estimates for main effects– Number of levels – 1

• df for between-group variance estimates for interaction effect – Total num cells – df for both main effects – 1– e.g. 2x2 => 4 – (1+1) – 1 = 1

• df for within-group variance estimate– Sum of df for each cell = N – num cells

• Report: “F(bet-group, within-group)=F, Sig.”

Page 35: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Publication formatTests of Between-Subjects Effects

Dependent Variable: Performance

26.507a 5 5.301 3.685 .018

210.855 1 210.855 146.547 .000

20.728 2 10.364 7.203 .005

.002 1 .002 .001 .974

1.680 2 .840 .584 .568

25.899 18 1.439

401.250 24

52.406 23

SourceCorrected Model

Intercept

TrainingDays

Trainer

TrainingDays * Trainer

Error

Total

Corrected Total

Type III Sumof Squares df Mean Square F Sig.

R Squared = .506 (Adjusted R Squared = .369)a.

N=24, 2x3=6 cells => df TrainingDays=2, df within-group variance=24-6=18

=> F(2,18)=7.20, p<.05

Page 36: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

36

Reporting rule

• IF you have a significant interaction

• THEN – If 2x2 study: do not report main effects, even if

significant– Else: must look at patterns of means in cells to

determine whether to report main effects or not.

Page 37: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Results?

TrainingDaysTrainerTrainingDays * Trainer

Sig.0.340.120.41

n.s.

Page 38: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Results?

TrainingDaysTrainerTrainingDays * Trainer

Sig.0.340.120.02

Significant interaction between TrainingDaysAnd Trainer, F(2,22)=.584, p<.05

Page 39: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Results?

TrainingDaysTrainerTrainingDays * Trainer

Sig.0.340.020.41

Main effect of Trainer, F(1,22)=.001, p<.05

Page 40: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Results?

TrainingDaysTrainerTrainingDays * Trainer

Sig.0.040.120.01

Significant interaction between TrainingDaysAnd Trainer, F(2,22)=.584, p<.05

Do not report TrainingDays as significant

Page 41: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

Results?

TrainingDaysTrainerTrainingDays * Trainer

Sig.0.040.020.41

Main effects for both TrainingDays, F(2,22)=7.20, p<.05, and Trainer,F(1,22)=.001, p<.05

Page 42: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

“Factorial Design”

• Not all cells in your design need to be tested– But if they are, it is a “full factorial design”, and you

do a “full factorial ANOVA”

Real-Time Retrospective

Agent

Text

X

Page 43: IS 4800 Empirical Research Methods  for Information Science Class Notes March 16, 2012

43

Higher-Order Factorial Designs

• More than two independent variables are included in a higher-order factorial design– As factors are added, the complexity of the experimental

design increases• The number of possible main effects and interactions increases

• The number of subjects required increases

• The volume of materials and amount of time needed to complete the experiment increases