Top Banner
Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research
30

Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Dec 17, 2015

Download

Documents

Augusta Carroll
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Symposium:

Statistical Power and Optimal Design Principles for

Improving the Efficiency of Psychological Research

Page 2: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Jake Westfall (University of Colorado Boulder)

• PANGEA: A web application for power analysis in general ANOVA designs

Daniel Lakens (Eindhoven University of Technology)

• Performing high-powered studies efficiently with sequential analyses

Matthew Fritz (University of Nebraska – Lincoln)

• Issues with increasing statistical power in mediation models

Robert Ackerman (The University of Texas at Dallas)

• Power considerations for the actor-partner interdependence model

Page 3: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Power is an old issue

• Methodologists have been preaching about power for over 50 years– (Cohen, 1962)

• Yet low-powered studies continue to be the norm in psychology– (Sedlmeier & Gigerenzer, 1989)– (Maxwell, 2004)

Page 4: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Renewed interest in power?

Page 5: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Renewed interest in power?• Lots of recent interest in attempting to

replicate results[citation needed]

• But failures to replicate only informative when statistical power is adequate

Page 6: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Fine. But what is left to learn about power?

• A lot• For one, persistent and widespread

intuitions about the sample sizes necessary for adequate power are basically terrible

• n=30 rule??

Page 7: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Design n=30 rule

2 cells 40%

2×2 69%

2×2×2 94%

N=160 rule

2 cells 81%

2×2 81%

2×2×2 81%

Based on power to detect average effect size in social psychology (d=0.45) in between-subjects factorials

Page 8: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Fine. But what is left to learn about power?

• Even among the initiated, power tends to be well-understood only for simple designs– At most: Factorial ANOVA & multiple regression

• For even moderately more complicated designs (e.g. 2*2 mixed ANOVA), researchers back to “winging it”

• Some strange things can happen when designs get complicated!– Maximum attainable power < 100% ?!

Page 9: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

This talk

• Two purposes1. Debut brand new web app.

PANGEA: Power ANalysis for GEneral Anova designs

2. Describe in detail a particular, unique application of PANGEA

• Power analysis with crossed random factors (participants responding to stimuli)

Page 10: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

JakeWestfall.org/pangea/

Page 11: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

PANGEA (JakeWestfall.org/pangea/)

• “General ANOVA design” = any design that can be described by some variety of ANOVA model– Any number of factors with any number of levels– Any factor can be fixed or random (more on that

shortly!)– Any possible pattern of nesting/crossing allowed

Page 12: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

PANGEA (JakeWestfall.org/pangea/)• Examples of designs covered by PANGEA:

– 2 independent groups (the classic!)– Factorial (between-subjects) ANOVA– Repeated-measures or mixed ANOVA– 3 level (and beyond) hierarchical/multilevel designs– Crossed random factors (e.g., participants crossed

with stimuli)– Dyadic designs (e.g., Social Relations Model)

• All in a single, unified framework

Page 13: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

PANGEA (JakeWestfall.org/pangea/)

• Limitations:– Assumes “balanced” designs only (constant cell size /

constant number of observations per unit)– Assumes no continuous predictors

Page 14: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

An example: Crossed random factors• Studies involving participants responding to

stimuli (hypothetical data matrix):

Subject #1

2

3

.

.

.

4 6 7 3 8 8 7 9 5 6

4 7 8 4 6 9 6 7 4 5

3 6 7 4 5 7 5 8 3 4

Page 15: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

• Just in domain of implicit prejudice and stereotyping:– IAT (Greenwald et al.)– Affective Priming (Fazio et al.)– Shooter task (Correll et al.)– Affect Misattribution Procedure (Payne et al.)– Go/No-Go task (Nosek et al.)– Primed Lexical Decision task (Wittenbrink et al.)– Many non-paradigmatic studies

Page 16: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

• “How many stimuli should I use?”• “How similar or variable should the stimuli be?”• “When should I counterbalance the assignment of stimuli

to conditions?”• “Is it better to have all participants respond to the same set

of stimuli, or should each participant receive different stimuli?”

• “Should participants make multiple responses to each stimulus, or should every response by a participant be to a unique stimulus?”

Hard questions

PANGEA to the rescue!

Page 17: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Power analysis in crossed designs

• Power determined by several parameters:– 1 effect size (Cohen’s d)– 2 sample sizes

• p = # of participants• q = # of stimuli

– Set of Variance Partitioning Coefficients (VPCs)• VPCs describe what proportion of the random variation

in the data comes from which sources• Different designs depend on different VPCs

Page 18: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

For power = 0.80, need q ≈ 50

Page 19: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

For power = 0.80, need p ≈ 20

Page 20: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

?

Page 21: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Maximum attainable power

• In crossed designs, power asymptotes at a maximum theoretically attainable value that depends on:– Effect size– Number of stimuli– Stimulus variability

• Under realistic assumptions, maximum attainable power can be quite low!

Page 22: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

To obtain max.power = 0.9…

Pessimist: q=86

Realist: q=20 to 50

Optimist: q=11

Page 23: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Implications of maximum attainable power

• Think hard about your experimental stimuli before you begin collecting data!– Once data collection begins, maximum attainable

power is pretty much determined.

• Even the most optimistic assumptions imply that we should use at least 11 stimuli– Based on achieving max. power = 0.9 to detect a

canonical “medium” effect size (d = 0.5)

Page 24: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

The end

JakeWestfall.org/pangea/

References:Westfall, J., Kenny, D. A., & Judd, C. M. (2014).

Statistical power and optimal design in experiments in which samples of participants respond to samples of stimuli.

Journal of Experimental Psychology: General.

Judd, C. M., Westfall, J., & Kenny, D. A. (invited).

Linear mixed models for the analysis of experiments with multiple random factors.

To appear in Annual Review of Psychology.

Page 25: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Bonus slides!

Page 26: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

PANGEA (JakeWestfall.org/pangea/)

• Features coming soon to PANGEA– Specify desired power, solve for minimum parameter

values (effect size, sample sizes, etc.) necessary to yield that power level

– Sensitivity analysis: Specify distributions of likely parameter values, compute corresponding distribution of likely power values

Page 27: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Sensitivity analysis

Distribution of correlations

+

Distribution of effect sizes

+

Range of sample sizes

=

Power curve that includes parameter uncertainty

Page 28: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

What about time-consuming stimulus presentation?

• Assume that responses to each stimulus take about 10 minutes (e.g., film clips).

• Power analysis says we need q=60 to reach power=0.8 (based on having p=60)

• But then it would take over 10 hours for a participant to respond to every stimulus!

• The highest feasible number of responses per participant is, say, 6 (about one hour)

• Are we doomed to have low power? No!

Page 29: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Stimuli-within-Block designs

Page 30: Symposium: Statistical Power and Optimal Design Principles for Improving the Efficiency of Psychological Research.

Standard error reduced by factor of 2.3!