Power and Sample Size - Vanderbilt - WebHomebiostat.mc.vanderbilt.edu/wiki/pub/Main/ChrisSlaughter/powergi.pdf · Opportunity to plan data analysis before collecting any ... Power

Power andSample Size

Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Power and Sample Size

Chris Slaughter, DrPH

Assistant Professor, Department of BiostatisticsVanderbilt University School of Medicine

GI Research Conference June 19, 2008


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Outline

1 Introduction

2 Definitions

3 Factors that Impact Power

4 Sample Size Calculations

5 Margin of Error

6 Conclusions and Advice


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Power and Sample Size

First question asked (and the last answered)

How many subjects do I need in my study?If I enroll # subjects in a treatment and control group,how likely am I to detect a significant difference betweenthe two groups?

Calculations depends on

Scientific goalsStudy designAnalysis methodPractical limitations: budget, time


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Expectations

What to expect from a sample size calculation

Estimate of the approximate number of subjects for agiven study designConduct early at design phase when changes still possibleOpportunity to plan data analysis before collecting anydata

What not to expect

High accuracy if inputs (informed guesses) are not accurateA quick answerPost-hoc power analysis


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Hypothesis Testing

Hypothesis: usually a statement to be judged of the form“population value = specified constant”

Null hypothesis (H0)

Usually a hypothesis of no effectH0 is often a straw man; something you hope to disproveH0 : µ1 − µ2 = 0

Alternative hypothesis (H1)

H1 : µ1 − µ2 6= 0Power and sample size calculation require you specify thealternative hypothesis too; e.g. H1 : µ1 − µ2 = 10


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Errors in Hypothesis Testing

Type I error (α)

Prob. of rejecting your null hypothesis when it is trueDeclaring that a significant association exists between Xand Y when, in truth, X and Y are not relatedα = 0.05 or 0.01 usually

Type II error (β)

Prob. of failing to reject your null hypothesis when it isfalseNot finding a significant association exists between X andY when, in truth, X and Y are relatedPower = 1− ββ = 0.20 or 0.10 usually


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

More Definitions

Effect size: How large of a difference you expect to seebetween groups (e.g. a treatments and control group)

Difference in means, difference in proportions, odds ratios,relative riskWhat is a clinically relevant difference?

Precision

Absence of random errorVariable has nearly the same value when measuredmultiple timesHigh precision leads to decreased variability and higherpower

Accuracy

Degree to which a variable accurately measures what it issupposed to measureIncreases validity of conclusions


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Effect Size and Precision


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Power and Sample Size Relationships

Power ↑ when

Allow larger type I error (α; tradeoff between type I and IIerrors)Larger effect observedVariability ↓n ↑ (and n1

n2= 1)

Required sample size (n) ↓Allow larger type I errorLarger effect observedVariability ↓Allow larger type II error (power ↓)


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Power versus Sample Size


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Types of outcomes and predictors

Specific power calculation will depend on the analysismethod

Continuous outcome, binary predictor

Percent of time below pH 4 in a treatment and controlgroup2-sample t-test, Wilcoxon rank sum test

Binary outcome, binary predictor

Any improvement (yes/no) in the steroid group comparedto the steroid plus dilation groupDichotomize percent of time below pH 4χ2 test, test of proportions, odds ratio

Continuous outcome, continuous predictor

Correlation, linear regression

Lots of other analysis options...


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Calculation Methods

Software: PS, web, others

biostat.mc.vanderbilt.edu/PowerSampleSizewww.cs.uiowa.edu/~rlenth/Power/#1 and #4 on google search

Formulas

Repeated measuresUnusual designs

Simulation

Study-specific

biostat.mc.vanderbilt.edu/PowerSampleSize

www.cs.uiowa.edu/~rlenth/Power/


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Example: Percent of time below pH 4

Continuous outcome, binary predictor

Treatment group spends 40% of time below pH 4Control group spends 50% of time below pH 4Standard deviation of 10%


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Example: Dichotomize percent of time below pH 4

Binary outcome, binary predictor

“Abnormal” if more than half of time is spent below pH 4

Treatment group: 16% AbnormalControl group: 50% AbnormalStandard deviation determined by above percentages


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Comparison of binary and continuous outcomes

Same data assumptions

Treatment: 40% of time below pH 4 (σ = 10%) wouldgive 16% AbnormalControl: 50% of time below pH 4 (σ = 10%) would give50% Abnormal

Number of subjects needed (each group)

Continuous outcome: 22 subjectsBinary outcome: 38 subjects

Need estimate of the variability for continuous outcomes

For binary outcomes, variability is largest for p = 50%


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Sample Size and Margin of Error

Goal: Plan a study so that the margin of error issufficiently small

The margin of error is defined to be half of the confidenceinterval width

Basing the sample size calculations on the margin of errorcan lead to a study that gives scientifically relevant resultseven if the results are not statistically significant.


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Margin of Error Example

Infection rate in a population is 50% and a reduction to40% is believed to “clinically significant”

Enroll enough subjects so that the margin of error is 5%.Consider these two possible outcomes:

1 The new treatment is found to decrease infections by 6%(95% CI: [11%, 1%]).

P-value < 0.05 (“significant”)

2 The new treatment decreases infections by only 4% (95%CI: [9%,−1%]).

P-value > 0.05 (“not significant”)


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Advantages

Advantages of planning for precision rather than power1

Many studies are powered to detect a miracle and nothingless; if a miracle doesn’t happen, the study provides noinformationPlanning on the basis of precision will allow the resultingstudy to be interpreted if the P-value is large, because theconfidence interval will not be so wide as to include bothclinically significant improvement and clinically significantworsening

See Borenstein M: J Clin Epi 1994; 47:1277-1285.


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Using Correlation (r) to Compute Sample Size

Continuous outcomes, continuous predictors

Without knowledge of population variances, etc., r can beuseful for planning studies

Choose n so that margin for error (half-width of C.L.) forr is acceptable

Precision of r in estimating ρ is generally worst whenpopulation correlation is 0

This margin for error is shown in the following figure below


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Using Correlation (r) to Compute Sample Size

Margin for error (length of longer side of asymmetric 0.95 confidence interval) for r in estimating ρ, when

ρ = 0 (solid line) and ρ = 0.5 (dotted line). Calculations are based on Fisher’s z transformation of r .


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Other considerations

Other factors that can impact required sample size

Dropouts (missing data)Correlation: Paired observations or repeated measuresMultiple testing and interim analysesEquivalence testingBetter analysis options


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Bad Ideas

Do not...

Use retrospective power calculations

Calculate standardized effect sizes (Cohen)

Standardize measure: “small”, “medium”, and “large”effectsIgnores important parts of study planning, science


Introduction

Definitions

Relationships

Calculations

Margin ofError

Summary

Good Ideas

Do ...

Use power calculations prospectively to plan future studies

Put science before statistics

Design your study to meet scientific goalsClinically important effect sizesStatistics help identify a plan that is effective in meetingscientific goals – not the other way around

Conduct pilot studies

Useful for estimating variance

Use continuous variables when possible

Power and Sample Size - Vanderbilt - WebHomebiostat.mc.vanderbilt.edu/wiki/pub/Main/ChrisSlaughter/powergi.pdf · Opportunity to plan data analysis before collecting any ... Power

Documents