Top Banner
Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim
45

Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Mar 26, 2015

Download

Documents

Lauren Walton
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Sample Size Calculation

PD Dr. Rolf Lefering

IFOM - Institut für Forschung in der Operativen MedizinUniversität Witten/HerdeckeCampus Köln-Merheim

Page 2: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

sample size

Sample Size Calculation

uncertainty

costs & effort & time

Page 3: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Sample Size Calculation

Single study group

- continuous measurement

- count of events

Comparative trial (2 or more groups)

- continuous measurement

- count of events

Page 4: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Which true value is compatible with the observation?

Confidence interval

... range where the true value lies with a high probability (usually 95%)

Confidence Interval

Page 5: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Confidence Interval

Example:

56 patients with open fractures, 9 developed an infection (16%)

n=56

sample

all patients with open fractures

infection rate:

16%

true value ???

Page 6: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Formula for event rates

n = sample sizep = percentage

Example: n = 56p = 16%

CI95 = 16 +/- 1,96 * (16*84) / 56 = 16 +/- 9,6[ 6,4 - 25,6 ]

Confidence Interval

p * (100 - p) CI95 = P +/- 1,96 *

n

Page 7: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Confidence Interval

95% confidence interval around a 20% incidence rate

0

5

10

15

20

25

30

35

40

45

50

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

sample size

inci

den

ce r

ate

(%)

Page 8: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

CI95 = M 1,96 * SE

Mean:

M = meanSE = standard errorSD = standard deviationn = sample size

Remember:

SE = SD / n

1,65 für 90%1,96 für 95%2,58 für 99%

Formula for continuous variables

Confidence Interval

Page 9: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Sample Size Calculation

Comparative trials

Page 10: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

„What is the sample size to show that early weight-bearing therapy is better ?“

„Which key should I press here now ?“

„What is the sample size to show that early weight bearing therapy, as

compared to standard therapy, is able to reduce the time until return to work from 10 weeks to 8 weeks, where time

to work has a SD of 3 ?“

36 cases per group !

Page 11: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Outcome Measures

Wound infection

WellbeingPain

Sepsis

Beweglichkeit

Inedpemdence,autonomy

Hospital stayOrgan failure

Recurrencerate

Blood pressure

Fear

FatigueAnxietySocial

status

Lab values

Survival

Complications

Depressionen

Page 12: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

• RelevanceDoes this endpoint convince the patient / the scientific community?

• Reliability; measurabilityCould the outcome easily be measured, without much variation, also by different people?

• Sensitivity Does the intervention lead to a significant change in the outcome measure?

• RobustnessHow much is the endpoint influenced by other factors?

Select Outcome Measure

Page 13: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

• Primary endpoint

Main hypothesis or core question; aim of the studyStatistics: confirmative

• Secondary endpoints

Other interesting questions, additional endpoints Statistics: explorative

(could be confirmative in case of a large difference)

Advantage: prospective selection in the study protocol

• Retrospektively selected endpoints

Selected when the trial is done, based on subgroup differencesStatistics: ONLY explorative !

Select Outcome Measure

Page 14: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Sample size

Certainty - errorPower

Differenceto be detected

Sample Size Calculation

Page 15: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

A statistical test

is a method (or tool) to decide whether an observed difference* is really present or just based on variation by chance

* this is true for a test for difference which is the most frequently applied one in medicine

Statistical Testing

Page 16: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Test for difference„Intervention A is better than B“

Test for equivalence„Intervention A and B have the same effect“

Test for non- inferiority„Intervention A is not worse than B“

Statistical Testing

Page 17: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

How a test procedure works

1. Want to show: there is a difference

2. Assume: there is NO difference between the groups; („equal effects“, null-hypothesis)

3. Try to disprove this assumption:- perform study / experiment- measure the difference

4. Calculate: the probability that such a difference could occur although the assumption („no difference“) was true

= p-value

Statistical Testing

Page 18: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

statistical test for difference:

The p-value is the probability for the case that the observed

difference occured just by chance

Statistical Testing

Page 19: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

statistical test for difference :

p is the probability for„no difference“

Statistical Testing

Page 20: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

„Germany and Spain areequally strong soccer teams !“

Game tonight:

6 : 0 für Germany

Null hypothesistrialn=6

statisticaltest:

p = 0,031

p-value says:How big is the chance that one of two equally strong teams scores 6 goals, and the other one none.

Spain could still be equally strong as Germany, but the chance is small (3,1%)

Statistical Testing

Page 21: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

small difference

small difference

large difference

large difference

large sample

large sample

small sample

small sample

p=0,68

p=0,05 p<0,001

p=0,05

Statistical Testing

Page 22: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

The more cases are included, the better could „equality“ be disproved

Example: drug A has a success rate of 80%, while drug B is better with a healing rate of 90%

20 8/10 9/10 0,5340 16/20 18/20 0,38

100 40/50 45/50 0,16200 80/100 90/100 0,048400 160/200 180/200 0,005

1000 400/500 450/500 <0,001

drug A drug Bsample size 80% 90% p-value

Statistical Testing

Page 23: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

A „significant“ p-value ...

does NOT prove the size of the difference,

but only excludes equality!

Statistical Testing

Page 24: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

p-value large (>0.05)

The observed difference is probably caused by chance only, or the sample size in

not sufficient to exclude chance

null-hypothesis in maintained

“no difference”

p-value small (0.05)

chance alone is not sufficient to explain this difference

there is a systematicdifference

null-hypothesis is rejected

“significant difference“

p-value

Statistical Testing

Page 25: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

The decision

- for a difference (significance, p 0.05)

- or against it („equality“, not significant, p > 0.05)

is not certain but only a probability (p-value). Therefore, errors are possible:

Type 1 error: Decision for a difference although there is none=> wrong finding

Type 2 error: Decision for „equality“ although there is one=> missed finding

Errors

Statistical Testing

Page 26: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Statistical Testing

Errors

Truth

Test says ... no difference difference

significant type 1 error

wrong finding

not significant missed finding

type 2 error

Page 27: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

type 1 error type 2 error

“wrong finding“ „missed finding“

Fire detector wrong alarm no alarm in case of fire

Court conviction of set a an innocent criminal free

Clinical study difference difference was “significant” was missed

by chance

Statistical Testing

Page 28: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

“What is the Power of the study ?”

Type 2 error probability to miss a difference

Power = 1 - probability to detect a difference

Power depends on:

- the magnitude of a difference- the sample size- the variation of the outcome measure- the significance level ()

Power

Page 29: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

“What is the Power of the study ?”

POWER is the probability to detect a certain difference X with the given sample size n as significant (at level ).

“Does the study have enough power to detect a difference of size X ?”

Power

Page 30: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

When to perform power calculations?

1. Planning phase – sample size calculation:

if the assumed difference really exists, what risk

would I take to miss this difference ?

2. Final analysis – in case of a non-significant result:

what size of difference could be rejected with the

present data ?

Power

Page 31: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Example

Clinical trial: Laparoscopic versus open appendectomy

Endpoint: Maximum post-operative pain intensity (VAS 0-100 points)

Patients: 30 cases per group

Results: lap.: 28 (SD 18)open: 32 (SD 17)

p = 0.38 not significant !

What is the power of the study ???

Power

Page 32: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Sample size

Certainty - errorPower

Differenceto be detected

Sample Size Calculation

Page 33: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Sample size

= 0.05 = 0.20

Differenceto be detected

error Risk to find a difference by chance

error Risk to miss a real difference

Sample Size Calculation

Page 34: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Sample size

= 0.05 = 0.20

PT & PCor

Difference& SD

Event rates: Percentages in the treatment and the control group

Continuous measures: difference of means and standard deviation

Sample Size Calculation

Page 35: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

SD unknown

if the variation (standard deviation) is not known,the expected advantage could be expressed as

„effect size“

which is the difference in units of the (unknown) SD

Example:• pain values are at least 1 SD below the control group (effect size = 1.0)

• the difference will be at least half a SD (effect size = 0.5)

Continuous Endpoints

Sample Size Calculation

Page 36: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Test with non-parametric rank statistics

• non-normal distribution, or non-metric values

• Mann-Whitney U-test; Wilcoxon test

Use t-Test for sample size calculation

and add 10% of cases

Sample Size Calculation

Continuous Endpoints

Page 37: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Guess …

How many patients are needed to show that a new intervention is able to reduce the complication rate from 20% to 14% ?(=0.05; =0.20, i.e. 80% power)

Sample Size Calculation

Page 38: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Dupont WD, Plummer WD

Power and Sample Size

Calculations: A Review and

Computer Program

Contr. Clin. Trials (1990) 11:116-128

http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/PowerSampleSize

Sample Size Calculation

Page 39: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Sample Size Calculation

Page 40: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Multiple Testing

• Mehr als eine Versuchs-/Therapiegruppe

• Mehrere Zielgrößen

• Mehrere Follow-Up Zeitpunkte

• Zwischenauswertungen

• Subgruppen-Analysen

Multiple testing increases the risk of arbitrary significant results

Overall statistical error in 8 tests at the 0.05 level:

α = 1 - 0.95 8 = 1 - 0,66 = 0.34

Page 41: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

correct at least 1 error

• 1 test (with 5% error) 95% 5%

• 2 tests (with 5% error each) 90,25% 9,75%

• 3 tests

• 4 tests

• 5 tests

• …..

90,25%

4,75%

4,75%

0,25%

Multiple Testing

Page 42: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

correct at least 1 error

• 1 test (with 5% error) 95% 5%

• 2 tests (with 5% error each) 90,2% 9,8%

• 3 tests 85,7% 14,3%

• 4 tests 81,5% 18,5%

• 5 tests 77,4% 22,6%

• …..

Multiple Testing

Page 43: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

Select ONE primary and multiple secondary questions

Combination of endpointsmultiple complications „Negative event“multiple time points AUC, maximum value, time to

normalmultiple endpoints sum score acc. to O‘Brian

Adjustment of p-values, i.e. each endpoint is tested with a „stronger“ α levele.g. Bonferroni: k tests at level α / k (5 tests at the 1% level, instead of 1 Test at 5% level)

A priori ordered hypothesespredefine the order of tests (each at 5% level)

What could you do?

Multiple Testing

Page 44: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

• Fixed sample size end of trial

• Sequential design after each case

• Group sequential design after each step

• Adaptive design after each step

Interim Analysis

Page 45: Sample Size Calculation PD Dr. Rolf Lefering IFOM - Institut für Forschung in der Operativen Medizin Universität Witten/Herdecke Campus Köln-Merheim.

aus: TR Flemming, DP Harrington, PC O‘BrianDesign of group sequential tests. Contr. Clin Trials (1984) 5: 348-361

Interim Analysis