Top Banner
University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School [email protected] www.EvidenceBased.net/talks
51

University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Dec 19, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

University of Sydney

Statistics 101: Power, p-values and………... publications.

Dr. Gordon S Doig,Senior Lecturer in Intensive Care,

Northern Clinical [email protected]

www.EvidenceBased.net/talks

Page 2: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: The basic tests

• t-test • paired t-test• Wilcoxon Rank Sum test (Mann-Whitney U test)• Wilcoxon Signed Rank Sum test• Kolmogorov-Smirnov (one and two sample test)• Chi-square test • Fisher’s Exact test• ANOVA• Kruskal-Wallis rank test• repeated measures ANOVA

Page 3: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Why do we need statistics???

When we conduct any type of research, we can make at least two major types of errors when we draw our conclusions:

I)

II)

Page 4: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Why do we need statistics???

When we conduct any type of research, we can make at least two major types of errors when we draw our conclusions:

I) we claim to have found an important treatment effect when in reality there is no treatment effect.

II) we claim that no treatment effect exists when in reality there is an important treatment effect.

Page 5: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Why do we need statistics???

Some important definitions:What is a p-value?

What is power?

Page 6: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Why do we need statistics???

Some important definitions:What is a p-value?

What is power?

P-value: The probability that the difference we observed could be due to chance alone.

Page 7: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Why do we need statistics???

Some important definitions:What is a p-value?

What is power?

P-value: The probability that the difference we observed could be due to chance alone.

Power: The probability that if there is a real difference, our experiment will find it.

Page 8: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Why do we need statistics???

When we conduct any type of research, we can make at least two major types of errors when we draw our conclusions:

I) we can claim to have found an important treatment effect when in reality there is no treatment effect.

II) we can claim that no treatment effect exists when in reality there is an important treatment effect

P-value: The probability that the difference we observed could be due to chance alone.

Power: The probability that if there is a real difference, our experiment will find it.

Page 9: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Sample size calculations: The use of Power

Every experiment should start with a sample size calculation.

• Having adequate power protects us from Type II errors.

Page 10: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Sample size calculations: The use of Power

Every experiment should start with a sample size calculation.

• Having adequate power protects us from Type II errors.

• Forces us to consider a primary outcome for our experiment.

• primary outcomes can be continuous, categorical (interval, ordered, unordered), dichotomous

Page 11: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Sample size calculations: The use of Power

Every experiment should start with a sample size calculation.

• Having adequate power protects us from Type II errors.

• Forces us to consider a primary outcome for our experiment.

• primary outcomes can be continuous, categorical (interval, ordered, unordered), dichotomous

• Should consider issues of design in order to simplify analysis.

Page 12: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: The use of P???

Selection of appropriate study design / analytic technique:

• protects from Type I errors.

• is driven by driven by a combination of study outcome and study design.

Page 13: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Basics of experimental design

1) Before and after trial• physiological parameter/outcome measured• intervention delivered• physiological parameter/outcome measured again• compare measurement before with measurement after, usually in same subject

Page 14: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Basics of experimental design

1) Before and after trial• physiological parameter/outcome measured• intervention delivered• physiological parameter/outcome measured again• compare measurement before with measurement after, usually in same subject

2) Comparison between two groups• subjects are randomly assigned to one of two groups• one group receives intervention• compare outcome between two groups after intervention

Page 15: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Basics of experimental design

1) Before and after trial• physiological parameter/outcome measured• intervention delivered• physiological parameter/outcome measured again• compare measurement before with measurement after, usually in same subject

2) Comparison between two groups• subjects are randomly assigned to one of two groups• one group receives intervention• compare outcome between two groups after intervention

3) Comparison between more than two groups• as above but subjects are assigned to more than two groups• could compare 3 different drugs or 3 different doses

Page 16: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Outcome identification

Primary outcomes can be 1) continuous, 2) categorical (interval, ordered, unordered), 3) dichotomous

1) Continuous outcomes:• most physiological parameters (Hb, pressures, biochemistry)• usually involves a direct measurement• often Normally distributed

Page 17: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Outcome identification2) Categorical outcomes:a) interval

• equal unit change between each ordered category

Page 18: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Outcome identification2) Categorical outcomes:a) interval

• equal unit change between each ordered category• length of stay, age, time to event, some scoring systems• may be Normally distributed

Page 19: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Outcome identification2) Categorical outcomes:a) interval

• equal unit change between each ordered category• length of stay, age, time to event, some scoring systems• may be Normally distributed

b) ordered• unequal unit change between each ordered category

Page 20: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Outcome identification2) Categorical outcomes:a) interval

• equal unit change between each ordered category• length of stay, age, time to event, some scoring systems• may be Normally distributed

b) ordered• unequal unit change between each ordered category• most scoring systems, tumor stage or grade, low-medium-high• not usually Normally distributed

Page 21: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Outcome identification2) Categorical outcomes:a) interval

• equal unit change between each ordered category• length of stay, age, time to event, some scoring systems• may be Normally distributed

b) ordered• unequal unit change between each ordered category• most scoring systems, tumor stage or grade, low-medium-high• not usually Normally distributed

c) unordered• no sequential order to categories

Page 22: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Outcome identification2) Categorical outcomes:a) interval

• equal unit change between each ordered category• length of stay, age, time to event, some scoring systems• may be Normally distributed

b) ordered• unequal unit change between each ordered category• most scoring systems, tumor stage or grade, low-medium-high• not usually Normally distributed

c) unordered• no sequential order to categories• type of tumor, location, diagnosis• re-think outcome selection!!!!

Page 23: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Outcome identification3) Dichotomous outcomes:

• only two possible outcome states• tumor / no tumor• dead / alive• follows Binomial distribution

Page 24: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: The basic tests

• t-test • paired t-test• Wilcoxon Rank Sum test (Mann-Whitney U test)• Wilcoxon Signed Rank Sum test• Kolmogorov-Smirnov (one and two sample test)• Chi-square test • Fisher’s Exact test• ANOVA• Kruskal-Wallis rank test• repeated measures ANOVA

Page 25: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

1) Before and after trial (same subjects, continuous and interval )

Page 26: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

Step 1: Determine if outcome is Normally distributed• plot histogram with density function line

1) Before and after trial (same subjects, continuous and interval )

Page 27: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

Step 1: Determine if outcome is Normally distributed• plot histogram with density function line• could ‘formally’ test using Wilkes-Shapiro statistic

120 130 140 150 160 170

hina

0.00

0.02

0.04

0.06

1) Before and after trial (same subjects, continuous and interval )

Page 28: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

Step 1: Determine if outcome is Normally distributed• plot histogram with density function line• could ‘formally’ test using Wilkes-Shapiro statistic

120 130 140 150 160 170

hina

0.00

0.02

0.04

0.06

0 200 400 600 800

hicreat

0.000

0.001

0.002

0.003

0.004

0.005

1) Before and after trial (same subjects, continuous and interval )

Page 29: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

120 130 140 150 160 170

hina

0.00

0.02

0.04

0.06

0 200 400 600 800

hicreat

0.000

0.001

0.002

0.003

0.004

0.005

1) Before and after trial (same subjects, continuous and interval )

Page 30: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

paired t-test Wilcoxon Signed Rank Sum Test

120 130 140 150 160 170

hina

0.00

0.02

0.04

0.06

0 200 400 600 800

hicreat

0.000

0.001

0.002

0.003

0.004

0.005

1) Before and after trial (same subjects, continuous and interval )

Page 31: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

paired t-test Wilcoxon Signed Rank Sum Test

NB - if ordered categorical outcome, use one sample Kolmogorov-Smirnov test

120 130 140 150 160 170

hina

0.00

0.02

0.04

0.06

0 200 400 600 800

hicreat

0.000

0.001

0.002

0.003

0.004

0.005

1) Before and after trial (same subjects, continuous and interval )

Page 32: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

2) Comparison between two groups(continuous and interval)

Page 33: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

2) Comparison between two groupsStep 1: Determine if outcome is Normally distributed

• plot histogram (use all data) with density function line• could ‘formally’ test using Wilkes-Shapiro statistic

0 10 20 30 40 50

apache2

0.00

0.01

0.02

0.03

0.04

(continuous and interval)

Page 34: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

2) Comparison between two groupsStep 1: Determine if outcome is Normally distributed

• plot histogram (use all data) with density function line• could ‘formally’ test using Wilkes-Shapiro statistic

0 200 400 600 800

hicreat

0.000

0.001

0.002

0.003

0.004

0.005

0 10 20 30 40 50

apache2

0.00

0.01

0.02

0.03

0.04

(continuous and interval)

Page 35: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

2) Comparison between two groups

0 200 400 600 800

hicreat

0.000

0.001

0.002

0.003

0.004

0.005

0 10 20 30 40 50

apache2

0.00

0.01

0.02

0.03

0.04

(continuous and interval)

Page 36: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

2) Comparison between two groups

0 200 400 600 800

hicreat

0.000

0.001

0.002

0.003

0.004

0.005

t-test Wilcoxon Rank Sum test

0 10 20 30 40 50

apache2

0.00

0.01

0.02

0.03

0.04

(continuous and interval)

Page 37: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

2) Comparison between two groups

0 200 400 600 800

hicreat

0.000

0.001

0.002

0.003

0.004

0.005

t-test Wilcoxon Rank Sum test

NB - if ordered categorical outcome, use two sample Kolmogorov-Smirnov test

0 10 20 30 40 50

apache2

0.00

0.01

0.02

0.03

0.04

(continuous and interval)

Page 38: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

3) Comparison between more than two groups

Page 39: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

3) Comparison between more than two groupsStep 1: Determine if outcome is Normally distributed

• plot histogram (use all data) with density function line• could ‘formally’ test using Wilkes-Shapiro statistic

120 130 140 150 160 170

hina

0.00

0.02

0.04

0.06

0 200 400 600 800

hicreat

0.000

0.001

0.002

0.003

0.004

0.005

Page 40: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

3) Comparison between more than two groups

120 130 140 150 160 170

hina

0.00

0.02

0.04

0.06

0 200 400 600 800

hicreat

0.000

0.001

0.002

0.003

0.004

0.005

Page 41: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

3) Comparison between more than two groups

ANOVA Kruskal-Wallis rank test

120 130 140 150 160 170

hina

0.00

0.02

0.04

0.06

0 200 400 600 800

hicreat

0.000

0.001

0.002

0.003

0.004

0.005

Page 42: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Design and Analysis

3) Comparison between more than two groups

ANOVA Kruskal-Wallis rank testNB - could transform (calculate the log or ln) each outcome value and redo histogram…. if transformed values are Normally distributed, can now use ‘more powerful’ ANOVA (or t-test if 2 samples).

120 130 140 150 160 170

hina

0.00

0.02

0.04

0.06

0 200 400 600 800

hicreat

0.000

0.001

0.002

0.003

0.004

0.005

Page 43: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Dichotomous outcomes1) Before and after trial

• rate before intervention compared to rate after intervention• McNemer’s chi-square

Page 44: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Dichotomous outcomes1) Before and after trial

• rate before intervention compared to rate after intervention• McNemer’s chi-square

2) Comparison between two groups• create 2x2 table, calculate rate for each Group

Dead AliveGroup A 2 8 20% mortalityGroup B 7 3 70% mortality• compare using chi-square test

Page 45: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Dichotomous outcomes1) Before and after trial

• rate before intervention compared to rate after intervention• McNemer’s chi-square

2) Comparison between two groups• create 2x2 table, calculate rate for each Group

Dead AliveGroup A 2 8 20% mortalityGroup B 7 3 70% mortality• compare using chi-square testNB - if any one cell contains < 5 counts, use Fisher’s Exact test

Page 46: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Dichotomous outcomes1) Before and after trial

• rate before intervention compared to rate after intervention• McNemer’s chi-square

2) Comparison between two groups• create 2x2 table, calculate rate for each Group

Dead AliveGroup A 2 8 20% mortalityGroup B 7 3 70% mortality• compare using chi-square testNB - if any one cell contains < 5 counts, use Fisher’s Exact test

3) Comparison between more than two groups• undertake a series of comparisons via 2x2 tables as above

Page 47: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Special considerationsTransformations:

Sometimes its possible to ‘transform’ a long tailed distribution to a normal distribution.Calculate the log or ln of each outcome value and redo histogram.Allows us to apply ‘more powerful’ tests based on assumption of Normality (paired t-test, t-test, ANOVA).Try non-parametric test first <- fewer assumptions!!!!

0 200 400 600 800

hicreat

0.000

0.001

0.002

0.003

0.004

0.005

Page 48: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Special considerations

The t-test has 3 basic, fundamental underlying assumptions:

1) Outcomes are Normally distributed

• test assumptions of Normality

• use non-parametric tests

Page 49: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Special considerations

The t-test has 3 basic, fundamental underlying assumptions:

1) Outcomes are Normally distributed

• test assumptions of Normality

• use non-parametric tests

2) Outcomes are independent

• if outcomes are from same subjects, use paired t-test

Page 50: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Special considerations

The t-test has 3 basic, fundamental underlying assumptions:

1) Outcomes are Normally distributed

• test assumptions of Normality

• use non-parametric tests

2) Outcomes are independent

• if outcomes are from same subjects, use paired t-test

3) The variance of each group is similar

• stats package should formally test equality of variances

• different p-values for each condition

Page 51: University of Sydney Statistics 101: Power, p-values and ………... publications. Dr. Gordon S Doig, Senior Lecturer in Intensive Care, Northern Clinical School.

Analysis 101: Summary

• t-test (two groups, Normally distributed)• paired t-test (before/after, Normally distributed)• Wilcoxon Rank Sum test (two groups, non-parametric)• Wilcoxon Signed Rank Sum test (before/after, non-parametric)• Kolmogorov-Smirnov (before/after, two groups, ordered categorical)• Chi-square test (dichotomous outcome)• Fisher’s Exact test (dichotomous outcome, any cell size < 5)• ANOVA (more than two groups, Normally distributed)• Kruskal-Wallis rank test (more than two groups, non-parametric)• repeated measures ANOVA

www.EvidenceBased.net/talks