Top Banner
Overview of Biostatistical Methods R andom ized C linicalTrials (R C T)
16

Overview of Biostatistical Methods

Dec 31, 2015

Download

Documents

Simon McKenzie

Overview of Biostatistical Methods. Overview of Biostatistical Methods. ~ GOLD STANDARD ~ Designed to compare two or more treatment groups for a statistically significant difference between them – i.e., beyond random chance – often measured via a “ p-value ” (e.g., p < .05). - PowerPoint PPT Presentation
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Overview of  Biostatistical  Methods

Overview of Biostatistical Methods

Randomized

Clinical Trials (RCT)

Page 2: Overview of  Biostatistical  Methods

X

Treatment population Control

population

0

Overview of Biostatistical Methods

Randomized

Clinical Trials (RCT)

Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx Let X = cholesterol level (mg/dL);

Patients satisfying inclusion criteria

RANDOMIZE

Treatment Arm

Control Arm

RANDOM SAMPLES

End of Study

T-test F-test

(ANOVA)

Experiment

~ GOLD STANDARD ~Designed to compare two or more treatment groups for a statistically significant difference between them – i.e., beyond random chance – often measured via a “p-value” (e.g., p < .05).

significant?

1 2

0 1 2:H

possible expected distributions:

Page 3: Overview of  Biostatistical  Methods

X

Post-Tx population Pre-Tx

population

Overview of Biostatistical Methods

Randomized

Clinical Trials (RCT)

Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx Let X = cholesterol level (mg/dL)

Patients satisfying inclusion criteria

Pre-Tx Arm

Post-Tx Arm

PAIRED SAMPLES

End of Study

Paired T-test, ANOVA F-test

“repeated measures”

Experiment

~ GOLD STANDARD ~Designed to compare two or more treatment groups for a statistically significant difference between them – i.e., beyond random chance – often measured via a “p-value” (e.g., p < .05).

significant?

1 2

0 1 2:H

0

from baseline, on same patients

Page 4: Overview of  Biostatistical  Methods

S(t) = P(T > t)

0

1

T

Overview of Biostatistical Methods

Randomized

Clinical Trials (RCT)

Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx

~ GOLD STANDARD ~Designed to compare two or more treatment groups for a statistically significant difference between them – i.e., beyond random chance – often measured via a “p-value” (e.g., p < .05).

Let T = Survival time (months);

End of Study

Log-Rank Test,Cox Proportional Hazards Model

Kaplan-Meier estimates

population survival curves:

significant?

S2(t)Control

S1(t)Treatment

AUC difference

0 1 2: ( ) ( )H S t S t

survival probability

Page 5: Overview of  Biostatistical  Methods

Overview of Biostatistical Methods

Case-Control studies

Case-Control studies

Cohort studiesCohort studies

Page 6: Overview of  Biostatistical  Methods

E+ vs. E–

Overview of Biostatistical Methods

Observational study designs that test for a statistically significant association between a disease D and exposure E to a potential risk (or protective) factor, measured via “odds ratio,” “relative risk,” etc. Lung cancer / Smoking

PRESENT

E+ vs. E– ? D+ vs. D– ?

Case-Control studies

Case-Control studies

Cohort studiesCohort studies

Both types of study yield a 22 “contingency table” for binary variables D and E:

D+ D–

E+ a b a + b

E– c d c + d

a + c b + d n

relatively easy and inexpensive subject to faulty records, “recall bias”

D+ vs. D–

FUTUREPAST

measures direct effect of E on D expensive, extremely lengthy…

Example: Framingham, MA study

where a, b, c, d are the observed counts of individuals in each cell.

cases controls reference group

End of Study

Chi-squared Test

McNemar Test(for paired case-control study designs)

H0: No association between D and E.

Page 7: Overview of  Biostatistical  Methods

–1 0 +1

Overview of Biostatistical MethodsAs seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test.

But what if the two variables – say, X and Y – are numerical measurements?

Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)?

JAMA. 2003;290:1486-1493

Correlation Coefficient

measures the strength of linear association

between X and Y

X

Y

Scatterplot

r

positive linear correlation

negative linear correlation

Page 8: Overview of  Biostatistical  Methods

–1 0 +1

Overview of Biostatistical MethodsAs seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test.

Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)?

JAMA. 2003;290:1486-1493

Correlation Coefficient

measures the strength of linear association

between X and Y

X

Y

Scatterplot

r

positive linear correlation

negative linear correlation

But what if the two variables – say, X and Y – are numerical measurements?

Page 9: Overview of  Biostatistical  Methods

–1 0 +1

Overview of Biostatistical MethodsAs seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test.

Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)?

JAMA. 2003;290:1486-1493

Correlation Coefficient

measures the strength of linear association

between X and Y

X

Y

Scatterplot

r

positive linear correlation

negative linear correlation

But what if the two variables – say, X and Y – are numerical measurements?

Page 10: Overview of  Biostatistical  Methods

Overview of Biostatistical MethodsAs seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test.

Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)?

Correlation Coefficient

measures the strength of linear association

between X and Y

But what if the two variables – say, X and Y – are numerical measurements?

For this example, r = –0.387(weak, negative linear correl)

Page 11: Overview of  Biostatistical  Methods

For this example, r = –0.387(weak, negative linear correl)

residuals

Overview of Biostatistical MethodsAs seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test.

Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)?

But what if the two variables – say, X and Y – are numerical measurements?

Want the unique line that minimizes the sum of the squared residuals.

Simple Linear Regression gives the “best” line

that fits the data.

Regression Methods

?

Page 12: Overview of  Biostatistical  Methods

For this example, r = –0.387(weak, negative linear correl) For this example, r = –0.387(weak, negative linear correl)

Y = 8.790 – 4.733 X (p = .0055)

residuals

Overview of Biostatistical MethodsAs seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test.

Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)?

Regression Methods

But what if the two variables – say, X and Y – are numerical measurements?

Want the unique line that minimizes the sum of the squared residuals.

Simple Linear Regression gives the “least squares”

regression line.

It can also be shown that the proportion of total variability in the data that is accounted for by the line is equal to r 2, which in this case, = (–0.387)2 = 0.1497 (15%)... very small.

Page 13: Overview of  Biostatistical  Methods

Overview of Biostatistical Methods

Extensions of Simple Linear Regression

• Polynomial Regression – predictors X, X2, X3,…

• Multilinear Regression – independent predictors X1, X2,

w/o or w/ interaction (e.g., X5 X8)

• Logistic Regression – binary response Y (= 0 or 1)

• Transformations of data, e.g., semi-log, log-log,…

• Generalized Linear Models

• Nonlinear Models

• many more…

Page 14: Overview of  Biostatistical  Methods

Numerical (Quantitative) e.g., $ Annual Income

Summary ~2 POPULATIONS:

H0: 1 = 2

Normally distributed?YesNo

Wilcoxon Rank Sum (aka Mann-Whitney U)

2-sample T (w/o pooling)

Yes1 2, 30?n n

“NonparametricTests”

No

Yes No

2-sample T (w/ pooling)

Equivariance?

• Satterwaithe

• Welch

“Approximate” T

• Q-Q plots• Shapiro-Wilk• Anderson-

Darling• others…

• F-test• Bartlett•

others…

2 POPULATIONS:

• ANOVA F-test• Regression Methods

Kruskal-Wallis

Various modifications

X

σ1 σ2

1 2

Independent e.g., RCT

Paired (Matched) e.g., Pre- vs. Post-

Sample 1

1 1 1, ,n x sSample 2

2 2 2, ,n x s

Yes No

• Sign Test• Wilcoxon Signed Rank

“NonparametricTests”

Paired T

ANOVA F-test(w/ “repeated measures”or “blocking”)

• Friedman• Kendall’s

W• others…

Page 15: Overview of  Biostatistical  Methods

Categorical (Qualitative) e.g., Income Level: Low, Mid, High

Summary ~

2 CATEGORIES per each of two variables:

J

1 2 3 • • • c

I

1

2

3

•• •

etc.

r

H0: “There is no association between (the categories of) I and

(the categories of) J.” r × c contingency table

Chi-squared Tests

Test of Independence (1 population, 2 categorical variables)

Test of Homogeneity (2 populations, 1 categorical variable)

“Goodness-of-Fit” Test (1 population, 1 categorical variable)

Modifications• McNemar Test for paired

2 × 2 categorical data, to control for “confounding variables” e.g., case-control studies

• Fisher’s Exact Test for small “expected values” (< 5) to avoid possible “spurious significance”

Page 16: Overview of  Biostatistical  Methods

Introduction to Basic Statistical Methods

Part 1: Statistics in a Nutshell

UWHC Scholarly ForumMay 21, 2014

Ismor Fischer, Ph.D.UW Dept of [email protected]

Part 2: Overview of Biostatistics: “Which Test Do I Use??” Sincere thanks to…

• Judith Payne

• Heidi Miller

• Samantha Goodrich

• Troy Lawrence

• YOU! All slides posted at http://www.stat.wisc.edu/~ifischer/Intro_Stat/UWHC