Overview of Biostatistical Methods
Post on 23-Feb-2016
40 Views
Preview:
DESCRIPTION
Transcript
Overview of Biostatistical Methods
Randomized
Clinical Trials (RCT)
X
Treatment population Control
population
0
Overview of Biostatistical Methods
Randomized
Clinical Trials (RCT)
Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx Let X = decrease (–) in cholesterol level (mg/dL);
Patients satisfying inclusion criteria
RANDOMIZE
Treatment Arm
Control Arm
RANDOM SAMPLES
End of Study
T-test F-test
(ANOVA)
Experiment
~ GOLD STANDARD ~Designed to compare two or more treatment groups for a statistically significant difference between them – i.e., beyond random chance – often measured via a “p-value” (e.g., p < .05).
significant?
1 2
0 1 2:H
possible expected distributions:
S(t) = P(T > t)
0
1
T
Overview of Biostatistical Methods
Randomized
Clinical Trials (RCT)
Examples: Drug vs. Placebo, Drugs vs. Surgery, New Tx vs. Standard Tx
~ GOLD STANDARD ~Designed to compare two or more treatment groups for a statistically significant difference between them – i.e., beyond random chance – often measured via a “p-value” (e.g., p < .05).
Let T = Survival time (months);
End of Study
Log-Rank Test,Cox Proportional Hazards Model
Kaplan-Meier estimates
population survival curves:
significant?
S2(t)Control
S1(t)Treatment
AUC difference
0 1 2: ( ) ( )H S t S t
survival probability
Overview of Biostatistical Methods
Case-Control studies
Cohort studies
E+ vs. E–
Overview of Biostatistical MethodsObservational study designs that test for a statistically significant association
between a disease D and exposure E to a potential risk (or protective) factor, measured via “odds ratio,” “relative risk,” etc. Lung cancer / Smoking
PRESENT
E+ vs. E– ? D+ vs. D– ?
Case-Control studies
Cohort studies
Both types of study yield a 22 “contingency table” of data:
D+ D–
E+ a b a + b
E– c d c + d
a + c b + d n
relatively easy and inexpensive subject to faulty records, “recall bias”
D+ vs. D–
FUTUREPAST
measures direct effect of E on D expensive, extremely lengthy…
Example: Framingham, MA study
where a, b, c, d are the numbers of individuals in each cell.
cases controls reference group
End of Study
Chi-squared Test
McNemar TestH0: No association between D and E.
–1 0 +1
Overview of Biostatistical MethodsAs seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test.
But what if the two variables – say, X and Y – are numerical measurements?
Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)?
JAMA. 2003;290:1486-1493 Correlation Coefficient
measures the strength of linear association
between X and Y
X
Y
Scatterplot
r
positive linear correlation
negative linear correlation
–1 0 +1
Overview of Biostatistical MethodsAs seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test.
Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)?
JAMA. 2003;290:1486-1493 Correlation Coefficient
measures the strength of linear association
between X and Y
X
Y
Scatterplot
r
positive linear correlation
negative linear correlation
But what if the two variables – say, X and Y – are numerical measurements?
–1 0 +1
Overview of Biostatistical MethodsAs seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test.
Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)?
JAMA. 2003;290:1486-1493 Correlation Coefficient
measures the strength of linear association
between X and Y
X
Y
Scatterplot
r
positive linear correlation
negative linear correlation
But what if the two variables – say, X and Y – are numerical measurements?
Overview of Biostatistical MethodsAs seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test.
Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)?
Correlation Coefficient
measures the strength of linear association
between X and Y
But what if the two variables – say, X and Y – are numerical measurements?
For this example, r = –0.387(weak, negative linear correl)
For this example, r = –0.387(weak, negative linear correl)
residuals
Overview of Biostatistical MethodsAs seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test.
Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)?
But what if the two variables – say, X and Y – are numerical measurements?
Want the unique line that minimizes the sum of the squared residuals.
Simple Linear Regression gives the “best” line
that fits the data.Regression
Methods?
For this example, r = –0.387(weak, negative linear correl) For this example, r = –0.387(weak, negative linear correl)
Y = 8.790 – 4.733 X (p = .0055)
residuals
Overview of Biostatistical MethodsAs seen, testing for association between categorical variables – such as disease D and exposure E – can generally be done via a Chi-squared Test.
Furthermore, if sample data does suggest that one exists, what is the nature of that association, and how can it be quantified, or modeled via Y = f (X)?
Regression Methods
But what if the two variables – say, X and Y – are numerical measurements?
Want the unique line that minimizes the sum of the squared residuals.
Simple Linear Regression gives the “least squares”
regression line.
Furthermore however, the proportion of total variability in the data that is accounted for by the line is only r 2 = (–0.387)2 = 0.1497 (15%).
Overview of Biostatistical MethodsExtensions of Simple Linear Regression
• Polynomial Regression – predictors X, X2, X3,…
• Multilinear Regression – independent predictors X1, X2,
…
w/o or w/ interaction (e.g., X5 X8)
• Logistic Regression – binary response Y (= 0 or 1)
• Transformations of data, e.g., semi-log, log-log,…
• Generalized Linear Models
• Nonlinear Models
• many more…
Overview of Biostatistical Methods
Sincere thanks to
• Judith Payne
• Heidi Miller
• Rebecca Mataya
• Troy Lawrence
• YOU!
top related