C OMPARING GROUPS –PART 1C ONTINUOUS DATA Min Chen, Ph.D. Assistant Professor Quantitative Biomedical Research Center Department of Clinical Sciences Bioinformatics Shared Resource Simmons Comprehensive Cancer Center Lecture 4 July 9, 2013 Min Chen (QBRC/CCBSR) Comparing groups Continuous Data 1 Lec 4 1 / 38 OUTLINE 1 REVIEW 2 I NTRODUCTION 3 COMPARISON OF TWO GROUPS Parametric tests Min Chen (QBRC/CCBSR) Comparing groups Continuous Data 1 Lec 4 2 / 38
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
COMPARING GROUPS – PART 1 CONTINUOUS DATA
Min Chen, Ph.D.
Assistant Professor
Quantitative Biomedical Research CenterDepartment of Clinical SciencesBioinformatics Shared Resource
Simmons Comprehensive Cancer Center
Lecture 4
July 9, 2013
Min Chen (QBRC/CCBSR) Comparing groups Continuous Data 1 Lec 4 1 / 38
OUTLINE
1 REVIEW
2 INTRODUCTION
3 COMPARISON OF TWO GROUPSParametric tests
Min Chen (QBRC/CCBSR) Comparing groups Continuous Data 1 Lec 4 2 / 38
REVIEW: (1−α )% CONFIDENCE INTERVAL OF THEMEAN
Lower Limit :
L = X̄− zα/2 ×s√n
Upper Limit :
U = X̄+ zα/2 ×s√n
Standard Normal Distribution:
µ = 0,σ = 1
Min Chen (QBRC/CCBSR) Comparing groups Continuous Data 1 Lec 4 3 / 38
REVIEW OF CONFIDENCE INTERVAL FROM SMALLSAMPLE
As a rule of thumb, if sample size, N < 30, use the formula below.
(1−α)% Confidence Interval:
X̄± tα/2,n−1 ×s√n
where tα/2 is the (α/2)th quantile of the t-distribution with
(n -1) degrees of freedom.
Min Chen (QBRC/CCBSR) Comparing groups Continuous Data 1 Lec 4 4 / 38
REVIEW: INTERPRETATION OF CI
The CI:
Pr(L(X)≤ θ ≤ U(X)) = 1−α.
It is temping to state “the probability that the θ lies between two
numbers, L and U, is (1−α)”.
� Wrong because θ is a fixed number;� L(X) and U(X) are random variables, not numbers.� On average 95% times the calculated intervals will contain the true
population parameter θ .
Min Chen (QBRC/CCBSR) Comparing groups Continuous Data 1 Lec 4 5 / 38
RELATIONSHIP BETWEEN TYPE I ERROR (α ) AND POWER
Min Chen (QBRC/CCBSR) Comparing groups Continuous Data 1 Lec 4 6 / 38
PARAMETRIC VS NON-PARAMETRIC
Parametric tests
Assume data follow some known distribution
E.g., Normal, t-distribution, chi-square, Binomial distribution etc. –
Compare means, variances
Non-parametric tests
Don’t assume a form of distribution
Compare other measures of central tendency (e.g., median, or location
shift)
Useful for skewed data, small samples, ordinal data
Min Chen (QBRC/CCBSR) Comparing groups Continuous Data 1 Lec 4 7 / 38
NOTATION
Population parameter Sample value
Mean µ X̄
Standard deviation σ s
Variance σ2s
2
Sample Size n
Sample Value xi
Min Chen (QBRC/CCBSR) Comparing groups Continuous Data 1 Lec 4 8 / 38
ONE SAMPLE t TEST
Recall one - sample t-test:
t =X̄−µ0
s/√
n
Test statistic for comparing the mean of one group against a fixed
value.
General form of a t-statistic is
t =difference of means
standard error.
T-statistic follows a t-distribution!
Min Chen (QBRC/CCBSR) Comparing groups Continuous Data 1 Lec 4 9 / 38
STUDENT’S t-DISTRIBUTION
Here is how to generate a Student’s t random variable:
Tν =Z�V/ν
,
where
Z is a standard normal distribution;
V has a chi-squared distribution with ν degrees of freedom (df), i.e.,
V =ν
∑i=1
Z2i
where Zi are iid standard normal r.v.’s. (Recall E[Z2i] = 1. So
E[V] = ν .)
Z and V are independent.
Min Chen (QBRC/CCBSR) Comparing groups Continuous Data 1 Lec 4 10 / 38
t–A FAMILY OF DISTRIBUTIONS IDENTIFIED BY df
Recall t = X̄−µ0s/√
n=
(X̄−µ0)/σ√
n√s2/σ2
.
Approaches Normal distribution as df increases.
Min Chen (QBRC/CCBSR) Comparing groups Continuous Data 1 Lec 4 11 / 38
SMALL SAMPLE VS. LARGE SAMPLE
Recall in CI, as a rule of thumb, if sample size n < 30, use the tstatistic for the (1−α)% confidence Interval:
X̄± tα/2,n−1 ×s√n
while for large samples we have
X̄± zα/2 ×s√n.
The reason is when sample size is large,
tα/2,n−1 ≈ zα/2.
Min Chen (QBRC/CCBSR) Comparing groups Continuous Data 1 Lec 4 12 / 38
OUTLINE
1 REVIEW
2 INTRODUCTION
3 COMPARISON OF TWO GROUPSParametric tests
Min Chen (QBRC/CCBSR) Comparing groups Continuous Data 1 Lec 4 13 / 38
COMPARING MEANS OF PAIRED SAMPLES
In paired samples each data point in one sample is matched to another
data point in the second sample.
Same subject
� Measured at 2 time points� Before and after intervention� Two eyes (Left, Right)� Two organs (Heart, Liver)