CCEB Topics in Biostatistics Part 2 Sarah J. Ratcliffe, Ph.D. Sarah J. Ratcliffe, Ph.D. Center for Clinical Epidemiology and Center for Clinical Epidemiology and Biostatistics Biostatistics University of Penn School of Medicine University of Penn School of Medicine
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
CCEB
Topics in BiostatisticsPart 2
Sarah J. Ratcliffe, Ph.D.Sarah J. Ratcliffe, Ph.D.Center for Clinical Epidemiology and Center for Clinical Epidemiology and
BiostatisticsBiostatisticsUniversity of Penn School of Medicine University of Penn School of Medicine
Steps:Steps: Select a one-sided or two-sided test.Select a one-sided or two-sided test. Establish the level of significance (e.g., Establish the level of significance (e.g., αα
= .05).= .05). Select an appropriate test statistic.Select an appropriate test statistic. Compute test statistic with actual data.Compute test statistic with actual data. Calculate degrees of freedom (Calculate degrees of freedom (dfdf) for the ) for the
test statistic.test statistic.
CCEB
Hypothesis testing
Steps cont’d:Steps cont’d: Obtain a tabled value for the statistical Obtain a tabled value for the statistical
test.test. Compare the test statistic to the tabled Compare the test statistic to the tabled
value.value. Calculate a p-value.Calculate a p-value.
Make decision to accept or reject null Make decision to accept or reject null hypothesis.hypothesis.
CCEB
Hypothesis testing
Steps:Steps: Select a one-sided or two-sided test.Select a one-sided or two-sided test. Establish the level of significance (e.g., Establish the level of significance (e.g., αα = .05). = .05). Select an appropriate test statistic.Select an appropriate test statistic. Compute test statistic with actual data.Compute test statistic with actual data. Calculate degrees of freedom (Calculate degrees of freedom (dfdf) for the test ) for the test
statistic.statistic.
CCEB
Hypothesis testing: One-sided versus Two-sided
Determined by the alternative hypothesis. Unidirectional = one-sided
Example: Infected macaques given vaccine or placebo. Higherviral-replication in vaccine group has no benefit ofinterest.
H0: vaccine has no beneficial effect on viral-replication levels at 6 weeks after infection.
Ha: vaccine lowers viral-replication levels by 6 weeks after infection.
CCEB
Hypothesis testing: One-sided versus Two-sided
Bi-directional = two-sidedExample:
Infected macaques given vaccine or placebo. Interested in whether vaccine has any effect on viral-replication levels, regardless of direction of effect.
H0: vaccine has no beneficial effect on viral-replication levels at 6 weeks after infection.
Ha: vaccine effects the viral-replication levels.
CCEB
Hypothesis testing
Steps:Steps: Select a one-sided or two-sided test.Select a one-sided or two-sided test. Establish the level of significance (e.g., Establish the level of significance (e.g., αα = . = .
05).05). Select an appropriate test statistic.Select an appropriate test statistic. Compute test statistic with actual data.Compute test statistic with actual data. Calculate degrees of freedom (Calculate degrees of freedom (dfdf) for the test ) for the test
statistic.statistic.
CCEB
Hypothesis testing: Level of Significance
How many different hypotheses are being examining?
How many comparisons are needed to answer this hypothesis?
Are any interim analyses planned?e.g. test data, depending on results
collect more data and re-test.=>=> How many tests will be ran in total?How many tests will be ran in total?
CCEB
Hypothesis testing: Level of Significance
αtotal = desired total Type-I error (false positives) for all comparisons.
One test α1 = αtotal
Multiple tests / comparisons If αi = αtotal, then ∑αi > αtotal
Need to use a smaller α for each test.
CCEB
Hypothesis testing: Level of Significance
Conservative approach: αi = αtotal / number comparisons
Can give different α’s to each comparison. Formal methods include: Bonferroni, Tukey-
Cramer, Scheffe’s method, Duncan-Walker. O’Brien-Fleming boundary or a Lan and Demets analog
can be used to determine αi for interim analyses.
Benjamini Y, and Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. JRSSB, 57:125-133.
CCEB
Hypothesis testing
Steps:Steps: Select a one-tailed or two-tailed test.Select a one-tailed or two-tailed test. Establish the level of significance (e.g., Establish the level of significance (e.g., αα = .05). = .05). Select an appropriate test statistic.Select an appropriate test statistic. Compute test statistic with actual data.Compute test statistic with actual data. Calculate degrees of freedom (Calculate degrees of freedom (dfdf) for the test ) for the test
statistic.statistic.
CCEB
Hypothesis testing: Selecting an Appropriate test
How many samples are being compared? One sample Two samples Multi-samples
Are these samples independent? Unrelated subjects in each sample. Subjects in each sample related / same.
CCEB
Hypothesis testing: Selecting an Appropriate test
Are your variables continuous or categorical? If continuous, is the data normally distributed?
Normality can be determined using a P-P
(or Q-Q) plot. Plot should be approximately a straight line
for normality. If not normal, can it be transformed to
normality?Blindly assuming normality can lead to
wrong conclusions!!!
CCEB
Hypothesis testing: Selecting an Appropriate test
Approximately a straight line
= normal assumption okay
CCEB
Hypothesis testing: Selecting an Appropriate test
Not a straight line
= NOT normal
Can it be transformed to normality?
CCEB
Hypothesis testing: Selecting an Appropriate test
The natural log transform of the data is approximately a straight line
= normal assumption okay
Analyze the transformed data NOT the original data.
CCEB
Hypothesis testing: Geometric versus Arithmetic mean
GeometricGeometric mean of n positive numerical values is mean of n positive numerical values is the nth root of the product of the n values. the nth root of the product of the n values.
GeometricGeometric will always be will always be less thanless than arithmeticarithmetic.. GeometricGeometric better when some values are very large better when some values are very large
in magnitude and others are small.in magnitude and others are small. If If geometricgeometric is used, log-transform the data before is used, log-transform the data before
analyzing. analyzing. Arithmetic mean of log-transformed data is the Arithmetic mean of log-transformed data is the
log of the geometric mean of the data log of the geometric mean of the data E.g. t-test on log-transformed data = test for E.g. t-test on log-transformed data = test for
location of the geometric mean location of the geometric mean Langley R., Langley R., Practical Statistics Simply ExplainedPractical Statistics Simply Explained, 1970, , 1970,
Dover Press Dover Press
CCEB
Source: Richardson & Overbaugh (2005). Basic statistical considerations in virological experiments. Journal of Virology, 29(2): 669-676.
Continuous 2 Independent NormalTwo-sample t-test for means, two-sample
F test for variances
Continuous 2 Independent Non-normal Wilcoxon rank sum test
Continuous 2 Paired Normal Paired t-test
Continuous 2 Paired Non-normal Wilcoxon signed-rank test, sign test
Continuous >2 Independent NormalOne-way ANOVA for means, Bartlett's
test of homogeneity for variances
Continuous >2 Independent Non-normal Kruskal-Wallis test
Continuous >2 Related Non-normal Friedman rank sum test
CCEB
Hypothesis testing: Selecting an Appropriate test
Other tests are available for more complex situations. For example,
Repeated measures ANOVA: >2 measurements taken on each subject; usually interested in time effect.
GEEs / Mixed-effects models: >2 measurements taken on each subject; adjust for other covariates.
CCEB
Hypothesis testing
Steps:Steps: Select a one-tailed or two-tailed test.Select a one-tailed or two-tailed test. Establish the level of significance (e.g., Establish the level of significance (e.g., αα = .05). = .05). Select an appropriate test statistic.Select an appropriate test statistic. Run the testRun the test..
CCEB
Example 1
Expression of chemokine receptors on CD14+/CD14- populations of blood monocytes.
Percent of cells positive by FACS.
CCEB
CCR8
subject CD14+ CD14-
1 5 17
2 9 25
3 13 36
4 2 9
5 5 18
6 0 2
7 6 6
8 21 30
9 5 6
10 36 35
mean 10.2 18.4
st dev 10.9 12.6
st error 3.4 4.0
CCEB
Example 1 cont’d
Continuous data, 2 samples=> t-test, if normal OR=> Wilcoxon rank sum or signed-rank
sum test, if non-normal Are samples independent or paired?
If independent, can test for equality of variances using a Levene’s test
CCEB
Example 1 cont’d
T-tests in excel
=TTEST(L6:L15,M6:M15,2,2)
Cells containing data from sample 1
Cells containing data from sample 2
1-sided or 2-sided test
Type of t-test:
1: paired
2: independent, equal variance
3: independent, unequal variance
CCEB
CCEB
Example 1 cont’d Possible results for different assumptions:
www.cceb.upenn.edu/main/center/becc.htmlwww.cceb.upenn.edu/main/center/becc.html Hourly fee serviceHourly fee service Design and analysis strategies for research Design and analysis strategies for research
proposals; proposals; Selecting and implementing appropriate statistical Selecting and implementing appropriate statistical
methods for specific applications to research data; methods for specific applications to research data; Statistical and graphical analysis of data; Statistical and graphical analysis of data; Statistical review of manuscripts.Statistical review of manuscripts.