Assessing Validity of Epidemiologic Studies: Bias and ... Lesson 4.pdf · Bias slide #6 Selection Bias: What are the solutions? •Little or nothing can be done to fix this bias once
Post on 19-Jun-2020
5 Views
Preview:
Transcript
Assessing Validity of
Epidemiologic Studies: Bias
and Random Error
Epidemiological Reasoning
• Generate and test hypotheses using an epidemiologic study
• When interpreting the results of these studies, must rule out a false association or an alternative explanation to prove the validity of the study
– Bias
– Confounding
– Random error (chance)
About Bias
• A systematic error that results in an incorrect (invalid) estimate of the measure of association
• Bias does not mean that the investigator is “prejudiced,” rather it results from ignorance or unavoidable decisions made during the design and conduct of the study.
• It can be evaluated but not fixed in the analysis phase.
• Two main types of bias are selection and observation bias.
Selection Bias
A. Results from procedures used to select “who gets into the study”
B. The relationship between the exposure and the disease is different in the study participants than it is in those who are eligible but don’t participate
C. Most likely to occur in case-control or retrospective cohort because exposure and outcome have occurred at time of study selection
Bias slide #5
Selection Bias
• In a Case-Control Study: occurs when controls
or cases are more (or less) likely to be included
in study if they have been exposed -- that is,
inclusion in study is not independent of
exposure
• In a Cohort Study: occurs when selection of
exposed and unexposed subjects is not
independent of outcome (so, it can only occur in
a retrospective cohort study)
Bias slide #6
Selection Bias: What are the solutions?
• Little or nothing can be done to fix this bias once it
has occurred.
• You need to avoid it when you design and conduct
the study by
– using the same criteria for selecting cases and controls,
– obtaining all relevant subject records and compare non-
participants to participants,
– obtaining high participation rates, and
– taking in account diagnostic and referral patterns of
disease (i.e., for certain diseases may be more likely to
diagnose or refer for admission based on risk factors )
Bias slide #7
Observation Bias
• A measurement error that arises from systematic differences in the way information on exposure or disease is obtained from the study groups
• Occurs after subjects have entered the study
• Pertains to how the data are collected
• Results in participants who are incorrectly classified as either exposed or unexposed or as diseased or not diseased
Bias slide #8
Observation Bias
• Several types of observation bias:
– Recall bias,
– Interviewer bias,
– Loss to follow up, and
– Differential and non-differential
misclassification
Bias slide #9
When interpreting study results,
ask yourself these questions …
• Given conditions of the study, could bias
have occurred?
• Is bias actually present?
• Are consequences of the bias large enough
to distort the measure of association in an
important way?
• Which direction is the distortion? – Is it
towards the null or away from the null?
EVALUATING THE ROLE OF
RANDOM ERROR
Not a systematic error, but
chance or the luck of the draw
Statistical Inference
• Goal of any epidemiological study is to learn
about the true relation between an exposure and
disease based on data from a sample. This is
inference.
• The actual study result will vary depending on
who is actually in the study sample. This is known
as sampling variability.
Role of chance…
• You may draw a bad (unrepresentative sample)
due to chance alone.
– The measure of association you observe in your data
may differ from the true measure of association by
chance alone.
• You can calculate the probability that the
measure of association you observed was
due to chance.
Hypothesis Testing
• Hypothesis testing means that you are performing
a statistical test in order to get a P value. A
statistical test quantifies the degree to which
sampling variability or chance may explain the
observed association.
Hypothesis Testing
• The assumption made about the result before you start the test is the null hypothesis (H0): RR=1, OR=1, RD=0. You are assuming that the H0 is true, NOT some alternative hypothesis (HA)
• The H0 is assessed by a statistical test that gives you a P value. The P value tells how likely it is that the observed result would occur, if the null hypothesis is really the truth.
• Definition of P value: Given that H0 is true, the p-value is the probability of seeing the observed result, and results more extreme, by chance alone.
Hypothesis Testing
• P value ranges from 0 to 1.
• The particular statistical test that is used depends on type of study, type of measurement, etc.
• A small P value implies that the alternate hypothesis is a better explanation for the data.
• Small P values indicate that chance is an unlikely explanation for the result.
Statistical Conventions
• p<=.05 is an arbitrary cutoff for statistical significance
• If p<= .05, we say results are unlikely to be due to chance, and we reject H0 in favor of HA.
• If p>.05, we say that chance is a likely explanation for the finding and we do not reject H0.
• However, you cannot exclude chance no matter how small a P value is. In addition, you cannot mandate chance no matter how large a P value.
P Value
• EX: Hypothetical study of pesticide exposure and the risk of breast cancer
• RR = 1.4 P value = .10
• These results indicate that the best estimate of the increased breast cancer risk associated with pesticide exposure is 1.4.
• The P value indicates a moderate degree of compatibility of these data with the null hypothesis. Since the P value is not less than .05, these results are not considered "statistically significant.”
Confidence Intervals
• Another approach to quantifying sampling variability is confidence intervals
• The actual measure of association given by the data is the point estimate. The point estimate has variability that can be expressed mathematically, just as a mean has a variance and standard deviation. Given sampling variability, it is important to indicate the precision of the point estimate, i.e., give some indication of sampling variability
• This is indicated by the confidence interval.
Confidence Intervals
• One definition of a confidence interval: Range within
which the true magnitude of effect lies with a stated
probability, or a certain degree of assurance (usually 95%)
• The strict statistical definition: If you did the study 100
times and got 100 point estimates and 100 CIs, in 95 of the
100 results, the true point estimate would lie within the
given interval. In 5 instances, the true point estimate would
not lie within the given interval.
• Note that the point estimate is RR, OR, or RD
Confidence Intervals
• EX: Pesticides and breast cancer
• RR = 1.4 95% CI = 0.7 – 2.6
• Again, the results indicate that the best estimate of the increased breast cancer risk associated with pesticide exposure is 1.4. However, we are 95% confident that the true RR lies between 0.7 and 2.6. That is, the data are also consistent with hypotheses of 0.7 to 2.6.
P-values And Confidence Intervals
• P values and CIs tell you nothing about the
other possible explanations for an observed
result: bias and confounding.
• P values and CIs tell you nothing about
biological, clinical or public health
significance.
Practice Exercise
Study Sample
Size
Relative
Risk
P value 95% CI
A 100 2.0 .10 0.8 - 4.2
B 500 2.0 .06 0.9 – 3.3
C 1000 3.5 .02 2.6 – 4.5
D 2000 3.0 .015 2.2 – 3.5
E 2500 3.2 .001 2.8 – 3.6
Practice Exercise
• Interpret each study result. Include interpretations of the
relative risk, P value and confidence interval.
• What is the relationship between the sample size and the
width of the confidence interval?
• What is the relationship between the sample size and P
value?
• Which gives more information: the P value or confidence
interval?
Practice Exercise
• Is there a relationship between the sample size and the relative risk?
• Are the five study results consistent on the basis of statistical significance?
• Are the five study results consistent on the basis of the point estimates and confidence intervals?
top related