Assessing Validity of Epidemiologic Studies: Bias and ... Lesson 4.pdf · Bias slide #6 Selection Bias: What are the solutions? •Little or nothing can be done to fix this bias once

Post on 19-Jun-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

Transcript

Assessing Validity of

Epidemiologic Studies: Bias

and Random Error

Epidemiological Reasoning

• Generate and test hypotheses using an epidemiologic study

• When interpreting the results of these studies, must rule out a false association or an alternative explanation to prove the validity of the study

– Bias

– Confounding

– Random error (chance)

About Bias

• A systematic error that results in an incorrect (invalid) estimate of the measure of association

• Bias does not mean that the investigator is “prejudiced,” rather it results from ignorance or unavoidable decisions made during the design and conduct of the study.

• It can be evaluated but not fixed in the analysis phase.

• Two main types of bias are selection and observation bias.

Selection Bias

A. Results from procedures used to select “who gets into the study”

B. The relationship between the exposure and the disease is different in the study participants than it is in those who are eligible but don’t participate

C. Most likely to occur in case-control or retrospective cohort because exposure and outcome have occurred at time of study selection

Bias slide #5

Selection Bias

• In a Case-Control Study: occurs when controls

or cases are more (or less) likely to be included

in study if they have been exposed -- that is,

inclusion in study is not independent of

exposure

• In a Cohort Study: occurs when selection of

exposed and unexposed subjects is not

independent of outcome (so, it can only occur in

a retrospective cohort study)

Bias slide #6

Selection Bias: What are the solutions?

• Little or nothing can be done to fix this bias once it

has occurred.

• You need to avoid it when you design and conduct

the study by

– using the same criteria for selecting cases and controls,

– obtaining all relevant subject records and compare non-

participants to participants,

– obtaining high participation rates, and

– taking in account diagnostic and referral patterns of

disease (i.e., for certain diseases may be more likely to

diagnose or refer for admission based on risk factors )

Bias slide #7

Observation Bias

• A measurement error that arises from systematic differences in the way information on exposure or disease is obtained from the study groups

• Occurs after subjects have entered the study

• Pertains to how the data are collected

• Results in participants who are incorrectly classified as either exposed or unexposed or as diseased or not diseased

Bias slide #8

Observation Bias

• Several types of observation bias:

– Recall bias,

– Interviewer bias,

– Loss to follow up, and

– Differential and non-differential

misclassification

Bias slide #9

When interpreting study results,

ask yourself these questions …

• Given conditions of the study, could bias

have occurred?

• Is bias actually present?

• Are consequences of the bias large enough

to distort the measure of association in an

important way?

• Which direction is the distortion? – Is it

towards the null or away from the null?

EVALUATING THE ROLE OF

RANDOM ERROR

Not a systematic error, but

chance or the luck of the draw

Statistical Inference

• Goal of any epidemiological study is to learn

about the true relation between an exposure and

disease based on data from a sample. This is

inference.

• The actual study result will vary depending on

who is actually in the study sample. This is known

as sampling variability.

Role of chance…

• You may draw a bad (unrepresentative sample)

due to chance alone.

– The measure of association you observe in your data

may differ from the true measure of association by

chance alone.

• You can calculate the probability that the

measure of association you observed was

due to chance.

Hypothesis Testing

• Hypothesis testing means that you are performing

a statistical test in order to get a P value. A

statistical test quantifies the degree to which

sampling variability or chance may explain the

observed association.

Hypothesis Testing

• The assumption made about the result before you start the test is the null hypothesis (H0): RR=1, OR=1, RD=0. You are assuming that the H0 is true, NOT some alternative hypothesis (HA)

• The H0 is assessed by a statistical test that gives you a P value. The P value tells how likely it is that the observed result would occur, if the null hypothesis is really the truth.

• Definition of P value: Given that H0 is true, the p-value is the probability of seeing the observed result, and results more extreme, by chance alone.

Hypothesis Testing

• P value ranges from 0 to 1.

• The particular statistical test that is used depends on type of study, type of measurement, etc.

• A small P value implies that the alternate hypothesis is a better explanation for the data.

• Small P values indicate that chance is an unlikely explanation for the result.

Statistical Conventions

• p<=.05 is an arbitrary cutoff for statistical significance

• If p<= .05, we say results are unlikely to be due to chance, and we reject H0 in favor of HA.

• If p>.05, we say that chance is a likely explanation for the finding and we do not reject H0.

• However, you cannot exclude chance no matter how small a P value is. In addition, you cannot mandate chance no matter how large a P value.

P Value

• EX: Hypothetical study of pesticide exposure and the risk of breast cancer

• RR = 1.4 P value = .10

• These results indicate that the best estimate of the increased breast cancer risk associated with pesticide exposure is 1.4.

• The P value indicates a moderate degree of compatibility of these data with the null hypothesis. Since the P value is not less than .05, these results are not considered "statistically significant.”

Confidence Intervals

• Another approach to quantifying sampling variability is confidence intervals

• The actual measure of association given by the data is the point estimate. The point estimate has variability that can be expressed mathematically, just as a mean has a variance and standard deviation. Given sampling variability, it is important to indicate the precision of the point estimate, i.e., give some indication of sampling variability

• This is indicated by the confidence interval.

Confidence Intervals

• One definition of a confidence interval: Range within

which the true magnitude of effect lies with a stated

probability, or a certain degree of assurance (usually 95%)

• The strict statistical definition: If you did the study 100

times and got 100 point estimates and 100 CIs, in 95 of the

100 results, the true point estimate would lie within the

given interval. In 5 instances, the true point estimate would

not lie within the given interval.

• Note that the point estimate is RR, OR, or RD

Confidence Intervals

• EX: Pesticides and breast cancer

• RR = 1.4 95% CI = 0.7 – 2.6

• Again, the results indicate that the best estimate of the increased breast cancer risk associated with pesticide exposure is 1.4. However, we are 95% confident that the true RR lies between 0.7 and 2.6. That is, the data are also consistent with hypotheses of 0.7 to 2.6.

P-values And Confidence Intervals

• P values and CIs tell you nothing about the

other possible explanations for an observed

result: bias and confounding.

• P values and CIs tell you nothing about

biological, clinical or public health

significance.

Practice Exercise

Study Sample

Size

Relative

Risk

P value 95% CI

A 100 2.0 .10 0.8 - 4.2

B 500 2.0 .06 0.9 – 3.3

C 1000 3.5 .02 2.6 – 4.5

D 2000 3.0 .015 2.2 – 3.5

E 2500 3.2 .001 2.8 – 3.6

Practice Exercise

• Interpret each study result. Include interpretations of the

relative risk, P value and confidence interval.

• What is the relationship between the sample size and the

width of the confidence interval?

• What is the relationship between the sample size and P

value?

• Which gives more information: the P value or confidence

interval?

Practice Exercise

• Is there a relationship between the sample size and the relative risk?

• Are the five study results consistent on the basis of statistical significance?

• Are the five study results consistent on the basis of the point estimates and confidence intervals?

top related