EP711
COHORT STUDIES
Types of Epidemiologic Studies
• Observational
Cohort
Case-control
•ExperimentalRandomized controlled trials
Definition: A study in which two or more groups of people that are free of
disease and that differ according to the extent of exposure
(e.g. exposed and unexposed) are compared with respect to
disease incidence
Cohort studies are the observational equivalent of experimental studies
but
The researcher cannot allocate exposure –he/she must locate a natural experiment to observe the relationship
between the exposure and disease
Cohort StudiesCohort Studies
ComparisonComparison
Active Intervention
Active Intervention
Outcome?
Randomized Controlled TrialsRandomized Controlled Trials
Exposed
Unexposed
Identify the study
population (cohort)
Design:
• Investigator randomly assigns exposure (treatment)
• Then observe over time for subsequent outcome
Design:
• Non-diseased subjects grouped based on presence of exposure
• Then determine subsequent outcome (e.g.- disease)
Example: Is smoking associated with lung cancer?
Non-smokersNon-smokers
SmokersSmokers
Outcome?
Cohort StudiesCohort Studies
Exposed
Unexposed
Identify the cohort
Advantages of Cohort StudiesAdvantages of Cohort Studies
• Temporal sequence between exposure & disease is clear (e.g., smoking preceded cancer)
• Can directly calculate incidence, RD, PRD
• Good for looking at rare exposures or unusual risk factors (e.g. agent orange)
• Can evaluate multiple effects of a single factor
Start of Study
Start of Study
PastPast FutureFuture
Compare Incidence
Have factor
Don’t have factorThe Cohort
Retrospective Cohort StudyRetrospective Cohort Study
•Cheaper, faster•Efficient with diseases with long latent period•Exposure data may be inadequate (limitation)
Start of Study
Start of Study
PastPast FutureFuture
Compare Incidence
Have factor
Don’t have factorThe
Cohort
Prospective Cohort StudyProspective Cohort Study
•More expensive, time consuming•Not efficient for diseases with long latent periods •Better exposure and confounder data•Less vulnerable to bias
Prospective partRetrospective part
Past Future
CompareIncidence
CompareIncidence
Start of Study
Ambidirectional Cohort StudyAmbidirectional Cohort Study
Contains elements of both types of studies
Have factor
Don’t have factor
Types of Cohort Populations
Types of Cohort Populations
• Open or Dynamic
– Changeable characteristic– Members come and go– Losses may occur
• Fixed– Irrevocable event– Does not add new members– Losses may occur
• Closed– Irrevocable event– Does not add new members – No losses occur
Never marriedResidents of Boston
Aged 25-54
Baby Boomers, 9/11 survivors,RCT participants
Church Picnic or Wedding Attendees
Choice depends upon hypothesis under study and feasibility considerations
• For common risk factors(obesity, HBP): A cohort from the general population (e.g., Framingham Heart Study, NHANES)
A special study group, e.g., doctors or nurses (e.g. The Nurse’s Health Study, Black Women’s
Health Study)
• For unusual risk factors : A special (rare) exposure group:
(e.g., Agent Orange, Hiroshima, Occupational)
Selection of Study PopulationSelection of Study Population
The Framingham Heart Study
• Initiated by NHLBI
• Objective was to identify the common factors or characteristics that contribute to CVD by following healthy individuals
• The researchers recruited 5,209 men and women between the ages of 30 and 62 from the town of Framingham, Massachusetts
• Since 1948, the subjects have continued to return to the study every two years for a detailed medical history, physical examination, and laboratory tests
• In 1971, the study enrolled a second generation - the original participants' adult children and their spouses
• In April 2002 the Study entered a new phase: the enrollment of a third generation of participants, the grandchildren of the original cohort.
The Nurse’s Health Study
• 2 cohorts; Differ by age
• NHS I–Assembled in 1976–~122,000 female nurses aged 30-55 years
• NHS II–Assembled in 1989–117,000 female nurses aged 25-42 years
• Biennial postal questionnaires
The Nurse’s Health StudyThe Nurse’s Health Study
• The primary goal to investigate the potential long term consequences of the use of oral contraceptives, in a population of normal women
• Primary outcomes include heart disease & cancer (common endpoints)
• Examines multiple common risk factors (diet, exercise, obesity, vitamin use)
• Subjects able to respond with a high degree of accuracy
• Motivated to participate in a long term study
• Easy to locate
• The U.S. military sprayed some 11 million gallons of the defoliant over southern and central Vietnam from 1962 to 1971 in an effort to expose enemy supply lines, sanctuaries and bases.
• Airmen were exposed during spraying flights, while loading the chemical and while performing maintenance on the aircraft and the spraying equipment.
• Agent Orange was named for the orange-striped barrels it was shipped in. It contains dioxin, a cancer-causing byproduct linked to medical ailments in U.S. war veterans and their Vietnamese counterparts.
Air Force Ranch Hand Study
(“Agent Orange Study”)
1) As similar as possible with respect to other factors that could influence outcome
2) Comparable & accurate information
Counterfactual ideal
• The ideal comparison group consists of exactly the same individuals in the exposed group – but without the exposure
• Epidemiologists must select different sets of people who are as similar as possible
Selection of the Comparison Group Selection of the Comparison Group
Exposed Unexposed
General PopulationGeneral PopulationGeneral Population
Internal Comparison Comparison CohortGeneral Population
Comparison
vs.
Land-scapers/Grounds
Crew
Road Crew/
AsphaltWorkers
GeneralPopulation
RubberWorkers
vs.Lean Obese
vs.
Nurses
Sources of Comparison GroupSources of Comparison Group
Which of the three comparison groups is best?
“Healthy-Worker Effect”“Healthy-Worker Effect”
• Rates of morbidity and mortality among a working population are lower than those of the general population
• Health requirements for workers (especially physical laborers) tend to be stringent
• General population consists of both healthy and ill people
• Leads to underestimation of risk
Pre-Existing Records
Advantages•Inexpensive•Recorded before disease occurrence
Disadvantages•Inadequate level of detail•Missing records•Little or no information on confounders
Sources of Exposure Information
Sources of Exposure Information
Questionnaires, Interviews
Advantages•Good for information not routinely recorded
Disadvantages•Potential for recall bias
Sources of Exposure Information
Sources of Exposure Information
Direct Testing
(Physical exams, tests, environmental monitoring)
Advantages•Good for certain exposures
Disadvantages•Expensive•Not feasible in large studies
Sources of Exposure Information
Sources of Exposure Information
• Death certificates• Physician, hospital, health plan records• Questionnaires (verify by records)• Medical exams
Sources of Outcome Information
Sources of Outcome Information
You can use blinding to ensure that there is comparable ascertainment of the outcomes in both groups
Goal is to obtain complete follow-up information on all subjects regardless of exposure status
• Ascertainment of outcome data involves following all subjects from exposure into the future
• Time consuming process
• However, high losses to follow-up raise doubts about the validity of the study (bias)
Follow-upFollow-up
If likelihood of loss to follow up is related to the risk factor and the outcome, the estimate of the association will be biased
Loss to Follow Up
Example:True incidence of thromboembolism:
Subjects lost to follow up: 1,012 1,008
Subjects with TE lost to follow up:
Apparent incidence of TE:
OC Users Non-OC users
20/10,000 10/10,000
12 2
8/8,988 8/8,992
True RR = 2.0 Apparent RR = 1.0
Can occur in prospective cohort studies and in experimental studies
Effects: can produce over- or under- estimate of association.
•Town lists•Telephone books, 411•Vital records (birth, death, marriage)•Registry of Motor Vehicles (RMV) lists•MD & Hospital records•Internet•Credit bureau•Relatives, friends (“contacts”)•Professional registries (AMA, RN, ABA, etc.)
Follow-up ResourcesFollow-up Resources
Tuberculosis Treatment and Breast Cancer Study
Tuberculosis Treatment and Breast Cancer Study
Follow-up Strategies
Begin with an interested group Collect identifiable information
Full name & AddressDOB, SSNContact information
Maintain frequent contact with all respondents Regular mail (questionnaires, newsletters)Telephone callsPersonal contact (if possible)
Incentives (gifts, calendars, money)
Analysis of Cohort Study Analysis of Cohort Study
• Basic analysis involves calculation of incidence of disease among exposed and unexposed groups
• Depending on available data, you can calculate cumulative incidence (CI) or incidence rates (IR)
• Recall set up of 2 x 2 tables
Person-Time In A Prospective Cohort Study
Person-Time In A Prospective Cohort Study
SubjectA-B-C-D-E-F-G-H-I-J-K-L-
x
x
xP1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13
Total time at risk =107.7 Total person-yrs
Timeat Risk
8.311.0
14.014.0
10.2 3.0
7.0
10.0
3.0
9.06.2
12.0
x= when theygot disease
D
?
D
?
X= when they got diseaseD= death? = Lost to follow-up
Analysis of a Cohort Study
CasesPerson-Years of follow-up
Exposed A PYE
Unexposed C PYU
Total A+C PYE + PYU
IRE = A/PYE IRU = C/PYU
RR = IRE/IRU
Interpretation: The RR is the risk of developing the outcome in the exposedrelative to the unexposed
Start of Study
Future
Examine the associationBetween obesity and CHDIn a sample of 117,000 RNs w/o cardiovascular disease
Compare Incidence of Disease
obese
lean
Nurse’s Health StudyNurse’s Health Study
Have risk factor
Don’t have it
Follow-up Surveys
AnalysisAnalysis
CHD Cases
Woman-Years of follow-up
Obese (Exposed)
85 99,573
Lean (Unexposed)
41 177,356
Total 126 276,929
IR1 = 85/99,573 = 8.54/10,000 woman-years
IR0 = 41/177,356 = 2.31/10,000 woman-years
RR = IR1/IR0 = 3.7
Obese women had 3.7 times the risk of CHD compared to lean women
Obesity Rate of CHD ?
<21
21-<23
23-<25
25-<29
>29
BMI: CHD cases
41
57
56
67
85
Person-yearsof observation
177,356
194,243
155,717
148,541
99,573
Rate of CHD per100,000 P-Yrs
(incidence)
29.3
36.0
45.1
Risk Ratio
1.0
3.7
Risk Ratio In The Nurses Risk Ratio In The Nurses Health StudyHealth Study
Risk Ratio In The Nurses Risk Ratio In The Nurses Health StudyHealth Study
23.1
85.4
Risk Ratio = 85.4/100,000 / 23.1/100,000 = 3.7
Obesity Rate of CHD ?
CHDcases
41
57
56
67
85
person-yearsof observation
177,356
194,243
155,717
148,541
99,573
rate of CHD per100,000 P-Yrs
(incidence)
23.1
29.3
36.0
45.1
85.4
Risk Difference
0.0
62.3
Risk Difference = 85.4/100,000 - 23.1/100,000 = 62.3 excess cases per 100,000 P-Yrs in heaviest group
Risk Difference In The Risk Difference In The Nurses Health StudyNurses Health Study
Risk Difference In The Risk Difference In The Nurses Health StudyNurses Health Study
<21
21-<23
23-<25
25-<29
>29
BMI:
Strengths of Cohort StudiesStrengths of Cohort StudiesStrengths of Cohort StudiesStrengths of Cohort Studies
• Efficient for rare exposures
• Usually good information on exposures
• Can evaluate multiple effects of an exposure
239 3
139
Yes No
Orthopedic Problems
98
119
Yes No
Breast Cancer
138
169
Yes No
Cardiovascular Disease
227 320,807
217 310,820
Yes P-Yrs
Reproductive ProblemsYes
No
Obesity
A Cohort Study Can Look at Multiple Outcomes
A Cohort Study Can Look at Multiple Outcomes
Disadvantages to Cohort Studies(especially prospective)
Disadvantages to Cohort Studies(especially prospective)
• May need large numbers of subjects for long periods of time
• Can be expensive and time consuming
• Not good for rare diseases or those with long latency
• Loss to follow up undermines validity
• How were the study groups selected or defined?
• Did they differ in other ways that could influence the outcome?
• Were the data accurate? • Was data collection comparable for all study groups?
• How complete was the follow-up?
When Reading A Cohort Study, Ask…
When Reading A Cohort Study, Ask…
The Black Women’s The Black Women’s Health Study (BWHS)Health Study (BWHS)
A Follow-up Study of
African-American Women
Boston University
Slone Epidemiology Center
Why Is The BWHS Needed?Why Is The BWHS Needed?
• Rates of illness and death from many diseases are higher in African-American women
• Lack of health research studies involving African-American women, particularly large studies
156
95
4023
133109
Heart Stroke Cancer
Death rate per 100,000 women
BlackWhite
Exposure and Outcome Information
• Biennial postal questionnaires
• Self-report
1995 Questionnaire Data:Baseline
• Age• Weight• Height• Waist, hip circumference• Use of medical care• Occupation• Education• Medical history (prevalent disease)• Reproductive history• Drugs (OCs, HRT, vitamins, medications)• Cigarette smoking• Alcohol use• Diet (60 item Block-NCI questionnaire)• Physical activity• Family care responsibilities
1997-2007 Follow-up Questionnaires1997-2007 Follow-up Questionnaires
Update “exposures” for previous 2-year period:(e.g., OC use, weight, alcohol use, cigarette smoking, physical activity, etc.)
Record “outcomes” for previous 2-year period:Incident disease, Births, Deaths
Additional questions:Ancestry (race, ethnicity, where born) Lupus symptom list Experiences and perceptions of racism Hair straightener useFamily history of disease Exposure to violence Use of herbal remedies Individual health/belief systemDepression scale (CESD)* Household Income Education * Diet *Religion/Spirituality Perceived stress and copingDental health Access to car/transportationSiblings/birth order
*Repeat Question
1995 – 2007 Questionnaire Data1995 – 2007 Questionnaire Data
Prevalent and incident diseases and conditions:
Hypertension Cervical cancerDiabetes Rheumatoid arthritisHigh cholesterol OsteoarthritisHeart attack GingivitisAngina DepressionStroke SarcoidosisClot in lung, leg AsthmaCyst in breast Toxemia/Pre-eclampsiaFibroids Gastric/duodenal ulcerEndometriosis Hydatidiform moleLupus Polycystic ovarySickle cell anemia GlaucomaBreast cancer Multiple SclerosisLung cancer Kidney StonesColon/rectal cancer Other - specify
ValidationValidation
• Self-reported data
• Important for minimizing bias (misclassification)
• Must confirm: Exposures (when feasible) Vital status (deaths) Outcomes
• Expensive, not feasible in most cohort studies
• Not all exposures can be easily validated (subjective measures)
• Perceptions and experiences of racism/unfair treatment
• Can be accomplished in a sample of the cohort
• Anthropometric (height, weight, hip & waist circumference)
• Physical Activity
• Diet
Validation: ExposuresValidation: Exposures
Diet Validation StudyDiet Validation Study
• 408 BWHS Participants
• Over a 1-year period (quarterly) provided:
• 3 telephone 24 hour diet recalls
• 1 3-day food diary
• Compared nutrient intake estimates
• FFQ Data vs. Combined recall & diary data
Kumanyika et al., Ann Epidemiol 2003;13:111-118
Validation: OutcomesValidation: Outcomes
Non-medical cohort:
• Symptoms of illness are nonspecific
• Participants may not know the diagnosis even if it was made
• Direct examinations not feasible in a large cohort study
Validation: OutcomesValidation: Outcomes
Information requested depends on outcome studied
– Breast cancer and other cancers: hospital records, pathology reports, discharge summaries, CA registry data
– Coronary heart disease: hospital records, discharge summaries
– Lupus, RA, MS, Sarcoidosis: hospital records, physician checklists
– Hypertension and Diabetes: self-report plus use of appropriate drug, physician checklists
Challenges Challenges
Medical records
Difficulties in obtaining records
• Additional consent process (medical release)
• Incomplete records
• Records from multiple sources
• Physician checklists: a burden on physician
»Remedies
• Patient checklists
• Registry Data (Cancer)
Validation: Deaths
• Important for follow-up (person-time and outcome)
• Reported by: » Next of kin» Post office» Internet» SSDMF» NDI
• Confirmed by: » Death Certificate with date and cause(s) of death
• Supplied by:» State registrar» Next of kin
Challenges
Obtaining death certificates
–State IRB process (varies by city and state)–State registry budget cuts–Takes time to receive certificates–Cost ranges from $0.37 to $20 per
certificate search
Remedy: − NDI plus− Gives DOD and coded cause of death− Still requires state IRB approval
Study ResultsStudy Results
• ~100 publications (manuscripts and abstracts)
• Full list available at:
www.bu.edu/bwhs/publications
Genomic StudiesGenomic Studies
• Collection of cheek samples (for extraction of DNA) from all BWHS participants between January 2004 and December 2007
• Samples sent to National Human Genome Center at Howard University
BWHS Genomic StudiesBWHS Genomic Studies
• Participants receive $15 AFTER their consent and sample have been received
• Samples stored at Howard University Human Genome Center for future analyses
• 56% (n~27,000) participation / ~5% refusal
• Non-responders followed-up via telephone calls
>>1 Move Between 1995-19971 Move Between 1995-1997
Age: <30 70%
30-39 57%
40-49 45%
50-69 39%
Russell, et al. AJE. 2001;154:845-53