PHC215 By Dr. Khaled Ouanes Ph.D. E-mail: [email protected] Twitter: @khaled_ouanes INTRODUCTION TO HEALTHCARE RESEARCH METHODS
PHC215
By Dr. Khaled Ouanes Ph.D.
E-mail: [email protected]
Twitter: @khaled_ouanes
INTRODUCTION TO
HEALTHCARE RESEARCH
METHODS
Correlational /Ecological
Studies
Correlational studies are also called
ecological or aggregate studies.
This type of studies uses population-level
data to examine the relationship between
exposure rates and disease rates.
We are thus in the case of a study in which
units of analysis are populations or groups
of people rather than individuals.
i.e. The focus will be on the comparison of
populations/groups rather than individual patients
or participants.
Examples
Does the percentage of adults with multiple sclerosis tend to be higher in countries farther from the equator?
Does the rate of asthma tend to be higher in cities with higher levels of air pollution?
Does the prevalence of diabetes tend to be higher when we have higher prevalence of obesity?
Population-level data are used
to look for associations between
two or more group
characteristics
Data Sources
At least one data source (if not more) thatcontains comparable information about thepopulation characteristics of interest must beidentified.
Information about all the variables of interestmust be available for a suitable number ofpopulations, which can be grouped by place orby time.
Examples of Populations
All Western European countries
The largest 25 metropolitan areas in the Arab world
All Sub-Sahara countries
A random sample of survey Areas in London
Historic data for the past decades from one or more place-based populations
Exposures and Outcomes
At least one characteristic of the populations
being examined is designated as an exposureExposures are often environmental measures likely to be fairly consistent across an
entire population
At least one characteristic is designated as an
outcome
Aggregate Data
Population characteristics are in the form of
aggregate (grouped) data, such as: the proportion of each population with a particular characteristic
the average value of the variable in the population
Examples of Exposures
The percentage of adults older than 30 who have not completed
at least 12 years of education
The mean income in the population
The median age
The number of rainy days over a given year in the population
The average ultraviolet radiation index during midday in the
hottest month of the year
Examples of Diseases
The prevalence of obesity among adults
The mean BMI (body mass index) among adults
The annual mortality rate from asthma
Cautions
Correlational studies are valid only if the data
points are comparable.A data point is a discrete unit of information. Generally, any single fact is a data point.
In a statistical or analytical context, a data point is usually derived from a
measurement or research and can be represented numerically and/or graphically.
In some populations, exposures and diseases may
be routinely undercounted or routinely over-
diagnosed compared to other populations.
Cautions
If multiple sources of data are used or if the data
were collected over a lengthy period of time,
then the definition of exposure or disease may
differ from one population to another and may
not be comparable.
Data Management Example
Data should be entered into a spreadsheet
Each population (A, B, C, etc.) is in its own row
Each exposure and each outcome is in its own
column
Analysis: Correlation
On a scatterplot used to illustrate correlation, each point represents one population in the study.
The exposure is plotted on the x-axis, and the outcome or
disease is plotted on the y-axis.
Do you see a Correlation?
Do you see a Correlation?
Analysis: Correlation
1. When all the points fall neatly in a line, then the
correlation is strong.
2. When the points are not exactly linear but a line for
trend can be drawn, then the correlation is mild or
moderate.
3. When the points appear to be randomly placed
and no obvious line can be drawn through them,
then the correlation is weak or nonexistent.
Analysis: Correlation
If higher levels of exposure are linked to higher rates of disease, then the slope is positive.
If higher levels of exposure are linked to lower rates of disease, then the slope is negative.
Analysis: Correlation
For continuous variables and other variables with
responses that can be plotted on a number line, a
Pearson correlation coefficient (r) should be used to calculate the correlation.
For variables that assign a rank to responses or that have
ordered categories, use the Spearman rank-order
correlation (designated by the letter r or the Greek letter
r (rho) in most statistical programs).
Analysis: Correlation
The Pearson method is built on the notion that if
Measurement 1 trails Measurement 2 (directly or
inversely), you can get some indications on how
linked they are by calculating Pearson's r -the
correlation coefficient-, which is a quantity
derived from the products of the differences
between each M1 and its average and each M2
and its average.
Analysis: Correlation
Spearman's rank coefficient is similar to Pearson in
producing a value from -1 to +1, but you would
use Spearman when the rank order of the data
are important in some way.
The Pearson test is more widely used.
r = –1: all points lie perfectly on a line with a negative slope
r = 1: all points lie perfectly on a line with a positive slope
r = 0: no association between the exposure and outcome
r2 shows how strong a correlation is without indicating the
direction of the association
Analysis: Correlation
Analysis
Use linear regression models when the goal is to:
compare more than two variables
understand the relationship between two variables
while controlling or adjusting for the effects of other
variables
Age Adjustment
When the populations being compared havevery different age structures, age adjustmentmay be necessary to make a fair comparisonamong populations.
Avoiding the Ecological Fallacy
Correlational studies compare groups rather than individuals.
No individual-level data are included in the analysis, only population-level data.
The incorrect attribution of population-level associations to individuals is called the ecological fallacy.
Even though a population with a higher rate of exposure to something has a higher rate of
disease than populations with lower exposure rates, individuals in that population who have a high level of exposure do not necessarily have
the disease.
Avoiding the Ecological Fallacy
Avoiding the Ecological Fallacy
The experience of an individual in a populationmay vary significantly from the populationaverage.
It would be incorrect to assume that any oneindividual from a country with a high averagebody mass index (BMI) will be obese or that anindividual from a country with a low average BMIwill not be obese.
However, it is appropriate to identify trends in populations and to use those observations to
generate hypotheses for individual-level studies that will test for relationships between the characteristics of interest in individuals.
Avoiding the Ecological Fallacy
Key Characteristics of Correlational
(Ecological) Studies
Case Series
Uses of Case Series
Describing the characteristics of and similarities
among a group of individuals with the same signs
and/or symptoms of disease
Identifying new syndromes and refining case
definitions.
Clarifying typical disease progression
Developing hypotheses for future research
Sample Size
Some case series for rare conditions may
require only a few participants
Other studies may include several hundred
individuals
Getting Started…
Select one disease or condition of interest
Determine what will be new and interesting about
the study
Identify an appropriate and available source of
cases
Establish a clear case definition that spells out
inclusion criteria and exclusion criteria.
Case Definitions
Specify characteristics related to:
The disease or procedure ICD codes (International Classification of Diseases codes) are often used as
part of the definition
Person
Place
Time
Sample Case Definitions
Data Collection
Primary data: interviews of cases using a
questionnaire and/or qualitative techniques
Secondary data from patient charts (medical
records)
It is often helpful to create a questionnaire that guides the extraction of
information from medical records
Be aware that patient charts are often incomplete; missing information
about a symptom does not mean that the patient did not experience it
Most case studies do not require any advanced analyses or any numbers beyond simple counts
and frequencies.
Key Characteristics of a
Case Series
Cross-Sectional Surveys
Overview
The goal of a cross-sectional survey, alsocalled a prevalence study, is to measure theproportion of a population with a particularexposure or disease at one point in time basedon a representative sample of a population.
Cross-sectional surveys are among the most popular study approaches in the health sciences because they allow for
the relatively rapid collection of new data.
Uses
Cross-sectional surveys are used to:
Describe communities
Assess population needs
Evaluate programs
Establish baseline data prior to the initiation
of longitudinal studies
Representative Populations
Cross-sectional studies use a simple study design:
The researcher asks a few hundred people to
complete a short questionnaire and then analyzes the
data.
However, there is one very important requirement: the
participants must be reasonably representative of
some larger population.
Representative Populations
If a researcher wants the results of a survey to be
generalizable to all town residents, it is NOT acceptable to
use a convenience population such as:
Friends
Fans attending a football game
Shoppers at a store at a given time on a chosen day
Individuals attending a clinic
Pupils attending a neighbourhood school
Representative Populations
If the results of a cross-sectional survey are
intended to reflect the profile of an entire
town (or other population group), then the
study’s sampling strategy must recruit a
population that is as diverse as the town.
Analysis: Prevalence
Prevalence = the proportion of the population with a given trait at
the time of the survey
Analysis: Comparative Statistics Prevalence rate ratios (PRRs) compare the prevalence of a characteristic in
2 population subgroups by taking a ratio of their prevalence rates
Note: An exposure can be said to be “associated” or “related” to a disease,
but a cross-sectional survey cannot show that an exposure caused a
disease.
Key Characteristics of
Cross-Sectional Surveys
PHC215
By Dr. Khaled Ouanes Ph.D.
E-mail: [email protected]
Twitter: @khaled_ouanes
HEALTHCARE RESEARCH METHODS
Based on the textbook of introduction to health research methods – K.H. Jacobsen