2013 http://informahealthcare.com/txc ISSN: 1040-8444 (print), 1547-6898 (electronic) Crit Rev Toxicol, 2013; 43(10): 829–849 ! 2013 Informa Healthcare USA, Inc. DOI: 10.3109/10408444.2013.837864 REVIEW ARTICLE Evaluation of the causal framework used for setting National Ambient Air Quality Standards Julie E. Goodman, Robyn L. Prueitt, Sonja N. Sax, Lisa A. Bailey, and Lorenz R. Rhomberg Gradient, Cambridge, MA, USA Abstract A scientifically sound assessment of the potential hazards associated with a substance requires a systematic, objective and transparent evaluation of the weight of evidence (WoE) for causality of health effects. We critically evaluated the current WoE framework for causal determination used in the United States Environmental Protection Agency’s (EPA’s) assessments of the scientific data on air pollutants for the National Ambient Air Quality Standards (NAAQS) review process, including its methods for literature searches; study selection, evaluation and integration; and causal judgments. The causal framework used in recent NAAQS evaluations has many valuable features, but it could be more explicit in some cases, and some features are missing that should be included in every WoE evaluation. Because of this, it has not always been applied consistently in evaluations of causality, leading to conclusions that are not always supported by the overall WoE, as we demonstrate using EPA’s ozone Integrated Science Assessment as a case study. We propose additions to the NAAQS causal framework based on best practices gleaned from a previously conducted survey of available WoE frameworks. A revision of the NAAQS causal framework so that it more closely aligns with these best practices and the full and consistent application of the framework will improve future assessments of the potential health effects of criteria air pollutants by making the assessments more thorough, transparent, and scientifically sound. Keywords Air quality, causal framework, criteria pollutants, risk assessment, systematic review, weight of evidence History Received 1 July 2013 Revised 20 August 2013 Accepted 20 August 2013 Published online 1 October 2013 Table of Contents Abstract ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 829 Introduction ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 829 The NAAQS causal framework ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 830 WoE best practices ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 834 Phase 1: Define the causal question and develop criteria for study selection ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 835 Phase 2: Develop and apply criteria for review of individual studies ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 839 Phase 3: Integrate and evaluate evidence ... ... ... ... ... ... ... ... ... 840 Phase 4: Draw conclusions based on inferences ... ... ... ... ... ... 841 Case study: Ozone ISA ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 842 Phase 1 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 843 Phase 2 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 843 Evaluation of study quality ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 843 Consideration of study limitations ... ... ... ... ... ... ... ... ... ... ... ... 843 Measurement error ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 843 Confounders ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 844 Model specification ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 844 Phase 3 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 845 Consideration of modified Bradford Hill criteria ... ... ... ... ... 845 Weighing alternative views ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 846 Phase 4 ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 846 Conclusions ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 847 Acknowledgements ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 847 Declaration of interest ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 848 References ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 848 Introduction A scientifically sound assessment of the potential risks associated with a substance requires a systematic, objective and transparent evaluation of the weight of evidence (WoE) for causality of health effects. Many different methods for conducting WoE evaluations have been used by authoritative bodies and reported in the scientific literature (e.g. Adami et al., 2011; Borgert et al., 2011; ECETOC, 2009; IARC, 2006; Rhomberg et al., 2010, 2011a, 2013; Swaen & van Amelsvoort, 2009). The United States Environmental Protection Agency (EPA) uses WoE approaches to assess the risks of health effects from exposures to chemicals in the environment, and these assessments are used to support a wide range of regulatory activities. For example, as part of the process for setting health-based National Ambient Air Quality Standards (NAAQS), EPA periodically evaluates the WoE for potential health risks of exposure to each of six substances that are classified as criteria air pollutants [particulate matter (PM), sulfur dioxide, nitrogen dioxide, carbon monoxide, ozone and lead] because of their national-scale occurrence and public health significance. These evaluations are pre- sented as an Integrated Science Assessment (ISA), in which Address for correspondence: Julie E. Goodman, Gradient, 20 University Road, Cambridge, MA 02138, USA. Tel: 617-395-5000. E-mail: [email protected]
22
Embed
Evaluation of the causal framework used for setting …...Evaluation of the causal framework used for setting National Ambient Air Quality Standards Julie E. Goodman, Robyn L. Prueitt,
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Crit Rev Toxicol, 2013; 43(10): 829–849! 2013 Informa Healthcare USA, Inc. DOI: 10.3109/10408444.2013.837864
REVIEW ARTICLE
Evaluation of the causal framework used for setting National AmbientAir Quality Standards
Julie E. Goodman, Robyn L. Prueitt, Sonja N. Sax, Lisa A. Bailey, and Lorenz R. Rhomberg
Gradient, Cambridge, MA, USA
Abstract
A scientifically sound assessment of the potential hazards associated with a substance requiresa systematic, objective and transparent evaluation of the weight of evidence (WoE) for causalityof health effects. We critically evaluated the current WoE framework for causal determinationused in the United States Environmental Protection Agency’s (EPA’s) assessments of thescientific data on air pollutants for the National Ambient Air Quality Standards (NAAQS) reviewprocess, including its methods for literature searches; study selection, evaluation andintegration; and causal judgments. The causal framework used in recent NAAQS evaluationshas many valuable features, but it could be more explicit in some cases, and some features aremissing that should be included in every WoE evaluation. Because of this, it has not alwaysbeen applied consistently in evaluations of causality, leading to conclusions that are not alwayssupported by the overall WoE, as we demonstrate using EPA’s ozone Integrated ScienceAssessment as a case study. We propose additions to the NAAQS causal framework based onbest practices gleaned from a previously conducted survey of available WoE frameworks. Arevision of the NAAQS causal framework so that it more closely aligns with these best practicesand the full and consistent application of the framework will improve future assessments of thepotential health effects of criteria air pollutants by making the assessments more thorough,transparent, and scientifically sound.
Keywords
Air quality, causal framework, criteriapollutants, risk assessment, systematicreview, weight of evidence
History
Received 1 July 2013Revised 20 August 2013Accepted 20 August 2013Published online 1 October 2013
strength of the observed association, experimental evidence,
temporality, specificity and analogy (Table 2). The original
Bradford Hill criteria were developed mainly for the inter-
pretation of epidemiology results and are not meant to be
specific rules to follow. EPA modified them in the NAAQS
causal framework for use with the broad array of study types
that are considered in the ISAs (e.g. epidemiology, toxicology,
mechanistic), to be more consistent with EPA’s Guidelines for
Carcinogen Risk Assessment (US EPA, 2005). Regarding how
Table 1. EPA NAAQS process for causal determination.
Step
Literature searching Specialized searches on specific topicsReview tables of contents of relevant journalsIdentification by expert scientistsReview citations in previous assessmentsIdentification by the public and advisory committeesIterative modification to optimize identification of pertinent publications
Selection of studies for inclusion Peer-reviewedEPA analyses of publicly-available dataBased on study qualityConsiders policy-relevance of studiesConsiders relevance to ambient exposures
Consideration of general limitations of each study type Controlled human exposureEpidemiologyToxicology
Use of modified Bradford Hill aspects to aid in judging causality ConsistencyCoherenceBiological plausibilityBiological gradientStrength of associationExperimental evidenceTemporalitySpecificityAnalogy
Evaluate evidence for major health outcome categories
Integrate evidence from across disciplines (controlled exposure, epi-demiology, toxicology) and across health endpoints
Incorporate peer and public comment and advice
Weigh alternative views on controversial issues
Characterize strength of evidence into causal conclusions Causal relationshipLikely to be a causal relationshipSuggestive of a causal relationshipInadequate to infer a causal relationshipNot likely to be a causal relationship
the modified criteria, or ‘‘aspects’’ as EPA refers to them, are
to be used in the framework, EPA states:
Although these aspects provide a framework for assessing
the evidence, they do not lend themselves to being
considered in terms of simple formulas or fixed rules of
evidence leading to conclusions about causality (Hill,
1965). For example, one cannot simply count the number
of studies reporting statistically significant results or
statistically nonsignificant results and reach credible
conclusions about the relative weight of the evidence and
the likelihood of causality. Rather, these aspects provide a
framework for systematic appraisal of the body of
evidence, informed by peer and public comment and
advice, which includes weighing alternative views on
controversial issues. (US EPA, 2013a)
The NAAQS causal framework suggests that evidence be
evaluated for major health outcome categories (e.g. respira-
tory effects) and conclusions be drawn based on the
integration of evidence from across health disciplines and
across the spectrum of related health endpoints (e.g. health
effects ranging from inflammation of the lungs to respiratory
disease mortality). In drawing judgments, there is a focus on
evidence of effects in the range of relevant human exposures
to a given pollutant, with a general consideration of studies
with doses or exposures in the range of no more than one or
two orders of magnitude above current or ambient conditions.
In discussing causal determinations, EPA characterizes the
evidence on which the judgment is based, including the
strength of evidence for individual endpoints within each
major health outcome category. Based on these characteriza-
tions, conclusions regarding the WoE for causation are
classified using a five-level hierarchy: ‘‘Causal relationship’’;
‘‘Likely to be a causal relationship’’; ‘‘Suggestive of a causal
relationship’’; ‘‘Inadequate to confer a causal relationship’’
and ‘‘Not likely to be a causal relationship’’ (Table 1). The
framework also discusses several factors for consideration in
delineating between adverse and non-adverse health effects
resulting from exposure to air pollution.
Table 2. EPA aspects to aid in judging causality.
Aspect Description
Consistency of the observed association An inference of causality is strengthened when a pattern of elevated risks is observedacross several independent studies. The reproducibility of findings constitutes one ofthe strongest arguments for causality. If there are discordant results amonginvestigations, possible reasons such as differences in exposure, confounding factors,and the power of the study are considered.
Coherence An inference of causality from one line of evidence (e.g. epidemiologic, controlledhuman exposure [clinical], or animal studies) may be strengthened by other lines ofevidence that support a cause-and-effect interpretation of the association. Thecoherence of evidence from various fields greatly adds to the strength of an inferenceof causality. In addition, there may be coherence in demonstrating effects acrossmultiple study designs or related health endpoints within one scientific line ofevidence.
Biological plausibility An inference of causality tends to be strengthened by consistency with data fromexperimental studies or other sources demonstrating plausible biological mechan-isms. A proposed mechanistic linking between an effect and exposure to the agent isan important source of support for causality, especially when data establishing theexistence and functioning of those mechanistic links are available.
Biological gradient (exposure–response relationship) A well-characterized exposure–response relationship (e.g. increasing effects associatedwith greater exposure) strongly suggests cause and effect, especially when suchrelationships are also observed for duration of exposure (e.g. increasing effectsobserved following longer exposure times).
Strength of the observed association The finding of large, precise risks increases confidence that the association is not likelydue to chance, bias or other factors. However, it is noted that a small magnitude in aneffect estimate may represent a substantial effect in a population.
Experimental evidence Strong evidence for causality can be provided through ‘‘natural experiments’’ when achange in exposure is found to result in a change in occurrence or frequency of healthor welfare effects.
Temporal relationship of the observed association Evidence of a temporal sequence between the introduction of an agent, and appearanceof the effect, constitutes another argument in favor of causality.
Specificity of the observed association Evidence linking a specific outcome to an exposure can provide a strong argument forcausation. However, it must be recognized that rarely, if ever, does exposure to apollutant invariably predict the occurrence of an outcome, and that a given outcomemay have multiple causes.
Analogy Structure activity relationships and information on the agent’s structural analogs canprovide insight into whether an association is causal. Similarly, information on modeof action for a chemical, as one of many structural analogs, can inform decisionsregarding likely causality.
Source: US EPA (2013a, Table I).
832 J. E. Goodman et al. Crit Rev Toxicol, 2013; 43(10): 829–849
As discussed above, the NAAQS causal framework is
largely based on an IOM (2008) WoE framework. The current
IOM framework has four categories of causal determination,
with only the top two designating evidence that establishes a
causal relationship (top category) or a relationship for which
causality is ‘‘at least as likely as not’’ (second category)
(Table 3). In contrast, as shown in Table 4, the NAAQS
causal framework has five categories of causal determination,
with the top three designating evidence that establishes a
causal relationship (i.e. causal, likely causal, or suggestive
of a causal relationship). Further, in contrast to the IOM
approach, the NAAQS causal framework states that only
one positive study is sufficient to establish a suggestive
causal relationship when the results of other studies are
inconsistent; this can greatly increase the frequency of
positive causal conclusions.
Table 4. EPA’s weight of evidence for causal determination.
Causal determination Health effects
Causal relationship Evidence is sufficient to conclude that there is a causal relationship with relevant pollutant exposures (i.e.doses or exposures generally within one to two orders of magnitude of current levels). That is, thepollutant has been shown to result in health effects in studies in which chance, bias, and confoundingcould be ruled out with reasonable confidence. For example: (a) controlled human exposure studiesthat demonstrate consistent effects; or (b) observational studies that cannot be explained by plausiblealternatives or are supported by other lines of evidence (e.g. animal studies or mode of actioninformation). Evidence includes multiple high-quality studies.
Likely to be a causal relationship Evidence is sufficient to conclude that a causal relationship is likely to exist with relevant pollutantexposures, but important uncertainties remain. That is, the pollutant has been shown to result in healtheffects in studies in which chance and bias can be ruled out with reasonable confidence but potentialissues remain. For example: (a) observational studies show an association, but copollutant exposuresare difficult to address and/or other lines of evidence (controlled human exposure, animal, or mode ofaction information) are limited or inconsistent; or (b) animal toxicological evidence from multiplestudies from different laboratories that demonstrate effects, but limited or no human data are available.Evidence generally includes multiple high-quality studies.
Suggestive of a causal relationship Evidence is suggestive of a causal relationship with relevant pollutant exposures, but is limited. Forexample, (a) at least one high-quality epidemiologic study shows an association with a given healthoutcome but the results of other studies are inconsistent; or (b) a well-conducted toxicological study,such as those conducted in the National Toxicology Program (NTP), shows effects in animal species.
Inadequate to infer a causal relationship Evidence is inadequate to determine that a causal relationship exists with relevant pollutant exposures.The available studies are of insufficient quantity, quality, consistency, or statistical power to permit aconclusion regarding the presence or absence of an effect.
Not likely to be a causal relationship Evidence is suggestive of no causal relationship with relevant pollutant exposures. Several adequatestudies, covering the full range of levels of exposure that human beings are known to encounter andconsidering at-risk populations, are mutually consistent in not showing an effect at any level ofexposure.
Source: US EPA (2013a, Table II).
Table 3. IOM recommended categories for the level of evidence for causation.
Causal determination Evidence
Sufficient The evidence is sufficient to conclude that a causal relationship exists. For example: (a) replicated and consistent evidenceof an association from several high-quality epidemiologic studies that cannot be explained by plausible noncausalalternatives (e.g. chance, bias or confounding); or (b) evidence of causation from animal studies and mechanisticknowledge; or (c) compelling evidence from animal studies and strong mechanistic evidence from studies in exposedhumans, consistent with (i.e. not contradicted by) the epidemiologic evidence.
Equipoise and above The evidence is sufficient to conclude that a causal relationship is at least as likely as not, but not sufficient to conclude thata causal relationship exists. For example: (a) evidence of an association from the preponderance of several high-qualityepidemiologic studies that cannot be explained by plausible noncausal alternatives (e.g. chance, bias or confounding) aswell as animal evidence and biological knowledge consistent with a causal relationship; or (b) strong evidence fromanimal studies or mechanistic evidence that is not contradicted by human or other evidence.
Below equipoise The evidence is not sufficient to conclude that a causal relationship is at least as likely as not, or is not sufficient to makea scientifically informed judgment. For example: (a) consistent human evidence of an association that is limited bythe inability to rule out chance, bias, or confounding with confidence, and weak animal or mechanistic evidence; or(b) animal evidence suggestive of a causal relationship, but weak or inconsistent human and mechanistic evidence; or(c) mechanistic evidence suggestive of a causal relationship, but weak or inconsistent animal and human evidence; or(d) the evidence base is very thin.
Against The evidence suggests the lack of a causal relationship. For example: (a) consistent human evidence of no causalassociation from multiple studies covering the full range of exposures encountered by humans; or (b) animal ormechanistic evidence supportive of a lack of a causal relationship.
There are many valuable features in the current NAAQS
causal framework. These include literature searching that is
iterative, consideration of modified Bradford Hill criteria to
aid in judgments regarding causal determinations, weighing
alternative views on controversial issues, integration of
judgments of the evidence across disciplines and endpoints,
focusing on evidence of health effects in the range of human-
relevant exposures, consideration of the adversity of health
outcomes and a hierarchy for categorization of the strength of
the evidence.
Some aspects of the NAAQS causal framework could be
more specific, however, and the framework does not explicitly
include several features that, we would argue, are critical for
a thorough and internally consistent WoE evaluation. For
example, there is no explicit guidance for literature search
strategies or determining study exclusion criteria. There is
also no explicit guidance for evaluating the strengths and
weaknesses of studies and no information on how to use study
quality criteria to assign a quality weight to individual studies.
There are no clear statements regarding how the Bradford Hill
criteria should be applied (e.g. there is no indication of what
constitutes a strong association) or how the causality judg-
ments should consider all criteria jointly. There is also no
guidance on how to evaluate alternative hypotheses that are
supported by the data (e.g. a given substance is not a causal
factor for a health effect and apparent associations are
explicable by other factors) or what criteria should be used
to determine which hypothesis is most supported by the data.
As indicated in the NAS roadmap, EPA needs to state
the justification for its judgments explicitly – including the
reasoning behind judgments – to aid transparency.
Because of these and other limitations discussed below,
EPA’s NAAQS causal framework has not always been applied
consistently in evaluations of causality for individual sub-
stances – much less across substances – and adequate
transparency regarding the justification of specific causal
judgments is not achieved in the criteria air pollutant ISAs.
We provide specific examples of this with a case study of the
ozone ISA (US EPA, 2013a) below.
WoE best practices
We recently conducted a survey of existing WoE frameworks,
including the NAAQS causal framework, to evaluate WoE
best practices (Rhomberg et al., 2013). From these frame-
works, we identified four successive phases to the overall
WoE process that are consistent with the critical steps for the
development of a scientifically sound assessment of the WoE
for causation, as identified by the NAS panel that reviewed
EPA’s formaldehyde assessment. These critical steps should
be considered by EPA in developing an updated NAAQS
framework. Below, we describe these phases; Table 5 presents
their general features.
Phase 1. Define the causal question and develop criteria
for study selection. Frame the purpose of the evaluation and
the causal questions to be evaluated, and define the criteria for
selecting studies relevant to the evaluation to ensure trans-
parency. Prior to this phase, there is a separate role for
problem formulation, which could be considered as ‘‘Phase
0’’ and entails the evaluation of what is needed to address the
decisions to be made, along with an assessment of the
capacity of an analysis of available data to provide answers
addressing those needs.
Phase 2. Develop and apply criteria for review of
individual studies. Conduct and present a systematic and
consistent review of available studies relevant to the causal
Table 5. Weight-of-evidence best practices.
Phase Steps
Phase 1Define the causal question and develop criteria for study selection Define causal question or hypothesis
Define criteria for study inclusion and exclusionPlan literature searchDesign literature search strategiesScreen and select studies, with justification for exclusionRecord search strategies, results, and decisions
Phase 2Develop and apply criteria for review of individual studies Extract study characteristics
Extract dataAssess study qualityCategorize study qualityAssess individual study resultsCategorize study relevance and adequacy
Phase 3Integrate and evaluate evidence Evaluate data within and across realms of evidence
(e.g. toxicology, epidemiology, MoA)Assess MoAAssess adversity of effectsCompare alternative accounts of observationsFormulate WoE conclusionsIdentify data gaps and propose next steps
Phase 4Draw conclusions based on inferences Propose categories for causal relationships
Propose recommendations for risk assessmentPropose recommendations for research or policies
Source: Rhomberg et al. (2013).
834 J. E. Goodman et al. Crit Rev Toxicol, 2013; 43(10): 829–849
question. Evaluate the rigor and quality of individual study
results using pre-defined criteria applied uniformly across
studies.
Phase 3. Integrate and evaluate evidence. Make sound
and defensible scientific judgments about the existence and
nature of causative processes for the health outcome under
consideration. This is one of the more challenging phases for
any WoE framework; no matter how one lays out procedures
and methods for synthesizing across studies, in the end, the
question is about how studies in one setting (e.g. animal or
in vitro assays) should affect our assessment of potential
causality or risks in another (e.g. the general human
population exposed environmentally).
Phase 4. Draw conclusions based on inferences. Apply the
results of the WoE evaluation from Phase 3 to make
conclusions that can be used to inform regulatory decision-
making. Although this phase is not risk management itself, it
can be influenced by risk management considerations. In a
regulatory setting, decisions about WoE categories (‘‘known’’
causative agent, ‘‘likely’’ causative agent, etc.) or findings
about the science [sufficient evidence for a mode of action
(MoA) or to replace a default assumption for developing a
toxicity value] are influenced in this stage by policy questions
and regulatory consequences for those decisions and, ultim-
ately, by policies and judgments about the sufficiency of
evidence to support those decisions.
The most important aspect of a WoE framework is that it
provides specific and transparent guidance for how to work
through the WoE process. At the same time, the framework
needs to be flexible so that it can be applied in a consistent
manner across different types of datasets for evaluations of
causality. A WoE evaluation is only useful and applicable to
constructive scientific debate if the logic behind it is made
clear, and, with that, it is often necessary to take the reader
through alternative interpretations of the data so that the
various interpretations can be compared logically. This
approach does not eliminate the need for scientific judgment,
and often may not lead to a definitive choice of one
interpretation over the other, but it will clearly lay out the
logic for how one weighs the evidence for and against each
interpretation. Only in this way is it possible to have
constructive scientific debate about potential causality that
is focused on an organized, logical ‘‘weighing’’ of the
evidence.
The NAAQS causal framework provides guidance on
many features of the WoE phases noted above. Some of them
are explicit and some are clearly included in the framework,
but specific guidance is not always provided (e.g. study
exclusion criteria). As a result, evaluations are not always
consistent either among studies for a particular criteria
pollutant or among pollutants, as discussed below in the
case study of ozone.
Table 5 outlines WoE best practices generally and Table 6
outlines features of the NAAQS causal framework, as well as
additional features that we recommend based on best
practices. These additions will enable the framework to
include these WoE best practices and help ensure that they are
applied consistently. Several of the recommended features
may have, in practice, been incorporated into WoE evalu-
ations in the criteria pollutant ISAs; however, if they are not
explicitly stated as guidance in the discussion of the NAAQS
causal framework in the ISA Preamble, or if they are
discussed as occurring in a Phase that does not match
where they should occur according to WoE best practices,
they are still noted in Table 6 as additional, best practice
features. Below, we discuss these features by phase.
Phase 1: Define the causal question and developcriteria for study selection
The causal question and breadth of scope should be explicitly
stated in Phase 1 of a WoE evaluation to provide a clear
purpose for the assessment as well as direction for the
remaining steps. Defining the causal question at the beginning
of the process helps ensure that the correct question is posed,
and it sets the context within which the bearing and utility of
studies can be evaluated. Study selection criteria and the
literature search design should be articulated clearly (Tables 5
and 6). The end product of Phase 1 is a clearly defined causal
question, a documented search strategy with clear inclusion
and exclusion criteria, a list of included and excluded studies
with reasons for inclusion or exclusion, and a record of any
deviations from the original plan.
In the NAAQS causal framework, the definition of the
causal question is inferred from the purpose of the ISA in the
NAAQS process, which is to provide a ‘‘concise review,
synthesis, and evaluation of the most policy-relevant science
to serve as a scientific foundation for the review of the
[NAAQS]’’ (US EPA, 2013a). EPA does not, but in our view
should, explicitly resolve the overarching causal question into
specific issues for problem formulation. For example, EPA
could specify aspects of the analysis regarding what is
relevant to existing human exposures, the basis for what
constitutes sensitive subpopulations, whether exposure to
other agents affects response, and interactions with other
stressors. In other words, by specifying whether the purpose is
a ‘‘hazard in context’’ decision or a more general identifica-
tion of a potential hazard at any exposure level, this would
help to guide decisions on whether or not it is found in the
actual population. Problem formulation should try to identify
the specific sub-judgments that need to be made to support
the overall goal, with the aim of enabling assessment of
whether the available data speak to and enable sound
decisions on them.
With regard to the literature search, the Preamble of the
ozone ISA provides an overview of EPA’s general literature
search strategy and the various ways in which studies are
identified for potential inclusion. Some of the study inclusion
criteria are also presented, and an evaluation of study quality
is included in the study selection process. However, the
Preamble does not provide guidance regarding study exclu-
sion criteria or justification for inclusion or exclusion of
studies. As shown in Table 6, the NAAQS causal framework
should include explicit guidance regarding literature search
strategies for identifying all available evidence, as well as
criteria for both study inclusion and exclusion (e.g. study
type, types of participants, exposures, outcomes). While
relevance to ambient exposures to criteria pollutants is
important to consider at this point, allowances may be made
for scientific information that does not meet inclusion criteria
The authors are employed by Gradient, a private environ-
mental consulting firm. The Gradient staff has strong
expertise in assessing human, experimental animal, and
mechanistic data in WoE analyses (as is evident in recent
evaluations conducted for bisphenol A, naphthalene, formal-
dehyde, chlorpyrifos, methanol, styrene, nickel, and toluene
diisocyanate) and has presented several of these analyses to
regulatory bodies. In addition, the Gradient staff, including
the authors of this paper, have carefully evaluated the science
underlying EPA’s review of various NAAQS and offered both
oral and written testimony to EPA. Gradient has also
addressed issues on systematic review and integration of
evidence for a number of clients. The work reported in this
paper was conducted by the authors during the normal course
of employment at Gradient with financial support provided by
the American Petroleum Institute (API). API is the major
trade association of corporations in the petroleum sector, from
discovery through production and refining. Drafts of this
paper were reviewed by members or affiliates of API. The
authors have the sole responsibility for the writing, contents,
and conclusions in this paper. The conclusions are not
necessarily those of API.
References
Abbey DE, Nishino N, McDonnell WF, et al. (1999). Long-terminhalable particles and other air pollutants related to mortality innonsmokers. Am J Respir Crit Care Med, 159, 373–82.
Adami HO Berry SC, Breckenridge C, et al. (2011). Toxicology andepidemiology: Improving the science with a framework for combiningtoxicological and epidemiological evidence to establish causalinference. Toxicol Sci, 122, 223–34.
Adams WC. (2006). Comparison of chamber 6.6-h exposures to 0.04–0.08 ppm ozone via square-wave and triangular profiles on pulmonaryresponses. Inhal Toxicol, 18, 127–36.
Bachmann J. (2007). Will the circle be unbroken: a history of the USNational Ambient Air Quality Standards. J Air Waste Manage Assoc,57, 652–97.
Bell ML, Kim JY, Dominici F. (2007). Potential confounding ofparticulate matter on the short-term association between ozone andmortality in multisite time-series studies. Environ Health Perspect,115, 1591–5.
Boffetta P, McLaughlin JK, La Vecchia C, et al. (2008). False-positiveresults in cancer epidemiology: a plea for epistemological modesty.J Natl Cancer Inst, 100, 988–95.
Borgert CJ, Mihaich EM, Ortego LS, et al. (2011). Hypothesis-drivenweight of evidence framework for evaluating data within the US EPA’sEndocrine Disruptor Screening Program. Regul Toxicol Pharmacol,61, 185–91.
Brauer M, Blair J, Vedal S. (1996). Effect of ambient ozone exposure onlung function in farm workers. Am J Respir Crit Care Med, 154,981–7.
Brown JS, Bateson TF, McDonnell WF. (2008). Effects of exposure to0.06 ppm ozone on FEV1 in humans: a secondary analysis of existingdata. Environ Health Perspect, 116, 1023–6.
Brown JS. [US EPA, National Center for Environmental Assessment(NCEA)]. (2007). Memo to Ozone NAAQS Review Docket (OAR-2005-0172) re: the effects of ozone on lung function at 0.06 ppm inhealthy adults. 8p., June 14.
Chan CC, Wu TH. (2005). Effects of ambient ozone exposure on mailcarriers’ peak expiratory flow rates. Environ Health Perspect, 113,735–8.
Clyde M. (2000). Model uncertainty and health effect studies forparticulate matter. Environmetrics, 11, 745–63.
Dockery DW, Pope CA, Xu X, et al. (1993). An association between airpollution and mortality in six U.S. cities. N Engl J Med, 329, 1753–9.
European Centre for Ecotoxicology and Toxicology of Chemicals(ECETOC). (2009). Framework of the Integration of Human and
Animal Data in Chemical Risk Assessment. ECETOC TechnicalReport No. 104. 130p., January.
Franklin M, Schwartz J. (2008). The impact of secondary particles on theassociation between ambient ozone and mortality. Environ HealthPerspect, 116, 453–8.
Frey HC, Samet JM. (2011). Letter to L. Jackson re: CASAC Review ofEPA’s ‘‘Integrated Science Assessment for Lead (First ExternalReview Draft – May 2011)’’. Clean Air Scientific AdvisoryCommittee (CASAC). 120p., December 9. [Online] Available from:http://yosemite.epa.gov/sab/sabproduct.nsf/0/D3E2E8488025344D852579610068A8A1/$File/EPA-CASAC-12-002-unsigned.pdf [lastaccessed 29 June 2012].
Gent JF, Triche EW, Holford TR, et al. (2003). Association of low-levelozone and fine particles with respiratory symptoms in children withasthma. JAMA, 290, 1859–67.
Girardot SP, Ryan PB, Smith SM, et al. (2006). Ozone and PM2.5exposure and acute pulmonary health effects: a study of hikers in theGreat Smoky Mountains National Park. Environ Health Perspect, 114,1044–52.
Goodman JE, Dodge DG, Bailey LA. (2010). A framework for assessingcausality and adverse effects in humans with a case study of sulfurdioxide. Regul Toxicol Pharmacol, 58, 308–22.
Goodman JE, Sax SN. (2012). Comments on the Integrated ScienceAssessment for Ozone and Related Photochemical Oxidants (ThirdExternal Review Draft). 125p. Submitted to EPA on 15 August 2012.Docket ID: EPA-HQ-ORD-2011-0050.
Health Effects Institute (HEI). 2003. Revised analyses of time-seriesstudies of air pollution and health. 306p., May.
Higgins JPT, Green S. (2011). Cochrane handbook for systematicreviews of interventions (Version 5.1.0). The Cochrane Collaboration.March. Available from: http://www.cochrane-handbook.org/ [lastaccessed 03 Feb 2012].
Hill AB. (1965). The environment and disease: association or causation?Proc R Soc Med, 58, 295–300.
Hoppe P, Praml G, Rabe G, et al. (1995). Environmental ozone fieldstudy on pulmonary and subjective responses of assumed risk groups.Environ Res, 71, 109–21.
Institute of Medicine (IOM). (2008). Improving the presumptivedisability decision-making process for veterans. Committee onEvaluation of the Presumptive Disability Decision-Making Processfor Veterans, Board on Military and Veterans Health, NationalAcademies Press. 781p.
International Agency for Research on Cancer (IARC). (2006). Preambleto the IARC monographs on the evaluation of carcinogenic risks tohumans. 27p. Available from: http://monographs.iarc.fr/ENG/Preamble/CurrentPreamble.pdf [last accessed 20 Sep 2012].
Ito K. (2003). Associations of particulate matter components with dailymortality and morbidity in Detroit, Michigan. In: Revised analyses oftime-series studies of air pollution and health. Boston, MA: HealthEffects Institute, 143–156.
Ito K, De Leon SF, Lippmann M. (2005). Associations between ozoneand daily mortality: analysis and meta-analysis. Epidemiology, 16,446–57.
Jerrett M, Burnett RT, Pope CA, et al. (2009). Long-term ozone exposureand mortality. N Engl J Med, 360, 1085–95.
Jerrett M, Burnett RT, Ma R, et al. (2005). Spatial analysis of airpollution and mortality in Los Angeles. Epidemiology, 16, 727–36.
Jurek AM, Greenland S, Maldonado G. (2008). How far from non-differential does exposure or disease misclassification have to be tobias measures of association away from the null? Int J Epidemiol, 37,382–5.
Jurek AM, Greenland S, Maldonado G, Church TR. (2005). Properinterpretation of non-differential misclassification effects: expect-ations vs. observations. Int J Epidemiol, 34, 680–7.
Kamps AW, Roorda RJ, Brand PL. (2001). Peak flow diaries inchildhood asthma are unreliable. Thorax, 56, 180–2.
Katsouyanni K, Samet JM, Anderson HR, et al. (2009). Air pollutionand health: a European and North American Approach (APHENA).HEI Research Report 142. 132p., October 29.
Kim CS, Alexis NE, Rappold AG, et al. (2011). Lung function andinflammatory responses in healthy young adults exposed to 0.06 ppmozone for 6.6 hours. Am J Respir Crit Care Med, 183, 1215–21.
Klimisch HJ, Andreae M, Tillmann U. (1997). A systematic approach forevaluating the quality of experimental toxicological and ecotoxico-logical data. Regul Toxicol Pharmacol, 25, 1–5.
848 J. E. Goodman et al. Crit Rev Toxicol, 2013; 43(10): 829–849
Koop G, McKitrick R, Tole L. (2010). Air pollution, economic activityand respiratory illness: evidence from Canadian cities, 1974–1994.Environ Model Softw, 25, 873–85.
Koop G, Tole L. (2004). Measuring the health effects of air pollution: towhat extent can we really say that people are dying from bad air?J Environ Econ Manage, 47, 30–54.
Korrick SA, Neas LM, Dockery DW, et al. (1998). Effects of ozone andother pollutants on the pulmonary function of adult hikers. EnvironHealth Perspect, 106, 93–9.
Lavelle KS, Schnatter AR, Travis KZ, et al. (2012). Framework forintegrating human and animal data in chemical risk assessment. RegulToxicol Pharmacol, 62, 302–12.
Lefohn AS, Hazucha MJ, Shadwick D, Adams WC. (2010). Analternative form and level of the human health ozone standard. InhalToxicol, 22, 999–1011.
Lipfert FW, Perry HM, Miller JP, et al. (2000). The WashingtonUniversity-EPRI Veterans’ Cohort Mortality Study: preliminaryresults. Inhal Toxicol, 12, 41–73.
Lipsett MJ, Ostro BD, Reynolds P, et al. (2011). Long-term exposure toair pollution and cardiorespiratory disease in the California teachersstudy cohort. Am J Respir Crit Care Med, 184, 828–35.
McClellan RO. (2012). Role of science and judgment in setting nationalambient air quality standards: how low is low enough? Air QualAtmos Health, 5, 243–58.
Miller KA, Siscovick DS, Sheppard L, et al. (2007). Long-term exposureto air pollution and incidence of cardiovascular events in women.N Engl J Med, 356, 447–58.
Moolgavkar SH, McClellan RO, Dewanji A, et al. (2013). Time-seriesanalyses of air pollution and mortality in the United States: asubsampling approach. Environ Health Perspect, 121, 73–8.
Mortimer KM, Neas LM, Dockery DW, et al. (2002). The effect of airpollution on inner-city children with asthma. Eur Respir J, 19,699–705.
Naeher LP, Holford TR, Beckett WS, et al. (1999). Healthy women’sPEF variations with ambient summer concentrations of PM10, PM2.5,SO42-, Hþ, and O3. Am J Respir Crit Care Med, 160, 117–25.
National Research Council (NRC). (2011). Review of the EnvironmentalProtection Agency’s Draft IRIS Assessment of Formaldehyde.National Academies Press. 194p., April. Available from: http://www.nap.edu/catalog/13142.html [last accessed 08 April 2011].
Neas LM, Dockery DW, Koutrakis P, Speizer, FE. (1999). Fine particlesand peak flow in children: Acidity versus mass. Epidemiology, 10,550–3.
Nicolich M. (2007). Letter Report to US EPA, Air and Radiation Docket.Re: Comments on EPA’s Proposed ‘‘National Ambient Air QualityStandards for Ozone’’, 72 Fed Reg. 37, 818 (11 July 2007). DocketNo. EPA-HQ-OAR-2005-0172. October 9.
O’Connor GT, Neas L, Vaughn B, et al. (2008). Acute respiratory healtheffects of air pollution on children with asthma in US inner cities.J Allergy Clin Immunol, 121, 1133–9.
Rhomberg LR, Bailey LA, Goodman JE, et al. (2011a). Is exposureto formaldehyde in air causally associated with leukemia? – ahypothesis-based weight-of-evidence analysis. Crit Rev Toxicol, 41,555–621.
Rhomberg LR, Bailey LA, Goodman JE. (2010). Hypothesis-basedweight of evidence: a tool for evaluating and communicatinguncertainties and inconsistencies in the large body of evidence inproposing a carcinogenic mode of action – naphthalene as an example.Crit Rev Toxicol, 40, 671–96.
Rhomberg LR, Chandalia JK, Long CM, Goodman JE. (2011b).Measurement error in environmental epidemiology and the shape ofexposure–response curves. Crit Rev Toxicol, 41, 651–71.
Rhomberg LR, Goodman JE, Bailey LA, et al. (2013). A surveyof frameworks for best practices in weight-of-evidence analyses.Crit Rev Toxicol. DOI: 10.3109/10408444.2013.832727.
Romieu I, Meneses F, Ramirez M, et al. (1998). Antioxidant supple-mentation and respiratory functions among workers exposed to highlevels of ozone. Am J Respir Crit Care Med, 158, 226–32.
Sarnat JA, Brown KW, Schwartz J, et al. (2005). Ambientgas concentrations and personal particulate matter exposures: impli-cations for studying the health effects of particles. Epidemiology, 16,385–95.
Schelegle ES, Morales CA, Walby WF, et al. (2009). 6.6-Hour inhalationof ozone concentrations from 60 to 87 parts per billion in healthyhumans. Am J Respir Crit Care Med, 180, 265–72.
Smith RL, Xu B, Switzer P. (2009). Reassessing the relationship betweenozone and short-term mortality in U.S. urban communities. InhalToxicol, 21, 37–61.
Spencer-Hwang R, Knutsen SF, Soret S, et al. (2011). Ambient airpollutants and risk of fatal coronary heart disease among kidneytransplant recipients. Am J Kidney Dis, 58, 608–16.
Stieb DM, Szyszkowicz M, Rowe BH, Leech JA. (2009). Air pollutionand emergency department visits for cardiac and respiratory condi-tions: A multi-city time-series analysis. Environ Health, 8, 25.
Stylianou M, Nicolich MJ. (2009). Cumulative effects and thresholdlevels in air pollution mortality: Data analysis of nine large US citiesusing the NMMAPS dataset. Environ Pollut, 157, 2216–23.
Swaen G, van Amelsvoort L. (2009). A weight of evidence approach tocausal inference. J Clin Epidemiol, 62, 270–7.
Taubes G. (1995). Epidemiology faces its limits. Science, 269,164–9.
US EPA. (2005). Guidelines for carcinogen risk assessment. RiskAssessment Forum, EPA/630/P-03/001F. March.
US EPA. (2006). Air quality criteria for ozone and related photochemicaloxidants. Volume I. EPA 600/R-05/004aF. 821p., February.
US EPA. (2008a). Integrated science assessment for oxides of nitrogen.EPA/600/R-08/071. 260p., July.
US EPA. (2008b). Integrated science assessment for sulfur oxides –health criteria. EPA/600/R-08/047F. September.
US EPA. (2009). Integrated science assessment for particulate matter(final). EPA/600/R-08/139F. 2228p., December.
US EPA. (2010a). Integrated science assessment for carbon monoxide.EPA/600/R-09/019F. 593p., January.
US EPA. (2010b). Toxicological review of formaldehyde – inhalationassessment (CAS No. 50-00-0) in support of summary information onthe integrated risk information system (IRIS). Volumes I–IV (Draft).EPA/635/R-10/002A. 1043p., June.
US EPA. (2012). Integrated science assessment for lead (third externalreview draft). EPA/600/R-10/075C. 1762p., November.
US EPA. (2013a). Integrated science assessment for ozone and relatedphotochemical oxidants (final). EPA/600/R–10/076F. 1251p.,February.
US EPA. (2013b). Process of reviewing the national ambient air qualitystandards. 2p., June. Available from: http://www.epa.gov/ttn/naaqs/review.html [last accessed 19 Aug 2013].
Wacholder S, Hartge P, Lubin JH, Dosemeci M. (1995). Non-differentialmisclassification and bias towards the null: a clarification [letter to theeditor]. Occup Environ Med, 52, 557.
Xia Y, Tong H. (2006). Cumulative effects of air pollution on publichealth. Stat Med, 25, 3548–59.
Zanobetti A, Schwartz J. (2008). Is there adaptation in the ozonemortality relationship: a multi-city case-crossover analysis. EnvironHealth, 7, 22.
Zanobetti A, Schwartz J. (2011). Ozone and survival in four cohorts withpotentially predisposing diseases. Am J Respir Crit Care Med, 184,836–41.
Copyright of Critical Reviews in Toxicology is the property of Taylor & Francis Ltd and itscontent may not be copied or emailed to multiple sites or posted to a listserv without thecopyright holder's express written permission. However, users may print, download, or emailarticles for individual use.