Water Quality Models for Supporting Shellfish Harvesting Area Management by Andrew David Gronewold Department of Environmental Sciences and Policy Duke University Date: Approved: Dr. Kenneth Reckhow, co-supervisor Dr. Robert Wolpert, co-supervisor Dr. Rachel Noble Dr. William Kirby-Smith Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Environmental Sciences and Policy in the Graduate School of Duke University 2009
136
Embed
Water Quality Models for Supporting Shellfish Harvesting ...
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Water Quality Models for Supporting
Shellfish Harvesting Area Management
by
Andrew David Gronewold
Department of Environmental Sciences and PolicyDuke University
Date:Approved:
Dr. Kenneth Reckhow, co-supervisor
Dr. Robert Wolpert, co-supervisor
Dr. Rachel Noble
Dr. William Kirby-Smith
Dissertation submitted in partial fulfillment of therequirements for the degree of Doctor of Philosophy
in the Department of Environmental Sciences and Policyin the Graduate School of
Duke University
2009
ABSTRACT
Water Quality Models for Supporting
Shellfish Harvesting Area Management
by
Andrew David Gronewold
Department of Environmental Sciences and PolicyDuke University
Date:Approved:
Dr. Kenneth Reckhow, co-supervisor
Dr. Robert Wolpert, co-supervisor
Dr. Rachel Noble
Dr. William Kirby-Smith
An abstract of a dissertation submitted in partial fulfillment of therequirements for the degree of Doctor of Philosophy
in the Department of Environmental Sciences and Policyin the Graduate School of
B North Carolina Shellfish Harvesting Area Water Quality Standards104
Bibliography 106
Biography 117
viii
List of Figures
2.1 Graphical representation of critical environmental system responsevariables and potential model endpoints. Management decisions areindicated by boxes, and variables are represented by rounded nodes. . 11
2.2 Graphical representation of assumed system variables and causal rela-tionships for terrestrial fate and transport of fecal indicator bacteria.Management decisions are indicated by boxes, and variables are rep-resented by rounded nodes. . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 Graphical representation of environmental processes and system vari-ables affecting aquatic fate and transport of fecal indicator organisms.Management decisions are indicated by boxes, and variables are rep-resented by rounded nodes. . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4 Comprehensive graphical network of fecal contamination in designatedresource waters. Management decisions are indicated by boxes, andvariables are represented by rounded nodes. . . . . . . . . . . . . . . 29
3.2 Graphical representation of Bayes’ theorem indicating prior and pos-terior probability densities, and the normalized likelihood for a waterquality standard. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Graphical representation of environmental variables and processes as-sociated with fecal contamination in tidal shellfish harvesting areas.Management decisions are indicated by boxes, and variables are rep-resented by rounded nodes. . . . . . . . . . . . . . . . . . . . . . . . . 39
3.4 Graphical submodel relating precipitation events, tidal dynamics, andwater quality. Probabilities for all variable states are based on moni-toring data collected between 1994 and 1997 at a cluster of monitoringstations in the upper reaches of the Newport River Estuary, NorthCarolina. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
ix
3.5 Conditional probability distribution table for fecal coliform MPN node.For each of the three states of the MPN node, each row indicates themarginal probability of the node being in that state given the state ofthe three causal variables. For example, the probability that the MPNis less than 14 organisms per 100 ml, given that the tide is rising, themost recent rainfall was less than one inch, and that it has been lessthan four days since the most recent rain event, is 0.667. . . . . . . . 44
3.6 Graphical submodel relating precipitation events, tidal dynamics, andwater quality. Probabilities for fecal coliform MPN states are condi-tional upon long-term average precipitation and tidal conditions in theupper reaches of the Newport River Estuary, North Carolina. . . . . . 45
4.1 Prior and posterior distributions for σk for five randomly selected sta-tions in the Newport River using the three priors in table 4.4. Eachrow utilizes the same prior distribution, and each column represents aseparate station. Vertical gray lines are added to facilitate comparisonbetween alternative priors for each station. . . . . . . . . . . . . . . . 53
4.2 Combinations of the mean µc and standard deviation σc of the log-transformed fecal coliform concentration distribution which yieldedMPN (solid lines) or CFU (dashed lines) samples in violation of theNSSP median standard (panel a), geometric mean standard (panel b),90th percentile standard (panel c), or any standard (panel d) with afrequency of either 0.005 or 0.1. The zone of violations is in the upperright of each panel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3 Relationship between the mean µc and standard deviation σc of the log-transformed fecal coliform concentration distribution and simulatedviolation of any CFU-based water quality standard (dashed lines) andany MPN-based water quality standard (solid lines) for possible val-ues of the negative binomial dispersion parameter α. Panels a and bindicate µc − σc pairs expected to violate standards with a frequencyof 0.1 and 0.005, respectively. . . . . . . . . . . . . . . . . . . . . . . 55
4.4 Log-likelihood (solid line) of transformation parameter γ for σc usingpaired values of µc and σc. Panel a based on values from table 4.2 forσc > 0.65, panel b based on values from table 4.2 for σc ≤ 0.65, panelc based on values from table 4.3 for σc > 0.65, and panel d based onvalues from table 4.3 for σc ≤ 0.65. . . . . . . . . . . . . . . . . . . . 56
x
4.5 Violation contour lines overlaid by violation line best-fit regressionmodel fitted values based on model parameters in table 4.5. . . . . . . 57
4.6 Joint posterior probability density contour lines (solid lines) for fourmonitoring stations in the Newport River Estuary. Dashed lines in-dicate combinations of the mean µc and standard deviation σc of thelog-transformed fecal coliform concentration distribution which violateconcentration-based standards no more than 0.5% of the time usingMPN or CFU standards as the reference. Confidences of compliance(CC) are given in the lower left of each panel for both MPN and CFU-based standards. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.1 Expected values and 95% prediction sets or prediction intervals forobservable fecal coliform MPN (panel A) and CFU (panel B) measure-ments given the true fecal coliform concentration in organisms per 100ml. For clarity, expected values and 95% prediction sets or intervalsare plotted only for every 5th integer-valued concentration c. Maxi-mum true concentrations in each plot are based on maximum MPNand CFU observations in the NCDENR-SSS data set. CFU predictionintervals are based on an MF sample aliquot volume of 100 ml. . . . . 75
5.2 Expected value and 95% credible intervals for the fecal coliform trueconcentration given MPN (panel A) and CFU (panel B) estimates inorganisms per 100 ml. For clarity, panel A includes only the 51 observ-able MPN estimates presented in standard laboratory analysis MTFconversion tables for the 5-tube serial dilution analysis procedure (see,e.g. Woodward, 1957) and panel B includes only every 5th observableCFU value based on an MF test with a sample aliquot volume of 100ml. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3 Empirical linear regression model (panel A) and theoretical probabilitymodel (panel B) of the relationship between fecal coliform MPN andCFU estimates from the same water quality sample. . . . . . . . . . . 77
5.4 Observed values, expected values, and the theoretical probability massfunction of the MPN for a CFU measurement from the same waterquality sample. Observed values are from recent NCDENR-SSS study. 78
xi
6.1 Estimated inner quartile (50%, thick black line) and 95% intervals(thin black line) for each model parameter based on samples of size10, 25, or 100. Vertical gray lines indicate the parameter value usedto simulate data. Dots (solid and hollow) indicate median values.For each sample size, the upper line (with solid circle) represents theparameter estimate based on using the MPN point estimate, and thelower line (with hollow circle) represents parameter estimates basedon using the pattern of positive tubes for model calibration. . . . . . 93
6.2 Estimated inner quartile (50%, thick black line) and 95% intervals(thin black line) for model-predicted FIB concentrations at time t =1, 4, and 7 days. Vertical gray lines indicate the expected FIB con-centration using the “true” parameter values. Dots (solid and hollow)indicate median values. For each sample size, the upper line (withsolid circle) represents predicted FIB concentrations using the modelcalibrated with MPN point estimates, and the lower line (with hol-low circle) represents predicted FIB concentrations using the modelcalibrated using the pattern of positive tubes. . . . . . . . . . . . . . 95
xii
List of Tables
3.1 Marginal distribution of fecal coliform MPN results at a selected group-ing of monitoring stations. Newport River, North Carolina. . . . . . . 46
3.2 Summary of Bayesian analysis results for Newport River, North Car-olina fecal coliform MPN data. . . . . . . . . . . . . . . . . . . . . . . 47
4.1 NSSP shellfish harvesting area fecal coliform water quality standardsbased on a minimum of 30 randomly collected samples. . . . . . . . . 51
4.2 Values of µc and σc constituting MPN contour line (for simulated vi-olation frequency = 0.005). . . . . . . . . . . . . . . . . . . . . . . . 51
4.6 Estimated confidence of compliance (CC), posterior probability of vi-olating any MPN standard, and observed violations for monitoringstations in the Newport River Estuary during the 2000-2005 assess-ment period. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.1 Example of simulated data set with sample size j = 10. Each rowrepresents a simulated grab sample with concentration c collected attime t, a simulated pattern of positive tubes (x1, x2, x3) resulting fromstandard MTF decimal dilution analysis of the grab sample, and thecorresponding MPN (**see Methods section for interpretation of re-sults with all tubes negative, or all tubes positive). . . . . . . . . . . 89
A.1 Water bodies within shellfish growing area E-4 and their status relativeto the 303(d) list of impaired waters. “IR Category” refers to 2008Draft Integrated Report (IR) Category. . . . . . . . . . . . . . . . . . 103
xiii
Acknowledgments
I would like to thank the members of my doctorate committee, particularly Dr. Ken
Reckhow for agreeing to take me on as a graduate student, and for providing me not
only with the opportunity to work on a focused research project closely linked to
my own interests, but also with the opportunity to explore new research trajectories.
I’m also very grateful to Dr. Robert Wolpert for agreeing to work closely on many of
the detailed statistical aspects of my research. Finally, many thanks to Dr. Rachel
Noble and Dr. Bill Kirby-Smith for your support and friendship, it has been a
pleasure working with you.
Much of the research presented in this dissertation was supported with funds
from the United States Environmental Protection Agency (USEPA) through the
North Carolina Division of Water Quality (NCDWQ) 319 program (Contract No.
EW05049). Additional funding was also provided through grants from the National
Science Foundation (NSF Grant Nos. DMS-0112069 and DMS-0422400) and (through
collaboration with Dr. Mark E. Borsuk) the EPA Office of Research and Develop-
ment’s Advanced Monitoring Initiative (AMI) Pilot Projects Focused on GEOSS
(Global Earth Observation System of Systems). I am also very grateful for scholar-
ship support from the Water Environment Federation (WEF), Quantitative Environ-
mental Analysis (QEA), LLC, and the North Carolina Association of Environmental
Professionals (NCAEP).
I owe tremendous thanks to the staff at NCDENR-SSS, including Patti Fowler,
xiv
J.D. Potts, Andrew Haines, Shannon Jenkins, Nadine Stoddard, and Diane Mason,
all of whom were consistently generous with their time in answering questions, sharing
and explaining water quality analysis data, and teaching me about their analytical
procedures. Most, if not all, of this dissertation would not have been possible without
their help. I also owe thanks to Larry Wood and his family not only for their kindness
over the years, but also for teaching me about sailing, shellfishing, and appreciating
the beauty and joy found in natural resources, particularly those of Waquoit Bay on
Cape Cod.
Several colleagues from the Nicholas School, particularly those who are either
current or former members of Ken Reckhow’s laboratory group, provided critical
feedback at various stages of my research. In particular, thanks go to Ben Best, Joe
Sexton, Craig Stow, Song Qian, Sean McMahon, Scott Loarie, Rob Schick, Ibrahim
Alameddine, Lori Bennear, Richard Anderson, and Conrad Lamon. I also owe thanks
to my former colleagues at Stearns & Wheler, LLC, particulary Bill Hall, Jr. and
Nate Weeks, both of whom provided kind assurance both during my graduate studies,
and in my decision to return to graduate school. Many thanks also go out to the
hard work of master’s students who contributed to research on the Newport River
including Tammy Hill, Whan Chunkrua, and Ryan O’Banion.
My family is fortunate enough to live in a wonderful community in Old West
Durham, and the support from our friends and neighbors has been invaluable, par-
ticularly from Cyrus, Michelle, Julie, Nancy, Guy, Colleen, Riley, Patrick and Ally.
xv
Jim and Meg Lister also provided invaluable support, keeping my family fed and on
the right parenting track during the first few months after Michael and John were
born. We could not have survived parenting twins, nor could I have maintained
progress in my Ph.D. program, without your help. I also appreciate the one-on-one
help from many of the staff here at the Nicholas School, including Jacqui Franklin,
Deborah Wilson, Nancy Morgans, Laura Turcotte, Stephen Cash, and Meg Stephens.
I also owe special thanks to Lana BenDavid and the graduate school for their support.
I will be indebted for a very long time (if not forever) to my family, particularly
those who managed to survive staying in our crazy home while we tried to raise
twins, support a doctoral dissertation, and do all the other things families try to do.
In particular, Peter and Marcia provided critical support during a major transition
period. Granny and Pop-pop, I hope it goes without saying that your love and
support have made all the different in the world. Mom and Dad, thank you for
everything. Most of all, thank you Sara. When I asked you to marry me, I failed
to mention that I planned on quitting my job, going back to graduate school in
North Carolina, buying a house, and immediately having twins. Thanks for keeping
everything (including yourself) afloat.
xvi
Chapter 1
Introduction
This doctoral dissertation presents the derivation and application of a series of wa-
ter quality models and modeling strategies which provide critical guidance to water
quality-based management decisions by identifying and explicitly acknowledging un-
certainty and variability in terrestrial and aquatic environments, and in water qual-
ity sampling and analysis procedures. While these modeling tools can be used to
assist management decisions in waters with a wide range of designated uses, my re-
search focuses on developing tools which can be integrated into a probabilistic or
Bayesian network model supporting total maximum daily load (TMDL) assessments
of impaired shellfish harvesting waters. Such a model is currently being developed
through an ongoing 319 project with the North Carolina Division of Water Quality,
and a major goal of this dissertation is to provide tools which will improve the per-
formance of that model. Therefore, the research presented in this dissertation should
be viewed as a component of a more comprehensive modeling and research effort.
While my research provides the foundation for building a Bayesian or probabilistic
network model, the final model is not presented explicitly as part of this dissertation.
Section 303(d) of the United States (US) Clean Water Act requires that states
assess the condition of surface waters and report those which fail to meet ambient
water quality standards (Smith et al., 2001; Houck, 2002). These are added to the
1
US Environmental Protection Agency (EPA) list of impaired waters (U.S. Environ-
mental Protection Agency, 2005b) and can only be removed after the performance
of a TMDL assessment (National Research Council, 2001; Cooter, 2004) followed by
sample-based verification that the standards are being met. As with any TMDL
assessment, the primary objective of the Newport River TMDL is to determine the
maximum allowable pollutant load from point, non-point, and natural sources, in-
cluding a margin of safety (MOS), which can be discharged into a receiving water
without violating water quality standards (National Research Council, 2001; Houck,
2002). Such predictive assessments are usually based on an empirical or mechanistic
water quality model relating pollutant loading levels to water body concentrations
(Borsuk et al., 2002; Benham et al., 2006). Fecal indicator bacteria (FIB), such
as fecal coliform, are commonly used to assess potential pathogen contamination in
coastal waters, and serve as the pollutant of concern for the models presented in this
Model Ordinance in the Guide for the Control of Molluscan Shellfish, prepared by
the National Shellfish Sanitation Program (NSSP), includes recommended FIB-based
water quality criteria for shellfish-growing waters (Food and Drug Administration
and Interstate Shellfish Sanitation Conference, 2005). States which participate in
the NSSP, and which are also members of the Interstate Shellfish Sanitation Confer-
ence, enforce the Model Ordinance as a minimum requirement for sanitary control of
shellfish (Food and Drug Administration and Interstate Shellfish Sanitation Confer-
2
ence, 2005). Similar FIB-based water quality standards are enforced in surface waters
with other designated uses, such as recreational use (N.C. Department of Environ-
ment and Natural Resources, 2004) and drinking water supply (U.S. Environmental
Protection Agency, 2005a).
The latest official assessment of US water quality data (U.S. Environmental Pro-
tection Agency, 2002) indicates that pathogens are the leading cause of coastal shore-
line standard violations (275 total miles impaired) and the second leading cause of
violations in rivers and streams (82,100 total miles impaired). The Newport River
Estuary and its tributaries, which are collectively designated as growing area E-4
by the North Carolina Department of Environment and Natural Resources Shellfish
Sanitation and Recreational Water Quality Section (NCDENR-SSS), is historically a
productive shellfish harvesting area. However, all of its segments and tributaries are
either permanently or conditionally closed to shellfishing based on poor water qual-
ity or proximity to known or potential sources of fecal contamination. As a result,
growing area E-4 comprises forty of the designated shellfish harvesting areas in North
Carolina which are currently included in the USEPA 303(d) list and therefore require
a TMDL assessment (see appendix A). Developing modeling tools which support
TMDL assessments in this area not only addresses an acute need, but also provides
additional context for addressing pathogen water quality problems around the US
and the world.
3
1.1 Dissertation Organization
My dissertation is divided into 6 chapters. This Chapter (Chapter 1) describes the
rationale for my doctorate research including overall research objectives and critical
regulatory requirements. Chapter 2 proposes a new graphical structure for either
a probabilistic or Bayesian network model of water quality in shellfish harvesting
waters. USEPA recommends initiating TMDL projects with an evaluation of ap-
propriate water quality indicators and associated target values which can be used
to assess attainment of the designated use (U.S. Environmental Protection Agency,
2001). Therefore, while chapter 2 defines (rather broadly) the scope of any bacte-
rial TMDL assessment, it also highlights a poorly-defined relationship between water
quality model endpoints and proposed measures of water quality (including alterna-
tive indicator organisms and different testing methods) as well as potential risks to
human and environmental health. Although my dissertation focuses on mitigating
fecal contamination in shellfishing resource areas (and reducing subsequent risk of
the outbreak of shellfish-borne infectious diseases), Chapter 2 serves as a reminder
that pollutants of non-fecal origin (such as red-tide causing ciguatoxins) might be
integrated into ongoing health risk-based management planning (Hackney and Pier-
son, 1994). Chapter 2 indicates a growing need in the microbial analysis and water
quality modeling field to more explicitly quantify the relationship between human
health risks and alternative measures of fecal and non-fecal contamination in coastal
resource waters. Identifying this research need was a major result of the early stages
4
of my research, and establishes the context for all subsequent Chapters of my dis-
sertation. Much of the research presented in Chapter 2 appears in peer-reviewed
proceedings of the International Water Association (IWA) WaterMatex 2007 Confer-
ence (Gronewold and Reckhow, 2007).
In Chapter 3, I develop and apply a simplified version of the conceptual graphical
model from Chapter 2 to water quality monitoring data from the Newport River
using the Bayesian analysis software package Neticar. This analysis identifies how
presumed critical environmental variables impact water quality-based management
decisions, and whether or not those variables are monitored under truly random con-
ditions. Furthermore, the initial modeling effort in Chapter 3 indicates that critical
model variables (such as the model endpoint) should explicitly acknowledge uncer-
tainty and variability (through, for example, probabilistic models) to allow compari-
son between model output and management decision criteria. The work in Chapter
3 also suggests that fecal indicator bacteria concentration forecasting models must
appropriately reflect uncertainty inherent to specific bacteria water quality analysis
procedures, and that the Neticar software package may not be the most appropriate
tool for doing so. The research in Chapter 3 appears in peer-reviewed conference
proceedings of the Water Environment Federation TMDL 2007 Specialty Conference
held in Bellevue, Washington (Gronewold et al., 2007).
In Chapter 4, I develop a novel approach to assessing FIB-based water quality
standards for pathogenically-impaired resource waters and propose new standards
5
based on distributional parameters of the in situ FIB concentration probability dis-
tribution (as opposed to the current approach of using most probable number (MPN)
or colony-forming unit (CFU) values). This work is motivated by recommendations
of the National Research Council (2001), and an exploratory analysis of historic New-
port River water quality and environmental data, which suggest that several water
bodies in shellfish growing area E-4 either do not appear to violate water quality
standards, or do not have sufficient data to justify being included in the 303(d) list.
Chapter 4 concludes with a re-evaluation of water quality standard violations in the
Newport River based on my proposed water quality standards. Much of the work
(and text) in Chapter 4 was developed in collaboration with Dr. Mark Borsuk, Dr.
Robert Wolpert, and Dr. Kenneth Reckhow and was recently published (as the cover
article) in Environmental Science & Technology (Gronewold et al., 2008).
Chapter 5 compares different FIB water quality metrics in order to determine
whether an ongoing change in NCDENR-SSS standard operating procedure (and
elsewhere, presumably) from a multiple tube fermentation (MTF)-based procedure
to a membrane filtration (MF) procedure might cause a change in the observed fre-
quency of water quality standard violations. This comparison is based on an inno-
vative theoretical model of the MPN probability distribution for any observed CFU
estimate from the same water quality sample, and is applied to recent water quality
samples collected and analyzed by NCDENR-SSS for fecal coliform concentration us-
ing both MTF and MF analysis tests. This research provides important insight into
6
whether MPN and CFU intra-sample variability stems from human error, laboratory
procedure variability, or is simply a consequence of the probabilistic basis for calcu-
lating the MPN. This research was conducted in close collaboration with Dr. Robert
Wolpert, and was recently published in Water Research (Gronewold and Wolpert,
2008).
Finally, in Chapter 6, I propose a Bayesian strategy to calibrating FIB water
quality models in which the pattern of positive tubes from a multiple-tube fermen-
tation (MTF) serial dilution analysis is used as data. My proposed strategy assumes
that the pattern of positive tubes or wells in a serial dilution analysis experiment
(using, for example, either the MTF test or IDEXX Quanti-Trayr/2000 system),
when modeled as a series of stochastic random variables, reflects variability in serial
dilution analysis procedures and, consequently, uncertainty in the estimate of the true
FIB concentration. I then compare my proposed Bayesian strategy with the common
practice of using MPN point estimates to calibrate FIB water quality models. The
research presented in Chapter 6 highlights how proper acknowledgement (or igno-
rance) of model input uncertainty affects both FIB water quality model parameter
estimates as well as model-based management decisions. Much of this research was
completed in collaboration with Dr. Song Qian, Dr. Robert Wolpert, Dr. Rachel
Noble and Dr. Kennth Reckhow, and is currently being revised following submittal
to Water Research.
7
Chapter 2
Developing a Graphical Model
Note: much of the text from this Chapter appears in peer-reviewed proceedings of the
International Water Association (IWA) WaterMatex 2007 conference (Gronewold
and Reckhow, 2007).
Appropriate graphical representation of assumed relationships between environ-
mental system variables, resource area management actions, and human health im-
pacts is the first and potentially most critical stage in the development of a proba-
bilistic or Bayesian network model designed to protect designated resource waters.
A graphical network establishes the cornerstone on which model algorithms are iden-
tified and applied, monitoring plans are implemented, and management alternatives
are evaluated. Furthermore, a graphical model structure facilitates group model
building and dissemination of these model algorithms and assumptions about system
dynamics (Borsuk et al., 2004).
In this Chapter, I present a step-by-step approach to developing a graphical net-
work relating system variables and management actions associated with fecal con-
tamination of resource waters. This graphical network model serves as a foundation
for future implementation of a probabilistic or Bayesian network model designed
to integrate environmental conditions in bacteriologically impaired surface waters
8
with management alternatives, and to forecast probability distributions of designated
model endpoints.
2.1 Selecting the Model Endpoint
Long-term water resource management projects, such as those implemented through
the TMDL program, should start with an evaluation of appropriate water quality
indicators and associated target values which can be used to assess designated use
attainment (U.S. Environmental Protection Agency, 2001). Current guidelines for
United States shellfish harvesting waters, for example, indicate fecal coliform most
probable number (MPN) and colony forming unit (CFU) values are the basis for water
quality standards, and therefore serve as logical model endpoints (Food and Drug Ad-
ministration and Interstate Shellfish Sanitation Conference, 2005). Recent research,
however, indicates that several alternative indicator organisms may more accurately
reflect the potential health risk associated with fecal contamination (National Re-
search Council, 2001; U.S. Environmental Protection Agency, 2001). Potential alter-
native indicators of fecal contamination include the family of coliform bacteria (which
include total coliform, fecal coliform, and Escherichia coli), fecal streptococci (U.S.
Environmental Protection Agency, 2001; Kashefipour et al., 2005), and Enterococcus
sp (Sanders et al., 2005). Furthermore, while indicator organism concentrations in
the water column are the standard for assessing water quality and threats of fecal
contamination, human illness and death may also occur from the consumption of
9
shellfish contaminated with pollutants of non-fecal origin (such as red-tide causing
ciguatoxins), even if the shellfish are properly cooked. Several authors, including
Hackney and Pierson (1994), provide a history of field studies relating human disease
outbreaks with contamination of shellfishing resource areas.
Other human and environmental health measures not directly linked to TMDL
implementation, but of significant concern to public health officials and the public-
at-large, include potential relationships between fecal coliform concentration in the
water column and underlying shellfish tissue, the relationship between fecal indicator
organism concentration in shellfish tissue and risk of human illness, and the relation-
ship (in any media, including waters and shellfish tissue) between fecal indicator and
pathogenic organism concentrations. These environmental and human health vari-
ables are included in the graphical network to improve model flexibility and facilitate
future adaptation to alternative management scenarios, and applications other than
strictly TMDL support.
However, because this dissertation is primarily intended to support the TMDL
assessment process and, in particular, the development of TMDLs in shellfish har-
vesting waters, the model endpoint assumed for most of this dissertation is surface
water fecal coliform concentration assessed using currently approved analytical tech-
niques and standards. I propose this endpoint with implicit understanding that fecal
coliform (or other indicator organism) aquatic and terrestrial transport and transfor-
mation processes used to establish conditional probability relationships in the net-
10
work model are not likely to accurately represent fate and transport dynamics of the
pathogens they supposedly represent. Developing a network model structure which
can be adapted to a variety of potential model endpoints, including both pathogenic
and non-pathogenic organisms of fecal origin, is an area for additional research. My
proposed graphical representation of critical model endpoints, including critical en-
vironmental system response variables, management decisions, and potential model
endpoints is included in figure 2.1.
Figure 2.1: Graphical representation of critical environmental system response vari-ables and potential model endpoints. Management decisions are indicated by boxes,and variables are represented by rounded nodes.
Uncertainty associated with some fecal indicator organism laboratory analysis
procedures can range up to an order of magnitude, and has significant impacts on
both management actions and perceptions of threats to human health. Though not
immediately obvious in figure 2.1, network models provide a logical framework for
exposing and propagating both intrinsic sources of measurement uncertainty inherent
to bacteriological analytical procedures, as well as extrinsic sources of uncertainty,
into model endpoints. Modeling strategies for addressing these potential sources of
11
uncertainty will be addressed in detail in Chapters 4, 5, and 6.
2.2 Terrestrial Fate and Transport
In order to guide long-term management strategies, a fecal coliform pollution network
model must address the relationship between land use practices, loading reduction
measures, and predicted changes in pollutant concentration probability distribution
in the receiving water. The first relationship in this causal chain, therefore, is ter-
restrial fecal pollution deposition and wash-off. Establishing conditional probability
relationships between the variables related to this process allows loading reduction
management actions to be simulated in the model either at the pollution generation
level, or at the watershed-water body interface.
Land use practices and land cover types in the coastal watersheds of North Car-
olina, as with many other watersheds discharging into coastal shellfishing resource
waters, are dominated by agriculture and forested areas in which potential fecal
pollution sources can range from waste management infrastructure to wildlife and
agricultural runoff (Weiskel et al., 1996; White et al., 2000; Shen et al., 2005). Ter-
restrial accumulation and decay of fecal indicator bacteria from these, and similar
landscapes, can be approximated by a first-order decay process coupled with a lin-
ear daily loading rate term (Alley and Smith, 1981; U.S. Environmental Protection
Agency, 2001):
dL
dt= N(s) − kt(s)L(t) (2.1)
12
where
L(t) = number of organisms on the landscape at time t (often in days)
N(s) = seasonal terrestrial FIB deposition rate (organisms per day)
The wide range of factors and high degrees of uncertainty affecting the relation-
ship between pollutant accumulation, and washoff during precipitation events, make
collection of appropriate data for such models an often overwhelming task (National
Research Council, 2001). For example, soil moisture conditions are often considered
a critical variable in watershed runoff processes (Beven, 2001), yet the high spatial
and temporal monitoring resolution required to accurately reflect these conditions
would exhaust the resources of most management groups (National Research Coun-
cil, 2001). As a result, I have combined historic algorithms (represented by equations
2.1 and 2.2) into the following:
dL
dt= N(s) − kt(s)L(t) − αrbL(t) (2.6)
This approach has the advantage of minimizing dependency on more detailed,
small-scale terrestrial processes, many of which are poorly understood and relatively
underrepresented in the literature, including (Ferguson et al., 2003):
• transport particle size distribution
• relationship between physical properties of watersheds, microbial cellular-scale
properties, and transport phenomenon
• microbial die-off and decay upon initial transfer into aquatic environment
A graphical representation of the proposed terrestrial fate and transport compo-
nent is presented in figure 2.2.
16
Figure 2.2: Graphical representation of assumed system variables and causal rela-tionships for terrestrial fate and transport of fecal indicator bacteria. Managementdecisions are indicated by boxes, and variables are represented by rounded nodes.
2.3 Aquatic Fate and Transport
Small coastal embayments, including many tributaries of the Newport River Estuary,
are defined as partially enclosed water bodies with a connection to a larger bay or
estuary. Depths in these small coastal embayments can range between 0.5 and 3.0
meters (Fischer, 1979; Thomann and Mueller, 1987), and pollutant concentrations are
heavily influenced by tidal flushing and surface runoff. Advection and other trans-
port processes in coastal resource waters are frequently dominated by tidal activity
(Grant et al., 2001; Kashefipour et al., 2005; Sanders et al., 2005). The relatively
shallow depth and strong advective forces in small coastal embayments often result
in complete or near-complete vertical mixing (Thomann and Mueller, 1987).
17
In addition to advective forces, governing processes affecting fecal indicator organ-
ism aquatic fate and transport include settling (Chapra, 1997) and natural mortality
(Gameson and Gould, 1974; Auer and Niehaus, 1993). Mortality of biological or-
ganisms is often represented in water quality models by a first-order loss rate ka
(in units day−1), which includes effects of temperature, salinity, and solar radiation
(Chapra, 1997). In addition, coastal embayments are often surrounded by wetlands
which undergo continuous wetting and drying cycles and, as a result, may represent
a non-point source of pollution (Sanders et al., 2005).
In an effort to develop a simple and robust network model structure, I review in
detail (in the following sections) potential approaches to modeling FIB transport and
decay processes. I then extract key model algorithms to be represented in the final
network model.
2.3.1 Bacteria loss models
Bacteria die-off and decay in aquatic environments is typically represented in water
quality models by an effective loss rate, ka (in units day−1) accounting for natural
mortality, solar radiation, and settling (Chapra, 1997):
ka = kad+ kai
+ kas (2.7)
where kadrepresents natural mortality, kai
represents mortality due to solar radia-
tion, and kas represents loss due to settling. Additional environmental variables which
18
potentially contribute to bacteria loss not typically addressed explicitly in bacteria
water quality models include (Davies-Colley et al., 1994; U.S. Environmental Protec-
tion Agency, 2001):
• attraction to solids
• water column pH
• starvation and predation
• structural damage
• osmotic pressure induced by salinity gradient following runoff events
• nutrient deficiencies
• turbidity
• variations in spectral quality of sunlight
• oxygen and nutrient concentrations
Natural Die-Off (kad)
Fecal coliform bacteria natural die-off rates can be approximated by a first-order
temperature and salinity-dependent process of the following form (Mancini, 1978;
Thomann and Mueller, 1987; Chapra, 1997):
kad= (0.8 + 0.006Ps)θ
T−20
19
where Ps is the percentage of sea water, T is the water temperature in degrees Celsius,
and θ expresses the temperature dependency of a reaction rate (and is typically
between 1.0 and 1.1). This equation can be modified as a function of measured
salinity S, assuming a seawater salinity in the range of 30 to 35 parts per thousand
(ppt):
kad= (0.8 + 0.02S)θT−20
Historic studies indicate a wide range of temperature-dependent pathogen and
indicator organisms survival rates. For example, in research results summarized
by U.S. Environmental Protection Agency (2001), pathogens have been inactivated
following exposure to temperature extremes, including freezing and boiling (Tzipori,
1983; Badenoch et al., 1990), while pathogen survival rates at moderate temperatures
(i.e. between approximately 4 and 20 degrees Celsius) ranged between 2 and 6 months
(Bingham et al., 1979; Adam, 1991; Medema et al., 1997). More recent studies, such
as those conducted and cited by Auer and Niehaus (1993), also indicate no significant
relationship between ambient temperature and decay rate (in the absence of solar
effects), implying θ = 1 in equation 2.8 (for additional details, see Mitchell and
Chamberlin, 1979; Moeller and Calkins, 1980; Auer and Niehaus, 1993)). Freshwater
studies cited in Novotny and Olem (1994) indicate enteric virus survival rates ranging
between 2 and 188 days.
20
Death due to Solar Radiation (kai)
Bacterial loss in aquatic environments due to solar radiation is often approximated
as (Mancini, 1978; Thomann and Mueller, 1987; Auer and Niehaus, 1993; Chapra,
1997):
kai=
αI0
keH(1 − e−keH) (2.8)
where α is a proportionality constant often approximated as unity (Thomann and
Mueller, 1987), I0 is surface light energy, ke is a light extinction coefficient (typically
in units of 1/m) derived from suspended solids concentration or secchi disk depth
measurements, and H is the depth (in meters) of the layer over which the approximate
decay rate is being applied. Research on effects of solar radiation on bacteria and
pathogen decay rates include a comparison between Giardia and Cryptosporidium
decay rates in sunlight (see Johnson et al., 1997; Kashefipour et al., 2005), effects of
turbidity on solar penetration in the water column and subsequent increased survival
of microorganisms (Salomon and Pommepuy, 1990), and comparisons between loss of
viral infectivity under various light and substrate concentration conditions indicating
solar radiation as a significant factor on loss of viral infectivity (Noble and Fuhrman,
1997). While these studies provide insight into the role of environmental variables
on the fate of both pathogenic and indicator organisms, it is likely they could only
be presented in models with a level of detail too high for supporting thousands, and
perhaps tens of thousands, of TMDL assessments.
21
Settling Loss (kas)
Bacteria settling rates are believed to be a function of the fraction of organisms
entrapped in settling solids, and can be approximated as (Chapra, 1997):
kas = Fpvs
H(2.9)
where vs is the settling velocity of the solids (in meters per day), H is the depth of
measurement (in meters), and Fp is the fraction of bacteria attached to solids, which
can be approximated by:
Fp ≈Kdm
1 + Kdm
where Kd is a partition coefficient (in m3/g) and m is the suspended solids concen-
tration (in mg/L).
Settling velocity vs can be zero, positive, or negative. Negative settling velocities
account for microorganisms entrapped in resuspended sediment (see, e.g. Thomann
and Mueller, 1987). Recent studies indicate that resuspension of sediment and en-
trapped bacteria can impair water quality in the absence of precipitation events (see
Irvine and Pettibone, 1993; Weiskel et al., 1996). Obiri-Danso and Jones (2000)
found that fecal indicator organisms, in particular, are susceptible to resuspension
during dry weather. Studies supporting these findings indicate that fecal indicator
and pathogenic bacteria may survive longer in sediment than in overlying waters,
22
often by order of magnitude difference (Ashbolt et al., 1993; Nix et al., 1993; Ghins-
berg et al., 1994; Davies-Colley et al., 1994; Obiri-Danso and Jones, 2000; Sanders
et al., 2005). Some pathogenic organisms, such as Campylobacter, do not survive
for more than a few hours in cold weather, and on the order of minutes in the sum-
mer, making its presence in sediment a strong indicator of recent fecal pollution and
potential threats to human health (Obiri-Danso and Jones, 2000). Other potential
factors related to resuspension events include soil characteristics and hydrodynamic
shear forces at the sediment-fluid interface (Blanchard et al., 1997).
2.3.2 Bacteria transport models
Approaches to modeling the transport and fate of fecal indicator organisms in a
shallow tidal estuary range from simple one-dimensional models focusing on first
order decay and dilution to complex 3-dimensional models encompassing diffusivity
gradients, temperature and salinity gradients, and velocity profiles. Some models, for
example, predict continuous advection, dispersion, and die-off throughout tidal cycles
(Sanders et al., 2005). Others, as recommended by Thomann and Mueller (1987),
use a time scale no finer than one point in the tidal cycle to the same point in the
next cycle. Each type of model carries implicit monitoring requirements, with the
more complex models requiring more extensive monitoring networks with a broader
range of environmental parameters.
Regardless of model structure or spatial and temporal scale, microbial transport
23
models historically address advection, diffusion, and dilution (Fischer, 1979). First-
order decay (or loss) terms appearing in these models can be viewed as an integration
of potential loss factors discussed in section 2.3.1. Fecal coliform concentrations in
tidal estuaries, in particular, are commonly assumed to be governed by river discharge
and tidal range (Grant et al., 2001; Kashefipour et al., 2005). Rarely, however, do
these processes apply equally to a single water body. For example, exploratory anal-
ysis of historic water quality data in the Newport River indicates that the main body
of the Newport River estuary acts as a large tidal basin with high tidal exchange rates
and salinity values, and relatively infrequent water quality violations. Tributaries of
the Newport River estuary, however, exhibit relatively high frequency of standard
violations and are typically small enough (both in surface area and volume) that
tidal advection likely outweighs effects of other hydraulic processes.
For the remainder of this section, I review potential hydraulic transport processes
and associated modeling strategies for tidal estuaries and coastal embayments, fol-
lowed by a summary of modeling approaches most applicable to the coastal waters
of the Newport River and its tributaries. Of course, most of these algorithms will
not likely be included in the final proposed model and are presented with the under-
standing that they provide context and guidance for choosing the proposed model
and, if necessary, for making changes to the model in the future.
Some of the most well-known and frequently applied water quality models are
based on solutions to the advective-diffusion equation, which is commonly used for
24
modeling bacteria and other non-conservative substances undergoing first-order de-
cay (for details, see Fischer, 1979). Similarly, the QUAL2K pathogen model applies
a mass balance approach to solving fecal bacteria concentration on a reach-by-reach
basis (Chapra et al., 2007). Several recent studies, however, serve as building evi-
dence that the advective-diffusion equation, and similar mechanistic models, promote
a level of detail exceeding the limitations of most data collection resources (National
Research Council, 2001; Borsuk et al., 2004). Salomon and Pommepuy (1990), for
example, acknowledge the complexity and cost associated with implementing a 3-
dimensional model, and found (in their particular study) that dilution was so dom-
inant, subsequent detailed investigations of organism mortality were not justified.
Arega and Sanders (2004), while successfully applying the California tidal wetland
modeling system (and providing a comprehensive list of similar studies) demonstrate
the potential large amounts of data and, in their case, the use of dye studies, required
for complex model support. Such effort is not expected to be practical on the scale
of the TMDL program. Furthermore, it is unclear if advection-diffusion equations,
and other high order differential equations typically applied to hydraulic water qual-
ity problems, apply to cellular transport in water bodies dominated by dilution and
advection.
Most importantly, the water quality standards for shellfish harvesting waters are
based on water quality at the surface at a particular monitoring station. As a re-
sult, detailed 2 and 3-dimensional models exceed not only the resources, but also
25
the needs of the TMDL assessments in the Newport River Estuary. Finally, because
this doctorate research is intended to support model implementation on a scale in
the order of thousands of models, and perhaps tens of thousands of surface waters,
the underlying model algorithm should be as simple as possible to facilitate monitor-
ing and modeling efforts, and to simulate model endpoints within acceptable error
limits (Reckhow, 1999). Tidal flushing models follow a general modeling strategy
recommended for rivers such as the Newport (Thomann and Mueller, 1987), which
combine mass balance theory with volumetric water exchange due to the rise and fall
of the tide, and has origins dating back to the work of Ketchum (1951). Subsequent
efforts to revise and apply Ketchum’s tidal flushing model, which are now commonly
referred to as tidal prism models, include Kuo and Neilson (1988), Sanford et al.
(1992), Luketina (1998), and in a recent coastal North Carolina TMDL, Shen et al.
(2005).
The tidal prism is defined as the difference between the volume of water in an
embayment at high and low tide (Luketina, 1998), and the concentration of a non-
conservative pollutant S in a tidal environment can be modeled as follows:
dS
dt=
W
V− kS(t) − (1 − b)Q
V(S(t) − Samb) −
I(t)
V(S(t) − Si(t)) (2.10)
26
where
S(t) = pollutant concentration at time t (in ppt or mg/L)
t = time (days)
W = within-estuary source (mg per day)
V = estuary average volume (L)
k = first order decay rate (1/day)
b = return flow factor (0 < b < 1)
Q = estuary outflow (L/day)
Samb = salinity in water outside estuary (ppt)
I(t) = estuary inflow at time t (L/day)
Si(t) = pollutant concentration in estuary inflow at time t (ppt)
In addition to being relatively simple, the tidal prism model has the advantage
of having only one hydrologic calibration parameter, the return flow factor b (Kuo
et al., 2005). This factor has been reported in the literature to range between 0.23
(Sanford et al., 1992) and 0.3 (Kuo et al., 2005), and these sources caution against
using (in the absence of any monitoring data) the often-recommended value of 0.5.
Based on a review of historic pathogen fate and transport models, I propose that
the tidal prism model is most appropriate for the waters of the Newport River Estu-
ary. While a simple zero-dimensional model may be suitable for the Newport River
Estuary tributaries, the central portion of the Newport River is most likely too large
27
for representation by a zero-dimensional model with a single reference monitoring
point. The loading reduction requirements for Newport River Estuary tributaries
may therefore have to serve as a conservative guide for the loading reduction re-
quirements of the Estuary itself (if it is found to be in violation of water quality
standards). A graphical representation of my proposed aquatic fate and transport
model, including critical environmental processes and system variables related to a
tidal prism model, is included in figure 2.3.
Figure 2.3: Graphical representation of environmental processes and system vari-ables affecting aquatic fate and transport of fecal indicator organisms. Managementdecisions are indicated by boxes, and variables are represented by rounded nodes.
2.4 Summary
The comprehensive graphical model is developed by combining designated submodels
for each system component, however a model simplifying step adapted from Borsuk
et al. (2004), in which model variables which are not controllable, predictable, or
observable are removed from the network, results in the graphical network presented
28
in figure 2.4.
Figure 2.4: Comprehensive graphical network of fecal contamination in designatedresource waters. Management decisions are indicated by boxes, and variables arerepresented by rounded nodes.
29
Chapter 3
Developing and Applying a Simple
Bayesian Network Model
Note: the research in this Chapter appears in peer-reviewed conference proceedings
of the Water Environment Federation TMDL 2007 Specialty Conference in Bellevue,
Washington (Gronewold et al., 2007).
Fate and transport processes related to bacteriological contamination of recre-
ational and shellfish harvesting waters are complicated and often poorly-understood
with a broad range of historic modeling efforts and associated varying degrees of
success. Developing water quality models which reflect implicit causal relationships
between environmental phenomena, land use patterns, and surface water quality are
vital to the long-term success of the USEPA Total Maximum Daily Load (TMDL)
Program. In this Chapter, I develop and apply a simple Bayesian network model
intended to support fecal coliform TMDL assessments in shellfish harvesting waters.
System components are graphically presented and discussed as a critical initial step in
successful model development, followed by establishment of probabilistic relationships
between system components. The subsequent model, while only a simplified version
of the more comprehensive model expected to be developed after my dissertation, is
suggested as an innovative tool for successful implementation of future TMDLs for
microbial contaminants. I begin by describing context for this research along with
30
a technical description of Bayesian networks. A graphical model representing sys-
tem dynamics in shellfish harvesting waters is presented, followed by application of
a submodel with data from the Newport River Estuary in eastern North Carolina.
3.1 Background
The goal of the TMDL process is to determine the maximum pollutant loading which
can enter a water body without exceeding water quality standards (National Research
Council, 2001; Shen et al., 2005). Despite the complications associated with model-
ing the relationship between fecal indicator bacteria (FIB) loading, surface water FIB
concentrations, and ultimately shellfish contamination, shellfish harvesting resource
area managers are charged with protecting human health by closing harvesting areas
immediately following conditions which may increase the risk of exposure to water-
borne pathogens. Shellfish harvesting areas which violate long-term water quality
standards are placed on the USEPA 303(d) list of impaired waters and are required
to undergo a TMDL assessment. Simple models are therefore needed to simultane-
ously support short-term management programs while providing forecast information
which can guide long-term management actions towards water quality standard com-
pliance. Key characteristics of these simple models should include, but not be limited
to, appropriate acknowledgement of uncertainty (in all phases of the process) and ap-
plicability to thousands of shellfish harvesting areas for which a TMDL assessment
is required, but has not been initiated.
31
Local shellfishing resource area management plans contain conservative criteria
for shellfish growing area closures and openings in order to protect human and envi-
ronmental health. Closure criteria typically include the volume of recent precipitation
events, while reopening criteria may include a subjective analysis of the number of
days since the precipitation event, event intensity, and monitoring to confirm water
quality restoration. Although these criteria are based on historic relationships be-
tween stormwater runoff and high pathogen concentrations in receiving waters, the
implicit causal relationship between precipitation intensity, lag between precipitation
events, land use patterns, receiving water quality and subsequent shellfish contami-
nation is poorly understood. Because short-term protection of human health takes
priority over long-term restoration of impaired shellfishing areas, effective implemen-
tation of a shellfishing resource area management plan does not necessitate explicit
understanding of the runoff-contamination relationship. Current management prac-
tices reflect the assumption that precipitation-based responses in water quality are
similar within neighboring stations and closure decisions are often subsequently ap-
plied to large areas encompassing several stations.
Additional management scenarios in shellfish harvesting areas include short-term
closure and re-opening of resource areas under the authority of local management
agencies. The primary objective of local management scenarios, as opposed to the
long-term remediation goals of the TMDL program, is protecting human health
through restricting or prohibiting shellfish harvesting either during adverse pollution
32
conditions, such as a recent rainfall event, or due to long-term water quality standard
violations. Due to the close relationship between the criteria and environmental pro-
cesses related to these two management schemes, fecal pollution modeling strategies
need to be developed that address both public health concerns and retention and/or
restoration of the beneficial uses of the waterbody.
3.2 Bayesian Networks
A Bayesian network is a graphical representation of conditional probability distri-
butions relating a set of system variables coupled with their formal statistical and
probabilistic relationships (see Pearl, 1988; Spiegelhalter et al., 1993, for extensive
definitions). Qualitative assessment of graphical model structure represents the first
of three stages in the development of a Bayesian network model in which system vari-
ables and assumptions about their relationships are identified, and was discussed pre-
viously in Chapter 2 (Spiegelhalter et al., 1993). Each system variable in a Bayesian
network model is represented by a node, and the presence or absence of an arc between
nodes indicates conditional dependence or independence, respectively. Although arcs
between variable nodes typically imply causality in Bayesian networks, the condi-
tional dependence represented by an arc may indicate a more complex relationship
(Borsuk et al., 2004). The graphical model, while providing a framework for identi-
fying system variables and qualitative beliefs regarding their interdependence, does
not by itself carry a probabilistic interpretation (Spiegelhalter et al., 1993).
33
The second stage of Bayesian network model development acknowledges an im-
plicit joint probability distribution encompassing the proposed model variables and
reflecting the graphical structure of the network (Spiegelhalter et al., 1993). For ex-
ample, fecal contamination of coastal estuaries may be represented using a simple
model which relates rainfall distribution (R) to fecal coliform concentration (F ) as a
function of both non-point (N) and point source (P ) loading, as presented in figure
3.1.
Figure 3.1: Simple network model representing rainfall-induced fecal contaminationof a coastal estuary.
The joint probability of system variables in this simplified model can be written
via the chain rule as:
p(R, N, P, F ) = p(R)p(N |R)p(P |R, N)p(F |R, N, P )
The implied conditional independence indicated by the lack of an arc between
nodes allows us to simplify the joint probability to:
34
p(R, N, P, F ) = p(R)p(N |R)p(P |R)p(F |N, P )
This simplification is possible because once the direct causes of a system variable
are observed, other system variables do not influence understanding of the node’s
distribution (Spiegelhalter et al., 1993). The resulting joint probability can therefore
be viewed as a set of several local distributions, each made up of only a node and
its parents (Spiegelhalter et al., 1993; Borsuk et al., 2004). These local distributions,
commonly referred to as belief universes (see, e.g. Jensen et al., 1990), represent
the cornerstones of model decomposition and one of many benefits associated with
modeling an environmental system with a Bayesian network.
The third and final stage (Spiegelhalter et al., 1993) of Bayesian network model
development involves encoding the conditional probability distribution within the
graphical model structure. Conditional probability distributions are often established
using model simulations, in some cases combined with expert opinions on system
dynamics.
The Bayesian component of Bayesian network models addresses how new infor-
mation is used to modify the conditional probability relationships between system
variables in an existing model. Computations relating future conditional probabil-
ity relationships (posterior distributions) with previous or current understanding of
the relationships (prior distributions) and new observations (likelihood) are based on
Bayes’ theorem, which can be expressed as the following:
35
posterior ∝ likelihood × prior
A graphical representation of Bayes’ theorem is included in figure 3.2.
Figure 3.2: Graphical representation of Bayes’ theorem indicating prior and poste-rior probability densities, and the normalized likelihood for a water quality standard.
3.3 Methods
3.3.1 Study Area and Data Collection
The focus area for this study is the Newport River Estuary (NPRE), located along
the eastern coast of North Carolina in Carteret County. The Newport River and its
tributaries are collectively referred to as shellfish growing area E-4. Shellfish growing
36
area E-4 is locally managed by the North Carolina Department of Environment and
Natural Resources Shellfish Sanitation and Recreational Water Quality Section (SSS),
and encompasses forty individual harvesting areas currently included in USEPA’s
303(d) list of impaired waters targeted for TMDL assessment (see Appendix A)
Water quality samples from shellfish growing area E-4 are routinely collected by
SSS from 29 sampling stations in accordance with guidelines outlined by the National
Shellfish Sanitation Program Food and Drug Administration and Interstate Shellfish
Sanitation Conference (2005). Routine compliance samples are collected roughly
5 to 6 times per year, while adverse condition samples are collected after rainfall
events in order to determine the duration of short-term shellfish harvesting area
closings. The primary data set used for this analysis is the routine monitoring data. In
addition to analyzing samples for fecal coliform concentration, the approximate status
of the tide is recorded during each sampling event. Stations are periodically added to
and removed from the sampling program depending on monitoring needs. The SSS
monitoring data is the longest continuing dataset using consistent station locations
for bacteriological water quality information in the Newport River Estuary and is
the primary source of inference for determining water quality standard violations
and TMDL modeling efforts. Rainfall data within the Newport River Estuary is
obtained from the National Oceanographic and Atmospheric Association’s (NOAA)
national climatic data center (NCDC) weather observation station in Morehead City,
North Carolina.
37
3.3.2 Model Variables
A comprehensive graphical model representing assumed processes and system compo-
nents in a tidal shellfish harvesting area was developed in Chapter 2. Components of
the graphical model (see figure 3.3) were identified and related to one another based
on a review of historic studies of tidal estuary systems and guidance from USEPA
(Grant et al., 2001; Kashefipour et al., 2005; U.S. Environmental Protection Agency,
2001). Recent research indicates that a wide range of alternative indicator organisms
may reflect the health risks associated with fecal contamination, and therefore may be
considered as potential model endpoints (National Research Council, 2001). Such or-
ganisms include, but are not limited to, the family of coliform bacteria (which include
total coliform, fecal coliform, and Escherichia coli) and Enterococcus sp (U.S. Envi-
ronmental Protection Agency, 2001). Current guidelines for United States shellfish
harvesting waters indicate fecal coliform most probable number (MPN) and colony
forming unit (CFU) values as a basis for water quality standards.
For the purposes of this study, a simplified network model is derived from the
comprehensive model (figure 3.3) which includes only those variables which are mea-
surable, and which relate precipitation and tidal dynamics with fecal coliform MPN
measurements. Because water quality samples collected for this study were analyzed
by SSS using a 5-tube serial dilution multiple tube fermentation procedure resulting
in MPN estimates of fecal coliform concentration, fecal coliform MPN will serve as the
model endpoint. Exploratory analysis of historical data, local management criteria,
38
Figure 3.3: Graphical representation of environmental variables and processes as-sociated with fecal contamination in tidal shellfish harvesting areas. Managementdecisions are indicated by boxes, and variables are represented by rounded nodes.
and conversations with SSS personnel indicate precipitation and tide are two of the
most significant variables affecting bacteriological water quality within the Newport
River Estuary. A similar model simplification process is presented in Borsuk et al.
39
(2004).
In order to facilitate both graphical representation and Bayesian updating, I im-
plement the proposed model using the Bayesian network software package Neticar.
A critical aspect of implementing a Bayesian network model within most packaged
software programs is variable discretization, and variables in the proposed submodel
are primarily discretized in order to best reflect current local and federal management
criteria. For example, shellfish harvesting areas in the Newport River Estuary are
often closed after a daily rainfall event exceeding one inch. The magnitude of the
most recent rainfall event is therefore selected as a submodel variable with alternative
states of less than one inch and at least one inch. In addition, the shellfish manage-
ment guidelines outlined in Title 15A of the North Carolina Administrative Code
(NCAC), Chapter 18 (Environmental Health), SubChapter A (Sanitation), Sections
.0300 through .0900 (see Appendix B) indicate that the median fecal coliform most
probable number (MPN) or the geometric mean MPN of water shall not exceed 14
organisms per 100 ml, and not more than ten percent of the samples shall exceed
a fecal coliform MPN of 43 organisms per 100 ml (based on the five-tube serial di-
lution analysis procedure used by SSS). A graphical representation of the proposed
submodel, indicating the selected variables and their states, is included as figure 3.4.
Each node in figure 3.4 represents a system variable, and the rows within each node
indicate a variable state along with the associated probability distribution. Where
applicable, the bottom of each node includes the node variable mean and standard
40
Figure 3.4: Graphical submodel relating precipitation events, tidal dynamics, andwater quality. Probabilities for all variable states are based on monitoring datacollected between 1994 and 1997 at a cluster of monitoring stations in the upperreaches of the Newport River Estuary, North Carolina.
deviation. The values in figure 3.4 are based on water quality data collected from
monitoring stations in the upper reaches of the Newport River between 1994 and
1997.
3.3.3 Conditional Probabilities
Representing relationships between variables using conditional probability distribu-
tions facilitates not only model updating (using Bayes’ theorem), but also analysis of
sensitivity of the response variable (fecal coliform MPN) to alternative environmental
states. For example, the probability distributions expressed in figure 3.4 are based
on precipitation and tidal conditions only at the time of sampling. It is therefore
uncertain if the distribution of fecal coliform MPN presented in figure 3.4 is an ap-
propriate indicator of long-term average conditions in the water body and if it can
be used as an accurate tool for assessing impairment of the designated use.
41
Historic data from the Morehead City NCDC station for this time period indi-
cates that there are between 0 and 4 days of dryness between rainfall events roughly
84% of the time, and more than 4 days of dryness between rainfall events 16% of
the time. Historic data analysis also indicates that the magnitude of daily rainfall
events is less than 1 inch approximately 90% of the time. Adjusting the distribution
of environmental variables to reflect long term conditions provides a better under-
standing of the long-term distribution of the water quality measurement. Neticar
stores relationships between causal and response variables in a conditional proba-
bility table. In this example, the relationship between variables does not change as
we modify marginal probability distributions of (assumed) causal variables. Using
the chain rule, we can demonstrate how Neticar calculates the marginal probabil-
ity distribution for any state of the fecal coliform MPN given different states of the
causal variables. For example, figure 3.5 (from the Neticar graphical user inter-
face) shows the empirically-based conditional probability distribution table for fecal
coliform MPN. Each row corresponds to the conditional probability that the fecal
coliform MPN will be in a given state given the state of all three causal variables.
For example, the first row of the table indicates that there is a 0.67 probability that
the fecal coliform MPN will be below 14 organisms per 100 ml when the tide is rising,
when the most recent rainfall is less than one inch, and when the most recent rainfall
event was less than four days ago. Using the chain rule, the marginal probability
that the fecal coliform MPN is between 0 and 14 organisms per 100 ml (integrated
42
over all possible states the three causal variable states, expressed here as x1, x2, x3)
can be written as:
p(0 ≤ MPN < 14) =∑
X
p(0 ≤ MPN < 14 | x1, x2, x3)π(x1, x2, x3) (3.1)
We assume that x1, x2, x3 are independent, and can therefore rewrite equation 3.1
as:
p(0 ≤ MPN < 14) =∑
X
p(0 ≤ MPN < 14 | x1, x2, x3)π(x1)π(x2)π(x3)
We can then combine the conditional probabilities for the fecal coliform MPN in
figure 3.5 with any set of marginal probabilities of environmental (causal) variables.
The marginal probability that the fecal coliform MPN is between 0 and 14 organisms
per 100 ml under long-term environmental conditions (which, as stated previously,
are slightly different than those under which the samples were collected) is:
43
Figure 3.5: Conditional probability distribution table for fecal coliform MPN node.For each of the three states of the MPN node, each row indicates the marginalprobability of the node being in that state given the state of the three causal variables.For example, the probability that the MPN is less than 14 organisms per 100 ml, giventhat the tide is rising, the most recent rainfall was less than one inch, and that it hasbeen less than four days since the most recent rain event, is 0.667.
p(0 ≤ MPN < 14) =∑
X
p(0 ≤ MPN < 14 | x1, x2, x3)π(x1)π(x2)π(x3)
= (0.67)(0.90)(0.50)(0.84) +
(0.55)(0.90)(0.50)(0.16) +
(0.33)(0.10)(0.50)(0.84) +
(0.33)(0.10)(0.50)(0.16) +
(0.61)(0.90)(0.50)(0.84) +
(0.60)(0.90)(0.50)(0.16) +
(0.33)(0.10)(0.50)(0.84) +
(0.55)(0.10)(0.50)(0.16)
= 0.60
44
A summary of marginal distributions for each causal variable revised to reflect
long-term conditions in the Newport River Estuary, along with the revised marginal
distribution for the fecal coliform MPN, is presented in figure 3.6.
Figure 3.6: Graphical submodel relating precipitation events, tidal dynamics, andwater quality. Probabilities for fecal coliform MPN states are conditional uponlong-term average precipitation and tidal conditions in the upper reaches of the New-port River Estuary, North Carolina.
3.4 Results and Discussion
Results of the analysis of the conditional probability distributions of water quality
data within the upper reaches of the Newport River between 1994 and 2004 using
the proposed Bayesian network submodel are presented in table 3.1. The summary
table divides the data into three time periods, and indicates distribution of fecal
colifom MPN under the conditions at the time of sampling (e.g. figure 3.4) and ad-
justed for long-term average conditions (e.g. figure 3.6). Analysis of the data in table
3.1 indicates little change in the probability distribution of fecal coliform between
the selected time periods and between the long-term average distribution and the
45
Table 3.1: Marginal distribution of fecal coliform MPN results at a selected groupingof monitoring stations. Newport River, North Carolina.
Marginal distribution under Marginal distribution adjusted forsampling conditions long-term average conditions
MPN (org/100mL) Probability MPN (org/100mL) Probability
0 to 14 0.58 0 to 14 0.601994-1997 14 to 43 0.23 14 to 43 0.20
≥ 43 0.19 ≥ 43 0.200 to 14 0.64 0 to 14 0.66
1997-2000 14 to 43 0.17 14 to 43 0.17≥ 43 0.19 ≥ 43 0.17
0 to 14 0.53 0 to 14 0.562001-2004 14 to 43 0.30 14 to 43 0.28
≥ 43 0.17 ≥ 43 0.16
distribution under sampling conditions. Results of the analysis indicate either that
the original monitoring program reflects long-term conditions, or that I don’t have
enough data to support alternative conditional scenarios for all possible combinations
of the variable states. Future modeling efforts should include data not initially in-
cluded in the standard SSS monitoring program in order to improve understanding
of the relationship between rainfall events, tide, and bacteria concentrations.
Results of the Bayesian analysis of water quality data are presented in table 3.2,
and indicate that the Bayesian analysis may provide a more representative long-term
indication of water quality in the Newport River. In addition, the results indicate
that a Bayesian analysis provides an opportunity to apply relative weights to current
and historic data based on potential knowledge of changing dynamics within the
contributing watershed.
In particular, a Bayesian analysis yields fecal coliform MPN probability distri-
46
Table 3.2: Summary of Bayesian analysis results for Newport River, North Carolinafecal coliform MPN data.
Prior distribution Posterior distributionMPN (org/100mL) Probability MPN (org/100mL) Probability
0 to 14 0.33(2) 0 to 14 0.611994-1997 14 to 43 0.33(2) 14 to 43 0.20
≥ 43 0.33(2) ≥ 43 0.190 to 14 0.61 0 to 14 0.64
1997-2000 14 to 43 0.20 14 to 43 0.18≥ 43 0.19 ≥ 43 0.18
0 to 14 0.64 0 to 14 0.622001-2004 14 to 43 0.18 14 to 43 0.22
≥ 43 0.18 ≥ 43 0.16NOTES: 1) All distributions conditional on long-term average conditions.
2) A very low relative weight (effective sample size = 1) was appliedto this prior distribution. See text for additional details.
butions at the end of each selected time period (i.e. 1994-1997, 1997-2000, and
2000-2004) with less between-time-period variance than the marginal probability dis-
tributions. For example, the marginal probability that fecal coliform MPN is below
14 is 0.56 for the 2000-2004 period compared to 0.66 to the 1997-2000 period (see
table 3.1). The Bayesian posterior probability that fecal coliform MPN is below 14
following the 2001-2004 time period is 0.62, compared to 0.64 for the 1997-2000 time
period (see table 3.2). These results imply that a Bayesian analysis is less influ-
enced by potential anomalies in the sampling data from a particular time period,
and perhaps provides a better overall representation of conditions within the water
body.
In addition, Bayesian analysis using the Neticar software allows prior and like-
lihood information to be weighted in order to reflect possible knowledge that either
47
historical or current data may serve as a more accurate indication of conditions as-
sessed for regulatory compliance. As an example, the prior probability distributions
presented for the 1994-1997 time period in table 3.2 are intended to reflect complete
ignorance of water quality conditions. A typical Bayesian analysis would reflect this
ignorance through an improper uniform prior distribution, applying equal probability
to all possible values of the fecal coliform MPN. In Neticar, a uniform probability
is applied using equal probabilities for all categories of the selected variable. As a
result, the prior distribution in table 3.2 for the 1994-1997 time period contains a
probability of 0.33 for each variable state. In order to minimize the effect of applying
disproportionate prior probabilities to each of the possible values of the fecal coliform
MPN, I apply a relative weight of 1 (i.e. relative sample size = 1) to the prior dis-
tribution allowing the likelihood (with a sample size of roughly 90) to dominate the
posterior distribution.
3.5 Conclusions
I have presented a case study applying conditional probability networks and Bayesian
updating to evaluate short and long-term water quality conditions within the Newport
River Estuary in North Carolina. This case study is intended to support the ongoing
evaluation of fecal contamination in the Newport River, and to serve as a precedent
for other water quality assessments conducted through the USEPA TMDL program.
A noted advantage to evaluating fecal contamination with a Bayesian network
48
model is the ability to easily adjust conditional probability distributions based on
changing knowledge of existing environmental conditions, and integration of new ev-
idence from ongoing and future water quality monitoring programs. The proposed
submodel serves as a template for a more rigorous analysis using the full comprehen-
sive Bayesian network model presented in figure 3.3. This research also suggests that
the current sampling scheme represents well the marginal probability distributions of
dominant environmental factors (e.g. wind and tide).
49
Chapter 4
An Assessment of Fecal IndicatorBacteria-Based Water Quality Standards
and Water Quality Model Endpoints
The content of this Chapter is published in Gronewold et al. (2008) and is available
at doi: 10.1021/es703144k. By permission of the American Chemical Society, the
abstract, figures, and tables are included below.
Abstract
Fecal indicator bacteria (FIB) are commonly used to assess the threat of pathogen
contamination in coastal and inland waters. Unlike most measures of pollutant lev-
els however, FIB concentration metrics, such as most probable number (MPN) and
colony-forming units (CFU), are not direct measures of the true in situ concentration
distribution. Therefore, there is the potential for inconsistencies among model and
sample-based water quality assessments, such as those used in the Total Maximum
Daily Load (TMDL) program. To address this problem, we present an innovative
approach to assessing pathogen contamination based on water quality standards that
impose limits on parameters of the actual underlying FIB concentration distribution,
rather than on MPN or CFU values. Such concentration-based standards link more
explicitly to human health considerations, are independent of the analytical proce-
50
dures employed, and are consistent with the outcomes of most predictive water quality
models. We demonstrate how compliance with concentration-based standards can be
inferred from traditional MPN values using a Bayesian inference procedure. This
methodology, applicable to a wide range of FIB-based water quality assessments, is
illustrated here using fecal coliform data from shellfish harvesting waters in the New-
port River Estuary, North Carolina. Results indicate that areas determined to be
compliant according to the current methods-based standards may actually have an
unacceptably high probability of being in violation of concentration-based standards.
Table 4.1: NSSP shellfish harvesting area fecal coliform water quality standardsbased on a minimum of 30 randomly collected samples.
Basis for standard Standardq50 µgeo q90
n MPN observations from 5-tube MTF procedure 14 14 43n CFU observations from MF procedure 14 14 31
Table 4.5: Regression model parameters including transformation parameter (γ),intercept (β0), and slope (β1).
52
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ≈
U(α
=0,
β=
100)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1.5,
β=
0.37
5)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1, β
=2)
σDensity
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ≈
U(α
=0,
β=
100)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1.5,
β=
0.37
5)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1, β
=2)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ≈
U(α
=0,
β=
100)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1.5,
β=
0.37
5)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1, β
=2)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ≈
U(α
=0,
β=
100)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1.5,
β=
0.37
5)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1, β
=2)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ≈
U(α
=0,
β=
100)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1.5,
β=
0.37
5)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1, β
=2)
Fig
ure
4.1
:P
rior
and
pos
teri
ordis
trib
uti
ons
for
σk
for
five
random
lyse
lect
edst
atio
ns
inth
eN
ewpor
tR
iver
usi
ng
the
thre
epri
ors
inta
ble
4.4.
Eac
hro
wuti
lize
sth
esa
me
pri
ordistr
ibuti
on,
and
each
colu
mn
repre
sents
ase
par
ate
stat
ion.
Ver
tica
lgr
aylines
are
added
tofa
cilita
teco
mpar
ison
bet
wee
nal
tern
ativ
epri
ors
for
each
stat
ion.
53
a)
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Analytical Procedure
MPNCFU
b)
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Analytical Procedure
MPNCFU
c)
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Analytical Procedure
MPNCFU
d)
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Analytical Procedure
MPNCFU
Figure 4.2: Combinations of the mean µc and standard deviation σc of the log-trans-formed fecal coliform concentration distribution which yielded MPN (solid lines) orCFU (dashed lines) samples in violation of the NSSP median standard (panel a), ge-ometric mean standard (panel b), 90th percentile standard (panel c), or any standard(panel d) with a frequency of either 0.005 or 0.1. The zone of violations is in theupper right of each panel.
54
a) µ c
σc
0.0
0.5
1.0
1.5
2.0
2.5
3.0
0.00.51.01.52.02.53.0
MP
N A
naly
sis,
=
2 10 ∞
CF
U A
naly
sis,
=
2 10 ∞
α α
b) µ c
σc0.
00.
51.
01.
52.
02.
53.
0
0.00.51.01.52.02.53.0
MP
N A
naly
sis,
=
2 10 ∞
CF
U A
naly
sis,
=
2 10 ∞
α α
Fig
ure
4.3
:R
elat
ionsh
ipbet
wee
nth
em
ean
µc
and
stan
dar
ddev
iati
onσ
cof
the
log-
tran
sfor
med
feca
lco
lifo
rmco
nce
n-
trat
ion
dis
trib
uti
onan
dsi
mula
ted
vio
lati
onof
any
CFU
-bas
edw
ater
qual
ity
stan
dar
d(d
ashed
lines
)an
dan
yM
PN
-bas
edw
ater
qual
ity
stan
dar
d(s
olid
lines
)fo
rpos
sible
valu
esof
the
neg
ativ
ebin
omia
ldis
per
sion
par
amet
erα.
Pan
els
aan
db
indic
ate
µc−
σc
pai
rsex
pec
ted
tovio
late
stan
dar
ds
wit
ha
freq
uen
cyof
0.1
and
0.00
5,re
spec
tive
ly.
55
−2 0 2 4 6
−50
050
100
a)
γ
log−
likel
ihoo
d
−2 0 2 4 6
−10
010
2030
b)
γlo
g−lik
elih
ood
−2 0 2 4 6
−50
050
100
c)
γ
log−
likel
ihoo
d
−2 0 2 4 6
−20
010
2030
d)
γ
log−
likel
ihoo
d
Figure 4.4: Log-likelihood (solid line) of transformation parameter γ for σc usingpaired values of µc and σc. Panel a based on values from table 4.2 for σc > 0.65,panel b based on values from table 4.2 for σc ≤ 0.65, panel c based on values fromtable 4.3 for σc > 0.65, and panel d based on values from table 4.3 for σc ≤ 0.65.
56
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5
0.0
0.5
1.0
1.5
2.0
Violation frequency contour lines
MPNCFU
Model fit
MPNCFU
σc = 0.65
Figure 4.5: Violation contour lines overlaid by violation line best-fit regressionmodel fitted values based on model parameters in table 4.5.
57
CC (%) Posterior probability of Violated any MPN standardStn. MPN CFU size-30 sample violating during the 2000–2005
any MPN standard assessment period?3 52 39 5 no4 44 33 6 no
Table 4.6: Estimated confidence of compliance (CC), posterior probability of vi-olating any MPN standard, and observed violations for monitoring stations in theNewport River Estuary during the 2000-2005 assessment period.
58
Station 25
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Probability density contour linesMPN 0.5% violation standardCFU 0.5% violation standard
CC = 2−3%
Station 27A
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Probability density contour linesMPN 0.5% violation standardCFU 0.5% violation standard
CC < 1%
Station 3
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Probability density contour linesMPN 0.5% violation standardCFU 0.5% violation standard
CC = 39−52%
Station 35
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Probability density contour linesMPN 0.5% violation standardCFU 0.5% violation standard
CC = 73−80%
Figure 4.6: Joint posterior probability density contour lines (solid lines) for fourmonitoring stations in the Newport River Estuary. Dashed lines indicate combina-tions of the mean µc and standard deviation σc of the log-transformed fecal coliformconcentration distribution which violate concentration-based standards no more than0.5% of the time using MPN or CFU standards as the reference. Confidences of com-pliance (CC) are given in the lower left of each panel for both MPN and CFU-basedstandards.
59
Chapter 5
Modeling the Relationship Between Most
Probable Number (MPN) and Colony
Forming Unit (CFU) Estimates of Fecal
Indicator Bacteria Concentrations
Reproduced in part with permission from Gronewold and Wolpert (2008). Copyright
2008 Elsevier. Available at doi:10.1016/j.watres.2008.04.011
Most probable number (MPN) and colony-forming-unit (CFU) estimates of fe-
cal coliform bacteria concentration are common measures of water quality in coastal
shellfish harvesting and recreational waters. Estimating procedures for MPN and
CFU have intrinsic variability and are subject to additional uncertainty arising from
minor variations in experimental protocol. It has been observed empirically that the
standard multiple-tube fermentation (MTF) decimal dilution analysis MPN proce-
dure is more variable than the membrane filtration CFU procedure, and that MTF-
derived MPN estimates are somewhat higher on average than CFU estimates, on
split samples from the same water bodies. I construct a probabilistic model that
provides a clear theoretical explanation for the variability in, and discrepancy be-
tween, MPN and CFU measurements. I then compare my model to water quality
samples analyzed using both MPN and CFU procedures, and find that the (often
large) observed differences between MPN and CFU values for the same water body
60
are well within the ranges predicted by my probabilistic model. Results indicate that
MPN and CFU intra-sample variability does not stem from human error or labora-
tory procedure variability, but is instead a simple consequence of the probabilistic
basis for calculating the MPN. These results demonstrate how probabilistic models
can be used to compare samples from different analytical procedures, and to deter-
mine whether transitions from one procedure to another are likely to cause a change
in quality-based management decisions.
5.1 Introduction
Coastal water resource management agencies frequently revise standard water qual-
ity analysis procedures based on the latest available technologies. For example, the
North Carolina Department of Environmental and Natural Resources Shellfish San-
itation and Recreational Water Quality Section (NCDENR-SSS), and similar water
resource management agencies, are considering replacing multiple-tube fermentation
(MTF) fecal coliform analysis procedures with membrane filtration (MF) procedures
because MF results, while variable, are much less so than MTF results (as commonly
implemented) from the same water quality sample. NCDENR-SSS and other agen-
cies are concerned, however, that water quality-based management decisions for a
particular water body (such as approval or prohibition of shellfishing) may change
after MF procedures are implemented.
Here, I derive a theoretical model for the probability distribution of MTF and MF
61
test results from the same water quality sample. This innovative approach allows a
side-by-side comparison of alternative testing methods, accommodating their intrinsic
differences (rather than assuming that these differences have no effect). Further, I
find the probability distributions for the true fecal coliform concentrations associated
with different possible measurement results from each procedure.
Differences, if observed, between the MTF-MF relationship predicted by my model
and the MTF-MF relationship observed empirically in samples from a particular
laboratory, would suggest significant extrinsic sources of uncertainty and variability
(i.e. unrelated to natural spatial distribution of organisms in a sample aliquot volume)
and, more importantly, an increased chance that changing standard fecal coliform
analysis from MTF to MF might lead to a change in water quality-based management
decisions.
Variability in MTF and MF analysis results can be divided into two categories:
intrinsic stochastic variability due to the natural dispersion of bacteria within sample
containers, and extrinsic variability. Intrinsic sources of variability are mostly a
consequence of procedure design, and are explained later in this section. Extrinsic
sources of variability include departures from expected sampling protocol, microbial
cell damage (during filtration, for example) which may reduce the number of viable
organisms (Kloot et al., 2006), and clumping of bacteria cells (Noble et al., 2003b).
Other potential extrinsic sources of variability relate to environmental conditions
at the time of sampling, including antecedent rainfall, turbidity, and season (Cabelli
62
et al., 1983; Noble et al., 2003a). These extrinsic sources of variability are not included
in my model and, if they actually contribute to MTF-MF intra-sample variability,
will limit my model’s ability to explain the difference between MTF and MF results.
Fecal and total coliform bacteria are indicators of potential fecal pollution and
water-borne pathogenic threats to human health (Cabelli, 1983; LeClerc et al., 2001).
Other bacterial measures of water quality include Escherichia coli (a subset of fecal
coliforms), and enterococci (Noble et al., 2003a). Extensive definitions of fecal and
total coliform bacteria are presented elsewhere (Rompre et al., 2002; Kloot et al.,
2006). My model is applied to monitoring data from shellfish harvesting areas in
which fecal coliform is a more common measure of water quality. As a result, I
discuss only fecal coliform bacteria concentrations for the rest of this paper, however
the application of probabilistic models to intra-sample variability can be applied to a
wide range of microbial, physical, and chemical pollutants (see, e.g. Kinzelman et al.,
2003; U.S. Geological Survey, 1996; Horowitz, 1986).
MTF and MF are two common procedures for estimating fecal coliform concen-
trations in coastal resource waters (Eckner, 1998; Buckalew et al., 2006). MTF and
MF fecal coliform analysis results are reported as most probable number (MPN) and
colony-forming unit (CFU) estimates of the true fecal coliform concentration c (typ-
ically in organisms per 100 ml). Detailed descriptions of the MF microbial analysis
procedure are presented in Rose et al. (1975), Rippey et al. (1987), Dufour et al.
(1981), Eckner (1998), and Esham and Sizemore (1998). Similar descriptions of the
63
MTF procedure are presented in Cochran (1950); Hurley and Roscoe (1983); Beliaeff
and Mary (1993); McBride et al. (2003).
MPN estimates derived from a standard (e.g. 5-tube × 3 dilution series) MTF
analysis are, by definition, the possible values of the concentration at which the
likelihood function (see Appendix, equation 5.2) attains its maximum. The likelihood
function offers an indication of how strongly an observed pattern of positive tube
counts from an MTF analysis support each possible value c of the concentration
(McBride, 2005, pp. 12–13). The MPN estimates are highly variable because this
function has a very broad peak, and so is close to its maximum value over a wide
range of possible concentrations.
Additional discussion of the statistical assumptions inherent in MTF-based MPN
calculations can be found in Eisenhart and Wilson (1943); Beliaeff and Mary (1993);
Klee (1993). CFU estimates are based on the number of distinguishable bacterial
colonies which form on a culture plate after filtration and incubation. CFU variability
is inversely proportional to the volume of sample water filtered, and therefore while
CFU estimates are variable, the variability is often small compared to that of MTF-
derived MPN estimates when large aliquot volumes are used. The broad likelihood
function of MTF positive tube count observations and variability in the number of
distinguishable bacterial growth colonies are both examples of intrinsic variability in
MPN and CFU estimates, and are therefore addressed explicitly in my model.
Several recent studies document empirical relationships between fecal bacteria
64
analysis results from different testing procedures (e.g. Eckner, 1998; Noble et al.,
2003b; Kloot et al., 2006). The study by Noble et al. (2003b), for example, which
compares beach water quality analysis results using MF, MTF, and the IDEXX
Despite differences between regression model fitted values (panel A of figure 5.3)
and expected values from my theoretical probability model (panel B of figure 5.3),
I expect empirical regression model fitted values to approach expected values of the
MPN for a specific CFU as sample size increases. Differences, if any, between large-
sample empirical regression model fitted values and my theoretical model expected
values might suggest significant non-probabilistic (i.e. extrinsic) sources of variabil-
ity. Exploring comparisons between my proposed probabilistic model and regression
models fit to very large data sets is an area for future research.
70
5.4 Conclusions
I derived a theoretical model of the MPN probability distribution for any observed
CFU estimate from the same water quality sample. Recent water quality samples
collected and analyzed by NCDENR-SSS for fecal coliform concentration using both
MTF and MF analysis tests yielded MPN and CFU estimates entirely consistent
with my theoretical probabilistic model. My results indicate that MPN and CFU
intra-sample variability does not stem from human error or laboratory procedure
variability, but is instead a simple consequence of the probabilistic basis for calculat-
ing the MPN.
I anticipate this study will serve as a stepping stone towards future research on
whether different fecal coliform analysis procedures might lead to different water
quality standard violation frequencies for the same water body. Method-dependent
differences, if any, might propagate into coastal resource water management decisions
through two undesirable pathways. First, analysis of water quality samples from a
coastal resource water might, depending on the analysis procedure used, result in
different management actions (such as closing or opening a shellfish harvesting area).
Second, if fecal coliform concentration estimates vary depending on whether MTF or
MF procedures are used, potential benefits of merging historic MPN and new CFU
data sets would be limited (Noble et al., 2003b). Future research on the probabilistic
basis for current water quality standard violations, coupled with the modeling tools
presented in this paper, could provide answers to these research questions.
71
Other suggested studies stemming from this research include, but are not limited
to, quantifying membrane filtration-related fecal coliform thinning and contamination
rates, exploring environmental effects on fecal coliform concentration estimate bias,
and determining how measuring different coliform bacteria metabolic output effects
fecal coliform concentration estimates.
5.5 Calculations
Assuming fecal coliform organisms at concentration c (in organisms per 100 ml) are
well mixed in a water sample, it is commonly assumed that aliquots of volume vi ml
from the water sample contain a Poisson Po(cvi/100) distributed number of fecal co-
liform organisms (McCrady, 1915; Greenwood and Yule, 1917; de Man, 1977; Russek
and Colwell, 1983; Best and Rayner, 1985; Woomer et al., 1990; Briones and Re-
ichardt, 1999). Out of ni serial dilution analysis tubes, the numbers of positive tubes
xi are independent binomial Bi(ni, pi) random variables with pi = 1− exp(−cvi/100)
(for more on using Poisson and binomial distributions in environmental data analysis,
see Ott, 1995, pp. 93–113 and 127–137). The MPN for m dilution series can therefore
be expressed as:
MPN = argmaxc
[
m∏
i=1
(
1 − e−cvi/100)xi
(
e−cvi/100)ni−xi
]
(5.1)
72
and the conditional probability distribution of positive tube counts X = {xi}, given
true fecal coliform concentration c, is:
f(x | c) =m∏
i=1
(
ni
xi
)
[
1 − e−cvi/100]xi
[
e−cvi/100]ni−xi
(5.2)
The Poisson-distributed CFU observation Y ∼ Po(λ) with mean λ = cV/100 for
sample aliquot volume V ml has conditional probability distribution, given true fecal
coliform concentration c, given by
f(y | c) =1
y!(cV/100)ye−cV/100 for y ∈ 0, 1, 2, . . . (5.3)
The posterior probability distribution of the true fecal coliform concentration c
for an observed tube count combination x, using Jeffreys’ scale-invariant “reference”
prior distribution π(c) ∝ 1/√
c (Jeffreys, 1946; Bernardo and Ramon, 1998), is given
by:
f(c | x) ∝ c−1/2e−(c/100)∑m
i=1 vi(ni−xi)m∏
i=1
(
1 − e−cvi/100)xi
, c > 0 (5.4)
Using the same Jeffreys’ prior distribution, the posterior distribution of c for a
73
given CFU observation y is:
f(c | y) ∝ cy−1/2e−cV/100, c > 0 (5.5)
which is a Gamma Ga(α, λ) distribution with shape parameter α = y + 1/2 and rate
parameter λ = V/100.
Finally I calculate the probability distribution of the positive tube count vector
x = (x1, . . . , xm), 1≤xi≤ni for any CFU observation y, P[X = x | Y = y], by
combining equations 5.2 and 5.5:
f(x | y) =
∫
∞
0
f(x | c)f(c | y)dc (5.6)
=(V/100)y+1/2
Γ(y + 1/2)×
∫
∞
0
cy−1/2e−(c/100)[V +∑m
i=1 vi(ni−xi)]m∏
i=1
(
ni
xi
)
(
1 − e−cvi/100)xi
dc.
74
050
100
150
200
250
02004006008001000
Tru
e fe
cal c
olifo
rm c
once
ntra
tion
(org
anis
ms
per
100
ml)
Fecal coliform MPN (organisms per 100 ml)
A
E(M
PN
|c)
MP
N 9
5% p
redi
ctio
n se
t for
spe
cifie
d tr
ue c
once
ntra
tion
1:1
line
050
100
150
200
250
02004006008001000T
rue
feca
l col
iform
con
cent
ratio
n (o
rgan
ism
s pe
r 10
0 m
l)Fecal coliform CFU (organisms per 100 ml)
E(C
FU
|c)
1:1
line
! M
PN
95%
pre
dict
ion
inte
rval
for
spec
ific
true
con
cent
ratio
n
B
Fig
ure
5.1
:E
xpec
ted
valu
esan
d95
%pre
dic
tion
sets
orpre
dic
tion
inte
rval
sfo
rob
serv
able
feca
lco
lifo
rmM
PN
(pan
elA
)an
dC
FU
(pan
elB
)m
easu
rem
ents
give
nth
etr
ue
feca
lco
lifo
rmco
nce
ntr
atio
nin
orga
nis
msper
100
ml.
For
clar
ity,
expec
ted
valu
esan
d95
%pre
dic
tion
sets
orin
terv
als
are
plo
tted
only
for
ever
y5th
inte
ger-
valu
edco
nce
ntr
atio
nc.
Max
imum
true
conce
ntr
atio
ns
inea
chplo
tar
ebas
edon
max
imum
MP
Nan
dC
FU
obse
rvat
ions
inth
eN
CD
EN
R-S
SS
dat
ase
t.C
FU
pre
dic
tion
inte
rval
sar
ebas
edon
anM
Fsa
mple
aliq
uot
volu
me
of10
0m
l.
75
050
100
150
200
250
02004006008001000
Fec
al c
olifo
rm M
PN
(or
gani
sms
per
100
ml)
True fecal coliform concentration (organisms per 100 ml)
A
E(c
|MP
N)
1:1
line
! 9
5% c
redi
ble
inte
rval
s
050
100
150
200
250
02004006008001000F
ecal
col
iform
CF
U (
orga
nism
s pe
r 10
0 m
l)True fecal coliform concentration (organisms per 100 ml)
E(c
|CF
U)
1:1
line
! 9
5% c
redi
ble
inte
rval
s
B
Fig
ure
5.2
:E
xpec
ted
valu
ean
d95
%cr
edib
lein
terv
als
forth
efe
calco
lifo
rmtr
ue
conce
ntr
atio
ngi
ven
MP
N(p
anel
A)an
dC
FU
(pan
elB
)es
tim
ates
inor
ganis
ms
per
100
ml.
For
clar
ity,
pan
elA
incl
udes
only
the
51ob
serv
able
MP
Nes
tim
ates
pre
sente
din
stan
dar
dla
bor
ator
yan
alysi
sM
TF
conve
rsio
nta
ble
sfo
rth
e5-
tube
seri
aldiluti
onan
alysi
spro
cedure
(see
,e.
g.W
oodw
ard,
1957
)an
dpan
elB
incl
udes
only
ever
y5th
obse
rvab
leC
FU
valu
ebas
edon
anM
Fte
stw
ith
asa
mple
aliq
uot
volu
me
of10
0m
l.
76
Fec
al c
olifo
rm C
FU
(or
gani
sms
per
100
ml)
Fecal coliform MPN (organisms per 100 ml)
!
NC
DE
NR
−S
SS
dat
a us
ed in
log−
linea
r re
gres
sion
mod
elN
CD
EN
R−
SS
S d
ata
excl
uded
from
log−
linea
r re
gres
sion
mod
elR
egre
ssio
n m
odel
fitte
d va
lues
95%
MP
N p
redi
ctio
n in
terv
al1:
1 lin
e
0125102050100200500
01
25
1020
5010
020
0
A
Fec
al c
olifo
rm C
FU
(or
gani
sms
per
100
ml)
Fecal coliform MPN (organisms per 100 ml)
NC
DE
NR
−S
SS
dat
aE
(MP
N|C
FU
)95
% M
PN
pre
dict
ion
set (
for
spec
ified
CF
U)
1:1
line
0125102050100200500
01
25
1020
5010
020
0
B
Fig
ure
5.3
:E
mpir
ical
linea
rre
gres
sion
model
(pan
elA
)an
dth
eore
tica
lpro
bab
ility
model
(pan
elB
)of
the
rela
tion
ship
bet
wee
nfe
calco
lifo
rmM
PN
and
CFU
esti
mat
esfr
omth
esa
me
wat
erqual
ity
sam
ple
.
77
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
MPN (organisms per 100 ml)
Pro
babi
lity
mas
s
Observed CFU = 6 organisms per 100 mlE(MPN|CFU = 6) = 7.6 organisms per 100 mlObserved MPN values when CFU = 6f(MPN|CFU=6)
01
23
4
0 1 5 10 20 50 100 200 500 1000
Num
ber
of o
bser
vatio
ns
Figure 5.4: Observed values, expected values, and the theoretical probability massfunction of the MPN for a CFU measurement from the same water quality sample.Observed values are from recent NCDENR-SSS study.
78
Chapter 6
Improving Parameter Estimation in the
Aquatic Fate and Transport Model
Much of the research in this chapter was completed in collaboration with Dr. Song
Qian, Dr. Robert Wolpert, Dr. Rachel Noble and Dr. Kenneth Reckhow, and was
submitted to Water Research.
Water resource management decisions often depend on mechanistic or empirical
models to predict water quality conditions under future pollutant loading scenarios.
While explicitly acknowledging process, observation, and analytical uncertainty in
these models is considered critical to model-based resource management decisions
and protection of human and environmental health, few tools have been developed
which explicitly propagate analytical uncertainty into fecal indicator bacteria (FIB)
water quality models. Here, I explore how ignorance or acknowledgement of model
input uncertainty affects model parameter estimates in a simple FIB water quality
model. I present two approaches to calibrating the model using simulated results
of a standard multiple-tube fermentation (MTF) serial dilution analysis. The first
approach uses only the most probable number (MPN) point estimate, while the sec-
ond implements a Bayesian approach to modeling the number of positive tubes in
each MTF dilution series as a stochastic random variable. I find that my proposed
Bayesian approach yields parameter estimates which are asymptotically more accu-
79
rate and precise, and model predictions with less uncertainty than those based on
using MPN point estimates. These results suggest a potential new strategy for reduc-
ing uncertainty in model-based water resource management decisions, such as those
implemented through the United States Environmental Protection Agency (USEPA)
Total Maximum Daily Load (TMDL) program.
6.1 Introduction
Explicitly acknowledging analytical uncertainty is a potentially critical component of
water quality modeling and model-based water resources management. Nonetheless,
few tools have been developed and applied to propagate intrinsic analysis uncer-
tainty through coastal shellfish harvesting and recreational water quality models into
model forecasts and management decisions. Water quality standards in designated
recreational and shellfish harvesting areas are often based on the concentration of
fecal indicator bacteria (FIB) such as total coliforms, fecal coliforms, and enterococ-
Table 6.1: Example of simulated data set with sample size j = 10. Each row repre-sents a simulated grab sample with concentration c collected at time t, a simulatedpattern of positive tubes (x1, x2, x3) resulting from standard MTF decimal dilutionanalysis of the grab sample, and the corresponding MPN (**see Methods section forinterpretation of results with all tubes negative, or all tubes positive).
89
Ste
pVari
able
(s)
sim
ula
ted
Model
Para
met
ers
Pre
dic
tors
or
calc
ula
ted
1c
c=
eln
c0−
kt+
No(0
,σm
)c 0
,in
itia
lFIB
conce
ntr
ation
(org
anis
ms
per
100
ml)=
1500
tk,firs
t-ord
erFIB
dec
ayra
te(1
/day
)=
0.8
σm
,m
odel
resi
duals.
e.(log-o
rganis
ms
per
100
ml)
=0.3
2x1,x
2,x
3x
i∼
Bi(n
,pi)
n,num
ber
oftu
bes
inea
chdilution
seri
es=
5c
pi,pro
bability
ofposi
tive
test
sam
ple
inse
ries
i=
1−
e−cv
i/100
vi,sa
mple
aliquot
volu
me
(ml)
inse
ries
i,∈
[10,1
,0.1
]
3M
PN
arg
max
c[∏
m i=1
(
1−
e−cv
i/100)
xi(
e−cv
i/100)
n−
xi]
m,num
ber
ofdilution
seri
es=
3x1,x
2,x
3
Table
6.2
:Sum
mar
yof
step
suse
dto
sim
ula
tehypot
het
ical
wat
erqual
ity
anal
ysi
sdat
ain
cludin
gFIB
fate
and
tran
spor
tin
anaq
uat
icen
vir
onm
ent
wit
hfirs
t-or
der
dec
ay(s
tep
1),ra
ndom
lyge
ner
ated
pat
tern
ofpos
itiv
ese
rial
diluti
onan
alysi
stu
bes
(ste
p2)
,an
dca
lcula
tion
ofth
eas
soci
ated
MP
N(s
tep
3).
90
6.2.2 Parameter Estimation
My first approach to estimating parameter values (i.e. c0, k, σm) in the first-order
decay model (equation 6.7) uses an ordinary least-squares (OLS) regression with
ln(MPN) point estimates as the model response variable (see Weisberg, 2005, for
details on OLS regression). The regression model is:
ln(MPN) = β0 + β1 ∗ t + No(0, σ) (6.8)
where β0 is an estimate of ln(c0), β1 is an estimate of k, and σ is an estimate of σm.
For each of the 100 size j (j ∈ 10, 25, 100) sample sets, I record the estimated mean
value of ln(c0), k, and σm.
My second approach implements a Bayesian modeling strategy in which I derive
posterior distributions for each model parameter using Markov-chain Monte Carlo
(MCMC) simulations in the WinBUGS software program (Lunn et al., 2000; Spiegel-
halter et al., 2003). My Bayesian modeling approach is based on the assumption
that the number of positive tubes in an MTF dilution series (xi) can be modeled as
a Binomial Bi(n, pf ) random variable evolving from the true FIB concentration c as
follows:
91
ln(c) = ln(c0) − k ∗ t + No(0, σm)
xi ∼ Bi(n = 5, pi = 1 − e−cvi/100)
For each of the 100 samples sets, I record an estimated mean value of c0, k, and
σm. Detailed code for implementing this approach in WinBUGS, including selection
of parameter prior distributions, is included in the Appendix.
6.3 Results
Model calibration using both MPN point estimates and the pattern of positive serial
dilution analysis tubes yielded accurate estimates of parameters c0 and k. As shown
in figure 6.1, the inner quartile range (thick black line) contains the “true value” of
c0 and k for both procedures for all three sample sizes. For sample sizes of 10 and
25, however, estimates of c0 and k are more precise in models calibrated using the
MPN point estimate.
Model calibration using the MPN, however, consistently resulted in significant
overestimates of model error (σm) for all sample sizes. Furthermore, the magnitude
of overestimation increased with sample size. In contrast, estimates of σm using
the pattern of positive serial dilution analysis tubes yielded an inner quartile range
containing the true value of σm for samples of size 25 and 100, and parameter 95%
92
credible intervals containing the true value of σm for samples of size 10. None of the
95% intervals for σm contained its true value, regardless of sample size.
0 1000 2000 3000 4000c0
Sam
ple
size
1025
100
0.6 0.7 0.8 0.9 1 1.1
k
0.0 0.5 1.0 1.5σm
Figure 6.1: Estimated inner quartile (50%, thick black line) and 95% intervals(thin black line) for each model parameter based on samples of size 10, 25, or 100.Vertical gray lines indicate the parameter value used to simulate data. Dots (solidand hollow) indicate median values. For each sample size, the upper line (with solidcircle) represents the parameter estimate based on using the MPN point estimate,and the lower line (with hollow circle) represents parameter estimates based on usingthe pattern of positive tubes for model calibration.
6.4 Discussion
My analysis indicates that using the pattern of positive tubes from an MTF serial
dilution analysis as data provides far more accurate estimates of the model error term
(σm), but provides somewhat less precise and less accurate estimates of model decay
rate k and initial concentration c0 (particularly with a small sample size). I expect
that the relative uncertainty in c0, particularly when the pattern of positive serial
dilution tubes is used for inference, is a simple consequence of the data-generating
process. More specifically, I set the log-linear model intercept term, c0, to 1500
93
organisms per 100 ml, which is close to the upper detection limit of the standard
5-tube MTF procedure. Water quality grab samples simulated at time t ≈ 0 might
have yielded MTF results with all tubes positive, and because I assigned these results
an MPN value of 1700 organisms/100 ml, the estimate of c0 appears to be accurate,
when in fact it is likely determined by my choice of the upper value of censored data.
When a dilution series yields all positive or negative results, the underlying con-
centration is essentially non-identifiable. Common approaches to addressing these
data points in models, including either removing them before analysis or reporting
them as below or above a certain value, often lead to a loss of information (Qian
et al., 2004). I also explored alternative linear modeling procedures for censored
data, including the EM algorithms presented in Schmee and Hahn (1979) and Tan-
ner (1991). Differences between parameter values estimated using EM algorithm,
and those presented in my results, are insignificant.
As discussed in Qian et al. (2005), using all of the observed serial dilution counts
as data for model inference (including those with all positive and all negative results)
is expected to yield models which outperform those using MPN-based data, regard-
less of whether those using MPN data omit or censor the MPN values associated
with all positive or all negative tube counts. This study has demonstrated potential
effects of using the MPN on model parameter estimates, however further analysis is
needed to understand potential effects on model forecasts. Here, I demonstrate how
uncertainty in FIB concentration model parameters propagates into predictions of
94
FIB concentration. I use a Monte Carlo simulation procedure using triplicate values
of c0, k, and σm to simulate the distribution FIB concentrations (using my original
model in equation 6.7) at t = 1, 4, and 7 days. I find that model prediction uncer-
tainty is consistently higher in models calibrated using MPN point estimates than
models calibrated using the pattern of positive serial dilution analysis tubes (figure
6.2). These results emphasize how explicitly modeling analytical process uncertainty
improves not only understanding of the relationship between pollutant concentrations
in the water column and laboratory-derived estimates of the concentration, but also
how uncertainty in resource area management decisions might relate to variability in
those estimates.
0 500 1000 1500 2000 2500c
t (days) = 1
Sam
ple
size
1025
100
t (days) = 4
0 50 100 150 200 250
c
0 5 10 15 20 25 30c
t (days) = 7
Figure 6.2: Estimated inner quartile (50%, thick black line) and 95% intervals(thin black line) for model-predicted FIB concentrations at time t = 1, 4, and 7days. Vertical gray lines indicate the expected FIB concentration using the “true”parameter values. Dots (solid and hollow) indicate median values. For each samplesize, the upper line (with solid circle) represents predicted FIB concentrations usingthe model calibrated with MPN point estimates, and the lower line (with hollowcircle) represents predicted FIB concentrations using the model calibrated using thepattern of positive tubes.
I also explored the choice of parameter prior distributions as a potential source
95
of bias in the posterior parameter distribution. For example, posterior parameter
distributions for k based on a normal prior distribution, k ∼ No(0,σ2k) with σk ∼
U(0,20), were compared to the posterior parameter distribution based on a uniform
prior distribution, k ∼ U(0,20) (see Gelman, 2006, for details on prior distribution
parameterization). Differences between the resulting posterior parameter distribu-
tions were negligible, indicating that my selection of prior distributions was not a
significant source of parameter estimation bias.
Opportunities for applying my modeling approach are found in a broad range
of environmental and public health-related disciplines. For example, Harris et al.
(1998) utilize MPN data in the analysis of planktonic diatom concentrations in sedi-
ment samples and cite similar studies using MPN calculations (e.g. Larrazabal et al.,
1990; An et al., 1992). Eckford and Fedorak (2005) use an MPN method to as-
sess nitrate-reducing bacteria growth in oil fields, and Fegan et al. (2004) present a
series of studies enumerating Escherichia coli O157 in cattle feces using MPN pro-
cedures. Additional examples of MPN-based environmental assessment include soil
and groundwater composition analysis (Menyah and Sato, 1996; Papen and von Berg,
1998) and aquifer contamination studies (Bekins et al., 1999). A specific example of
an MPN-based assessment of fecal contamination in recreational water bodies is the
Oregon Beach Monitoring Program (Neumann et al., 2006). This program, while
acknowledging environmental conditions as potential sources of data variability, ap-
plies MPN point estimates of FIB concentration rather than probabilistic estimates,
96
and therefore represents the type of study which could utilize, and potentially be
improved by, my modeling strategy.
In light of the many examples of uses of MPN data, I must acknowledge an on-
going transition in FIB water quality monitoring from traditional MTF technologies
B. Prior distributions and WinBUGS code for estimating parameters using
the pattern of positive tubes from a decimal dilution analysis:
Following the approach of Gelman (2006), I use the following parameter prior
distributions:
π(c0) ∼ LN(0, σc0)
π(σc0) ∼ U(0, 20)
π(k) ∼ U(0, 20)
π(σm) ∼ U(0, 20)
and the following WinBUGS code:
model {
for (j in 1:J){ #J = number of samples in each set
t1[j] ~ dbin(p1[j],n) #id = set number (out of n.run)
t2[j] ~ dbin(p2[j],n)
t3[j] ~ dbin(p3[j],n)
p1[j] <- 1-exp(-(c[j]/100)*v1)
p2[j] <- 1-exp(-(c[j]/100)*v2)
p3[j] <- 1-exp(-(c[j]/100)*v3)
c[j] <- exp(logc0[id[j]]-k[id[j]]*t[j]+error[j])
error[j] ~ dnorm(0,tau[id[j]])
}
101
v1 <- 10
v2 <- 1
v3 <- .1
n <- 5
for (i in 1:n.run){ #n.run = 100 sets
logc0[i] ~ dnorm(0,tauc0)
tau[i] <- pow(sigma[i],-2)
sigma[i] ~ dunif(0,20)
k[i] ~ dunif(0,20)
}
tauc0 <- pow(sigmac0,-2)
sigmac0 ~ dunif(0,20)
}
102
Appendix A
Listing of Impaired Waters
Water body name Assessment Unit Classification IR Category
Newport River 21-(17)a SA 4csNewport River 21-(17)b1 SA 4csNewport River 21-(17)b2 SA 5Newport River 21-(17)c SA 5Newport River 21-(17)d1 SA 5Newport River 21-(17)d3 SA 4csNewport River 21-(17)e1 SA 4csNewport River 21-(17)e2 SA 4csNewport River 21-(17)f SA 4csNewport River 21-(17)g1 SA 4csNewport River 21-(17)g2 SA 4csNewport River 21-(17)h SA 5
Little Creek Swamp 21-18 SA 4csMill Creek 21-19 SA 4csBig Creek 21-20 SA 4cs
Little Creek 21-21 SA 4csHarlowe Canal 21-22-1 SA 4csAlligator Creek 21-22-2 SA 4csHarlowe Creek 21-22a SA 4csHarlowe Creek 21-22b1 SA 4csHarlowe Creek 21-22b2 SA 4csHarlowe Creek 21-22b3 SA 4csHarlowe Creek 21-22c SA 5Oyster Creek 21-23a SA 5Oyster Creek 21-23b SA 4cs
Eastman Creek 21-24-1 SA 4csBell Creek 21-24-2a SA 4csBell Creek 21-24-2b SA 4csCore Creek 21-24a SA NACore Creek 21-24b1 SA 4csCore Creek 21-24b2 SA 4csCore Creek 21-24c SA 4csWare Creek 21-25 SA 5
Russell Creek 21-26a SA 4csRussell Creek 21-26b SA 4csWading Creek 21-27 SA 4csGable Creek 21-28a SA 4csGable Creek 21-28b SA 4csWillis Creek 21-29 SA 4cs
Crab Point Bay 21-30 SA 4cs
Table A.1: Water bodies within shellfish growing area E-4 and their status relativeto the 303(d) list of impaired waters. “IR Category” refers to 2008 Draft IntegratedReport (IR) Category.
103
Appendix B
North Carolina Shellfish Harvesting Area
Water Quality Standards
Title 15A of the North Carolina Administrative Code (NCAC), Chapter 18 (Envi-ronmental Health), SubChapter A (Sanitation), Sections .0300 through .0900 providerules governing the harvest, growth, distribution and consumption of shellfish. Thefollowing is a summary of the four major shellfish growing area classifications aspresented in Section .0900 of the pertinent section of the NCAC:
Approved Areas - A shellfish growing area is classified as Approved if the followingcriteria are met:
1. the shoreline survey has indicated that there is no significantpoint source contamination;
2. the area is not contaminated with fecal material, pathogenicmicroorganisms, poisonous and deleterious substances, or ma-rine biotoxins that may render consumption of the shellfish haz-ardous;
3. the median fecal coliform most probable number (MPN) or thegeometric mean MPN of water shall not exceed 14 per 100 milliliters,and not more than ten percent of the samples shall exceed a fecalcoliform MPN of 43 per 100 milliliters (per five tube decimal di-lution) in those portions of areas most probably exposed to fecalcontamination during adverse pollution conditions.
Conditionally Approved Areas As stated in NCAC, conditionally approved ar-eas are those expected to meet Approved Area criteria for extended periods andthe factors determining those periods are known and predictable. Written man-agement plans are developed by the Division of Environmental Health for theseareas. When management plan criteria are met, the Division may recommendthese areas opened to shellfish harvest on a temporary basis. When manage-ment plan criteria are not met, or the public health appears to be jeopardized,the Division recommends immediate closure of the area.
Restricted Areas An area is classified as restricted with the sanitary survey in-dicates a limited degree of pollution, and the area is not contaminated to theextent that indicates that consumption of shellfish could be hazardous after con-trolled depuration or relaying. According to Shellfish Sanitation Section Staff,
104
shellfish may be transported from restricted areas to other areas for cleansingfor a minimum of 14 days.
Prohibited Areas Areas are classified as Prohibited if there is either no currentSanitary Survey, if sanitary survey information indicates that the area does notmeet criteria for an Approved, Conditionally Approved, or Restricted Area.In addition, areas are classified as Prohibited if the growing area is within awastewater treatment plant outfall buffer zone, immediate vicinity of a marina(unless it has less than 30 slips, has no boats over 24 feet in length, or hasno boats with heads or cabins). Specific growing area limits are included inSection .0900 of NCAC.
105
Bibliography
Adam, R. D. (1991). The biology of Giardia spp. Microbiological Reviews 55, 4,706–732.
Alley, W. M. and Smith, P. E. (1981). Estimation of accumulation parameters forurban runoff quality modeling. Water Resources Research 17, 6, 1657–1664.
An, K. H., Lassus, P., Maggi, P., Bardouil, M., and Truquet, P. (1992). Dinoflag-ellate cyst changes and winter environmental-conditions in Vilaine Bay, SouthernBrittany (France). Botanica Marina 35, 1, 61–67.
APHA (2005). Standard methods for the examination of water and wastewater. Amer-ican Public Health Association, Washington, DC, 20th edn.
Arega, F. and Sanders, B. F. (2004). Dispersion model for tidal wetlands. Journalof Hydraulic Engineering-ASCE 130, 8, 739–754.
Ashbolt, N. J., Grohmann, G. S., and Kueh, C. S. W. (1993). Significance of spe-cific bacterial pathogens in the assessment of polluted receiving waters of Sydney,Australia. Water Science and Technology 27, 3/4, 449–452.
Aspinall, L. J. and Kilsby, D. C. (1979). A microbiological quality-control procedurebased on tube counts. Journal of Applied Bacteriology 46, 2, 325–329.
Auer, M. T. and Niehaus, S. L. (1993). Modeling Fecal Coliform Bacteria–I. Fieldand Laboratory Determination of Loss Kinetics. Water Research 27, 4, 693–701.
Badenoch, J., Bartlett, L., Benton, C., Casemore, D., Cawthorne, R., Earnshaw, F.,Ives, K., Jeffery, J., Smith, H., Vaile, M., Warrell, D., and Wright, A. (1990). Cryp-tosporidium in water supplies. Report of the group experts. Tech. rep., Departmentof the Environment, Department of Health. London, UK. HMSO.
Barbe, D. E., Cruise, J. F., and Mo, X. (1996). Modeling the buildup and washoff ofpollutants on urban watersheds. Water Resources Bulletin 32, 3, 511–519.
Bekins, B. A., Godsy, E. M., and Warren, E. (1999). Distribution of microbialphysiologic types in an aquifer contaminated by crude oil. Microbial Ecology 37,4, 263–275.
Beliaeff, B. and Mary, J.-Y. (1993). The most probable number estimate and itsconfidence-limits. Water Research 27, 5, 799–805.
Benham, B. L., Baffaut, C., Zeckoski, R. W., Mankin, K. R., Pachepsky, Y. A.,Sadeghi, A. M., Brannan, K. M., Soupir, M. L., and Habersack, M. J. (2006).Modeling bacteria fate and transport in watersheds to support TMDLs. Transac-tions of the ASABE 49, 4, 987–1002.
106
Bernardo, J. M. and Ramon, J. M. (1998). An introduction to Bayesian referenceanalysis: inference on the ratio of multinomial parameters 47, 1, 101–135.
Berry, D. A. (1996). Statistics: a Bayesian Perspective. Duxbury Press, Belmont,California.
Best, D. J. and Rayner, J. C. W. (1985). A comparison of the MPN and Fisher-Yatesestimators for the density of organisms. Biometrical Journal 27, 2, 167–172.
Beven, K. (2001). How far can we go in distributed hydrological modelling? Hydrologyand Earth System Sciences 5, 1, 1–12.
Bingham, A. K., Jarroll, E. L., and Meyer, E. A. (1979). Giardia-sp - physicalfactors of excystation invitro, and excystation vs eosin exclusion as determinantsof viability. Experimental Parasitology 47, 2, 284–291.
Blanchard, G. F., Sauriau, P. G., Gall, V. C. L., Gouleau, D., Garet, M. J., andOlivier, F. (1997). Kinetics of tidal resuspension of microbiota: Testing the ef-fects of sediment cohesiveness and bioturbation using flume experiments. MarineEcology-Progress Series 151, 17–25.
Bolstad, W. M. (2004). Introduction to Bayesian Statistics. Wiley-Interscience, Hobo-ken, N.J.
Borsuk, M. E., Stow, C. A., and Reckhow, K. H. (2002). Predicting the frequencyof water quality standard violations: A probabilistic approach for TMDL develop-ment. Environmental Science & Technology 36, 10, 2109–2115.
Borsuk, M. E., Stow, C. A., and Reckhow, K. H. (2004). A Bayesian network of eu-trophication models for synthesis, prediction, and uncertainty analysis. EcologicalModelling 173, 2-3, 219–239.
Bowie, G., Mills, W., Porcella, D., Campbell, C., and Chamberlin, C. (1985). Rates,constants, and kinetics formulations in surface water quality modeling. UnitedStates Environmental Protection Agency Office of Research and Development En-vironmental Research Laboratory, Washington, D.C., 2nd edn.
Cabelli, V. J. (1983). Water-borne Viral Infections In: M. Butler, R. Medlen andR. Morris (eds), “Viruses and Disinfection of Water and Wastewater.”. SurreyPress, Guilford, England.
Cabelli, V. J., Dufour, A. P., McCabe, L. J., and Levin, M. A. (1983). A marine recre-ational water-quality criterion consistent with indicator concepts and risk analysis.Journal Water Pollution Control Federation 55, 10, 1306–1314.
Casella, G. and Berger, R. L. (2002). Statistical Inference. Duxbury, Pacific Grove,California.
Chapra, S. C. (1997). Surface water-quality modeling. Mcgraw-hill series in waterresources and environmental engineering index. McGraw-Hill, New York.
Chapra, S. C., Pelletier, G. J., and Tao, H. (2007). QUAL2K: A modeling frame-work for simulating river and stream water quality, version 2.07: Documentationand user’s manual. Tech. rep., Civil and Environmental Engineering Dept., TuftsUniversity.
Cochran, W. G. (1950). Estimation of bacterial densities by means of the ‘mostprobable number’. Biometrics 6, 2, 105–116.
Cooter, W. S. (2004). Clean water act assessment processes in relation to changingU.S. Environmental Protection Agency management strategies. EnvironmentalScience & Technology 38, 20, 5265–5273.
Davies-Colley, R. J., Bell, R. G., and Donnison, A. M. (1994). Sunlight inactivationof Enterococci and fecal-coliforms in sewage effluent diluted in seawater. Appliedand Environmental Microbiology 60, 6, 2049–2058.
de Man, J. C. (1977). MPN tables for more than one test. European Journal ofApplied Microbiology and Biotechnology 4, 4, 307–316.
Dufour, A. P. and Cabelli, V. J. (1975). Membrane-filter procedure for enumeratingcomponent genera of coliform group in seawater. Applied Microbiology 29, 6, 826–833.
Dufour, A. P., Strickland, E. R., and Cabelli, V. J. (1981). Membrane-filter methodfor enumerating Escherichia coli. Appl. Environ. Microbiol. 41, 5, 1152–1158.
Eckford, R. E. and Fedorak, P. M. (2005). Applying a most probable number methodfor enumerating planktonic, dissimilatory, ammonium-producing, nitrate-reducingbacteria in oil field waters. Canadian Journal of Microbiology 51, 8, 725–729.
Eckner, K. F. (1998). Comparison of membrane filtration and multiple-tube fermen-tation by the colilert and enterolert methods for detection of waterborne coliform
108
bacteria, Escherichia coli, and enterococci used in drinking and bathing water qual-ity monitoring in southern Sweden. Applied and Environmental Microbiology 64,8, 3079–3083.
Eisenhart, C. and Wilson, P. W. (1943). Statistical methods and control in bacteri-ology. Bacteriological Reviews 7, 2, 57–137.
Esham, E. C. and Sizemore, R. K. (1998). Evaluation of two techniques: mFC andmTEC for determining distributions of fecal pollution in small, North Carolinatidal creeks. Water Air and Soil Pollution 106, 1, 179–197.
Fegan, N., Higgs, G., Vanderlinde, P., and Desmarchelier, P. (2004). Enumerationof Escherichia coli O157 in cattle faeces using most probable number techniqueand automated immunomagnetic separation. Letters in Applied Microbiology 38,1, 56–59.
Ferguson, C., Husman, A. M. D., Altavilla, N., Deere, D., and Ashbolt, N. (2003).Fate and transport of surface water pathogens in watersheds. Critical Reviews inEnvironmental Science and Technology 33, 3, 299–361.
Fischer, H. B. (1979). Mixing in inland and coastal waters. Academic Press, NewYork.
Food and Drug Administration and Interstate Shellfish Sanitation Conference (2005).National Shellfish Sanitation Program - guide for the control of molluscan shellfish.
Gameson, A. and Gould, D. (1974). Effects of solar radiation on the mortality ofsome terrestrial bacteria in sea water. In International Symposium on Dischargeof Sewage from Sea Outfalls, vol. Paper No. 22, London. Pergamon Press.
Garthright, W. E. (1993). Bias in the logarithm of microbial density estimates fromserial dilutions. Biometrical Journal 35, 3, 299–314.
Garthright, W. E. (1997). A Bayesian analysis of serial dilutions offers a worsepositive bias than the MPN and proposes an inappropriate interval estimate. FoodMicrobiology 14, 5, 515–517.
Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models(comment on article by Browne and Draper). Bayesian Analysis 1, 3, 515–534.
Ghinsberg, R. C., Dov, L. B., Sheinberg, Y., Nitzan, Y., and Rogol, M. (1994).Monitoring of selected bacteria and fungi in sand and sea-water along the Tel-avivcoast. Microbios 77, 310, 29–40.
Grant, S. B., Sanders, B. F., Boehm, A. B., Redman, J. A., Kim, J. H., Mrse, R. D.,Chu, A. K., Gouldin, M., McGee, C. D., Gardiner, N. A., Jones, B. H., Svejkovsky,
109
J., and Leipzig, G. V. (2001). Generation of Enterococci bacteria in a coastalsaltwater marsh and its impact on surf zone water quality. Environmental Science& Technology 35, 12, 2407–2416.
Greenwood, M. and Yule, G. U. (1917). On the statistical interpretation of somebacteriological methods employed in water analysis. The Journal of Hygiene 16,1, 36–54.
Gronewold, A. D., Borsuk, M. E., Wolpert, R. L., and Reckhow, K. H. (2008). An as-sessment of fecal indicator bacteria-based water quality standards. EnvironmentalScience & Technology 42, 13, 4676–4682.
Gronewold, A. D. and Reckhow, K. H. (2007). Developing a Bayesian network modelfor bacteriologically impaired surface waters. In proceedings of the 7th Interna-tional (IWA) Symposium on Systems Analysis and Integrated Assessment in WaterManagement (Washington, D.C., USA).
Gronewold, A. D. and Wolpert, R. L. (2008). Modeling the relationship betweenmost probable number (MPN) and colony-forming unit (CFU) estimates of fecalcoliform concentration. Water Research 42, 13, 3327–3334.
Gronewold, A. D., Wolpert, R. L., Noble, R. T., Coulliette, A. D., and Reckhow,K. H. (2007). Developing a Bayesian network model for supporting fecal coliformTMDL assessments. In proceedings of the Water Environment Federation SpecialtyConference - TMDL 2007 (Bellevue, Washington, USA).
Hackney, C. R. and Pierson, M. D. (1994). Environmental indicators and shellfishsafety. Chapman & Hall, New York.
Harris, A. S. D., Jones, K. J., and Lewis, J. (1998). An assessment of the accuracyand reproducibility of the most probable number (MPN) technique in estimatingnumbers of nutrient stressed diatoms in sediment samples. Journal of ExperimentalMarine Biology and Ecology 231, 1, 21–30.
Horowitz, A. (1986). Comparison of methods for the concentration of suspendedsediment in river water for subsequent chemical analysis. Environmental Science& Technology 20, 2, 155–160.
Houck, O. A. (2002). The Clean Water Act TMDL program: law, policy, and imple-mentation. Environmental Law Institute, Washington, D.C., 2nd edn.
Hurley, M. A. and Roscoe, M. E. (1983). Automated statistical analysis of microbialenumeration by dilution series. Journal of Applied Bacteriology 55, 1, 159–164.
Irvine, K. N. and Pettibone, G. W. (1993). Dynamics of indicator bacteria popula-tions in sediment and river water near a combined sewer outfall. EnvironmentalTechnology 14, 6, 531–542.
110
Jakeman, A. J. and Letcher, R. A. (2003). Integrated assessment and modelling: Fea-tures, principles and examples for catchment management. Environmental Mod-elling & Software 18, 6, 491–501.
Jeffreys, H. (1946). An invariant form for the prior probability in estimation problems.Proceedings of the Royal Society of London Series A– Mathematical and PhysicalSciences 186, 1007, 453–461.
Jensen, F. V., Olesen, K. G., and Andersen, S. K. (1990). An algebra of Bayesianbelief universes for knowledge-based systems. Networks 20, 5, 637–659.
Johnson, D. C., Enriquez, C. E., Pepper, I. L., Davis, T. L., Gerba, C. P., and Rose,J. B. (1997). Survival of Giardia, Cryptosporidium, poliovirus and salmonella inmarine waters. Water Science and Technology 35, 11-12, 261–268.
Kashefipour, S. M., Lin, B., and Falconer, R. A. (2005). Neural networks for pre-dicting seawater bacterial levels. Proceedings of The Institution of Civil Engineers-Water Management 158, 3, 111–118.
Ketchum, B. (1951). The exchanges of fresh and salt waters in tidal estuaries. Journalof Marine Research 10, 1, 18–38.
Kinzelman, J., Ng, C., Jackson, E., Gradus, S., and Bagley, R. (2003). Entero-cocci as indicators of Lake Michigan recreational water quality: Comparison oftwo methodologies and their impacts on public health regulatory events. Appliedand Environmental Microbiology 69, 1, 92–96.
Klee, A. J. (1993). A computer-program for the determination of most probablenumber and its confidence-limits. Journal of Microbiological Methods 18, 2, 91–98.
Kloot, R. W., Radakovich, B., Huang, X.-Q., and Brantley, D. (2006). A compar-ison of bacterial indicators and methods in rural surface waters. EnvironmentalMonitoring and Assessment 121, 1, 275–287.
Kuo, A. and Neilson, B. (1988). Modified Tidal Prism Model for Water Quality inSmall Coastal Embayments. Water Science and Technology 20, 6/7, 133–142.
Kuo, A., Park, K., Kim, S., and Lin, J. (2005). A Tidal Prism Water Quality Modelfor Small Coastal Basins. Coastal Management 33, 1, 101–117.
Larrazabal, M. E., Lassus, P., Maggi, P., and Bardouil, M. (1990). Modern dinoflag-ellate kysts in Vilaine Bay Southern Brittany (France). Cryptogamie Algologie 11,3, 171–185.
LeClerc, H., Mossel, D. A. A., Edberg, S. C., and Struijk, C. B. (2001). Advancesin the bacteriology of the coliform group: Their suitability as markers of microbialwater safety. Annual Review of Microbiology 55, 201–234.
111
Lee, J. H. and Bang, K. W. (2000). Characterization of urban stormwater runoff.Water Research 34, 6, 1773–1780.
Levin, M. A., Fischer, J. R., and Cabelli, V. J. (1975). Membrane filter technique forenumeration of enterococci in marine waters. Applied Microbiology 30, 1, 66–71.
Lunn, D. J., Thomas, A., Best, N., and Spiegelhalter, D. (2000). WinBUGS-ABayesian modelling framework: Concepts, structure, and extensibility. Statisticsand Computing 10, 4, 325–337.
Mancini, J. L. (1978). Numerical estimates of coliform mortality-rates under variousconditions. Journal Water Pollution Control Federation 50, 11, 2477–2484.
McBride, G. B. (2003). Preparing exact most probable number (mpn) tables usingoccupancy theory, and accompanying measures of uncertainty. NIWA TechnicalReport 121 62.
McBride, G. B. (2005). Using statistical methods for water quality management.Issues, problems and solutions. John Wiley & Sons Ltd Chichester, UK.
McBride, G. B., McWhirter, J. L., and Dalgety, M. H. (2003). Uncertainty in mostprobable number calculations for microbiological assays. Journal of AOAC Inter-national 86, 5, 1084–1088.
McCrady, M. H. (1915). The numerical interpretation of fermentation tube results.Journal of Infectious Diseases 17, 1, 183–212.
McMurry, S. W., Coyne, M. S., and Perfect, E. (1998). Fecal coliform transportthrough intact soil blocks amended with poultry manure. Journal of EnvironmentalQuality 27, 1, 86–92.
Medema, G. J., Bahar, M., and Schets, F. M. (1997). Survival of Cryptosporid-ium parvum, Escherichia coli, faecal Enterococci and Clostridium perfringens inriver water: Influence of temperature and autochthonous microorganisms. WaterScience and Technology 35, 11, 249–252.
Menyah, M. K. and Sato, K. (1996). A proposal for re-evaluating the most probablenumber procedure for estimating numbers of Bradyrhizobium spp. Biology andFertility of Soils 23, 2, 110–112.
Mitchell, R. and Chamberlin, C. (1979). Indicators of viruses in water and food(edited by Berg G.). 1–12. Ann Arbor Science Publishers, Inc, Ann Arbor, MI.
112
Moeller, J. R. and Calkins, J. (1980). Bactericidal agents in waste-water lagoons andlagoon design. Journal Water Pollution Control Federation 52, 10, 2442–2451.
National Research Council (2001). Assessing the TMDL approach to water qualitymanagement.
N.C. Department of Environment and Natural Resources (2004). Coastal recreationalwaters monitoring, evaluation, and notification rules: 15a ncac 18a .3400.
NCDENR (2007). Study on comparison between CFU and MPN estimates of fecalcoliform concentration.
Neumann, C. M., Harding, A. K., and Sherman, J. M. (2006). Oregon Beach mon-itoring program: Bacterial exceedances in marine and freshwater creeks/outfallsamples, October 2002-April 2005. Marine Pollution Bulletin 52, 10, 1270–1277.
Nix, P. G., Daykin, M. M., and Vilkas, K. L. (1993). Sediment bags as an integratorof fecal contamination in aquatic systems. Water Research 27, 10, 1569–1576.
Noble, R. T. and Fuhrman, J. A. (1997). Virus decay and its causes in coastal waters.Applied and Environmental Microbiology 63, 1, 77–83.
Noble, R. T., Moore, D. F., Leecaster, M. K., McGee, C. D., and Seisberg, S. B.(2003a). Comparison of total coliform, fecal coliform, and enterococcus bacterialindicator response for ocean recreational water quality testing. Water Research37, 7, 1637–1643.
Noble, R. T., Seisberg, S. B., Leecaster, M. K., McGee, C. D., Ritter, K. J., Walker,K. O., and Vainik, P. M. (2003b). Comparison of beach bacterial water qualityindicator measurement methods. Environmental Monitoring and Assessment 81,1, 301–312.
Novotny, V. and Olem, H. (1994). Water quality: Prevention, identification, andmanagement of diffuse pollution. Van Nostrand Reinhold, New York, 1st edn.
Obiri-Danso, K. and Jones, K. (2000). Intertidal sediments as reservoirs for hip-purate negative Campylobacters, Salmonellae and faecal indicators in three E.U.recognised bathing waters in northwest England. Water Research 34, 2, 519–527.
Ott, W. (1995). Environmental statistics and data analysis. Lewis Publishers, BocaRaton.
Papen, H. and von Berg, R. (1998). A most probable number method (MPN) for theestimation of cell numbers of heterotrophic nitrifying bacteria in soil. Plant andSoil 199, 1, 123–130.
113
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausibleinference. Morgan Kaufmann Publishers, San Mateo, Calif.
Qian, S. S., Donnelly, M., Schmelling, D. C., Messner, M., Linden, K. G., and Cotton,C. (2004). Ultraviolet light inactivation of protozoa in drinking water: a Bayesianmeta-analysis. Water Research 38, 2, 317–326.
Qian, S. S., Linden, K. G., and Donnelly, M. (2005). A Bayesian analysis of mouseinfectivity data to evaluate the effectiveness of using ultraviolet light as a drinkingwater disinfectant. Water Research 39, 17, 4229–4239.
R Development Core Team (2006). R: A Language and Environment for StatisticalComputing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
Reckhow, K. H. (1994). Water-quality simulation modeling and uncertainty analysisfor risk assessment and decision-making. Ecological Modelling 72, 1, 1–20.
Reckhow, K. H. (1999). Water quality prediction and probability network models.Canadian Journal of Fisheries and Aquatic Sciences 56, 7, 1150–1158.
Reeves, R. L., Grant, S. B., Mrse, R. D., Oancea, C. M. C., Sanders, B. F., andBoehm, A. B. (2004). Scaling and management of fecal indicator bacteria in runofffrom a coastal urban watershed in southern california. Environmental Science &Technology 38, 9, 2637–2648.
Rippey, S. R., Adams, W. N., and Watkins, W. D. (1987). Enumeration of fecal-coliforms and Escherichia-coli in marine and estuarine waters - an alternative tothe APHA-MPN approach. Journal Water Pollution Control Federation 59, 8,795–798.
Rompre, A., Servais, P., Baudart, J., de Roubin, M.-R., and Laurent, P. (2002).Detection and enumeration of coliforms in drinking water: current methods andemerging approaches. Journal of Microbiological Methods 49, 1, 31–54.
Rose, R. E., Geldreich, E. E., and Litsky, W. (1975). Improved membrane-filtermethod for fecal coliform analysis. Applied Microbiology 29, 4, 532–536.
Roussanov, B., Hawkins, D. M., and Tatini, S. R. (1996). Estimating bacterial densityfrom tube dilution data by a Bayesian method. Food Microbiology 13, 5, 341–363.
Russek, E. and Colwell, R. R. (1983). Computation of most probable numbers. Appl.Environ. Microbiol. 45, 5, 1646–1650.
Salomon, J. C. and Pommepuy, M. (1990). Mathematical-model of bacterial-contamination of the morlaix estuary (france). Water Research 24, 8, 983–994.
114
Sanders, B. F., Arega, F., and Sutula, M. (2005). Modeling the dry-weather tidalcycling of fecal indicator bacteria in surface waters of an intertidal wetland. WaterResearch 39, 14, 3394–3408.
Sanford, L., Boicourt, W., and Rives, S. (1992). Model for estimating tidal flushingof small embayments. Journal of Waterway, Port, Coastal and Ocean Engineering118, 6, 635–654.
Sayler, G. S., Nelson, J. D., Justice, A., and Colwell, R. R. (1975). Distributionand significance of fecal indicator organisms in Upper Chesapeake Bay. AppliedMicrobiology 30, 4, 625–638.
Schijven, J. F. and Hassanizadeh, S. M. (2000). Removal of viruses by soil passage:Overview of modeling, processes, and parameters. Critical Reviews In Environ-mental Science and Technology 30, 1, 49–127.
Schijven, J. F. and Hassanizadeh, S. M. (2002). Virus removal by soil passage at fieldscale and groundwater protection of sandy aquifers. Water Science and Technology46, 3, 123–129.
Schmee, J. and Hahn, G. J. (1979). A simple method for regression analysis withcensored data. Technometrics 21, 4, 417–432.
Shen, J., Sun, S.-C., and Wang, T.-P. (2005). Development of the fecal coliform totalmaximum daily load using Loading Simulation Program C++ and tidal prismmodel in estuarine shellfish growing areas: A case study in the Nassawadox coastalembayment, Virginia. J. Environ. Sci. Heal. A 40, 9, 1791–1807.
Smith, E. P., Ye, K. Y., Hughes, C., and Shabman, L. A. (2001). Statistical assess-ment of violations of water quality standards under section 303(d) of the CleanWater Act. Environmental Science & Technology 35, 3, 606–612.
Spiegelhalter, D., Dawid, A., Lauritzen, S., and Cowell, R. (1993). Bayesian Analysisin Expert Systems. Statistical Science 8, 3, 219–247.
Spiegelhalter, D. J., Thomas, A., Best, N. G., and Lunn, D. J. (2003). WinBUGSversion 1.4 user manual. Tech. rep., Medical Res. Counc. Biostat. Unit, Cambridge,UK.
Tanner, M. A. (1991). Tools for Statistical Inference. Springer-Verlab, New York,NY.
Thomann, R. V. and Mueller, J. A. (1987). Principles of surface water quality mod-eling and control. Harper & Row, New York.
Thomas, G. W. and Phillips, R. E. (1979). Consequences of water-movement inmacropores. Journal of Environmental Quality 8, 2, 149–152.
115
Tillett, H. E. and Coleman, R. (1985). Estimated numbers of bacteria in samplesfrom non-homogeneous bodies of water - how should mpn and membrane filtrationresults be reported. Journal of Applied Bacteriology 59, 4, 381–388.
Tzipori, S. (1983). Cryptosporidiosis in animals and humans. Microbiological Reviews47, 1, 84–96.
U.S. Environmental Protection Agency (2001). Protocol for developing pathogenTMDLs. Tech. Rep. EPA 841-R-00-002, Office of Water (4503F), United StatesEnvironmental Protection Agency, Washington, DC.
U.S. Environmental Protection Agency (2002). National water quality inventory:Report to congress (2002 reporting cycle), EPA 841-R-07-001.
U.S. Environmental Protection Agency (2005a). Code of federal regulations: Title40, chapter 1, part 141.
U.S. Environmental Protection Agency (2005b). Guidance for 2006 assessment, listingand reporting requirements pursuant to sections 303(d), 305(b) and 314 of theClean Water Act.
U.S. Geological Survey (1996). Water quality of the Lower Columbia River Basin:Analysis of current and historical water-quality data through 1994 (Water-resources investigations report 95-4294), 52-53. Tech. rep., U.S. Geological Survey.
Vandenberghe, V., Bauwens, W., and Vanrolleghem, P. A. (2007). Evaluation ofuncertainty propagation into river water quality predictions to guide future moni-toring campaigns. Environmental Modelling & Software 22, 5, 725–732.
Weisberg, S. (2005). Applied linear regression. Wiley series in probability and statis-tics. Wiley-Interscience, Hoboken, N.J., 3rd edn.
Weiskel, P. K., Howes, B. L., and Heufelder, G. R. (1996). Coliform contaminationof a coastal embayment: Sources and transport pathways. Environmental Science& Technology 30, 6, 1872–1881.
White, N. M., Line, D. E., Potts, J. D., Kirby-Smith, W., Doll, B., and Hunt, W. F.(2000). Jump Run Creek shellfish restoration project. Journal of Shellfish Research19, 1, 473–476.
Woodward, R. L. (1957). How probable is the most probable number? Journal ofthe American Water Works Association 49, 1, 1060–1068.
Woomer, P. L., Bennett, J., and Yost, R. (1990). Overcoming the inflexibility ofmost-probable-number procedures. Agronomy Journal 82, 2, 349–353.
116
Biography
My research and career objectives first took shape during my undergraduate educa-
tion at Cornell University’s School of Civil and Environmental Engineering. After
graduating from Cornell in 1995, I was employed as a project manager and licensed
professional engineer with the environmental engineering consulting firms Stearns &
Wheler, LLC and the Ecological Engineering Group, Inc. Between 1995 and 2003
I initiated and completed over forty planning, design, and construction projects in
areas of wastewater, water, and solid waste management. Significant project accom-
plishments include obtaining grant funding for point and non-point source pollution
mitigation projects in small communities through the Massachusetts Coastal Zone
Management (CZM) Coastal Pollutant Remediation (CPR) program, and serving as
the resident engineering during the closure of a 54-acre municipal solid waste landfill.
I also completed a series of comprehensive watershed and wastewater management
planning studies for rapidly growing communities in southeastern Massachusetts.
Each planning project included a detailed analysis of environmental management
infrastructure alternatives, evaluation of public policy and regulatory issues, and ex-
tensive field work to determine hydrogeological and surface water quality conditions.
In addition to my work as an environmental engineer, I began supervising and
coordinating a wide variety of research projects in 1999 as a scientist and teacher
with the Sea Education Association (SEA) based in Woods Hole, Massachusetts. I
have since logged over 200 days at sea as a teacher with SEA while advising high
117
school and college-level students during the data gathering and report writing phases
of individual research projects, including non-point source pollution analysis of nu-
trient loading in Samana Bay in the Dominican Republic and distribution of spiny
lobster larvae across the gulf stream. As a student with SEA, I investigated the
impacts of stormwater runoff on eutrophication in St. George’s Harbor, Bermuda,
and subsequent implications for outbreak of shellfish-borne diseases such as paralytic
shellfish poisoning (PSP) and ciguatera. My experience with SEA provided a unique
perspective on global environmental problems through coastal research projects in
Nova Scotia, Bermuda, the Lesser Antilles, and Central America. My passion for
teaching, research, and pursuing graduate study was confirmed by my experience
with SEA, and my enthusiasm persisted through adverse conditions at sea such as
severe weather, sleep deprivation, and seasickness. Throughout my experiences in
engineering consulting and with SEA, however, I repeatedly questioned traditional
approaches to addressing uncertainty in water quality measurements, construction
cost estimates, and other critical environmental management decision criteria. These
questions inspired my return to graduate school and my work on Bayesian statistical
models.
I began graduate studies at the Nicholas School of the Environment at Duke Uni-
versity under the guidance of Drs. Kenneth H. Reckhow and Robert L. Wolpert.
My research focused on applying statistical models to help solve environmental re-
source and infrastructure management problems. I specialize in developing innovative
118
modeling tools which integrate monitoring data from multiple spatial and temporal
scales to characterize interrelated meteorological and hydrological processes, as well
as ecosystem response dynamics. My doctorate research focuses on developing mod-
eling tools for evaluating climate change, land use, and pollutant mitigation scenarios
to restore water quality in impaired shellfish harvesting waters in Eastern North Car-
olina. Significant contributions from this research include a new set of water quality
standards imposing limits on parameters of the true fecal bacteria concentration
(the applicable measure of water quality), as opposed to traditional standards which
impose limits on most probable number (MPN) and colony-forming unit (CFU) con-
centration point estimates. This research recently appeared as a cover article in En-
vironmental Science & Technology. I also developed an innovative approach to mod-
eling the relationship between alternative measures of fecal coliform concentration,
which provided important guidance to shellfish harvesting area managers currently
debating a shift in standard laboratory protocol. This research recently appeared
in Water Research. The contributions of my graduate work to the scientific com-
munity were acknowledged through several awards and scholarships, including the
Water Environment Federation Robert Canham Graduate Scholarship, the North
Carolina Association of Environmental Professionals Graduate Scholarship, and the
QEA, LLC Graduate Scholarship. In addition, I received an Outstanding Student
Paper Award for a presentation of my research at the American Geophysical Union
(AGU) Fall 2007 Meeting.
119
While conducting my research at Duke, I also served as the primary instructor
for the Nicholas School of the Environment graduate-level course in water quality
management, and periodically served as a guest lecturer for courses in water quality
modeling and probability. I consistently received positive evaluations from students
at Duke and at SEA, and was awarded the Nicholas School of the Environment
teaching assistant of the year award after my first year of graduate study.