Top Banner
Lake classification to enhance prediction of eutrophication endpoints in Finnish lakes E. Conrad Lamon III a, * , Olli Malve b , Olli-Pekka Pietila ¨inen b a Nicholas School of the Environment and Earth Sciences, Box 90328, Duke University, Levine Science Research Center, A316, Durham, NC 27708, USA b Finnish Environment Institute (SYKE), Mechelininkatu 34a, PO Box 140, FIN-00251 Helsinki, Finland Received 24 October 2006; received in revised form 26 October 2007; accepted 29 October 2007 Available online 26 December 2007 Abstract We used the Bayesian TREED procedure to determine the efficacy of using an existing trophic status classification scheme for prediction of chlorophyll a in 150 Finnish lakes. Growing season data were log (base e) transformed and averaged by lake and year. We compared regressions of lnTP and lnTN on lnChla based on aggregations of the 9 levels of ‘‘Lake Type’’, the classification scheme of the Finnish Environment Institute (SYKE), to a new classification scheme identified by the Bayesian TREED regression algorithm that partitioned the data based on geographic, morphometric and chemical properties of the lakes. The classifier identified with the BTREED algorithm had the best resulting model fit as measured by several different metrics. The model identified by the BTREED procedure that was allowed to use the suite of geographic, mor- phometric and chemical classifiers selected only the morphometric variable mean lake depth as the basis of the classification scheme. This model resulted in separate classes for shallow (<2.6 m), medium (2.6 m < mean depth < 16.3 m) and deep (>16.3 m) lakes corresponding to co- control by N and P (shallow and medium depths) and N-control (deep lakes) of algal productivity as measured by chlorophyll a, as indicated by the regression coefficients for each partition on depth. However, TN:TP ratios indicate clear P limitation in each depth class. Ó 2007 Elsevier Ltd. All rights reserved. Keywords: Bayesian TREED model; Tree based models; Stressoreresponse relationships; Lake classification; Nutrients; Eutrophication 1. Introduction In 1998, the USEPA published the National Strategy for the Development of Regional Nutrient Criteria (USEPA, 1998). Nutrient criteria are numerical values for both causal (nutri- ents) and response (algal biomass, chlorophyll a) variables. Among EPA’s goals were to identify problem areas, to serve as the basis for state and tribal water quality criteria for nutri- ents, and to evaluate the relative success in reducing cultural eutrophication. Rather than proposing a single set of criteria applicable to the entire continental US, EPA has based this scheme on aggregate level 3 ecoregions (Omernik, 1987). Similarly, the European Union has adopted the Water Framework Directive (WFD, European Commission, 2000), which requires member states to identify quantitative relation- ships between chemical stressors (nutrients) and biological status (algal biomass, chlorophyll a) as a part of the goal of improving the ecological status of European lakes. Like EPA, the EU member states have recognized the need for re- gional variation in setting the criteria for ‘‘good’’ ecological status (and, therefore, quantitative stressoreresponse relation- ships). Through the WFD, European lakes have been classified into ‘‘intercalibration types’’, based on lake morphological (mean depth, surface area) and chemical (color) metrics for eventual determination of their ecological status (European Commission, 2004). Classification schemes for lakes may be descriptive or pre- dictive. Both the EPA and EU initiatives are related to quanti- fication of eutrophication problems in lakes, and both claim prediction of eutrophication endpoints (such as chlorophyll a) from stressors (such as nutrient concentrations) as a goal. Omernik et al. (1988) suggested ‘‘regional patterns will afford * Corresponding author. Tel.: þ1 919 613 8105; fax: þ1 919 681 5740. E-mail address: [email protected] (E.C. Lamon III). 1364-8152/$ - see front matter Ó 2007 Elsevier Ltd. All rights reserved. doi:10.1016/j.envsoft.2007.10.008 Available online at www.sciencedirect.com Environmental Modelling & Software 23 (2008) 938e947 www.elsevier.com/locate/envsoft
10

Lake classification to enhance prediction of eutrophication endpoints in Finnish lakes

May 16, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Lake classification to enhance prediction of eutrophication endpoints in Finnish lakes

Available online at www.sciencedirect.com

Environmental Modelling & Software 23 (2008) 938e947www.elsevier.com/locate/envsoft

Lake classification to enhance prediction ofeutrophication endpoints in Finnish lakes

E. Conrad Lamon III a,*, Olli Malve b, Olli-Pekka Pietilainen b

a Nicholas School of the Environment and Earth Sciences, Box 90328, Duke University, Levine Science Research Center, A316, Durham, NC 27708, USAb Finnish Environment Institute (SYKE), Mechelininkatu 34a, PO Box 140, FIN-00251 Helsinki, Finland

Received 24 October 2006; received in revised form 26 October 2007; accepted 29 October 2007

Available online 26 December 2007

Abstract

We used the Bayesian TREED procedure to determine the efficacy of using an existing trophic status classification scheme for prediction ofchlorophyll a in 150 Finnish lakes. Growing season data were log (base e) transformed and averaged by lake and year. We compared regressionsof lnTP and lnTN on lnChla based on aggregations of the 9 levels of ‘‘Lake Type’’, the classification scheme of the Finnish Environment Institute(SYKE), to a new classification scheme identified by the Bayesian TREED regression algorithm that partitioned the data based on geographic,morphometric and chemical properties of the lakes. The classifier identified with the BTREED algorithm had the best resulting model fit asmeasured by several different metrics. The model identified by the BTREED procedure that was allowed to use the suite of geographic, mor-phometric and chemical classifiers selected only the morphometric variable mean lake depth as the basis of the classification scheme. This modelresulted in separate classes for shallow (<2.6 m), medium (2.6 m < mean depth < 16.3 m) and deep (>16.3 m) lakes corresponding to co-control by N and P (shallow and medium depths) and N-control (deep lakes) of algal productivity as measured by chlorophyll a, as indicatedby the regression coefficients for each partition on depth. However, TN:TP ratios indicate clear P limitation in each depth class.� 2007 Elsevier Ltd. All rights reserved.

Keywords: Bayesian TREED model; Tree based models; Stressoreresponse relationships; Lake classification; Nutrients; Eutrophication

1. Introduction

In 1998, the USEPA published the National Strategy for theDevelopment of Regional Nutrient Criteria (USEPA, 1998).Nutrient criteria are numerical values for both causal (nutri-ents) and response (algal biomass, chlorophyll a) variables.Among EPA’s goals were to identify problem areas, to serveas the basis for state and tribal water quality criteria for nutri-ents, and to evaluate the relative success in reducing culturaleutrophication. Rather than proposing a single set of criteriaapplicable to the entire continental US, EPA has based thisscheme on aggregate level 3 ecoregions (Omernik, 1987).

Similarly, the European Union has adopted the WaterFramework Directive (WFD, European Commission, 2000),

* Corresponding author. Tel.: þ1 919 613 8105; fax: þ1 919 681 5740.

E-mail address: [email protected] (E.C. Lamon III).

1364-8152/$ - see front matter � 2007 Elsevier Ltd. All rights reserved.

doi:10.1016/j.envsoft.2007.10.008

which requires member states to identify quantitative relation-ships between chemical stressors (nutrients) and biologicalstatus (algal biomass, chlorophyll a) as a part of the goal ofimproving the ecological status of European lakes. LikeEPA, the EU member states have recognized the need for re-gional variation in setting the criteria for ‘‘good’’ ecologicalstatus (and, therefore, quantitative stressoreresponse relation-ships). Through the WFD, European lakes have been classifiedinto ‘‘intercalibration types’’, based on lake morphological(mean depth, surface area) and chemical (color) metrics foreventual determination of their ecological status (EuropeanCommission, 2004).

Classification schemes for lakes may be descriptive or pre-dictive. Both the EPA and EU initiatives are related to quanti-fication of eutrophication problems in lakes, and both claimprediction of eutrophication endpoints (such as chlorophyll a)from stressors (such as nutrient concentrations) as a goal.Omernik et al. (1988) suggested ‘‘regional patterns will afford

Page 2: Lake classification to enhance prediction of eutrophication endpoints in Finnish lakes

939E.C. Lamon III et al. / Environmental Modelling & Software 23 (2008) 938e947

a framework for calibrating mathematical predictive modelsand will allow more accurate evaluations of the potential suc-cess of lake restoration projects’’. Though both claim a goal ofprediction, the level 3 aggregate ecoregions proposed by EPA,and lake intercalibration types used by the EC (EuropeanCommission, 2004), represent descriptive rather than predic-tive classifiers. For intercalibration types, within-group varia-tion along the axes of the classification variables used issmall relative to between-group variation with these classi-fiers. This produces lake classes that are self similar in the var-iables used for classification. However, the metrics used todefine the classification schemes do not map seamlessly tostressors (nutrients), eutrophication responses (biomass, chlo-rophyll a) or the stressoreresponse relationships (models).For a predictive classifier, we are seeking lake classes thatshare stressoreresponse relationships (models), and not justcommon means on morphometric or chemical measures.

Linear regression models can provide descriptions of simplestructure in water quality data that are both more useful and moreinterpretable than their more complex mechanistic counterparts(Malve et al., 2007; Malve, 2007). They are useful because theyprovide a predictive tool whose properties are well understoodfor evaluating proposed management manipulations, or for gen-erating testable hypotheses. They are interpretable in that the rel-ative magnitudes of the linear model coefficients may indicatethe relative importance of each covariate in describing variabilityin the response variable, if regressors (predictor variables) arestandardized before model estimation.

Sometimes the simple linear structure of a regression modeldoes not apply to an entire dataset. There are many reasonsthis may be true for lake water quality data, including regionaldifferences in climate, geology, lake morphometry, land use,land cover, biota, etc. For use in such situations, Chipmanet al. (2002) provide an algorithm to obtain models that maybetter describe this simple structure, by subsetting the dataand then fitting separate submodels for each subset.

The objective of our study is to develop a lake classificationscheme that is useful for the prediction of the response of algalbiomass, as measured by chlorophyll a, to the variation inchemical and physical characteristics in Finnish lakes in orderto support the implementation of the Water Framework Direc-tive. More specifically, we are interested in comparing inter-calibration types, an existing typology (sensu Peters, 1991),to a new classifier identified in this study for use as a frame-work for estimating a predictive model for chlorophyll a.

2. Methods

Because intercalibration types are a descriptive classifier, we wanted to de-

termine if there were aggregations of these intercalibration types that could

provide the basis for regressions that might improve predictive performance

beyond that of either a regression using either the entire dataset (the ‘‘null

model’’) or a ‘‘dummy variable’’ regression based on the existing intercalibra-

tion type classification scheme. In the case of aggregations of intercalibration

types, there are

�91

�þ�

92

�þ�

93

�þ�

94

�¼ 3609

ways in which we can combine these nine intercalibration type factors into two

groups. We will leave the math to the reader for aggregations of three and

more groups.

We also wanted to determine if there was an alternative to this descriptive

classifierdperhaps based on other metrics in addition to the water quality met-

rics color, depth and areadthat could be used to partition the data into subsets

for which the resulting terminal node regressions provided a better fit than the

null model or the dummy regression. Expert judgment was used by the Finnish

Environment Institute (SYKE) to develop the lake types to subset the data

along the axes of three continuous variables, mean depth, surface area and

color. Based on these three variables, we have 277 unique surface areas,

mean depths and color values represented in our annual mean dataset (see

data description in Section 2.3). For 277 unique potential values at which

we may subset the data, there are

�2771

�þ�

2772

�þ/þ

�277138

ways in which we can use these values to create just two groups! It seems to be

an act of considerable hubris to select one classification scheme over all other

possibilities and then continue with nutrient criteria development (model

fitting) as if this was the ‘‘one true scheme’’, particularly if our choice is

made based on self-similarity of within group means and not self similarity

of predictive models (regressions). Faced with this existing classification

scheme, we wanted to see if aggregations of these classes made sense for pre-

diction, and if we could find a better classification scheme based on one or all

five different continuous metrics available. BTREED was designed for this

purpose, simultaneously performing the search for a classification scheme

with regression model fitting in order to maximize log integrated likelihood

of the total model in the Bayesian framework.

The search for a predictive classifier based on five continuous variables

is an even larger problem than either aggregation of the existing intercali-

bration types, or subsetting based on the three continuous variables used

by SYKE to develop the intercalibration types. An algorithm that efficiently

explores the enormous space of these potential aggregations (or classifica-

tion schemes) for use as the basis of stressoreresponse regression relation-

ships seems necessary. We used an enhancement to the Bayesian CART

procedure (Chipman et al., 1998), which includes Bayesian CART as

a model subclass, known as Bayesian TREED models (Chipman et al.,

2002) to achieve the goals of this research. Bayesian CART and Bayesian

TREED models are both enhancements to the more familiar Classification

and Regression Tree (CART) procedure (Breiman et al., 1984; Clark and

Pregibon, 1992).

2.1. Tree based methods

CART models are fit using a ‘‘greedy’’ algorithm to ‘‘grow’’ a tree, then

‘‘prune’’ it back to avoid over-fitting (Clark and Pregibon, 1992). These algo-

rithms are greedy because they grow a tree by maximizing some fitting crite-

rion at each sequentially chosen split (or subset) of the data (Chipman et al.,

1998). This approach produces a sequence of trees, all of which are refine-

ments of the previous tree in the sequence (except the first one). This ‘‘nest-

ing’’, such that the splits at a given level are only meaningful given the

splits that have come at higher levels, severely limits the exploration of the

set of all possible trees, or tree space, T. There are several proposed remedies

to address this problem, including bagging (Brieman, 1996), bumping (Tib-

shirani and Knight, 1999), and boosting (Freund and Schapire, 1996) which

all use the same greedy algorithm on subsets of the observations in a cross val-

idation procedure to generate a variety of tree topologies. Instead of removing

a subset of observations, Random Forests (Breiman, 2001) removes entire vari-

ables at random. Rather than using a cross validation scheme or another

Bayesian CART algorithm (Dennison et al., 1998), we chose an enhanced ap-

proach to Bayesian CART consisting of a prior specification and stochastic

search (Chipman et al., 1998) to explore a much richer set of candidate trees

than the greedy algorithm.

Page 3: Lake classification to enhance prediction of eutrophication endpoints in Finnish lakes

940 E.C. Lamon III et al. / Environmental Modelling & Software 23 (2008) 938e947

2.2. Bayesian TREED models

Table 1

Geomorphological typology of Finnish Lakes, Finnish Environment Institute

(SYKE)

Lake Type Name Details

I Large, non-humic lakes SA >4000 Ha, color <30

II Large, humic lakes SA >4000 Ha, color >30

III Medium and small,

non-humic lakes

SA 50e4000 Ha, color <30

IV Medium area,

humic deep lakes

SA 500e4000 Ha,

color 30e90, D >3 m

V Small, humic,

deep lakes

SA 50e500 Ha,

color 30e90, D >3 m

VI Deep, very

humic lakes

Color >90, D >3 m

VII Shallow, non-humic lakes Color <30, D <3 m

VIII Shallow, humic lakes Color 30e90, D <3 m

IX Shallow, very

humic lakes

Color >90, D <3 m

Bayesian TREED models are Bayesian hierarchical models that select sub-

sets of observations on partitions of the design matrix, X, for which linear

models are estimated. Further, they identify subsets of X for which linear

model performance is improved as measured by predictive log integrated like-

lihood (LIL). Chipman et al. (2002) give the full details regarding Bayesian

TREED models, while Lamon and Stow (2004) and Freeman et al. (in press)

provide applications of the method using water quality data.

Specification of a Bayesian TREED model consists of two model compo-

nents. First is a tree structure, T, with b bottom nodes. Bottom nodes are also

known as terminal nodes, leaf nodes or simply leaves. Second is a parameter

vector q ¼ (q1,.,qb), where qi is associated with the ith bottom node. If x is in

the ith terminal node of T, the conditional of Y given X is then Yjxwf ðyjX; qiÞ,where f is a parametric family indexed by qi. The parametric family is the lin-

ear regression, where YjXwNðx0b;s2Þ. Use of the linear model in each termi-

nal node is accomplished by letting qi ¼ ðbi;s2i Þ. It is important to note here

that both bi and si may vary from node to node, which allows both global non-

linearity and global heteroscedasticity. The tree is fully specified by the pair

(q,T ), and the Bayesian approach requires that we specify a prior p(q,T ).

Because q indexes a parametric model for each T, we can use Bayes theorem

such that p(q,T ) ¼ p(q j T )p(T ). Consequently, prior specification is done in

two stages: first by specifying a prior on the tree space, p(T ), and then specifying

a prior on the distribution at the bottom nodes, conditional on T, p(q j T ).

Markov chain Monte Carlo techniques are used to stochastically search for

high posterior probability trees T. Specifically, a MetropoliseHastings algo-

rithm is used to simulate a Markov chain with limiting distribution p(T j Y,X ).

Starting with an initial tree T0, iteratively simulate the transitions from Ti to

Ti þ 1 by two steps:

1. Generate a candidate value T* with probability distribution q(Ti,T*).

2. Set Tiþ1 ¼ T* with probability

a�Tiþ1;T�

�¼min

�qðT�;TiÞpðYjX;T�ÞpðT�ÞqðTi;T�ÞpðY

��X;TiÞpðTiÞ;1

�ð1Þ

Else, set Tiþ1 ¼ Ti. In Eq. (1), pðYjX; T�Þ is given by

pðYjX;TÞ ¼Z

pðYjX;q;TÞpðq;TÞdq¼Yb

i¼1

Z Yni

j¼1

p�yij

��xij;qi

�pðqiÞdqi

ð2Þ

and q(T,T*) is the kernel which generates T* from T by randomly choosing

among four steps: GROW, PRUNE, CHANGE, and SWAP. In the GROW

step we split a random bottom node (leaf) according to the probability listed

below. The PRUNE step collapses a randomly chosen pair of ‘‘sibling’’ nodes

to their parent node. The CHANGE step changes the existing splitting rule at

a randomly chosen interior node. In the SWAP step, we swap the rule of a ran-

domly chosen child/parent pair. Chipman et al. (2002) note that this algorithm

gravitates rapidly to trees of high posterior probability.

The TREED model’s tendency to grow is defined by the equation:

Prðnode splitsjdepth¼ dÞ ¼ að1þ dÞ�b ð3Þ

where a is the base probability of tree growth from splitting a node, and b determines

the rate at which the propensity to split decreases with increased tree size (Chipman

et al., 2002). To estimate how choice of the prior on a and b affects the model, we

explored a range of alpha (0.5 and 0.95) and beta (0.5e2.0, step 0.5) values.

Using conjugate priors for the end node model q allows for the integral in

Eq. (1) to be solved analytically. We use the conjugate form p(b,s) ¼p(b j s)p(s) with:

b��swN

�b;s2A�1

�;s2w

nl

c2n

ð4Þ

where c2n is the chi-squared cumulative distribution function with n degrees of

freedom. This further reduces the specification problem to the choice of values

for the hyperparameters n, l, b and A. The choice of b and A is done in an

‘‘automatic’’ fashion described by Chipman et al. (2002).

We used the data to choose l. We let s be the classical unbiased estimator of s

based on the full linear model fit to all the standardized training data. We wish to

choose l to express the idea that for terminal node regressions we expect the error

standard deviation to be smaller than s, but maybe not too much smaller. Since we

allow for a different s in each terminal node, we also want to capture the prior

belief that s could be bigger than s. In this situation Chipman et al. (2002) suggest

choosing a quantile q such that Pr(s < s) ¼ q, then used the implied value of l.

Chipman et al. (2002) report on the use of q ¼ 0.75 and q ¼ 0.95, which implies

l ¼ 0:404s2 and l ¼ 0:1173s2, respectively. Here we choose q to be 0.95. We

choose n ¼ 3, giving the prior on s the same weight as 3 observations, which

is small in comparison to the total number of observations used.

Chipman et al. (2002) note that most methods that flexibly fit data typically

have ‘‘parameters’’ that must be chosen. To fit a neural network model, we

must specify a decay parameter. Use of kernel smoothers requires choice of

a window width. These are often related to penalties on model size to avoid

fitting noise (overfitting). Over fitting results in too large a model, which

will fit training data well and test data poorly. To some extent, both our prior

on T indexed by a and b, and the choice of conjugate prior are related to the

issue of penalizing overly large models.

2.3. Data

National water quality monitoring of lakes in Finland began in 1965 after

the implementation of the Water Act in 1962. The pollution monitoring net-

work was based on the polluter pays principle. The current database of the

Finnish Environment Institute contains water quality data from about 58,000

sampling sites. The sampling strategy and analysis methods have been de-

scribed by Niemi et al. (2001). In January 2000, the Regional Environment

Centres and the Finnish Environment Institute put into operation the Eurowa-

ternet monitoring network for Finnish Inland waters (Niemi et al., 2001),

which contains 253 lake sites from the national monitoring network based

on the guidelines presented by the European Environment Agency (Nixon

et al., 1998). Eurowaternet was designed to produce objective, statistically re-

liable and comparable information to allow European Commission, member

states and the general public to judge the effectiveness of policy and the needs

for policy development. Information is required on the status of Europe’s in-

land water resources, quality and quantity and how that relates and responds

to pressures on the environment (causeeeffect relationships, Nixon et al.,

1998). We used 551 observations collected during the productive season

from 150 Finnish Lakes from 1990 to 2001. These lakes belong mainly to

the Eurowaternet but also contain some lakes from the national monitoring

network. These lakes have been classified by Lake Type according to mor-

phological characteristics (depth and surface area) and color (European Com-

mission, 2004 and Table 1). The response variable for our models is

chlorophyll a (mg L�1), a surrogate for algal biomass. The tree portion of

the Bayesian TREED model developed here includes the variables altitude

(m), latitude (decimal degrees), surface area (km2), mean depth (m), and color

Page 4: Lake classification to enhance prediction of eutrophication endpoints in Finnish lakes

941E.C. Lamon III et al. / Environmental Modelling & Software 23 (2008) 938e947

(mg Pt L�1). Predictor variables used in the endnode models were the nutrient

concentrations total nitrogen (TN, mg L�1) and total phosphorus (TP, mg L�1).

We have used TN and TP because they are generally available measures, and

are nearly always correlated with the particular nitrogen and phosphorus spe-

cies that influence algal density more directly. We log (base e) transformed TP,

TN and Chla for use in fitting the endnode regressions, then took the annual

averages by lake, providing 280 growing season lake-wide averages (one

per lake-year). Predictor variables used in the tree portion T of the model

were not transformed because the splitting process is invariant to monotone

transformations of the tree variables.

2.4. Modeling approach

We wanted to determine if there were other metrics that could be used to

partition the data into subsets or aggregations of the intercalibration types for

which the resulting endnode regressions provided a better fit than regressions

using the entire dataset (null model) or those based on the existing

Model 1

lnChla ~

Depth

Depth

<2.6 m

-0.002 + 0.12n = 16

-0.6 -0.2 0.2 0.6 -0.6 -0.2

-0.6

-0.2

0.2

0.6

-0.6

-0.2

0.2

0.6

-0.6

-0.2

0.2

0.6

-0.6

-0.2

0.2

0.6

logTN

logTN beta= 0.646

logTP

logTP beta= 0.418

log

logTN be

log

logTP be

-0.6 -0.2 0.2 0.6 -0.6 -0.2

>2.6 m

0.074 + 0.646 lnTN + 0.418 lnTP

<16.3 m

n

Fig. 1. Model 1 dendrogram and partial residuals plots. Splits are on

classification scheme (dummy regression using intercalibration type factor).

For this purpose we fit two separate TREED models, along with the full linear

regression using all the data for comparison purposes. For TREED Model 1, we

allowed the five tree variables described above into the tree portion, and used the

MCMC procedure to search the model space using 20 restarts of the algorithm,

with 5000 iterations per restart. We then fit TREED Model 2, allowing only the

Lake Type variable described above into the tree portion of the model, and ran

the same numberof restarts and iterations per restart in this configuration. BTREED

models were fit using free software (http://gsbwww.uchicago.edu/fac/robert.

mcculloch/research/code/CART/index.html) based on Chipman et al. (2002).

We investigate the 3 leaves (linear models) of Model 1 using Bayesian

model averaging (BMA) to provide estimates of regression coefficients that

account for model specification uncertainty. The parameters of linear models

depend in part on the model specification, i.e. which predictor variables are

included in the model. BMA accounts for model uncertainty in variable selec-

tion by averaging over the best models in the model class. Parameter estimates

in BMA are a weighted average of estimates, using posterior model probability

as the weights. We used the bicreg( ) function of the BMA package (Raftery

>16.3 m MSE = 0.1315MAD = 0.2158LIL = 492.38

-0.129 + 0.646 lnTN + 0.027 lnTP n = 24

8 lnTN + 0.651 lnTP

0.2 0.6 -0.6 -0.2 0.2 0.6

-0.6

-0.2

0.2

0.6

-0.6

-0.2

0.2

0.6

TN

ta= 0.128

TP

ta= 0.651

0.2 0.6 -0.6 -0.2 0.2 0.6

logTN

logTN beta= 0.646

logTP

logTP beta= 0.027

= 240

depth and leaf models are for shallow, medium and deep lakes.

Page 5: Lake classification to enhance prediction of eutrophication endpoints in Finnish lakes

Lake Type

Types 3,5,8,9 Types 1,2,4,6,7

Model 2

lnChla ~

MSE = 0.1320MAD = 0.2570LIL = 489.04

-0.008 + 0.075 lnTN + 0.661 lnTPn = 237

0.043 + 0.589 ln TN + 0.494 lnTPn = 43

-0.6

-0.2

0.2

0.6

-0.6

-0.2

0.2

0.6

-0.6

-0.2

0.2

0.6

-0.6

-0.2

0.2

0.6

logTN beta= 0.589

logTP

logTP beta= 0.494

logTN beta= 0.075

logTP

logTP beta= 0.661

-0.6 -0.2 0.2 0.6 -0.6 -0.2 0.2 0.6

-0.6 -0.2 0.2 0.6 -0.6 -0.2 0.2 0.6

logTN logTN

Fig. 2. Model 2 dendrogram and partial residuals plots. Tree variables restricted to Lake Type, showing aggregation of the nine levels resulting from this classifier

into two groups.

942 E.C. Lamon III et al. / Environmental Modelling & Software 23 (2008) 938e947

et al., 2005) in the R statistical environment (R Development Core Team,

2004) to obtain the BMA estimates of the regression coefficients. For a given

dependent variable and set of candidate independent variables, the software

finds models in Occam’s window and their posterior probabilities, and the pos-

terior mean and standard deviation of the regression coefficients (Raftery,

1995). Occam’s window is a generalization of Occam’s razor, a maxim for

choosing among competing systems of hypotheses (models).

Bayes factors may be compared among competing models M1 and M2 (Jef-

freys, 1961; Kass and Raftery, 1995), in a form of hypothesis testing. They pro-

vide a means of determining which model is favored as a representation of the

data. The Bayes factor is a likelihood ratio if prior model probabilities are as-

sumed equal. Jeffreys (1961) provides guidance for interpretation of results in

terms of evidence in favor of a model relative to a competing model. The ratio

of two model likelihoods (LM1/LM2) is equivalent to the difference in log of the

likelihoods log LM1 � log LM2. Differences in log10 likelihood from 0 to 0.5 is

termed insignificant, from 0.5 to 1 substantial, from 1 to 2 strong, and greater

than 2 decisive evidence for a model M1 over competing model M2.

3. Results

The log integrated likelihood (LIL) of the full regression(null model) is 459.8 and much less than that of Model 1

(Fig. 1) or Model 2 (Fig. 2), which indicates decisive evidencethat regressions based on either classification system are bettersupported by the data than a multiple regression using all thedata. The full regression (n ¼ 280) produces:

lnChla ¼ 0:1499 lnTNþ 0:6592 lnTPþ 3 ð5Þ

with a mean squared error (MSE) of 0.1590 and a median ab-solute deviation (MAD) of 0.2603. Note that the intercept termis estimated to be exactly zero, the mean of the standardizedlnChla response across the entire dataset. Often in statisticalinference, the ‘‘null model’’ consists of only an intercept,which corresponds to the mean of the response variable, butwe have not considered that na€ıve hypothesis (model) here,as it leads to uninteresting forecasts.

A dummy variable regression based on the intercalibrationtype classification scheme resulted in a regression model con-sisting of 28 parameters (9 types times 3 regression coef-ficients ¼ 27, plus the error standard deviation). Fitting thedummy variable regression to our data resulted in a mean

Page 6: Lake classification to enhance prediction of eutrophication endpoints in Finnish lakes

Table 2

Comparison of regression coefficients determined by the BTREED procedure,

BMA and by linear regression with both lnTP and lnTN as simultaneous re-

gressors in a multiple regression (both), and individually (TP only and TN

only) in a simple regression

Leaf Source TP coefficient TN coefficient Post.

model prob.

R2

1 BTREED 0.4176 0.6455

BMA 0.3348 0.72617

Both 0.4162 (0.1786) 0.6476 (0.1858) 0.803 0.9434

TP only 0.9901 (0.0928) e e 0.8905

TN only e 1.0467 (0.0826) 0.197 0.9198

2 BTREED 0.6506 0.1278

BMA 0.6728 0.0932

Both 0.6507 (0.0386) 0.1278 (0.0467) 0.729 0.7909

TP only 0.7322 (0.0249) e 0.271 0.7843

TN only e 0.7350 (0.0440) e 0.5401

3 BTREED 0.0270 0.6459

BMA 0.0045 0.6634

Both 0.0254 (0.0806) 0.6478 (0.0760) 0.178 0.9085

TP only 0.5711 (0.1011) e e 0.5918

TN only e 0.6668 (0.0452) 0.822 0.9080

All data BTREED 0.6592 0.1499

BMA 0.6845 0.1408

Both 0.6800 (0.0351) 0.1479 (0.0432) 0.952 0.8097

TP only 0.7721 (0.0230) e 0.048 0.8016

TN only e 0.7876 (0.0426) e 0.5519

943E.C. Lamon III et al. / Environmental Modelling & Software 23 (2008) 938e947

squared error (MSE) of 0.1345 and a median absolute devia-tion (MAD) of 0.2327. However, with only two observationsfor Lake Type 7, we cannot determine unique estimates ofall three regression coefficients for Type 7 lakes, and thereforewill have difficulty in predicting Type 7 lakes using thismodel.

The dendrograms for Model 1 and Model 2, along with par-tial residuals plots (Weisberg, 1985) for the end node models ap-pear as Figs. 1 and 2, respectively. Model 1 used only depth ofthe five potential tree variables to partition the data into threesubsets, shallow (<2.6 m), medium (2.6 m < depth < 16.3 m)and deep (>16.3 m) lakes (Fig. 1), with a MSE of 0.1368 andMAD of 0.2166. Model 2 aggregated the nine lake types intotwo partitions, corresponding to lake types 3, 5, 8, 9 in theleft child and 1, 2, 4, 6, 7 in the right child node (Fig. 2), witha MSE of 0.1370 and a MAD of 0.2461. The difference in logintegrated likelihood (LIL) between Model 1 and Model 2 isgreater than 3 (M1 �M2 ¼ 492.38 � 489.04 ¼ 3.34), indica-tive of decisive evidence in the data favoring Model 1 overModel 2 (Kass and Raftery, 1995). Mean squared errors of thetwo models do not provide much guidance for model selection,though the MSE of Model 1 is slightly less than that of Model 2,while median absolute deviation (MAD) differs in the seconddecimal place between the two, also favoring Model 1.

Collinear predictors typically lead to large variances for es-timated regression coefficients. The presence of collinearpredictors complicates the interpretability of regression coeffi-cients in linear regression, and the terminal node regressionsof the BTREED procedure are no exception. For this reason,we fit separate regressions (one for each predictor lnTP andlnTN) on lnChla for each leaf of Model 1 to determine thestrength of linear association between the predictors and theresponse. The results of this exercise appear in Table 2, alongwith parameter estimates jointly determined using both pre-dictors in a multiple regression model, parameter estimates es-timated using Bayesian model averaging (BMA), and theparameter estimates shown in Fig. 1 determined by theBTREED procedure.

Based on these results we see that the largest coefficientestimated by the BTREED procedure for each leaf is thebest single separately determined linear predictor of lnChla(as measured by R2), but that in the presence of collinearitythe other BTREED coefficient is not a reliable measure of lin-ear association between the corresponding predictor andlnChla. For example in leaf 1 (Table 2), the ‘‘TN only’’ coef-ficient (1.0467) is larger than the ‘‘TP only’’ coefficient(0.9901), and has a larger R2, making TN the best separatelydetermined predictor in leaf 3 lakes. TN also has the largestcoefficient estimated by BTREED is on the TN predictor(0.6463). We also note that the BMA and BTREED parameterestimates are similar. Table 2 results indicate that a modelcontaining both lnTN and lnTP as predictors is preferredover a model with only lnTN (4:1 odds) for the shallow lakesof leaf 1, while a model using only lnTP was assigned zeroposterior probability. In the medium depth lakes of leaf 2,a model containing both lnTN and lnTP as predictors is pre-ferred over a model with only lnTP (2.7:1 odds), while the

model with only lnTN was assigned zero posterior probability.In both leaf 1 and leaf 2, a two-predictor model is to be pre-ferred over a model with only a single predictor. In the deep(leaf 3) lakes, however, a model using only lnTN as a predictoris more likely than a model using both predictors (4.6:1 odds).Separately determined regression coefficients for lnTN andlnTP are statistically different from each other only for deep(leaf 3) lakes.

Positive, separately determined (Table 2) slopes of nearlythe same magnitude on both lnTN and lnTP for the shallowand medium depth lakes regression (leaves 1 and 2, Fig. 1)indicate that either may be used to predict lnChla nearly aswell as the other. In the deep lakes of leaf 3 the significantlylarger magnitude of the positive, separately determined(Table 2) slope on lnTN relative to that for lnTP is indicativethat lnTN is a better predictor of lnChla (Fig. 1). One mayalso notice in Fig. 1 that the lnChla values partitioned bythe model into leaf 1 are generally above the mean (0 onthe standardized lnChla scale on the vertical axes), whilethose for deep lakes in leaf 3 are generally below themean, indicating that shallow lakes are more productivethan deep lakes.

We used various biological and physical factors to gaininsight into differences between the depth classes definedby Model 1 (Fig. 3). Fig. 3(a) indicates a steady, significantprogressive increase in TN:TP (molar) from shallow to deeplakes. It is noteworthy that all molar TN:TP calculated forannual lake average TN and TP for these Finnish lakesgreatly exceed the Redfield ratio of 16:1. A decreasing gra-dient in productivity, as measured by lnChla concentrations,is apparent in Fig. 3(b) from shallow to deep lakes, as

Page 7: Lake classification to enhance prediction of eutrophication endpoints in Finnish lakes

944 E.C. Lamon III et al. / Environmental Modelling & Software 23 (2008) 938e947

mentioned in the discussion of Fig. 1 above. Fig. 3(c) showslnTN is highest in leaf 1 but the medians are not signifi-cantly different between leaf 2 and leaf 3. The decreasinggradient in lnTP from leaf 1 to leaf 3 is evident in Fig. 3(d),just as an increasing gradient is evident for lnVolume(Fig. 3(e)).

Fig. 4 illustrates how the variables used in the tree portion Tof the BTREED models compare to depth in regard to the par-titions defined by Model 1. Median altitude (Fig. 4(a)) and lat-itude (Fig. 4(b)) do not differ significantly between leaves.However, an increasing gradient in median ln(Surface Area)with increasing depth is apparent (Fig. 4(d)), and a decreasinggradient on color with increasing depth is also discernible(Fig. 4(f)). However, only depth provides this particular dis-joint partition of the data, apparent in Fig. 4(e) in the lackof overlapping depth values between leaves.

4. Discussion

We present a way to develop quantitative stressorere-sponse models for the establishment of nutrient criteria that

leaf 1 leaf 2 leaf 3

leaf 1 leaf 2 leaf 3

leaf 1 leaf 2 leaf 3

50

150

TN

:T

P(m

ol)

a

c

e

4.5

5.5

6.5

7.5

lnT

N

2

lo

g10 V

olu

me

3

4

5

6

Fig. 3. Notched box and whisker plots of (a) N to P ratio, (b) lnChla, (c) lnTN, (d) l

tree portion of Model 1.

is some compromise in approach along the spectrum from‘‘all lakes are the same’’ to ‘‘all lakes are unique.’’ The Finn-ish Environment Institute (SYKE) has proposed using the in-tercalibration types (or Lake type) to aggregate lakes intogroups for developing these stressoreresponse relationships.Unfortunately, while this ‘‘descriptive’’ classifier subsetsFinnish lakes into groups that are self similar in lake depth,surface area and color, our results show that for developinga model linking nutrients to chlorophyll a, combining these9 lake types into 2 groups improves predictive accuracy. Fur-ther, neglecting the intercalibration types altogether in favorof 3 distinct mean depth based classes improves predictiveperformance further still. Because the TREED proceduresearches all linear regression models associated with the en-tire space of possible subsets produced by splitting rulesalong the axes of all the tree variables, it is well suited tothe task at hand.

It is instructive to use the model to explore the predictedprobability of exceeding a chlorophyll criterion. Figs. 5 and6 are contour plots of the 90th percentile values from our pre-dictive distribution for Finnish lakes associated with Model 1

leaf 1 leaf 2 leaf 3

leaf 1 leaf 2 leaf 3

b

d

0

lnC

hla

1

lnT

P

2

4

2

3

4

nTP, (e) lnVolume by leaf of Model 1. None of these variables were used in the

Page 8: Lake classification to enhance prediction of eutrophication endpoints in Finnish lakes

leaf 1 leaf 2 leaf 30

200

400

a

leaf 1 leaf 2 leaf 3

3600

3900

b

leaf 1 leaf 2 leaf 3

2

4

6

8

c

leaf 1 leaf 2 leaf 3

leaf 1 leaf 2 leaf 3 leaf 1 leaf 2 leaf 3

4

6

8

10

d

5

10

20

e

0

50

150

f

Fig. 4. Notched box and whisker plots of (a) Altitude, (b) Latitude, (c) Lake Type, (d) lnSurface Area, (e) Depth, and (f) Color by leaf of TREED Model 1. All

these variables except Lake Type were used in the tree portion of model 1, while Lake Type was the only variable used in the tree portion of Model 2.

945E.C. Lamon III et al. / Environmental Modelling & Software 23 (2008) 938e947

and Model 2, respectively. The Finnish Environment Institutehas established chlorophyll a criteria for lake water quality clas-sification in Finland, resulting in five separate quality classes(http://www.environment.fi/default.asp?node¼14912&lan¼en).Contours in Figs. 5 and 6 are labeled by the threshold valuesassociated with these five classes, with the units of chlorophyllconcentration (mg L�1) associated with each class listed in thecaptions. The plots may be used as nomographs to determinethe combination of nitrogen and phosphorus concentrationsthat exceed the standard shown on the contour 10% of thetime.

The inference from Model 2 is that although the 9 individ-ual lake (intercalibration) types may differ along the morpho-metric (surface area, depth) and chemical (color) axes, thesedifferences do not translate into differences that would require9 individual predictive models for lnChla. Aggregation of thelake types into the groups indicated by Model 2 providesa more parsimonious classifier than the ‘‘Lake Type’’ classi-fier, and as a consequence, predictive models based on this ag-gregation may outperform models based on the nine levels ofthe Lake Type classifier. If the goal is simply to classify lakesinto subsets along the axes of morphometric or chemical mea-sures such that the subset means differ, the result may be sub-sets that are sufficiently descriptive of differences in theexisting data, but provide little useful information to guide

development of a predictive model. The fact that the ‘‘LakeType’’ variable is derived by an a priori, descriptive approachto classification based on these three metrics leads us to a pre-dictive model (Model 2) that is clearly inferior in terms of pre-dictive ability (as measured by differences in LIL, MSE andMAD) and insight (see Section 3 and discussion above) com-pared to a model in which subsets and predictive models aredetermined simultaneously (Model 1).

The reasons to develop a classification scheme for lakesmay be descriptive or predictive (or both). The more a prioridecisions we make regarding model structure, the less we al-low for the uncertainty regarding that model structure to bepropagated through to the credible intervals on the qualityvariable of interest (prediction uncertainty) (Lamon andStow, 2004). Any aggregation of these subsets for the purposeof developing a predictive model suffers by virtue of the in-formation lost in the process of a priori classification. Thesplit points along the three axes that determined the 9 subsets(intercalibration types) were chosen a priori, and were inde-pendent of the estimation of the predictive model. This maybe satisfactory if the goal is description. However, to separatelakes into units for the purpose of predicting eutrophicationendpoints, we may be less interested in subsets with a homog-enous description (mean value of a water quality measure),and more interested in subsets within which a homogenous

Page 9: Lake classification to enhance prediction of eutrophication endpoints in Finnish lakes

TP, g/L

0 20 40 60 80 100

0 20 40 60 80 100 0 20 40 60 80 100

0

500

1000

1500

0

500

1000

1500

0

500

1000

1500

Shallow Lakes Medium Lakes

Deep Lakes

Poor=50

Poor=50Satisfactory=20

Satisfactory=20

Good=10

Good=10

Good=10

1

Excellent=10

Excellent=4

Excellent=4

TP, g/L

TP, g/L

TN

, g

/L

TN

, g

/L

TN

, g

/L

Fig. 5. The 90th percentile from model predictive distribution for Shallow, Medium and Deep Lakes associated with Model 1. Contours denote a 10% probability of

exceeding the chlorophyll a concentration classification thresholds indicated by their labels. Excellent ¼ 4 mg L�1, Good ¼ 10 mg L�1, Satisfactory ¼ 20 mg L�1

and Poor ¼ 50 mg L�1.

946 E.C. Lamon III et al. / Environmental Modelling & Software 23 (2008) 938e947

model structure applies. The use of the present procedureprovides the dual benefits of improved predictive perfor-mance and a richer description. BTREED has been shownto provide enhanced predictive power in the context of

Lake Types 3,5,8,9

0

500

1000

1500

0 20 40 60 80 100

Poor=50

120

Satisfactory=20Good=10Excellent=4

TN

, g

/L

TP, g/L

Fig. 6. The 90th percentile from model predictive distribution for Lake Types 3, 5, 8

probability of exceeding the chlorophyll a concentration classification thresho

Satisfactory ¼ 20 mg L�1 and Poor ¼ 50 mg L�1.

nutrient criteria development, and should be useful to anyonewishing to identify subsets of multivariate observational data-sets in which simple, well understood models (regressions) fitthe data well.

Lake Types 1,2,4,6,7

0

500

1000

1500

0 20 40 60 80 100

Poor=50

Satisfactory=20

Good=10Excellent=4

TN

, g

/L

TP, g/L

, 9, or Lake Types 1, 2, 4, 6, 7 associated with Model 2. Contours denote a 10%

lds indicated by their labels. Excellent ¼ 4 mg L�1, Good ¼ 10�1mg L�1,

Page 10: Lake classification to enhance prediction of eutrophication endpoints in Finnish lakes

947E.C. Lamon III et al. / Environmental Modelling & Software 23 (2008) 938e947

Acknowledgements

ECL is grateful for the productive, collegial working atmo-sphere in the Water Quality Modeling Group at the NicholasSchool of the Environment and Earth Sciences at Duke Uni-versity during the production of this paper. We wish to thankGeorge Arhonditsis, Song Qian and Craig Stow for thoughtfulreviews of an earlier version of this manuscript. Mistakes ofcommission or omission are solely our own. This researchhas been supported in part by a grant from the US Environ-mental Protection Agency’s Science to Achieve Results(STAR) program. Although the research described in the arti-cle has been funded in part by the US Environmental Protec-tion Agency’s STAR program through grant number RD-83088701-0, it has not been subjected to any EPA reviewand therefore does not necessarily reflect the views of theAgency, and no official endorsement should be inferred.

References

Brieman, L., 1996. Bagging predictors. Mach. Learn. 26, 123e140.

Breiman, L., 2001. Random forests. Mach. Learn. 45, 5e32.

Breiman, L., Friedman, J., Olshen, R., Stone, J.C., 1984. Classification and

Regression Trees. Wadsworth International Group, Belmont, CA.

Chipman, H.A., George, E.I., McCulloch, R.E., 1998. Bayesian CART model

search. J. Am. Statist. Assoc. 93, 935e948.

Chipman, H., George, E.I., McCulloch, R.E., 2002. Bayesian treed models.

Mach. Learn. 48, 299e320.

Clark, L.A., Pregibon, D., 1992. Tree-based models. In: Chambers, J.M.,

Hastie, T.J. (Eds.), Statistical Models in S. Wadsworth and Brooks/Cole

Advanced Books and Software, Pacific Grove, CA.

Dennison, D., Mallick, B., Smith, A.F.M., 1998. A Bayesian CART algorithm.

Biometricka 85, 363e377.

European Commission, 2000. Water Framework Directive (Directive 2000/60/

EC). Official Journal of the European Communities L327, 1e72. http://

europa.eu.int/comm/environment/water/water-framework/index_en.html.

European Commission, 2004. Overview of common intercalibration types and

guidelines for the selection of intercalibration sites. Ecostat Working

Group 2.A, Brussels. http://forum.europa.eu.int/Members/irc/env/wfd/

library/WorkingGroups/NewWG2a-EcologicalStatus.

Freeman, A., Lamon, E.C., Stow, C.A. Regional nutrient and chlorophyll-a

relationships in lakes and reservoirs: a Bayesian TREED model

approach. Ecol. Model., in press.

Freund, Y., Schapire, R., 1996. Experiments with a new boosting algorithm,

In: Machine Learning: Proceedings of the Thirteenth International

Conference, pp. 148e156.

Jeffreys, H., 1961. Theory of Probability. Oxford University Press, London.

Kass, E.R., Raftery, A.E., 1995. Bayes factors. J. Am. Statist. Assoc. 90, 773e795.

Lamon, E.C., Stow, C.A., 2004. Bayesian methods for regional-scale eutrophi-

cation models. Water Res. 38 (11), 2764e2774.

Malve, O., 2007. Water quality prediction for river basin management. Doc-

toral dissertation, Helsinki University of Technology. Espoo, Finland.

TKK-DISS-2292. ISBN 978-951-22-8749-9. URL: http://lib.tkk.fi/Diss/

2007/isbn9789512287505/.

Malve, O., Laine, M., Haario, H., Kirkkala, T., Sarvala, J., 2007. Bayesian

modelling of algal mass occurrencesdusing adaptive MCMC methods

with a lake water quality model. Environ. Model. Software 22 (7), 966e

977.

Niemi, J., Heinonen, P., Mitikka, S., Vuoristo, H., Pietilainen, O.-P.,

Puupponen, M., Ronka, E., 2001. The Finnish Eurowaternet with informa-

tion about Finnish water resources and monitoring strategies. In: Finnish

Environment Institute, Environmental Protection, The Finnish Environ-

ment, No. 445. Edita Ltd, Helsinki, Finland, ISBN 952-11-0827-4, 62 pp.

Nixon, S., Grath, J., Bøgestrand, J., 1998. Eurowaternet, The European Envi-

ronment Agency’s monitoring and information network for inland water

resources. Technical guidelines for implementation. European Environ-

ment Agency, Copenhagen, Denmark. Technical Report No. 7.

Omernik, J.M., 1987. Ecoregions of the conterminous United States. Ann.

Assoc. Am. Geogr. 77 (1), 118e125.

Omernik, J.M., Larsen, D.P., Rohm, S.M., Clarke, S.E., 1988. Summer total

phosphorus in lakes: a map of Minnesota, Wisconsin, and Michigan,

USA. Environ. Manag. 12, 815e825.

Peters, R.H., 1991. A Critique for Ecology. Cambridge University Press.

R Development Core Team, 2004. R: a Language and Environment for Statis-

tical Computing. R Foundation for Statistical Computing. Vienna, Austria,

ISBN 3-900051-07-0. URL. http://www.R-project.org.

Raftery, A.E., 1995. Bayesian model selection in social research (with discus-

sion). In: Marsden, P.V. (Ed.), Sociological Methodology. Blackwells,

Cambridge, MA, pp. 111e196.

Raftery, A.E., Hoeting, J., Volinsky, C., Painter, I., 2005. BMA: Bayesian

Model Averaging. R package version 3.01. http://www.r-project.org,

http://www.research.att.com/wvolinsky/bma.html.

Tibshirani, R., Knight, K., 1999. Model search by bootstrap ‘‘bumping’’. J.

Comp. Graph. Statist. 8 (4), 671e686.

USEPA, 1998. National Strategy for the Development of Regional Nutrient

Criteria, Office of Water. United States Environmental Protection Agency.

EPA 822-R-98e002.

Weisberg, S., 1985. Applied Linear Regression. John Wiley and Sons, New

York.