Top Banner
This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author’s institution, sharing with colleagues and providing to institution administration. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright
15

Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

Jan 16, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

This article was published in an Elsevier journal. The attached copyis furnished to the author for non-commercial research and

education use, including for instruction at the author’s institution,sharing with colleagues and providing to institution administration.

Other uses, including reproduction and distribution, or selling orlicensing copies, or posting to personal, institutional or third party

websites are prohibited.

In most cases authors are permitted to post their version of thearticle (e.g. in Word or Tex form) to their personal website orinstitutional repository. Authors requiring further information

regarding Elsevier’s archiving and manuscript policies areencouraged to visit:

http://www.elsevier.com/copyright

Page 2: Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

Author's personal copy

Identification of reliable regression- and correlation-based sensitivitymeasures for importance ranking of water-quality model parameters

Gemma Manache a,*, Charles S. Melching b

a MWH, UK Operations, 240 Cygnet Court, Lakeside Drive, Centre Park, Warrington WA1 1RN, UKb Department of Civil and Environmental Engineering, Marquette University, P.O. Box 1881, Milwaukee, WI 53201-1881, USA

Received 11 October 2006; received in revised form 7 August 2007; accepted 8 August 2007

Available online 18 October 2007

Abstract

Sensitivity analysis methods based on multiple simulations such as Monte Carlo Simulation (MCS) and Latin Hypercube Sampling (LHS) arevery efficient, especially for complex computer models. The application of these methods involves successive runs of the model under inves-tigation with different sampled sets of the uncertain model-input variables and (or) parameters. The subsequent statistical analysis based on re-gression and correlation analysis among the input variables and model output allows determination of the input variables or the parameters towhich the model prediction uncertainty is most sensitive. The sensitivity effect of the model-input variables or parameters on the model outputscan be quantified by various statistical measures based on regression and correlation analysis. This paper provides a thorough review of thesemeasures and their properties and develops a concept for selecting the most robust and reliable measures for practical use. The concept is dem-onstrated through the application of Latin Hypercube Sampling as the sensitivity analysis technique to the DUFLOW water-quality model de-veloped for the Dender River in Belgium. The results obtained indicate that the Semi-Partial Correlation Coefficient and its rank equivalent theSemi-Partial Rank Correlation Coefficient can be considered adequate measures to assess the sensitivity of the DUFLOW model to the uncer-tainty in its input parameters.� 2007 Elsevier Ltd. All rights reserved.

Keywords: Sensitivity analysis; Latin Hypercube Sampling; Monte Carlo Simulation; Water quality modelling; Linear regression; Correlation coefficients

1. Introduction

Many sources of uncertainties are associated with the de-velopment and application of computer models such as naturaluncertainty and variability, model structure uncertainty, para-metric uncertainty, and data uncertainty (Beck, 1987). Withall the sources of uncertainty, the model output obviously isuncertain. Therefore, the systematic application of sensitivityand uncertainty analysis prior to model application is essentialin order to identify the critical aspects of the model that affectmodel output uncertainty. The benefits of sensitivity analysisare to gain basic insight on the system being modeled, to

indicate whether the model operates as intended, to identifythe key components of the model that require further calibra-tion and/or study, and to assess the relative importance of inputvariables for guidance in data collection and model calibra-tion. Saltelli (2002) further observed that sensitivity analysisis increasingly applied for corroboration, quality assurance,and defensibility of model-based analyses. An example ofthe insight gained from sensitivity analysis occurred whenthe first author presented the result (from this study) that algalprocesses dominate dissolved oxygen (DO) variations at a con-ference in Ghent, Belgium, in 2000, and the pollutionmanagers for the Dender River were surprised as they had pre-viously viewed the Dender as having the classical biochemicaloxygen demand (BOD)eDO problem. Similarly, Bouraoui(2007) demonstrated that the application of sensitivity analysisto the PEARL model highlighted the greater importance of thepesticide degradation properties over the soil hydraulic

* Corresponding author. Tel.: þ44 (0) 7724 632602; fax: þ44 (0) 1925

658108.

E-mail address: [email protected] (G. Manache).

1364-8152/$ - see front matter � 2007 Elsevier Ltd. All rights reserved.

doi:10.1016/j.envsoft.2007.08.001

Available online at www.sciencedirect.com

Environmental Modelling & Software 23 (2008) 549e562www.elsevier.com/locate/envsoft

Page 3: Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

Author's personal copy

properties. The study of Campolongo et al. (2007) indicatedthat the results of sensitivity analysis of a chemical reactionmodel for dimethysulphide (DMS) open up the ground formodel reconsideration. An example of designing data collec-tion programs was that after Melching and Yoon (1996)showed that the reaeration-rate coefficient (K2) dominatedDO uncertainty in the Passaic River, the New Jersey Depart-ment of Environmental Protection began evaluating the costof measuring K2 in situ with tracer methods.

Different strategies and methods can be used to performsensitivity analyses on computer models. They can be classi-fied into two main types according to the objectives of theanalysis: local sensitivity analysis and global sensitivity anal-ysis. Local sensitivity analysis is concerned with output vari-ability in the neighbourhood of a nominal value in the inputspace while global sensitivity analysis is concerned with thevariability of model output over the entire input space (Yehand Tung, 1993). Sensitivity analysis methods can be classi-fied into the following groups (Campolongo et al., 2000): dif-ferential analysis (local), Monte Carlo analysis (global),Importance measures including the Fourier Amplitude Sensi-tivity Test, FAST (global), and response surface methodology(global). A comprehensive review of sensitivity analysis tech-niques is given in Helton (1993).

Several studies of the relative performance of the variousmethods have been reported in the literature (e.g., Iman andHelton, 1985, 1988; Saltelli and Marivoet, 1990; Saltelli andHomma, 1992; Saltelli et al., 1993 and Pappenberger et al.,2006). Most of these studies illustrate that with the increasingcomputational power of computers, sensitivity analysismethods based on multiple simulations such as Monte CarloSimulation (MCS) and Latin Hypercube Sampling (LHS) arevery powerful, robust, and flexible compared with the othermethods.

Global sensitivity analysis methods based on MCS andLHS consider the investigated model as a black box and thewhole range of the input variables is examined. Therefore,this technique is very efficient, especially when the model un-der investigation is complex and its input parameters rangeover several orders of magnitude. Helton et al. (2006) notethat LHS is very popular for use with computationally de-manding models because its efficient stratification propertiesallow for the extraction of a large amount of uncertainty andsensitivity information with a relatively small sample size.The MCS and LHS analysis methods imply a sampling froma possible range of the input variables followed by modelevaluations for the sampled values after which regression-and correlation-based sensitivity measures can be computed(Saltelli et al., 1995).

In this paper, sensitivity analysis is carried out based on theLHS approach. This analysis is limited to the influence ofparameter uncertainty on the model output. Nevertheless, itcan also be applied to any other uncertain factors used inthe model simulations (e.g., boundary conditions, forcingfunctions, and input data). The main objective of this paperis to provide a detailed review and theoretical discussion ofvarious correlation- and regression-based measures proposed

for sensitivity analysis and to establish a concept for selectingthe most relevant measures for this study. The development ofthis concept is demonstrated through the application of LHS toa water-quality model developed for the Dender River inBelgium.

2. Sensitivity analysis

A key issue in the application of sensitivity analysis is theranking of importance of the model-input variables or param-eters by assessing their contribution to the uncertainty/vari-ability of the model output. The sensitivity contributions ofthe model parameters/input variables to the model outputcan be quantified by various measures. These measures allowthe estimation of how much the model output is affected bythe uncertainty in the model parameters and/or input variables.Due to the considerable analogy between sensitivity and un-certainty analysis and the use of Monte Carlo type analysisin both analyses, it can be noticed that some of the sensitivitymeasures also can be used as uncertainty measures.

The most frequently used measures based on correlationand regression analysis are described in the following sections.The measures discussed in this paper are appropriate formodels that are approximately linear or monotonically nonlin-ear (rank transform methods). Methods for non-monotonically,nonlinear models are reviewed in Helton et al. (2006). Themeasures studied here allow the input variables to be rankedas a function of their influence on the model output. Theyare computed from the matrix XðN � PÞ which contains thevalues of the P sampled parameters for the N model runsand the corresponding output vector ðy ¼ y1; y2;.; yNÞ. Theoutput variable y in the following sections is assumed to besingle-valued to keep the notation simple. In practice, theremay be more than one predicted variable of interest and some-times the output variables are time dependent. In this latercase, the output vector y represents the model prediction forthe output variable of interest at a specific time point.

3. Regression-based sensitivity and uncertainty measures

3.1. Ordinary regression coefficient (ORC)

The relation between model parameters and model outputcan be approximated in a simple way with a linear regressionmodel. Let ðy ¼ y1; y2;.; yNÞ be the LHS output vector result-ing from N simulation runs and XðN � pÞ the input matrixcontaining the values of the p sampled input variables usedfor N runs:

X ¼

2664x11 x12 . x1p

x21 x22 . x2p

. . . .xN1 xN2 . xNp

3775The least-squares regression of the model output y(k) in the

k-th simulation run ðk ¼ 1;.;NÞ on the associated sampledparameters x1ðkÞ;.; xpðkÞ can be written as

550 G. Manache, C.S. Melching / Environmental Modelling & Software 23 (2008) 549e562

Page 4: Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

Author's personal copy

yðkÞ ¼ b0þXp

i¼1

bixiðkÞþ eðkÞ ¼byðkÞþ eðkÞ k ¼ 1;.;N ð1Þ

where y(k) and x1ðkÞ;.; xpðkÞ are the values of the model out-put and parameters in the k-th model run, e(k) is the regressionresidual, byðkÞ is the regression-estimate of the model outputfor the k-th model run, and b0; b1;.; bp are the ordinary re-gression coefficients (ORCs), obtained by minimising thefunction

FðbÞ ¼XN

k¼1

"yðkÞ � b0�

Xp

i¼1

bixiðkÞ#2

ð2Þ

The quantity byðkÞ ¼ b0 þPp

i¼1 bixiðkÞ represents the bestlinear prediction. The ordinary regression coefficient bi canbe considered as an absolute sensitivity measure as it quan-tifies approximately the absolute change Dy of y, due toa change Dxi of xi while other parameters ( j ) remain constant

ORCi ¼Dy

Dxi

����xj¼constant

ð3Þ

The goodness of the linear approximation can be assessedby the coefficient of determination (COD; also called R2) ofthis regression

R2y ¼

S2y

S2y

¼ 1� S2e

S2y

ð4Þ

where S2e is the variance of the regression residuals, S2

y is thevariance of the original simulation model output, and S2

y isthe variance of the output byðkÞ of the linear regression model.R2

y represents the fraction of the variance of the output vectorexplained by the linear approximation. It indicates how wellthe linear regression model fits the original model output gen-erated by LHS (R2

y ¼ 1 indicates perfect model performance).The ORCs are not very useful in sensitivity analysis becauseeach ORC is influenced by the units of xi and also does not in-corporate any information on the distribution assigned to xi

(Helton et al., 2006).

3.2. Standardised regression coefficient (SRC)

To mitigate the units problem for the ORC some standard-isation or normalisation can be suitable to avoid this draw-back. By standardising the quantities in Eq. (1), theregression model can be written as

y� y

Sy

¼ bðsÞ1

ðx1� x1ÞSx1

þ/þ bðsÞp

�xp� xp

�Sxp

¼Xk

i¼1

SRCðy; xiÞxi� xi

Sxi

ð5Þ

where y and xi are the average values of ðy1;.; yNÞ,ðx1i;.; xNiÞ, respectively, Sy and Sxi

are the respective

standard deviations, y is the original simulation model output,and b

ðsÞ1 ;.; b

ðsÞp are the standardised regression coefficients

computed from the bi as

bðsÞi ¼ SRCðy; xiÞ ¼ bi

Sxi

Sy

ð6Þ

The standardised regression coefficients can be used as rel-ative sensitivity measures (when the input variables are inde-pendent), measuring the effect of moving each input variableaway from its mean by a fixed fraction of its standard devia-tion, while the other variables remain constant. When the xisare independent, the exclusion of an individual xi from the re-gression model has no effect on the SRCs for the remainingvariables (Helton et al., 2006). In addition to alleviating theunits problem of the ORC, the SRC also is influenced by thedistribution assigned to xi (Helton et al., 2006). The SRC isa useful sensitivity measure but it is only meaningful if theSxi

reflects a realistic spread of the considered variations andthe variables x are independent.

Based on the regression, Eq. (1), the uncertainty in y can beexpressed in terms of biSxi

and the correlation coefficients rxixj

between the variables xi and xj (Janssen et al., 1992):

S2y ¼

Xp

i¼1

Xp

j¼1

ðbiSxiÞ�bjSxj

�rxixjþ S2

e ð7Þ

If xi is uncorrelated with the other input variables xj (i.e. rxixj¼

0 for isj), then the quantity biSximeasures the linear uncer-

tainty contribution of the input xi. Consequently, the standar-dised regression coefficient (SRC) measures the fraction ofthe uncertainty (variance) in y contributed by xi if the correla-tion between the input variables is weak and if the modelcoefficient of determination R2

y approaches 1.If correlation exists between the input variables, the uncer-

tainty contribution of an individual variable cannot be neatlyquantified since this contribution also is related to one of thecorrelated variables. To cope with this problem, various uncer-tainty measures that account for correlation have been pro-posed in the literature (Janssen, 1994) such as the relativepartial sum of squares (RPSS), the marginal uncertainty con-tribution (MUC), and the partial uncertainty contribution(PUC). Because this study is restricted only to sensitivity anal-ysis, the description of these later measures will not be givenhere and the reader is suggested to refer to Janssen et al.(1992) for a detailed review.

3.3. Normalised regression coefficient (NRC)

The quantities in Eq. (1) also can be standardised by divid-ing the data by their average values. Thus, the regressionmodel can be written as

y

y¼Xk

i¼1

NRCðy; xiÞxi

xi

ð8Þ

where NRC is the normalised regression coefficient computedfrom bi as

551G. Manache, C.S. Melching / Environmental Modelling & Software 23 (2008) 549e562

Page 5: Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

Author's personal copy

NRCðy; xiÞ ¼ bi

xi

yð9Þ

The normalised regression coefficients can be used as rela-tive sensitivity measures, expressing the relative change Dy=yof y with respect to its average, due to relative change Dxi=xi

of xi with respect to its average value, while the other variablesremain constant. The NRC also can be used to quantify uncer-tainty if the linear regression model is a good approximationof the original model, the standard deviation Sxi

reflects the ac-tual variability of the input variable xi, and the variables areindependent. The NRC is related to the SRC as follows

SRCi ¼ NRCi

CVi

CVy

ð10Þ

where CVi and CVy are the coefficients of variation for xi andy, respectively. This means that the standardised regression co-efficient (SRC) can be expressed as a multiplication of the rel-ative sensitivity contribution, NRC, and the ratio between therelative uncertainty in the input variables xi and in the modeloutput y, and, thus, illustrates the simple link between sensitiv-ity and uncertainty.

4. Correlation-based sensitivity measures

4.1. Linear correlation coefficient (LCC)

In this approach the association between the parameter xi

and the model output y is measured. The most simple andwidely used measure is the linear correlation coefficient ryxi

(LCC), which reflects the linear relation between y and xi. Itcan be expressed by

ryxi¼ covðy; xiÞ

varðyÞvarðxiÞ¼

PNk¼1ðxki� xiÞ ðyk � yÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiPN

k¼1ðxki� xiÞ2PN

k¼1ðyk � yÞ2q ð11Þ

where covðy; xiÞ is the covariance of y and xi, var( y) is the var-iance of y, varðxiÞ is the variance of xi, xki and yk are the valuesof the input variable xi and model output y in the k-th modelrun, respectively.

The LCC indicates the degree of linear relation betweeninput variable xi and model output y taking into account the in-fluence of the input variables which are correlated with xi. Thevalue of LCC varies between e1 and 1, a value near 1 or e1implies that y can be expressed as a linear function of xi. TheLCC is a relative sensitivity measure; it quantifies the relativechange Dy of y, in terms of its standard deviation Sy, if xi

changes relatively in terms of its standard deviation Sxi, while

the correlated sources change accordingly (i.e. if xi changes toxi þ aSxi

, then xj changes, due to correlation, to xj þ arxixjSxj

)(Janssen et al., 1992):

Dy

Sy

¼Xp

j¼1

SRCj

Dxj

Sxj

¼Xp

j¼1

�SRCjrxixj

�Dxi

Sxi

¼ LCCi

Dxi

Sxi

ð12Þ

where SRCk is the standardised regression coefficient of thevariable xk which is correlated to the variable xi and rxixk

isthe correlation coefficient between the two variables.

The LCC also is used as an uncertainty measure, since itexpresses the relative change of a quantity with relation toits standard deviation. If the relation between input variablesand model output is almost linear and if correlation betweenthe variables xi is weak, then the LCC is a measure to quan-tify the uncertainty, and will be approximately equal to theSRC. The LCC will equal the SRC for linear models with un-correlated variables. In general the ratio between the LCCand the SRC is a measure of the influence of correlation onthe uncertainty contribution. If this ratio is approximately 1,the influence of the correlation can be neglected (Janssenet al., 1990).

4.2. Partial correlation coefficient (PCC)

In the case that input variable xi is correlated to another in-put variable xj; jsi, the LCC incorporates the influence of theother correlated parameters (Janssen et al., 1992). To avoidthis drawback, Iman and Helton (1988) proposed the use ofthe partial correlation coefficient (PCC), which measures thedegree of linear relation between the input variable xi andthe model output y after making an adjustment to removethe linear effect of all the remaining variables xj; jsi. ThePCC between the variable xi and the model output y can bedetermined as

PCCi ¼�ciy�

ciicyy

�1=2ð13Þ

where ciy, cii, and cyy are the elements of the inverse of the cor-relation matrix C between the individual xis and y based on Nsimulation runs. The inverse matrix can be written as follows

266664rx1x1

rx1x2. rx1xp

rx1y

rx2x1rx2x2

. rx2xp rx2y

. . . . .rxpx1

rxpx2. rxpxp rxpy

ryx1rxyx2

. ryxp ryy

377775�1

¼

266664c11 c12 . c1p c1y

c21 c22 . c2p c2y

. . . . .cp1 cp2 . cpp cpy

cy2 cy2 . cyp cyy

377775 ð14Þ

where rxixjis the correlation coefficient between the input vari-

ables xi and xj, and rxiy is the correlation coefficient betweenthe variable xi and the model output y. The PCC can also beexpressed as

PCCi ¼ r~yi~xi

ð15Þ

where ~yi and ~xi result from correcting y and xi for the lineareffects of the other variables using simple regression. ThePCC is a relative sensitivity measure; it quantifies the relative

552 G. Manache, C.S. Melching / Environmental Modelling & Software 23 (2008) 549e562

Page 6: Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

Author's personal copy

change D~yi of ~yi, in terms of its standard deviation S~yi

, if ~xi

changes relatively in terms of its standard deviation S~xi

after re-moving the influence of the correlated variables from y and xi

D~yi

S~yi

¼ PCCi

D~xi

S~xi

ð16Þ

This measure also is used as an uncertainty measure, sinceit expresses the relative change of a quantity with relation toits standard deviation. The PCC tends to exclude the effectsof the other elements of x, the assumed distribution for xi,and the magnitude of the impact of the uncertainty in xi onthe uncertainty in y (Helton et al., 2006).

4.3. Semi-partial correlation coefficient (SPC)

Since the model output y also is corrected for the effects ofthe correlated variables xj; jsi, the PCC is, therefore, con-cerned with a different model output ~yi for each individual var-iable xi. This hampers a fair comparison of the contributions ofthe various input variables. Therefore, Janssen et al. (1990)proposed an alternative measure by correcting only the vari-able xi, and correlating the corrected quantity ~xi with the orig-inal model output y:

SPCi ¼ ry~xi

ð17Þ

SPC denotes the semi-partial correlation coefficient that ex-presses the linear relation between y and the corrected quantity~xi. The SPC is a relative sensitivity measure which quantifiesthe relative change Dy of y, in terms of its standard deviationSy, if ~xi changes relatively in terms of its standard deviationS

~xi:

Dy

Sy

¼ SPCi

D~xi

S~xi

ð18Þ

If the correlation between the variables ~xi is weak, the SPCwill be approximately equal to the LCC and the SRC. In thecase of strong correlation between the variables ~xi, this mea-sure can give a misleading impression of the uncertainty con-tribution (Janssen et al., 1992).

4.4. Alternative measures computed on therank-transformed data

The regression- and correlation-based measures mentionedin the previous sections are only valid when the relation be-tween model input and output is approximately linearðR2z1Þ. The farther R2 is from 1 the less valid the variousmeasures become. Saltelli et al. (2006) suggested R2> 0.7as a bound on the usefulness of the linear correlation and re-gression measures. When non-linearity between model inputand output is present, more sophisticated (nonlinear) regres-sion models can be used (Draper and Smith, 1981), sometransformations on the data can be applied (Gilbert, 1987; Sto-line, 1991), or gridding- and entropy-based methods (amongothers) can be applied for non-monotonic, nonlinear models(see Helton et al., 2006).

Iman and Conover (1979) considered the rank-transforma-tion method in which the original values of the input variablesand the model output are replaced by their rankings (ranking 1for the smallest value). This technique is used to linearizemonotonic nonlinear relations so that linear regression analy-sis can be applied on the rank-transformed data. For example,the regression equation between the ranks of the output andthe ranks of the input variables is

RyðkÞ ¼ br0 þXp

i¼1

birRxiðkÞ þ erðkÞ

¼ bRyðkÞ þ erðkÞ k ¼ 1;.;N ð19Þ

where RyðkÞ and Rx1ðkÞ;.;RxpðkÞ are the values of the rankedmodel output and ranked parameters in the k-th model run,erðkÞ is the rank-regression residual, bRyðkÞ is the regression-estimate of the ranked model output for the k-th model run,and b0r; b1r;.; bpr are the rank-regression coefficients(RRC). Eqs. (2)e(18) could be similarly modified by replac-ing y with Ry and xi with Rxi

to obtain the analogous sensitivitymeasures from the rank-regression analysis. The regressionmeasures are referred to as the standardised rank-regressioncoefficient (SRRC) and normalised rank-regression coefficient(NRRC). The correlation-based sensitivity measures computedon the rank-transformed data (input variables and model out-put) are referred to as the linear rank correlation coefficient(LRCC), the partial rank correlation coefficient (PRCC), andthe semi-partial rank correlation coefficient (SPRC). A sum-mary of the previously mentioned sensitivity and uncertaintymeasures is given in Table 1. Because uncertainties in rank-transformed data have no direct relation with those in theoriginal data, it is, therefore, difficult to relate the rank-basedmeasures to statements on the uncertainty contributions of theoriginal input variables (Janssen et al., 1990).

4.5. Measures related to the linear regression modeladequacy and application

When using the regression or rank-regression based mea-sures in sensitivity analysis, it is important to consider themodel coefficient of determination R2

y (on raw values orranks), which gives the percentage of the variance of the inputdata reproduced by the regression model (Janssen et al., 1992;

Table 1

Summary of various sensitivity and uncertainty measures based on regression

and correlation analysis

Original

data

Rank-

transformed

data

Sensitivity Uncertainty

Regression ORC RRC Dy=Dxi e

NRC NRRC ðDy=yÞ=ðDxi=xiÞ eSRC SRRC ðDy=SyÞ=ðDxi=Sxi

Þ ðSDy=SyÞ=ðSDxi=SxiÞ

Correlation LCC LRCC ðDy=SyÞ=ðDxi=Sxi Þ ðSDy=SyÞ=ðSDxi=Sxi ÞPCC PRCC ðD~yi=S

~yiÞ=ðD~xi=S

~xiÞ ðS

D~yi=S

~yiÞ=ðS

D~xi=S

~xiÞ

SPC SPRC ðDy=SyÞ=ðD~xi=S~xiÞ ðSDy=SyÞ=ðSD~xi

=S~xiÞ

553G. Manache, C.S. Melching / Environmental Modelling & Software 23 (2008) 549e562

Page 7: Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

Author's personal copy

Saltelli et al., 1993; Campolongo and Saltelli, 1997). Thismeasure represents the validity of the linear regression modelto approximate the model output, and, hence, the validity ofthe selected sensitivity measures. A value of R2

y close to 1 in-dicates an effective linear regression model, and, hence, thepossibility of ranking the model parameters based on regres-sion model coefficients. Low R2

y values (<0.7, Saltelli et al.,2006) indicate a poor regression model; a low percentage ofthe data variance is accounted for, so that the ranking of theparameters based on their contribution to this fraction losessignificance (Saltelli et al., 1993).

When additional variables are added to a regression equa-tion, the value of R2 increases even when these variables donot significantly improve the regression equation. Therefore,the interpretation of R2 only is valid on the population level(Janssen, 1994); for finite sample situations, an alternativemeasure can be more appropriate, e.g., the adjusted R2 (Seber,1977; Janssen et al., 1990, 1992). The adjusted R2 ðR2

adjÞ canbe expressed as follows

R2adj ¼ 1�

�1�R2

� � N� 1

N� ð1þ pÞ

ð20Þ

By modifying R2 by the number of parameters used in theregression equation, the adjusted R2 is more likely to only in-crease if the new parameter results in an improved regressionmodel.

When input variables are linearly related, the application ofa linear regression can lead to an accuracy problem, called thecollinearity problem (Hocking, 1983). The variance inflationfactor, VIFi, suggested by Marquardt (1970) is considered tobe a useful measure to detect collinearities (Snee, 1983).The VIFi is defined as i-th diagonal element of the inverseof the correlation matrix of the xis

VIFi ¼ ½Cx�ii¼�1� r2

i

��1 ð21Þ

where ri is the multiple correlation coefficient of the regressionof xi on all other xjs ðisjÞ. The VIFi is a measure for the in-flation of the variance in the estimated regression coefficientbi due to the mutual correlation between the xjs (Janssenet al., 1992). In the case of uncorrelated variables, the varianceinflation factor is equal to 1. Snee (1983) reported that theleast-squares results are acceptable when the maximum VIFis less than 5 or 10. Larger values indicate that it is appropriateto consider more sophisticated regression techniques for han-dling the collinearity.

In addition to the R2 and VIF, it is also important to specifythe accuracy of the estimated regression quantities enablinga test on their significance. These tests are described in mosttextbooks on regression analysis (e.g., Draper and Smith,1981). Janssen et al. (1992) listed the most important of theseestimators and tests in sensitivity and uncertainty analyses asfollows.

� The standard error of regression-estimate, defined as thesquare root of the Mean Squares of Errors (MSE). It

represents the standard deviation of the data about the re-gression line, rather than about the sample mean.� The F ratio is the test statistic used to decide whether the

regression model as a whole has statistically significantpredictive capability. F is the ratio of the Mean Squaresof Regression (MSR) to the Mean Squares of Errors(MSE), where

MSR¼X�byðkÞ � y

�2=p ð22Þ

� The t-statistic can be used for measuring the significanceof the estimated regression coefficients as described inthe Section 5.

4.6. Discussion of various sensitivity measures

In the previous sections, a number of sensitivity and uncer-tainty measures based on regression and correlation analysiswere presented. These measures are computed either on rawdata or rank-transformed data. When the different measuresare used to rank the input variables according to their influenceon the model output uncertainty, it is clear that the ranking willnot be unique. Therefore, it would be useful to identifywhether some techniques perform better than the others andwhen two or more techniques can provide complementaryinformation (Saltelli and Marivoet, 1990).

When the model output can be represented as a linear func-tion of the input variables, the SRC can be used, as well as cor-relation measures such as the LCC and the PCC (Gomit et al.,1997; Saltelli and Sobol, 1995). The SRC is a very effectivemeasure of the relative importance of the input variables.The validity of this measure is conditional to the degree towhich the regression model fits the data, i.e. to R2 (Gomitet al., 1997; Janssen et al., 1990, 1992; Saltelli et al., 1993).The correlation-based measures LCC and PCC reflect the lin-ear relations between the input variables and the model output.The PCC differs from the LCC in that it measures the degreeof the linear relation between the input variables xi and themodel output y after removing the linear effect of the othervariables (Iman and Conover, 1979; Iman and Helton, 1988).If the modeller is interested in the underlying error-propaga-tion properties of a model, the PCC is of greatest interest(Gardner et al., 1981). Under field conditions, all componentsof the system are subject to uncertainty and are measured witherror. These sources of error cannot be controlled and their in-fluence on prediction reliability cannot be removed as ina PCC analysis. Thus, in the design of field sampling pro-grams, the ranking of parameters according to the LCC isthe most relevant (Gardner et al., 1981). An example of usingLCC and PCC in sensitivity analysis of model parameters fora linear stream-ecosystem model is presented in Gardner et al.(1981).

Yeh and Tung (1993) reported that the correlation coeffi-cients indicate the strength of the association between inputsand output while regression coefficients represent the intensityof the relation. The study of Saltelli et al. (1993) indicated that

554 G. Manache, C.S. Melching / Environmental Modelling & Software 23 (2008) 549e562

Page 8: Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

Author's personal copy

SRC and PCC always produce the same ranking, unless signif-icant correlation is imposed on the input variables. Yeh andTung (1993) found in their study of the sensitivity and uncer-tainty analysis of a pit migration model that the ranking of theinput parameters based on the t-ratio of SRC is practicallyidentical to that using PCC.

As indicated previously, the PCC has a drawback that theconsidered model output is different for each individual inputvariable, thus preventing a fair comparison of the various un-certainty contributions to y. Therefore, the use of the SPC pro-posed in Janssen et al. (1990) as an alternative measure wouldbe more efficient. The relation between the PCC and the SPCcan be expressed by

PCCi ¼SPCiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

ðSPCiÞ2þ�1�R2

�q ð23Þ

Due to this relation, the PCC behaves similar to the SPC inthe ordinal sense (if R2z1), and, therefore, leads to the sameimportance ranking. However, the numerical values of thePCC do not always allow clear discrimination between impor-tant and unimportant variables, e.g., if R2z1 all PCC will beclose to �1 or 1 (Janssen et al., 1992; Janssen, 1994), anda variable can appear to have a larger effect on the uncertaintyin y than is actually the case (Helton et al., 2006). The para-metric measures SRC, LCC, and SPC are related as follows

LCCi ¼Xp

j¼1

SRCjrxixjð24Þ

SPCi ¼SRCiffiffiffiffiffiffiffiffiffiffiffiffiffiffi�

C�1x

�ii

q ¼ SRCiffiffiffiffiffiffiffiffiffiVIFi

p ð25Þ

These relations show that SRC, SPC, and LCC are equal ifthe input variables are uncorrelated and where R2z1. Thus,they lead to the same importance rankings (Janssen et al.,1992; Janssen, 1994). Under these conditions the use of oneof the three measures will be sufficient for the identificationof the most important input variables.

In the case of substantial correlation between the input vari-ables, the SRC will not be a reliable indicator. If the linear re-gression model is valid, the SPC would be the mostappropriate sensitivity/uncertainty measure (Janssen, 1994).However, additional information on correlation structure andSRC should be used in combination with SPC to avoid errone-ous interpretations.

Sensitivity and uncertainty measures computed on therank-transformed data such as SRRC, LRCC, PRCC, andSPRC normally are used in the presence of non-linearitybetween model-input variables and output (when R2 computedon the raw values is low). Saltelli et al. (1995) reported that thedifference between the R2 computed on the raw data and onthe ranks is a useful indicator of the non-linearity of themodel. As far as the inputeoutput relation is monotonic, theSRRC, LRCC, PRCC, and SPRC can be considered fairlyreproducible and accurate. However, their accuracy becomes

dubious in the presence of model non-monotonicity (Saltelliet al., 1993).

Iman and Helton (1985, 1988) compared three widely usedtechniques for sensitivity and uncertainty analyses e responsesurface methodology, LHS with and without regression analy-sis, and differential analysis. The results obtained indicatedthat the SRRC and PRCC used in conjunction with LHS arethe most robust. Saltelli and Marivoet (1990) compared a num-ber of parametric and non-parametric measures and test statis-tics used in sensitivity analysis such as LCC, PCC, SRC,LRCC, PRCC, SRRC, the Smirnov test statistic, the Cra-mereVon Mises test statistic, the ManneWhitney test statistic,and the two-sample t-test statistic. The main finding of thisstudy was that the SRRC and PRCC were the most stablemeasures.

Saltelli and Marivoet (1990), Saltelli and Homma (1992),and Saltelli et al. (1993) indicated that SRRC and PRCCmethods perform identically and appear to be the most robustand reliable. Moreover, an important observation made bySaltelli and Homma (1992) is that the use of the PRCC andSRRC is somewhat redundant because the two techniques pro-duce identical variable ranking especially in the case that theinput variables are not correlated.

Based on the above discussion, a concept is established forselecting appropriate sensitivity measures in this study. Fig. 1illustrates schematically the decision tree used in the applica-tion of sensitivity/uncertainty analysis to the DUFLOWmodel. In this figure the decision steps involving the varioussensitivity measures (SRC, LCC, SRRC, etc.) are applied toeach model parameter or input variable individually, whereasthe other decisions are applied to the regression models orgroups of the parameters and/or input variables. The criteriafor determining a strong correlation, strong rank correlation,and the significance of the various sensitivity measures are de-veloped and tested in this study. As can be seen from this fig-ure, the SRC, LCC, PCC, and SPC are adequate sensitivity/uncertainty measures in the case of a good linear (rank) regres-sion model and uncorrelated input variables. As mentionedpreviously, SRC, LCC, and SPC are equal for uncorrelated in-put variables. In general the ratio between the LCC and theSRC is a measure for the influence of correlation on the uncer-tainty contribution. If this ratio is approximately 1, the corre-lation between the input variables is weak. The varianceinflation factor, VIF, also can be a useful indicator of linear re-lations between input variables. In the case of uncorrelatedvariables (or weak correlation), the variance inflation factoris equal to 1, and a value greater than 5 indicates substantialcorrelation.

When correlation exists between the input variables, theSRC becomes an unreliable sensitivity/uncertainty measureas previously discussed. To cope with this problem, other mea-sures that account for correlation can be used (Janssen et al.,1992; Janssen, 1994). One of these measures is the relativepartial sum of squares (RPSS) proposed by Dale et al.(1988). This measure is obtained by discarding xi from theregression and measuring the associated loss in explainedvariance

555G. Manache, C.S. Melching / Environmental Modelling & Software 23 (2008) 549e562

Page 9: Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

Author's personal copy

RPSSi ¼S2

y � S2yji�

S2y

ð26Þ

where byji� is the associated linear predictor, when predicting yon basis of x1;.; xi�1, xiþ1;.; xp, and S2

yji�is its variance. A

large RPSSi indicates that the variable xi is an important sourceof uncertainty.

4.7. Sensitivity analysis of the DUFLOW model

In this paper, the effect of parameter uncertainties on a wa-ter-quality model prediction is investigated through the appli-cation of sensitivity analysis. The LHS technique incombination with regression and correlation analysis hasbeen applied to the DUFLOW model developed for the DenderRiver in Belgium.

StrongCorrelation?

Yes

NoRPSS, SPCSignificant?

SRC, LCC, PCC, SPCSignificant?

Unimportantvariable

Importantvariable

Yes

Yes

Determine the uncertain input variables to beconsidered for sensitivity analysis

Specify ranges of variations and distriburtions

Generate Latin Hypercube samples of size N

Run the computer model with the sampledvalues of the input variables

(N runs)

Define linear regression model betweensampled input variables and considered model

output

Apply ranktransform on the data

R2

adjr ≥0.7

Strong rankcorrelation?

SRRC, LRCC,PRCC, SPRC

significant?

Yes

No

Importantvariable

STOP

RPRSS,SPRCsignificant?

Yes

Unimportantvariable

NoNo

No

YesNo

NoNo

Yes

Yes

Linear model?R

2

adj ≥0.7

Fig. 1. Schematic view of the sensitivity analysis concept applied in this study. [R2adj, adjusted coefficient of determination; SRC, standardised regression

coefficient; PCC, partial correlation coefficient; SPC, semi-partial correlation coefficient; LCC, linear correlation coefficient; R2adjr, adjusted rank coefficient of

determination; SRRC, standardised rank-regression coefficient; PRCC, partial rank correlation coefficient; SPRC, semi-partial rank correlation coefficient;

LRCC, linear rank correlation coefficient.].

556 G. Manache, C.S. Melching / Environmental Modelling & Software 23 (2008) 549e562

Page 10: Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

Author's personal copy

DUFLOW is a computer package for simulating one-dimensional unsteady flow and water quality in openwatercourses (DUFLOW, 1992). In the computation of flowhydraulics the DUFLOW model solves the full de SaintVenant equations of motion for unsteady flow. This hydraulicmodel can be directly coupled with one of two predefinedwater-quality models: EUTROF1 and EUTROF2. Also, DU-FLOW is an open format model and users may easily includeadditional formulations to the model. EUTROF1 is based onthe EUTRO4 model from WASP4 developed by the U.S.Environmental Protection Agency (Ambrose et al., 1988). Itincludes the cycling of nitrogen, phosphorous, and oxygen.The growth of one phytoplankton species also is simulated.In EUTROF2 three algal species are included and interactionsbetween the sediment and the overlying water column aretaken into consideration while the other water-quality kineticsin the water column are simulated in a similar way as inEUTROF1. In this study, EUTROF1 was used with the modi-fication that the coefficient in the O’ConnoreDobbinsequation to estimate the reaeration-rate coefficient could beconsidered an uncertain variable.

Within the DUFLOW model, 37 parameters were used formodelling the water-quality processes in the Dender River.Twenty-nine parameters were considered uncertain, whilethe other eight parameters were assumed to have a small effecton the model uncertainty. The names, statistics, and assumed

distributions for these parameters are listed in Table 2. The ba-sis for the assumptions regarding these parameters is given inManache and Melching (2004). Extensive details on the appli-cation of the DUFLOW model to the Dender River can befound in Manache (2001) and Manache and Melching (2004).

For the application of the LHS procedure, the uncertaintyof each of these 29 parameters has been characterized by a spe-cific probability distribution (Table 2). All parameters were as-sumed to be statistically independent because of a lack ofinformation on possible correlation among parameters. Thesoftware package UNCSAM (Janssen et al., 1992) was usedto generate the N sets of the random parameter values corre-sponding to the LHS procedure. The sample size N is takenequal to 4/3 times the number of the uncertain parameters(N¼ 4/3p), which was found to give satisfactory results byIman and Helton (1985). To test the reliability of the use ofN¼ 4/3p, a second LHS sample was made with N¼ 3p, andit was found that the results of the two samples lead to nearlyidentical results for the SPC (Manache, 2001; Manache andMelching, 2007). Thus, for this study DUFLOW has been ex-ecuted successively with the 40 sets of generated parameters tosimulate DO, carbonaceous BOD (CBOD), ammonia, and al-gal biomass concentrations along the Dender River.

The sensitivity analysis is restricted to one simulated water-quality variable, the DO because it is generally considered asthe primary indicator of aquatic-system health. The sensitivity

Table 2

The mean values, standard deviations (SD), maximum and minimum limits, and distribution types used for the Latin Hypercube Sampling of the DUFLOW

parameters

Parameter Definition Mean SD Min Max Distribution

achlc Chlorophyll to carbon ratio 30 10 100 Triangular

anc Nitrogen to carbon ratio 0.16 0.05 0.43 Triangular

aoc Oxygen to carbon ratio 4 2 Normal

apc Phosphorous to carbon ratio 0.025 0 0.05 Triangular

e0 Background light extinction 1.7 1 5 Triangular

fbod Fraction of dissolved CBOD 0.4 0.18 Normal

is Optimal light intensity 80 10 100 Triangular

kbod Oxidation rate constant for CBOD 0.4 0.1 Normal

kbodo Monod constant for oxidation of CBOD 2 0.5 Normal

kden Denitrification rate constant 0.1 0.025 Normal

kdie Algal die-rate constant 0.2 0.003 0.8 Triangular

kdno Monod constant for denitrification 0.5 0.1 Normal

kmin Mineralisation rate constant for PORG and NORG 0.65 0.1625 Normal

kn Monod constant for nitrogen 0.1 0.02 Normal

knit Nitrification rate constant 0.2 0.05 Normal

kno Monod constant for nitrification 2 0.4 Normal

kp Monod constant for phosphorous 0.014 0.001 0.05 Triangular

kres Algal respiration rate constant 0.15 0.05 0.2 Triangular

krmin Constant of O’ConnoreDobbins equation for

estimation of the reaeration-rate coefficient

3.94 1.694 Lognormal

tbod Temperature coefficient for oxidation of CBOD 1.045 0.0523 Normal

tden Temperature coefficient for denitrification 1.045 0.0523 Normal

tga Temperature coefficient for algal growth 1.047 0.0524 Normal

tmin Temperature coefficient for mineralisation 1.047 0.0524 Normal

tnit Temperature coefficient for nitrification 1.088 0.0544 Normal

tra Temperature coefficient for algal respiration 1.047 0.0524 Normal

trea Temperature coefficient for reareation 1.016 0.0508 Normal

umax Maximum algal growth rate 4 0.8 Normal

vs0 Net settling velocity of organic matter 1.5 0.375 Normal

vss Settling velocity of suspended solids 0.1 0.025 Normal

557G. Manache, C.S. Melching / Environmental Modelling & Software 23 (2008) 549e562

Page 11: Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

Author's personal copy

analysis is performed on the amount of time that DO concen-trations are less than a specific value (2, 3, and 4 mg/l) overa period of 1 year (1990). Two locations on the Dender Riverare considered for model analysis Denderleeuw and Dender-belle (Manache, 2001; Manache and Melching, 2004). Inthis paper, the values and rankings of various sensitivity mea-sures calculated on the amount of time during which the DOconcentrations are less than 4 mg/L at Denderleeuw are givenas an illustration.

5. Results and discussion

Regression and correlation coefficients between theDUFLOW model-input parameters and the considered modeloutput (i.e. the amount of time the DO concentrations areless than 4 mg/l) were computed. The sensitivity and uncer-tainty measures mentioned in the previous sections computedusing the UNCSAM computer package include the SRC, PCC,SPC, LCC, SRRC, PRCC, SPRC, and LRCC. These estima-tors have been used to rank the importance of the DUFLOWparameters according to their influence on the model output.The values and rankings of various sensitivity measures calcu-lated on the amount of time during which the DO concentra-tion is less than 4 mg/l at Denderleeuw for the simulatedcurrent conditions are given in Tables 3 and 4.

Table 4 illustrates that the sensitivity/uncertainty measuresSRC, PCC, and SPC lead to nearly identical importance rank-ing for the DUFLOW model parameters. The ranking of theparameters based on the non-parametric (rank-based) mea-sures SRRC, PRCC, and SPRC also is practically identical.This conclusion is in agreement with other studies (e.g., Salt-elli et al., 1993; Yeh and Tung, 1993; Saltelli and Marivoet,1990; Saltelli and Homma, 1992) where the SRC and PCC(as well as SRRC and PRCC) are found to give similar rankingof the input variables. Saltelli and Homma (1992) reported thatthe use of PCC and SRC (as well as PRCC and SRRC) is re-dundant because the two techniques produce the same rankingespecially in the case that the input variables are notcorrelated.

To determine the most important DUFLOW parametersbased on the correlation measures (LCC, SPC, LRCC, andSPRC), the t-statistic given in Morrison (1984) is used totest the significance of the correlation coefficients:

t ¼ r

ffiffiffiffiffiffiffiffiffiffiffiffiN� 2

1� r2

rð27Þ

where N is the sample size and r is the considered correlationcoefficient. For the partial correlation coefficients (PCC andPRCC), N is replaced by N � ðpþ 1Þ. The value of the

Table 3

The values of various sensitivity measures calculated on the amount of time during which the dissolved oxygen concentration is less than 4 mg/l at Denderleeuw

for simulated current conditions

Parameter Latin Hypercube Simulation results

SRC PCC SPC LCC SRRC PRCC SPRC LRCC

is 0.53* 0.86* 0.47* 0.62* 0.38* 0.77* 0.34* 0.51*

e0 0.42* 0.81* 0.40* 0.36* 0.31* 0.72* 0.29 0.29

kdie 0.37* 0.77* 0.35* 0.39* 0.42* 0.81* 0.39* 0.41*

fbod 0.27* 0.63* 0.24 0.23 0.01 0.02 0.01 0.07

kmin �0.25* �0.63* �0.23 �0.16 �0.03 �0.10 �0.03 0.02

aoc 0.19 0.53 0.18 0.22 0.17 0.46 0.15 0.19

tmin �0.16 �0.46 �0.15 �0.15 �0.05 �0.17 �0.05 �0.01

vss 0.15 0.42 0.13 �0.04 �0.18 �0.48 �0.16 �0.13

kres 0.13 0.41 0.13 0.16 0.09 0.27 0.08 0.15

kdno �0.13 �0.40 �0.12 �0.17 �0.01 �0.02 �0.01 0.00

umax �0.13 �0.38 �0.12 �0.24 �0.34* �0.74* �0.32* �0.29

kn 0.12 0.34 0.10 0.06 �0.02 �0.07 �0.02 0.01

kno �0.09 �0.29 �0.09 0.02 �0.02 �0.07 �0.02 0.05

vs0 0.09 0.28 0.08 0.07 �0.07 �0.23 �0.07 �0.17

tnit 0.09 0.28 0.08 0.15 �0.01 �0.04 �0.01 �0.07

tden �0.09 �0.28 �0.08 �0.12 �0.05 �0.18 �0.05 �0.11

kp 0.08 0.24 0.07 0.08 �0.08 �0.25 �0.07 0.03

tra 0.06 0.19 0.05 0.00 �0.09 �0.28 �0.08 �0.05

trea 0.06 0.16 0.05 0.07 0.16 0.41 0.13 0.21

kbodo �0.05 �0.14 �0.04 �0.09 �0.21 �0.51 �0.17 �0.23

anc �0.04 �0.14 �0.04 �0.01 �0.16 �0.47 �0.15 �0.02

kden �0.04 �0.14 �0.04 �0.05 �0.09 �0.29 �0.09 �0.10

tbod 0.03 0.09 0.03 �0.01 �0.08 �0.24 �0.07 �0.24

knit �0.02 �0.08 �0.02 0.00 0.22* 0.59* 0.21 0.18

tga 0.02 0.06 0.02 0.05 0.01 0.03 0.01 0.03

krmin 0.01 0.04 0.01 0.05 �0.13 �0.39 �0.12 �0.18

kbod �0.01 �0.04 �0.01 0.04 0.28* 0.67* 0.26 0.32*

apc 0.01 0.02 0.01 0.04 �0.09 �0.29 �0.09 �0.09

achlc �0.01 �0.02 0.00 �0.05 �0.04 �0.13 �0.04 �0.04

*Indicates that the correlation or regression coefficient is significant at the 5% level on the basis of the t-test statistic.

558 G. Manache, C.S. Melching / Environmental Modelling & Software 23 (2008) 549e562

Page 12: Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

Author's personal copy

t-statistic associated with regression coefficients (SRC andSRRC) is computed by:

t ¼ regression coefficient ðbiÞstandard error of bi

ð28Þ

At the significance level of 5%, the t-statistic values whichare greater than 1.96 indicate significant coefficients/contribu-tions. Table 5 gives the values of the t-test statistic for sensitiv-ity/uncertainty measures calculated on the amount of timeduring which the DO concentration is less than 4 mg/l at Den-derleeuw for simulated current conditions. The sensitivity/un-certainty measures significant at the 5% level are marked withasterisk (*) in Tables 3 and 5.

Based on the previous statement about the similarity inranking the input parameters based on SRC, PCC, and SPC(or SRRC, PRCC, and SPRC), and in order to avoid superflu-ous interpretation of the sensitivity analysis results based onthese measures, it was decided that the selection of one mea-sure from the three raw based measures SRC, PCC, and SPCand one measure from the three rank-based measures SRRC,PRCC, and SPRC would be sufficient to rank the DUFLOWinput parameters according to their influence on the consid-ered model output. The question that may arise in this caseis which of these measures should be considered and on

what basis it should be selected. The answer to this questionis given in the following discussion.

As previously mentioned, the validity of the SRCs andSRRCs as measures of sensitivity is conditional on the degreeto which regression model fits the data, i.e. to R2 or to the ad-justed R2 ðR2

adjÞ which is considered more appropriate than R2

(Janssen, 1994). Table 6 summarises the regression and rank-regression statistics for the considered model output. It can benoticed that the regression model based upon the raw valuesgives a coefficient of determination R2 of about 0.92. Forthe ranked data, the regression model yields R2 of about0.92. These relatively high values of R2 (close to 1 and

Table 4

The rankings of various sensitivity measures calculated on the amount of time

during which the dissolved oxygen concentration is less than 4 mg/l at Dender-

leeuw for simulated current conditions

Parameter Latin Hypercube

Simulation results

SRC PCC SPC LCC SRRC PRCC SPRC LRCC

is 1 1 1 1 2 2 2 1

e0 2 2 2 3 4 4 4 4

kdie 3 3 3 2 1 1 1 2

fbod 4 4 4 5 28 29 29 19

kmin 5 5 5 8 23 23 23 26

aoc 6 6 6 6 9 10 10 9

tmin 7 7 7 11 21 21 21 27

vss 8 8 8 22 8 8 8 14

kres 9 9 9 9 15 16 16 13

kdno 11 10 10 7 29 28 28 29

umax 10 11 11 4 3 3 3 5

kn 12 12 12 17 25 25 25 28

kno 15 13 13 25 24 24 24 21

vs0 16 14 14 16 19 19 19 12

tnit 13 15 15 10 26 26 26 18

tden 14 16 16 12 20 20 20 15

kp 17 17 17 14 18 17 17 24

tra 19 18 18 28 16 15 15 20

trea 18 19 19 15 11 11 11 8

kbodo 20 20 20 13 7 7 7 7

anc 21 21 21 27 10 9 9 25

kden 22 22 22 21 14 14 14 16

tbod 23 23 23 26 17 18 18 6

knit 24 24 24 29 6 6 6 10

tga 25 25 25 18 27 27 27 23

krmin 26 26 26 19 12 12 12 11

kbod 27 27 27 24 5 5 5 3

apc 28 28 28 23 13 13 13 17

achlc 29 29 29 20 22 22 22 22

Table 5

Values of the t-test statistic for the sensitivity measures calculated on the

amount of time during which the dissolved oxygen concentration is less

than 4 mg/l at Denderleeuw for simulated current conditions

Parameter t-test statistic (TTST)

SRC PCC SPC LCC SRRC PRCC SPRC LRCC

is 5.23* 5.23* 3.28* 4.87* 3.77* 3.77* 2.23* 3.65*

e0 4.38* 4.38* 2.69* 2.38* 3.24* 3.24* 1.87 1.87

kdie 3.85* 3.85* 2.30* 2.61* 4.34* 4.34* 2.61* 2.77*

fbod 2.59* 2.59* 1.52 1.46 0.06 0.06 0.06 0.43

kmin �2.57* �2.57* �1.46 �1.00 �0.31 �0.31 �0.19 0.12

aoc 1.95 1.95 1.13 1.39 1.65 1.65 0.94 1.19

tmin �1.66 �1.66 �0.94 �0.94 �0.54 �0.54 �0.31 �0.06

vss 1.46 1.46 0.81 �0.25 �1.72 �1.72 �1.00 �0.81

kres 1.40 1.40 0.81 1.00 0.89 0.89 0.49 0.94

kdno �1.36 �1.36 �0.75 �1.06 �0.06 �0.06 �0.06 0.00

umax �1.32 �1.32 �0.75 �1.52 �3.48 �3.48* �2.08* �1.87

kn 1.15 1.15 0.62 0.37 �0.21 �0.21 �0.12 0.06

kno �0.96 �0.96 �0.56 0.12 �0.22 �0.22 �0.12 0.31

vs0 0.93 0.93 0.49 0.43 �0.73 �0.73 �0.43 �1.06

tnit 0.93 0.93 0.49 0.94 �0.14 �0.14 �0.06 �0.43

tden �0.92 �0.92 �0.49 �0.75 �0.57 �0.57 �0.31 �0.68

kp 0.79 0.79 0.43 0.49 �0.81 �0.81 �0.43 0.19

tra 0.60 0.60 0.31 0.00 �0.92 �0.92 �0.49 �0.31

trea 0.52 0.52 0.31 0.43 1.40 1.40 0.81 1.32

kbodo �0.46 �0.46 �0.25 �0.56 �1.85 �1.85 �1.06 �1.46

anc �0.46 �0.46 �0.25 �0.06 �1.70 �1.70 �0.94 �0.12

kden �0.44 �0.44 �0.25 �0.31 �0.96 �0.96 �0.56 �0.62

tbod 0.28 0.28 0.19 �0.06 �0.79 �0.79 �0.43 �1.52

knit �0.26 �0.26 �0.12 0.00 2.32* 2.32* 1.32 1.13

tga 0.19 0.19 0.12 0.31 0.08 0.08 0.06 0.19

krmin 0.13 0.13 0.06 0.31 �1.34 �1.34 �0.75 �1.13

kbod �0.12 �0.12 �0.06 0.25 2.89* 2.89* 1.66 2.08*

apc 0.07 0.07 0.06 0.25 �0.96 �0.96 �0.56 �0.56

achlc �0.05 �0.05 0.00 �0.31 �0.42 �0.42 �0.25 �0.25

*Significant coefficients/contributions at the 5% significance level.

Table 6

Summary of regression and rank-regression analyses on the considered model

output

Regression statistics Raw data Ranked data

Largest VIF 1.57 1.59

The squared multiple correlation

coefficient R20.92 0.92

The percentage variance accounted

for (100� adj.R2)

67.9 67.8

The F ratio for the regression 3.9 3.8

The standard error of regression-estimate 142 6.63

559G. Manache, C.S. Melching / Environmental Modelling & Software 23 (2008) 549e562

Page 13: Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

Author's personal copy

much greater than the 0.7 boundary suggested by Saltelli et al.,2006) express the effectiveness of the linear regression models(on raw data and on ranks) based upon the input parameters.

To avoid creating a false impression on the validity of theregression model, the adjusted coefficient of determinationðR2

adjÞ also is calculated (on raw data and ranks). As given inTable 6, the adjusted coefficient of determination is approxi-mately 0.7 for original regression and for rank-regression, in-dicating that, the fit of the regression models can be consideredgood ðR2

adj > 0:7Þ. Further, the overall utility of the regressionmodel is assessed by the F ratio. The F values given in Table 6(i.e. 3.9 for original regression and 3.8 for rank-regression)substantially exceed the critical F value at the 5% significancelevel (Fcritical¼ 2.7 with p degrees of freedom for the regres-sion and N� (1þ p) degrees of freedom for the error). Thismeans that the regression models (on raw data and ranks)have significant predictive capability.

Moreover, the problem of multi-collinearity between the in-put parameters is checked by evaluating the variance inflationfactor, VIF. The largest VIF value obtained is 1.57 for the orig-inal regression and 1.59 the regression based upon the ranks ofthe input values (Table 6). These values, which are much lessthan 5, indicate that the linear and linear rank-regression re-sults can be considered acceptable. It can, therefore, be con-cluded that the regression models (on raw data and onranks) do not confront collinearity problems.

Based on the foregoing analysis of the R2, R2adj, F ratio, and

VIF values, it can be considered that the linear and the linearrank-regression models are both valid to approximate themodel output based upon the DUFLOW input parameters,and, hence, the SRC and SRRC can be used as reliable sensi-tivity measures.

Based on the t-statistic values for the various sensitivitymeasures given in Table 5, it can be noticed that the SRCand PCC have the same t-statistic values, and, thus, identifythe same important parameters. The SRC and PCC identifyfive parameters that have a significant effect on the amountof time during which the DO concentration is less than4 mg/l at Denderleeuw. These parameters are: optimal light in-tensity (is), background light extinction (e0), algal die-rateconstant (kdie), fraction of dissolved CBOD (fbod), and min-eralisation rate constant (kmin). The SPC, however, identifiesonly three parameters: is, e0, and kdie that have a significantcontribution to model sensitivity. Similar remarks can be for-mulated for the rank-based measures. The SRRC and PRCCindicate six sensitive parameters: is, e0, kdie, algal maximumgrowth rate (umax), nitrification rate constant (knit), andCBOD oxidation rate constant (kbod) while, the SPRC iden-tifies only three sensitive parameters: is, kdie, and umax.

Based on this comparison, it can be concluded that the SRCand PCC and their rank equivalent SRRC and PRCC identifyapproximately twice the number of important parameters thatare identified by SPC and SPRC. Therefore, it would be moredifficult to eliminate the less important parameters based onSRC and PCC and their rank equivalent SRRC and PRCC.Due to this drawback, it is decided that the use of the SPCand its rank equivalent SPRC to identify the most important

parameters of the DUFLOW model would be more appropri-ate. The rationale of this decision is the similarity of the im-portance ranking results obtained with this measure to theSRC and PCC (SRRC and PRCC) results and the limited num-ber of the identified important parameters. The fact that the re-gression coefficients indicate twice as many ‘‘significant’’parameters as the correlation coefficients may be related tothe extra uncertainty resulting from applying a linear approx-imation to a nonlinear model. That is, even good linear regres-sions to the raw data and output or the parameter and outputranks (such as those obtained in this study) still introduce extrauncertainty to the overall analysis. The increase in uncertaintymay result in the inclusion of more parameters significantly af-fecting this uncertainty. Therefore, the additional parametersmay be more of an artifact of the regression than the originalmodel uncertainty. These results are consistent with the obser-vation of Helton et al. (2006) that the PCC can indicate thata variable has a larger effect on the uncertainty in the outputthan is actually the case.

The results of sensitivity analysis for the considered modeloutput based on the SPC (SPRC) are further compared withthose obtained by the LCC (LRCC). As can be seen fromTables 3 and 5, the LCC and SPC identify three sensitive param-eters: is, kdie, and e0. For the rank-based measures, the LRCCidentifies the parameters: is, kdie, and kbod while the SPRCidentifies the parameters: kdie, is, and umax as the most sensi-tive parameters. Although the LCC is the most simple andwidely used sensitivity/uncertainty measure, it has the draw-back that it incorporates the influence of the other correlated pa-rameters and this might be the explanation for the difference inthe results based on SPRC and LRCC. The parameters is, kdie,e0, and umax mainly are used in modelling the algal-growthprocess as described below (DUFLOW, 1992)

dA

dt¼ ½u maxFTFNFI�A�

�kresqðT�20Þ

ra þ kdie�A ð29Þ

where A is the algal concentration in milligrams per liter, t istime, FT is the algal growth limitation due to temperature, FN

is the algal growth limitation due to nutrients, FI is the algalgrowth limitation due to light energy, qra is the temperaturecorrection coefficient for algal respiration, and T is the temper-ature in degrees Celsius. The light limitation factor is com-puted as follows

FI ¼e

3totz

�exp

� I0

isexpð�3totzÞ

�� exp

�I0

is

�ð30Þ

where e is the Napierian base value, I0 is the incoming surfacelight intensity, z is the water depth, and 3tot is the total lightextinction which is the sum of background light extinction(e0) and shading from algae and suspended solids. Therefore,it is obvious that the relation among these parameters makesthe quantification of the contribution of individual parameterssomewhat difficult. As mentioned previously, some measurescan be used to cope with this correlation problem such asthe relative partial sum of squares (RPSS) which measuresthe relative loss in explained variance (i.e. R2) if an input

560 G. Manache, C.S. Melching / Environmental Modelling & Software 23 (2008) 549e562

Page 14: Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

Author's personal copy

parameter is removed from the regression. The RPSS valuesobtained for the parameters is, e0, kdie, and umax are 0.2,0.16, 0.12, and 0.01, respectively, indicating that the parameterumax has a small contribution compared the other three pa-rameters is, e0, and kdie. This result, which is in a good agree-ment with the SPC based results, confirms that the SPC can beused as a reliable measure to identify the most important pa-rameters taking into account the linear dependency amongthe input parameters.

Consequently, it was decided that the SPC and its rankequivalent SPRC would be adequate measures to assess thesensitivity of the DUFLOW model output to its input param-eters. As illustrated in Table 6, the difference between thevalues of R2, R2

adj, and the F ratio for the linear regressionand for the linear rank-regression is small and, thus, it is dif-ficult to prefer one regression to another. Both types of regres-sion (original and rank) and their related sensitivity measuresmay be useful. However, due to the difficulty to relate therank-based measures to statements on the uncertainty contri-butions of the original input variables (Janssen et al., 1992),it would be more judicious to consider the linear regression-based sensitivity measures for the identification and impor-tance ranking of the DUFLOW model parameters for themodel output.

From the previous detailed discussion, it can be concludedthat the SPC and SPRC for ranked values can be used as reli-able sensitivity/uncertainty measures to identify the most in-fluential DUFLOW parameters on the model output.Accordingly, for the considered model output in this paperand for a 5% significance level, the SPC identifies only threeparameters that have a significant effect on the amount of timeduring which the DO concentration is less than 4 mg/l at Den-derleeuw as listed in Tables 3 and 5: the optimal light intensity(is), the background light extinction (e0), and the algal die-rateconstant (kdie).

The rank-regression based measures SPRC and LRCCshow slight differences in the identification of the importantparameters for the considered model output. The SPRC indi-cates that the parameters: is, kdie, and umax have an importantinfluence at the 5% significance level. The LRCC indicatesthat the parameters: is, kdie, and kbod have an important influ-ence on the model output at the 5% significance level. How-ever, the parameters e0 and umax could have also beenidentified as important parameters by SPRC and LRCC, re-spectively, if the significance level of the t-test is increasedto 7%.

The final conclusion is that is, e0, and kdie are the most in-fluential DUFLOW parameters affecting the uncertainty of theconsidered model output. The identification of the parameterumax by the rank-based measures can be explained by the re-moval of the correlation effect between the algal growth pro-cess parameters is, e0, kdie, and umax when the ranked valuesof input and output are used while this effect is consideredwhen the raw data are used. However, the identification ofthe algal maximum growth rate parameter (umax) as importantbased on SPRC and LRCC can be also taken into account ifmore trust in the analysis results is desired.

6. Conclusion

Sensitivity analysis of a continuous water-quality simula-tion model with respect to uncertain model-input parameterswas presented in this paper. The Latin Hypercube Samplingtechnique was applied to the stream flow, water-quality modelDUFLOW to demonstrate the general application of sensitivityanalysis to complex stream water-quality models, to developgeneral procedures for such sensitivity analysis applications,and to identify the most reliable sensitivity measures. In thisstudy, a variance inflation factor of 5 was used to determinestrong correlation and strong rank correlation and the 5% sig-nificance level from the t-statistic was used to identify sensi-tivity measure significance in the procedure outlined inFig. 1. Although this study was focussed on the effect of pa-rameter uncertainty on the model prediction, the developed ap-proach can be applied to any other uncertain factors used inthe model simulations (e.g., input data, boundary conditions,etc.) and may be applied to a wide variety of problems. Themeasures discussed in this paper are appropriate for modelsthat are approximately linear or monotonically nonlinear(rank transform methods). Methods for non-monotonically,nonlinear models are reviewed in Helton et al. (2006).

From the extensive discussion of the most commonly usedsensitivity analysis measures based on regression and correla-tion analysis and the practical results obtained from the appli-cation of LHS to the DUFLOW model, a number of interestingconclusions can be formulated (that are similar to findings inthe literature) with regard to the selection of the most adequatesensitivity measures as follows.

(1) Sensitivity measures SRC, PCC, and SPC (as well asSRRC, PRCC, and SPRC for ranked values) lead to simi-lar ranking of the DUFLOW input parameters. Therefore,a selection of one measure amongst those calculated onthe raw data and one measure amongst those calculatedon the ranked values would be sufficient to identify themost important parameters of the DUFLOW model.

(2) The SRC and PCC and their rank equivalent SRRC andPRCC identify approximately two times more importantparameters than can be identified by the SPC and SPRC.

(3) The fact that the regression-based measures identify twiceas many ‘‘important’’ parameters as the correlation-basedmeasures may be related to the extra uncertainty resultingfrom applying a linear approximation to a nonlinear model(i.e. a regression model is not a completely suitable substi-tute for a complex physics-based model like DUFLOW).Thus, the identification of more parameters may be moreof an artifact of the regression inaccuracy than the originalmodel uncertainty.

(4) The SPC and LCC appear to be, in general, the most ro-bust and reliable sensitivity measures for this study. How-ever, the SPC is preferred to the LCC since it accounts forlinear dependency among the input parameters.

(5) The SPC and SPRC were found to be adequate measuresfor this study to assess the sensitivity of the DUFLOWmodel output to its input parameters.

561G. Manache, C.S. Melching / Environmental Modelling & Software 23 (2008) 549e562

Page 15: Identification of reliable regression- and correlation-based sensitivity measures for importance ranking of water-quality model parameters

Author's personal copy

Acknowledgements

The authors want to thank the Agricultural University ofWageningen, Department of Nature Conservation, Wagenin-gen, The Netherlands, for providing the software DUFLOW.We also thank Mr. Peter H.M. Janssen from the National Insti-tute of Public Health and Environmental Protection, Bilt-hoven, The Netherlands for providing us the softwareUNCSAM.

References

Ambrose, R.B., Wool, T.A., Connolly, J.P., Schanz, R.W., 1988. WASP4, a Hy-

drodynamic and Water Quality ModeleModel Theory, User’s Manual, and

Programmer’s Guide. U.S. Environmental Protection Agency, Athens, GA.

EPA/600/3-87-039.

Beck, M.B., 1987. Water quality modelling: a review of the analysis of uncer-

tainty. Water Resources Research 23 (5), 1393e1441.

Bouraoui, F., 2007. Testing the PEARL model in the Netherlands and Sweden.

Environmental Modelling & Software 22 (7), 937e950.

Campolongo, F., Saltelli, A., 1997. Sensitivity analysis of an environmental

model: an application of different analysis methods. Reliability Engineer-

ing and System Safety 57, 49e69.

Campolongo, F., Saltelli, A., Sørensen, T., Tarantola, S., 2000. Hitchhiker’s

guide to sensitivity analysis. In: Saltelli, A., Chan, K., Scott, E.M.

(Eds.), Sensitivity Analysis. John Wiley & Sons, Chichester, pp. 15e47.

Campolongo, F., Cariboni, J., Saltelli, A., 2007. An effective screening design

for sensitivity analysis of large models. Environmental Modelling & Soft-

ware 22 (10), 1509e1518.

Dale, V.H., Jager, H.I., Rosen, A.E., 1988. Using sensitivity and uncertainty

analysis to improve predictions of broad-scale forest development. Ecolog-

ical modelling 42, 165e178.

Draper, N.R., Smith, H., 1981. Applied Regression Analysis, second ed. John

Wiley & Sons, Inc.

DUFLOW, 1992. A Micro-computer Package for the Simulation of One-

dimensional Unsteady Flow and Water Quality in Open Channel Systems.

Manual for DUFLOW Version 2.00. ICIM, Rijswijk.

Gardner, R.H., O’Neill, R.V., Mankin, J.B., Carney, J.H., 1981. A comparison

of sensitivity and error analysis based on stream ecosystem model. Ecolog-

ical Modelling 12, 173e190.

Gilbert, R.O., 1987. Statistical Methods for Environmental Pollution Monitor-

ing. Van Nostrand Reinold, New York.

Gomit, J.M., Marivoet, J., Raimbault, P., Recreo, F., 1997. Evaluation of Ele-

ments Responsible for the Effective Engaged Dose Rates Associated with

the Final Storage of Radioactive Waste: Everest Project. Final Report Pub-

lished in EUR 17449/1 EN.

Helton, J.C., 1993. Uncertainty and sensitivity analysis techniques for use in

performance assessment for radioactive waste disposal. Reliability

Engineering and System Safety 42, 327e367.

Helton, J.C., Johnson, J.D., Sallaberry, C.J., Storlie, C.B., 2006. Survey of

sampling-based methods for uncertainty and sensitivity analysis. Reliabil-

ity Engineering and System Safety 91, 1175e1209.

Hocking, R.R., 1983. Developments in linear regression methodology 1959e

1982. Technometrics 25, 219e230.

Iman, R.L., Conover, W.J., 1979. The use of rank transform in regression.

Technometrics 21 (4), 499e509.

Iman, R.L., Helton, J.C., 1985. A Comparison of Uncertainty and Sensitivity

Analysis Techniques for Computer Models. Report NUREGICR-3904,

SAND 84-1461. Sandia National Laboratories, Albuquerque, New Mexico.

Iman, R.L., Helton, J.C., 1988. An investigation of uncertainty and sensitivity

analysis techniques for computer models. Risk analysis 8 (1), 71e90.

Janssen, P.H.M., Slob, W., Rotmans, J., 1990. Sensitivity Analysis and Uncer-

tainty Analysis: An Inventory of Ideas, Methods, and Techniques. RIVM

Report No. 958805001, Bilthoven, The Netherlands (in Dutch).

Janssen, P.H.M., Heuberger, P.S.C., Sanders, R., 1992. UNCSAM 1.1: A Soft-

ware Package for Sensitivity and Uncertainty Analysis. Report No.

959101004. National Institute of Public Health and Environmental Protec-

tion, Bilthoven, The Netherlands.

Janssen, P.H.M., 1994. Assessing sensitivities and uncertainties in models:

a critical evaluation. In: Grasman, J., Van Straten, G. (Eds.), Predictability

and Nonlinear Modelling in Natural Sciences and Economics. Kluwer

Academic Publishers, Dordrecht, pp. 344e361 (Proceedings of the 75th

Anniversary Conference of WAU, April 5e7, 1993, Wageningen, The

Netherlands).

Manache, G., 2001. Sensitivity of a Continuous Water-quality Simulation

Model to Uncertain Model-input Parameters. Ph.D. thesis, Chair of Hy-

drology and Hydraulics, Vrije Universiteit Brussel, Brussels, Belgium.

Manache, G., Melching, C.S., 2004. Sensitivity analysis of a water-quality

model using Latin Hypercube Sampling. Journal of Water Resources Plan-

ning and Management, ASCE 130 (3), 232e242.

Manache, G., Melching, C.S., 2007. Sensitivity of Latin Hypercube Sampling

to sample size and distributional assumptions. In: Proceedings CD-ROM,

32nd Congress of the International Association of Hydraulic Engineering

and Research, Venice, Italy, July 1e6, 2007.

Marquardt, D.W., 1970. Generalized inverses, ridge regression, biased linear

estimation, and nonlinear estimation. Technometrics 12, 591e612.

Melching, C.S., Yoon, C.G., 1996. Key sources of uncertainty in QUAL2E

model of Passaic River. Journal of Water Resources Planning and Manage-

ment, ASCE 122 (2), 105e113.

Morrison, D.F., 1984. Multivariate Statistical Methods, second ed. Mc-Graw

Hill, Singapore.

Pappenberger, F., Iorgulescu, I., Beven, K.J., 2006. Sensitivity analysis based

on regional splits and regression trees (SARS-RT). Environmental Model-

ling & Software 21 (7), 976e990.

Saltelli, A., 2002. Sensitivity analysis for importance assessment. Risk Anal-

ysis 22 (3), 579e590.

Saltelli, A., Marivoet, J., 1990. Nonparametric statistics in sensitivity analysis

for model output: a comparison of selected techniques. Reliability Engi-

neering and System Safety 28, 229e253.

Saltelli, A., Homma, T., 1992. Sensitivity analysis for model output. Perfor-

mance of black box techniques on three international benchmark exercises.

Computational Statistics and Data Analysis 13 (1), 73e94.

Saltelli, A., Sobol, I.M., 1995. About the use of rank transformation in sensi-

tivity analysis of model output. Reliability Engineering and System Safety

50, 225e239.

Saltelli, A., Andres, T.H., Homma, T., 1993. Sensitivity analysis of model out-

put: an investigation of new techniques. Computational Statistics and Data

Analysis 15, 211e238.

Saltelli, A., Andres, T.H., Homma, T., 1995. Sensitivity analysis of model out-

put: performance of the iterated fractional factorial design method. Com-

putational Statistics and Data Analysis 20, 387e407.

Saltelli, A., Ratto, M., Tarantola, S., Campolongo, F., 2006. Sensitivity analy-

sis practices: strategies for model-based inference. Reliability Engineering

and System Safety 91, 1109e1125.

Seber, G.A.F., 1977. Linear Regression Analysis. John Wiley & Sons, Inc,

New York.

Snee, R.D., 1983. Discussion on developments in linear regression methodol-

ogy 1959e1982. Technometrics 25, 230e237.

Stoline, M.R., 1991. An examination of the lognormal and Box and Cox

family of transformation in fitting environmental data. Environmetrics 2,

85e106.

Yeh, K.C., Tung, Y.K., 1993. Uncertainty and sensitivity analyses of pit

migration model. Journal of Hydraulic Engineering, ASCE 119 (2),

262e283.

562 G. Manache, C.S. Melching / Environmental Modelling & Software 23 (2008) 549e562