Top Banner
Mapping soil water retention curves via spatial Bayesian hierarchical models Wen-Hsi Yang a,, David Clifford a , Budiman Minasny b a CSIRO Digital Productivity Flagship, G.P.O. Box 2583, Brisbane, QLD 4001, Australia b Faculty of Agriculture & Environment, The University of Sydney, Sydney, NSW 2006, Australia article info Article history: Received 7 October 2014 Received in revised form 6 February 2015 Accepted 13 March 2015 Available online 23 March 2015 This manuscript was handled by Peter K. Kitanidis, Editor-in-Chief, with the assistance of Wolfgang Nowak, Associate Editor Keywords: Water retention curve Spatial van Genuchten model Bayesian hierarchical modeling Markov chain Monte Carlo Spatial confounding WinBUGS summary Soil water retention curves are an important parameter in soil hydrological modeling. These curves are usually represented by the van Genuchten model. Two approaches have previously been taken to predict curves across a field – interpolation of field measurements followed by estimation of the van Genuchten model parameters, or estimation of the parameters according to field measurements followed by interpolation of the estimated parameters. Neither approach is ideal as, due to their two-stage nature, they fail to properly track uncertainty from one stage to the next. In this paper we address this shortcom- ing through a spatial Bayesian hierarchical model that fits the van Genuchten model and predicts the fields of hydraulic parameters of the van Genuchten model as well as fields of the corresponding soil water retention curves. This approach expands the van Genuchten model to a hierarchical modeling framework. In this framework, soil properties and physical or environmental factors can be treated as covariates to add into the van Genuchten model hierarchically. Consequently, the effects of covariates on the hydraulic parameters of the van Genuchten model can be identified. In addition, our approach takes advantage of Bayesian analysis to account for uncertainty and overcome the shortcomings of other existing methods. The code used to fit these models are available as an appendix to this paper. We apply this approach to data surveyed from part of the alluvial plain of the river Rhône near Yenne in Savoie, France. In this data analysis, we demonstrate how the inclusion of soil type or spatial effects can improve the van Genuchten model’s predictions of soil water retention curves. Crown Copyright Ó 2015 Published by Elsevier B.V. All rights reserved. 1. Introduction Soil water retention curves are one of the most important parameters in soil hydrological modeling. These curves character- ize water storage and pore distribution in soils and are an essential input to drive models that simulate soil water balance for climate and environmental monitoring as well as models of crop yield management. These curves are usually represented by an equation with several parameters that describe the relationship between water content and potential or pressure head. This parametric representation also allows the calculation of unsaturated hydraulic conductivity based on the assumed pore-distribution models (Mualem, 1976; Collis-George, 2014). Various parametric equations have been proposed for modeling water retention curves, including Brooks and Corey (1964), van Genuchten (1980), and Kosugi (1996). Bimodal pore-distribution models such as Durner (1994) also have been proposed. The van Genuchten (VG) model is the most commonly used. Using the notation similar to Voltz and Goulard (1994), the model is writ- ten as WðhÞ¼ W s W r ½1 þðahÞ n m þ W r ; W s ; W r ; a; m > 0; n > 1; where WðhÞ represents the water content (in gg 1 ) at pressure head h (in m), W s is the saturated water content (in gg 1 ), W r is the resid- ual water content (in gg 1 ), and a (in m 1 ), n and m are shape parameters. The parameters W s and W r indicate the water content as h ! 0 and h !1, respectively. The parameters a and n are related with the inverse of air entry suction and the pore-size distribution, respectively. Typically, since n is closely related to m, van Genuchten (1980) proposed replacing m with 1 1=n. This special case of the VG model can thus be written as WðhÞ¼ W s W r ½1 þðahÞ n 1 1 n þ W r ; W s ; W r ; a > 0; n > 1: ð1Þ In this setting, the number of parameters reduces to four. This form of the VG model has been widely used in addressing characteristic http://dx.doi.org/10.1016/j.jhydrol.2015.03.029 0022-1694/Crown Copyright Ó 2015 Published by Elsevier B.V. All rights reserved. Corresponding author. Tel.: +61 7 38335533. E-mail address: [email protected] (W.-H. Yang). Journal of Hydrology 524 (2015) 768–779 Contents lists available at ScienceDirect Journal of Hydrology journal homepage: www.elsevier.com/locate/jhydrol
12

Mapping soil water retention curves via spatial Bayesian hierarchical models

Mar 29, 2023

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Mapping soil water retention curves via spatial Bayesian hierarchical models

Journal of Hydrology 524 (2015) 768–779

Contents lists available at ScienceDirect

Journal of Hydrology

journal homepage: www.elsevier .com/locate / jhydrol

Mapping soil water retention curves via spatial Bayesian hierarchicalmodels

http://dx.doi.org/10.1016/j.jhydrol.2015.03.0290022-1694/Crown Copyright � 2015 Published by Elsevier B.V. All rights reserved.

⇑ Corresponding author. Tel.: +61 7 38335533.E-mail address: [email protected] (W.-H. Yang).

Wen-Hsi Yang a,⇑, David Clifford a, Budiman Minasny b

a CSIRO Digital Productivity Flagship, G.P.O. Box 2583, Brisbane, QLD 4001, Australiab Faculty of Agriculture & Environment, The University of Sydney, Sydney, NSW 2006, Australia

a r t i c l e i n f o s u m m a r y

Article history:Received 7 October 2014Received in revised form 6 February 2015Accepted 13 March 2015Available online 23 March 2015This manuscript was handled by Peter K.Kitanidis, Editor-in-Chief, with theassistance of Wolfgang Nowak, AssociateEditor

Keywords:Water retention curveSpatial van Genuchten modelBayesian hierarchical modelingMarkov chain Monte CarloSpatial confoundingWinBUGS

Soil water retention curves are an important parameter in soil hydrological modeling. These curves areusually represented by the van Genuchten model. Two approaches have previously been taken to predictcurves across a field – interpolation of field measurements followed by estimation of the van Genuchtenmodel parameters, or estimation of the parameters according to field measurements followed byinterpolation of the estimated parameters. Neither approach is ideal as, due to their two-stage nature,they fail to properly track uncertainty from one stage to the next. In this paper we address this shortcom-ing through a spatial Bayesian hierarchical model that fits the van Genuchten model and predicts thefields of hydraulic parameters of the van Genuchten model as well as fields of the corresponding soilwater retention curves. This approach expands the van Genuchten model to a hierarchical modelingframework. In this framework, soil properties and physical or environmental factors can be treated ascovariates to add into the van Genuchten model hierarchically. Consequently, the effects of covariateson the hydraulic parameters of the van Genuchten model can be identified. In addition, our approachtakes advantage of Bayesian analysis to account for uncertainty and overcome the shortcomings of otherexisting methods. The code used to fit these models are available as an appendix to this paper. We applythis approach to data surveyed from part of the alluvial plain of the river Rhône near Yenne in Savoie,France. In this data analysis, we demonstrate how the inclusion of soil type or spatial effects can improvethe van Genuchten model’s predictions of soil water retention curves.

Crown Copyright � 2015 Published by Elsevier B.V. All rights reserved.

1. Introduction van Genuchten (VG) model is the most commonly used. Using

Soil water retention curves are one of the most importantparameters in soil hydrological modeling. These curves character-ize water storage and pore distribution in soils and are an essentialinput to drive models that simulate soil water balance for climateand environmental monitoring as well as models of crop yieldmanagement. These curves are usually represented by an equationwith several parameters that describe the relationship betweenwater content and potential or pressure head. This parametricrepresentation also allows the calculation of unsaturated hydraulicconductivity based on the assumed pore-distribution models(Mualem, 1976; Collis-George, 2014).

Various parametric equations have been proposed for modelingwater retention curves, including Brooks and Corey (1964), vanGenuchten (1980), and Kosugi (1996). Bimodal pore-distributionmodels such as Durner (1994) also have been proposed. The

the notation similar to Voltz and Goulard (1994), the model is writ-ten as

WðhÞ ¼ Ws �Wr

½1þ ðahÞn�mþWr; Ws;Wr;a;m > 0; n > 1;

where WðhÞ represents the water content (in gg�1) at pressure headh (in m), Ws is the saturated water content (in gg�1), Wr is the resid-ual water content (in gg�1), and a (in m�1), n and m are shapeparameters. The parameters Ws and Wr indicate the water contentas h! 0 and h!1, respectively. The parameters a and n arerelated with the inverse of air entry suction and the pore-sizedistribution, respectively. Typically, since n is closely related to m,van Genuchten (1980) proposed replacing m with 1� 1=n. Thisspecial case of the VG model can thus be written as

WðhÞ ¼ Ws �Wr

½1þ ðahÞn�1�1nþWr ; Ws;Wr ;a > 0; n > 1: ð1Þ

In this setting, the number of parameters reduces to four. This formof the VG model has been widely used in addressing characteristic

Page 2: Mapping soil water retention curves via spatial Bayesian hierarchical models

W.-H. Yang et al. / Journal of Hydrology 524 (2015) 768–779 769

properties of soil water content. Our study also considers this formof the VG model.

Many methods have been proposed to estimate the hydraulicparameters of the VG model via measured water retention data.In general, they can be divided into two categories:

1. Use optimization techniques to minimize the sum of squarederrors between the observed and modeled water retention. Inthis case, the Levenberg–Marquardt algorithm of nonlinearleast-squares methods is used in the RETC program (vanGenuchten et al., 1991) and the SWRC Fit (Seki, 2007). Otherglobal optimization techniques have also been proposed suchas a genetic algorithm (Vrugt et al., 2001) or simulated annealing(Younes et al., 2013).

2. Use Bayesian approaches in combination with Markov chainMonte Carlo (MCMC) (Abbaspour et al., 1997; Vrugt et al.,2003). The advantage of this approach is to provide the poster-ior distribution of parameters rather than a set of single values(i.e., point estimators).

Alternatively, pedotransfer functions (PTFs) (see Vila et al.,1999; Vereecken et al., 2010, and the references therein) have alsobeen used to estimate the hydraulic parameters, but this is an indi-rect method for predicting parameters from other more easilymeasured soil properties.

Most research on the estimation of hydraulic parameters focuseson finding the best parametric model that can characterize soilhydraulic properties corresponding to water retention curves(Vrugt et al., 2003). Such research typically does not consider spa-tial variability of soil hydraulic properties, which can cause highvariations in water transport processes. In addition, water retentioncurves are usually expensive to measure. Thus, spatial prediction ofthese properties is an important topic. Although researchers havebeen looking into efficient ways to predict water retention parame-ters across a field or landscape, determining and describing the spa-tial pattern of soil physical properties remains a difficult task formodeling landscape-scale soil–water processes (Wendroth et al.,2006). Voltz and Goulard (1994) proposed a two-stage approachto address this task. They first interpolated water content at differ-ent measured pressure heads and then used a least squares tech-nique based on Marquardt’s maximum neighborhood method toestimate VG parameters at each location of interest. Similarly,Saito et al. (2009) used SWRC Fit to estimate VG parameters at eachlocation where data were measured, and conducted ordinary krig-ing for interpolation at locations in between. Furthermore, theyevaluated two procedures: (1) fitting the VG model to the dataand then interpolating the parameters, and (2) interpolatingindividual water retention measurements and then fitting the VGmodel at each interpolated location. Their results showed that thelater procedure performed better when the mean absolute errorof water content was used as the evaluation criteria. However,the approach of Voltz and Goulard (1994) and the latter one ofSaito et al. (2009) took no advantage of the spatial variability ofthe VG model parameters to map water retention content curves.On the other hand, since these two approaches fit local VG modelsusing a small sample size of measured water retention data, theestimated hydraulic parameters may be imprecise which can affectthe precision of water retention curves. For example, the samplesize of measured water retention at each location was eleven inthe study of Saito et al. (2009). Consequently, abnormal observa-tions would reduce the precision of the first procedure of Saitoet al. (2009). Importantly, these approaches use least-squares basedtechniques to fit the VG model. As such, they lack the abilityto account for uncertainty of the hydraulic parameters andcould underestimate the uncertainty of predicted water retentioncurves.

In this paper, we propose a spatial Bayesian hierarchicalapproach to estimate and predict the VG model parameters and fur-ther, to predict their corresponding water retention curves. Thisapproach allows soil properties and other physical or environmen-tal factors to be incorporated into the VG model hierarchically tointerpret and predict the variation of soil water retention curves.If a priori knowledge is known, PTFs can be included in this modelas well. In some sense, this approach can be thought as a hierarchi-cal VG model. It is important to note that although Abbaspour et al.(1997) and Vrugt et al. (2003) used Bayesian approaches to esti-mate the VG model parameters, their approaches did not considerspatial effects and cannot incorporate useful extraneous variablesto improve the performance of the VG model. With our approachit is feasible to infer effects of covariates on hydraulic parameterscorresponding to soil water retention curves. In addition, differentfrom the two-stage approaches of Voltz and Goulard (1994) andSaito et al. (2009), our approach can estimate and predict thehydraulic parameters and water retention curves simultaneouslyusing all data from the study area. Importantly, since our approachis in the Bayesian paradigm, the uncertainty of the hydraulicparameters and water retention curves can be quantified. Acomprehensive introduction to the Bayesian paradigm, and thecomputational techniques within this paradigm, are beyond the scopeof this paper. We cite related papers both from application journalsand where necessary, from statistical ones. A good resource forinterested readers is Carlin and Louis (2001) or Gelman et al. (2003).

We applied this approach to data that were previously surveyedfrom part of the alluvial plain of the river Rhône near Yenne inSavoie, France. Our example demonstrates how the inclusion ofcovariate information such as soil type or the inclusion of spatialeffects in the model can lead to improvements in the performanceof the VG model in predicting water retention curves. The fact thatthis is achieved while simultaneously accounting for uncertaintyboth in the VG model and in the spatial interpolation, marks animportant contribution to this field of research.

2. Data and method

2.1. Data description

Voltz and Goulard (1994) surveyed water retention from part ofthe alluvial plain of the river Rhône near Yenne in Savoie, France.Fig. 1 illustrates the spatial distribution of 75 sites in the study areafrom two sampling schemes: 54 are from a rectangular grid withpoints equally spaced at 100 and 200 m intervals in the x and ydirection, respectively, and 21 are from a square grid with pointsequally spaced at 141 m intervals. Because of gravel content,Fig. 1 illustrates that the two sampling scheme were not sampledcompletely. At each location, undisturbed topsoil aggregates werecollected at a depth of 40 cm. Their gravimetric water contentswere measured at eight levels of pressure head: �0.1, �0.5, �1,�2, �4, �9, �30, and �150 m in a pressure plate extractor.

The study area contained six soil types which differ in terms oftheir soil texture and drainage characteristics. Voltz and Goulard(1994) described these six classes as follows: silt loam over loam(type 1), homogeneous silt loam (type 2), silty clay loam over poorlydrained silty clay with marked gleyic features in depth (type 3),homogeneous silt loam with shallow phreatic water (type 4), loamover gravelly sand (type 5), and loam with angular gravel and pres-ence of shallow phreatic water (type 6). However, the 75 sampledlocations only covered five of these soil types since sites of type 6could not be sampled. Fig. 1 illustrates the spatial distribution offive soil classes across the 75 sampled locations. Fig. 2(a) demon-strates the observed water retention curves and associated meancurve as a function of pressure head. The red curve highlighted in

Page 3: Mapping soil water retention curves via spatial Bayesian hierarchical models

0 200 400 600 800 1000

020

040

060

080

010

00

x (m)

y (m

)

1

1

1

1

1 1

1

1 1

1

2 2

2

2 2

2 2

2

2 2

2

2 2

2 2 2

2 2 2

2 2 2 2

3 3 3

3 3

3 3

3 3 3

3 3

3

3

3

4

4 4

5

5 5

2

2

2 2

2 2 2

3 3 3 3

3 3 3

3 3

4 4

4

5 5

Fig. 1. Spatial distribution of the observation locations. Black and red symbols express samples from rectangular grid with points spaced at 200 and 141 m intervals,respectively. Numbers represent soil classes: silt loam over loam (type 1), homogeneous silt loam (type 2), silty clay loam over poorly drained silty clay with marked gleyicfeatures in depth (type 3), homogeneous silt loam with shallow phreatic water (type 4), and loam over gravelly sand (type 5). (For interpretation of the references to colour inthis figure legend, the reader is referred to the web version of this article.)

770 W.-H. Yang et al. / Journal of Hydrology 524 (2015) 768–779

Fig. 2(a) comes from location ðx; yÞ ¼ ð200;400Þ which has a coarsetexture soil with low porosity and a narrow range of pore-size dis-tribution that drains rapidly with increasing pressure head. Fig. 2(b)illustrates the trend of water retention in an increasing order fromsoil types 1, 2, 4, 5, and 3. Soil type 1 typically holds less water atsaturation and drains quickly with increasing pressure head. Thisreflects the fact that soil type 1 is a coarse-textured soil with anarrow range of pore-size distribution. In contrast, soil type 3 is afine-textured soil with a wider range of pore-size distribution.Finally note that soil type 5 drains faster than soil type 4 whenthe potential pressure is less than �30 m (i.e., log10ð30Þ ¼ 1:48).

2.2. Spatial Bayesian hierarchical models

Bayesian hierarchical models use conditional probability todescribe uncertainty of various sources and to easily measure thepropagation of that uncertainty (Berliner, 1996; Cressie and

Wikle, 2011). Cressie and Wikle (2011) describe that sources gen-erally can be divided into three groups: the data (Y), process (B),and parameter (H) model. In terms of conditional probability,Bayesian hierarchical models generally can be decomposed intothree levels

½Y ;B;H� ¼ ½YjB;H�½BjH�½H�;

where ½G� and ½GjH� are the notation used to denote the probabilitydistribution of G and the conditional probability distribution of Ggiven H, respectively. In some cases, Bayesian hierarchical modelsmay have multiple data models or process models. In what follows,we introduce the components of our Bayesian hierarchical approachfor the VG model from data model to parameter model.

2.2.1. Data modelsUsually data are observed with measurement errors. These

errors may come from equipment or human behavior in sampling

Page 4: Mapping soil water retention curves via spatial Bayesian hierarchical models

(a)W

ater

Con

tent

(g/g

)

−1.0 −0.5 0.0 0.5 1.0 1.5 2.0 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

(b)

Pressure Head (log10(h))Pressure Head (log10(h))

Wat

er C

onte

nt (g

/g)

Fig. 2. (a) 74 observed water retention curves in gray, one observed water retention curves in red at location ðx; yÞ ¼ ð200;400Þ, and the mean curve of 75 observations in bluealong with pressure head (m). (b) Mean water retention curves of five soil types: type 1 in black, type 2 in red, type 3 in green, type 4 in blue, and type 5 in light blue. (Forinterpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

W.-H. Yang et al. / Journal of Hydrology 524 (2015) 768–779 771

processes. In this respect, we add an error term to the deterministicmodel (1). Additionally, the interest typically focuses on a finite setof observed water retentions. In this case, we let yqi ;j

denote theobserved noisy water retention at location qi at the jth pressurehead hj. Then we can rewrite (1) as follows

yqi ;j¼ wqi ;j þ �qi ;j ¼

Ws;qi�Wr;qi

½1þ ðaqihjÞnqi �1�

1nqi

þWr;qiþ �qi ;j;

Ws;qi;Wr;qi

;aqi> 0; nqi

> 1;

where the error term �qi ;j has mean zero and constant variance r2�

for i ¼ 1; . . . ;nq and j ¼ 1; . . . ; J. In general, �qi ;j is assumed to followa Gaussian distribution (i.e., �qi ;j � Nð0;r2

�Þ). Note that specifying adistribution to �qi ;j will depend on a priori knowledge about errors.For example, if we know that errors could happen in a specific rangearound zero, a truncated Gaussian may be considered. In general,we often lack knowledge about errors. This concern leads us to con-sider a Gaussian distribution for errors as a general assumption.

Now suppose we have a set of locations fk1; . . . ; knkg of interest

for prediction. For simplicity of notation, the observation locationsfq1; . . . ; qnq

g are assumed a subset of the prediction locations. Notethat it is not necessary for the observation locations to be a subsetof the prediction locations. For the observed water retention vari-ables, we have i ¼ 1; . . . ;nq observations yqi

� ½yqi ;1; . . . ; yqi ;J�0 at J

pressure heads, which can be written in terms of the nq � J matrixY � ½yq1

; . . . ; yqnq�0. Similarly, the true water retention matrix of the

prediction locations is W � ½wk1 ; . . . ;wknk�0, where

wkm � ½wkm ;1; . . . ;wkm ;J�0. The data model in matrix form is as follows

Y ¼ HWþ Ey; ð2Þ

where H is an nq � nk incident matrix and Ey is an nq � J matrixcomposed of the errors f�qi ;jg. Since some elements of W may not

be observed, H is typically composed of 1s and 0s to connectelements of Y with their corresponding elements of W (see Cressieand Wikle, 2011, p. 162). For example, if the kth row of W iscorresponding to the qth row of Y, the kth element is 1 and the restof elements are zeros in terms of the qth row of H.

2.2.2. Process modelsBefore introducing our process model, we describe the

approach used to deal with the fact that the processesWs;km ;Wr;km ;akm , and nkm have constraints on the range of valuesthey can take. To estimate these processes, two approaches canbe considered. One is to work within the current constraineddomain. The alternative is to transform these processes intounconstrained domains, to estimate their associated values in thatunconstrained domain, and then, to transform these estimated val-ues back to their original domains. We follow this alternative byrewriting our process as follows:

Ws;km ¼ eb1;km ;

Wr;km ¼ eb2;km ;

akm ¼ eb3;km ;

nkm ¼ eb4;km þ 1;

where �1 < b‘;km<1; ‘ ¼ 1;2;3;4. Our estimation procedure will

focus on fb‘;kmg whose domains are unconstrained instead of

fWs;km ;Wr;km ;akm ;nkmg. Note that other transformations can beconsidered.

We assume the process model for bkm� ½b1;km

; b2;km; b3;km

; b4;km�0

at location km has the form,

b0km¼ x0km

Dþ e0b;km; ð3Þ

where xkm denotes a vector with P covariates at location km;D isits corresponding P � 4 unknown coefficient matrix, and

Page 5: Mapping soil water retention curves via spatial Bayesian hierarchical models

772 W.-H. Yang et al. / Journal of Hydrology 524 (2015) 768–779

eb;km � ½eb1 ;km ; eb2 ;km ; eb3 ;km ; eb4 ;km �0 serves as a stationary multivariate

spatial process with mean zero and a cross-covariance function

CðgÞ ¼ ½Cb‘;b‘0 ðgÞ�4‘;‘0

as a function of lag g (i.e., distance in km), where

Cb‘;b‘0 ðgÞ ¼ Covðeb‘ ;km ; eb‘0 ;kmþgÞ. Note that Covð�; �Þ denotes a covari-ance function. For locations m ¼ 1; . . . ;nk, we define an nk � 4 matrixfor latent variables, B � ½bk1

; . . . ;bknk�0, and a P � 4 matrix for covari-

ates, X � ½xk1 ; . . . ;xknk�0. We can write (3) in matrix form as follows

B ¼ XDþ Eb; ð4Þ

where Eb � ½ek1 ; . . . ; eknk�0. As mentioned in Section 1, with the sup-

port of scientific knowledge, PTFs can be a part of X with knowncoefficients specified in the corresponding positions of the coeffi-cient matrix D. Typically, Eb is assumed to follow a multivariateGaussian process. Under this assumption, B is also a multivariateGaussian process so that the spatial process of the constrainedhydraulic parameters W of the VG model follows a multivariatelog-Gaussian process.

Through (4), extraneous variables and spatial effects can beadded to the VG model to improve the performance for predictingwater retention curves. To show the advantages of such a spatialprocess model we compare five non-spatial and three spatial pro-cess models. First, we describe the five non-spatial process models.Without account for spatial dependence, the non-diagonal ele-ments of the cross-covariance function of ebkm

are all zero; thatis, cross-covariance values associated with distance/lag g whereg > 0 are zero. Hence, the multivariate Gaussian process reducesto a multivariate Gaussian distribution with mean zero and covari-ance matrix Rb. Accordingly, the five models are presented below

M1 : b0km¼ DM1;

M2 : b0km¼ DM2 þ e0bM2 ;km

; ebM2 ;km

� Nð0;diagðr2bM2;1

;r2bM2;2

;r2bM2;3

;r2bM2;4ÞÞ;

M3 : b0km¼ DM3 þ e0bM3 ;km

; ebM3 ;km � Nð0;RbM3 Þ;

M4 : b0km¼ x0km

DM4 þ e0bM4 ;km; ebM4 ;km

� Nð0;diagðr2bM4;1

;r2bM4;2

;r2bM4;3

;r2bM4;4ÞÞ;

M5 : b0km¼ x0km

DM5 þ e0bM5 ;km; ebM5 ;km � Nð0;RbM5 Þ;

where diagð�Þ denotes a diagonal matrix. In addition, the abovemodels can be divided into two groups. M1, M2, and M3 considera case where no additional covariates are available. In such case,D corresponds to the overall mean of b0km

. Conversely, M4 and M5use external covariates to enhance model fitting and prediction. Inthe data analysis, there is one soil type associated with each sam-pling location (see Section 2.1). Soil type is included in the modelsas a categorical covariate via the use of dummy (indicator) variables(see Weisberg, 1985, p. 169, for more details). Furthermore, M2 andM4 are given constrained covariance matrices with diagonal struc-tures, which means the four latent variables are mutually indepen-dent. In contrast, M3 and M5 are specified using flexible covariancematrices. Note that M1 reduces to a non-hierarchical VG model. Inthis case, the Bayesian inferences of M1 will be close to Abbaspouret al. (1997) and Vrugt et al. (2003). M3 is close to the mixed modelsuggested by Minasny et al. (2013).

The three models that account for spatial variation are asfollows:

M6 : B ¼ DM6 þ EbM6 ;

M7 : B ¼ XDM7 þ EbM7 ;

M8 : B ¼ XDM8 þ ðI� PÞEbM8 ;

where I expresses an identity matrix of size knkand P ¼ XðX0XÞ�1X0

is the hat matrix (Weisberg, 1985). Similar to M1, M2, and M3 of thefive non-spatial process models, M6 ignores additional covariatesthat may be available. M7 and M8 are similar to M4 and M5 in thatthey include soil type as a covariate. Model M8 modifies M7 to rem-edy spatial confounding when spatial effects are collinear withcovariates following the approach of Hodges and Reich (2010). InSection 3 we show how spatial confounding impacts the inferencesof soil type effect corresponding to the hydraulic parameters of theVG model in our data analysis.

To make the computation easier, we assume that the latentvariables that make up b are independent. We note that thisassumption is not necessary. The inference results of M3 and M5in our data analysis show that b1 and b4 are highly correlatedaccording to the posterior means of correlation coefficients; forM3, Corðb1; b4Þ ¼ �0:95 and for M5, Corðb1; b4Þ ¼ �0:85.However, WinBugs 1.4.3 (Lunn et al., 2000) (http://www.mrc-bsu.cam.ac.uk/bugs/) used to perform our MCMC samplers is inefficientto run MCMC sampling for multivariate spatial models. In such aconcern, we assume that the latent variables are independent.Under the assumption of independence, the components Cb‘ ;b‘0 ðgÞof CðgÞ are equal to zero when ‘ – ‘0. When ‘ ¼ ‘0, we assumeCb‘ ;b‘0 ðgÞ ¼ r2

b‘expð�g

s‘Þ; ‘ ¼ 1;2;3;4, where r2

b‘and s‘ are scale and

range parameters, respectively. This covariance structure is alsoknown as the exponential covariance function. This structure isapplied to spatial processes of M6, M7 and M8. This covariancefunction is widely used and can fit soil data well (Minasny andMcBratney, 2005). Other types of covariance matrix can beconsidered also.

2.2.3. Parameter modelIn the Bayesian paradigm, we specify prior distributions for the

parameters of our model. First, for the data model (2) we specify aninverse-gamma distribution for r2

� , given by r2� � IGða�; b�Þ, where

a� and b� are given shape and scale hyperparameters. For the pro-cess model (4), we specify a Gaussian distribution for each elementdp;‘ of D, given by dp;‘ � Nðldp;‘

;r2dp;‘Þ, where ldp;‘

and r2dp;‘

are given

hyperparameters. Additionally, for simplicity, we set r2dp;‘

equal to a

common r2d (i.e., r2

dp;‘¼ r2

d). For M2, M4, M6 and M7, we specify an

inverse-gamma distribution for r2b;‘ given by r2

b;‘ � IGðab;‘; bb;‘Þ,where ab;‘ and bb;‘ are shape and scale hyperparameters. For M3

and M5, we use a Wishart distribution for R�1b given by

R�1b �WpðV; mÞ, where V and m are the given scale matrix and

degrees of freedom. Also, for simplicity, we give a common valuea for a� and ab;‘, and b for b� and bb;‘.

In terms of hyperparameters, their values are set so that priorprobability distributions are proper distributions with large vari-ance (i.e., vague or noninformative priors). Our choice of vague pri-ors is to impart little impact on the analysis. For the inverse gammaprior, the shape and scale parameters are specified equal to 10�3

and 10�3, respectively. For each of the Gaussian priors, the meanis given a zero since we have no a priori scientific preference, andthe variance is specified equal to 106. For the Wishart distribution,R�1

b is specified using V ¼ 103Ip, where Ip is an identity matrix ofdimension p� p, and m ¼ p. Since the variability of Wishart dis-tribution is inversely proportional to m, the Wishart distributionhas a flat shape and its integral is finite when m ¼ p.

We treat the range parameters sM6 ;‘ of M6 as hyperparametersby giving fixed values. This decision was taken as their MCMC sam-ples could not converge appropriately in our data analysis. Forexample, if uniform priors were specified, the support of their pos-terior distributions was equal to that of the given priors. If gamma

Page 6: Mapping soil water retention curves via spatial Bayesian hierarchical models

W.-H. Yang et al. / Journal of Hydrology 524 (2015) 768–779 773

distributions were given, their MCMC samples could not converge.Additionally, for simplicity, we set sM6 ;‘ equal to a common sM6 (i.e.,sM6 ;‘ ¼ sM6 ).

To choose an appropriate value for sM6 , we did the following pre-liminary analysis. The average and maximum distance betweenpairs of the 75 observation locations were 0.513 and 1.141 km,respectively. Also, we estimated the range parameter of anexponential covariance function of measured water retention ateach pressure head surface by using the likfit function of R (R CoreTeam, 2014) package ‘‘geoR’’ (Ribeiro and Diggle, 2001). The likfitfunction conducts likelihood-based estimation for Gaussian ran-dom fields. The estimated ranges are 0.3353, 0.3763, 0.3708,0.3668, 0.3049, 0.2176, 0.2870, and 0.3189 km for pressure heads�0.1, �0.5, �1, �2, �4, �9, �30, and �150 m, respectively.Experimental variograms and likelihood-based estimation at each

σ2

Iterat

d1,1

Iteration

d1,3

Iteration

d1,4

Iteration

0 500 1000 150

0.00

018

0.00

024

0 500 1000 1500 2000 2500 3000

−1.8

−1.4

−1.0

0 500 1000 1500 2000 2500 3000

−5−3

−11

0 500 1000 1500 2000 2500 3000

−2.0

−1.5

−1.0

−0.5

Fig. 3. The traces of r2� ;d1;1; d1;3; d1;4;r2

b;1;

pressure head are illustrated in Supplementary Material A.Accordingly, we conducted sensitivity analysis for sM6 2 f0:5;1;1:5;2;2:5;3;3:5g km for our data analysis. Similar to M6, we setsM7 ;‘ and sM8 ;‘ equal to a common sM7 and sM8 , respectively, andconducted sensitivity analysis for sM7 ¼ sM8 2 f0:5;1;1:5;2;2:5;3;3:5g km.

2.2.4. Bayesian inference and model selectionWe considered the out-of-sample validation of Voltz and

Goulard (1994) to evaluate model performance. This out-of-samplevalidation procedure used the data at 54 locations (i.e., black sym-bols in Fig. 1) to fit a model and predicted water retention curves at24 locations (i.e., red symbols in Fig. 1). More specifically, ourMCMC samplers were performed under WinBugs 1.4.3 using thespatial correlation functionality of the GeoBUGS package

ε

ion

σ2β,1

Iteration

σ2β,3

Iteration

0.02

0.04

0.06

0.08

0 2000 2500 3000

0 500 1000 1500 2000 2500 3000

0 500 1000 1500 2000 2500 3000

12

34

5

0 500 1000 1500 2000 2500 3000

0.00

0.10

0.20

0.30

σ2β,4

Iteration

r3b;3, and r2

b;4 of M6 when sM6 =1 km.

Page 7: Mapping soil water retention curves via spatial Bayesian hierarchical models

Table 1The sensitivity analysis for the range parameters of spatial (M6–M8) models in termsof deviance information criterion (DIC), mean deviation (MD) (in gg�1) and root meansquared deviation (RMSD), and mean absolute error (MAE) (in gg�1). We denotesM6 ; sM7 and sM8 (in km) as the range parameters for M6, M7, and M8, respectively.

sM6

1 1.5 2 2.5 3 3.5

M6DIC �2319 �2320 �2320 �2320 �2320 �2321

MD (�10�4) 2.92 2.35 1.62 0.56 �0.53 �1.87

RMSD (�10�2) 2.93 2.94 2.95 2.97 2.99 3.00

MAE (�10�2) 2.33 2.34 2.35 2.36 2.37 2.39

sM7

M7DIC �2318 �2318 �2319 �2319 �2320 �2319

MD (�10�4) �18.22 �18.82 �19.65 �20.95 �21.95 �23.21

RMSD (�10�2) 3.04 3.04 3.06 3.07 3.09 3.10

MAE (�10�2) 2.42 2.42 2.44 2.44 2.46 2.47

sM8

M8DIC �2318 �2318 �2319 �2319 �2319 �2319

MD (�10�4) �18.66 �19.25 �19.62 �21.28 �22.39 �23.83

RMSD (�10�2) 3.04 3.05 3.06 3.07 3.09 3.11

MAE (�10�2) 2.42 2.42 2.43 2.45 2.46 2.47

774 W.-H. Yang et al. / Journal of Hydrology 524 (2015) 768–779

(Thomas et al., 2004). Our MCMC samples consisted of threechains. Each chain collected 3,000 MCMC samples from 950,000MCMC iterations by discarding the first 50,000 iterations asburn-in and keeping every 300th iteration of the remainder. Thisprocess led to final MCMC samples with size 9,000.

Convergence was first evaluated through visual inspection ofthe trace plot of parameters. For example, Fig. 3 illustrates thetrace plots of the parameters of M6. Basically, these plots showthe convergence of MCMC samples since no significant trends areapparent. To reduce subjective impacts from visual inspection ontrace plots, we also conducted a numerical inspection of R-hatfor each parameter (Gelman, 1996). If the MCMC converges prop-erly, R-hat should be close to 1.0. Although a value of R-hat below1.1 is acceptable (see Gelman et al., 2003, pp. 296–297), we tunedour model until R-hat was below 1.05 to ensure model conver-gence in the data analysis.

Four criteria were considered to evaluate model performance.The first is the deviance information criterion (DIC) designed as atrade-off between the goodness of fit and complexity of Bayesianhierarchical models (Spiegelhalter et al., 2002; Gelman et al.,2003). DIC can be calculated as follows,

DIC ¼ �4T

XT

t¼1

logLðHt jYÞ þ 2 logLð bHjYÞ;

where logLðHjYÞ is the log-likelihood of H given data Y;Ht is the t-

th MCMC sample of H, and bH is the posterior mean of H. In general,a difference of DIC of more than 10 units may indicate that themodel with smaller DIC has significantly better performance(Spiegelhalter et al., 2002; Berry et al., 2010; Lesaffre and Lawson,

Table 2We compare our non-spatial (M1–M5) and spatial (M6–M8) models in terms of deviancedeviation (RMSD), and mean absolute error (MAE) (in gg�1).

M1 M2 M3

DIC �1550 �2323 �2334

MD (�10�4) �212.15 �216.25 �211.95

RMSD (�10�2) 4.05 5.43 5.44

MAE (�10�2) 3.41 4.41 4.40

2012). The other three criteria for evaluating model performanceare the mean deviation (MD) as used in Voltz and Goulard, 1994,root mean square deviation (RMSD), and mean absolute error(MAE) as used in Saito et al. (2009). These are defined as

MD ¼ 1TIJ

XT

t¼1

XI

i¼1

XJ

j¼1

ðcyijt � yijÞ;

RMSD ¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1

TIJ

XT

t¼1

XI

i¼1

XJ

j¼1

ðcyijt � yijÞ2

vuut ;

MAE ¼ 1TIJ

XT

t¼1

XI

i¼1

XJ

j¼1

jcyijt � yijj;

where cyijt denotes the t-th predicted water retention from theMCMC iterations at location i at pressure head hj. MD measuresthe unbiasedness of predictors of yij and should be close to zero.MAE measures the precision of predictors of yij and should have arelative small value. Alternatively, since RMSD comprises thesquared bias and variance of predictors, it measures goodness ofprediction and should have a relative small value (Carroll andCressie, 1996).

3. Results

Our preliminary analysis of M1 showed that the coefficient d1;2

corresponding to Wr required a more informative Gaussian priorsuch as Nð�50;103Þ instead of our default prior of Nð0;106Þ, other-wise Wr does not converge. Given d1;2 � Nð�50;103Þ, the posterior

mean and median of Wr are both close to zero (i.e., 1:03� 10�4 and9:18� 10�24, respectively). Therefore, similar to Voltz and Goulard(1994), we fixed the residual water content Wr;km to zero at eachlocation km. We found that such a setting has little impact on theinference of the other hydraulic parameters of the VG model aswell as little impact on the prediction of water retention curves.

Table 1 presents the sensitivity analysis for the range parame-ters of M6, M7 and M8. We note that when sM6 ¼ sM7 ¼sM8 ¼ 0:5 km, R-hat indicated convergence (i.e., R-hat < 1.05) buttrace plots of some parameters did not show reasonable evidencefor convergence. Hence, these results were not shown in Table 1.For M6, DIC values showed no difference since the maximum abso-lute difference in DIC values was less than 10. On the other hand,the MD value closest to zero happened at sM6 ¼ 3 km but thesmallest RMSD value and MAE values occurred at sM6 ¼ 1 km. Tochoose a range value for M6 to compare with the other models,we considered sM6 ¼ 1 km since MDS comprises squared biasand variance of predictors. For M7 and M8, DIC values showedno difference but the MD value closest to zero, the smallestRMSD and MAE values all happened when the range parametersequal one. Hence, we considered sM7 ¼ 1 km and sM8 ¼ 1 km forM7 and M8 respectively, to do model comparison. We note thatthese choices did not affect the performance of M6, M7, and M8in model comparison.

In model comparison, Table 2 presents the DIC, MD, RMSD andMAE values of each model. For DIC, the smallest value was found

information criterion (DIC), mean deviation (MD) (in gg�1) and root mean squared

M4 M5 M6 M7 M8

�2317 �2330 �2319 �2318 �2318�82.86 �84.70 2.92 �18.22 �18.66

4.14 4.00 2.93 3.04 3.04

3.30 3.20 2.33 2.42 2.42

Page 8: Mapping soil water retention curves via spatial Bayesian hierarchical models

Table 3Comparison of soil type effect on latent variables Ws (in gg�1), a (in m�1), and n according to M4, M5, M7 and M8. We treated soil type 1 as the base type and obtained posteriormeans and 95% credible intervals for exponential of relative effects of the other soil types. The lower and upper bounds of 95% credible intervals are the 2.5% and 97.5% quantilesof a posterior distribution, respectively.

Soil type Posterior mean & 95% credible interval

Ws a n

M42 �0.001 (�0.021, 0.018) �0.163 (�0.420, �0.005) 0.039 (�0.026, 0.102)3 0.054 (0.032, 0.076) �0.133 (�0.392, 0.055) �0.065 (�0.129, �0.002)4 0.005 (�0.028, 0.039) �0.202 (�0.466, 0.024) 0.035 (�0.096, 0.206)5 0.028 (�0.008, 0.066) �0.112 (�0.422, 0.318) �0.009 (�0.118, 0.135)

M52 �0.002 (�0.022, 0.018) �0.162 (�0.403, �0.003) 0.046 (�0.022, 0.111)3 0.054 (0.030, 0.078) �0.123 (�0.378, 0.071) �0.071 (�0.140, �0.006)4 0.005 (�0.030, 0.041) �0.194 (�0.448, 0.035) 0.033 (�0.101, 0.210)5 0.027 (�0.010, 0.066) �0.100 (�0.405, 0.351) �0.016 (�0.129, 0.130)

M72 0.003 (�0.025, 0.031) �0.462 (�2.693, �0.007) 0.041 (�0.028, 0.113)3 0.037 (�0.004, 0.081) �0.275 (�2.145, 0.413) �0.054 (�0.134, 0.034)4 0.015 (�0.031, 0.034) �0.350 (�2.461, 0.368) 0.048 (�0.098, 0.247)5 0.012 (�0.037, 0.063) �0.188 (�2.280, 0.933) 0.007 (�0.123, 0.176)

M82 �0.002 (�0.011, 0.007) �0.168 (�0.333, �0.056) 0.036 (�0.028, 0.099)3 0.056 (0.045, 0.066) �0.140 (�0.313, �0.005) �0.064 (�0.125, �0.004)4 0.012 (�0.006, 0.032) �0.208 (�0.386, �0.053) 0.036 (�0.094, 0.211)5 0.028 (0.010, 0.047) �0.113 (�0.323, 0.192) �0.006 (�0.115, 0.131)

W.-H. Yang et al. / Journal of Hydrology 524 (2015) 768–779 775

for M3, but the absolute difference between M3 and M5 was lessthan 10. Except for M1, the absolute differences in DIC betweenM3 and M2, M4, M6, M7, and M8 were just a bit larger than 10.For MD, RMSD and MAE, M6 had superior performance comparedwith the others including the other two spatial models M7 and M8.These two spatial models had similar performance in prediction. Inparticular, the spatial models M6, M7 and M8 performed betterthan the non-spatial models. The reason for DIC preferring M3was because M3 had more complexity (i.e., more parameters) thanM6, M7 and M8. This result reflects a well known issue that DICusually favors models that overfit (Spiegelhalter et al., 2014).Although alternative information criteria have been proposed intheory to address such issue, none are readily available in practice(Spiegelhalter et al., 2014). From a prediction point of view, spatialmodels provided more accurate and precise predictors for waterretention curves.

Since M6 performed better than M7 and M8, it is important toexamine in detail whether soil type can help predict water reten-tion curves. We first compared non-spatial models with and with-out soil type as a covariate. From Table 2, M4 and M5 includingsoil-type covariates had better performance than M1, M2, andM3 not including soil-type covariates in terms of MD and MAE.M5 had superior performance in terms of RMSD. These results indi-cated that soil type could improve accuracy and precision of modelperformance. Although soil type and spatial effects could individu-ally improve model performance on predicting water retention, M7and M8, which consider both soil type and spatial effects togetherdid not perform better than M6. Such a result indicated that thecontribution of soil type to predicting water retention curves wassmaller than the contribution of spatial effects in our data analysis.

Finally, we looked at inferences on soil type effects to the VGmodel parameters in M4, M5, M7 and M8. Table 3 presents soiltype effects to Ws;a and n where soil type 1 served as the referencelevel for comparison with soil types 2, 3, 4, and 5. For Ws, soil types2, 4, and 5 were statistically indifferent from soil type 1 in M4, M5,M7 and M8. However, soil type 3 was larger than soil type 1 in M4,M5 and M8 but indifferent from soil type 1 in M7 with a smallerposterior mean and a wider credible interval than in M4, M5 andM8. For a, soil types 3, 4, and 5 were statistically indifferent fromsoil type 1 in M4, M5 and M7 but soil types 3 and 4 are statistically

smaller than soil type 1 in M8. On the other hand, soil type 2 wasstatistically smaller than soil type 1 in M4, M5, M7 and M8.However, the posterior mean and credible interval in M7 weresmaller and wider than M4, M5 and M8. In terms of n, soil types2, 4, and 5 were statistically indifferent from soil type 1 in M4,M5, M7 and M8. However, soil type 3 was smaller than soil type1 in M4, M5 and M8 but indifferent from soil type 1 in M7 witha larger posterior mean and a wider credible interval. In summary,these results showed that adding spatial effects to the model (i.e.,M7) could change posterior means and inflate credible intervals.Using appropriate methods such as Hodges and Reich (2010) toremedy spatial confounding is necessary and essential for sta-tistical inferences. The spatial model M8, that has spatial con-founding remedied, showed similar inferences about soil typeeffects as found in the non-spatial models M4 and M5. In particu-lar, M8 indicated that soil type 3 and 5 were 0.056 and 0.028 largerthan soil type 1 for Ws, respectively. For a, soil type 2, 3, and 4 were0.168, 0.140, and 0.208 smaller than that of soil type 1, respec-tively. For n, soil type 3 was 0.064 smaller than soil type 1.

As a result of the out-of-sample validation, we chose M6 as ourfinal model. Fig. 4 shows posterior distributions and priors ofparameters in M6 and how Bayesian analysis reduces parameteruncertainty in data and prior to produce posterior distribution.To predict the fields of the hydraulic parameters Ws;a, and n ofthe VG model as well as the corresponding water retention, we cre-ated a square grid of points with a 25 m interval and predicted thisfield using M6 fitted to the data from the 75 locations. It is impor-tant to note that we have no information about where graveloccurs in the grid in our example and so we have not eliminatedany points of the grid when producing the map of VG parametersand associated water retention curves in our analysis. The R andWinBUGS codes for M6 are included in Supplementary Material B.

Fig. 5 illustrates the predicted mean and standard deviation ofWs, log(a), and n, respectively. For Ws, soil type 3 (i.e., silty clayloam) in the eastern part of the field showed high values but soiltype 1 and 2 on the western part of the field showed small values.These results agreed with the observed water retention curveswhen pressure head is close to zero in Fig. 2(b). For log(a) and n,soil types 1 and 2 on the western part of the field showed high val-ues. On the other hand, soil type 3 showed small values of log(a)

Page 9: Mapping soil water retention curves via spatial Bayesian hierarchical models

0e+00 1e−04 2e−04 3e−04 4e−04 5e−04

σ2ε

Den

sity

1 2 3

d1,1

Den

sity

−3 −2 −1 0

−6 −4 −2 0 2 4 6

d1,3

Den

sity

−2 −1 0 1 2

d1,4

Den

sity

0.00 0.05 0.10 0.15

σ2β,1

Den

sity

0 2 4 6 8 10

σ2β,3

Den

sity

0.0 0.1 0.2 0.3 0.4 0.5

050

0015

000

0.0

1.0

2.0

0.00

0.10

0.20

0.30

0.0

1.0

2.0

3.0

010

2030

400.

00.

20.

40.

60

510

15

σ2β,4

Den

sity

Fig. 4. The posterior distributions (black curves) and prior distributions (red dash curves) of r2� ;d1;1;d1;3;d1;4;r2

b;1;r3b;3, and r2

b;4 of M6 when sM6 =1 km. Note that since all thepriors are vague priors, their density values are relatively smaller than the posterior distributions. (For interpretation of the references to colour in this figure legend, thereader is referred to the web version of this article.)

776 W.-H. Yang et al. / Journal of Hydrology 524 (2015) 768–779

and n on the eastern part of the field. These results indicated fine-textured soils with wider range of pore-distribution typically comewith small values of log(a) and n of the VG model. Moreover, thearea of soil type 4 surrounded by soil types 2 and 3 had smaller val-ues of log(a) and n than soil type 5 but the other area of soil type 4had larger values than soil type 5. This result indicated that soilhydraulic properties of a soil type could vary corresponding tochanges of environmental factors in space. Besides, since the high-est value of log(a) (i.e., inverse of air-entry value) occurred atðx; yÞ ¼ ð200;400Þ, the water retention curve decreases rapidly atthis location (i.e., the red curve in Fig. 2(a)). Fig. 6 shows thecorresponding predicted mean and standard deviation of waterretention at head pressures �0.5, �1, and �150 m, respectively.Since soil type 3 had high values of Ws and small values of log(a)and n, it held more water. There is lower retention atðx; yÞ ¼ ð200;400Þ because of the high value of log(a).

4. Discussion and conclusion

We proposed a spatial Bayesian hierarchical approach imple-mented to the VG model. In the data analysis, we demonstratedthe capability of predicting the hydraulic parameters at locationsof interest as well as the corresponding water retention curves.We demonstrated that our approach could allow useful soil prop-erties and physical or environmental factors to be used as covari-ates to correlate the hydraulic parameters of the VG model andto help the VG model to predict water retention curves. Moreimportantly, this approach could account for the uncertainty ofthe hydraulic parameters of the VG model and water retentioncurve prediction.

The data analysis raised several concerns about our approach.The first deals with the prior for parameters associated with theresidual water content, Wr . Our preliminary data analysis

Page 10: Mapping soil water retention curves via spatial Bayesian hierarchical models

(a)

x (m)

y (m

)

0.24

0.26

0.28

0.30

0.32

0.34

0.36

1

1

1

1

1 1

1

1 1

1

2 2

2

2 2

2 2

2

2 2

2

2 2

2 2 2

2 2 2

2 2 2 2

3 3 3

3 3

3 3

3 3 3

3 3

3

3

3

4

4 4

5

5 52

2

2 2

2 2 2

3 3 3 3

3 3 3

3 3

4 4

4

5 5

(b)

x (m)

−3.0

−2.5

−2.0

−1.5

−1.0

−0.5

0.0

1

1

1

1

1 1

1

1 1

1

2 2

2

2 2

2 2

2

2 2

2

2 2

2 2 2

2 2 2

2 2 2 2

3 3 3

3 3

3 3

3 3 3

3 3

3

3

3

4

4 4

5

5 52

2

2 2

2 2 2

3 3 3 3

3 3 3

3 3

4 4

4

5 5

(c)

x (m)

1.22

1.24

1.26

1.28

1.30

1

1

1

1

1 1

1

1 1

1

2 2

2

2 2

2 2

2

2 2

2

2 2

2 2 2

2 2 2

2 2 2 2

3 3 3

3 3

3 3

3 3 3

3 3

3

3

3

4

4 4

5

5 52

2

2 2

2 2 2

3 3 3 3

3 3 3

3 3

4 4

4

5 5

(d)

x (m)

y (m

)

0.010

0.015

0.020

0.025

0.030

1

1

1

1

1 1

1

1 1

1

2 2

2

2 2

2 2

2

2 2

2

2 2

2 2 2

2 2 2

2 2 2 2

3 3 3

3 3

3 3

3 3 3

3 3

3

3

3

4

4 4

5

5 52

2

2 2

2 2 2

3 3 3 3

3 3 3

3 3

4 4

4

5 5

(e)

x (m)

0.3

0.4

0.5

0.6

0.7

1

1

1

1

1 1

1

1 1

1

2 2

2

2 2

2 2

2

2 2

2

2 2

2 2 2

2 2 2

2 2 2 2

3 3 3

3 3

3 3

3 3 3

3 3

3

3

3

4

4 4

5

5 52

2

2 2

2 2 2

3 3 3 3

3 3 3

3 3

4 4

4

5 5

0 200 400 600 800 1000

020

040

060

080

010

00

0 200 400 600 800 1000

020

040

060

080

010

00

0 200 400 600 800 1000

020

040

060

080

010

00

0 200 400 600 800 1000

020

040

060

080

010

00

0 200 400 600 800 1000

020

040

060

080

010

00

0 200 400 600 800 1000

020

040

060

080

010

00

(f)

x (m)

0.025

0.030

0.035

0.040

0.045

1

1

1

1

1 1

1

1 1

1

2 2

2

2 2

2 2

2

2 2

2

2 2

2 2 2

2 2 2

2 2 2 2

3 3 3

3 3

3 3

3 3 3

3 3

3

3

3

4

4 4

5

5 52

2

2 2

2 2 2

3 3 3 3

3 3 3

3 3

4 4

4

5 5

Fig. 5. (a), (b), and (c) illustrate posterior mean of Ws (in gg�1), log(a) (in logðm�1Þ), and n, respectively. (d), (e), and (f) depict posterior standard deviation of Ws , log(a), and n,respectively. Red points are 75 observation locations. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of thisarticle.)

(a)

x (m)

y (m

)

0.10

0.15

0.20

0.25

0.30

0.35

1

1

1

1

1 1

1

1 1

1

2 2

2

2 2

2 2

2

2 2

2

2 2

2 2 2

2 2 2

2 2 2 2

3 3 3

3 3

3 3

3 3 3

3 3

3

3

3

4

4 4

5

5 52

2

2 2

2 2 2

3 3 3 3

3 3 3

3 3

4 4

4

5 5

(b)

x (m)

0.10

0.15

0.20

0.25

0.30

0.35

1

1

1

1

1 1

1

1 1

1

2 2

2

2 2

2 2

2

2 2

2

2 2

2 2 2

2 2 2

2 2 2 2

3 3 3

3 3

3 3

3 3 3

3 3

3

3

3

4

4 4

5

5 52

2

2 2

2 2 2

3 3 3 3

3 3 3

3 3

4 4

4

5 5

(c)

x (m)

0.10

0.15

0.20

0.25

0.30

0.35

1

1

1

1

1 1

1

1 1

1

2 2

2

2 2

2 2

2

2 2

2

2 2

2 2 2

2 2 2

2 2 2 2

3 3 3

3 3

3 3

3 3 3

3 3

3

3

3

4

4 4

5

5 52

2

2 2

2 2 2

3 3 3 3

3 3 3

3 3

4 4

4

5 5

(d)

x (m)

y (m

)

0.010

0.015

0.020

0.025

0.030

1

1

1

1

1 1

1

1 1

1

2 2

2

2 2

2 2

2

2 2

2

2 2

2 2 2

2 2 2

2 2 2 2

3 3 3

3 3

3 3

3 3 3

3 3

3

3

3

4

4 4

5

5 52

2

2 2

2 2 2

3 3 3 3

3 3 3

3 3

4 4

4

5 5

(e)

x (m)

0.010

0.015

0.020

0.025

0.030

1

1

1

1

1 1

1

1 1

1

2 2

2

2 2

2 2

2

2 2

2

2 2

2 2 2

2 2 2

2 2 2 2

3 3 3

3 3

3 3

3 3 3

3 3

3

3

3

4

4 4

5

5 52

2

2 2

2 2 2

3 3 3 3

3 3 3

3 3

4 4

4

5 5

0 200 400 600 800 1000

020

040

060

080

010

00

0 200 400 600 800 1000

020

040

060

080

010

00

0 200 400 600 800 1000

020

040

060

080

010

00

0 200 400 600 800 1000

020

040

060

080

010

00

0 200 400 600 800 1000

020

040

060

080

010

00

0 200 400 600 800 1000

020

040

060

080

010

00

(f)

x (m)

0.010

0.015

0.020

0.025

1

1

1

1

1 1

1

1 1

1

2 2

2

2 2

2 2

2

2 2

2

2 2

2 2 2

2 2 2

2 2 2 2

3 3 3

3 3

3 3

3 3 3

3 3

3

3

3

4

4 4

5

5 52

2

2 2

2 2 2

3 3 3 3

3 3 3

3 3

4 4

4

5 5

Fig. 6. Predicted water retention on the prediction squared grid equally spaced at 100 m interval. (a), (b), and (c) illustrate posterior mean (in gg�1) at head pressures �0.5,�1and �150 m, respectively. (d), (e), and (f) depict posterior standard deviation at head pressures �0.5, �1 and �150 m, respectively. Red points are 75 observation locations.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

W.-H. Yang et al. / Journal of Hydrology 524 (2015) 768–779 777

Page 11: Mapping soil water retention curves via spatial Bayesian hierarchical models

778 W.-H. Yang et al. / Journal of Hydrology 524 (2015) 768–779

suggested that Wr was close to zero and so we fixed Wr to zerowhich is similar to the approach of Voltz and Goulard (1994). Bygiving vague priors (i.e., distributions with large variance), we wereaware that, in some cases, to simultaneously estimate both Ws andWr might make the model unidentifiable because their relatedparameters compete to get information from data. When suchcases occur, we suggest specifying subjective priors based on priorexperience, expert-knowledge, or modeling Ws and Wr togetherwith additional conditions such as Ws > Wr . We think that theissue of how to appropriately estimate Ws and Wr simultaneouslycan be an important topic for future research because of the needto account for the joint uncertainty of Ws and Wr .

The second concern is how to model or parametrize the covari-ance function of spatial effects. Although we considered anexponential covariance function in our analysis, any appropriatelyparametrized covariance functions could be considered to accom-modate the data at hand. Also, parametrized covariance functionsare not required to be identical for all the hydraulic parametersof the VG model. The parameters of a covariance function couldbe either estimated by specifying priors or treated ashyperparameters given fixed values. Treating them as unknownsand estimating them would be the first consideration becausethe uncertainty from them could be accounted for. Trial and errorcould be used to choose covariance functions and hyperparame-ters. Moreover, if the study covers a large area, anisotropic or non-stationary covariance structures should be considered.

The third and final concern arose through our consideration ofsoil type as a covariate to correlate the hydraulic parameters ofthe VG model in the data analysis. We found that this covariatewas confounded with spatial effects explained by the covariancestructure. We should be aware of spatial confounding while mak-ing statistical inferences according to a spatial model. At last,although DIC is a widely used criteria for model selection inhierarchical Bayesian modeling, DIC could not select a relativelyparsimonious spatial model in our data analysis. Due to DIC infavor of a complicated model, it may not be a suitable criteria toselect a model for prediction.

Overall, this work presented a spatial Bayesian hierarchicalapproach to estimate and predict the hydraulic parameters of theVG model and further, to predict their water retention curves.This approach can account for uncertainty from various sourcesin the model. Finally, extending the process model to a nonlinearmodel will be our future research.

Acknowledgments

We would like to thank Dr. Marc Voltz from INRA Montpellierfor providing the data. We would also like to thank Dr. DanGladish and three anonymous referees for their helpful commentsand suggestions.

Appendix A. Supplementary material

Supplementary data associated with this article can be found, inthe online version, at http://dx.doi.org/10.1016/j.jhydrol.2015.03.029.

References

Abbaspour, K.C., van Genuchten, M.T., Schulin, R., Schläppi, E., 1997. A sequentialuncertainty domain inverse procedure for estimating subsurface flow andtransport parameters. Water Resour. Res. 33, 1879–1892. http://dx.doi.org/10.1029/97WR01230, <http://onlinelibrary.wiley.com/doi/10.1029/97WR01230/abstract>.

Berliner, L.M., 1996. Hierarchical Bayesian time series models. In: Hanson, K.M.,Silver, R.N. (Eds.), Maximum Entropy and Bayesian Methods. Kluwer AcademicPublishers, <http://link.springer.com/chapter/10.1007/978-94-011-5430-7_3>.

Berry, S.M., Carlin, B.P., Lee, J.J., Müller, P., 2010. Bayesian Adaptive Methods forClinical Trials. CRC Press.

Brooks, R.H., Corey, A.T., 1964. Hydraulic properties of porous media. HydrologyPaper No. 3, Civil Engineering Department, Colorado State University, FortCollins, Colorado, USA.

Carlin, B.P., Louis, T.A., 2001. Bayes and Empirical Bayes Methods for Data Analysis,,Chapman & Hall.

Carroll, S.S., Cressie, N., 1996. A comparison of geostatistical methodologies used toestimate snow water equivalent. Water Resour. Bull. 32, 267–278. http://dx.doi.org/10.1111/j.1752-1688.1996.tb03450.x, <http://onlinelibrary.wiley.com/doi/10.1111/j.1752-1688.1996.tb03450.x/abstract>.

Collis-George, N., 2014. A model for the interpretation of the experimental drainagemoisture characteristic. Geoderma 213, 124–130. http://dx.doi.org/10.1016/j.geoderma.2013.07.032, <http://www.sciencedirect.com/science/article/pii/S0016706113002760>.

Cressie, N., Wikle, C.K., 2011. Statistics for Spatio-Temporal Data. John Wiley & Sons,New York.

Durner, W., 1994. Hydraulic conductivity estimation for soils with heterogeneouspore structure. Water Resour. Res. 30, 211–223. http://dx.doi.org/10.1029/93WR02676, <http://onlinelibrary.wiley.com/doi/10.1029/93WR02676/abstract>.

Gelman, A., 1996. Inference and monitoring convergence. In: Gilks, W., Richardson,S., Spiegelhalter, D. (Eds.), Markov Chain Monte Carlo in Practice. Chapman andHall/CRC, pp. 131–143.

Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B., 2003. Bayesian Data Analysis,, CRCPress.

Hodges, J.S., Reich, B.J., 2010. Adding spatially-correlated errors can mess up thefixed effect you love. Am. Stat. 64, 325–334. http://dx.doi.org/10.1198/tast.2010.10052, <http://www.tandfonline.com/doi/abs/10.1198/tast.2010.10052>.

Kosugi, K., 1996. Lognormal distribution model for unsaturated soil hydraulicproperties. Water Resour. Res. 32, 2697–2703. http://dx.doi.org/10.1029/96WR01776, <http://onlinelibrary.wiley.com/doi/10.1029/96WR01776/abstract>.

Lesaffre, E., Lawson, A.B., 2012. Bayesian Biostatistics. John Wiley & Sons, New York.Lunn, D.J., Thomas, A., Best, N., Spiegelhalter, D., 2000. WinBUGS – a Bayesian

modelling framework: concept, structure, and extensibility. Stat. Comput. 10,325–337. http://dx.doi.org/10.1023/A:1008929526011, <http://link.springer.com/article/10.1023/A:1008929526011>.

Minasny, B., McBratney, A.B., 2005. The Matérn function as a general model for soilvariograms. Geoderma 128, 192–207. http://dx.doi.org/10.1016/j.geoderma.2005.04.003, <http://www.sciencedirect.com/science/article/pii/S0016706105000911>.

Minasny, B., Whelan, B.M., Triantafilis, J., McBratney, A.B., 2013. Pedometricsresearch in the vadose zone – review and perspectives. Vadose Zone J. 12, 1–20.http://dx.doi.org/10.2136/vzj2012.0141, <https://dl.sciencesocieties.org/publications/vzj/abstracts/12/4/vzj2012.0141>.

Mualem, Y., 1976. New model for predicting the hydraulic conductivity ofunsaturated porous media. Water Resour. Res. 12, 513–522. http://dx.doi.org/10.1029/WR012i003p00513, <http://onlinelibrary.wiley.com/doi/10.1029/WR012i003p00513/abstract>.

R Core Team, 2014. R: A Language and Environment for Statistical Computing. RFoundation for Statistical Computing, Vienna, Austria. <http://www.R-project.org/>.

Ribeiro Jr., P.J., Diggle, P.J., 2001. geoR: A package for geostatistical analysis. R-News,1, pp. 15–18. <http://cran.R-project.org/doc/Rnews>.

Saito, H., Seki, K., Šimunek, J., 2009. An alternative deterministic method for thespatial interpolation of water retention parameters. Hydrol. Earth Syst. Sci. 13,453–465. http://dx.doi.org/10.5194/hess-13-453-2009, <http://www.hydrol-earth-syst-sci.net/13/453/2009/hess-13-453-2009.html>.

Seki, K., 2007. SWRC fit – a nonlinear fitting program with a water retention curvefor soils having unimodal and bimodal pore structure. Hydrol. Earth Syst. Sci.Discuss. 4, 407–437, <http://hal.archives-ouvertes.fr/hal-00298817/>..

Spiegelhalter, D.J., Best, N.G., Carlin, B.P., van der Linde, A., 2002. Bayesian measuresof model complexity and fit. J. R. Stat. Soc. B Met. 64, 583–639. http://dx.doi.org/10.1111/1467-9868.00353, <http://onlinelibrary.wiley.com/doi/10.1111/1467-9868.00353/abstract>.

Spiegelhalter, D.J., Best, N.G., Carlin, B.P., van der Linde, A., 2014. The devianceinformation criterion: 12 years on. J.R. Stat. Soc. B Met. http://dx.doi.org/10.1111/rssb.12062, <http://onlinelibrary.wiley.com/doi/10.1111/rssb.12062/abstract>..

Thomas, A., Best, N., Lunn, D., Arnold, R., Spiegelhalter, D., 2004. GeoBUGS Version1.2 User Manual. MRC Biostatistics Unit.

van Genuchten, M.T., 1980. A close-form equation for predicting the hydraulicconductivity of unsaturated soils. Soil Sci. Soc. Am. J. 44, 892–989.

van Genuchten M.Th., Leij, F.J., Yates, S.R., 1991. The RETC code for quantifying thehydraulic functions of unsaturated soils. Report No. EPA/600/2-91/065, RobertS. Kerr Environmental Research Laboratory, US Environmental ProtectionAgency.

Vereecken, H., Weynants, M., Javaux, M., Pachepsky, Y., Schaap, M.G., vanGenuchten, M.T., 2010. Using pedotransfer functions to estimate the vanGenuchten-Mualem soil hydraulic properties: a review. Vadose Zone J. 9, 795–820. http://dx.doi.org/10.2136/vzj2010.0045, <https://dl.sciencesocieties.org/publications/vzj/abstracts/9/4/795>.

Vila, J.P., Wagner, V., Neveu, P., Voltz, M., Lagacherie, P., 1999. Neural networkarchitecture selection: new Bayesian perspectives in predictive modelling:

Page 12: Mapping soil water retention curves via spatial Bayesian hierarchical models

W.-H. Yang et al. / Journal of Hydrology 524 (2015) 768–779 779

application to a soil hydrology problem. Ecol. Model. 120, 119–130. http://dx.doi.org/10.1016/S0304-3800(99)00096-4, <http://www.sciencedirect.com/science/article/pii/S0304380099000964>.

Voltz, M., Goulard, M., 1994. Spatial interpolation of soil moisture retention curves.Geoderma 62, 109–123. http://dx.doi.org/10.1016/0016-7061(94)90031-0,<http://www.sciencedirect.com/science/article/pii/0016706194900310>.

Vrugt, J.A., van Wijk, M.T., Hopmans, J.W., Šimunek, J., 2001. One, two, and three-dimensional root water uptake functions for transient modeling. Water Resour.Res. 37, 2457–2470. http://dx.doi.org/10.1029/2000WR000027, <http://onlinelibrary.wiley.com/doi/10.1029/2000WR000027/abstract>.

Vrugt, J.A., Bouten, W., Gupta, H.V., Hopmans, J.W., 2003. Toward improvedidentifiability of soil hydraulic parameters. Vadose Zone J. 2, 98–113. http://

dx.doi.org/10.2136/vzj2003.9800, <https://dl.sciencesocieties.org/publications/vzj/abstracts/2/1/98>.

Weisberg, S., 1985. Applied Linear Regression, second ed. John Wiley & Sons, NewYork.

Wendroth, O., Koszinski, S., Pena-Yewtukhiv, E., 2006. Spatial association amongsoil hydraulic properties, soil texture, and geoelectrical resistivity. VadoseZone J. 5, 341–355. http://dx.doi.org/10.2136/vzj2005.0026, <https://dl.sciencesocieties.org/publications/vzj/abstracts/5/1/341>.

Younes, A., Mara, T.A., Fajraoui, N., Lehmann, F., Belfort, B., Beydoun, H., 2013. Use ofglobal sensitivity analysis to help assess unsaturated soil hydraulic parameters.Vadose Zone J. 12. http://dx.doi.org/10.2136/vzj2011.0150, <https://dl.sciencesocieties.org/publications/vzj/abstracts/12/1/vzj2011.0150>.