Top Banner
Hydrol. Earth Syst. Sci., 18, 3319–3339, 2014 www.hydrol-earth-syst-sci.net/18/3319/2014/ doi:10.5194/hess-18-3319-2014 © Author(s) 2014. CC Attribution 3.0 License. Large-scale regionalization of water table depth in peatlands optimized for greenhouse gas emission upscaling M. Bechtold 1 , B. Tiemeyer 1 , A. Laggner 1 , T. Leppelt 1 , E. Frahm 1,* , and S. Belting 1,2 1 Thünen Institute of Climate-Smart Agriculture, Braunschweig, Germany 2 Belting Umweltplanung, Quernheim, Germany * now at: Physikalisch-Technische Bundesanstalt, Braunschweig, Germany Correspondence to: M. Bechtold ([email protected]) Received: 18 March 2014 – Published in Hydrol. Earth Syst. Sci. Discuss.: 7 April 2014 Revised: 2 July 2014 – Accepted: 20 July 2014 – Published: 1 September 2014 Abstract. Fluxes of the three main greenhouse gases (GHG) CO 2 , CH 4 and N 2 O from peat and other soils with high or- ganic carbon contents are strongly controlled by water ta- ble depth. Information about the spatial distribution of water level is thus a crucial input parameter when upscaling GHG emissions to large scales. Here, we investigate the potential of statistical modeling for the regionalization of water lev- els in organic soils when data covers only a small fraction of the peatlands of the final map. Our study area is Germany. Phreatic water level data from 53 peatlands in Germany were compiled in a new data set comprising 1094 dip wells and 7155 years of data. For each dip well, numerous possible predictor variables were determined using nationally avail- able data sources, which included information about land cover, ditch network, protected areas, topography, peatland characteristics and climatic boundary conditions. We applied boosted regression trees to identify dependencies between predictor variables and dip-well-specific long-term annual mean water level (WL) as well as a transformed form (WL t ). The latter was obtained by assuming a hypothetical GHG transfer function and is linearly related to GHG emissions. Our results demonstrate that model calibration on WL t is su- perior. It increases the explained variance of the water level in the sensitive range for GHG emissions and avoids model bias in subsequent GHG upscaling. The final model explained 45 % of WL t variance and was built on nine predictor vari- ables that are based on information about land cover, peat- land characteristics, drainage network, topography and cli- matic boundary conditions. Their individual effects on WL t and the observed parameter interactions provide insight into natural and anthropogenic boundary conditions that control water levels in organic soils. Our study also demonstrates that a large fraction of the observed WL t variance cannot be explained by nationally available predictor variables and that predictors with stronger WL t indication, relying, for exam- ple, on detailed water management maps and remote sensing products, are needed to substantially improve model predic- tive performance. 1 Introduction Greenhouse gas (GHG) emissions from organic soils can be high compared to mineral soils. In Germany, the fraction of organic soils classified as peatland covers only 5 % of the land surface, but does account for 40 % of GHG emissions in the reporting categories “agriculture” and “land use, land use change and forestry” of the UN Framework Convention on Climate Change (UNFCCC) (UBA, 2012). Also, other or- ganic soils with a lower soil organic carbon content (SOC) but still meeting the definition of organic soils according to IPCC (2006) are important sources of persistently high GHG emissions (Leiber-Sauheitl et al., 2014). In our study, we also consider these soils. For simplification, we will refer in the following to the total of peatlands and “other organic soils” as organic soils. Current estimates of GHG emissions from organic soils are fairly uncertain and reporting of most coun- tries relies on IPCC default emission factors (EF) for CO 2 emissions which are stratified for land use and climatic re- gion, e.g., 10 t C ha -1 year -1 for arable land in the warm temperate zone. Published by Copernicus Publications on behalf of the European Geosciences Union.
21

Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

Aug 08, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

Hydrol Earth Syst Sci 18 3319ndash3339 2014wwwhydrol-earth-syst-scinet1833192014doi105194hess-18-3319-2014copy Author(s) 2014 CC Attribution 30 License

Large-scale regionalization of water table depth in peatlandsoptimized for greenhouse gas emission upscaling

M Bechtold1 B Tiemeyer1 A Laggner1 T Leppelt1 E Frahm1 and S Belting12

1Thuumlnen Institute of Climate-Smart Agriculture Braunschweig Germany2Belting Umweltplanung Quernheim Germany now at Physikalisch-Technische Bundesanstalt Braunschweig Germany

Correspondence toM Bechtold (michelbechtoldtibundde)

Received 18 March 2014 ndash Published in Hydrol Earth Syst Sci Discuss 7 April 2014Revised 2 July 2014 ndash Accepted 20 July 2014 ndash Published 1 September 2014

Abstract Fluxes of the three main greenhouse gases (GHG)CO2 CH4 and N2O from peat and other soils with high or-ganic carbon contents are strongly controlled by water ta-ble depth Information about the spatial distribution of waterlevel is thus a crucial input parameter when upscaling GHGemissions to large scales Here we investigate the potentialof statistical modeling for the regionalization of water lev-els in organic soils when data covers only a small fractionof the peatlands of the final map Our study area is GermanyPhreatic water level data from 53 peatlands in Germany werecompiled in a new data set comprising 1094 dip wells and7155 years of data For each dip well numerous possiblepredictor variables were determined using nationally avail-able data sources which included information about landcover ditch network protected areas topography peatlandcharacteristics and climatic boundary conditions We appliedboosted regression trees to identify dependencies betweenpredictor variables and dip-well-specific long-term annualmean water level (WL) as well as a transformed form (WLt)The latter was obtained by assuming a hypothetical GHGtransfer function and is linearly related to GHG emissionsOur results demonstrate that model calibration on WLt is su-perior It increases the explained variance of the water level inthe sensitive range for GHG emissions and avoids model biasin subsequent GHG upscaling The final model explained45 of WLt variance and was built on nine predictor vari-ables that are based on information about land cover peat-land characteristics drainage network topography and cli-matic boundary conditions Their individual effects on WLtand the observed parameter interactions provide insight intonatural and anthropogenic boundary conditions that control

water levels in organic soils Our study also demonstratesthat a large fraction of the observed WLt variance cannot beexplained by nationally available predictor variables and thatpredictors with stronger WLt indication relying for exam-ple on detailed water management maps and remote sensingproducts are needed to substantially improve model predic-tive performance

1 Introduction

Greenhouse gas (GHG) emissions from organic soils can behigh compared to mineral soils In Germany the fraction oforganic soils classified as peatland covers only 5 of theland surface but does account for 40 of GHG emissions inthe reporting categories ldquoagriculturerdquo and ldquoland use land usechange and forestryrdquo of the UN Framework Convention onClimate Change (UNFCCC) (UBA 2012) Also other or-ganic soils with a lower soil organic carbon content (SOC)but still meeting the definition of organic soils according toIPCC (2006) are important sources of persistently high GHGemissions (Leiber-Sauheitl et al 2014) In our study we alsoconsider these soils For simplification we will refer in thefollowing to the total of peatlands and ldquoother organic soilsrdquoas organic soils Current estimates of GHG emissions fromorganic soils are fairly uncertain and reporting of most coun-tries relies on IPCC default emission factors (EF) for CO2emissions which are stratified for land use and climatic re-gion eg 10 t C haminus1 yearminus1 for arable land in the warmtemperate zone

Published by Copernicus Publications on behalf of the European Geosciences Union

3320 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Artificial drainage turns the function of former naturalpeatlands from a C sink into a C source Experimental workwith organic soils during the last 2 decades showed that theaerated soil pore space above the water level is one of the keyvariables explaining the amount of CO2 emissions (Mooreand Dalva 1993) Frequently the water level relative to soilsurface (further simply referred to as ldquowater levelrdquo with neg-ative values below ground) is used as proxy for air-filledporosity given the simplicity and availability of water levelmeasurements Additionally low water levels and oxygenavailability are also key drivers of nitrous oxide (N2O) pro-duction in organic soils (Regina et al 1996) which increasesthe relevance of organic soils for climate change mitigationpolicy During anaerobic conditions when water levels are ator above the land surface substantial methane (CH4) emis-sions can occur (Levy et al 2012)

It is postulated that the GHG budget ndash the sum of theCO2-equivalents of the three main greenhouse gases (CO2N2O CH4) ndash is at a minimum for annual mean water lev-els (annual mean further defined by the variable name WL)at aboutminus005 tominus01 m (Droumlsler et al 2011) Followingatmospheric sign convention a positive sign stands for netemissions while a negative sign indicates a net uptake ofGHGs Other parameters such as physical and chemical soilproperties and vegetation also influence the amount of theemissions and thus weaken the relation between total GHGbudget and WL

If available information about the spatial distribution ofWL can identify GHG hot spot regions and improve the ac-curacy of the total GHG budgets at large scales The applica-tion of transfer functions that relate GHG emissions to WLand potential other influencing site characteristics can refinethe estimates derived from simple application of IPCC de-fault EFs However in many countries and regions as forexample Germany and Europe a map of WL in organic soilsdoes not exist The spatial availability of measured WL ismuch higher than that of measured GHG fluxes which sug-gests the use of WL as scaling parameter for upscaling GHGemissions

Several methods were applied in the past to produce WLmaps Their suitability is strongly related to data availabil-ity which very often decreases in quality and spatial densitywith increasing scale of the study area Spatially distributedprocess-based modeling (Thompson et al 2009) and semi-physical statistical approaches (Bierkens and Stroet 2007)are able to reproduce well the water level dynamics in wet-lands environments including peatlands However they heav-ily rely on spatial information about the systemrsquos physicalproperties and boundary conditions (peat hydraulic proper-ties hydraulic conductivity of peat base drainage system)data that is often only available with sufficient detail at a re-gional scale (Limpens et al 2008) Despite this difficultythere are studies in which process-based models were ap-plied to model peatland water levels at a large scale (na-tional or continental) Gong et al (2012) adopted a common

soilndashvegetationndashatmosphere transfer model to account forthe differing hydrological processes in pristine fens pristinebogs and drained peatlands and modeled water level fluctu-ations in boreal peatlands for all Finland But calibration andvalidation with data from only three mires does not allow forconclusions about the accuracy and general applicability ofthe model Numerous large-scale hydrological wetland mod-els are often developed with a focus on delineating wetlandextent (Melton et al 2013) TOPMODEL-based schemes(Ju et al 2006) and more advanced large-scale hydrologicframeworks (Fan and Miguez-Macho 2011) are suited tomodel WL but do not account for anthropogenic drainageand thus are only applicable to pristine (or nearly pristine)peatland systems

When detailed physical model input that is needed for aphysically based approach is lacking statistical or machine-learning tools represent a promising alternative (Finke et al2004) Potential predictor variables that are available at thefinal map scale are determined for each location with waterlevel data and the algorithm identifies dependencies betweenpotential predictors and target variables such as WL or otherstatistical values that describe water level dynamics For ar-eas rich in water level data eg the Netherlands residualsof the statistical model can afterwards be analyzed for spatialcorrelation If this is present it can be used to correct for spa-tially correlated model bias by kriging This scheme has beenapplied to agricultural areas by Finke et al (2004) and to na-ture conservation areas by Hoogland et al (2010) Spatialinterpolation approaches can include ancillary data such asmapped geophysical parameters (Buchanan and Triantafilis2009) Statistical approaches strongly rely on both the quan-tity and quality of the data on the target variable itself iethe water level data An important quality criterion for wa-ter level data from organic soils is the measurement depthIt is crucial that there is little or no hydraulic resistance bya low conductive layer between the perforated part of themonitoring well and the fluctuating water level If hydraulicresistance is too high the monitoring well acts as a piezome-ter and water levels may substantially differ from the actualphreatic level as shown for peatlands by van der Gaast etal (2009) If such piezometer data is part of a data set andinterpreted as phreatic water level data during model calibra-tion this can lead to an under- or overestimation of predictedwater levels in organic soils An underestimation of waterlevel predictions (too dry) is discussed for Dutch modelingstudies in van der Gaast et al (2009)

At present in Germany a map of water levels in organicsoils that could be used for GHG upscaling is lacking Thisfact and current efforts on improving GHG emission esti-mates for German organic soils were the main drivers forour study Thus the major goal of this study was the devel-opment of a model concept that produces a water level mapat the scale of all organic soils in Germany that is specificallyoptimized for water level ranges to which GHG emissions re-act sensitively We emphasize that the objective of our study

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3321

was to regionalize annual mean water levels and not the GHGemissions themselves The latter are influenced by more sitecharacteristics in particular soil properties Furthermore wesuppose that annual mean water level is probably not the onlyor optimal statistical measure to describe the water level ef-fect on annual GHG emissions However we are not awareof well-established information about transfer functions thatrelate more complex statistical measures of water level dy-namics to GHG emissions Therefore we here focused on thesimple and frequently applied ldquoannual mean water levelrdquo

As a first step we compiled a new data set of phreatic wa-ter level time series of organic soils with contributions fromnumerous data sources Based on this data we developed amodeling approach for the annual mean water level that fol-lows the basic idea of the statistical regionalization presentedin Finke et al (2004) However the data coverage in ourstudy substantially differed from their study Our data cov-ers only a small fraction of the peatlands of the final map andspatial interpolation of residuals was not possible We thusextended their approach by

ndash including additional possible predictor variables

ndash using boosted regression trees as a modeling tool toidentify the influence of both numerical and categoricalvariables simultaneously

ndash applying a new weighting scheme that balances out het-erogeneous water level data sets with highly variablespatial data density

ndash transforming the annual mean water level WL into atransformed annual mean water level WLt that shows alinear relationship with the GHG budget and optimizesmodel calibration for the WL range relevant for GHGemissions and

ndash restricting the water level regionalization to phreaticwater levels of organic soils

We present a detailed analysis of the influence of the individ-ual predictor variables on water levels of organic soils as wellas their interactions Furthermore the manuscript includesthe estimation of model uncertainty and possible paths offuture model improvement Finally the calibrated model isused to derive a map of WLt for all organic soils in Germanyand the regionalization results are presented

2 Data set and methods

21 Data set of phreatic water levels in organic soils

Available data of phreatic water levels in organic soils arescarce In contrast to data of rather deeply drilled observa-tion wells of official groundwater monitoring networks shortpeatland observation wells of only 1 or 2 meters length that

Figure 1 Locations of the 1094 dip wells of the data set Base map(geological map 1 200 000 BGR) shows the distribution of bogand fen peat and other organic soils

measure the phreatic water level of the peat layer are cur-rently not collected in central data management systems inGermany or any of its federal states With a comprehensivequestionnaire started in 2011 we collected water level timeseries of organic soils from local agencies non-governmentalorganizations universities consultants and other sourcesand combined this data with water level data from ourprojects Time series included manual and automatic mea-surements Years with less than six measurements or datagaps of more than 3 months were excluded Water level timeseries of each dip well were visually checked on plausible dy-namics by comparing with data from neighboring dip wellsand weather data time series Based on auxiliary data andlocal knowledge we further identified dip wells that reacheddown to the underlying aquifer If dip wells failed these qual-ity checks they were removed from the data set

The final data set comprised 7155 years of data from53 German peatlands and 1094 dip wells On average timeseries ranged over 7 years All time series were collected atsome period between the years 1988 to 2012 Data are welldistributed over most of the German peatland regions andcover the three major types of organic soils (Fig 1) Com-pared with the distribution of the types of organic soils in

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3322 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Germany the fraction of dip wells on bogs is overrepresentedin the data set by the factor of 25 while dip wells on fensand other organic soils are slightly underrepresented Dataalso cover the common land-use types (for data sources seeTable 1) However dip wells on organic soils that are neitherused for agriculture forestry or peat mining further referredto as ldquounused peatlandsrdquo are overrepresented in the data setby a factor of 6 as data was collected more frequently andin higher spatial data density in the frame of conservationprojects The fraction of unused peatlands of the German or-ganic soils is 6 and the fraction in the data set is 36 In contrast dip wells on arable land are underrepresented inthe data set by a factor of 6 The fraction of arable land onGerman organic soils is 24 and the fraction in the data setis 4 The other two key land-use types of organic soils inGermany grassland and forest are well represented in thedata set The misbalance of the land-use types in the data setis accounted for in the weighting of data (see Sect 232)

If land use changed within the measurement period of adip well the time series was split at the moment when theland-use record indicates the transition For each segment theannual mean water level WL (here with negative values de-fined as water levels below ground) was calculated as themulti-year average value over the whole measurement periodof the specific land use

The primary application of the WL map produced in thisstudy is for the upscaling of long-term GHG emissions asemission reporting may only reflect anthropogenic effectsbut not interannual climatic effects As GHG transfer func-tions are developed on annual data their application requiresboth the long-term annual mean water level as well as itsinterannual variability Due to the non-linear dependence ofGHG emissions on WL single years with extreme water lev-els can strongly influence long-term average GHG fluxesThis study is focused on the regionalization of the long-termannual mean water levels For this objective model buildingshould be based on long-term water level time series to av-erage out the effect of weather variation within a completeclimatic period (commonly 30 years) The existing nation-ally available data on water level time series of organic soilshowever does not comprise a single time series with com-plete data coverage over the last 30 years Due to the lackof sufficient long-term water level time series we includedall time series in the model building process Average cli-matic boundary conditions (precipitation reference evapo-transpiration water balance) of the specific measurement pe-riod of each dip well are part of the predictor variables (seeSect 22) and thus are supposed to partly account for theeffect of specific weather conditions on WL in case of shortmeasurement periods

22 Predictor variables

Spatial coverage of phreatic water level data of organic soilsis too low to obtain WL maps by simple spatial interpolation

(Fig 1) Additional spatial data is needed as basis for region-alization Ancillary information that covers fully or at leastmost of the extent of the final map are necessary They can beused as predictor variables A comprehensive set of variables(numerical and categorical) with potential indication for thehydrological condition of an organic soil were determinedfor each dip well (Fig 2 and Table 1)

The predictor variables which can partly be found also inFinke et al (2004) can be divided into seven groups

221 Land cover

As certain land use and vegetation require and reflect cer-tain WL such information can be used as an indicator forthe average drainage level around the dip well Land-use andvegetation information is based on the German Digital Land-scape Model (ATKIS Basis-DLM) which is updated contin-uously by aerial photos as well as sporadic ground mappingand has a temporal accuracy of 3 months to 5 years It is pro-vided as fine-scaled polygons and represents the best uniformland cover information available in Germany It contains in-formation on primary land-use type few optional vegetationattributes and whether ldquowet soilrdquo has been observed duringmapping As we noticed that the use of a large number ofcategorical variables lowers the performance of boosted re-gression trees we further aggregated the three informationtypes (i) land use (ii) vegetation and (iii) wet soil into a setof nine combined land cover classes (Table 1) These landcover classes were a trade-off between fine differentiationand the number of replicates in each class For grasslandsa ldquowet grasslandrdquo class was separated when grassland wasoverlaid with wet soil andor tree or shrubs vegetation whichmay indicate a less intensive management Forests overlaidwith wet soil were separated as ldquowet forestrdquo Further unusedpeatlands overlaid with wet soil and showing no coveragewith tree attributes were characterized by higher water levelsand were thus separated as ldquowet unused peatlandrdquo The veryfew dip wells classified as open water (n = 2) and peat cut-ting (n = 5) were merged to the reed and arable land coverclass respectively Land-use type and land cover class wereextracted at the dip well (point extraction) and as fractions invarious buffers around the dip well (Table 1) As using toomany weak predictor variables lowers model performanceand increases overfitting the numerous land cover fractionswere further aggregated into two classes the fraction of dry(arable and grassland) and wet (reed wet grassland wet for-est and wet unused peatland) land cover on organic soils Forthe calculation of the fraction of dry land cover we testedvarious factors for the reduction of the contribution of grass-land compared to arable land as the grassland class also in-cludes wetter grasslands that could not be detected with theavailable land cover catalogue A factor of 05 was an optimalvalue which was then set fixed

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3323

222 Drainage network

Locations of ditches that are included as lines in the DigitalLandscape Model were used to obtain information about thedrainage network The total ditch length was calculated forvarious buffer sizes Further the distance to the next ditchwas calculated for each dip well A short distance to the nextditch may indicate either lower or higher water levels de-pending on whether the ditches are used for drainage or al-ready blocked and used for rewetting measures Similarlythe indication of the total length of ditches is not uniqueTherefore we defined two different sets of ditch variablesA first set for which we calculated values for all land coverclasses and a second one for which we only calculated val-ues for land cover classes for which ditches are undoubtedlyused for drainage ie arable and grassland

223 Peatland characteristics

The geological map of Germany (scale 1 200 000) definedthe area for which WL predictions were modeled It is alsothe basis for topological peatland predictor variables ie thefraction of organic soils in different buffer sizes as well asthe dip well distance to the edge of the peatland Informationabout the peatland type and the substrate at the peat base ispresented in more detail in a newly compiled raster map oforganic soils (Roszligkopf et al 2014) and was thus extractedfrom this map Peatland types were aggregated into fiveclasses lowland bog (North German Plains and Alpine Fore-lands) upland bog (Central Uplands and Alps) fen neighbor-ing surface water fen without neighboring surface water anda class of ldquoother organic soilsrdquo that do not fulfill the C contentand thickness criteria to be classified as peatland Substratesat the peat base included loose unconsolidated rock (alluvialsand and gravel deposits) consolidated rock (bedrock) andpeat clay layer The first type may indicate the occurrence ofseepage (positive or negative) whereas the latter two typesmay indicate rather a hydraulic decoupling from the aquiferhydraulic head

224 Climatic boundary conditions

Climatic boundary conditions directly influence water levelOn the one hand the typical long-term climatic boundaryconditions may indicate the general vulnerability of peat-lands in a specific region On the other hand given the dif-ferent lengths of measurement periods of the time seriesin this study climatic boundary condition predictor vari-ables may account for the effect of a climatically wetter ordrier measurement period compared to the long-term av-erages on the water level Climatic boundary conditionswere extracted from a 1times 1 km raster from the GermanWeather Service Annual summer and winter precipitationFAO56 PenmanndashMonteith reference evapotranspiration andclimatic water balance (difference between precipitation and

reference evapotranspiration) were determined for the indi-vidual measurement period of each dip well and as long-termaverages (30 years)

225 Relative altitude

Relative altitude was calculated by subtracting the medianaltitude of various buffer sizes from the absolute altitude ateach dip well in the digital elevation model (DEM) Rela-tive altitude is expected to have two different indications de-pending on the applied buffer size (i) in many peatlands theformer smooth peatland relief at the scale of approximatelygt 5 m has been disturbed due to peat cutting and differencesin drainage and mineralization rate As a consequence therather smooth phreatic surface often does not follow the un-even and patchy terrain Relative altitude with respect tosmaller buffer sizes (lt 250 m) may therefore explain part ofthe WL variation eg a dip well that is located at a surfacemuch higher than the surroundings may indicate deeper wa-ter levels (ii) for large buffer sizes (gt 250 m) relative altitudeindicates whether the peatland lies in a larger morphologicaldepression or elevation and thus may indicate whether large-scale lateral inflow of water can be expected or not Similarindication is provided by the topographic index (see below)The accuracy of relative altitude values depends on the reso-lution and accuracy of the DEM The nation-wide availableDEM is based on data sets of varying quality which maylower the influence of this variable

226 Topographic wetness index

The topographic wetness index is a common wetness indi-cator used in hydrology (Beven and Kirby 1979) It is acombined measure of catchment area and slope at a givenpoint and indicates the extent of flow accumulation High val-ues indicate wetter conditions If calculated at larger scaleshigher values may hint at the occurrence of positive seepageie upward flow of water from the aquifer Topographic wet-ness index was calculated for various DEM resolutions usingthe GRASS 7 module rwatershed

227 Protection status

The protection status of a peatland area may reflect hydrolog-ical conditions Therefore we checked for seven protectionstatus at each dip well (see Table 1 for details)

23 Model building scheme

Model building was performed using boosted regressiontrees (BRT) implemented in the two R packages ldquogbmrdquo(Ridgeway 2013) and ldquodismordquo (Hijmans 2013) BRT is amachine-learning algorithm in which the final model is de-rived from the data Functions that relate target to predictorvariables are not predetermined but freely developed BRTis based on the decision (or regression) tree concept In the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3324 M Bechtold et al Large-scale regionalization of water table depth in peatlands

decision tree concept the parameter space is searched se-quentially for the best split that results in the lowest modelmean squared error The mean responses of the groups thatresult from the various splits and correspond to certain pa-rameter ranges represent the model The common procedureis the growth of a large tree which is subsequently simpli-fied by dropping weak links that are identified with cross-validation Growing only a single tree has several disadvan-tages such as uneven functions that are very sensitive to thespecific sample of the data Therefore ensemble techniqueshave been combined with the decision tree concept Thesewere first the development of multiple models by bootstrap-ping of the samples (bagging technique) and the randomcreation of subsets of predictors at each split (random for-est technique) Later with the ldquoboostingrdquo technique of BRTa sequential procedure was developed in which data is re-weighted after each tree to increase emphasis on data that ispoorly modeled by the existing collection of trees (Elith etal 2008)

BRT modeling is increasingly applied in spatial model-ing of species or numerical environmental variables (Elithet al 2008 Martin et al 2011) thereby often showing su-perior performance compared to other machine-learning al-gorithms The increasing application of BRT is related toseveral of its favorable characteristics the strength of thismethod lies in the ability to fit complex functional dependen-cies including non-linear relationships and interactions be-tween predictor variables Based on its flexibility BRT is in-variant to monotonic transformations of predictors Further-more BRT allows for missing values in the predictor vari-ables thus predictor variable information does not necessar-ily need to fully cover the total map extent The gbm packagehandles missing values in predictor variables by introducingsurrogate splits The mean target value belonging to the miss-ing predictor values is attributed to these surrogate splits dur-ing model building We observed that the contribution of apredictor variable to the final model decreases with an in-creasing number of missing values This is intuitive as targetobservations of missing predictor values are mostly supposedto scatter strongly BRT is further fairly insensitive to out-liers and allows estimating the relative contribution of eachpredictor variable to the model Due to these characteristicswe expected BRT to be very well suited to the very hetero-geneous data set of this study

BRT model calibration is prone to overfitting and thereare various options to reduce this behavior Due to the over-fitting behavior cross validation is generally part of themodel building process However cross validation can beperformed in several ways and if performed carelessly canlead to overly optimistic model performance (Dersquoath 2007)Here cross validation was performed by leaving out wholepeatland areas instead of a random set of dip wells Thisrepresents a stricter cross validation and we noticed that itstrongly reduced overfitting of the water level data and thuscontributed to the development of a more robust model

Figure 2 Illustration of the predictor variables determined for eachdip well based on available national maps (see Table 1)

Another option to avoid overfitting is to impose mono-tonic slopes on the effects of individual parameters whichcan even lead to improved prediction performance (Dersquoath2007) For all our numerical variables we expected mono-tonic slopes rather than optimum functions To avoid pre-defining any expected direction all numerical variables wereadded twice to the set of predictors constraining the slope toa monotonic increase and decrease We let the model decidewhether monotonic increase or decrease has higher predic-tive power

Models were calibrated using a Gaussian response typeaimed at minimizing deviance (squared error) (Ridgeway2013) In all calibration runs we applied the gbmstep func-tion of the dismo package which assesses the optimal num-ber of boosting trees using cross validation We tested variouslearning rates (0001ndash001) bag fractions (01ndash08) and lev-els of tree complexity (3 to 7) ie the number of nodes in atree By trial and error we determined the most effective algo-rithm parameters for our data set being 0005 for the learningrate 06 for the bag fraction and 5 for the tree complexity

The final BRT model building is commonly performed asa two-step procedure (Elith et al 2008) which we basicallyalso followed in our study

i In the first step the whole set of predictor variables isused to calibrate a BRT model

ii In a second step the number of parameters is re-duced sequentially to avoid overfitting and to derive amore parsimonious model We tracked predictive per-formance criteria during the simplification process Asvarious variables were calculated for different buffersizes our predictors included a large number of cor-related variables Correlation coefficients between pre-dictor variables ofgt 07 are known to severely distortmodel estimation and subsequent prediction (Dormann

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3325Ta

ble

1O

verv

iew

onpr

edic

tor

varia

bles

Pre

dict

orva

riabl

eVa

riabl

ena

me

Valu

esP

oint

buf

fers

(m)

Dat

aso

urce

Land

-use

type

Ara

ble

gras

slan

dfo

rest

shr

ubs

peat

-min

ing

unus

edP

oint

100

500

100

025

00D

igita

lLan

dsca

peM

odel

1

peat

land

sw

amp

open

wat

er

Vege

tatio

nat

trib

utes

Dec

iduo

usfo

rest

mix

edfo

rest

con

ifero

usfo

rest

ree

dP

oint

Dig

italL

ands

cape

Mod

el1

(opt

iona

l)sh

rubs

gra

ss

ldquoWet

soil

obse

rved

rdquoY

esn

oP

oint

Dig

italL

ands

cape

Mod

el1

Com

bine

dla

ndco

ver

lcA

rabl

egr

assl

and

wet

gras

slan

dde

cidu

ous

incl

udin

gm

ixed

Poi

nt1

005

001

000

2500

Dig

italL

ands

cape

Mod

el1

info

rmat

ion

(land

-use

type

fo

rest

wet

fore

stc

onife

rous

fore

str

eed

unus

edpe

atla

nd

veg

and

wet

soil

attr

)w

etun

used

peat

land

Dry

land

cove

rfr

actio

nf

dry(

X)

Ara

ble

05times

gras

slan

don

orga

nic

soil

area

0to

110

050

010

002

500

Dig

italL

ands

cape

Mod

el1

Wet

land

cove

rfr

actio

nR

eed

wet

gras

slan

dw

etfo

rest

and

wet

unus

edpe

atla

ndon

100

500

1000

250

0D

igita

lLan

dsca

peM

odel

1

orga

nic

soil

area

0to

1

Tota

llen

gth

ofdi

tche

sfo

rdi le

ndr

y(X

)ge

0m

Poi

nt5

025

010

002

500

Dig

italL

ands

cape

Mod

el1

alll

can

don

lyfo

rar

able

and

gras

slan

d(s

ubsc

rldquod

ryrdquo)

Dis

tanc

eto

next

ditc

hge

0m

Poi

ntD

igita

lLan

dsca

peM

odel1

Pea

tland

type

pty

peLo

wla

ndbo

gup

land

bog

fen

neig

hbor

ing

surf

ace

wat

erf

enP

oint

Map

ofor

gani

cso

ils2

with

outn

eigh

borin

gsu

rfac

ew

ater

oth

erldquolo

w-C

rdquoor

gani

cso

il

Mat

eria

latp

eatb

ase

pba

seU

ncon

solid

ated

rock

pea

tcla

yla

yer

rock

no

info

rmat

ion

Poi

ntM

apof

orga

nic

soils

2

Pea

tland

frac

tion

fpe

at(X

)0

to1

Poi

nt5

001

000

2500

Geo

logi

calm

ap(B

GR

)3

Dis

tanc

eto

edge

ofpe

atla

ndgt

0m

Geo

logi

calm

ap(B

GR

)3

Rat

ioof

dpe

atf

peat

gt0

2500

Geo

logi

calm

ap(B

GR

)3

Pre

cipi

tatio

nge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Eva

potr

ansp

iratio

nge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Clim

atic

wat

erba

lanc

ew

b sum

mer

lt0

andge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Rel

ativ

ehe

ight

hre

l(X

)lt

0an

dge0

mP

oint

ndashm

edia

n25

50

100

250

Dig

italE

leva

tion

Mod

el5

500

1000

Topo

grap

hic

inde

xti ra

sR(X

)gt

0P

oint

and

1000

buffe

rfo

r10

D

igita

lEle

vatio

nM

odel

5

252

501

000

rast

erva

lues

Pro

tect

ion

stat

usN

atur

eco

nser

vatio

nar

eas

peci

alar

eas

ofco

nser

vatio

nP

oint

Map

sof

prot

ecte

dar

eas

6

spec

ialp

rote

ctio

nar

eafo

rw

ildbi

rds

UN

ES

CO

bios

pher

ere

serv

ena

ture

park

nat

iona

lpar

kla

ndsc

ape

prot

ectio

nar

ea

1AT

KIS

Bas

isD

LMF

eder

alA

genc

yfo

rC

arto

grap

hyan

dG

eode

syB

KG

2

map

ofor

gani

cso

ils(R

oszligko

pfet

al

2014

Hum

bold

tUni

vers

ityof

Ber

lin)

3ge

olog

ical

map

12

0000

0(G

UE

K20

0B

GR

ndashF

eder

alIn

stitu

tefo

rG

eosc

ienc

esan

dN

atur

alR

esou

rces

)4

rast

erm

ap1times

1km

ofw

eath

erda

ta(G

erm

anW

eath

erS

ervi

ce)

5B

KG

var

iabl

ena

me

indi

cate

dfo

rth

eni

neva

riabl

esin

the

final

mod

elw

ith(X

)in

dica

ting

buf

size

andR

indi

catin

gra

ster

reso

lutio

n6F

eder

alA

genc

yfo

rN

atur

eC

onse

rvat

ion

(BfN

)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3326 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 3 Illustration of the annual mean water level (WL) transformation Hypothetical transfer function relating GHG budget to WL (m)(a) GHG budget vs the transformed water level (WLt) (b) WLt vs WL The lines along thex axes indicate the data quantiles of theanalyzed data set(c)

et al 2013) Thus we performed this simplificationprocess by first dropping those parameters with a cor-relationgt 07 (either Pearson or Spearman type) to an-other parameter with a higher contribution (Clapcott etal 2011) This ensured that two highly correlated pa-rameters would not remain in the parameter set longerthan the last parameter of another group of variableswhich may contribute less compared to the two highlycorrelated parameters but provides extra informationthat is not covered by the other parameters After allhighly correlated parameters have been dropped fur-ther parameters with low contribution were droppedprogressively

Predictor contributions are calculated as proportional con-tributions to the total error reduction and can be consideredas a measure for the influence of the individual predictorsAdditionally a BRT model allows the derivation of partialdependence plots which indicate how the response is affectedby a certain predictor after accounting for the average effectsof all other predictors in the model (Elith et al 2008) Theseplots do not show the full effect of each parameter on themodel response due to interactions with other parameters thatare fixed to derive theses plots as well as due to parameter co-correlation However they can be used for interpreting modelbehavior (Elith et al 2008)

231 WLt transformation of WL

The map of water levels of this study was developed to im-prove the upscaling of greenhouse gas emissions from or-ganic soils Therefore the final map should provide the high-est accuracy for the water level range for which the high-est differences of greenhouse gas emissions occur This canbe achieved by transforming WL into a transformed vari-able WLt which shows a linear relationship with GHG emis-sions The sensitivity of greenhouse gas emissions to waterlevel has been analyzed in several laboratory and field ex-perimental and monitoring studies (Berglund and Berglund

2011 Droumlsler et al 2011 Hahn-Schoumlfl et al 2011 Leiber-Sauheitl et al 2014 Moore and Roulet 1993 Moore andDalva 1993 van den Akker et al 2012) General trends are astrong increase of methane (CH4) emissions for annual meanwater levels of approximatelygt minus01 m and an increase ofCO2 emissions for water levelslt minus01 m with a trend simi-lar to a saturation function that levels out approximately be-tweenminus04 andminus08 m (Fig 3a) While studies agree overthese general trends the exact shape of the transfer functionand the maximum levels of emissions as well as their depen-dence on soil properties and other environmental parametersare still controversial Here we assume a hypothetical trans-fer function relating the normalized GHG budget rangingfrom 0 to 1 to the water level (see also Fig 3)

GHG Balance=

minuse3(WL+01)

+1 WLlt=minus011minuseminus3(WL+01) WLgtminus01

(1)

As the GHG budget can be positive for both low and highWL we introduced the transformed water level WLt as(Fig 3)

WLt =

e3(WL+01)

minus 1 WL lt= minus011 minus eminus3(WL+01) WL gt minus01

(2)

By calibrating the model to both WL and WLt we test if theoptimization of WLt provides the highest model accuracy forthe water level range relevant for GHG emissions and if itoptimizes the map for application to GHG upscaling

232 Weighting scheme

When considering possible data weighting schemes it isworth emphasizing at this point that the goal of this study isthe development of a statistical model that can explain boththe water level variability within a peatland as well as amongdifferent peatlands The data of target and predictor variablesfor building this model is highly heterogeneous First the tar-get variable data set contains peatland areas that strongly dif-fer in their spatial extent and in the number of installed dip

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3327

wells Second the predictor variable data set contains cate-gorical and numerical data and part of the predictor variablespredominantly vary from peatland to peatland (eg climaticboundary conditions large-scale topographic wetness indexpeatland characteristics) whereas others also show within-peatland variability (eg land use small-scale topographicwetness index drainage network) As the influence of the in-dividual predictor variables on our target WLt is expected tobe rather diffuse due to abundant interactions with other sitecharacteristics the robustness of derived dependencies willstrongly depend on the number of different peatlands in thedata set

There are no universal data weighting rules for similarlyheterogeneous data situations and some degree of expertjudgment and subjectivity is inevitable involved when de-veloping an appropriate scheme (Francis 2011) The needfor introducing a data weighting scheme is obvious as with-out data weighting during calibration too much influencewould be given to small and well-studied peatlands whichwill reduce predictive model performance for large less-well-studied peatland areas To avoid this in a simple man-ner weight could be reduced by the number of dip wells ineach peatland which results in each peatland being equallyweighted This scheme however does not sufficiently usethe high information content provided by well-studied largepeatlands which should have a higher impact on model cali-bration than a small peatland with only few dip wells

Here we propose a new weighting scheme that takes intoaccount both factors peatland size and local density of dipwells to derive dip well specific weighting factors It is basedon principles of data uncertainty reduction by repeated mea-surements and of geostatistics First we consider our datasituation as an analogue of meta-analysis with grouped dataIt is has been shown for homogeneous problems (all datafrom same population) that optimal group weights for meta-analysis is 1SE2 (Hedges and Olkin 1985) with SE beingthe standard error of each group

SE =σe

radicN

(3)

whereσe is the error standard deviation of a measurementandN is the number of measurements in a group For ho-mogeneous problems and uniformσe this results in weightsthat are linearly dependent onN which we here call the firstend member of weighting Heterogeneity (within-group vari-ance) reduces the variation of the group weights which canbe shown by random effect models (Cumming 2012) Aswith second end member of weighting when heterogene-ity totally dominates within-group variance optimal groupweights are uniform for all groups ie weights are inde-pendent ofN We are not aware of a method that allowsthe estimation of the degree of heterogeneity for the com-plex target and predictor data situation in this study includ-ing data (spatial and temporal variability measurement er-ror) and model errors (missing parameters) As a trade-off

Figure 4 Sample semivariogram and fitted semivariogram modelof the annual mean water level data WL

between 1SE2(homogeneous end member) and 1 (heteroge-neous end member) we decided on a group weight that isthe inverse of the standard error 1SE which is for exam-ple often used in econometric studies (Dickens 1990) Weemphasize that this is a subjective decision

The group weight 1SE is the basis for the geostatisticalpart of our weighting scheme There are two reasons why wecannot directly treat our peatlands as groups First there iswithin-peatland variability which is related to changing sitecharacteristics It is one objective of our study to describethis variability by statistical modeling Thus dip wells mustbe treated individually and data cannot be aggregated at thepeatland level Second we expect the model to learn morewhen the same number of dip wells is installed in a largerpeatland In a small peatland spatial autocorrelation betweendip wells is higher ie the information content is lower thanfor large peatlands As a consequence of the first point wedo not aggregate and keep all dip wells in the target variabledata set by attributing to each dip well the fraction 1N ofits group weight so that the relative weights of the groupsremain constant As a consequence of the second point weuse principles of geostatistics in our weighting scheme Wereplace the group sizeN (positive integer number) by theldquostatisticalrdquo group sizen (positive continuous number be-inggt 1) which we derive from the spatial autocorrelationamong the dip wells

Therefore we analyze the spatial autocorrelation structureof the data set A single spherical variogram model was fit-ted to the sample variogram of all data (Fig 4 in Sect 31)Variogram models allow the differentiation of the total datavariance (called ldquosillrdquo) into a spatially uncorrelated variance(called ldquonuggetrdquo) and a spatially correlated variance (calledldquostructural variancerdquo and defined as sillndashnugget) (Wacker-nagel 2003) The variogram model allows for derivation forany distance between two locations the average squared dif-ference of values here defined asγ By definition at dis-tance 0 the average squared difference equals the nuggetand at distances greater than that called the ldquorangerdquo of spatial

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3328 M Bechtold et al Large-scale regionalization of water table depth in peatlands

autocorrelation the average squared difference equals thesill Accordingly the autocorrelated fractionf of the av-erage squared difference between two dip wellsi and j is

fij =sill minus γij

sill minus nugget (4)

We now define the ldquostatisticalrdquo group sizen of each dip wellito be the sum of one plus the autocorrelated fractionsfij ofall dip wells that are within the range of spatial autocorrela-tion of i

ni = 1 +

msumj=1

sill minus γij

sill minus nugget (5)

According to the discussion above dip-well-specific weightscan then be calculated with

wi =1

ni SEi

=1

σeiradic

ni

(6)

whereni is derived from Eq (5) The equation shows thatwith increasing ldquostatisticalrdquo group sizen ie with increas-ing spatial data density the weight of an individual dip wellis ldquodown-weightedrdquo to some degree a behavior that corre-sponds to our initial intention to lower the influence of smallpeatlands compared to large ones The error standard devia-tion σe is dependent on several factors eg the length of thetime series the temporal measurement density and the mi-crotopography around the dip well For simplicity we hereassumedσe to be uniform for all dip wells which simplifiesEq (6) towi =

1radic

ni

Only dip wells with the same land-use type were summedup with Eq (5) which avoids the down-weighting by dipwells that have different land-use types The latter are mostlycharacterized by fairly different WLt and thus by rather lowspatial autocorrelation to dip welli

After spatial correlation has been accounted for the sumof the weights of all dip wells of each land-use type were ad-justed that they correspond to the fractions of this land-usetype in Germany This adjustment accounts for the overrep-resentation in the data set of dip wells in unused peatlandsand underrepresentation of dip wells in arable land

233 Model performance criteria

Model fit and predictive performance after cross-validationwere quantified by the weighted root mean square error

RMSE =

radicradicradicradic 1summi=1 wi

msumi=1

(wi

(xoi minus xsi

)2) (7)

wherem is the number of dip wellsxoi is observed WLor WLt of dip well i xsi is simulated WL or WLt of dipwell i andwi is the data weight of dip welli (see below) We

refer to the root mean square error of the predicted data ofcross validation as RMSEcv Model performance was furtherquantified by NashndashSutcliffe efficiency (NSE)

NSE = 1 minus

msumi=1

wi

(xoi minus xsi

)2

msumi=1

wi

(xoi minus xo

)2 (8)

wherexo is the mean of all observed WL or WLt It indicateshow well observed vs predicted values match the 1 1 lineNSE is a good overall indicator of predictive performancebecause it combines scatter and bias (common offset andorslope difference from 11 line) (Nash and Sutcliffe 1970)Values greater than 0 signify a model that is better than thereference model based on the data mean We refer to the NSEof the training data as NSEcal and of the predicted data ofcross validation as NSEcv

Systematic errors were quantified by calculating the modelbias here defined as

BIAS =

msumi=1

(wi xoi minus wi xsi

) (9)

24 Model uncertainty and stability evaluation

Uncertainty of the model predictions was assessed by boot-strapping cross-validation and residual analysis

For the bootstrapping analysis we followed the procedureof Leathwick et al (2006) We estimated the confidence in-tervals around the predictions and the fitted functions by tak-ing 1000 bootstrap samples of the 53 peatlands The numberof peatlands in each sample was equivalent to the data set butpeatlands were selected randomly with replacement Usingthe predictor variables of the final model a BRT model wasfitted to each sample Cross validation was again performedon peatlands thus a peatland in the calibration data set wasnot part of the cross-validation data set to avoid overly opti-mistic results Variances of the predictions and of the fittedfunctions of the 1000 models were evaluated

If data sets are relatively small (egn lt 1000 Dersquoath2007) then the small size of the training and test data setslowers model accuracy Given the fairly small number ofpeatlands in the data set and the partly high spatial corre-lation of dip wells within these peatlands we decided not tosplit the data set into a training and test data set Estimatesof model accuracy can then be based on cross-validationthereby making effective use of all the data (Dersquoath 2007)The prediction uncertainty of the final model is estimatedby the root mean square error of prediction (RMSEcv seeabove) for each land cover class After testing for near-normal distribution of the residuals RMSEcv can be used toderive the 68 and 95 confidence intervals of the predictionswith RMSEcv and 2times RMSEcv respectively

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3329

Finally additional residual analysis was performed to eval-uate whether the predictions are biased for different landcover classes or geographical regions

25 Regionalization

In the final regionalization step the predictor variables con-tributing to the final model were determined at a 25times 25 mraster for all organic soil in Germany Predictor variableswere determined with the same map input that was used formodel building Land cover information including informa-tion on ditches was based on the data from year 2012 and theclimatic data was based on the average of the last 30 yearsThe fine spatial resolution of 25times 25 m was not chosen tofool the reader with a highly spatially accurate model Ratherthis fairly fine scale was necessary to map the relativelysmall-scale effects of the topography land use and peatlandgeometry variables The final model was then used to makea prediction for each of these raster cells

3 Results and discussion

31 Spatial correlation structure of the data set

The variogram model fitted to the sample variogram provideda nugget (0012 m2 011 m) a sill (009 m2 03 m) and arange of spatial correlation (2700 m) for our data set of WL(Fig 4) The nugget represents the very small-scale soil hy-draulic variability and micro-topography effects on WL (vander Ploeg et al 2012) and measurement error eg by dif-ferences in the determination of the ground surface and inthe timing of the manual measurements Furthermore micro-topography (eg hummocks) and oscillating peat surfaces ofwet peatlands pose a challenge for an accurate determinationof both ground surface and water level The water level timeseries in the data set were of different lengths and rangedfrom 1 to 20 years Interannual variability of water levels canbe large (eg Knotters and van Walsum 1997) For simplic-ity in our analysis data were not harmonized by extrapolat-ing WL time series using weather data to a 30-year periodThus the nugget also includes errors that are introduced bydip wells with different measurement periods that are locatedin the range of spatial correlation In consideration of theseerror sources the fitted nugget of 011 m appears to be a re-alistic value At 03 m the fitted sill matched nearly perfectlywith the standard deviation of the data (031 m) which in-dicates consistency between semivariogram model and dataset The fitted range of spatial correlation of 2700 m reflectsboth physical effects ie the average range of lateral flowdue to hydraulic gradients as well as the effect of averageland-use patterns in Germany on the spatial correlation ofWL Fitted values were used in the calculation of the dip-well-specific weights using Eq (6)

32 Typical water levels for land-use types in Germanorganic soils

The land cover classes are characterized by plausible meanand median water levels which show consistent differencesbetween each other (Table 2 and Fig 5a) The mean valuesof arable land and grassland agree with what can be expectedfor their agronomic requirements with slightly lower waterlevels for arable land The high variability observed for bothclasses may be related to the variability of the efficiency ofinstalled drainage systems as for example the presence andcondition of tile drains and the depth of ditches Grasslandscan be managed with very variable intensity which is partlyreflected in different water levels Figure 5a further showsthat deciduous forests seem to dominate slightly drier organicsoils compared to coniferous forests which dominate underwetter conditions A high variability of water levels is ob-served for the land cover class unused peatland On the onehand post peat-cutting topography increases the variabilityof WL over short distances It probably contributes to thehigh variance observed for this class On the other hand thisclass comprises both rather dry unused peatlands and wetterpeatlands in which re-wetting measures already took placewhich however do not show yet a wet soil attribute in theATKIS Digital Landscape Model This may also cause part ofthe variance observed in the grassland and forest land coverclass All wet land cover classes (reed wet grassland wetforest and wet unused peatland) that were separated by wet-ness indication clearly show higher water levels showing thewetness attribute of the Digital Landscape Model is a usefulattribute

Figure 5b shows the transformed water level for all classesIt can be observed that the variances of the wetter landcover increase relative to the variances of the dry land coverclasses This is due to the highest sensitivity of GHG emis-sions in the wet range of water levels (gt minus05 m) Conse-quently the rather high variance of WL for arable land cor-responds to a rather low variance of WLt ie to a rather lowassumed effect of WL variability on the GHG budget

33 BRT model calibration and validation WL vs WL t

In contrast to land cover class the other predictor variablesshowed if at all only weak relations to WL and WLt whenevaluating them with box plots 2-D cross plots and simplecorrelation matrices Here we expected BRT to detect thestrongest predictor interactions and to identify the most in-formative predictors

After model calibration with all predictors subsequentmodel simplification successively dropped those parameterswith correlationgt 07 and the lowest contribution For bothWL and WLt model performance improved during this sim-plification For WLt the highest values of NSEcv of approxi-mately 046 were achieved with 21 to 9 model parametersThe development of NSEcv for the last 50 parameters is

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3330 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 5 Water level relative to ground surface WL (m) and transformed water level WLt (minus) by land cover class illustrated as a weightedbox plot WLt = minus1 corresponds to maximum CO2 emissions and WLt = 1 to maximum CH4 emissions In the top horizontal axes thenumber of dip wells in each class is indicated

Figure 6NSEcv as a function of number of predictor variables usedin the model of WLt during model simplification and shown for thelast 50 parameter drops

shown in Fig 6 Further elimination of parameters led to apronounced decline of model performance Similar behav-ior was observed for the calibration on WL In favor of amore parsimonious model we chose the model with the low-est number of parameters before the pronounced decline ofmodel performance occurred For the calibration on WLtthis corresponded to the model with lowest number of param-eters that still achieved NSEcv values ofgt 045 (Fig 6) Thefinal WLt model comprised nine predictor variables and thefinal WL model seven parameters The percentages of pa-rameter contributions to the final model and their individualinfluences are discussed for WLt in Sect 34

Table 3 summarizes the statistical performances of themodels calibrated on WL and WLt For both models NSEcalis considerably higher than NSEcv and shows the commonly

Table 2 Weighted mean and standard deviation of WL and WLtdata and of the WLt map presented in Sect 36 for the nine landcover classes

WL (m) WLt (minus)WLt (minus)

Meanplusmn sd Meanplusmn sd Map meanplusmn sd

Arable land minus069plusmn 030 minus076plusmn 017 minus066plusmn 022Deciduous f minus045plusmn 034 minus049plusmn 037 minus047plusmn 035Grassland minus044plusmn 029 minus052plusmn 032 minus049plusmn 030Unused peatl minus039plusmn 036 minus039plusmn 041 minus037plusmn 040Coniferous f minus036plusmn 036 minus037plusmn 037 minus046plusmn 035Wet unused peatl minus022plusmn 027 minus018plusmn 040 minus017plusmn 036Wet forest minus022plusmn 029 minus017plusmn 043 minus021plusmn 039Wet grassland minus010plusmn 014 minus000plusmn 031 minus015plusmn 039Reed minus001plusmn 017 020plusmn 029 minus006plusmn 032

observed overfitting behavior of BRT models The differentmeasures that we conducted to minimize overfitting (cross-validation on peatlands restriction to monotonic responsesand model simplification including elimination of highly cor-related variables) lowered the difference between NSEcal andNSEcv but could not totally avoid overfitting NSEcv of theWLt model (0453) indicates higher predictive model perfor-mance compared to the WL model (0381) However as thedata ranges differ due to the transformation this comparisonmay be misleading Therefore we transformed the predic-tions of the WL model to obtain WLt values from this modeland equally calculated the performance criteria (Table 3 sec-ond column) Then NSEcv is slightly increased (0397) butdoes not achieve the values of the model that was calibratedon WLt A better predictive model performance of the modelcalibrated on WLt is also visible for the RMSEcv valuesThe total RMSEcv as well as the RMSEcv values for the

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3331

Table 3Performance criteria of the different models dry range de-fined as WLlt minus03 m and wet range as WLgt minus03 m

WL (m) WLt (minus) WLt (minus)(calibrated (calibrated on WL) (calibrated on WLt)

on WL)

NSEcal 0627 0559 0642NSEcv 0381 0397 0453RMSEcv 0269 0299 0284RMSEcvdry 0284 0263 0259RMSEcvwet 0222 0382 0355Bias minus0003 0083 0002Biasdry minus0012 0070 0003Biaswet 0021 0120 0000

dry (WL lt minus03 m) and wet range (WLgt minus03 m) showslightly lower values for the WLt model compared to WLtvalues from the model calibrated on WL Given our hypo-thetical transfer function (Fig 3) in which the GHG budgetis linearly related to WLt the higher accuracy of WLt pre-dictions directly corresponds to a higher accuracy of GHGbudget predictions

Superior model performance is also evident when evaluat-ing model bias Only when calibrating directly on WLt arethe WLt predictions bias-free Calibration on WL and subse-quent transformation to WLt introduces a model bias towardssystematically lower WLt values In subsequent applicationsto GHG emission upscaling lower WLt values would lead toan overestimation of CO2 emissions and to an underestima-tion of CH4 emissions

34 Influence of predictor variables on WLt

Given the beneficial characteristics of the model calibratedon WLt for GHG upscaling presentation and discussion offurther model results is restricted to the WLt model

The BRT method allows the analysis of the parameter con-tributions to and influences on the model (Elith et al 2008)and thus may contribute to system understanding The per-centages of the contributions of the nine predictor variablesto the final model ranged from 252 to 56 (Fig 7) Ex-cept protection status at least one parameter of each of theseven parameter groups contributed to the final model Allprotection status information was dropped early during thesimplification process due to low contribution although WLshowed slightly higher values for data from nature protectionor special areas of conservation However other parametersseem to be able to fully compensate the information that islost by dropping this predictor

Land cover class lc at the dip well was the parameter withstrongest contribution (252 ) It basically follows the trendillustrated in Fig 5b The bootstrap error plotted as standarddeviation (Fig 7) shows the variation of this influence overthe 1000 bootstrap models A second land cover parameterthe fraction of dry land cover classes on organic soils in abuffer of 2500 m radiusfdry (2500) contributed to the model

with 103 The monotonic decrease of WLt with increas-ing fdry (2500) is plausible as higher values reflect intensiveland use in the surroundings of the dip well and thus indicateintensive artificial drainage Together both parameter con-tributed 355 and thus land cover represents the parametergroup with the strongest model contribution

Peatland characteristics are the second most importantparameter group The peatland type contributed 16 Themodel indicates that peatlands without any connection to sur-face water bodies (river or lake) and the class of other organicsoils are characterized by lower WLt compared to the peat-land types lowland bog upland bog and fen neighboring sur-face water As the class of other organic soils is generallyexpected to reflect lower water levels and as surface watermay have a stabilization effect on water levels of organicsoils the influence of the peatland type can be consideredplausible Besides peatland type the substrate of the peatbase contributes 56 Here organic soils overlying peatclay layers (eg limnic sediments such as calcareous gyt-tja) or basement rock are characterized by higher WLt com-pared to organic soils overlying unconsolidated rock Thiscan be explained by the lower drainage resistance of uncon-solidated rocks This may cause an increased efficiency ofanthropogenic drainage andor a general higher vulnerabil-ity to seepage losses Finally slightly lower WLt values areindicated by a high fraction of organic soils for the 500 mbufferfpeat(500) This may reflect the higher land-use pres-sure on large peatlands compared to rather small peatlandswhich tentatively are more easily preserved by nature protec-tion efforts

The remaining four parameter groups are represented inthe model by only one parameter each The third most influ-ential parameter was the length of ditches on arable land andgrassland for the 250 m buffer dilendry (250) At first glanceit may be surprising that with increasing ditch density WLtvalues tend to be higher as ditches are supposed to drain thewater when land is used as arable land and grassland Thefact that the model identifies a rather strong effect in the op-posite direction may be caused by incomplete informationabout the drainage network There is not detailed informa-tion about the spatial distribution of tile drains Based on ex-pert knowledge agricultural areas with a lower ditch densityare more likely to have tile drains As these drains easilyinstalled with a narrow drain spacing are more effective atdraining organic soils low WLt values for arable land andgrassland may be related to low ditch densities Furthermoreditches were originally dug at narrow spacing in especiallywet areas of organic soils but there is no information avail-able whether these ditches still function properly

The parameters wbsummer hrel and tiras25all show expectedtrends The model predicts higher WLt for increasing cli-matic water balance in the summer period (May to October)wbsummer for dip wells located in depressions (low values ofhrel) and for higher small-scale topographic wetness indicescalculated on the 25times 25 digital elevation model (tiras25)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3332 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 7 Partial dependence plots for the predictor variables For an explanation of variables see Table 1 They axes are on WLt scale andare centered around the mean WLt Error bars and grey area indicate standard deviation of the response of over 1000 bootstrap models Therelative contribution of each predictor is indicated as percentage The lines along thex axes of each plot show distribution of data across thatvariable in deciles

The fact that all parameters show expected or explainableresponses in the model corroborates the reliability of the cal-ibrated WLt model The standard deviation of the predictorresponses based on the bootstrap samples shows the stabilityof the observed responses

Further insight into model behavior can be obtained by an-alyzing parameter interactions This is obtained by changingtwo parameters simultaneously while keeping mean valuesfor all other parameters (Elith et al 2008) Figure 8 showsthe two strongest parameter interactions Parameter wbsummer

strongly interacts withptype The generally lower values ofWLt of fens without surface water connection and other or-ganic soils show a stronger dependency on the summer cli-matic water balance While a summer climatic water balanceof gt minus80 mm shows a rather weak effect on WLt for the wet-ter peatland types in contrast to the two drier peatland typesthere is still a strong effect with increasing wbsummer Thetrend for wbsummergt 130 mm for the dry peatland types issupported by seven different peatlands

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3333

Figure 8Partial dependence plots representing the two strongest interactions in the model(a) betweenptype and wbsummerand(b) betweenpbaseandfdry Fitted WLt is plotted on they axis which is obtained after accounting for the average effect of all other predictor variables

Another strong interaction is observed forpbase andfdry (2500) While a rather weak effect of the fraction ofarable land and grassland is observed for organic soils over-lying basement rock and peat clay layer a strong effect is ob-served for organic soils overlying unconsolidated rock Thisinteraction reflects the higher lateral range of drainage effectsfor organic soils with little flow resistance at the peat base Inthese organic soils intensive land use lowers the water levelover large areas

35 Discussion of model uncertainty

Plotting observed vs predicted WLt from cross-validation(Fig 9) illustrates the rather large residual variance that can-not be explained by the model As indicated by the higherRMSEcv for the wet range (Table 3) scatter increases withincreasing WLt Error bars in they direction indicate dataerror derived from the nugget of the variogram It is shownfor a few data points as an example Due to transformationdata error increases for higher WLt Figure 9 demonstratesthat the fraction of unexplainable variance related to data er-ror is much higher for the wet than for the dry range Boot-strap error indicating the variation of the model predictionsfor 1000 bootstrap samples is shown in thex direction for thesame data points Bootstrap error is lower than the data errorfor the wet range and slightly higher for the dry range

Bootstrap errors demonstrate the sensitivity of model pre-dictions to changes of the data set used for calibration Whena model possesses structural deficits such as missing pre-dictor variables bootstrap errors should not be used to de-fine confidence intervals for the model predictions Figure 10shows residuals from cross-validation and standard deviationof bootstrap predictions for all land cover classes The resid-uals of each land cover class show near-normal distributionsFor five of the nine land cover classes (wet forest wet un-used peatland arable land coniferous forest and reed) theShapirondashWilk test of normality is positive (p gt 005) Fig-ure 10a further indicates that residuals of each land cover

Figure 9 Observed vs predicted transformed annual mean waterlevel (WLt) from cross-validation results Error bars show selecteddata and bootstrap model errors as standard deviation Data pointsare scaled by their weights

scatter fairly well around zero indicating low bias for the var-ious land cover classes Land-cover-class-specific confidenceintervals of model predictions can thus be derived from theRMSEcv of each land cover class eg 2times RMSEcv repre-senting the 95 confidence interval

The prediction uncertainty derived from cross-validationis much higher than the bootstrap prediction uncertainty ob-tained from the bootstrap standard deviation (sd) with 2times sdcorresponding to the 95 confidence interval (Fig 10)The large difference between these values indicates that themodel has structural deficits that can be attributed to severalerror sources

i Key influences on WLt are missing in the set of pre-dictor variables None of the predictor variables in-dicate whether and to which extent water level in-crease due to re-wetting measures took place in the last

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 2: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

3320 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Artificial drainage turns the function of former naturalpeatlands from a C sink into a C source Experimental workwith organic soils during the last 2 decades showed that theaerated soil pore space above the water level is one of the keyvariables explaining the amount of CO2 emissions (Mooreand Dalva 1993) Frequently the water level relative to soilsurface (further simply referred to as ldquowater levelrdquo with neg-ative values below ground) is used as proxy for air-filledporosity given the simplicity and availability of water levelmeasurements Additionally low water levels and oxygenavailability are also key drivers of nitrous oxide (N2O) pro-duction in organic soils (Regina et al 1996) which increasesthe relevance of organic soils for climate change mitigationpolicy During anaerobic conditions when water levels are ator above the land surface substantial methane (CH4) emis-sions can occur (Levy et al 2012)

It is postulated that the GHG budget ndash the sum of theCO2-equivalents of the three main greenhouse gases (CO2N2O CH4) ndash is at a minimum for annual mean water lev-els (annual mean further defined by the variable name WL)at aboutminus005 tominus01 m (Droumlsler et al 2011) Followingatmospheric sign convention a positive sign stands for netemissions while a negative sign indicates a net uptake ofGHGs Other parameters such as physical and chemical soilproperties and vegetation also influence the amount of theemissions and thus weaken the relation between total GHGbudget and WL

If available information about the spatial distribution ofWL can identify GHG hot spot regions and improve the ac-curacy of the total GHG budgets at large scales The applica-tion of transfer functions that relate GHG emissions to WLand potential other influencing site characteristics can refinethe estimates derived from simple application of IPCC de-fault EFs However in many countries and regions as forexample Germany and Europe a map of WL in organic soilsdoes not exist The spatial availability of measured WL ismuch higher than that of measured GHG fluxes which sug-gests the use of WL as scaling parameter for upscaling GHGemissions

Several methods were applied in the past to produce WLmaps Their suitability is strongly related to data availabil-ity which very often decreases in quality and spatial densitywith increasing scale of the study area Spatially distributedprocess-based modeling (Thompson et al 2009) and semi-physical statistical approaches (Bierkens and Stroet 2007)are able to reproduce well the water level dynamics in wet-lands environments including peatlands However they heav-ily rely on spatial information about the systemrsquos physicalproperties and boundary conditions (peat hydraulic proper-ties hydraulic conductivity of peat base drainage system)data that is often only available with sufficient detail at a re-gional scale (Limpens et al 2008) Despite this difficultythere are studies in which process-based models were ap-plied to model peatland water levels at a large scale (na-tional or continental) Gong et al (2012) adopted a common

soilndashvegetationndashatmosphere transfer model to account forthe differing hydrological processes in pristine fens pristinebogs and drained peatlands and modeled water level fluctu-ations in boreal peatlands for all Finland But calibration andvalidation with data from only three mires does not allow forconclusions about the accuracy and general applicability ofthe model Numerous large-scale hydrological wetland mod-els are often developed with a focus on delineating wetlandextent (Melton et al 2013) TOPMODEL-based schemes(Ju et al 2006) and more advanced large-scale hydrologicframeworks (Fan and Miguez-Macho 2011) are suited tomodel WL but do not account for anthropogenic drainageand thus are only applicable to pristine (or nearly pristine)peatland systems

When detailed physical model input that is needed for aphysically based approach is lacking statistical or machine-learning tools represent a promising alternative (Finke et al2004) Potential predictor variables that are available at thefinal map scale are determined for each location with waterlevel data and the algorithm identifies dependencies betweenpotential predictors and target variables such as WL or otherstatistical values that describe water level dynamics For ar-eas rich in water level data eg the Netherlands residualsof the statistical model can afterwards be analyzed for spatialcorrelation If this is present it can be used to correct for spa-tially correlated model bias by kriging This scheme has beenapplied to agricultural areas by Finke et al (2004) and to na-ture conservation areas by Hoogland et al (2010) Spatialinterpolation approaches can include ancillary data such asmapped geophysical parameters (Buchanan and Triantafilis2009) Statistical approaches strongly rely on both the quan-tity and quality of the data on the target variable itself iethe water level data An important quality criterion for wa-ter level data from organic soils is the measurement depthIt is crucial that there is little or no hydraulic resistance bya low conductive layer between the perforated part of themonitoring well and the fluctuating water level If hydraulicresistance is too high the monitoring well acts as a piezome-ter and water levels may substantially differ from the actualphreatic level as shown for peatlands by van der Gaast etal (2009) If such piezometer data is part of a data set andinterpreted as phreatic water level data during model calibra-tion this can lead to an under- or overestimation of predictedwater levels in organic soils An underestimation of waterlevel predictions (too dry) is discussed for Dutch modelingstudies in van der Gaast et al (2009)

At present in Germany a map of water levels in organicsoils that could be used for GHG upscaling is lacking Thisfact and current efforts on improving GHG emission esti-mates for German organic soils were the main drivers forour study Thus the major goal of this study was the devel-opment of a model concept that produces a water level mapat the scale of all organic soils in Germany that is specificallyoptimized for water level ranges to which GHG emissions re-act sensitively We emphasize that the objective of our study

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3321

was to regionalize annual mean water levels and not the GHGemissions themselves The latter are influenced by more sitecharacteristics in particular soil properties Furthermore wesuppose that annual mean water level is probably not the onlyor optimal statistical measure to describe the water level ef-fect on annual GHG emissions However we are not awareof well-established information about transfer functions thatrelate more complex statistical measures of water level dy-namics to GHG emissions Therefore we here focused on thesimple and frequently applied ldquoannual mean water levelrdquo

As a first step we compiled a new data set of phreatic wa-ter level time series of organic soils with contributions fromnumerous data sources Based on this data we developed amodeling approach for the annual mean water level that fol-lows the basic idea of the statistical regionalization presentedin Finke et al (2004) However the data coverage in ourstudy substantially differed from their study Our data cov-ers only a small fraction of the peatlands of the final map andspatial interpolation of residuals was not possible We thusextended their approach by

ndash including additional possible predictor variables

ndash using boosted regression trees as a modeling tool toidentify the influence of both numerical and categoricalvariables simultaneously

ndash applying a new weighting scheme that balances out het-erogeneous water level data sets with highly variablespatial data density

ndash transforming the annual mean water level WL into atransformed annual mean water level WLt that shows alinear relationship with the GHG budget and optimizesmodel calibration for the WL range relevant for GHGemissions and

ndash restricting the water level regionalization to phreaticwater levels of organic soils

We present a detailed analysis of the influence of the individ-ual predictor variables on water levels of organic soils as wellas their interactions Furthermore the manuscript includesthe estimation of model uncertainty and possible paths offuture model improvement Finally the calibrated model isused to derive a map of WLt for all organic soils in Germanyand the regionalization results are presented

2 Data set and methods

21 Data set of phreatic water levels in organic soils

Available data of phreatic water levels in organic soils arescarce In contrast to data of rather deeply drilled observa-tion wells of official groundwater monitoring networks shortpeatland observation wells of only 1 or 2 meters length that

Figure 1 Locations of the 1094 dip wells of the data set Base map(geological map 1 200 000 BGR) shows the distribution of bogand fen peat and other organic soils

measure the phreatic water level of the peat layer are cur-rently not collected in central data management systems inGermany or any of its federal states With a comprehensivequestionnaire started in 2011 we collected water level timeseries of organic soils from local agencies non-governmentalorganizations universities consultants and other sourcesand combined this data with water level data from ourprojects Time series included manual and automatic mea-surements Years with less than six measurements or datagaps of more than 3 months were excluded Water level timeseries of each dip well were visually checked on plausible dy-namics by comparing with data from neighboring dip wellsand weather data time series Based on auxiliary data andlocal knowledge we further identified dip wells that reacheddown to the underlying aquifer If dip wells failed these qual-ity checks they were removed from the data set

The final data set comprised 7155 years of data from53 German peatlands and 1094 dip wells On average timeseries ranged over 7 years All time series were collected atsome period between the years 1988 to 2012 Data are welldistributed over most of the German peatland regions andcover the three major types of organic soils (Fig 1) Com-pared with the distribution of the types of organic soils in

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3322 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Germany the fraction of dip wells on bogs is overrepresentedin the data set by the factor of 25 while dip wells on fensand other organic soils are slightly underrepresented Dataalso cover the common land-use types (for data sources seeTable 1) However dip wells on organic soils that are neitherused for agriculture forestry or peat mining further referredto as ldquounused peatlandsrdquo are overrepresented in the data setby a factor of 6 as data was collected more frequently andin higher spatial data density in the frame of conservationprojects The fraction of unused peatlands of the German or-ganic soils is 6 and the fraction in the data set is 36 In contrast dip wells on arable land are underrepresented inthe data set by a factor of 6 The fraction of arable land onGerman organic soils is 24 and the fraction in the data setis 4 The other two key land-use types of organic soils inGermany grassland and forest are well represented in thedata set The misbalance of the land-use types in the data setis accounted for in the weighting of data (see Sect 232)

If land use changed within the measurement period of adip well the time series was split at the moment when theland-use record indicates the transition For each segment theannual mean water level WL (here with negative values de-fined as water levels below ground) was calculated as themulti-year average value over the whole measurement periodof the specific land use

The primary application of the WL map produced in thisstudy is for the upscaling of long-term GHG emissions asemission reporting may only reflect anthropogenic effectsbut not interannual climatic effects As GHG transfer func-tions are developed on annual data their application requiresboth the long-term annual mean water level as well as itsinterannual variability Due to the non-linear dependence ofGHG emissions on WL single years with extreme water lev-els can strongly influence long-term average GHG fluxesThis study is focused on the regionalization of the long-termannual mean water levels For this objective model buildingshould be based on long-term water level time series to av-erage out the effect of weather variation within a completeclimatic period (commonly 30 years) The existing nation-ally available data on water level time series of organic soilshowever does not comprise a single time series with com-plete data coverage over the last 30 years Due to the lackof sufficient long-term water level time series we includedall time series in the model building process Average cli-matic boundary conditions (precipitation reference evapo-transpiration water balance) of the specific measurement pe-riod of each dip well are part of the predictor variables (seeSect 22) and thus are supposed to partly account for theeffect of specific weather conditions on WL in case of shortmeasurement periods

22 Predictor variables

Spatial coverage of phreatic water level data of organic soilsis too low to obtain WL maps by simple spatial interpolation

(Fig 1) Additional spatial data is needed as basis for region-alization Ancillary information that covers fully or at leastmost of the extent of the final map are necessary They can beused as predictor variables A comprehensive set of variables(numerical and categorical) with potential indication for thehydrological condition of an organic soil were determinedfor each dip well (Fig 2 and Table 1)

The predictor variables which can partly be found also inFinke et al (2004) can be divided into seven groups

221 Land cover

As certain land use and vegetation require and reflect cer-tain WL such information can be used as an indicator forthe average drainage level around the dip well Land-use andvegetation information is based on the German Digital Land-scape Model (ATKIS Basis-DLM) which is updated contin-uously by aerial photos as well as sporadic ground mappingand has a temporal accuracy of 3 months to 5 years It is pro-vided as fine-scaled polygons and represents the best uniformland cover information available in Germany It contains in-formation on primary land-use type few optional vegetationattributes and whether ldquowet soilrdquo has been observed duringmapping As we noticed that the use of a large number ofcategorical variables lowers the performance of boosted re-gression trees we further aggregated the three informationtypes (i) land use (ii) vegetation and (iii) wet soil into a setof nine combined land cover classes (Table 1) These landcover classes were a trade-off between fine differentiationand the number of replicates in each class For grasslandsa ldquowet grasslandrdquo class was separated when grassland wasoverlaid with wet soil andor tree or shrubs vegetation whichmay indicate a less intensive management Forests overlaidwith wet soil were separated as ldquowet forestrdquo Further unusedpeatlands overlaid with wet soil and showing no coveragewith tree attributes were characterized by higher water levelsand were thus separated as ldquowet unused peatlandrdquo The veryfew dip wells classified as open water (n = 2) and peat cut-ting (n = 5) were merged to the reed and arable land coverclass respectively Land-use type and land cover class wereextracted at the dip well (point extraction) and as fractions invarious buffers around the dip well (Table 1) As using toomany weak predictor variables lowers model performanceand increases overfitting the numerous land cover fractionswere further aggregated into two classes the fraction of dry(arable and grassland) and wet (reed wet grassland wet for-est and wet unused peatland) land cover on organic soils Forthe calculation of the fraction of dry land cover we testedvarious factors for the reduction of the contribution of grass-land compared to arable land as the grassland class also in-cludes wetter grasslands that could not be detected with theavailable land cover catalogue A factor of 05 was an optimalvalue which was then set fixed

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3323

222 Drainage network

Locations of ditches that are included as lines in the DigitalLandscape Model were used to obtain information about thedrainage network The total ditch length was calculated forvarious buffer sizes Further the distance to the next ditchwas calculated for each dip well A short distance to the nextditch may indicate either lower or higher water levels de-pending on whether the ditches are used for drainage or al-ready blocked and used for rewetting measures Similarlythe indication of the total length of ditches is not uniqueTherefore we defined two different sets of ditch variablesA first set for which we calculated values for all land coverclasses and a second one for which we only calculated val-ues for land cover classes for which ditches are undoubtedlyused for drainage ie arable and grassland

223 Peatland characteristics

The geological map of Germany (scale 1 200 000) definedthe area for which WL predictions were modeled It is alsothe basis for topological peatland predictor variables ie thefraction of organic soils in different buffer sizes as well asthe dip well distance to the edge of the peatland Informationabout the peatland type and the substrate at the peat base ispresented in more detail in a newly compiled raster map oforganic soils (Roszligkopf et al 2014) and was thus extractedfrom this map Peatland types were aggregated into fiveclasses lowland bog (North German Plains and Alpine Fore-lands) upland bog (Central Uplands and Alps) fen neighbor-ing surface water fen without neighboring surface water anda class of ldquoother organic soilsrdquo that do not fulfill the C contentand thickness criteria to be classified as peatland Substratesat the peat base included loose unconsolidated rock (alluvialsand and gravel deposits) consolidated rock (bedrock) andpeat clay layer The first type may indicate the occurrence ofseepage (positive or negative) whereas the latter two typesmay indicate rather a hydraulic decoupling from the aquiferhydraulic head

224 Climatic boundary conditions

Climatic boundary conditions directly influence water levelOn the one hand the typical long-term climatic boundaryconditions may indicate the general vulnerability of peat-lands in a specific region On the other hand given the dif-ferent lengths of measurement periods of the time seriesin this study climatic boundary condition predictor vari-ables may account for the effect of a climatically wetter ordrier measurement period compared to the long-term av-erages on the water level Climatic boundary conditionswere extracted from a 1times 1 km raster from the GermanWeather Service Annual summer and winter precipitationFAO56 PenmanndashMonteith reference evapotranspiration andclimatic water balance (difference between precipitation and

reference evapotranspiration) were determined for the indi-vidual measurement period of each dip well and as long-termaverages (30 years)

225 Relative altitude

Relative altitude was calculated by subtracting the medianaltitude of various buffer sizes from the absolute altitude ateach dip well in the digital elevation model (DEM) Rela-tive altitude is expected to have two different indications de-pending on the applied buffer size (i) in many peatlands theformer smooth peatland relief at the scale of approximatelygt 5 m has been disturbed due to peat cutting and differencesin drainage and mineralization rate As a consequence therather smooth phreatic surface often does not follow the un-even and patchy terrain Relative altitude with respect tosmaller buffer sizes (lt 250 m) may therefore explain part ofthe WL variation eg a dip well that is located at a surfacemuch higher than the surroundings may indicate deeper wa-ter levels (ii) for large buffer sizes (gt 250 m) relative altitudeindicates whether the peatland lies in a larger morphologicaldepression or elevation and thus may indicate whether large-scale lateral inflow of water can be expected or not Similarindication is provided by the topographic index (see below)The accuracy of relative altitude values depends on the reso-lution and accuracy of the DEM The nation-wide availableDEM is based on data sets of varying quality which maylower the influence of this variable

226 Topographic wetness index

The topographic wetness index is a common wetness indi-cator used in hydrology (Beven and Kirby 1979) It is acombined measure of catchment area and slope at a givenpoint and indicates the extent of flow accumulation High val-ues indicate wetter conditions If calculated at larger scaleshigher values may hint at the occurrence of positive seepageie upward flow of water from the aquifer Topographic wet-ness index was calculated for various DEM resolutions usingthe GRASS 7 module rwatershed

227 Protection status

The protection status of a peatland area may reflect hydrolog-ical conditions Therefore we checked for seven protectionstatus at each dip well (see Table 1 for details)

23 Model building scheme

Model building was performed using boosted regressiontrees (BRT) implemented in the two R packages ldquogbmrdquo(Ridgeway 2013) and ldquodismordquo (Hijmans 2013) BRT is amachine-learning algorithm in which the final model is de-rived from the data Functions that relate target to predictorvariables are not predetermined but freely developed BRTis based on the decision (or regression) tree concept In the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3324 M Bechtold et al Large-scale regionalization of water table depth in peatlands

decision tree concept the parameter space is searched se-quentially for the best split that results in the lowest modelmean squared error The mean responses of the groups thatresult from the various splits and correspond to certain pa-rameter ranges represent the model The common procedureis the growth of a large tree which is subsequently simpli-fied by dropping weak links that are identified with cross-validation Growing only a single tree has several disadvan-tages such as uneven functions that are very sensitive to thespecific sample of the data Therefore ensemble techniqueshave been combined with the decision tree concept Thesewere first the development of multiple models by bootstrap-ping of the samples (bagging technique) and the randomcreation of subsets of predictors at each split (random for-est technique) Later with the ldquoboostingrdquo technique of BRTa sequential procedure was developed in which data is re-weighted after each tree to increase emphasis on data that ispoorly modeled by the existing collection of trees (Elith etal 2008)

BRT modeling is increasingly applied in spatial model-ing of species or numerical environmental variables (Elithet al 2008 Martin et al 2011) thereby often showing su-perior performance compared to other machine-learning al-gorithms The increasing application of BRT is related toseveral of its favorable characteristics the strength of thismethod lies in the ability to fit complex functional dependen-cies including non-linear relationships and interactions be-tween predictor variables Based on its flexibility BRT is in-variant to monotonic transformations of predictors Further-more BRT allows for missing values in the predictor vari-ables thus predictor variable information does not necessar-ily need to fully cover the total map extent The gbm packagehandles missing values in predictor variables by introducingsurrogate splits The mean target value belonging to the miss-ing predictor values is attributed to these surrogate splits dur-ing model building We observed that the contribution of apredictor variable to the final model decreases with an in-creasing number of missing values This is intuitive as targetobservations of missing predictor values are mostly supposedto scatter strongly BRT is further fairly insensitive to out-liers and allows estimating the relative contribution of eachpredictor variable to the model Due to these characteristicswe expected BRT to be very well suited to the very hetero-geneous data set of this study

BRT model calibration is prone to overfitting and thereare various options to reduce this behavior Due to the over-fitting behavior cross validation is generally part of themodel building process However cross validation can beperformed in several ways and if performed carelessly canlead to overly optimistic model performance (Dersquoath 2007)Here cross validation was performed by leaving out wholepeatland areas instead of a random set of dip wells Thisrepresents a stricter cross validation and we noticed that itstrongly reduced overfitting of the water level data and thuscontributed to the development of a more robust model

Figure 2 Illustration of the predictor variables determined for eachdip well based on available national maps (see Table 1)

Another option to avoid overfitting is to impose mono-tonic slopes on the effects of individual parameters whichcan even lead to improved prediction performance (Dersquoath2007) For all our numerical variables we expected mono-tonic slopes rather than optimum functions To avoid pre-defining any expected direction all numerical variables wereadded twice to the set of predictors constraining the slope toa monotonic increase and decrease We let the model decidewhether monotonic increase or decrease has higher predic-tive power

Models were calibrated using a Gaussian response typeaimed at minimizing deviance (squared error) (Ridgeway2013) In all calibration runs we applied the gbmstep func-tion of the dismo package which assesses the optimal num-ber of boosting trees using cross validation We tested variouslearning rates (0001ndash001) bag fractions (01ndash08) and lev-els of tree complexity (3 to 7) ie the number of nodes in atree By trial and error we determined the most effective algo-rithm parameters for our data set being 0005 for the learningrate 06 for the bag fraction and 5 for the tree complexity

The final BRT model building is commonly performed asa two-step procedure (Elith et al 2008) which we basicallyalso followed in our study

i In the first step the whole set of predictor variables isused to calibrate a BRT model

ii In a second step the number of parameters is re-duced sequentially to avoid overfitting and to derive amore parsimonious model We tracked predictive per-formance criteria during the simplification process Asvarious variables were calculated for different buffersizes our predictors included a large number of cor-related variables Correlation coefficients between pre-dictor variables ofgt 07 are known to severely distortmodel estimation and subsequent prediction (Dormann

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3325Ta

ble

1O

verv

iew

onpr

edic

tor

varia

bles

Pre

dict

orva

riabl

eVa

riabl

ena

me

Valu

esP

oint

buf

fers

(m)

Dat

aso

urce

Land

-use

type

Ara

ble

gras

slan

dfo

rest

shr

ubs

peat

-min

ing

unus

edP

oint

100

500

100

025

00D

igita

lLan

dsca

peM

odel

1

peat

land

sw

amp

open

wat

er

Vege

tatio

nat

trib

utes

Dec

iduo

usfo

rest

mix

edfo

rest

con

ifero

usfo

rest

ree

dP

oint

Dig

italL

ands

cape

Mod

el1

(opt

iona

l)sh

rubs

gra

ss

ldquoWet

soil

obse

rved

rdquoY

esn

oP

oint

Dig

italL

ands

cape

Mod

el1

Com

bine

dla

ndco

ver

lcA

rabl

egr

assl

and

wet

gras

slan

dde

cidu

ous

incl

udin

gm

ixed

Poi

nt1

005

001

000

2500

Dig

italL

ands

cape

Mod

el1

info

rmat

ion

(land

-use

type

fo

rest

wet

fore

stc

onife

rous

fore

str

eed

unus

edpe

atla

nd

veg

and

wet

soil

attr

)w

etun

used

peat

land

Dry

land

cove

rfr

actio

nf

dry(

X)

Ara

ble

05times

gras

slan

don

orga

nic

soil

area

0to

110

050

010

002

500

Dig

italL

ands

cape

Mod

el1

Wet

land

cove

rfr

actio

nR

eed

wet

gras

slan

dw

etfo

rest

and

wet

unus

edpe

atla

ndon

100

500

1000

250

0D

igita

lLan

dsca

peM

odel

1

orga

nic

soil

area

0to

1

Tota

llen

gth

ofdi

tche

sfo

rdi le

ndr

y(X

)ge

0m

Poi

nt5

025

010

002

500

Dig

italL

ands

cape

Mod

el1

alll

can

don

lyfo

rar

able

and

gras

slan

d(s

ubsc

rldquod

ryrdquo)

Dis

tanc

eto

next

ditc

hge

0m

Poi

ntD

igita

lLan

dsca

peM

odel1

Pea

tland

type

pty

peLo

wla

ndbo

gup

land

bog

fen

neig

hbor

ing

surf

ace

wat

erf

enP

oint

Map

ofor

gani

cso

ils2

with

outn

eigh

borin

gsu

rfac

ew

ater

oth

erldquolo

w-C

rdquoor

gani

cso

il

Mat

eria

latp

eatb

ase

pba

seU

ncon

solid

ated

rock

pea

tcla

yla

yer

rock

no

info

rmat

ion

Poi

ntM

apof

orga

nic

soils

2

Pea

tland

frac

tion

fpe

at(X

)0

to1

Poi

nt5

001

000

2500

Geo

logi

calm

ap(B

GR

)3

Dis

tanc

eto

edge

ofpe

atla

ndgt

0m

Geo

logi

calm

ap(B

GR

)3

Rat

ioof

dpe

atf

peat

gt0

2500

Geo

logi

calm

ap(B

GR

)3

Pre

cipi

tatio

nge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Eva

potr

ansp

iratio

nge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Clim

atic

wat

erba

lanc

ew

b sum

mer

lt0

andge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Rel

ativ

ehe

ight

hre

l(X

)lt

0an

dge0

mP

oint

ndashm

edia

n25

50

100

250

Dig

italE

leva

tion

Mod

el5

500

1000

Topo

grap

hic

inde

xti ra

sR(X

)gt

0P

oint

and

1000

buffe

rfo

r10

D

igita

lEle

vatio

nM

odel

5

252

501

000

rast

erva

lues

Pro

tect

ion

stat

usN

atur

eco

nser

vatio

nar

eas

peci

alar

eas

ofco

nser

vatio

nP

oint

Map

sof

prot

ecte

dar

eas

6

spec

ialp

rote

ctio

nar

eafo

rw

ildbi

rds

UN

ES

CO

bios

pher

ere

serv

ena

ture

park

nat

iona

lpar

kla

ndsc

ape

prot

ectio

nar

ea

1AT

KIS

Bas

isD

LMF

eder

alA

genc

yfo

rC

arto

grap

hyan

dG

eode

syB

KG

2

map

ofor

gani

cso

ils(R

oszligko

pfet

al

2014

Hum

bold

tUni

vers

ityof

Ber

lin)

3ge

olog

ical

map

12

0000

0(G

UE

K20

0B

GR

ndashF

eder

alIn

stitu

tefo

rG

eosc

ienc

esan

dN

atur

alR

esou

rces

)4

rast

erm

ap1times

1km

ofw

eath

erda

ta(G

erm

anW

eath

erS

ervi

ce)

5B

KG

var

iabl

ena

me

indi

cate

dfo

rth

eni

neva

riabl

esin

the

final

mod

elw

ith(X

)in

dica

ting

buf

size

andR

indi

catin

gra

ster

reso

lutio

n6F

eder

alA

genc

yfo

rN

atur

eC

onse

rvat

ion

(BfN

)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3326 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 3 Illustration of the annual mean water level (WL) transformation Hypothetical transfer function relating GHG budget to WL (m)(a) GHG budget vs the transformed water level (WLt) (b) WLt vs WL The lines along thex axes indicate the data quantiles of theanalyzed data set(c)

et al 2013) Thus we performed this simplificationprocess by first dropping those parameters with a cor-relationgt 07 (either Pearson or Spearman type) to an-other parameter with a higher contribution (Clapcott etal 2011) This ensured that two highly correlated pa-rameters would not remain in the parameter set longerthan the last parameter of another group of variableswhich may contribute less compared to the two highlycorrelated parameters but provides extra informationthat is not covered by the other parameters After allhighly correlated parameters have been dropped fur-ther parameters with low contribution were droppedprogressively

Predictor contributions are calculated as proportional con-tributions to the total error reduction and can be consideredas a measure for the influence of the individual predictorsAdditionally a BRT model allows the derivation of partialdependence plots which indicate how the response is affectedby a certain predictor after accounting for the average effectsof all other predictors in the model (Elith et al 2008) Theseplots do not show the full effect of each parameter on themodel response due to interactions with other parameters thatare fixed to derive theses plots as well as due to parameter co-correlation However they can be used for interpreting modelbehavior (Elith et al 2008)

231 WLt transformation of WL

The map of water levels of this study was developed to im-prove the upscaling of greenhouse gas emissions from or-ganic soils Therefore the final map should provide the high-est accuracy for the water level range for which the high-est differences of greenhouse gas emissions occur This canbe achieved by transforming WL into a transformed vari-able WLt which shows a linear relationship with GHG emis-sions The sensitivity of greenhouse gas emissions to waterlevel has been analyzed in several laboratory and field ex-perimental and monitoring studies (Berglund and Berglund

2011 Droumlsler et al 2011 Hahn-Schoumlfl et al 2011 Leiber-Sauheitl et al 2014 Moore and Roulet 1993 Moore andDalva 1993 van den Akker et al 2012) General trends are astrong increase of methane (CH4) emissions for annual meanwater levels of approximatelygt minus01 m and an increase ofCO2 emissions for water levelslt minus01 m with a trend simi-lar to a saturation function that levels out approximately be-tweenminus04 andminus08 m (Fig 3a) While studies agree overthese general trends the exact shape of the transfer functionand the maximum levels of emissions as well as their depen-dence on soil properties and other environmental parametersare still controversial Here we assume a hypothetical trans-fer function relating the normalized GHG budget rangingfrom 0 to 1 to the water level (see also Fig 3)

GHG Balance=

minuse3(WL+01)

+1 WLlt=minus011minuseminus3(WL+01) WLgtminus01

(1)

As the GHG budget can be positive for both low and highWL we introduced the transformed water level WLt as(Fig 3)

WLt =

e3(WL+01)

minus 1 WL lt= minus011 minus eminus3(WL+01) WL gt minus01

(2)

By calibrating the model to both WL and WLt we test if theoptimization of WLt provides the highest model accuracy forthe water level range relevant for GHG emissions and if itoptimizes the map for application to GHG upscaling

232 Weighting scheme

When considering possible data weighting schemes it isworth emphasizing at this point that the goal of this study isthe development of a statistical model that can explain boththe water level variability within a peatland as well as amongdifferent peatlands The data of target and predictor variablesfor building this model is highly heterogeneous First the tar-get variable data set contains peatland areas that strongly dif-fer in their spatial extent and in the number of installed dip

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3327

wells Second the predictor variable data set contains cate-gorical and numerical data and part of the predictor variablespredominantly vary from peatland to peatland (eg climaticboundary conditions large-scale topographic wetness indexpeatland characteristics) whereas others also show within-peatland variability (eg land use small-scale topographicwetness index drainage network) As the influence of the in-dividual predictor variables on our target WLt is expected tobe rather diffuse due to abundant interactions with other sitecharacteristics the robustness of derived dependencies willstrongly depend on the number of different peatlands in thedata set

There are no universal data weighting rules for similarlyheterogeneous data situations and some degree of expertjudgment and subjectivity is inevitable involved when de-veloping an appropriate scheme (Francis 2011) The needfor introducing a data weighting scheme is obvious as with-out data weighting during calibration too much influencewould be given to small and well-studied peatlands whichwill reduce predictive model performance for large less-well-studied peatland areas To avoid this in a simple man-ner weight could be reduced by the number of dip wells ineach peatland which results in each peatland being equallyweighted This scheme however does not sufficiently usethe high information content provided by well-studied largepeatlands which should have a higher impact on model cali-bration than a small peatland with only few dip wells

Here we propose a new weighting scheme that takes intoaccount both factors peatland size and local density of dipwells to derive dip well specific weighting factors It is basedon principles of data uncertainty reduction by repeated mea-surements and of geostatistics First we consider our datasituation as an analogue of meta-analysis with grouped dataIt is has been shown for homogeneous problems (all datafrom same population) that optimal group weights for meta-analysis is 1SE2 (Hedges and Olkin 1985) with SE beingthe standard error of each group

SE =σe

radicN

(3)

whereσe is the error standard deviation of a measurementandN is the number of measurements in a group For ho-mogeneous problems and uniformσe this results in weightsthat are linearly dependent onN which we here call the firstend member of weighting Heterogeneity (within-group vari-ance) reduces the variation of the group weights which canbe shown by random effect models (Cumming 2012) Aswith second end member of weighting when heterogene-ity totally dominates within-group variance optimal groupweights are uniform for all groups ie weights are inde-pendent ofN We are not aware of a method that allowsthe estimation of the degree of heterogeneity for the com-plex target and predictor data situation in this study includ-ing data (spatial and temporal variability measurement er-ror) and model errors (missing parameters) As a trade-off

Figure 4 Sample semivariogram and fitted semivariogram modelof the annual mean water level data WL

between 1SE2(homogeneous end member) and 1 (heteroge-neous end member) we decided on a group weight that isthe inverse of the standard error 1SE which is for exam-ple often used in econometric studies (Dickens 1990) Weemphasize that this is a subjective decision

The group weight 1SE is the basis for the geostatisticalpart of our weighting scheme There are two reasons why wecannot directly treat our peatlands as groups First there iswithin-peatland variability which is related to changing sitecharacteristics It is one objective of our study to describethis variability by statistical modeling Thus dip wells mustbe treated individually and data cannot be aggregated at thepeatland level Second we expect the model to learn morewhen the same number of dip wells is installed in a largerpeatland In a small peatland spatial autocorrelation betweendip wells is higher ie the information content is lower thanfor large peatlands As a consequence of the first point wedo not aggregate and keep all dip wells in the target variabledata set by attributing to each dip well the fraction 1N ofits group weight so that the relative weights of the groupsremain constant As a consequence of the second point weuse principles of geostatistics in our weighting scheme Wereplace the group sizeN (positive integer number) by theldquostatisticalrdquo group sizen (positive continuous number be-inggt 1) which we derive from the spatial autocorrelationamong the dip wells

Therefore we analyze the spatial autocorrelation structureof the data set A single spherical variogram model was fit-ted to the sample variogram of all data (Fig 4 in Sect 31)Variogram models allow the differentiation of the total datavariance (called ldquosillrdquo) into a spatially uncorrelated variance(called ldquonuggetrdquo) and a spatially correlated variance (calledldquostructural variancerdquo and defined as sillndashnugget) (Wacker-nagel 2003) The variogram model allows for derivation forany distance between two locations the average squared dif-ference of values here defined asγ By definition at dis-tance 0 the average squared difference equals the nuggetand at distances greater than that called the ldquorangerdquo of spatial

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3328 M Bechtold et al Large-scale regionalization of water table depth in peatlands

autocorrelation the average squared difference equals thesill Accordingly the autocorrelated fractionf of the av-erage squared difference between two dip wellsi and j is

fij =sill minus γij

sill minus nugget (4)

We now define the ldquostatisticalrdquo group sizen of each dip wellito be the sum of one plus the autocorrelated fractionsfij ofall dip wells that are within the range of spatial autocorrela-tion of i

ni = 1 +

msumj=1

sill minus γij

sill minus nugget (5)

According to the discussion above dip-well-specific weightscan then be calculated with

wi =1

ni SEi

=1

σeiradic

ni

(6)

whereni is derived from Eq (5) The equation shows thatwith increasing ldquostatisticalrdquo group sizen ie with increas-ing spatial data density the weight of an individual dip wellis ldquodown-weightedrdquo to some degree a behavior that corre-sponds to our initial intention to lower the influence of smallpeatlands compared to large ones The error standard devia-tion σe is dependent on several factors eg the length of thetime series the temporal measurement density and the mi-crotopography around the dip well For simplicity we hereassumedσe to be uniform for all dip wells which simplifiesEq (6) towi =

1radic

ni

Only dip wells with the same land-use type were summedup with Eq (5) which avoids the down-weighting by dipwells that have different land-use types The latter are mostlycharacterized by fairly different WLt and thus by rather lowspatial autocorrelation to dip welli

After spatial correlation has been accounted for the sumof the weights of all dip wells of each land-use type were ad-justed that they correspond to the fractions of this land-usetype in Germany This adjustment accounts for the overrep-resentation in the data set of dip wells in unused peatlandsand underrepresentation of dip wells in arable land

233 Model performance criteria

Model fit and predictive performance after cross-validationwere quantified by the weighted root mean square error

RMSE =

radicradicradicradic 1summi=1 wi

msumi=1

(wi

(xoi minus xsi

)2) (7)

wherem is the number of dip wellsxoi is observed WLor WLt of dip well i xsi is simulated WL or WLt of dipwell i andwi is the data weight of dip welli (see below) We

refer to the root mean square error of the predicted data ofcross validation as RMSEcv Model performance was furtherquantified by NashndashSutcliffe efficiency (NSE)

NSE = 1 minus

msumi=1

wi

(xoi minus xsi

)2

msumi=1

wi

(xoi minus xo

)2 (8)

wherexo is the mean of all observed WL or WLt It indicateshow well observed vs predicted values match the 1 1 lineNSE is a good overall indicator of predictive performancebecause it combines scatter and bias (common offset andorslope difference from 11 line) (Nash and Sutcliffe 1970)Values greater than 0 signify a model that is better than thereference model based on the data mean We refer to the NSEof the training data as NSEcal and of the predicted data ofcross validation as NSEcv

Systematic errors were quantified by calculating the modelbias here defined as

BIAS =

msumi=1

(wi xoi minus wi xsi

) (9)

24 Model uncertainty and stability evaluation

Uncertainty of the model predictions was assessed by boot-strapping cross-validation and residual analysis

For the bootstrapping analysis we followed the procedureof Leathwick et al (2006) We estimated the confidence in-tervals around the predictions and the fitted functions by tak-ing 1000 bootstrap samples of the 53 peatlands The numberof peatlands in each sample was equivalent to the data set butpeatlands were selected randomly with replacement Usingthe predictor variables of the final model a BRT model wasfitted to each sample Cross validation was again performedon peatlands thus a peatland in the calibration data set wasnot part of the cross-validation data set to avoid overly opti-mistic results Variances of the predictions and of the fittedfunctions of the 1000 models were evaluated

If data sets are relatively small (egn lt 1000 Dersquoath2007) then the small size of the training and test data setslowers model accuracy Given the fairly small number ofpeatlands in the data set and the partly high spatial corre-lation of dip wells within these peatlands we decided not tosplit the data set into a training and test data set Estimatesof model accuracy can then be based on cross-validationthereby making effective use of all the data (Dersquoath 2007)The prediction uncertainty of the final model is estimatedby the root mean square error of prediction (RMSEcv seeabove) for each land cover class After testing for near-normal distribution of the residuals RMSEcv can be used toderive the 68 and 95 confidence intervals of the predictionswith RMSEcv and 2times RMSEcv respectively

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3329

Finally additional residual analysis was performed to eval-uate whether the predictions are biased for different landcover classes or geographical regions

25 Regionalization

In the final regionalization step the predictor variables con-tributing to the final model were determined at a 25times 25 mraster for all organic soil in Germany Predictor variableswere determined with the same map input that was used formodel building Land cover information including informa-tion on ditches was based on the data from year 2012 and theclimatic data was based on the average of the last 30 yearsThe fine spatial resolution of 25times 25 m was not chosen tofool the reader with a highly spatially accurate model Ratherthis fairly fine scale was necessary to map the relativelysmall-scale effects of the topography land use and peatlandgeometry variables The final model was then used to makea prediction for each of these raster cells

3 Results and discussion

31 Spatial correlation structure of the data set

The variogram model fitted to the sample variogram provideda nugget (0012 m2 011 m) a sill (009 m2 03 m) and arange of spatial correlation (2700 m) for our data set of WL(Fig 4) The nugget represents the very small-scale soil hy-draulic variability and micro-topography effects on WL (vander Ploeg et al 2012) and measurement error eg by dif-ferences in the determination of the ground surface and inthe timing of the manual measurements Furthermore micro-topography (eg hummocks) and oscillating peat surfaces ofwet peatlands pose a challenge for an accurate determinationof both ground surface and water level The water level timeseries in the data set were of different lengths and rangedfrom 1 to 20 years Interannual variability of water levels canbe large (eg Knotters and van Walsum 1997) For simplic-ity in our analysis data were not harmonized by extrapolat-ing WL time series using weather data to a 30-year periodThus the nugget also includes errors that are introduced bydip wells with different measurement periods that are locatedin the range of spatial correlation In consideration of theseerror sources the fitted nugget of 011 m appears to be a re-alistic value At 03 m the fitted sill matched nearly perfectlywith the standard deviation of the data (031 m) which in-dicates consistency between semivariogram model and dataset The fitted range of spatial correlation of 2700 m reflectsboth physical effects ie the average range of lateral flowdue to hydraulic gradients as well as the effect of averageland-use patterns in Germany on the spatial correlation ofWL Fitted values were used in the calculation of the dip-well-specific weights using Eq (6)

32 Typical water levels for land-use types in Germanorganic soils

The land cover classes are characterized by plausible meanand median water levels which show consistent differencesbetween each other (Table 2 and Fig 5a) The mean valuesof arable land and grassland agree with what can be expectedfor their agronomic requirements with slightly lower waterlevels for arable land The high variability observed for bothclasses may be related to the variability of the efficiency ofinstalled drainage systems as for example the presence andcondition of tile drains and the depth of ditches Grasslandscan be managed with very variable intensity which is partlyreflected in different water levels Figure 5a further showsthat deciduous forests seem to dominate slightly drier organicsoils compared to coniferous forests which dominate underwetter conditions A high variability of water levels is ob-served for the land cover class unused peatland On the onehand post peat-cutting topography increases the variabilityof WL over short distances It probably contributes to thehigh variance observed for this class On the other hand thisclass comprises both rather dry unused peatlands and wetterpeatlands in which re-wetting measures already took placewhich however do not show yet a wet soil attribute in theATKIS Digital Landscape Model This may also cause part ofthe variance observed in the grassland and forest land coverclass All wet land cover classes (reed wet grassland wetforest and wet unused peatland) that were separated by wet-ness indication clearly show higher water levels showing thewetness attribute of the Digital Landscape Model is a usefulattribute

Figure 5b shows the transformed water level for all classesIt can be observed that the variances of the wetter landcover increase relative to the variances of the dry land coverclasses This is due to the highest sensitivity of GHG emis-sions in the wet range of water levels (gt minus05 m) Conse-quently the rather high variance of WL for arable land cor-responds to a rather low variance of WLt ie to a rather lowassumed effect of WL variability on the GHG budget

33 BRT model calibration and validation WL vs WL t

In contrast to land cover class the other predictor variablesshowed if at all only weak relations to WL and WLt whenevaluating them with box plots 2-D cross plots and simplecorrelation matrices Here we expected BRT to detect thestrongest predictor interactions and to identify the most in-formative predictors

After model calibration with all predictors subsequentmodel simplification successively dropped those parameterswith correlationgt 07 and the lowest contribution For bothWL and WLt model performance improved during this sim-plification For WLt the highest values of NSEcv of approxi-mately 046 were achieved with 21 to 9 model parametersThe development of NSEcv for the last 50 parameters is

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3330 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 5 Water level relative to ground surface WL (m) and transformed water level WLt (minus) by land cover class illustrated as a weightedbox plot WLt = minus1 corresponds to maximum CO2 emissions and WLt = 1 to maximum CH4 emissions In the top horizontal axes thenumber of dip wells in each class is indicated

Figure 6NSEcv as a function of number of predictor variables usedin the model of WLt during model simplification and shown for thelast 50 parameter drops

shown in Fig 6 Further elimination of parameters led to apronounced decline of model performance Similar behav-ior was observed for the calibration on WL In favor of amore parsimonious model we chose the model with the low-est number of parameters before the pronounced decline ofmodel performance occurred For the calibration on WLtthis corresponded to the model with lowest number of param-eters that still achieved NSEcv values ofgt 045 (Fig 6) Thefinal WLt model comprised nine predictor variables and thefinal WL model seven parameters The percentages of pa-rameter contributions to the final model and their individualinfluences are discussed for WLt in Sect 34

Table 3 summarizes the statistical performances of themodels calibrated on WL and WLt For both models NSEcalis considerably higher than NSEcv and shows the commonly

Table 2 Weighted mean and standard deviation of WL and WLtdata and of the WLt map presented in Sect 36 for the nine landcover classes

WL (m) WLt (minus)WLt (minus)

Meanplusmn sd Meanplusmn sd Map meanplusmn sd

Arable land minus069plusmn 030 minus076plusmn 017 minus066plusmn 022Deciduous f minus045plusmn 034 minus049plusmn 037 minus047plusmn 035Grassland minus044plusmn 029 minus052plusmn 032 minus049plusmn 030Unused peatl minus039plusmn 036 minus039plusmn 041 minus037plusmn 040Coniferous f minus036plusmn 036 minus037plusmn 037 minus046plusmn 035Wet unused peatl minus022plusmn 027 minus018plusmn 040 minus017plusmn 036Wet forest minus022plusmn 029 minus017plusmn 043 minus021plusmn 039Wet grassland minus010plusmn 014 minus000plusmn 031 minus015plusmn 039Reed minus001plusmn 017 020plusmn 029 minus006plusmn 032

observed overfitting behavior of BRT models The differentmeasures that we conducted to minimize overfitting (cross-validation on peatlands restriction to monotonic responsesand model simplification including elimination of highly cor-related variables) lowered the difference between NSEcal andNSEcv but could not totally avoid overfitting NSEcv of theWLt model (0453) indicates higher predictive model perfor-mance compared to the WL model (0381) However as thedata ranges differ due to the transformation this comparisonmay be misleading Therefore we transformed the predic-tions of the WL model to obtain WLt values from this modeland equally calculated the performance criteria (Table 3 sec-ond column) Then NSEcv is slightly increased (0397) butdoes not achieve the values of the model that was calibratedon WLt A better predictive model performance of the modelcalibrated on WLt is also visible for the RMSEcv valuesThe total RMSEcv as well as the RMSEcv values for the

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3331

Table 3Performance criteria of the different models dry range de-fined as WLlt minus03 m and wet range as WLgt minus03 m

WL (m) WLt (minus) WLt (minus)(calibrated (calibrated on WL) (calibrated on WLt)

on WL)

NSEcal 0627 0559 0642NSEcv 0381 0397 0453RMSEcv 0269 0299 0284RMSEcvdry 0284 0263 0259RMSEcvwet 0222 0382 0355Bias minus0003 0083 0002Biasdry minus0012 0070 0003Biaswet 0021 0120 0000

dry (WL lt minus03 m) and wet range (WLgt minus03 m) showslightly lower values for the WLt model compared to WLtvalues from the model calibrated on WL Given our hypo-thetical transfer function (Fig 3) in which the GHG budgetis linearly related to WLt the higher accuracy of WLt pre-dictions directly corresponds to a higher accuracy of GHGbudget predictions

Superior model performance is also evident when evaluat-ing model bias Only when calibrating directly on WLt arethe WLt predictions bias-free Calibration on WL and subse-quent transformation to WLt introduces a model bias towardssystematically lower WLt values In subsequent applicationsto GHG emission upscaling lower WLt values would lead toan overestimation of CO2 emissions and to an underestima-tion of CH4 emissions

34 Influence of predictor variables on WLt

Given the beneficial characteristics of the model calibratedon WLt for GHG upscaling presentation and discussion offurther model results is restricted to the WLt model

The BRT method allows the analysis of the parameter con-tributions to and influences on the model (Elith et al 2008)and thus may contribute to system understanding The per-centages of the contributions of the nine predictor variablesto the final model ranged from 252 to 56 (Fig 7) Ex-cept protection status at least one parameter of each of theseven parameter groups contributed to the final model Allprotection status information was dropped early during thesimplification process due to low contribution although WLshowed slightly higher values for data from nature protectionor special areas of conservation However other parametersseem to be able to fully compensate the information that islost by dropping this predictor

Land cover class lc at the dip well was the parameter withstrongest contribution (252 ) It basically follows the trendillustrated in Fig 5b The bootstrap error plotted as standarddeviation (Fig 7) shows the variation of this influence overthe 1000 bootstrap models A second land cover parameterthe fraction of dry land cover classes on organic soils in abuffer of 2500 m radiusfdry (2500) contributed to the model

with 103 The monotonic decrease of WLt with increas-ing fdry (2500) is plausible as higher values reflect intensiveland use in the surroundings of the dip well and thus indicateintensive artificial drainage Together both parameter con-tributed 355 and thus land cover represents the parametergroup with the strongest model contribution

Peatland characteristics are the second most importantparameter group The peatland type contributed 16 Themodel indicates that peatlands without any connection to sur-face water bodies (river or lake) and the class of other organicsoils are characterized by lower WLt compared to the peat-land types lowland bog upland bog and fen neighboring sur-face water As the class of other organic soils is generallyexpected to reflect lower water levels and as surface watermay have a stabilization effect on water levels of organicsoils the influence of the peatland type can be consideredplausible Besides peatland type the substrate of the peatbase contributes 56 Here organic soils overlying peatclay layers (eg limnic sediments such as calcareous gyt-tja) or basement rock are characterized by higher WLt com-pared to organic soils overlying unconsolidated rock Thiscan be explained by the lower drainage resistance of uncon-solidated rocks This may cause an increased efficiency ofanthropogenic drainage andor a general higher vulnerabil-ity to seepage losses Finally slightly lower WLt values areindicated by a high fraction of organic soils for the 500 mbufferfpeat(500) This may reflect the higher land-use pres-sure on large peatlands compared to rather small peatlandswhich tentatively are more easily preserved by nature protec-tion efforts

The remaining four parameter groups are represented inthe model by only one parameter each The third most influ-ential parameter was the length of ditches on arable land andgrassland for the 250 m buffer dilendry (250) At first glanceit may be surprising that with increasing ditch density WLtvalues tend to be higher as ditches are supposed to drain thewater when land is used as arable land and grassland Thefact that the model identifies a rather strong effect in the op-posite direction may be caused by incomplete informationabout the drainage network There is not detailed informa-tion about the spatial distribution of tile drains Based on ex-pert knowledge agricultural areas with a lower ditch densityare more likely to have tile drains As these drains easilyinstalled with a narrow drain spacing are more effective atdraining organic soils low WLt values for arable land andgrassland may be related to low ditch densities Furthermoreditches were originally dug at narrow spacing in especiallywet areas of organic soils but there is no information avail-able whether these ditches still function properly

The parameters wbsummer hrel and tiras25all show expectedtrends The model predicts higher WLt for increasing cli-matic water balance in the summer period (May to October)wbsummer for dip wells located in depressions (low values ofhrel) and for higher small-scale topographic wetness indicescalculated on the 25times 25 digital elevation model (tiras25)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3332 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 7 Partial dependence plots for the predictor variables For an explanation of variables see Table 1 They axes are on WLt scale andare centered around the mean WLt Error bars and grey area indicate standard deviation of the response of over 1000 bootstrap models Therelative contribution of each predictor is indicated as percentage The lines along thex axes of each plot show distribution of data across thatvariable in deciles

The fact that all parameters show expected or explainableresponses in the model corroborates the reliability of the cal-ibrated WLt model The standard deviation of the predictorresponses based on the bootstrap samples shows the stabilityof the observed responses

Further insight into model behavior can be obtained by an-alyzing parameter interactions This is obtained by changingtwo parameters simultaneously while keeping mean valuesfor all other parameters (Elith et al 2008) Figure 8 showsthe two strongest parameter interactions Parameter wbsummer

strongly interacts withptype The generally lower values ofWLt of fens without surface water connection and other or-ganic soils show a stronger dependency on the summer cli-matic water balance While a summer climatic water balanceof gt minus80 mm shows a rather weak effect on WLt for the wet-ter peatland types in contrast to the two drier peatland typesthere is still a strong effect with increasing wbsummer Thetrend for wbsummergt 130 mm for the dry peatland types issupported by seven different peatlands

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3333

Figure 8Partial dependence plots representing the two strongest interactions in the model(a) betweenptype and wbsummerand(b) betweenpbaseandfdry Fitted WLt is plotted on they axis which is obtained after accounting for the average effect of all other predictor variables

Another strong interaction is observed forpbase andfdry (2500) While a rather weak effect of the fraction ofarable land and grassland is observed for organic soils over-lying basement rock and peat clay layer a strong effect is ob-served for organic soils overlying unconsolidated rock Thisinteraction reflects the higher lateral range of drainage effectsfor organic soils with little flow resistance at the peat base Inthese organic soils intensive land use lowers the water levelover large areas

35 Discussion of model uncertainty

Plotting observed vs predicted WLt from cross-validation(Fig 9) illustrates the rather large residual variance that can-not be explained by the model As indicated by the higherRMSEcv for the wet range (Table 3) scatter increases withincreasing WLt Error bars in they direction indicate dataerror derived from the nugget of the variogram It is shownfor a few data points as an example Due to transformationdata error increases for higher WLt Figure 9 demonstratesthat the fraction of unexplainable variance related to data er-ror is much higher for the wet than for the dry range Boot-strap error indicating the variation of the model predictionsfor 1000 bootstrap samples is shown in thex direction for thesame data points Bootstrap error is lower than the data errorfor the wet range and slightly higher for the dry range

Bootstrap errors demonstrate the sensitivity of model pre-dictions to changes of the data set used for calibration Whena model possesses structural deficits such as missing pre-dictor variables bootstrap errors should not be used to de-fine confidence intervals for the model predictions Figure 10shows residuals from cross-validation and standard deviationof bootstrap predictions for all land cover classes The resid-uals of each land cover class show near-normal distributionsFor five of the nine land cover classes (wet forest wet un-used peatland arable land coniferous forest and reed) theShapirondashWilk test of normality is positive (p gt 005) Fig-ure 10a further indicates that residuals of each land cover

Figure 9 Observed vs predicted transformed annual mean waterlevel (WLt) from cross-validation results Error bars show selecteddata and bootstrap model errors as standard deviation Data pointsare scaled by their weights

scatter fairly well around zero indicating low bias for the var-ious land cover classes Land-cover-class-specific confidenceintervals of model predictions can thus be derived from theRMSEcv of each land cover class eg 2times RMSEcv repre-senting the 95 confidence interval

The prediction uncertainty derived from cross-validationis much higher than the bootstrap prediction uncertainty ob-tained from the bootstrap standard deviation (sd) with 2times sdcorresponding to the 95 confidence interval (Fig 10)The large difference between these values indicates that themodel has structural deficits that can be attributed to severalerror sources

i Key influences on WLt are missing in the set of pre-dictor variables None of the predictor variables in-dicate whether and to which extent water level in-crease due to re-wetting measures took place in the last

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 3: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3321

was to regionalize annual mean water levels and not the GHGemissions themselves The latter are influenced by more sitecharacteristics in particular soil properties Furthermore wesuppose that annual mean water level is probably not the onlyor optimal statistical measure to describe the water level ef-fect on annual GHG emissions However we are not awareof well-established information about transfer functions thatrelate more complex statistical measures of water level dy-namics to GHG emissions Therefore we here focused on thesimple and frequently applied ldquoannual mean water levelrdquo

As a first step we compiled a new data set of phreatic wa-ter level time series of organic soils with contributions fromnumerous data sources Based on this data we developed amodeling approach for the annual mean water level that fol-lows the basic idea of the statistical regionalization presentedin Finke et al (2004) However the data coverage in ourstudy substantially differed from their study Our data cov-ers only a small fraction of the peatlands of the final map andspatial interpolation of residuals was not possible We thusextended their approach by

ndash including additional possible predictor variables

ndash using boosted regression trees as a modeling tool toidentify the influence of both numerical and categoricalvariables simultaneously

ndash applying a new weighting scheme that balances out het-erogeneous water level data sets with highly variablespatial data density

ndash transforming the annual mean water level WL into atransformed annual mean water level WLt that shows alinear relationship with the GHG budget and optimizesmodel calibration for the WL range relevant for GHGemissions and

ndash restricting the water level regionalization to phreaticwater levels of organic soils

We present a detailed analysis of the influence of the individ-ual predictor variables on water levels of organic soils as wellas their interactions Furthermore the manuscript includesthe estimation of model uncertainty and possible paths offuture model improvement Finally the calibrated model isused to derive a map of WLt for all organic soils in Germanyand the regionalization results are presented

2 Data set and methods

21 Data set of phreatic water levels in organic soils

Available data of phreatic water levels in organic soils arescarce In contrast to data of rather deeply drilled observa-tion wells of official groundwater monitoring networks shortpeatland observation wells of only 1 or 2 meters length that

Figure 1 Locations of the 1094 dip wells of the data set Base map(geological map 1 200 000 BGR) shows the distribution of bogand fen peat and other organic soils

measure the phreatic water level of the peat layer are cur-rently not collected in central data management systems inGermany or any of its federal states With a comprehensivequestionnaire started in 2011 we collected water level timeseries of organic soils from local agencies non-governmentalorganizations universities consultants and other sourcesand combined this data with water level data from ourprojects Time series included manual and automatic mea-surements Years with less than six measurements or datagaps of more than 3 months were excluded Water level timeseries of each dip well were visually checked on plausible dy-namics by comparing with data from neighboring dip wellsand weather data time series Based on auxiliary data andlocal knowledge we further identified dip wells that reacheddown to the underlying aquifer If dip wells failed these qual-ity checks they were removed from the data set

The final data set comprised 7155 years of data from53 German peatlands and 1094 dip wells On average timeseries ranged over 7 years All time series were collected atsome period between the years 1988 to 2012 Data are welldistributed over most of the German peatland regions andcover the three major types of organic soils (Fig 1) Com-pared with the distribution of the types of organic soils in

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3322 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Germany the fraction of dip wells on bogs is overrepresentedin the data set by the factor of 25 while dip wells on fensand other organic soils are slightly underrepresented Dataalso cover the common land-use types (for data sources seeTable 1) However dip wells on organic soils that are neitherused for agriculture forestry or peat mining further referredto as ldquounused peatlandsrdquo are overrepresented in the data setby a factor of 6 as data was collected more frequently andin higher spatial data density in the frame of conservationprojects The fraction of unused peatlands of the German or-ganic soils is 6 and the fraction in the data set is 36 In contrast dip wells on arable land are underrepresented inthe data set by a factor of 6 The fraction of arable land onGerman organic soils is 24 and the fraction in the data setis 4 The other two key land-use types of organic soils inGermany grassland and forest are well represented in thedata set The misbalance of the land-use types in the data setis accounted for in the weighting of data (see Sect 232)

If land use changed within the measurement period of adip well the time series was split at the moment when theland-use record indicates the transition For each segment theannual mean water level WL (here with negative values de-fined as water levels below ground) was calculated as themulti-year average value over the whole measurement periodof the specific land use

The primary application of the WL map produced in thisstudy is for the upscaling of long-term GHG emissions asemission reporting may only reflect anthropogenic effectsbut not interannual climatic effects As GHG transfer func-tions are developed on annual data their application requiresboth the long-term annual mean water level as well as itsinterannual variability Due to the non-linear dependence ofGHG emissions on WL single years with extreme water lev-els can strongly influence long-term average GHG fluxesThis study is focused on the regionalization of the long-termannual mean water levels For this objective model buildingshould be based on long-term water level time series to av-erage out the effect of weather variation within a completeclimatic period (commonly 30 years) The existing nation-ally available data on water level time series of organic soilshowever does not comprise a single time series with com-plete data coverage over the last 30 years Due to the lackof sufficient long-term water level time series we includedall time series in the model building process Average cli-matic boundary conditions (precipitation reference evapo-transpiration water balance) of the specific measurement pe-riod of each dip well are part of the predictor variables (seeSect 22) and thus are supposed to partly account for theeffect of specific weather conditions on WL in case of shortmeasurement periods

22 Predictor variables

Spatial coverage of phreatic water level data of organic soilsis too low to obtain WL maps by simple spatial interpolation

(Fig 1) Additional spatial data is needed as basis for region-alization Ancillary information that covers fully or at leastmost of the extent of the final map are necessary They can beused as predictor variables A comprehensive set of variables(numerical and categorical) with potential indication for thehydrological condition of an organic soil were determinedfor each dip well (Fig 2 and Table 1)

The predictor variables which can partly be found also inFinke et al (2004) can be divided into seven groups

221 Land cover

As certain land use and vegetation require and reflect cer-tain WL such information can be used as an indicator forthe average drainage level around the dip well Land-use andvegetation information is based on the German Digital Land-scape Model (ATKIS Basis-DLM) which is updated contin-uously by aerial photos as well as sporadic ground mappingand has a temporal accuracy of 3 months to 5 years It is pro-vided as fine-scaled polygons and represents the best uniformland cover information available in Germany It contains in-formation on primary land-use type few optional vegetationattributes and whether ldquowet soilrdquo has been observed duringmapping As we noticed that the use of a large number ofcategorical variables lowers the performance of boosted re-gression trees we further aggregated the three informationtypes (i) land use (ii) vegetation and (iii) wet soil into a setof nine combined land cover classes (Table 1) These landcover classes were a trade-off between fine differentiationand the number of replicates in each class For grasslandsa ldquowet grasslandrdquo class was separated when grassland wasoverlaid with wet soil andor tree or shrubs vegetation whichmay indicate a less intensive management Forests overlaidwith wet soil were separated as ldquowet forestrdquo Further unusedpeatlands overlaid with wet soil and showing no coveragewith tree attributes were characterized by higher water levelsand were thus separated as ldquowet unused peatlandrdquo The veryfew dip wells classified as open water (n = 2) and peat cut-ting (n = 5) were merged to the reed and arable land coverclass respectively Land-use type and land cover class wereextracted at the dip well (point extraction) and as fractions invarious buffers around the dip well (Table 1) As using toomany weak predictor variables lowers model performanceand increases overfitting the numerous land cover fractionswere further aggregated into two classes the fraction of dry(arable and grassland) and wet (reed wet grassland wet for-est and wet unused peatland) land cover on organic soils Forthe calculation of the fraction of dry land cover we testedvarious factors for the reduction of the contribution of grass-land compared to arable land as the grassland class also in-cludes wetter grasslands that could not be detected with theavailable land cover catalogue A factor of 05 was an optimalvalue which was then set fixed

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3323

222 Drainage network

Locations of ditches that are included as lines in the DigitalLandscape Model were used to obtain information about thedrainage network The total ditch length was calculated forvarious buffer sizes Further the distance to the next ditchwas calculated for each dip well A short distance to the nextditch may indicate either lower or higher water levels de-pending on whether the ditches are used for drainage or al-ready blocked and used for rewetting measures Similarlythe indication of the total length of ditches is not uniqueTherefore we defined two different sets of ditch variablesA first set for which we calculated values for all land coverclasses and a second one for which we only calculated val-ues for land cover classes for which ditches are undoubtedlyused for drainage ie arable and grassland

223 Peatland characteristics

The geological map of Germany (scale 1 200 000) definedthe area for which WL predictions were modeled It is alsothe basis for topological peatland predictor variables ie thefraction of organic soils in different buffer sizes as well asthe dip well distance to the edge of the peatland Informationabout the peatland type and the substrate at the peat base ispresented in more detail in a newly compiled raster map oforganic soils (Roszligkopf et al 2014) and was thus extractedfrom this map Peatland types were aggregated into fiveclasses lowland bog (North German Plains and Alpine Fore-lands) upland bog (Central Uplands and Alps) fen neighbor-ing surface water fen without neighboring surface water anda class of ldquoother organic soilsrdquo that do not fulfill the C contentand thickness criteria to be classified as peatland Substratesat the peat base included loose unconsolidated rock (alluvialsand and gravel deposits) consolidated rock (bedrock) andpeat clay layer The first type may indicate the occurrence ofseepage (positive or negative) whereas the latter two typesmay indicate rather a hydraulic decoupling from the aquiferhydraulic head

224 Climatic boundary conditions

Climatic boundary conditions directly influence water levelOn the one hand the typical long-term climatic boundaryconditions may indicate the general vulnerability of peat-lands in a specific region On the other hand given the dif-ferent lengths of measurement periods of the time seriesin this study climatic boundary condition predictor vari-ables may account for the effect of a climatically wetter ordrier measurement period compared to the long-term av-erages on the water level Climatic boundary conditionswere extracted from a 1times 1 km raster from the GermanWeather Service Annual summer and winter precipitationFAO56 PenmanndashMonteith reference evapotranspiration andclimatic water balance (difference between precipitation and

reference evapotranspiration) were determined for the indi-vidual measurement period of each dip well and as long-termaverages (30 years)

225 Relative altitude

Relative altitude was calculated by subtracting the medianaltitude of various buffer sizes from the absolute altitude ateach dip well in the digital elevation model (DEM) Rela-tive altitude is expected to have two different indications de-pending on the applied buffer size (i) in many peatlands theformer smooth peatland relief at the scale of approximatelygt 5 m has been disturbed due to peat cutting and differencesin drainage and mineralization rate As a consequence therather smooth phreatic surface often does not follow the un-even and patchy terrain Relative altitude with respect tosmaller buffer sizes (lt 250 m) may therefore explain part ofthe WL variation eg a dip well that is located at a surfacemuch higher than the surroundings may indicate deeper wa-ter levels (ii) for large buffer sizes (gt 250 m) relative altitudeindicates whether the peatland lies in a larger morphologicaldepression or elevation and thus may indicate whether large-scale lateral inflow of water can be expected or not Similarindication is provided by the topographic index (see below)The accuracy of relative altitude values depends on the reso-lution and accuracy of the DEM The nation-wide availableDEM is based on data sets of varying quality which maylower the influence of this variable

226 Topographic wetness index

The topographic wetness index is a common wetness indi-cator used in hydrology (Beven and Kirby 1979) It is acombined measure of catchment area and slope at a givenpoint and indicates the extent of flow accumulation High val-ues indicate wetter conditions If calculated at larger scaleshigher values may hint at the occurrence of positive seepageie upward flow of water from the aquifer Topographic wet-ness index was calculated for various DEM resolutions usingthe GRASS 7 module rwatershed

227 Protection status

The protection status of a peatland area may reflect hydrolog-ical conditions Therefore we checked for seven protectionstatus at each dip well (see Table 1 for details)

23 Model building scheme

Model building was performed using boosted regressiontrees (BRT) implemented in the two R packages ldquogbmrdquo(Ridgeway 2013) and ldquodismordquo (Hijmans 2013) BRT is amachine-learning algorithm in which the final model is de-rived from the data Functions that relate target to predictorvariables are not predetermined but freely developed BRTis based on the decision (or regression) tree concept In the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3324 M Bechtold et al Large-scale regionalization of water table depth in peatlands

decision tree concept the parameter space is searched se-quentially for the best split that results in the lowest modelmean squared error The mean responses of the groups thatresult from the various splits and correspond to certain pa-rameter ranges represent the model The common procedureis the growth of a large tree which is subsequently simpli-fied by dropping weak links that are identified with cross-validation Growing only a single tree has several disadvan-tages such as uneven functions that are very sensitive to thespecific sample of the data Therefore ensemble techniqueshave been combined with the decision tree concept Thesewere first the development of multiple models by bootstrap-ping of the samples (bagging technique) and the randomcreation of subsets of predictors at each split (random for-est technique) Later with the ldquoboostingrdquo technique of BRTa sequential procedure was developed in which data is re-weighted after each tree to increase emphasis on data that ispoorly modeled by the existing collection of trees (Elith etal 2008)

BRT modeling is increasingly applied in spatial model-ing of species or numerical environmental variables (Elithet al 2008 Martin et al 2011) thereby often showing su-perior performance compared to other machine-learning al-gorithms The increasing application of BRT is related toseveral of its favorable characteristics the strength of thismethod lies in the ability to fit complex functional dependen-cies including non-linear relationships and interactions be-tween predictor variables Based on its flexibility BRT is in-variant to monotonic transformations of predictors Further-more BRT allows for missing values in the predictor vari-ables thus predictor variable information does not necessar-ily need to fully cover the total map extent The gbm packagehandles missing values in predictor variables by introducingsurrogate splits The mean target value belonging to the miss-ing predictor values is attributed to these surrogate splits dur-ing model building We observed that the contribution of apredictor variable to the final model decreases with an in-creasing number of missing values This is intuitive as targetobservations of missing predictor values are mostly supposedto scatter strongly BRT is further fairly insensitive to out-liers and allows estimating the relative contribution of eachpredictor variable to the model Due to these characteristicswe expected BRT to be very well suited to the very hetero-geneous data set of this study

BRT model calibration is prone to overfitting and thereare various options to reduce this behavior Due to the over-fitting behavior cross validation is generally part of themodel building process However cross validation can beperformed in several ways and if performed carelessly canlead to overly optimistic model performance (Dersquoath 2007)Here cross validation was performed by leaving out wholepeatland areas instead of a random set of dip wells Thisrepresents a stricter cross validation and we noticed that itstrongly reduced overfitting of the water level data and thuscontributed to the development of a more robust model

Figure 2 Illustration of the predictor variables determined for eachdip well based on available national maps (see Table 1)

Another option to avoid overfitting is to impose mono-tonic slopes on the effects of individual parameters whichcan even lead to improved prediction performance (Dersquoath2007) For all our numerical variables we expected mono-tonic slopes rather than optimum functions To avoid pre-defining any expected direction all numerical variables wereadded twice to the set of predictors constraining the slope toa monotonic increase and decrease We let the model decidewhether monotonic increase or decrease has higher predic-tive power

Models were calibrated using a Gaussian response typeaimed at minimizing deviance (squared error) (Ridgeway2013) In all calibration runs we applied the gbmstep func-tion of the dismo package which assesses the optimal num-ber of boosting trees using cross validation We tested variouslearning rates (0001ndash001) bag fractions (01ndash08) and lev-els of tree complexity (3 to 7) ie the number of nodes in atree By trial and error we determined the most effective algo-rithm parameters for our data set being 0005 for the learningrate 06 for the bag fraction and 5 for the tree complexity

The final BRT model building is commonly performed asa two-step procedure (Elith et al 2008) which we basicallyalso followed in our study

i In the first step the whole set of predictor variables isused to calibrate a BRT model

ii In a second step the number of parameters is re-duced sequentially to avoid overfitting and to derive amore parsimonious model We tracked predictive per-formance criteria during the simplification process Asvarious variables were calculated for different buffersizes our predictors included a large number of cor-related variables Correlation coefficients between pre-dictor variables ofgt 07 are known to severely distortmodel estimation and subsequent prediction (Dormann

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3325Ta

ble

1O

verv

iew

onpr

edic

tor

varia

bles

Pre

dict

orva

riabl

eVa

riabl

ena

me

Valu

esP

oint

buf

fers

(m)

Dat

aso

urce

Land

-use

type

Ara

ble

gras

slan

dfo

rest

shr

ubs

peat

-min

ing

unus

edP

oint

100

500

100

025

00D

igita

lLan

dsca

peM

odel

1

peat

land

sw

amp

open

wat

er

Vege

tatio

nat

trib

utes

Dec

iduo

usfo

rest

mix

edfo

rest

con

ifero

usfo

rest

ree

dP

oint

Dig

italL

ands

cape

Mod

el1

(opt

iona

l)sh

rubs

gra

ss

ldquoWet

soil

obse

rved

rdquoY

esn

oP

oint

Dig

italL

ands

cape

Mod

el1

Com

bine

dla

ndco

ver

lcA

rabl

egr

assl

and

wet

gras

slan

dde

cidu

ous

incl

udin

gm

ixed

Poi

nt1

005

001

000

2500

Dig

italL

ands

cape

Mod

el1

info

rmat

ion

(land

-use

type

fo

rest

wet

fore

stc

onife

rous

fore

str

eed

unus

edpe

atla

nd

veg

and

wet

soil

attr

)w

etun

used

peat

land

Dry

land

cove

rfr

actio

nf

dry(

X)

Ara

ble

05times

gras

slan

don

orga

nic

soil

area

0to

110

050

010

002

500

Dig

italL

ands

cape

Mod

el1

Wet

land

cove

rfr

actio

nR

eed

wet

gras

slan

dw

etfo

rest

and

wet

unus

edpe

atla

ndon

100

500

1000

250

0D

igita

lLan

dsca

peM

odel

1

orga

nic

soil

area

0to

1

Tota

llen

gth

ofdi

tche

sfo

rdi le

ndr

y(X

)ge

0m

Poi

nt5

025

010

002

500

Dig

italL

ands

cape

Mod

el1

alll

can

don

lyfo

rar

able

and

gras

slan

d(s

ubsc

rldquod

ryrdquo)

Dis

tanc

eto

next

ditc

hge

0m

Poi

ntD

igita

lLan

dsca

peM

odel1

Pea

tland

type

pty

peLo

wla

ndbo

gup

land

bog

fen

neig

hbor

ing

surf

ace

wat

erf

enP

oint

Map

ofor

gani

cso

ils2

with

outn

eigh

borin

gsu

rfac

ew

ater

oth

erldquolo

w-C

rdquoor

gani

cso

il

Mat

eria

latp

eatb

ase

pba

seU

ncon

solid

ated

rock

pea

tcla

yla

yer

rock

no

info

rmat

ion

Poi

ntM

apof

orga

nic

soils

2

Pea

tland

frac

tion

fpe

at(X

)0

to1

Poi

nt5

001

000

2500

Geo

logi

calm

ap(B

GR

)3

Dis

tanc

eto

edge

ofpe

atla

ndgt

0m

Geo

logi

calm

ap(B

GR

)3

Rat

ioof

dpe

atf

peat

gt0

2500

Geo

logi

calm

ap(B

GR

)3

Pre

cipi

tatio

nge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Eva

potr

ansp

iratio

nge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Clim

atic

wat

erba

lanc

ew

b sum

mer

lt0

andge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Rel

ativ

ehe

ight

hre

l(X

)lt

0an

dge0

mP

oint

ndashm

edia

n25

50

100

250

Dig

italE

leva

tion

Mod

el5

500

1000

Topo

grap

hic

inde

xti ra

sR(X

)gt

0P

oint

and

1000

buffe

rfo

r10

D

igita

lEle

vatio

nM

odel

5

252

501

000

rast

erva

lues

Pro

tect

ion

stat

usN

atur

eco

nser

vatio

nar

eas

peci

alar

eas

ofco

nser

vatio

nP

oint

Map

sof

prot

ecte

dar

eas

6

spec

ialp

rote

ctio

nar

eafo

rw

ildbi

rds

UN

ES

CO

bios

pher

ere

serv

ena

ture

park

nat

iona

lpar

kla

ndsc

ape

prot

ectio

nar

ea

1AT

KIS

Bas

isD

LMF

eder

alA

genc

yfo

rC

arto

grap

hyan

dG

eode

syB

KG

2

map

ofor

gani

cso

ils(R

oszligko

pfet

al

2014

Hum

bold

tUni

vers

ityof

Ber

lin)

3ge

olog

ical

map

12

0000

0(G

UE

K20

0B

GR

ndashF

eder

alIn

stitu

tefo

rG

eosc

ienc

esan

dN

atur

alR

esou

rces

)4

rast

erm

ap1times

1km

ofw

eath

erda

ta(G

erm

anW

eath

erS

ervi

ce)

5B

KG

var

iabl

ena

me

indi

cate

dfo

rth

eni

neva

riabl

esin

the

final

mod

elw

ith(X

)in

dica

ting

buf

size

andR

indi

catin

gra

ster

reso

lutio

n6F

eder

alA

genc

yfo

rN

atur

eC

onse

rvat

ion

(BfN

)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3326 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 3 Illustration of the annual mean water level (WL) transformation Hypothetical transfer function relating GHG budget to WL (m)(a) GHG budget vs the transformed water level (WLt) (b) WLt vs WL The lines along thex axes indicate the data quantiles of theanalyzed data set(c)

et al 2013) Thus we performed this simplificationprocess by first dropping those parameters with a cor-relationgt 07 (either Pearson or Spearman type) to an-other parameter with a higher contribution (Clapcott etal 2011) This ensured that two highly correlated pa-rameters would not remain in the parameter set longerthan the last parameter of another group of variableswhich may contribute less compared to the two highlycorrelated parameters but provides extra informationthat is not covered by the other parameters After allhighly correlated parameters have been dropped fur-ther parameters with low contribution were droppedprogressively

Predictor contributions are calculated as proportional con-tributions to the total error reduction and can be consideredas a measure for the influence of the individual predictorsAdditionally a BRT model allows the derivation of partialdependence plots which indicate how the response is affectedby a certain predictor after accounting for the average effectsof all other predictors in the model (Elith et al 2008) Theseplots do not show the full effect of each parameter on themodel response due to interactions with other parameters thatare fixed to derive theses plots as well as due to parameter co-correlation However they can be used for interpreting modelbehavior (Elith et al 2008)

231 WLt transformation of WL

The map of water levels of this study was developed to im-prove the upscaling of greenhouse gas emissions from or-ganic soils Therefore the final map should provide the high-est accuracy for the water level range for which the high-est differences of greenhouse gas emissions occur This canbe achieved by transforming WL into a transformed vari-able WLt which shows a linear relationship with GHG emis-sions The sensitivity of greenhouse gas emissions to waterlevel has been analyzed in several laboratory and field ex-perimental and monitoring studies (Berglund and Berglund

2011 Droumlsler et al 2011 Hahn-Schoumlfl et al 2011 Leiber-Sauheitl et al 2014 Moore and Roulet 1993 Moore andDalva 1993 van den Akker et al 2012) General trends are astrong increase of methane (CH4) emissions for annual meanwater levels of approximatelygt minus01 m and an increase ofCO2 emissions for water levelslt minus01 m with a trend simi-lar to a saturation function that levels out approximately be-tweenminus04 andminus08 m (Fig 3a) While studies agree overthese general trends the exact shape of the transfer functionand the maximum levels of emissions as well as their depen-dence on soil properties and other environmental parametersare still controversial Here we assume a hypothetical trans-fer function relating the normalized GHG budget rangingfrom 0 to 1 to the water level (see also Fig 3)

GHG Balance=

minuse3(WL+01)

+1 WLlt=minus011minuseminus3(WL+01) WLgtminus01

(1)

As the GHG budget can be positive for both low and highWL we introduced the transformed water level WLt as(Fig 3)

WLt =

e3(WL+01)

minus 1 WL lt= minus011 minus eminus3(WL+01) WL gt minus01

(2)

By calibrating the model to both WL and WLt we test if theoptimization of WLt provides the highest model accuracy forthe water level range relevant for GHG emissions and if itoptimizes the map for application to GHG upscaling

232 Weighting scheme

When considering possible data weighting schemes it isworth emphasizing at this point that the goal of this study isthe development of a statistical model that can explain boththe water level variability within a peatland as well as amongdifferent peatlands The data of target and predictor variablesfor building this model is highly heterogeneous First the tar-get variable data set contains peatland areas that strongly dif-fer in their spatial extent and in the number of installed dip

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3327

wells Second the predictor variable data set contains cate-gorical and numerical data and part of the predictor variablespredominantly vary from peatland to peatland (eg climaticboundary conditions large-scale topographic wetness indexpeatland characteristics) whereas others also show within-peatland variability (eg land use small-scale topographicwetness index drainage network) As the influence of the in-dividual predictor variables on our target WLt is expected tobe rather diffuse due to abundant interactions with other sitecharacteristics the robustness of derived dependencies willstrongly depend on the number of different peatlands in thedata set

There are no universal data weighting rules for similarlyheterogeneous data situations and some degree of expertjudgment and subjectivity is inevitable involved when de-veloping an appropriate scheme (Francis 2011) The needfor introducing a data weighting scheme is obvious as with-out data weighting during calibration too much influencewould be given to small and well-studied peatlands whichwill reduce predictive model performance for large less-well-studied peatland areas To avoid this in a simple man-ner weight could be reduced by the number of dip wells ineach peatland which results in each peatland being equallyweighted This scheme however does not sufficiently usethe high information content provided by well-studied largepeatlands which should have a higher impact on model cali-bration than a small peatland with only few dip wells

Here we propose a new weighting scheme that takes intoaccount both factors peatland size and local density of dipwells to derive dip well specific weighting factors It is basedon principles of data uncertainty reduction by repeated mea-surements and of geostatistics First we consider our datasituation as an analogue of meta-analysis with grouped dataIt is has been shown for homogeneous problems (all datafrom same population) that optimal group weights for meta-analysis is 1SE2 (Hedges and Olkin 1985) with SE beingthe standard error of each group

SE =σe

radicN

(3)

whereσe is the error standard deviation of a measurementandN is the number of measurements in a group For ho-mogeneous problems and uniformσe this results in weightsthat are linearly dependent onN which we here call the firstend member of weighting Heterogeneity (within-group vari-ance) reduces the variation of the group weights which canbe shown by random effect models (Cumming 2012) Aswith second end member of weighting when heterogene-ity totally dominates within-group variance optimal groupweights are uniform for all groups ie weights are inde-pendent ofN We are not aware of a method that allowsthe estimation of the degree of heterogeneity for the com-plex target and predictor data situation in this study includ-ing data (spatial and temporal variability measurement er-ror) and model errors (missing parameters) As a trade-off

Figure 4 Sample semivariogram and fitted semivariogram modelof the annual mean water level data WL

between 1SE2(homogeneous end member) and 1 (heteroge-neous end member) we decided on a group weight that isthe inverse of the standard error 1SE which is for exam-ple often used in econometric studies (Dickens 1990) Weemphasize that this is a subjective decision

The group weight 1SE is the basis for the geostatisticalpart of our weighting scheme There are two reasons why wecannot directly treat our peatlands as groups First there iswithin-peatland variability which is related to changing sitecharacteristics It is one objective of our study to describethis variability by statistical modeling Thus dip wells mustbe treated individually and data cannot be aggregated at thepeatland level Second we expect the model to learn morewhen the same number of dip wells is installed in a largerpeatland In a small peatland spatial autocorrelation betweendip wells is higher ie the information content is lower thanfor large peatlands As a consequence of the first point wedo not aggregate and keep all dip wells in the target variabledata set by attributing to each dip well the fraction 1N ofits group weight so that the relative weights of the groupsremain constant As a consequence of the second point weuse principles of geostatistics in our weighting scheme Wereplace the group sizeN (positive integer number) by theldquostatisticalrdquo group sizen (positive continuous number be-inggt 1) which we derive from the spatial autocorrelationamong the dip wells

Therefore we analyze the spatial autocorrelation structureof the data set A single spherical variogram model was fit-ted to the sample variogram of all data (Fig 4 in Sect 31)Variogram models allow the differentiation of the total datavariance (called ldquosillrdquo) into a spatially uncorrelated variance(called ldquonuggetrdquo) and a spatially correlated variance (calledldquostructural variancerdquo and defined as sillndashnugget) (Wacker-nagel 2003) The variogram model allows for derivation forany distance between two locations the average squared dif-ference of values here defined asγ By definition at dis-tance 0 the average squared difference equals the nuggetand at distances greater than that called the ldquorangerdquo of spatial

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3328 M Bechtold et al Large-scale regionalization of water table depth in peatlands

autocorrelation the average squared difference equals thesill Accordingly the autocorrelated fractionf of the av-erage squared difference between two dip wellsi and j is

fij =sill minus γij

sill minus nugget (4)

We now define the ldquostatisticalrdquo group sizen of each dip wellito be the sum of one plus the autocorrelated fractionsfij ofall dip wells that are within the range of spatial autocorrela-tion of i

ni = 1 +

msumj=1

sill minus γij

sill minus nugget (5)

According to the discussion above dip-well-specific weightscan then be calculated with

wi =1

ni SEi

=1

σeiradic

ni

(6)

whereni is derived from Eq (5) The equation shows thatwith increasing ldquostatisticalrdquo group sizen ie with increas-ing spatial data density the weight of an individual dip wellis ldquodown-weightedrdquo to some degree a behavior that corre-sponds to our initial intention to lower the influence of smallpeatlands compared to large ones The error standard devia-tion σe is dependent on several factors eg the length of thetime series the temporal measurement density and the mi-crotopography around the dip well For simplicity we hereassumedσe to be uniform for all dip wells which simplifiesEq (6) towi =

1radic

ni

Only dip wells with the same land-use type were summedup with Eq (5) which avoids the down-weighting by dipwells that have different land-use types The latter are mostlycharacterized by fairly different WLt and thus by rather lowspatial autocorrelation to dip welli

After spatial correlation has been accounted for the sumof the weights of all dip wells of each land-use type were ad-justed that they correspond to the fractions of this land-usetype in Germany This adjustment accounts for the overrep-resentation in the data set of dip wells in unused peatlandsand underrepresentation of dip wells in arable land

233 Model performance criteria

Model fit and predictive performance after cross-validationwere quantified by the weighted root mean square error

RMSE =

radicradicradicradic 1summi=1 wi

msumi=1

(wi

(xoi minus xsi

)2) (7)

wherem is the number of dip wellsxoi is observed WLor WLt of dip well i xsi is simulated WL or WLt of dipwell i andwi is the data weight of dip welli (see below) We

refer to the root mean square error of the predicted data ofcross validation as RMSEcv Model performance was furtherquantified by NashndashSutcliffe efficiency (NSE)

NSE = 1 minus

msumi=1

wi

(xoi minus xsi

)2

msumi=1

wi

(xoi minus xo

)2 (8)

wherexo is the mean of all observed WL or WLt It indicateshow well observed vs predicted values match the 1 1 lineNSE is a good overall indicator of predictive performancebecause it combines scatter and bias (common offset andorslope difference from 11 line) (Nash and Sutcliffe 1970)Values greater than 0 signify a model that is better than thereference model based on the data mean We refer to the NSEof the training data as NSEcal and of the predicted data ofcross validation as NSEcv

Systematic errors were quantified by calculating the modelbias here defined as

BIAS =

msumi=1

(wi xoi minus wi xsi

) (9)

24 Model uncertainty and stability evaluation

Uncertainty of the model predictions was assessed by boot-strapping cross-validation and residual analysis

For the bootstrapping analysis we followed the procedureof Leathwick et al (2006) We estimated the confidence in-tervals around the predictions and the fitted functions by tak-ing 1000 bootstrap samples of the 53 peatlands The numberof peatlands in each sample was equivalent to the data set butpeatlands were selected randomly with replacement Usingthe predictor variables of the final model a BRT model wasfitted to each sample Cross validation was again performedon peatlands thus a peatland in the calibration data set wasnot part of the cross-validation data set to avoid overly opti-mistic results Variances of the predictions and of the fittedfunctions of the 1000 models were evaluated

If data sets are relatively small (egn lt 1000 Dersquoath2007) then the small size of the training and test data setslowers model accuracy Given the fairly small number ofpeatlands in the data set and the partly high spatial corre-lation of dip wells within these peatlands we decided not tosplit the data set into a training and test data set Estimatesof model accuracy can then be based on cross-validationthereby making effective use of all the data (Dersquoath 2007)The prediction uncertainty of the final model is estimatedby the root mean square error of prediction (RMSEcv seeabove) for each land cover class After testing for near-normal distribution of the residuals RMSEcv can be used toderive the 68 and 95 confidence intervals of the predictionswith RMSEcv and 2times RMSEcv respectively

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3329

Finally additional residual analysis was performed to eval-uate whether the predictions are biased for different landcover classes or geographical regions

25 Regionalization

In the final regionalization step the predictor variables con-tributing to the final model were determined at a 25times 25 mraster for all organic soil in Germany Predictor variableswere determined with the same map input that was used formodel building Land cover information including informa-tion on ditches was based on the data from year 2012 and theclimatic data was based on the average of the last 30 yearsThe fine spatial resolution of 25times 25 m was not chosen tofool the reader with a highly spatially accurate model Ratherthis fairly fine scale was necessary to map the relativelysmall-scale effects of the topography land use and peatlandgeometry variables The final model was then used to makea prediction for each of these raster cells

3 Results and discussion

31 Spatial correlation structure of the data set

The variogram model fitted to the sample variogram provideda nugget (0012 m2 011 m) a sill (009 m2 03 m) and arange of spatial correlation (2700 m) for our data set of WL(Fig 4) The nugget represents the very small-scale soil hy-draulic variability and micro-topography effects on WL (vander Ploeg et al 2012) and measurement error eg by dif-ferences in the determination of the ground surface and inthe timing of the manual measurements Furthermore micro-topography (eg hummocks) and oscillating peat surfaces ofwet peatlands pose a challenge for an accurate determinationof both ground surface and water level The water level timeseries in the data set were of different lengths and rangedfrom 1 to 20 years Interannual variability of water levels canbe large (eg Knotters and van Walsum 1997) For simplic-ity in our analysis data were not harmonized by extrapolat-ing WL time series using weather data to a 30-year periodThus the nugget also includes errors that are introduced bydip wells with different measurement periods that are locatedin the range of spatial correlation In consideration of theseerror sources the fitted nugget of 011 m appears to be a re-alistic value At 03 m the fitted sill matched nearly perfectlywith the standard deviation of the data (031 m) which in-dicates consistency between semivariogram model and dataset The fitted range of spatial correlation of 2700 m reflectsboth physical effects ie the average range of lateral flowdue to hydraulic gradients as well as the effect of averageland-use patterns in Germany on the spatial correlation ofWL Fitted values were used in the calculation of the dip-well-specific weights using Eq (6)

32 Typical water levels for land-use types in Germanorganic soils

The land cover classes are characterized by plausible meanand median water levels which show consistent differencesbetween each other (Table 2 and Fig 5a) The mean valuesof arable land and grassland agree with what can be expectedfor their agronomic requirements with slightly lower waterlevels for arable land The high variability observed for bothclasses may be related to the variability of the efficiency ofinstalled drainage systems as for example the presence andcondition of tile drains and the depth of ditches Grasslandscan be managed with very variable intensity which is partlyreflected in different water levels Figure 5a further showsthat deciduous forests seem to dominate slightly drier organicsoils compared to coniferous forests which dominate underwetter conditions A high variability of water levels is ob-served for the land cover class unused peatland On the onehand post peat-cutting topography increases the variabilityof WL over short distances It probably contributes to thehigh variance observed for this class On the other hand thisclass comprises both rather dry unused peatlands and wetterpeatlands in which re-wetting measures already took placewhich however do not show yet a wet soil attribute in theATKIS Digital Landscape Model This may also cause part ofthe variance observed in the grassland and forest land coverclass All wet land cover classes (reed wet grassland wetforest and wet unused peatland) that were separated by wet-ness indication clearly show higher water levels showing thewetness attribute of the Digital Landscape Model is a usefulattribute

Figure 5b shows the transformed water level for all classesIt can be observed that the variances of the wetter landcover increase relative to the variances of the dry land coverclasses This is due to the highest sensitivity of GHG emis-sions in the wet range of water levels (gt minus05 m) Conse-quently the rather high variance of WL for arable land cor-responds to a rather low variance of WLt ie to a rather lowassumed effect of WL variability on the GHG budget

33 BRT model calibration and validation WL vs WL t

In contrast to land cover class the other predictor variablesshowed if at all only weak relations to WL and WLt whenevaluating them with box plots 2-D cross plots and simplecorrelation matrices Here we expected BRT to detect thestrongest predictor interactions and to identify the most in-formative predictors

After model calibration with all predictors subsequentmodel simplification successively dropped those parameterswith correlationgt 07 and the lowest contribution For bothWL and WLt model performance improved during this sim-plification For WLt the highest values of NSEcv of approxi-mately 046 were achieved with 21 to 9 model parametersThe development of NSEcv for the last 50 parameters is

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3330 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 5 Water level relative to ground surface WL (m) and transformed water level WLt (minus) by land cover class illustrated as a weightedbox plot WLt = minus1 corresponds to maximum CO2 emissions and WLt = 1 to maximum CH4 emissions In the top horizontal axes thenumber of dip wells in each class is indicated

Figure 6NSEcv as a function of number of predictor variables usedin the model of WLt during model simplification and shown for thelast 50 parameter drops

shown in Fig 6 Further elimination of parameters led to apronounced decline of model performance Similar behav-ior was observed for the calibration on WL In favor of amore parsimonious model we chose the model with the low-est number of parameters before the pronounced decline ofmodel performance occurred For the calibration on WLtthis corresponded to the model with lowest number of param-eters that still achieved NSEcv values ofgt 045 (Fig 6) Thefinal WLt model comprised nine predictor variables and thefinal WL model seven parameters The percentages of pa-rameter contributions to the final model and their individualinfluences are discussed for WLt in Sect 34

Table 3 summarizes the statistical performances of themodels calibrated on WL and WLt For both models NSEcalis considerably higher than NSEcv and shows the commonly

Table 2 Weighted mean and standard deviation of WL and WLtdata and of the WLt map presented in Sect 36 for the nine landcover classes

WL (m) WLt (minus)WLt (minus)

Meanplusmn sd Meanplusmn sd Map meanplusmn sd

Arable land minus069plusmn 030 minus076plusmn 017 minus066plusmn 022Deciduous f minus045plusmn 034 minus049plusmn 037 minus047plusmn 035Grassland minus044plusmn 029 minus052plusmn 032 minus049plusmn 030Unused peatl minus039plusmn 036 minus039plusmn 041 minus037plusmn 040Coniferous f minus036plusmn 036 minus037plusmn 037 minus046plusmn 035Wet unused peatl minus022plusmn 027 minus018plusmn 040 minus017plusmn 036Wet forest minus022plusmn 029 minus017plusmn 043 minus021plusmn 039Wet grassland minus010plusmn 014 minus000plusmn 031 minus015plusmn 039Reed minus001plusmn 017 020plusmn 029 minus006plusmn 032

observed overfitting behavior of BRT models The differentmeasures that we conducted to minimize overfitting (cross-validation on peatlands restriction to monotonic responsesand model simplification including elimination of highly cor-related variables) lowered the difference between NSEcal andNSEcv but could not totally avoid overfitting NSEcv of theWLt model (0453) indicates higher predictive model perfor-mance compared to the WL model (0381) However as thedata ranges differ due to the transformation this comparisonmay be misleading Therefore we transformed the predic-tions of the WL model to obtain WLt values from this modeland equally calculated the performance criteria (Table 3 sec-ond column) Then NSEcv is slightly increased (0397) butdoes not achieve the values of the model that was calibratedon WLt A better predictive model performance of the modelcalibrated on WLt is also visible for the RMSEcv valuesThe total RMSEcv as well as the RMSEcv values for the

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3331

Table 3Performance criteria of the different models dry range de-fined as WLlt minus03 m and wet range as WLgt minus03 m

WL (m) WLt (minus) WLt (minus)(calibrated (calibrated on WL) (calibrated on WLt)

on WL)

NSEcal 0627 0559 0642NSEcv 0381 0397 0453RMSEcv 0269 0299 0284RMSEcvdry 0284 0263 0259RMSEcvwet 0222 0382 0355Bias minus0003 0083 0002Biasdry minus0012 0070 0003Biaswet 0021 0120 0000

dry (WL lt minus03 m) and wet range (WLgt minus03 m) showslightly lower values for the WLt model compared to WLtvalues from the model calibrated on WL Given our hypo-thetical transfer function (Fig 3) in which the GHG budgetis linearly related to WLt the higher accuracy of WLt pre-dictions directly corresponds to a higher accuracy of GHGbudget predictions

Superior model performance is also evident when evaluat-ing model bias Only when calibrating directly on WLt arethe WLt predictions bias-free Calibration on WL and subse-quent transformation to WLt introduces a model bias towardssystematically lower WLt values In subsequent applicationsto GHG emission upscaling lower WLt values would lead toan overestimation of CO2 emissions and to an underestima-tion of CH4 emissions

34 Influence of predictor variables on WLt

Given the beneficial characteristics of the model calibratedon WLt for GHG upscaling presentation and discussion offurther model results is restricted to the WLt model

The BRT method allows the analysis of the parameter con-tributions to and influences on the model (Elith et al 2008)and thus may contribute to system understanding The per-centages of the contributions of the nine predictor variablesto the final model ranged from 252 to 56 (Fig 7) Ex-cept protection status at least one parameter of each of theseven parameter groups contributed to the final model Allprotection status information was dropped early during thesimplification process due to low contribution although WLshowed slightly higher values for data from nature protectionor special areas of conservation However other parametersseem to be able to fully compensate the information that islost by dropping this predictor

Land cover class lc at the dip well was the parameter withstrongest contribution (252 ) It basically follows the trendillustrated in Fig 5b The bootstrap error plotted as standarddeviation (Fig 7) shows the variation of this influence overthe 1000 bootstrap models A second land cover parameterthe fraction of dry land cover classes on organic soils in abuffer of 2500 m radiusfdry (2500) contributed to the model

with 103 The monotonic decrease of WLt with increas-ing fdry (2500) is plausible as higher values reflect intensiveland use in the surroundings of the dip well and thus indicateintensive artificial drainage Together both parameter con-tributed 355 and thus land cover represents the parametergroup with the strongest model contribution

Peatland characteristics are the second most importantparameter group The peatland type contributed 16 Themodel indicates that peatlands without any connection to sur-face water bodies (river or lake) and the class of other organicsoils are characterized by lower WLt compared to the peat-land types lowland bog upland bog and fen neighboring sur-face water As the class of other organic soils is generallyexpected to reflect lower water levels and as surface watermay have a stabilization effect on water levels of organicsoils the influence of the peatland type can be consideredplausible Besides peatland type the substrate of the peatbase contributes 56 Here organic soils overlying peatclay layers (eg limnic sediments such as calcareous gyt-tja) or basement rock are characterized by higher WLt com-pared to organic soils overlying unconsolidated rock Thiscan be explained by the lower drainage resistance of uncon-solidated rocks This may cause an increased efficiency ofanthropogenic drainage andor a general higher vulnerabil-ity to seepage losses Finally slightly lower WLt values areindicated by a high fraction of organic soils for the 500 mbufferfpeat(500) This may reflect the higher land-use pres-sure on large peatlands compared to rather small peatlandswhich tentatively are more easily preserved by nature protec-tion efforts

The remaining four parameter groups are represented inthe model by only one parameter each The third most influ-ential parameter was the length of ditches on arable land andgrassland for the 250 m buffer dilendry (250) At first glanceit may be surprising that with increasing ditch density WLtvalues tend to be higher as ditches are supposed to drain thewater when land is used as arable land and grassland Thefact that the model identifies a rather strong effect in the op-posite direction may be caused by incomplete informationabout the drainage network There is not detailed informa-tion about the spatial distribution of tile drains Based on ex-pert knowledge agricultural areas with a lower ditch densityare more likely to have tile drains As these drains easilyinstalled with a narrow drain spacing are more effective atdraining organic soils low WLt values for arable land andgrassland may be related to low ditch densities Furthermoreditches were originally dug at narrow spacing in especiallywet areas of organic soils but there is no information avail-able whether these ditches still function properly

The parameters wbsummer hrel and tiras25all show expectedtrends The model predicts higher WLt for increasing cli-matic water balance in the summer period (May to October)wbsummer for dip wells located in depressions (low values ofhrel) and for higher small-scale topographic wetness indicescalculated on the 25times 25 digital elevation model (tiras25)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3332 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 7 Partial dependence plots for the predictor variables For an explanation of variables see Table 1 They axes are on WLt scale andare centered around the mean WLt Error bars and grey area indicate standard deviation of the response of over 1000 bootstrap models Therelative contribution of each predictor is indicated as percentage The lines along thex axes of each plot show distribution of data across thatvariable in deciles

The fact that all parameters show expected or explainableresponses in the model corroborates the reliability of the cal-ibrated WLt model The standard deviation of the predictorresponses based on the bootstrap samples shows the stabilityof the observed responses

Further insight into model behavior can be obtained by an-alyzing parameter interactions This is obtained by changingtwo parameters simultaneously while keeping mean valuesfor all other parameters (Elith et al 2008) Figure 8 showsthe two strongest parameter interactions Parameter wbsummer

strongly interacts withptype The generally lower values ofWLt of fens without surface water connection and other or-ganic soils show a stronger dependency on the summer cli-matic water balance While a summer climatic water balanceof gt minus80 mm shows a rather weak effect on WLt for the wet-ter peatland types in contrast to the two drier peatland typesthere is still a strong effect with increasing wbsummer Thetrend for wbsummergt 130 mm for the dry peatland types issupported by seven different peatlands

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3333

Figure 8Partial dependence plots representing the two strongest interactions in the model(a) betweenptype and wbsummerand(b) betweenpbaseandfdry Fitted WLt is plotted on they axis which is obtained after accounting for the average effect of all other predictor variables

Another strong interaction is observed forpbase andfdry (2500) While a rather weak effect of the fraction ofarable land and grassland is observed for organic soils over-lying basement rock and peat clay layer a strong effect is ob-served for organic soils overlying unconsolidated rock Thisinteraction reflects the higher lateral range of drainage effectsfor organic soils with little flow resistance at the peat base Inthese organic soils intensive land use lowers the water levelover large areas

35 Discussion of model uncertainty

Plotting observed vs predicted WLt from cross-validation(Fig 9) illustrates the rather large residual variance that can-not be explained by the model As indicated by the higherRMSEcv for the wet range (Table 3) scatter increases withincreasing WLt Error bars in they direction indicate dataerror derived from the nugget of the variogram It is shownfor a few data points as an example Due to transformationdata error increases for higher WLt Figure 9 demonstratesthat the fraction of unexplainable variance related to data er-ror is much higher for the wet than for the dry range Boot-strap error indicating the variation of the model predictionsfor 1000 bootstrap samples is shown in thex direction for thesame data points Bootstrap error is lower than the data errorfor the wet range and slightly higher for the dry range

Bootstrap errors demonstrate the sensitivity of model pre-dictions to changes of the data set used for calibration Whena model possesses structural deficits such as missing pre-dictor variables bootstrap errors should not be used to de-fine confidence intervals for the model predictions Figure 10shows residuals from cross-validation and standard deviationof bootstrap predictions for all land cover classes The resid-uals of each land cover class show near-normal distributionsFor five of the nine land cover classes (wet forest wet un-used peatland arable land coniferous forest and reed) theShapirondashWilk test of normality is positive (p gt 005) Fig-ure 10a further indicates that residuals of each land cover

Figure 9 Observed vs predicted transformed annual mean waterlevel (WLt) from cross-validation results Error bars show selecteddata and bootstrap model errors as standard deviation Data pointsare scaled by their weights

scatter fairly well around zero indicating low bias for the var-ious land cover classes Land-cover-class-specific confidenceintervals of model predictions can thus be derived from theRMSEcv of each land cover class eg 2times RMSEcv repre-senting the 95 confidence interval

The prediction uncertainty derived from cross-validationis much higher than the bootstrap prediction uncertainty ob-tained from the bootstrap standard deviation (sd) with 2times sdcorresponding to the 95 confidence interval (Fig 10)The large difference between these values indicates that themodel has structural deficits that can be attributed to severalerror sources

i Key influences on WLt are missing in the set of pre-dictor variables None of the predictor variables in-dicate whether and to which extent water level in-crease due to re-wetting measures took place in the last

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 4: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

3322 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Germany the fraction of dip wells on bogs is overrepresentedin the data set by the factor of 25 while dip wells on fensand other organic soils are slightly underrepresented Dataalso cover the common land-use types (for data sources seeTable 1) However dip wells on organic soils that are neitherused for agriculture forestry or peat mining further referredto as ldquounused peatlandsrdquo are overrepresented in the data setby a factor of 6 as data was collected more frequently andin higher spatial data density in the frame of conservationprojects The fraction of unused peatlands of the German or-ganic soils is 6 and the fraction in the data set is 36 In contrast dip wells on arable land are underrepresented inthe data set by a factor of 6 The fraction of arable land onGerman organic soils is 24 and the fraction in the data setis 4 The other two key land-use types of organic soils inGermany grassland and forest are well represented in thedata set The misbalance of the land-use types in the data setis accounted for in the weighting of data (see Sect 232)

If land use changed within the measurement period of adip well the time series was split at the moment when theland-use record indicates the transition For each segment theannual mean water level WL (here with negative values de-fined as water levels below ground) was calculated as themulti-year average value over the whole measurement periodof the specific land use

The primary application of the WL map produced in thisstudy is for the upscaling of long-term GHG emissions asemission reporting may only reflect anthropogenic effectsbut not interannual climatic effects As GHG transfer func-tions are developed on annual data their application requiresboth the long-term annual mean water level as well as itsinterannual variability Due to the non-linear dependence ofGHG emissions on WL single years with extreme water lev-els can strongly influence long-term average GHG fluxesThis study is focused on the regionalization of the long-termannual mean water levels For this objective model buildingshould be based on long-term water level time series to av-erage out the effect of weather variation within a completeclimatic period (commonly 30 years) The existing nation-ally available data on water level time series of organic soilshowever does not comprise a single time series with com-plete data coverage over the last 30 years Due to the lackof sufficient long-term water level time series we includedall time series in the model building process Average cli-matic boundary conditions (precipitation reference evapo-transpiration water balance) of the specific measurement pe-riod of each dip well are part of the predictor variables (seeSect 22) and thus are supposed to partly account for theeffect of specific weather conditions on WL in case of shortmeasurement periods

22 Predictor variables

Spatial coverage of phreatic water level data of organic soilsis too low to obtain WL maps by simple spatial interpolation

(Fig 1) Additional spatial data is needed as basis for region-alization Ancillary information that covers fully or at leastmost of the extent of the final map are necessary They can beused as predictor variables A comprehensive set of variables(numerical and categorical) with potential indication for thehydrological condition of an organic soil were determinedfor each dip well (Fig 2 and Table 1)

The predictor variables which can partly be found also inFinke et al (2004) can be divided into seven groups

221 Land cover

As certain land use and vegetation require and reflect cer-tain WL such information can be used as an indicator forthe average drainage level around the dip well Land-use andvegetation information is based on the German Digital Land-scape Model (ATKIS Basis-DLM) which is updated contin-uously by aerial photos as well as sporadic ground mappingand has a temporal accuracy of 3 months to 5 years It is pro-vided as fine-scaled polygons and represents the best uniformland cover information available in Germany It contains in-formation on primary land-use type few optional vegetationattributes and whether ldquowet soilrdquo has been observed duringmapping As we noticed that the use of a large number ofcategorical variables lowers the performance of boosted re-gression trees we further aggregated the three informationtypes (i) land use (ii) vegetation and (iii) wet soil into a setof nine combined land cover classes (Table 1) These landcover classes were a trade-off between fine differentiationand the number of replicates in each class For grasslandsa ldquowet grasslandrdquo class was separated when grassland wasoverlaid with wet soil andor tree or shrubs vegetation whichmay indicate a less intensive management Forests overlaidwith wet soil were separated as ldquowet forestrdquo Further unusedpeatlands overlaid with wet soil and showing no coveragewith tree attributes were characterized by higher water levelsand were thus separated as ldquowet unused peatlandrdquo The veryfew dip wells classified as open water (n = 2) and peat cut-ting (n = 5) were merged to the reed and arable land coverclass respectively Land-use type and land cover class wereextracted at the dip well (point extraction) and as fractions invarious buffers around the dip well (Table 1) As using toomany weak predictor variables lowers model performanceand increases overfitting the numerous land cover fractionswere further aggregated into two classes the fraction of dry(arable and grassland) and wet (reed wet grassland wet for-est and wet unused peatland) land cover on organic soils Forthe calculation of the fraction of dry land cover we testedvarious factors for the reduction of the contribution of grass-land compared to arable land as the grassland class also in-cludes wetter grasslands that could not be detected with theavailable land cover catalogue A factor of 05 was an optimalvalue which was then set fixed

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3323

222 Drainage network

Locations of ditches that are included as lines in the DigitalLandscape Model were used to obtain information about thedrainage network The total ditch length was calculated forvarious buffer sizes Further the distance to the next ditchwas calculated for each dip well A short distance to the nextditch may indicate either lower or higher water levels de-pending on whether the ditches are used for drainage or al-ready blocked and used for rewetting measures Similarlythe indication of the total length of ditches is not uniqueTherefore we defined two different sets of ditch variablesA first set for which we calculated values for all land coverclasses and a second one for which we only calculated val-ues for land cover classes for which ditches are undoubtedlyused for drainage ie arable and grassland

223 Peatland characteristics

The geological map of Germany (scale 1 200 000) definedthe area for which WL predictions were modeled It is alsothe basis for topological peatland predictor variables ie thefraction of organic soils in different buffer sizes as well asthe dip well distance to the edge of the peatland Informationabout the peatland type and the substrate at the peat base ispresented in more detail in a newly compiled raster map oforganic soils (Roszligkopf et al 2014) and was thus extractedfrom this map Peatland types were aggregated into fiveclasses lowland bog (North German Plains and Alpine Fore-lands) upland bog (Central Uplands and Alps) fen neighbor-ing surface water fen without neighboring surface water anda class of ldquoother organic soilsrdquo that do not fulfill the C contentand thickness criteria to be classified as peatland Substratesat the peat base included loose unconsolidated rock (alluvialsand and gravel deposits) consolidated rock (bedrock) andpeat clay layer The first type may indicate the occurrence ofseepage (positive or negative) whereas the latter two typesmay indicate rather a hydraulic decoupling from the aquiferhydraulic head

224 Climatic boundary conditions

Climatic boundary conditions directly influence water levelOn the one hand the typical long-term climatic boundaryconditions may indicate the general vulnerability of peat-lands in a specific region On the other hand given the dif-ferent lengths of measurement periods of the time seriesin this study climatic boundary condition predictor vari-ables may account for the effect of a climatically wetter ordrier measurement period compared to the long-term av-erages on the water level Climatic boundary conditionswere extracted from a 1times 1 km raster from the GermanWeather Service Annual summer and winter precipitationFAO56 PenmanndashMonteith reference evapotranspiration andclimatic water balance (difference between precipitation and

reference evapotranspiration) were determined for the indi-vidual measurement period of each dip well and as long-termaverages (30 years)

225 Relative altitude

Relative altitude was calculated by subtracting the medianaltitude of various buffer sizes from the absolute altitude ateach dip well in the digital elevation model (DEM) Rela-tive altitude is expected to have two different indications de-pending on the applied buffer size (i) in many peatlands theformer smooth peatland relief at the scale of approximatelygt 5 m has been disturbed due to peat cutting and differencesin drainage and mineralization rate As a consequence therather smooth phreatic surface often does not follow the un-even and patchy terrain Relative altitude with respect tosmaller buffer sizes (lt 250 m) may therefore explain part ofthe WL variation eg a dip well that is located at a surfacemuch higher than the surroundings may indicate deeper wa-ter levels (ii) for large buffer sizes (gt 250 m) relative altitudeindicates whether the peatland lies in a larger morphologicaldepression or elevation and thus may indicate whether large-scale lateral inflow of water can be expected or not Similarindication is provided by the topographic index (see below)The accuracy of relative altitude values depends on the reso-lution and accuracy of the DEM The nation-wide availableDEM is based on data sets of varying quality which maylower the influence of this variable

226 Topographic wetness index

The topographic wetness index is a common wetness indi-cator used in hydrology (Beven and Kirby 1979) It is acombined measure of catchment area and slope at a givenpoint and indicates the extent of flow accumulation High val-ues indicate wetter conditions If calculated at larger scaleshigher values may hint at the occurrence of positive seepageie upward flow of water from the aquifer Topographic wet-ness index was calculated for various DEM resolutions usingthe GRASS 7 module rwatershed

227 Protection status

The protection status of a peatland area may reflect hydrolog-ical conditions Therefore we checked for seven protectionstatus at each dip well (see Table 1 for details)

23 Model building scheme

Model building was performed using boosted regressiontrees (BRT) implemented in the two R packages ldquogbmrdquo(Ridgeway 2013) and ldquodismordquo (Hijmans 2013) BRT is amachine-learning algorithm in which the final model is de-rived from the data Functions that relate target to predictorvariables are not predetermined but freely developed BRTis based on the decision (or regression) tree concept In the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3324 M Bechtold et al Large-scale regionalization of water table depth in peatlands

decision tree concept the parameter space is searched se-quentially for the best split that results in the lowest modelmean squared error The mean responses of the groups thatresult from the various splits and correspond to certain pa-rameter ranges represent the model The common procedureis the growth of a large tree which is subsequently simpli-fied by dropping weak links that are identified with cross-validation Growing only a single tree has several disadvan-tages such as uneven functions that are very sensitive to thespecific sample of the data Therefore ensemble techniqueshave been combined with the decision tree concept Thesewere first the development of multiple models by bootstrap-ping of the samples (bagging technique) and the randomcreation of subsets of predictors at each split (random for-est technique) Later with the ldquoboostingrdquo technique of BRTa sequential procedure was developed in which data is re-weighted after each tree to increase emphasis on data that ispoorly modeled by the existing collection of trees (Elith etal 2008)

BRT modeling is increasingly applied in spatial model-ing of species or numerical environmental variables (Elithet al 2008 Martin et al 2011) thereby often showing su-perior performance compared to other machine-learning al-gorithms The increasing application of BRT is related toseveral of its favorable characteristics the strength of thismethod lies in the ability to fit complex functional dependen-cies including non-linear relationships and interactions be-tween predictor variables Based on its flexibility BRT is in-variant to monotonic transformations of predictors Further-more BRT allows for missing values in the predictor vari-ables thus predictor variable information does not necessar-ily need to fully cover the total map extent The gbm packagehandles missing values in predictor variables by introducingsurrogate splits The mean target value belonging to the miss-ing predictor values is attributed to these surrogate splits dur-ing model building We observed that the contribution of apredictor variable to the final model decreases with an in-creasing number of missing values This is intuitive as targetobservations of missing predictor values are mostly supposedto scatter strongly BRT is further fairly insensitive to out-liers and allows estimating the relative contribution of eachpredictor variable to the model Due to these characteristicswe expected BRT to be very well suited to the very hetero-geneous data set of this study

BRT model calibration is prone to overfitting and thereare various options to reduce this behavior Due to the over-fitting behavior cross validation is generally part of themodel building process However cross validation can beperformed in several ways and if performed carelessly canlead to overly optimistic model performance (Dersquoath 2007)Here cross validation was performed by leaving out wholepeatland areas instead of a random set of dip wells Thisrepresents a stricter cross validation and we noticed that itstrongly reduced overfitting of the water level data and thuscontributed to the development of a more robust model

Figure 2 Illustration of the predictor variables determined for eachdip well based on available national maps (see Table 1)

Another option to avoid overfitting is to impose mono-tonic slopes on the effects of individual parameters whichcan even lead to improved prediction performance (Dersquoath2007) For all our numerical variables we expected mono-tonic slopes rather than optimum functions To avoid pre-defining any expected direction all numerical variables wereadded twice to the set of predictors constraining the slope toa monotonic increase and decrease We let the model decidewhether monotonic increase or decrease has higher predic-tive power

Models were calibrated using a Gaussian response typeaimed at minimizing deviance (squared error) (Ridgeway2013) In all calibration runs we applied the gbmstep func-tion of the dismo package which assesses the optimal num-ber of boosting trees using cross validation We tested variouslearning rates (0001ndash001) bag fractions (01ndash08) and lev-els of tree complexity (3 to 7) ie the number of nodes in atree By trial and error we determined the most effective algo-rithm parameters for our data set being 0005 for the learningrate 06 for the bag fraction and 5 for the tree complexity

The final BRT model building is commonly performed asa two-step procedure (Elith et al 2008) which we basicallyalso followed in our study

i In the first step the whole set of predictor variables isused to calibrate a BRT model

ii In a second step the number of parameters is re-duced sequentially to avoid overfitting and to derive amore parsimonious model We tracked predictive per-formance criteria during the simplification process Asvarious variables were calculated for different buffersizes our predictors included a large number of cor-related variables Correlation coefficients between pre-dictor variables ofgt 07 are known to severely distortmodel estimation and subsequent prediction (Dormann

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3325Ta

ble

1O

verv

iew

onpr

edic

tor

varia

bles

Pre

dict

orva

riabl

eVa

riabl

ena

me

Valu

esP

oint

buf

fers

(m)

Dat

aso

urce

Land

-use

type

Ara

ble

gras

slan

dfo

rest

shr

ubs

peat

-min

ing

unus

edP

oint

100

500

100

025

00D

igita

lLan

dsca

peM

odel

1

peat

land

sw

amp

open

wat

er

Vege

tatio

nat

trib

utes

Dec

iduo

usfo

rest

mix

edfo

rest

con

ifero

usfo

rest

ree

dP

oint

Dig

italL

ands

cape

Mod

el1

(opt

iona

l)sh

rubs

gra

ss

ldquoWet

soil

obse

rved

rdquoY

esn

oP

oint

Dig

italL

ands

cape

Mod

el1

Com

bine

dla

ndco

ver

lcA

rabl

egr

assl

and

wet

gras

slan

dde

cidu

ous

incl

udin

gm

ixed

Poi

nt1

005

001

000

2500

Dig

italL

ands

cape

Mod

el1

info

rmat

ion

(land

-use

type

fo

rest

wet

fore

stc

onife

rous

fore

str

eed

unus

edpe

atla

nd

veg

and

wet

soil

attr

)w

etun

used

peat

land

Dry

land

cove

rfr

actio

nf

dry(

X)

Ara

ble

05times

gras

slan

don

orga

nic

soil

area

0to

110

050

010

002

500

Dig

italL

ands

cape

Mod

el1

Wet

land

cove

rfr

actio

nR

eed

wet

gras

slan

dw

etfo

rest

and

wet

unus

edpe

atla

ndon

100

500

1000

250

0D

igita

lLan

dsca

peM

odel

1

orga

nic

soil

area

0to

1

Tota

llen

gth

ofdi

tche

sfo

rdi le

ndr

y(X

)ge

0m

Poi

nt5

025

010

002

500

Dig

italL

ands

cape

Mod

el1

alll

can

don

lyfo

rar

able

and

gras

slan

d(s

ubsc

rldquod

ryrdquo)

Dis

tanc

eto

next

ditc

hge

0m

Poi

ntD

igita

lLan

dsca

peM

odel1

Pea

tland

type

pty

peLo

wla

ndbo

gup

land

bog

fen

neig

hbor

ing

surf

ace

wat

erf

enP

oint

Map

ofor

gani

cso

ils2

with

outn

eigh

borin

gsu

rfac

ew

ater

oth

erldquolo

w-C

rdquoor

gani

cso

il

Mat

eria

latp

eatb

ase

pba

seU

ncon

solid

ated

rock

pea

tcla

yla

yer

rock

no

info

rmat

ion

Poi

ntM

apof

orga

nic

soils

2

Pea

tland

frac

tion

fpe

at(X

)0

to1

Poi

nt5

001

000

2500

Geo

logi

calm

ap(B

GR

)3

Dis

tanc

eto

edge

ofpe

atla

ndgt

0m

Geo

logi

calm

ap(B

GR

)3

Rat

ioof

dpe

atf

peat

gt0

2500

Geo

logi

calm

ap(B

GR

)3

Pre

cipi

tatio

nge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Eva

potr

ansp

iratio

nge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Clim

atic

wat

erba

lanc

ew

b sum

mer

lt0

andge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Rel

ativ

ehe

ight

hre

l(X

)lt

0an

dge0

mP

oint

ndashm

edia

n25

50

100

250

Dig

italE

leva

tion

Mod

el5

500

1000

Topo

grap

hic

inde

xti ra

sR(X

)gt

0P

oint

and

1000

buffe

rfo

r10

D

igita

lEle

vatio

nM

odel

5

252

501

000

rast

erva

lues

Pro

tect

ion

stat

usN

atur

eco

nser

vatio

nar

eas

peci

alar

eas

ofco

nser

vatio

nP

oint

Map

sof

prot

ecte

dar

eas

6

spec

ialp

rote

ctio

nar

eafo

rw

ildbi

rds

UN

ES

CO

bios

pher

ere

serv

ena

ture

park

nat

iona

lpar

kla

ndsc

ape

prot

ectio

nar

ea

1AT

KIS

Bas

isD

LMF

eder

alA

genc

yfo

rC

arto

grap

hyan

dG

eode

syB

KG

2

map

ofor

gani

cso

ils(R

oszligko

pfet

al

2014

Hum

bold

tUni

vers

ityof

Ber

lin)

3ge

olog

ical

map

12

0000

0(G

UE

K20

0B

GR

ndashF

eder

alIn

stitu

tefo

rG

eosc

ienc

esan

dN

atur

alR

esou

rces

)4

rast

erm

ap1times

1km

ofw

eath

erda

ta(G

erm

anW

eath

erS

ervi

ce)

5B

KG

var

iabl

ena

me

indi

cate

dfo

rth

eni

neva

riabl

esin

the

final

mod

elw

ith(X

)in

dica

ting

buf

size

andR

indi

catin

gra

ster

reso

lutio

n6F

eder

alA

genc

yfo

rN

atur

eC

onse

rvat

ion

(BfN

)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3326 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 3 Illustration of the annual mean water level (WL) transformation Hypothetical transfer function relating GHG budget to WL (m)(a) GHG budget vs the transformed water level (WLt) (b) WLt vs WL The lines along thex axes indicate the data quantiles of theanalyzed data set(c)

et al 2013) Thus we performed this simplificationprocess by first dropping those parameters with a cor-relationgt 07 (either Pearson or Spearman type) to an-other parameter with a higher contribution (Clapcott etal 2011) This ensured that two highly correlated pa-rameters would not remain in the parameter set longerthan the last parameter of another group of variableswhich may contribute less compared to the two highlycorrelated parameters but provides extra informationthat is not covered by the other parameters After allhighly correlated parameters have been dropped fur-ther parameters with low contribution were droppedprogressively

Predictor contributions are calculated as proportional con-tributions to the total error reduction and can be consideredas a measure for the influence of the individual predictorsAdditionally a BRT model allows the derivation of partialdependence plots which indicate how the response is affectedby a certain predictor after accounting for the average effectsof all other predictors in the model (Elith et al 2008) Theseplots do not show the full effect of each parameter on themodel response due to interactions with other parameters thatare fixed to derive theses plots as well as due to parameter co-correlation However they can be used for interpreting modelbehavior (Elith et al 2008)

231 WLt transformation of WL

The map of water levels of this study was developed to im-prove the upscaling of greenhouse gas emissions from or-ganic soils Therefore the final map should provide the high-est accuracy for the water level range for which the high-est differences of greenhouse gas emissions occur This canbe achieved by transforming WL into a transformed vari-able WLt which shows a linear relationship with GHG emis-sions The sensitivity of greenhouse gas emissions to waterlevel has been analyzed in several laboratory and field ex-perimental and monitoring studies (Berglund and Berglund

2011 Droumlsler et al 2011 Hahn-Schoumlfl et al 2011 Leiber-Sauheitl et al 2014 Moore and Roulet 1993 Moore andDalva 1993 van den Akker et al 2012) General trends are astrong increase of methane (CH4) emissions for annual meanwater levels of approximatelygt minus01 m and an increase ofCO2 emissions for water levelslt minus01 m with a trend simi-lar to a saturation function that levels out approximately be-tweenminus04 andminus08 m (Fig 3a) While studies agree overthese general trends the exact shape of the transfer functionand the maximum levels of emissions as well as their depen-dence on soil properties and other environmental parametersare still controversial Here we assume a hypothetical trans-fer function relating the normalized GHG budget rangingfrom 0 to 1 to the water level (see also Fig 3)

GHG Balance=

minuse3(WL+01)

+1 WLlt=minus011minuseminus3(WL+01) WLgtminus01

(1)

As the GHG budget can be positive for both low and highWL we introduced the transformed water level WLt as(Fig 3)

WLt =

e3(WL+01)

minus 1 WL lt= minus011 minus eminus3(WL+01) WL gt minus01

(2)

By calibrating the model to both WL and WLt we test if theoptimization of WLt provides the highest model accuracy forthe water level range relevant for GHG emissions and if itoptimizes the map for application to GHG upscaling

232 Weighting scheme

When considering possible data weighting schemes it isworth emphasizing at this point that the goal of this study isthe development of a statistical model that can explain boththe water level variability within a peatland as well as amongdifferent peatlands The data of target and predictor variablesfor building this model is highly heterogeneous First the tar-get variable data set contains peatland areas that strongly dif-fer in their spatial extent and in the number of installed dip

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3327

wells Second the predictor variable data set contains cate-gorical and numerical data and part of the predictor variablespredominantly vary from peatland to peatland (eg climaticboundary conditions large-scale topographic wetness indexpeatland characteristics) whereas others also show within-peatland variability (eg land use small-scale topographicwetness index drainage network) As the influence of the in-dividual predictor variables on our target WLt is expected tobe rather diffuse due to abundant interactions with other sitecharacteristics the robustness of derived dependencies willstrongly depend on the number of different peatlands in thedata set

There are no universal data weighting rules for similarlyheterogeneous data situations and some degree of expertjudgment and subjectivity is inevitable involved when de-veloping an appropriate scheme (Francis 2011) The needfor introducing a data weighting scheme is obvious as with-out data weighting during calibration too much influencewould be given to small and well-studied peatlands whichwill reduce predictive model performance for large less-well-studied peatland areas To avoid this in a simple man-ner weight could be reduced by the number of dip wells ineach peatland which results in each peatland being equallyweighted This scheme however does not sufficiently usethe high information content provided by well-studied largepeatlands which should have a higher impact on model cali-bration than a small peatland with only few dip wells

Here we propose a new weighting scheme that takes intoaccount both factors peatland size and local density of dipwells to derive dip well specific weighting factors It is basedon principles of data uncertainty reduction by repeated mea-surements and of geostatistics First we consider our datasituation as an analogue of meta-analysis with grouped dataIt is has been shown for homogeneous problems (all datafrom same population) that optimal group weights for meta-analysis is 1SE2 (Hedges and Olkin 1985) with SE beingthe standard error of each group

SE =σe

radicN

(3)

whereσe is the error standard deviation of a measurementandN is the number of measurements in a group For ho-mogeneous problems and uniformσe this results in weightsthat are linearly dependent onN which we here call the firstend member of weighting Heterogeneity (within-group vari-ance) reduces the variation of the group weights which canbe shown by random effect models (Cumming 2012) Aswith second end member of weighting when heterogene-ity totally dominates within-group variance optimal groupweights are uniform for all groups ie weights are inde-pendent ofN We are not aware of a method that allowsthe estimation of the degree of heterogeneity for the com-plex target and predictor data situation in this study includ-ing data (spatial and temporal variability measurement er-ror) and model errors (missing parameters) As a trade-off

Figure 4 Sample semivariogram and fitted semivariogram modelof the annual mean water level data WL

between 1SE2(homogeneous end member) and 1 (heteroge-neous end member) we decided on a group weight that isthe inverse of the standard error 1SE which is for exam-ple often used in econometric studies (Dickens 1990) Weemphasize that this is a subjective decision

The group weight 1SE is the basis for the geostatisticalpart of our weighting scheme There are two reasons why wecannot directly treat our peatlands as groups First there iswithin-peatland variability which is related to changing sitecharacteristics It is one objective of our study to describethis variability by statistical modeling Thus dip wells mustbe treated individually and data cannot be aggregated at thepeatland level Second we expect the model to learn morewhen the same number of dip wells is installed in a largerpeatland In a small peatland spatial autocorrelation betweendip wells is higher ie the information content is lower thanfor large peatlands As a consequence of the first point wedo not aggregate and keep all dip wells in the target variabledata set by attributing to each dip well the fraction 1N ofits group weight so that the relative weights of the groupsremain constant As a consequence of the second point weuse principles of geostatistics in our weighting scheme Wereplace the group sizeN (positive integer number) by theldquostatisticalrdquo group sizen (positive continuous number be-inggt 1) which we derive from the spatial autocorrelationamong the dip wells

Therefore we analyze the spatial autocorrelation structureof the data set A single spherical variogram model was fit-ted to the sample variogram of all data (Fig 4 in Sect 31)Variogram models allow the differentiation of the total datavariance (called ldquosillrdquo) into a spatially uncorrelated variance(called ldquonuggetrdquo) and a spatially correlated variance (calledldquostructural variancerdquo and defined as sillndashnugget) (Wacker-nagel 2003) The variogram model allows for derivation forany distance between two locations the average squared dif-ference of values here defined asγ By definition at dis-tance 0 the average squared difference equals the nuggetand at distances greater than that called the ldquorangerdquo of spatial

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3328 M Bechtold et al Large-scale regionalization of water table depth in peatlands

autocorrelation the average squared difference equals thesill Accordingly the autocorrelated fractionf of the av-erage squared difference between two dip wellsi and j is

fij =sill minus γij

sill minus nugget (4)

We now define the ldquostatisticalrdquo group sizen of each dip wellito be the sum of one plus the autocorrelated fractionsfij ofall dip wells that are within the range of spatial autocorrela-tion of i

ni = 1 +

msumj=1

sill minus γij

sill minus nugget (5)

According to the discussion above dip-well-specific weightscan then be calculated with

wi =1

ni SEi

=1

σeiradic

ni

(6)

whereni is derived from Eq (5) The equation shows thatwith increasing ldquostatisticalrdquo group sizen ie with increas-ing spatial data density the weight of an individual dip wellis ldquodown-weightedrdquo to some degree a behavior that corre-sponds to our initial intention to lower the influence of smallpeatlands compared to large ones The error standard devia-tion σe is dependent on several factors eg the length of thetime series the temporal measurement density and the mi-crotopography around the dip well For simplicity we hereassumedσe to be uniform for all dip wells which simplifiesEq (6) towi =

1radic

ni

Only dip wells with the same land-use type were summedup with Eq (5) which avoids the down-weighting by dipwells that have different land-use types The latter are mostlycharacterized by fairly different WLt and thus by rather lowspatial autocorrelation to dip welli

After spatial correlation has been accounted for the sumof the weights of all dip wells of each land-use type were ad-justed that they correspond to the fractions of this land-usetype in Germany This adjustment accounts for the overrep-resentation in the data set of dip wells in unused peatlandsand underrepresentation of dip wells in arable land

233 Model performance criteria

Model fit and predictive performance after cross-validationwere quantified by the weighted root mean square error

RMSE =

radicradicradicradic 1summi=1 wi

msumi=1

(wi

(xoi minus xsi

)2) (7)

wherem is the number of dip wellsxoi is observed WLor WLt of dip well i xsi is simulated WL or WLt of dipwell i andwi is the data weight of dip welli (see below) We

refer to the root mean square error of the predicted data ofcross validation as RMSEcv Model performance was furtherquantified by NashndashSutcliffe efficiency (NSE)

NSE = 1 minus

msumi=1

wi

(xoi minus xsi

)2

msumi=1

wi

(xoi minus xo

)2 (8)

wherexo is the mean of all observed WL or WLt It indicateshow well observed vs predicted values match the 1 1 lineNSE is a good overall indicator of predictive performancebecause it combines scatter and bias (common offset andorslope difference from 11 line) (Nash and Sutcliffe 1970)Values greater than 0 signify a model that is better than thereference model based on the data mean We refer to the NSEof the training data as NSEcal and of the predicted data ofcross validation as NSEcv

Systematic errors were quantified by calculating the modelbias here defined as

BIAS =

msumi=1

(wi xoi minus wi xsi

) (9)

24 Model uncertainty and stability evaluation

Uncertainty of the model predictions was assessed by boot-strapping cross-validation and residual analysis

For the bootstrapping analysis we followed the procedureof Leathwick et al (2006) We estimated the confidence in-tervals around the predictions and the fitted functions by tak-ing 1000 bootstrap samples of the 53 peatlands The numberof peatlands in each sample was equivalent to the data set butpeatlands were selected randomly with replacement Usingthe predictor variables of the final model a BRT model wasfitted to each sample Cross validation was again performedon peatlands thus a peatland in the calibration data set wasnot part of the cross-validation data set to avoid overly opti-mistic results Variances of the predictions and of the fittedfunctions of the 1000 models were evaluated

If data sets are relatively small (egn lt 1000 Dersquoath2007) then the small size of the training and test data setslowers model accuracy Given the fairly small number ofpeatlands in the data set and the partly high spatial corre-lation of dip wells within these peatlands we decided not tosplit the data set into a training and test data set Estimatesof model accuracy can then be based on cross-validationthereby making effective use of all the data (Dersquoath 2007)The prediction uncertainty of the final model is estimatedby the root mean square error of prediction (RMSEcv seeabove) for each land cover class After testing for near-normal distribution of the residuals RMSEcv can be used toderive the 68 and 95 confidence intervals of the predictionswith RMSEcv and 2times RMSEcv respectively

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3329

Finally additional residual analysis was performed to eval-uate whether the predictions are biased for different landcover classes or geographical regions

25 Regionalization

In the final regionalization step the predictor variables con-tributing to the final model were determined at a 25times 25 mraster for all organic soil in Germany Predictor variableswere determined with the same map input that was used formodel building Land cover information including informa-tion on ditches was based on the data from year 2012 and theclimatic data was based on the average of the last 30 yearsThe fine spatial resolution of 25times 25 m was not chosen tofool the reader with a highly spatially accurate model Ratherthis fairly fine scale was necessary to map the relativelysmall-scale effects of the topography land use and peatlandgeometry variables The final model was then used to makea prediction for each of these raster cells

3 Results and discussion

31 Spatial correlation structure of the data set

The variogram model fitted to the sample variogram provideda nugget (0012 m2 011 m) a sill (009 m2 03 m) and arange of spatial correlation (2700 m) for our data set of WL(Fig 4) The nugget represents the very small-scale soil hy-draulic variability and micro-topography effects on WL (vander Ploeg et al 2012) and measurement error eg by dif-ferences in the determination of the ground surface and inthe timing of the manual measurements Furthermore micro-topography (eg hummocks) and oscillating peat surfaces ofwet peatlands pose a challenge for an accurate determinationof both ground surface and water level The water level timeseries in the data set were of different lengths and rangedfrom 1 to 20 years Interannual variability of water levels canbe large (eg Knotters and van Walsum 1997) For simplic-ity in our analysis data were not harmonized by extrapolat-ing WL time series using weather data to a 30-year periodThus the nugget also includes errors that are introduced bydip wells with different measurement periods that are locatedin the range of spatial correlation In consideration of theseerror sources the fitted nugget of 011 m appears to be a re-alistic value At 03 m the fitted sill matched nearly perfectlywith the standard deviation of the data (031 m) which in-dicates consistency between semivariogram model and dataset The fitted range of spatial correlation of 2700 m reflectsboth physical effects ie the average range of lateral flowdue to hydraulic gradients as well as the effect of averageland-use patterns in Germany on the spatial correlation ofWL Fitted values were used in the calculation of the dip-well-specific weights using Eq (6)

32 Typical water levels for land-use types in Germanorganic soils

The land cover classes are characterized by plausible meanand median water levels which show consistent differencesbetween each other (Table 2 and Fig 5a) The mean valuesof arable land and grassland agree with what can be expectedfor their agronomic requirements with slightly lower waterlevels for arable land The high variability observed for bothclasses may be related to the variability of the efficiency ofinstalled drainage systems as for example the presence andcondition of tile drains and the depth of ditches Grasslandscan be managed with very variable intensity which is partlyreflected in different water levels Figure 5a further showsthat deciduous forests seem to dominate slightly drier organicsoils compared to coniferous forests which dominate underwetter conditions A high variability of water levels is ob-served for the land cover class unused peatland On the onehand post peat-cutting topography increases the variabilityof WL over short distances It probably contributes to thehigh variance observed for this class On the other hand thisclass comprises both rather dry unused peatlands and wetterpeatlands in which re-wetting measures already took placewhich however do not show yet a wet soil attribute in theATKIS Digital Landscape Model This may also cause part ofthe variance observed in the grassland and forest land coverclass All wet land cover classes (reed wet grassland wetforest and wet unused peatland) that were separated by wet-ness indication clearly show higher water levels showing thewetness attribute of the Digital Landscape Model is a usefulattribute

Figure 5b shows the transformed water level for all classesIt can be observed that the variances of the wetter landcover increase relative to the variances of the dry land coverclasses This is due to the highest sensitivity of GHG emis-sions in the wet range of water levels (gt minus05 m) Conse-quently the rather high variance of WL for arable land cor-responds to a rather low variance of WLt ie to a rather lowassumed effect of WL variability on the GHG budget

33 BRT model calibration and validation WL vs WL t

In contrast to land cover class the other predictor variablesshowed if at all only weak relations to WL and WLt whenevaluating them with box plots 2-D cross plots and simplecorrelation matrices Here we expected BRT to detect thestrongest predictor interactions and to identify the most in-formative predictors

After model calibration with all predictors subsequentmodel simplification successively dropped those parameterswith correlationgt 07 and the lowest contribution For bothWL and WLt model performance improved during this sim-plification For WLt the highest values of NSEcv of approxi-mately 046 were achieved with 21 to 9 model parametersThe development of NSEcv for the last 50 parameters is

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3330 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 5 Water level relative to ground surface WL (m) and transformed water level WLt (minus) by land cover class illustrated as a weightedbox plot WLt = minus1 corresponds to maximum CO2 emissions and WLt = 1 to maximum CH4 emissions In the top horizontal axes thenumber of dip wells in each class is indicated

Figure 6NSEcv as a function of number of predictor variables usedin the model of WLt during model simplification and shown for thelast 50 parameter drops

shown in Fig 6 Further elimination of parameters led to apronounced decline of model performance Similar behav-ior was observed for the calibration on WL In favor of amore parsimonious model we chose the model with the low-est number of parameters before the pronounced decline ofmodel performance occurred For the calibration on WLtthis corresponded to the model with lowest number of param-eters that still achieved NSEcv values ofgt 045 (Fig 6) Thefinal WLt model comprised nine predictor variables and thefinal WL model seven parameters The percentages of pa-rameter contributions to the final model and their individualinfluences are discussed for WLt in Sect 34

Table 3 summarizes the statistical performances of themodels calibrated on WL and WLt For both models NSEcalis considerably higher than NSEcv and shows the commonly

Table 2 Weighted mean and standard deviation of WL and WLtdata and of the WLt map presented in Sect 36 for the nine landcover classes

WL (m) WLt (minus)WLt (minus)

Meanplusmn sd Meanplusmn sd Map meanplusmn sd

Arable land minus069plusmn 030 minus076plusmn 017 minus066plusmn 022Deciduous f minus045plusmn 034 minus049plusmn 037 minus047plusmn 035Grassland minus044plusmn 029 minus052plusmn 032 minus049plusmn 030Unused peatl minus039plusmn 036 minus039plusmn 041 minus037plusmn 040Coniferous f minus036plusmn 036 minus037plusmn 037 minus046plusmn 035Wet unused peatl minus022plusmn 027 minus018plusmn 040 minus017plusmn 036Wet forest minus022plusmn 029 minus017plusmn 043 minus021plusmn 039Wet grassland minus010plusmn 014 minus000plusmn 031 minus015plusmn 039Reed minus001plusmn 017 020plusmn 029 minus006plusmn 032

observed overfitting behavior of BRT models The differentmeasures that we conducted to minimize overfitting (cross-validation on peatlands restriction to monotonic responsesand model simplification including elimination of highly cor-related variables) lowered the difference between NSEcal andNSEcv but could not totally avoid overfitting NSEcv of theWLt model (0453) indicates higher predictive model perfor-mance compared to the WL model (0381) However as thedata ranges differ due to the transformation this comparisonmay be misleading Therefore we transformed the predic-tions of the WL model to obtain WLt values from this modeland equally calculated the performance criteria (Table 3 sec-ond column) Then NSEcv is slightly increased (0397) butdoes not achieve the values of the model that was calibratedon WLt A better predictive model performance of the modelcalibrated on WLt is also visible for the RMSEcv valuesThe total RMSEcv as well as the RMSEcv values for the

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3331

Table 3Performance criteria of the different models dry range de-fined as WLlt minus03 m and wet range as WLgt minus03 m

WL (m) WLt (minus) WLt (minus)(calibrated (calibrated on WL) (calibrated on WLt)

on WL)

NSEcal 0627 0559 0642NSEcv 0381 0397 0453RMSEcv 0269 0299 0284RMSEcvdry 0284 0263 0259RMSEcvwet 0222 0382 0355Bias minus0003 0083 0002Biasdry minus0012 0070 0003Biaswet 0021 0120 0000

dry (WL lt minus03 m) and wet range (WLgt minus03 m) showslightly lower values for the WLt model compared to WLtvalues from the model calibrated on WL Given our hypo-thetical transfer function (Fig 3) in which the GHG budgetis linearly related to WLt the higher accuracy of WLt pre-dictions directly corresponds to a higher accuracy of GHGbudget predictions

Superior model performance is also evident when evaluat-ing model bias Only when calibrating directly on WLt arethe WLt predictions bias-free Calibration on WL and subse-quent transformation to WLt introduces a model bias towardssystematically lower WLt values In subsequent applicationsto GHG emission upscaling lower WLt values would lead toan overestimation of CO2 emissions and to an underestima-tion of CH4 emissions

34 Influence of predictor variables on WLt

Given the beneficial characteristics of the model calibratedon WLt for GHG upscaling presentation and discussion offurther model results is restricted to the WLt model

The BRT method allows the analysis of the parameter con-tributions to and influences on the model (Elith et al 2008)and thus may contribute to system understanding The per-centages of the contributions of the nine predictor variablesto the final model ranged from 252 to 56 (Fig 7) Ex-cept protection status at least one parameter of each of theseven parameter groups contributed to the final model Allprotection status information was dropped early during thesimplification process due to low contribution although WLshowed slightly higher values for data from nature protectionor special areas of conservation However other parametersseem to be able to fully compensate the information that islost by dropping this predictor

Land cover class lc at the dip well was the parameter withstrongest contribution (252 ) It basically follows the trendillustrated in Fig 5b The bootstrap error plotted as standarddeviation (Fig 7) shows the variation of this influence overthe 1000 bootstrap models A second land cover parameterthe fraction of dry land cover classes on organic soils in abuffer of 2500 m radiusfdry (2500) contributed to the model

with 103 The monotonic decrease of WLt with increas-ing fdry (2500) is plausible as higher values reflect intensiveland use in the surroundings of the dip well and thus indicateintensive artificial drainage Together both parameter con-tributed 355 and thus land cover represents the parametergroup with the strongest model contribution

Peatland characteristics are the second most importantparameter group The peatland type contributed 16 Themodel indicates that peatlands without any connection to sur-face water bodies (river or lake) and the class of other organicsoils are characterized by lower WLt compared to the peat-land types lowland bog upland bog and fen neighboring sur-face water As the class of other organic soils is generallyexpected to reflect lower water levels and as surface watermay have a stabilization effect on water levels of organicsoils the influence of the peatland type can be consideredplausible Besides peatland type the substrate of the peatbase contributes 56 Here organic soils overlying peatclay layers (eg limnic sediments such as calcareous gyt-tja) or basement rock are characterized by higher WLt com-pared to organic soils overlying unconsolidated rock Thiscan be explained by the lower drainage resistance of uncon-solidated rocks This may cause an increased efficiency ofanthropogenic drainage andor a general higher vulnerabil-ity to seepage losses Finally slightly lower WLt values areindicated by a high fraction of organic soils for the 500 mbufferfpeat(500) This may reflect the higher land-use pres-sure on large peatlands compared to rather small peatlandswhich tentatively are more easily preserved by nature protec-tion efforts

The remaining four parameter groups are represented inthe model by only one parameter each The third most influ-ential parameter was the length of ditches on arable land andgrassland for the 250 m buffer dilendry (250) At first glanceit may be surprising that with increasing ditch density WLtvalues tend to be higher as ditches are supposed to drain thewater when land is used as arable land and grassland Thefact that the model identifies a rather strong effect in the op-posite direction may be caused by incomplete informationabout the drainage network There is not detailed informa-tion about the spatial distribution of tile drains Based on ex-pert knowledge agricultural areas with a lower ditch densityare more likely to have tile drains As these drains easilyinstalled with a narrow drain spacing are more effective atdraining organic soils low WLt values for arable land andgrassland may be related to low ditch densities Furthermoreditches were originally dug at narrow spacing in especiallywet areas of organic soils but there is no information avail-able whether these ditches still function properly

The parameters wbsummer hrel and tiras25all show expectedtrends The model predicts higher WLt for increasing cli-matic water balance in the summer period (May to October)wbsummer for dip wells located in depressions (low values ofhrel) and for higher small-scale topographic wetness indicescalculated on the 25times 25 digital elevation model (tiras25)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3332 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 7 Partial dependence plots for the predictor variables For an explanation of variables see Table 1 They axes are on WLt scale andare centered around the mean WLt Error bars and grey area indicate standard deviation of the response of over 1000 bootstrap models Therelative contribution of each predictor is indicated as percentage The lines along thex axes of each plot show distribution of data across thatvariable in deciles

The fact that all parameters show expected or explainableresponses in the model corroborates the reliability of the cal-ibrated WLt model The standard deviation of the predictorresponses based on the bootstrap samples shows the stabilityof the observed responses

Further insight into model behavior can be obtained by an-alyzing parameter interactions This is obtained by changingtwo parameters simultaneously while keeping mean valuesfor all other parameters (Elith et al 2008) Figure 8 showsthe two strongest parameter interactions Parameter wbsummer

strongly interacts withptype The generally lower values ofWLt of fens without surface water connection and other or-ganic soils show a stronger dependency on the summer cli-matic water balance While a summer climatic water balanceof gt minus80 mm shows a rather weak effect on WLt for the wet-ter peatland types in contrast to the two drier peatland typesthere is still a strong effect with increasing wbsummer Thetrend for wbsummergt 130 mm for the dry peatland types issupported by seven different peatlands

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3333

Figure 8Partial dependence plots representing the two strongest interactions in the model(a) betweenptype and wbsummerand(b) betweenpbaseandfdry Fitted WLt is plotted on they axis which is obtained after accounting for the average effect of all other predictor variables

Another strong interaction is observed forpbase andfdry (2500) While a rather weak effect of the fraction ofarable land and grassland is observed for organic soils over-lying basement rock and peat clay layer a strong effect is ob-served for organic soils overlying unconsolidated rock Thisinteraction reflects the higher lateral range of drainage effectsfor organic soils with little flow resistance at the peat base Inthese organic soils intensive land use lowers the water levelover large areas

35 Discussion of model uncertainty

Plotting observed vs predicted WLt from cross-validation(Fig 9) illustrates the rather large residual variance that can-not be explained by the model As indicated by the higherRMSEcv for the wet range (Table 3) scatter increases withincreasing WLt Error bars in they direction indicate dataerror derived from the nugget of the variogram It is shownfor a few data points as an example Due to transformationdata error increases for higher WLt Figure 9 demonstratesthat the fraction of unexplainable variance related to data er-ror is much higher for the wet than for the dry range Boot-strap error indicating the variation of the model predictionsfor 1000 bootstrap samples is shown in thex direction for thesame data points Bootstrap error is lower than the data errorfor the wet range and slightly higher for the dry range

Bootstrap errors demonstrate the sensitivity of model pre-dictions to changes of the data set used for calibration Whena model possesses structural deficits such as missing pre-dictor variables bootstrap errors should not be used to de-fine confidence intervals for the model predictions Figure 10shows residuals from cross-validation and standard deviationof bootstrap predictions for all land cover classes The resid-uals of each land cover class show near-normal distributionsFor five of the nine land cover classes (wet forest wet un-used peatland arable land coniferous forest and reed) theShapirondashWilk test of normality is positive (p gt 005) Fig-ure 10a further indicates that residuals of each land cover

Figure 9 Observed vs predicted transformed annual mean waterlevel (WLt) from cross-validation results Error bars show selecteddata and bootstrap model errors as standard deviation Data pointsare scaled by their weights

scatter fairly well around zero indicating low bias for the var-ious land cover classes Land-cover-class-specific confidenceintervals of model predictions can thus be derived from theRMSEcv of each land cover class eg 2times RMSEcv repre-senting the 95 confidence interval

The prediction uncertainty derived from cross-validationis much higher than the bootstrap prediction uncertainty ob-tained from the bootstrap standard deviation (sd) with 2times sdcorresponding to the 95 confidence interval (Fig 10)The large difference between these values indicates that themodel has structural deficits that can be attributed to severalerror sources

i Key influences on WLt are missing in the set of pre-dictor variables None of the predictor variables in-dicate whether and to which extent water level in-crease due to re-wetting measures took place in the last

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 5: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3323

222 Drainage network

Locations of ditches that are included as lines in the DigitalLandscape Model were used to obtain information about thedrainage network The total ditch length was calculated forvarious buffer sizes Further the distance to the next ditchwas calculated for each dip well A short distance to the nextditch may indicate either lower or higher water levels de-pending on whether the ditches are used for drainage or al-ready blocked and used for rewetting measures Similarlythe indication of the total length of ditches is not uniqueTherefore we defined two different sets of ditch variablesA first set for which we calculated values for all land coverclasses and a second one for which we only calculated val-ues for land cover classes for which ditches are undoubtedlyused for drainage ie arable and grassland

223 Peatland characteristics

The geological map of Germany (scale 1 200 000) definedthe area for which WL predictions were modeled It is alsothe basis for topological peatland predictor variables ie thefraction of organic soils in different buffer sizes as well asthe dip well distance to the edge of the peatland Informationabout the peatland type and the substrate at the peat base ispresented in more detail in a newly compiled raster map oforganic soils (Roszligkopf et al 2014) and was thus extractedfrom this map Peatland types were aggregated into fiveclasses lowland bog (North German Plains and Alpine Fore-lands) upland bog (Central Uplands and Alps) fen neighbor-ing surface water fen without neighboring surface water anda class of ldquoother organic soilsrdquo that do not fulfill the C contentand thickness criteria to be classified as peatland Substratesat the peat base included loose unconsolidated rock (alluvialsand and gravel deposits) consolidated rock (bedrock) andpeat clay layer The first type may indicate the occurrence ofseepage (positive or negative) whereas the latter two typesmay indicate rather a hydraulic decoupling from the aquiferhydraulic head

224 Climatic boundary conditions

Climatic boundary conditions directly influence water levelOn the one hand the typical long-term climatic boundaryconditions may indicate the general vulnerability of peat-lands in a specific region On the other hand given the dif-ferent lengths of measurement periods of the time seriesin this study climatic boundary condition predictor vari-ables may account for the effect of a climatically wetter ordrier measurement period compared to the long-term av-erages on the water level Climatic boundary conditionswere extracted from a 1times 1 km raster from the GermanWeather Service Annual summer and winter precipitationFAO56 PenmanndashMonteith reference evapotranspiration andclimatic water balance (difference between precipitation and

reference evapotranspiration) were determined for the indi-vidual measurement period of each dip well and as long-termaverages (30 years)

225 Relative altitude

Relative altitude was calculated by subtracting the medianaltitude of various buffer sizes from the absolute altitude ateach dip well in the digital elevation model (DEM) Rela-tive altitude is expected to have two different indications de-pending on the applied buffer size (i) in many peatlands theformer smooth peatland relief at the scale of approximatelygt 5 m has been disturbed due to peat cutting and differencesin drainage and mineralization rate As a consequence therather smooth phreatic surface often does not follow the un-even and patchy terrain Relative altitude with respect tosmaller buffer sizes (lt 250 m) may therefore explain part ofthe WL variation eg a dip well that is located at a surfacemuch higher than the surroundings may indicate deeper wa-ter levels (ii) for large buffer sizes (gt 250 m) relative altitudeindicates whether the peatland lies in a larger morphologicaldepression or elevation and thus may indicate whether large-scale lateral inflow of water can be expected or not Similarindication is provided by the topographic index (see below)The accuracy of relative altitude values depends on the reso-lution and accuracy of the DEM The nation-wide availableDEM is based on data sets of varying quality which maylower the influence of this variable

226 Topographic wetness index

The topographic wetness index is a common wetness indi-cator used in hydrology (Beven and Kirby 1979) It is acombined measure of catchment area and slope at a givenpoint and indicates the extent of flow accumulation High val-ues indicate wetter conditions If calculated at larger scaleshigher values may hint at the occurrence of positive seepageie upward flow of water from the aquifer Topographic wet-ness index was calculated for various DEM resolutions usingthe GRASS 7 module rwatershed

227 Protection status

The protection status of a peatland area may reflect hydrolog-ical conditions Therefore we checked for seven protectionstatus at each dip well (see Table 1 for details)

23 Model building scheme

Model building was performed using boosted regressiontrees (BRT) implemented in the two R packages ldquogbmrdquo(Ridgeway 2013) and ldquodismordquo (Hijmans 2013) BRT is amachine-learning algorithm in which the final model is de-rived from the data Functions that relate target to predictorvariables are not predetermined but freely developed BRTis based on the decision (or regression) tree concept In the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3324 M Bechtold et al Large-scale regionalization of water table depth in peatlands

decision tree concept the parameter space is searched se-quentially for the best split that results in the lowest modelmean squared error The mean responses of the groups thatresult from the various splits and correspond to certain pa-rameter ranges represent the model The common procedureis the growth of a large tree which is subsequently simpli-fied by dropping weak links that are identified with cross-validation Growing only a single tree has several disadvan-tages such as uneven functions that are very sensitive to thespecific sample of the data Therefore ensemble techniqueshave been combined with the decision tree concept Thesewere first the development of multiple models by bootstrap-ping of the samples (bagging technique) and the randomcreation of subsets of predictors at each split (random for-est technique) Later with the ldquoboostingrdquo technique of BRTa sequential procedure was developed in which data is re-weighted after each tree to increase emphasis on data that ispoorly modeled by the existing collection of trees (Elith etal 2008)

BRT modeling is increasingly applied in spatial model-ing of species or numerical environmental variables (Elithet al 2008 Martin et al 2011) thereby often showing su-perior performance compared to other machine-learning al-gorithms The increasing application of BRT is related toseveral of its favorable characteristics the strength of thismethod lies in the ability to fit complex functional dependen-cies including non-linear relationships and interactions be-tween predictor variables Based on its flexibility BRT is in-variant to monotonic transformations of predictors Further-more BRT allows for missing values in the predictor vari-ables thus predictor variable information does not necessar-ily need to fully cover the total map extent The gbm packagehandles missing values in predictor variables by introducingsurrogate splits The mean target value belonging to the miss-ing predictor values is attributed to these surrogate splits dur-ing model building We observed that the contribution of apredictor variable to the final model decreases with an in-creasing number of missing values This is intuitive as targetobservations of missing predictor values are mostly supposedto scatter strongly BRT is further fairly insensitive to out-liers and allows estimating the relative contribution of eachpredictor variable to the model Due to these characteristicswe expected BRT to be very well suited to the very hetero-geneous data set of this study

BRT model calibration is prone to overfitting and thereare various options to reduce this behavior Due to the over-fitting behavior cross validation is generally part of themodel building process However cross validation can beperformed in several ways and if performed carelessly canlead to overly optimistic model performance (Dersquoath 2007)Here cross validation was performed by leaving out wholepeatland areas instead of a random set of dip wells Thisrepresents a stricter cross validation and we noticed that itstrongly reduced overfitting of the water level data and thuscontributed to the development of a more robust model

Figure 2 Illustration of the predictor variables determined for eachdip well based on available national maps (see Table 1)

Another option to avoid overfitting is to impose mono-tonic slopes on the effects of individual parameters whichcan even lead to improved prediction performance (Dersquoath2007) For all our numerical variables we expected mono-tonic slopes rather than optimum functions To avoid pre-defining any expected direction all numerical variables wereadded twice to the set of predictors constraining the slope toa monotonic increase and decrease We let the model decidewhether monotonic increase or decrease has higher predic-tive power

Models were calibrated using a Gaussian response typeaimed at minimizing deviance (squared error) (Ridgeway2013) In all calibration runs we applied the gbmstep func-tion of the dismo package which assesses the optimal num-ber of boosting trees using cross validation We tested variouslearning rates (0001ndash001) bag fractions (01ndash08) and lev-els of tree complexity (3 to 7) ie the number of nodes in atree By trial and error we determined the most effective algo-rithm parameters for our data set being 0005 for the learningrate 06 for the bag fraction and 5 for the tree complexity

The final BRT model building is commonly performed asa two-step procedure (Elith et al 2008) which we basicallyalso followed in our study

i In the first step the whole set of predictor variables isused to calibrate a BRT model

ii In a second step the number of parameters is re-duced sequentially to avoid overfitting and to derive amore parsimonious model We tracked predictive per-formance criteria during the simplification process Asvarious variables were calculated for different buffersizes our predictors included a large number of cor-related variables Correlation coefficients between pre-dictor variables ofgt 07 are known to severely distortmodel estimation and subsequent prediction (Dormann

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3325Ta

ble

1O

verv

iew

onpr

edic

tor

varia

bles

Pre

dict

orva

riabl

eVa

riabl

ena

me

Valu

esP

oint

buf

fers

(m)

Dat

aso

urce

Land

-use

type

Ara

ble

gras

slan

dfo

rest

shr

ubs

peat

-min

ing

unus

edP

oint

100

500

100

025

00D

igita

lLan

dsca

peM

odel

1

peat

land

sw

amp

open

wat

er

Vege

tatio

nat

trib

utes

Dec

iduo

usfo

rest

mix

edfo

rest

con

ifero

usfo

rest

ree

dP

oint

Dig

italL

ands

cape

Mod

el1

(opt

iona

l)sh

rubs

gra

ss

ldquoWet

soil

obse

rved

rdquoY

esn

oP

oint

Dig

italL

ands

cape

Mod

el1

Com

bine

dla

ndco

ver

lcA

rabl

egr

assl

and

wet

gras

slan

dde

cidu

ous

incl

udin

gm

ixed

Poi

nt1

005

001

000

2500

Dig

italL

ands

cape

Mod

el1

info

rmat

ion

(land

-use

type

fo

rest

wet

fore

stc

onife

rous

fore

str

eed

unus

edpe

atla

nd

veg

and

wet

soil

attr

)w

etun

used

peat

land

Dry

land

cove

rfr

actio

nf

dry(

X)

Ara

ble

05times

gras

slan

don

orga

nic

soil

area

0to

110

050

010

002

500

Dig

italL

ands

cape

Mod

el1

Wet

land

cove

rfr

actio

nR

eed

wet

gras

slan

dw

etfo

rest

and

wet

unus

edpe

atla

ndon

100

500

1000

250

0D

igita

lLan

dsca

peM

odel

1

orga

nic

soil

area

0to

1

Tota

llen

gth

ofdi

tche

sfo

rdi le

ndr

y(X

)ge

0m

Poi

nt5

025

010

002

500

Dig

italL

ands

cape

Mod

el1

alll

can

don

lyfo

rar

able

and

gras

slan

d(s

ubsc

rldquod

ryrdquo)

Dis

tanc

eto

next

ditc

hge

0m

Poi

ntD

igita

lLan

dsca

peM

odel1

Pea

tland

type

pty

peLo

wla

ndbo

gup

land

bog

fen

neig

hbor

ing

surf

ace

wat

erf

enP

oint

Map

ofor

gani

cso

ils2

with

outn

eigh

borin

gsu

rfac

ew

ater

oth

erldquolo

w-C

rdquoor

gani

cso

il

Mat

eria

latp

eatb

ase

pba

seU

ncon

solid

ated

rock

pea

tcla

yla

yer

rock

no

info

rmat

ion

Poi

ntM

apof

orga

nic

soils

2

Pea

tland

frac

tion

fpe

at(X

)0

to1

Poi

nt5

001

000

2500

Geo

logi

calm

ap(B

GR

)3

Dis

tanc

eto

edge

ofpe

atla

ndgt

0m

Geo

logi

calm

ap(B

GR

)3

Rat

ioof

dpe

atf

peat

gt0

2500

Geo

logi

calm

ap(B

GR

)3

Pre

cipi

tatio

nge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Eva

potr

ansp

iratio

nge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Clim

atic

wat

erba

lanc

ew

b sum

mer

lt0

andge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Rel

ativ

ehe

ight

hre

l(X

)lt

0an

dge0

mP

oint

ndashm

edia

n25

50

100

250

Dig

italE

leva

tion

Mod

el5

500

1000

Topo

grap

hic

inde

xti ra

sR(X

)gt

0P

oint

and

1000

buffe

rfo

r10

D

igita

lEle

vatio

nM

odel

5

252

501

000

rast

erva

lues

Pro

tect

ion

stat

usN

atur

eco

nser

vatio

nar

eas

peci

alar

eas

ofco

nser

vatio

nP

oint

Map

sof

prot

ecte

dar

eas

6

spec

ialp

rote

ctio

nar

eafo

rw

ildbi

rds

UN

ES

CO

bios

pher

ere

serv

ena

ture

park

nat

iona

lpar

kla

ndsc

ape

prot

ectio

nar

ea

1AT

KIS

Bas

isD

LMF

eder

alA

genc

yfo

rC

arto

grap

hyan

dG

eode

syB

KG

2

map

ofor

gani

cso

ils(R

oszligko

pfet

al

2014

Hum

bold

tUni

vers

ityof

Ber

lin)

3ge

olog

ical

map

12

0000

0(G

UE

K20

0B

GR

ndashF

eder

alIn

stitu

tefo

rG

eosc

ienc

esan

dN

atur

alR

esou

rces

)4

rast

erm

ap1times

1km

ofw

eath

erda

ta(G

erm

anW

eath

erS

ervi

ce)

5B

KG

var

iabl

ena

me

indi

cate

dfo

rth

eni

neva

riabl

esin

the

final

mod

elw

ith(X

)in

dica

ting

buf

size

andR

indi

catin

gra

ster

reso

lutio

n6F

eder

alA

genc

yfo

rN

atur

eC

onse

rvat

ion

(BfN

)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3326 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 3 Illustration of the annual mean water level (WL) transformation Hypothetical transfer function relating GHG budget to WL (m)(a) GHG budget vs the transformed water level (WLt) (b) WLt vs WL The lines along thex axes indicate the data quantiles of theanalyzed data set(c)

et al 2013) Thus we performed this simplificationprocess by first dropping those parameters with a cor-relationgt 07 (either Pearson or Spearman type) to an-other parameter with a higher contribution (Clapcott etal 2011) This ensured that two highly correlated pa-rameters would not remain in the parameter set longerthan the last parameter of another group of variableswhich may contribute less compared to the two highlycorrelated parameters but provides extra informationthat is not covered by the other parameters After allhighly correlated parameters have been dropped fur-ther parameters with low contribution were droppedprogressively

Predictor contributions are calculated as proportional con-tributions to the total error reduction and can be consideredas a measure for the influence of the individual predictorsAdditionally a BRT model allows the derivation of partialdependence plots which indicate how the response is affectedby a certain predictor after accounting for the average effectsof all other predictors in the model (Elith et al 2008) Theseplots do not show the full effect of each parameter on themodel response due to interactions with other parameters thatare fixed to derive theses plots as well as due to parameter co-correlation However they can be used for interpreting modelbehavior (Elith et al 2008)

231 WLt transformation of WL

The map of water levels of this study was developed to im-prove the upscaling of greenhouse gas emissions from or-ganic soils Therefore the final map should provide the high-est accuracy for the water level range for which the high-est differences of greenhouse gas emissions occur This canbe achieved by transforming WL into a transformed vari-able WLt which shows a linear relationship with GHG emis-sions The sensitivity of greenhouse gas emissions to waterlevel has been analyzed in several laboratory and field ex-perimental and monitoring studies (Berglund and Berglund

2011 Droumlsler et al 2011 Hahn-Schoumlfl et al 2011 Leiber-Sauheitl et al 2014 Moore and Roulet 1993 Moore andDalva 1993 van den Akker et al 2012) General trends are astrong increase of methane (CH4) emissions for annual meanwater levels of approximatelygt minus01 m and an increase ofCO2 emissions for water levelslt minus01 m with a trend simi-lar to a saturation function that levels out approximately be-tweenminus04 andminus08 m (Fig 3a) While studies agree overthese general trends the exact shape of the transfer functionand the maximum levels of emissions as well as their depen-dence on soil properties and other environmental parametersare still controversial Here we assume a hypothetical trans-fer function relating the normalized GHG budget rangingfrom 0 to 1 to the water level (see also Fig 3)

GHG Balance=

minuse3(WL+01)

+1 WLlt=minus011minuseminus3(WL+01) WLgtminus01

(1)

As the GHG budget can be positive for both low and highWL we introduced the transformed water level WLt as(Fig 3)

WLt =

e3(WL+01)

minus 1 WL lt= minus011 minus eminus3(WL+01) WL gt minus01

(2)

By calibrating the model to both WL and WLt we test if theoptimization of WLt provides the highest model accuracy forthe water level range relevant for GHG emissions and if itoptimizes the map for application to GHG upscaling

232 Weighting scheme

When considering possible data weighting schemes it isworth emphasizing at this point that the goal of this study isthe development of a statistical model that can explain boththe water level variability within a peatland as well as amongdifferent peatlands The data of target and predictor variablesfor building this model is highly heterogeneous First the tar-get variable data set contains peatland areas that strongly dif-fer in their spatial extent and in the number of installed dip

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3327

wells Second the predictor variable data set contains cate-gorical and numerical data and part of the predictor variablespredominantly vary from peatland to peatland (eg climaticboundary conditions large-scale topographic wetness indexpeatland characteristics) whereas others also show within-peatland variability (eg land use small-scale topographicwetness index drainage network) As the influence of the in-dividual predictor variables on our target WLt is expected tobe rather diffuse due to abundant interactions with other sitecharacteristics the robustness of derived dependencies willstrongly depend on the number of different peatlands in thedata set

There are no universal data weighting rules for similarlyheterogeneous data situations and some degree of expertjudgment and subjectivity is inevitable involved when de-veloping an appropriate scheme (Francis 2011) The needfor introducing a data weighting scheme is obvious as with-out data weighting during calibration too much influencewould be given to small and well-studied peatlands whichwill reduce predictive model performance for large less-well-studied peatland areas To avoid this in a simple man-ner weight could be reduced by the number of dip wells ineach peatland which results in each peatland being equallyweighted This scheme however does not sufficiently usethe high information content provided by well-studied largepeatlands which should have a higher impact on model cali-bration than a small peatland with only few dip wells

Here we propose a new weighting scheme that takes intoaccount both factors peatland size and local density of dipwells to derive dip well specific weighting factors It is basedon principles of data uncertainty reduction by repeated mea-surements and of geostatistics First we consider our datasituation as an analogue of meta-analysis with grouped dataIt is has been shown for homogeneous problems (all datafrom same population) that optimal group weights for meta-analysis is 1SE2 (Hedges and Olkin 1985) with SE beingthe standard error of each group

SE =σe

radicN

(3)

whereσe is the error standard deviation of a measurementandN is the number of measurements in a group For ho-mogeneous problems and uniformσe this results in weightsthat are linearly dependent onN which we here call the firstend member of weighting Heterogeneity (within-group vari-ance) reduces the variation of the group weights which canbe shown by random effect models (Cumming 2012) Aswith second end member of weighting when heterogene-ity totally dominates within-group variance optimal groupweights are uniform for all groups ie weights are inde-pendent ofN We are not aware of a method that allowsthe estimation of the degree of heterogeneity for the com-plex target and predictor data situation in this study includ-ing data (spatial and temporal variability measurement er-ror) and model errors (missing parameters) As a trade-off

Figure 4 Sample semivariogram and fitted semivariogram modelof the annual mean water level data WL

between 1SE2(homogeneous end member) and 1 (heteroge-neous end member) we decided on a group weight that isthe inverse of the standard error 1SE which is for exam-ple often used in econometric studies (Dickens 1990) Weemphasize that this is a subjective decision

The group weight 1SE is the basis for the geostatisticalpart of our weighting scheme There are two reasons why wecannot directly treat our peatlands as groups First there iswithin-peatland variability which is related to changing sitecharacteristics It is one objective of our study to describethis variability by statistical modeling Thus dip wells mustbe treated individually and data cannot be aggregated at thepeatland level Second we expect the model to learn morewhen the same number of dip wells is installed in a largerpeatland In a small peatland spatial autocorrelation betweendip wells is higher ie the information content is lower thanfor large peatlands As a consequence of the first point wedo not aggregate and keep all dip wells in the target variabledata set by attributing to each dip well the fraction 1N ofits group weight so that the relative weights of the groupsremain constant As a consequence of the second point weuse principles of geostatistics in our weighting scheme Wereplace the group sizeN (positive integer number) by theldquostatisticalrdquo group sizen (positive continuous number be-inggt 1) which we derive from the spatial autocorrelationamong the dip wells

Therefore we analyze the spatial autocorrelation structureof the data set A single spherical variogram model was fit-ted to the sample variogram of all data (Fig 4 in Sect 31)Variogram models allow the differentiation of the total datavariance (called ldquosillrdquo) into a spatially uncorrelated variance(called ldquonuggetrdquo) and a spatially correlated variance (calledldquostructural variancerdquo and defined as sillndashnugget) (Wacker-nagel 2003) The variogram model allows for derivation forany distance between two locations the average squared dif-ference of values here defined asγ By definition at dis-tance 0 the average squared difference equals the nuggetand at distances greater than that called the ldquorangerdquo of spatial

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3328 M Bechtold et al Large-scale regionalization of water table depth in peatlands

autocorrelation the average squared difference equals thesill Accordingly the autocorrelated fractionf of the av-erage squared difference between two dip wellsi and j is

fij =sill minus γij

sill minus nugget (4)

We now define the ldquostatisticalrdquo group sizen of each dip wellito be the sum of one plus the autocorrelated fractionsfij ofall dip wells that are within the range of spatial autocorrela-tion of i

ni = 1 +

msumj=1

sill minus γij

sill minus nugget (5)

According to the discussion above dip-well-specific weightscan then be calculated with

wi =1

ni SEi

=1

σeiradic

ni

(6)

whereni is derived from Eq (5) The equation shows thatwith increasing ldquostatisticalrdquo group sizen ie with increas-ing spatial data density the weight of an individual dip wellis ldquodown-weightedrdquo to some degree a behavior that corre-sponds to our initial intention to lower the influence of smallpeatlands compared to large ones The error standard devia-tion σe is dependent on several factors eg the length of thetime series the temporal measurement density and the mi-crotopography around the dip well For simplicity we hereassumedσe to be uniform for all dip wells which simplifiesEq (6) towi =

1radic

ni

Only dip wells with the same land-use type were summedup with Eq (5) which avoids the down-weighting by dipwells that have different land-use types The latter are mostlycharacterized by fairly different WLt and thus by rather lowspatial autocorrelation to dip welli

After spatial correlation has been accounted for the sumof the weights of all dip wells of each land-use type were ad-justed that they correspond to the fractions of this land-usetype in Germany This adjustment accounts for the overrep-resentation in the data set of dip wells in unused peatlandsand underrepresentation of dip wells in arable land

233 Model performance criteria

Model fit and predictive performance after cross-validationwere quantified by the weighted root mean square error

RMSE =

radicradicradicradic 1summi=1 wi

msumi=1

(wi

(xoi minus xsi

)2) (7)

wherem is the number of dip wellsxoi is observed WLor WLt of dip well i xsi is simulated WL or WLt of dipwell i andwi is the data weight of dip welli (see below) We

refer to the root mean square error of the predicted data ofcross validation as RMSEcv Model performance was furtherquantified by NashndashSutcliffe efficiency (NSE)

NSE = 1 minus

msumi=1

wi

(xoi minus xsi

)2

msumi=1

wi

(xoi minus xo

)2 (8)

wherexo is the mean of all observed WL or WLt It indicateshow well observed vs predicted values match the 1 1 lineNSE is a good overall indicator of predictive performancebecause it combines scatter and bias (common offset andorslope difference from 11 line) (Nash and Sutcliffe 1970)Values greater than 0 signify a model that is better than thereference model based on the data mean We refer to the NSEof the training data as NSEcal and of the predicted data ofcross validation as NSEcv

Systematic errors were quantified by calculating the modelbias here defined as

BIAS =

msumi=1

(wi xoi minus wi xsi

) (9)

24 Model uncertainty and stability evaluation

Uncertainty of the model predictions was assessed by boot-strapping cross-validation and residual analysis

For the bootstrapping analysis we followed the procedureof Leathwick et al (2006) We estimated the confidence in-tervals around the predictions and the fitted functions by tak-ing 1000 bootstrap samples of the 53 peatlands The numberof peatlands in each sample was equivalent to the data set butpeatlands were selected randomly with replacement Usingthe predictor variables of the final model a BRT model wasfitted to each sample Cross validation was again performedon peatlands thus a peatland in the calibration data set wasnot part of the cross-validation data set to avoid overly opti-mistic results Variances of the predictions and of the fittedfunctions of the 1000 models were evaluated

If data sets are relatively small (egn lt 1000 Dersquoath2007) then the small size of the training and test data setslowers model accuracy Given the fairly small number ofpeatlands in the data set and the partly high spatial corre-lation of dip wells within these peatlands we decided not tosplit the data set into a training and test data set Estimatesof model accuracy can then be based on cross-validationthereby making effective use of all the data (Dersquoath 2007)The prediction uncertainty of the final model is estimatedby the root mean square error of prediction (RMSEcv seeabove) for each land cover class After testing for near-normal distribution of the residuals RMSEcv can be used toderive the 68 and 95 confidence intervals of the predictionswith RMSEcv and 2times RMSEcv respectively

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3329

Finally additional residual analysis was performed to eval-uate whether the predictions are biased for different landcover classes or geographical regions

25 Regionalization

In the final regionalization step the predictor variables con-tributing to the final model were determined at a 25times 25 mraster for all organic soil in Germany Predictor variableswere determined with the same map input that was used formodel building Land cover information including informa-tion on ditches was based on the data from year 2012 and theclimatic data was based on the average of the last 30 yearsThe fine spatial resolution of 25times 25 m was not chosen tofool the reader with a highly spatially accurate model Ratherthis fairly fine scale was necessary to map the relativelysmall-scale effects of the topography land use and peatlandgeometry variables The final model was then used to makea prediction for each of these raster cells

3 Results and discussion

31 Spatial correlation structure of the data set

The variogram model fitted to the sample variogram provideda nugget (0012 m2 011 m) a sill (009 m2 03 m) and arange of spatial correlation (2700 m) for our data set of WL(Fig 4) The nugget represents the very small-scale soil hy-draulic variability and micro-topography effects on WL (vander Ploeg et al 2012) and measurement error eg by dif-ferences in the determination of the ground surface and inthe timing of the manual measurements Furthermore micro-topography (eg hummocks) and oscillating peat surfaces ofwet peatlands pose a challenge for an accurate determinationof both ground surface and water level The water level timeseries in the data set were of different lengths and rangedfrom 1 to 20 years Interannual variability of water levels canbe large (eg Knotters and van Walsum 1997) For simplic-ity in our analysis data were not harmonized by extrapolat-ing WL time series using weather data to a 30-year periodThus the nugget also includes errors that are introduced bydip wells with different measurement periods that are locatedin the range of spatial correlation In consideration of theseerror sources the fitted nugget of 011 m appears to be a re-alistic value At 03 m the fitted sill matched nearly perfectlywith the standard deviation of the data (031 m) which in-dicates consistency between semivariogram model and dataset The fitted range of spatial correlation of 2700 m reflectsboth physical effects ie the average range of lateral flowdue to hydraulic gradients as well as the effect of averageland-use patterns in Germany on the spatial correlation ofWL Fitted values were used in the calculation of the dip-well-specific weights using Eq (6)

32 Typical water levels for land-use types in Germanorganic soils

The land cover classes are characterized by plausible meanand median water levels which show consistent differencesbetween each other (Table 2 and Fig 5a) The mean valuesof arable land and grassland agree with what can be expectedfor their agronomic requirements with slightly lower waterlevels for arable land The high variability observed for bothclasses may be related to the variability of the efficiency ofinstalled drainage systems as for example the presence andcondition of tile drains and the depth of ditches Grasslandscan be managed with very variable intensity which is partlyreflected in different water levels Figure 5a further showsthat deciduous forests seem to dominate slightly drier organicsoils compared to coniferous forests which dominate underwetter conditions A high variability of water levels is ob-served for the land cover class unused peatland On the onehand post peat-cutting topography increases the variabilityof WL over short distances It probably contributes to thehigh variance observed for this class On the other hand thisclass comprises both rather dry unused peatlands and wetterpeatlands in which re-wetting measures already took placewhich however do not show yet a wet soil attribute in theATKIS Digital Landscape Model This may also cause part ofthe variance observed in the grassland and forest land coverclass All wet land cover classes (reed wet grassland wetforest and wet unused peatland) that were separated by wet-ness indication clearly show higher water levels showing thewetness attribute of the Digital Landscape Model is a usefulattribute

Figure 5b shows the transformed water level for all classesIt can be observed that the variances of the wetter landcover increase relative to the variances of the dry land coverclasses This is due to the highest sensitivity of GHG emis-sions in the wet range of water levels (gt minus05 m) Conse-quently the rather high variance of WL for arable land cor-responds to a rather low variance of WLt ie to a rather lowassumed effect of WL variability on the GHG budget

33 BRT model calibration and validation WL vs WL t

In contrast to land cover class the other predictor variablesshowed if at all only weak relations to WL and WLt whenevaluating them with box plots 2-D cross plots and simplecorrelation matrices Here we expected BRT to detect thestrongest predictor interactions and to identify the most in-formative predictors

After model calibration with all predictors subsequentmodel simplification successively dropped those parameterswith correlationgt 07 and the lowest contribution For bothWL and WLt model performance improved during this sim-plification For WLt the highest values of NSEcv of approxi-mately 046 were achieved with 21 to 9 model parametersThe development of NSEcv for the last 50 parameters is

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3330 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 5 Water level relative to ground surface WL (m) and transformed water level WLt (minus) by land cover class illustrated as a weightedbox plot WLt = minus1 corresponds to maximum CO2 emissions and WLt = 1 to maximum CH4 emissions In the top horizontal axes thenumber of dip wells in each class is indicated

Figure 6NSEcv as a function of number of predictor variables usedin the model of WLt during model simplification and shown for thelast 50 parameter drops

shown in Fig 6 Further elimination of parameters led to apronounced decline of model performance Similar behav-ior was observed for the calibration on WL In favor of amore parsimonious model we chose the model with the low-est number of parameters before the pronounced decline ofmodel performance occurred For the calibration on WLtthis corresponded to the model with lowest number of param-eters that still achieved NSEcv values ofgt 045 (Fig 6) Thefinal WLt model comprised nine predictor variables and thefinal WL model seven parameters The percentages of pa-rameter contributions to the final model and their individualinfluences are discussed for WLt in Sect 34

Table 3 summarizes the statistical performances of themodels calibrated on WL and WLt For both models NSEcalis considerably higher than NSEcv and shows the commonly

Table 2 Weighted mean and standard deviation of WL and WLtdata and of the WLt map presented in Sect 36 for the nine landcover classes

WL (m) WLt (minus)WLt (minus)

Meanplusmn sd Meanplusmn sd Map meanplusmn sd

Arable land minus069plusmn 030 minus076plusmn 017 minus066plusmn 022Deciduous f minus045plusmn 034 minus049plusmn 037 minus047plusmn 035Grassland minus044plusmn 029 minus052plusmn 032 minus049plusmn 030Unused peatl minus039plusmn 036 minus039plusmn 041 minus037plusmn 040Coniferous f minus036plusmn 036 minus037plusmn 037 minus046plusmn 035Wet unused peatl minus022plusmn 027 minus018plusmn 040 minus017plusmn 036Wet forest minus022plusmn 029 minus017plusmn 043 minus021plusmn 039Wet grassland minus010plusmn 014 minus000plusmn 031 minus015plusmn 039Reed minus001plusmn 017 020plusmn 029 minus006plusmn 032

observed overfitting behavior of BRT models The differentmeasures that we conducted to minimize overfitting (cross-validation on peatlands restriction to monotonic responsesand model simplification including elimination of highly cor-related variables) lowered the difference between NSEcal andNSEcv but could not totally avoid overfitting NSEcv of theWLt model (0453) indicates higher predictive model perfor-mance compared to the WL model (0381) However as thedata ranges differ due to the transformation this comparisonmay be misleading Therefore we transformed the predic-tions of the WL model to obtain WLt values from this modeland equally calculated the performance criteria (Table 3 sec-ond column) Then NSEcv is slightly increased (0397) butdoes not achieve the values of the model that was calibratedon WLt A better predictive model performance of the modelcalibrated on WLt is also visible for the RMSEcv valuesThe total RMSEcv as well as the RMSEcv values for the

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3331

Table 3Performance criteria of the different models dry range de-fined as WLlt minus03 m and wet range as WLgt minus03 m

WL (m) WLt (minus) WLt (minus)(calibrated (calibrated on WL) (calibrated on WLt)

on WL)

NSEcal 0627 0559 0642NSEcv 0381 0397 0453RMSEcv 0269 0299 0284RMSEcvdry 0284 0263 0259RMSEcvwet 0222 0382 0355Bias minus0003 0083 0002Biasdry minus0012 0070 0003Biaswet 0021 0120 0000

dry (WL lt minus03 m) and wet range (WLgt minus03 m) showslightly lower values for the WLt model compared to WLtvalues from the model calibrated on WL Given our hypo-thetical transfer function (Fig 3) in which the GHG budgetis linearly related to WLt the higher accuracy of WLt pre-dictions directly corresponds to a higher accuracy of GHGbudget predictions

Superior model performance is also evident when evaluat-ing model bias Only when calibrating directly on WLt arethe WLt predictions bias-free Calibration on WL and subse-quent transformation to WLt introduces a model bias towardssystematically lower WLt values In subsequent applicationsto GHG emission upscaling lower WLt values would lead toan overestimation of CO2 emissions and to an underestima-tion of CH4 emissions

34 Influence of predictor variables on WLt

Given the beneficial characteristics of the model calibratedon WLt for GHG upscaling presentation and discussion offurther model results is restricted to the WLt model

The BRT method allows the analysis of the parameter con-tributions to and influences on the model (Elith et al 2008)and thus may contribute to system understanding The per-centages of the contributions of the nine predictor variablesto the final model ranged from 252 to 56 (Fig 7) Ex-cept protection status at least one parameter of each of theseven parameter groups contributed to the final model Allprotection status information was dropped early during thesimplification process due to low contribution although WLshowed slightly higher values for data from nature protectionor special areas of conservation However other parametersseem to be able to fully compensate the information that islost by dropping this predictor

Land cover class lc at the dip well was the parameter withstrongest contribution (252 ) It basically follows the trendillustrated in Fig 5b The bootstrap error plotted as standarddeviation (Fig 7) shows the variation of this influence overthe 1000 bootstrap models A second land cover parameterthe fraction of dry land cover classes on organic soils in abuffer of 2500 m radiusfdry (2500) contributed to the model

with 103 The monotonic decrease of WLt with increas-ing fdry (2500) is plausible as higher values reflect intensiveland use in the surroundings of the dip well and thus indicateintensive artificial drainage Together both parameter con-tributed 355 and thus land cover represents the parametergroup with the strongest model contribution

Peatland characteristics are the second most importantparameter group The peatland type contributed 16 Themodel indicates that peatlands without any connection to sur-face water bodies (river or lake) and the class of other organicsoils are characterized by lower WLt compared to the peat-land types lowland bog upland bog and fen neighboring sur-face water As the class of other organic soils is generallyexpected to reflect lower water levels and as surface watermay have a stabilization effect on water levels of organicsoils the influence of the peatland type can be consideredplausible Besides peatland type the substrate of the peatbase contributes 56 Here organic soils overlying peatclay layers (eg limnic sediments such as calcareous gyt-tja) or basement rock are characterized by higher WLt com-pared to organic soils overlying unconsolidated rock Thiscan be explained by the lower drainage resistance of uncon-solidated rocks This may cause an increased efficiency ofanthropogenic drainage andor a general higher vulnerabil-ity to seepage losses Finally slightly lower WLt values areindicated by a high fraction of organic soils for the 500 mbufferfpeat(500) This may reflect the higher land-use pres-sure on large peatlands compared to rather small peatlandswhich tentatively are more easily preserved by nature protec-tion efforts

The remaining four parameter groups are represented inthe model by only one parameter each The third most influ-ential parameter was the length of ditches on arable land andgrassland for the 250 m buffer dilendry (250) At first glanceit may be surprising that with increasing ditch density WLtvalues tend to be higher as ditches are supposed to drain thewater when land is used as arable land and grassland Thefact that the model identifies a rather strong effect in the op-posite direction may be caused by incomplete informationabout the drainage network There is not detailed informa-tion about the spatial distribution of tile drains Based on ex-pert knowledge agricultural areas with a lower ditch densityare more likely to have tile drains As these drains easilyinstalled with a narrow drain spacing are more effective atdraining organic soils low WLt values for arable land andgrassland may be related to low ditch densities Furthermoreditches were originally dug at narrow spacing in especiallywet areas of organic soils but there is no information avail-able whether these ditches still function properly

The parameters wbsummer hrel and tiras25all show expectedtrends The model predicts higher WLt for increasing cli-matic water balance in the summer period (May to October)wbsummer for dip wells located in depressions (low values ofhrel) and for higher small-scale topographic wetness indicescalculated on the 25times 25 digital elevation model (tiras25)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3332 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 7 Partial dependence plots for the predictor variables For an explanation of variables see Table 1 They axes are on WLt scale andare centered around the mean WLt Error bars and grey area indicate standard deviation of the response of over 1000 bootstrap models Therelative contribution of each predictor is indicated as percentage The lines along thex axes of each plot show distribution of data across thatvariable in deciles

The fact that all parameters show expected or explainableresponses in the model corroborates the reliability of the cal-ibrated WLt model The standard deviation of the predictorresponses based on the bootstrap samples shows the stabilityof the observed responses

Further insight into model behavior can be obtained by an-alyzing parameter interactions This is obtained by changingtwo parameters simultaneously while keeping mean valuesfor all other parameters (Elith et al 2008) Figure 8 showsthe two strongest parameter interactions Parameter wbsummer

strongly interacts withptype The generally lower values ofWLt of fens without surface water connection and other or-ganic soils show a stronger dependency on the summer cli-matic water balance While a summer climatic water balanceof gt minus80 mm shows a rather weak effect on WLt for the wet-ter peatland types in contrast to the two drier peatland typesthere is still a strong effect with increasing wbsummer Thetrend for wbsummergt 130 mm for the dry peatland types issupported by seven different peatlands

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3333

Figure 8Partial dependence plots representing the two strongest interactions in the model(a) betweenptype and wbsummerand(b) betweenpbaseandfdry Fitted WLt is plotted on they axis which is obtained after accounting for the average effect of all other predictor variables

Another strong interaction is observed forpbase andfdry (2500) While a rather weak effect of the fraction ofarable land and grassland is observed for organic soils over-lying basement rock and peat clay layer a strong effect is ob-served for organic soils overlying unconsolidated rock Thisinteraction reflects the higher lateral range of drainage effectsfor organic soils with little flow resistance at the peat base Inthese organic soils intensive land use lowers the water levelover large areas

35 Discussion of model uncertainty

Plotting observed vs predicted WLt from cross-validation(Fig 9) illustrates the rather large residual variance that can-not be explained by the model As indicated by the higherRMSEcv for the wet range (Table 3) scatter increases withincreasing WLt Error bars in they direction indicate dataerror derived from the nugget of the variogram It is shownfor a few data points as an example Due to transformationdata error increases for higher WLt Figure 9 demonstratesthat the fraction of unexplainable variance related to data er-ror is much higher for the wet than for the dry range Boot-strap error indicating the variation of the model predictionsfor 1000 bootstrap samples is shown in thex direction for thesame data points Bootstrap error is lower than the data errorfor the wet range and slightly higher for the dry range

Bootstrap errors demonstrate the sensitivity of model pre-dictions to changes of the data set used for calibration Whena model possesses structural deficits such as missing pre-dictor variables bootstrap errors should not be used to de-fine confidence intervals for the model predictions Figure 10shows residuals from cross-validation and standard deviationof bootstrap predictions for all land cover classes The resid-uals of each land cover class show near-normal distributionsFor five of the nine land cover classes (wet forest wet un-used peatland arable land coniferous forest and reed) theShapirondashWilk test of normality is positive (p gt 005) Fig-ure 10a further indicates that residuals of each land cover

Figure 9 Observed vs predicted transformed annual mean waterlevel (WLt) from cross-validation results Error bars show selecteddata and bootstrap model errors as standard deviation Data pointsare scaled by their weights

scatter fairly well around zero indicating low bias for the var-ious land cover classes Land-cover-class-specific confidenceintervals of model predictions can thus be derived from theRMSEcv of each land cover class eg 2times RMSEcv repre-senting the 95 confidence interval

The prediction uncertainty derived from cross-validationis much higher than the bootstrap prediction uncertainty ob-tained from the bootstrap standard deviation (sd) with 2times sdcorresponding to the 95 confidence interval (Fig 10)The large difference between these values indicates that themodel has structural deficits that can be attributed to severalerror sources

i Key influences on WLt are missing in the set of pre-dictor variables None of the predictor variables in-dicate whether and to which extent water level in-crease due to re-wetting measures took place in the last

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 6: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

3324 M Bechtold et al Large-scale regionalization of water table depth in peatlands

decision tree concept the parameter space is searched se-quentially for the best split that results in the lowest modelmean squared error The mean responses of the groups thatresult from the various splits and correspond to certain pa-rameter ranges represent the model The common procedureis the growth of a large tree which is subsequently simpli-fied by dropping weak links that are identified with cross-validation Growing only a single tree has several disadvan-tages such as uneven functions that are very sensitive to thespecific sample of the data Therefore ensemble techniqueshave been combined with the decision tree concept Thesewere first the development of multiple models by bootstrap-ping of the samples (bagging technique) and the randomcreation of subsets of predictors at each split (random for-est technique) Later with the ldquoboostingrdquo technique of BRTa sequential procedure was developed in which data is re-weighted after each tree to increase emphasis on data that ispoorly modeled by the existing collection of trees (Elith etal 2008)

BRT modeling is increasingly applied in spatial model-ing of species or numerical environmental variables (Elithet al 2008 Martin et al 2011) thereby often showing su-perior performance compared to other machine-learning al-gorithms The increasing application of BRT is related toseveral of its favorable characteristics the strength of thismethod lies in the ability to fit complex functional dependen-cies including non-linear relationships and interactions be-tween predictor variables Based on its flexibility BRT is in-variant to monotonic transformations of predictors Further-more BRT allows for missing values in the predictor vari-ables thus predictor variable information does not necessar-ily need to fully cover the total map extent The gbm packagehandles missing values in predictor variables by introducingsurrogate splits The mean target value belonging to the miss-ing predictor values is attributed to these surrogate splits dur-ing model building We observed that the contribution of apredictor variable to the final model decreases with an in-creasing number of missing values This is intuitive as targetobservations of missing predictor values are mostly supposedto scatter strongly BRT is further fairly insensitive to out-liers and allows estimating the relative contribution of eachpredictor variable to the model Due to these characteristicswe expected BRT to be very well suited to the very hetero-geneous data set of this study

BRT model calibration is prone to overfitting and thereare various options to reduce this behavior Due to the over-fitting behavior cross validation is generally part of themodel building process However cross validation can beperformed in several ways and if performed carelessly canlead to overly optimistic model performance (Dersquoath 2007)Here cross validation was performed by leaving out wholepeatland areas instead of a random set of dip wells Thisrepresents a stricter cross validation and we noticed that itstrongly reduced overfitting of the water level data and thuscontributed to the development of a more robust model

Figure 2 Illustration of the predictor variables determined for eachdip well based on available national maps (see Table 1)

Another option to avoid overfitting is to impose mono-tonic slopes on the effects of individual parameters whichcan even lead to improved prediction performance (Dersquoath2007) For all our numerical variables we expected mono-tonic slopes rather than optimum functions To avoid pre-defining any expected direction all numerical variables wereadded twice to the set of predictors constraining the slope toa monotonic increase and decrease We let the model decidewhether monotonic increase or decrease has higher predic-tive power

Models were calibrated using a Gaussian response typeaimed at minimizing deviance (squared error) (Ridgeway2013) In all calibration runs we applied the gbmstep func-tion of the dismo package which assesses the optimal num-ber of boosting trees using cross validation We tested variouslearning rates (0001ndash001) bag fractions (01ndash08) and lev-els of tree complexity (3 to 7) ie the number of nodes in atree By trial and error we determined the most effective algo-rithm parameters for our data set being 0005 for the learningrate 06 for the bag fraction and 5 for the tree complexity

The final BRT model building is commonly performed asa two-step procedure (Elith et al 2008) which we basicallyalso followed in our study

i In the first step the whole set of predictor variables isused to calibrate a BRT model

ii In a second step the number of parameters is re-duced sequentially to avoid overfitting and to derive amore parsimonious model We tracked predictive per-formance criteria during the simplification process Asvarious variables were calculated for different buffersizes our predictors included a large number of cor-related variables Correlation coefficients between pre-dictor variables ofgt 07 are known to severely distortmodel estimation and subsequent prediction (Dormann

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3325Ta

ble

1O

verv

iew

onpr

edic

tor

varia

bles

Pre

dict

orva

riabl

eVa

riabl

ena

me

Valu

esP

oint

buf

fers

(m)

Dat

aso

urce

Land

-use

type

Ara

ble

gras

slan

dfo

rest

shr

ubs

peat

-min

ing

unus

edP

oint

100

500

100

025

00D

igita

lLan

dsca

peM

odel

1

peat

land

sw

amp

open

wat

er

Vege

tatio

nat

trib

utes

Dec

iduo

usfo

rest

mix

edfo

rest

con

ifero

usfo

rest

ree

dP

oint

Dig

italL

ands

cape

Mod

el1

(opt

iona

l)sh

rubs

gra

ss

ldquoWet

soil

obse

rved

rdquoY

esn

oP

oint

Dig

italL

ands

cape

Mod

el1

Com

bine

dla

ndco

ver

lcA

rabl

egr

assl

and

wet

gras

slan

dde

cidu

ous

incl

udin

gm

ixed

Poi

nt1

005

001

000

2500

Dig

italL

ands

cape

Mod

el1

info

rmat

ion

(land

-use

type

fo

rest

wet

fore

stc

onife

rous

fore

str

eed

unus

edpe

atla

nd

veg

and

wet

soil

attr

)w

etun

used

peat

land

Dry

land

cove

rfr

actio

nf

dry(

X)

Ara

ble

05times

gras

slan

don

orga

nic

soil

area

0to

110

050

010

002

500

Dig

italL

ands

cape

Mod

el1

Wet

land

cove

rfr

actio

nR

eed

wet

gras

slan

dw

etfo

rest

and

wet

unus

edpe

atla

ndon

100

500

1000

250

0D

igita

lLan

dsca

peM

odel

1

orga

nic

soil

area

0to

1

Tota

llen

gth

ofdi

tche

sfo

rdi le

ndr

y(X

)ge

0m

Poi

nt5

025

010

002

500

Dig

italL

ands

cape

Mod

el1

alll

can

don

lyfo

rar

able

and

gras

slan

d(s

ubsc

rldquod

ryrdquo)

Dis

tanc

eto

next

ditc

hge

0m

Poi

ntD

igita

lLan

dsca

peM

odel1

Pea

tland

type

pty

peLo

wla

ndbo

gup

land

bog

fen

neig

hbor

ing

surf

ace

wat

erf

enP

oint

Map

ofor

gani

cso

ils2

with

outn

eigh

borin

gsu

rfac

ew

ater

oth

erldquolo

w-C

rdquoor

gani

cso

il

Mat

eria

latp

eatb

ase

pba

seU

ncon

solid

ated

rock

pea

tcla

yla

yer

rock

no

info

rmat

ion

Poi

ntM

apof

orga

nic

soils

2

Pea

tland

frac

tion

fpe

at(X

)0

to1

Poi

nt5

001

000

2500

Geo

logi

calm

ap(B

GR

)3

Dis

tanc

eto

edge

ofpe

atla

ndgt

0m

Geo

logi

calm

ap(B

GR

)3

Rat

ioof

dpe

atf

peat

gt0

2500

Geo

logi

calm

ap(B

GR

)3

Pre

cipi

tatio

nge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Eva

potr

ansp

iratio

nge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Clim

atic

wat

erba

lanc

ew

b sum

mer

lt0

andge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Rel

ativ

ehe

ight

hre

l(X

)lt

0an

dge0

mP

oint

ndashm

edia

n25

50

100

250

Dig

italE

leva

tion

Mod

el5

500

1000

Topo

grap

hic

inde

xti ra

sR(X

)gt

0P

oint

and

1000

buffe

rfo

r10

D

igita

lEle

vatio

nM

odel

5

252

501

000

rast

erva

lues

Pro

tect

ion

stat

usN

atur

eco

nser

vatio

nar

eas

peci

alar

eas

ofco

nser

vatio

nP

oint

Map

sof

prot

ecte

dar

eas

6

spec

ialp

rote

ctio

nar

eafo

rw

ildbi

rds

UN

ES

CO

bios

pher

ere

serv

ena

ture

park

nat

iona

lpar

kla

ndsc

ape

prot

ectio

nar

ea

1AT

KIS

Bas

isD

LMF

eder

alA

genc

yfo

rC

arto

grap

hyan

dG

eode

syB

KG

2

map

ofor

gani

cso

ils(R

oszligko

pfet

al

2014

Hum

bold

tUni

vers

ityof

Ber

lin)

3ge

olog

ical

map

12

0000

0(G

UE

K20

0B

GR

ndashF

eder

alIn

stitu

tefo

rG

eosc

ienc

esan

dN

atur

alR

esou

rces

)4

rast

erm

ap1times

1km

ofw

eath

erda

ta(G

erm

anW

eath

erS

ervi

ce)

5B

KG

var

iabl

ena

me

indi

cate

dfo

rth

eni

neva

riabl

esin

the

final

mod

elw

ith(X

)in

dica

ting

buf

size

andR

indi

catin

gra

ster

reso

lutio

n6F

eder

alA

genc

yfo

rN

atur

eC

onse

rvat

ion

(BfN

)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3326 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 3 Illustration of the annual mean water level (WL) transformation Hypothetical transfer function relating GHG budget to WL (m)(a) GHG budget vs the transformed water level (WLt) (b) WLt vs WL The lines along thex axes indicate the data quantiles of theanalyzed data set(c)

et al 2013) Thus we performed this simplificationprocess by first dropping those parameters with a cor-relationgt 07 (either Pearson or Spearman type) to an-other parameter with a higher contribution (Clapcott etal 2011) This ensured that two highly correlated pa-rameters would not remain in the parameter set longerthan the last parameter of another group of variableswhich may contribute less compared to the two highlycorrelated parameters but provides extra informationthat is not covered by the other parameters After allhighly correlated parameters have been dropped fur-ther parameters with low contribution were droppedprogressively

Predictor contributions are calculated as proportional con-tributions to the total error reduction and can be consideredas a measure for the influence of the individual predictorsAdditionally a BRT model allows the derivation of partialdependence plots which indicate how the response is affectedby a certain predictor after accounting for the average effectsof all other predictors in the model (Elith et al 2008) Theseplots do not show the full effect of each parameter on themodel response due to interactions with other parameters thatare fixed to derive theses plots as well as due to parameter co-correlation However they can be used for interpreting modelbehavior (Elith et al 2008)

231 WLt transformation of WL

The map of water levels of this study was developed to im-prove the upscaling of greenhouse gas emissions from or-ganic soils Therefore the final map should provide the high-est accuracy for the water level range for which the high-est differences of greenhouse gas emissions occur This canbe achieved by transforming WL into a transformed vari-able WLt which shows a linear relationship with GHG emis-sions The sensitivity of greenhouse gas emissions to waterlevel has been analyzed in several laboratory and field ex-perimental and monitoring studies (Berglund and Berglund

2011 Droumlsler et al 2011 Hahn-Schoumlfl et al 2011 Leiber-Sauheitl et al 2014 Moore and Roulet 1993 Moore andDalva 1993 van den Akker et al 2012) General trends are astrong increase of methane (CH4) emissions for annual meanwater levels of approximatelygt minus01 m and an increase ofCO2 emissions for water levelslt minus01 m with a trend simi-lar to a saturation function that levels out approximately be-tweenminus04 andminus08 m (Fig 3a) While studies agree overthese general trends the exact shape of the transfer functionand the maximum levels of emissions as well as their depen-dence on soil properties and other environmental parametersare still controversial Here we assume a hypothetical trans-fer function relating the normalized GHG budget rangingfrom 0 to 1 to the water level (see also Fig 3)

GHG Balance=

minuse3(WL+01)

+1 WLlt=minus011minuseminus3(WL+01) WLgtminus01

(1)

As the GHG budget can be positive for both low and highWL we introduced the transformed water level WLt as(Fig 3)

WLt =

e3(WL+01)

minus 1 WL lt= minus011 minus eminus3(WL+01) WL gt minus01

(2)

By calibrating the model to both WL and WLt we test if theoptimization of WLt provides the highest model accuracy forthe water level range relevant for GHG emissions and if itoptimizes the map for application to GHG upscaling

232 Weighting scheme

When considering possible data weighting schemes it isworth emphasizing at this point that the goal of this study isthe development of a statistical model that can explain boththe water level variability within a peatland as well as amongdifferent peatlands The data of target and predictor variablesfor building this model is highly heterogeneous First the tar-get variable data set contains peatland areas that strongly dif-fer in their spatial extent and in the number of installed dip

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3327

wells Second the predictor variable data set contains cate-gorical and numerical data and part of the predictor variablespredominantly vary from peatland to peatland (eg climaticboundary conditions large-scale topographic wetness indexpeatland characteristics) whereas others also show within-peatland variability (eg land use small-scale topographicwetness index drainage network) As the influence of the in-dividual predictor variables on our target WLt is expected tobe rather diffuse due to abundant interactions with other sitecharacteristics the robustness of derived dependencies willstrongly depend on the number of different peatlands in thedata set

There are no universal data weighting rules for similarlyheterogeneous data situations and some degree of expertjudgment and subjectivity is inevitable involved when de-veloping an appropriate scheme (Francis 2011) The needfor introducing a data weighting scheme is obvious as with-out data weighting during calibration too much influencewould be given to small and well-studied peatlands whichwill reduce predictive model performance for large less-well-studied peatland areas To avoid this in a simple man-ner weight could be reduced by the number of dip wells ineach peatland which results in each peatland being equallyweighted This scheme however does not sufficiently usethe high information content provided by well-studied largepeatlands which should have a higher impact on model cali-bration than a small peatland with only few dip wells

Here we propose a new weighting scheme that takes intoaccount both factors peatland size and local density of dipwells to derive dip well specific weighting factors It is basedon principles of data uncertainty reduction by repeated mea-surements and of geostatistics First we consider our datasituation as an analogue of meta-analysis with grouped dataIt is has been shown for homogeneous problems (all datafrom same population) that optimal group weights for meta-analysis is 1SE2 (Hedges and Olkin 1985) with SE beingthe standard error of each group

SE =σe

radicN

(3)

whereσe is the error standard deviation of a measurementandN is the number of measurements in a group For ho-mogeneous problems and uniformσe this results in weightsthat are linearly dependent onN which we here call the firstend member of weighting Heterogeneity (within-group vari-ance) reduces the variation of the group weights which canbe shown by random effect models (Cumming 2012) Aswith second end member of weighting when heterogene-ity totally dominates within-group variance optimal groupweights are uniform for all groups ie weights are inde-pendent ofN We are not aware of a method that allowsthe estimation of the degree of heterogeneity for the com-plex target and predictor data situation in this study includ-ing data (spatial and temporal variability measurement er-ror) and model errors (missing parameters) As a trade-off

Figure 4 Sample semivariogram and fitted semivariogram modelof the annual mean water level data WL

between 1SE2(homogeneous end member) and 1 (heteroge-neous end member) we decided on a group weight that isthe inverse of the standard error 1SE which is for exam-ple often used in econometric studies (Dickens 1990) Weemphasize that this is a subjective decision

The group weight 1SE is the basis for the geostatisticalpart of our weighting scheme There are two reasons why wecannot directly treat our peatlands as groups First there iswithin-peatland variability which is related to changing sitecharacteristics It is one objective of our study to describethis variability by statistical modeling Thus dip wells mustbe treated individually and data cannot be aggregated at thepeatland level Second we expect the model to learn morewhen the same number of dip wells is installed in a largerpeatland In a small peatland spatial autocorrelation betweendip wells is higher ie the information content is lower thanfor large peatlands As a consequence of the first point wedo not aggregate and keep all dip wells in the target variabledata set by attributing to each dip well the fraction 1N ofits group weight so that the relative weights of the groupsremain constant As a consequence of the second point weuse principles of geostatistics in our weighting scheme Wereplace the group sizeN (positive integer number) by theldquostatisticalrdquo group sizen (positive continuous number be-inggt 1) which we derive from the spatial autocorrelationamong the dip wells

Therefore we analyze the spatial autocorrelation structureof the data set A single spherical variogram model was fit-ted to the sample variogram of all data (Fig 4 in Sect 31)Variogram models allow the differentiation of the total datavariance (called ldquosillrdquo) into a spatially uncorrelated variance(called ldquonuggetrdquo) and a spatially correlated variance (calledldquostructural variancerdquo and defined as sillndashnugget) (Wacker-nagel 2003) The variogram model allows for derivation forany distance between two locations the average squared dif-ference of values here defined asγ By definition at dis-tance 0 the average squared difference equals the nuggetand at distances greater than that called the ldquorangerdquo of spatial

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3328 M Bechtold et al Large-scale regionalization of water table depth in peatlands

autocorrelation the average squared difference equals thesill Accordingly the autocorrelated fractionf of the av-erage squared difference between two dip wellsi and j is

fij =sill minus γij

sill minus nugget (4)

We now define the ldquostatisticalrdquo group sizen of each dip wellito be the sum of one plus the autocorrelated fractionsfij ofall dip wells that are within the range of spatial autocorrela-tion of i

ni = 1 +

msumj=1

sill minus γij

sill minus nugget (5)

According to the discussion above dip-well-specific weightscan then be calculated with

wi =1

ni SEi

=1

σeiradic

ni

(6)

whereni is derived from Eq (5) The equation shows thatwith increasing ldquostatisticalrdquo group sizen ie with increas-ing spatial data density the weight of an individual dip wellis ldquodown-weightedrdquo to some degree a behavior that corre-sponds to our initial intention to lower the influence of smallpeatlands compared to large ones The error standard devia-tion σe is dependent on several factors eg the length of thetime series the temporal measurement density and the mi-crotopography around the dip well For simplicity we hereassumedσe to be uniform for all dip wells which simplifiesEq (6) towi =

1radic

ni

Only dip wells with the same land-use type were summedup with Eq (5) which avoids the down-weighting by dipwells that have different land-use types The latter are mostlycharacterized by fairly different WLt and thus by rather lowspatial autocorrelation to dip welli

After spatial correlation has been accounted for the sumof the weights of all dip wells of each land-use type were ad-justed that they correspond to the fractions of this land-usetype in Germany This adjustment accounts for the overrep-resentation in the data set of dip wells in unused peatlandsand underrepresentation of dip wells in arable land

233 Model performance criteria

Model fit and predictive performance after cross-validationwere quantified by the weighted root mean square error

RMSE =

radicradicradicradic 1summi=1 wi

msumi=1

(wi

(xoi minus xsi

)2) (7)

wherem is the number of dip wellsxoi is observed WLor WLt of dip well i xsi is simulated WL or WLt of dipwell i andwi is the data weight of dip welli (see below) We

refer to the root mean square error of the predicted data ofcross validation as RMSEcv Model performance was furtherquantified by NashndashSutcliffe efficiency (NSE)

NSE = 1 minus

msumi=1

wi

(xoi minus xsi

)2

msumi=1

wi

(xoi minus xo

)2 (8)

wherexo is the mean of all observed WL or WLt It indicateshow well observed vs predicted values match the 1 1 lineNSE is a good overall indicator of predictive performancebecause it combines scatter and bias (common offset andorslope difference from 11 line) (Nash and Sutcliffe 1970)Values greater than 0 signify a model that is better than thereference model based on the data mean We refer to the NSEof the training data as NSEcal and of the predicted data ofcross validation as NSEcv

Systematic errors were quantified by calculating the modelbias here defined as

BIAS =

msumi=1

(wi xoi minus wi xsi

) (9)

24 Model uncertainty and stability evaluation

Uncertainty of the model predictions was assessed by boot-strapping cross-validation and residual analysis

For the bootstrapping analysis we followed the procedureof Leathwick et al (2006) We estimated the confidence in-tervals around the predictions and the fitted functions by tak-ing 1000 bootstrap samples of the 53 peatlands The numberof peatlands in each sample was equivalent to the data set butpeatlands were selected randomly with replacement Usingthe predictor variables of the final model a BRT model wasfitted to each sample Cross validation was again performedon peatlands thus a peatland in the calibration data set wasnot part of the cross-validation data set to avoid overly opti-mistic results Variances of the predictions and of the fittedfunctions of the 1000 models were evaluated

If data sets are relatively small (egn lt 1000 Dersquoath2007) then the small size of the training and test data setslowers model accuracy Given the fairly small number ofpeatlands in the data set and the partly high spatial corre-lation of dip wells within these peatlands we decided not tosplit the data set into a training and test data set Estimatesof model accuracy can then be based on cross-validationthereby making effective use of all the data (Dersquoath 2007)The prediction uncertainty of the final model is estimatedby the root mean square error of prediction (RMSEcv seeabove) for each land cover class After testing for near-normal distribution of the residuals RMSEcv can be used toderive the 68 and 95 confidence intervals of the predictionswith RMSEcv and 2times RMSEcv respectively

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3329

Finally additional residual analysis was performed to eval-uate whether the predictions are biased for different landcover classes or geographical regions

25 Regionalization

In the final regionalization step the predictor variables con-tributing to the final model were determined at a 25times 25 mraster for all organic soil in Germany Predictor variableswere determined with the same map input that was used formodel building Land cover information including informa-tion on ditches was based on the data from year 2012 and theclimatic data was based on the average of the last 30 yearsThe fine spatial resolution of 25times 25 m was not chosen tofool the reader with a highly spatially accurate model Ratherthis fairly fine scale was necessary to map the relativelysmall-scale effects of the topography land use and peatlandgeometry variables The final model was then used to makea prediction for each of these raster cells

3 Results and discussion

31 Spatial correlation structure of the data set

The variogram model fitted to the sample variogram provideda nugget (0012 m2 011 m) a sill (009 m2 03 m) and arange of spatial correlation (2700 m) for our data set of WL(Fig 4) The nugget represents the very small-scale soil hy-draulic variability and micro-topography effects on WL (vander Ploeg et al 2012) and measurement error eg by dif-ferences in the determination of the ground surface and inthe timing of the manual measurements Furthermore micro-topography (eg hummocks) and oscillating peat surfaces ofwet peatlands pose a challenge for an accurate determinationof both ground surface and water level The water level timeseries in the data set were of different lengths and rangedfrom 1 to 20 years Interannual variability of water levels canbe large (eg Knotters and van Walsum 1997) For simplic-ity in our analysis data were not harmonized by extrapolat-ing WL time series using weather data to a 30-year periodThus the nugget also includes errors that are introduced bydip wells with different measurement periods that are locatedin the range of spatial correlation In consideration of theseerror sources the fitted nugget of 011 m appears to be a re-alistic value At 03 m the fitted sill matched nearly perfectlywith the standard deviation of the data (031 m) which in-dicates consistency between semivariogram model and dataset The fitted range of spatial correlation of 2700 m reflectsboth physical effects ie the average range of lateral flowdue to hydraulic gradients as well as the effect of averageland-use patterns in Germany on the spatial correlation ofWL Fitted values were used in the calculation of the dip-well-specific weights using Eq (6)

32 Typical water levels for land-use types in Germanorganic soils

The land cover classes are characterized by plausible meanand median water levels which show consistent differencesbetween each other (Table 2 and Fig 5a) The mean valuesof arable land and grassland agree with what can be expectedfor their agronomic requirements with slightly lower waterlevels for arable land The high variability observed for bothclasses may be related to the variability of the efficiency ofinstalled drainage systems as for example the presence andcondition of tile drains and the depth of ditches Grasslandscan be managed with very variable intensity which is partlyreflected in different water levels Figure 5a further showsthat deciduous forests seem to dominate slightly drier organicsoils compared to coniferous forests which dominate underwetter conditions A high variability of water levels is ob-served for the land cover class unused peatland On the onehand post peat-cutting topography increases the variabilityof WL over short distances It probably contributes to thehigh variance observed for this class On the other hand thisclass comprises both rather dry unused peatlands and wetterpeatlands in which re-wetting measures already took placewhich however do not show yet a wet soil attribute in theATKIS Digital Landscape Model This may also cause part ofthe variance observed in the grassland and forest land coverclass All wet land cover classes (reed wet grassland wetforest and wet unused peatland) that were separated by wet-ness indication clearly show higher water levels showing thewetness attribute of the Digital Landscape Model is a usefulattribute

Figure 5b shows the transformed water level for all classesIt can be observed that the variances of the wetter landcover increase relative to the variances of the dry land coverclasses This is due to the highest sensitivity of GHG emis-sions in the wet range of water levels (gt minus05 m) Conse-quently the rather high variance of WL for arable land cor-responds to a rather low variance of WLt ie to a rather lowassumed effect of WL variability on the GHG budget

33 BRT model calibration and validation WL vs WL t

In contrast to land cover class the other predictor variablesshowed if at all only weak relations to WL and WLt whenevaluating them with box plots 2-D cross plots and simplecorrelation matrices Here we expected BRT to detect thestrongest predictor interactions and to identify the most in-formative predictors

After model calibration with all predictors subsequentmodel simplification successively dropped those parameterswith correlationgt 07 and the lowest contribution For bothWL and WLt model performance improved during this sim-plification For WLt the highest values of NSEcv of approxi-mately 046 were achieved with 21 to 9 model parametersThe development of NSEcv for the last 50 parameters is

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3330 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 5 Water level relative to ground surface WL (m) and transformed water level WLt (minus) by land cover class illustrated as a weightedbox plot WLt = minus1 corresponds to maximum CO2 emissions and WLt = 1 to maximum CH4 emissions In the top horizontal axes thenumber of dip wells in each class is indicated

Figure 6NSEcv as a function of number of predictor variables usedin the model of WLt during model simplification and shown for thelast 50 parameter drops

shown in Fig 6 Further elimination of parameters led to apronounced decline of model performance Similar behav-ior was observed for the calibration on WL In favor of amore parsimonious model we chose the model with the low-est number of parameters before the pronounced decline ofmodel performance occurred For the calibration on WLtthis corresponded to the model with lowest number of param-eters that still achieved NSEcv values ofgt 045 (Fig 6) Thefinal WLt model comprised nine predictor variables and thefinal WL model seven parameters The percentages of pa-rameter contributions to the final model and their individualinfluences are discussed for WLt in Sect 34

Table 3 summarizes the statistical performances of themodels calibrated on WL and WLt For both models NSEcalis considerably higher than NSEcv and shows the commonly

Table 2 Weighted mean and standard deviation of WL and WLtdata and of the WLt map presented in Sect 36 for the nine landcover classes

WL (m) WLt (minus)WLt (minus)

Meanplusmn sd Meanplusmn sd Map meanplusmn sd

Arable land minus069plusmn 030 minus076plusmn 017 minus066plusmn 022Deciduous f minus045plusmn 034 minus049plusmn 037 minus047plusmn 035Grassland minus044plusmn 029 minus052plusmn 032 minus049plusmn 030Unused peatl minus039plusmn 036 minus039plusmn 041 minus037plusmn 040Coniferous f minus036plusmn 036 minus037plusmn 037 minus046plusmn 035Wet unused peatl minus022plusmn 027 minus018plusmn 040 minus017plusmn 036Wet forest minus022plusmn 029 minus017plusmn 043 minus021plusmn 039Wet grassland minus010plusmn 014 minus000plusmn 031 minus015plusmn 039Reed minus001plusmn 017 020plusmn 029 minus006plusmn 032

observed overfitting behavior of BRT models The differentmeasures that we conducted to minimize overfitting (cross-validation on peatlands restriction to monotonic responsesand model simplification including elimination of highly cor-related variables) lowered the difference between NSEcal andNSEcv but could not totally avoid overfitting NSEcv of theWLt model (0453) indicates higher predictive model perfor-mance compared to the WL model (0381) However as thedata ranges differ due to the transformation this comparisonmay be misleading Therefore we transformed the predic-tions of the WL model to obtain WLt values from this modeland equally calculated the performance criteria (Table 3 sec-ond column) Then NSEcv is slightly increased (0397) butdoes not achieve the values of the model that was calibratedon WLt A better predictive model performance of the modelcalibrated on WLt is also visible for the RMSEcv valuesThe total RMSEcv as well as the RMSEcv values for the

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3331

Table 3Performance criteria of the different models dry range de-fined as WLlt minus03 m and wet range as WLgt minus03 m

WL (m) WLt (minus) WLt (minus)(calibrated (calibrated on WL) (calibrated on WLt)

on WL)

NSEcal 0627 0559 0642NSEcv 0381 0397 0453RMSEcv 0269 0299 0284RMSEcvdry 0284 0263 0259RMSEcvwet 0222 0382 0355Bias minus0003 0083 0002Biasdry minus0012 0070 0003Biaswet 0021 0120 0000

dry (WL lt minus03 m) and wet range (WLgt minus03 m) showslightly lower values for the WLt model compared to WLtvalues from the model calibrated on WL Given our hypo-thetical transfer function (Fig 3) in which the GHG budgetis linearly related to WLt the higher accuracy of WLt pre-dictions directly corresponds to a higher accuracy of GHGbudget predictions

Superior model performance is also evident when evaluat-ing model bias Only when calibrating directly on WLt arethe WLt predictions bias-free Calibration on WL and subse-quent transformation to WLt introduces a model bias towardssystematically lower WLt values In subsequent applicationsto GHG emission upscaling lower WLt values would lead toan overestimation of CO2 emissions and to an underestima-tion of CH4 emissions

34 Influence of predictor variables on WLt

Given the beneficial characteristics of the model calibratedon WLt for GHG upscaling presentation and discussion offurther model results is restricted to the WLt model

The BRT method allows the analysis of the parameter con-tributions to and influences on the model (Elith et al 2008)and thus may contribute to system understanding The per-centages of the contributions of the nine predictor variablesto the final model ranged from 252 to 56 (Fig 7) Ex-cept protection status at least one parameter of each of theseven parameter groups contributed to the final model Allprotection status information was dropped early during thesimplification process due to low contribution although WLshowed slightly higher values for data from nature protectionor special areas of conservation However other parametersseem to be able to fully compensate the information that islost by dropping this predictor

Land cover class lc at the dip well was the parameter withstrongest contribution (252 ) It basically follows the trendillustrated in Fig 5b The bootstrap error plotted as standarddeviation (Fig 7) shows the variation of this influence overthe 1000 bootstrap models A second land cover parameterthe fraction of dry land cover classes on organic soils in abuffer of 2500 m radiusfdry (2500) contributed to the model

with 103 The monotonic decrease of WLt with increas-ing fdry (2500) is plausible as higher values reflect intensiveland use in the surroundings of the dip well and thus indicateintensive artificial drainage Together both parameter con-tributed 355 and thus land cover represents the parametergroup with the strongest model contribution

Peatland characteristics are the second most importantparameter group The peatland type contributed 16 Themodel indicates that peatlands without any connection to sur-face water bodies (river or lake) and the class of other organicsoils are characterized by lower WLt compared to the peat-land types lowland bog upland bog and fen neighboring sur-face water As the class of other organic soils is generallyexpected to reflect lower water levels and as surface watermay have a stabilization effect on water levels of organicsoils the influence of the peatland type can be consideredplausible Besides peatland type the substrate of the peatbase contributes 56 Here organic soils overlying peatclay layers (eg limnic sediments such as calcareous gyt-tja) or basement rock are characterized by higher WLt com-pared to organic soils overlying unconsolidated rock Thiscan be explained by the lower drainage resistance of uncon-solidated rocks This may cause an increased efficiency ofanthropogenic drainage andor a general higher vulnerabil-ity to seepage losses Finally slightly lower WLt values areindicated by a high fraction of organic soils for the 500 mbufferfpeat(500) This may reflect the higher land-use pres-sure on large peatlands compared to rather small peatlandswhich tentatively are more easily preserved by nature protec-tion efforts

The remaining four parameter groups are represented inthe model by only one parameter each The third most influ-ential parameter was the length of ditches on arable land andgrassland for the 250 m buffer dilendry (250) At first glanceit may be surprising that with increasing ditch density WLtvalues tend to be higher as ditches are supposed to drain thewater when land is used as arable land and grassland Thefact that the model identifies a rather strong effect in the op-posite direction may be caused by incomplete informationabout the drainage network There is not detailed informa-tion about the spatial distribution of tile drains Based on ex-pert knowledge agricultural areas with a lower ditch densityare more likely to have tile drains As these drains easilyinstalled with a narrow drain spacing are more effective atdraining organic soils low WLt values for arable land andgrassland may be related to low ditch densities Furthermoreditches were originally dug at narrow spacing in especiallywet areas of organic soils but there is no information avail-able whether these ditches still function properly

The parameters wbsummer hrel and tiras25all show expectedtrends The model predicts higher WLt for increasing cli-matic water balance in the summer period (May to October)wbsummer for dip wells located in depressions (low values ofhrel) and for higher small-scale topographic wetness indicescalculated on the 25times 25 digital elevation model (tiras25)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3332 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 7 Partial dependence plots for the predictor variables For an explanation of variables see Table 1 They axes are on WLt scale andare centered around the mean WLt Error bars and grey area indicate standard deviation of the response of over 1000 bootstrap models Therelative contribution of each predictor is indicated as percentage The lines along thex axes of each plot show distribution of data across thatvariable in deciles

The fact that all parameters show expected or explainableresponses in the model corroborates the reliability of the cal-ibrated WLt model The standard deviation of the predictorresponses based on the bootstrap samples shows the stabilityof the observed responses

Further insight into model behavior can be obtained by an-alyzing parameter interactions This is obtained by changingtwo parameters simultaneously while keeping mean valuesfor all other parameters (Elith et al 2008) Figure 8 showsthe two strongest parameter interactions Parameter wbsummer

strongly interacts withptype The generally lower values ofWLt of fens without surface water connection and other or-ganic soils show a stronger dependency on the summer cli-matic water balance While a summer climatic water balanceof gt minus80 mm shows a rather weak effect on WLt for the wet-ter peatland types in contrast to the two drier peatland typesthere is still a strong effect with increasing wbsummer Thetrend for wbsummergt 130 mm for the dry peatland types issupported by seven different peatlands

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3333

Figure 8Partial dependence plots representing the two strongest interactions in the model(a) betweenptype and wbsummerand(b) betweenpbaseandfdry Fitted WLt is plotted on they axis which is obtained after accounting for the average effect of all other predictor variables

Another strong interaction is observed forpbase andfdry (2500) While a rather weak effect of the fraction ofarable land and grassland is observed for organic soils over-lying basement rock and peat clay layer a strong effect is ob-served for organic soils overlying unconsolidated rock Thisinteraction reflects the higher lateral range of drainage effectsfor organic soils with little flow resistance at the peat base Inthese organic soils intensive land use lowers the water levelover large areas

35 Discussion of model uncertainty

Plotting observed vs predicted WLt from cross-validation(Fig 9) illustrates the rather large residual variance that can-not be explained by the model As indicated by the higherRMSEcv for the wet range (Table 3) scatter increases withincreasing WLt Error bars in they direction indicate dataerror derived from the nugget of the variogram It is shownfor a few data points as an example Due to transformationdata error increases for higher WLt Figure 9 demonstratesthat the fraction of unexplainable variance related to data er-ror is much higher for the wet than for the dry range Boot-strap error indicating the variation of the model predictionsfor 1000 bootstrap samples is shown in thex direction for thesame data points Bootstrap error is lower than the data errorfor the wet range and slightly higher for the dry range

Bootstrap errors demonstrate the sensitivity of model pre-dictions to changes of the data set used for calibration Whena model possesses structural deficits such as missing pre-dictor variables bootstrap errors should not be used to de-fine confidence intervals for the model predictions Figure 10shows residuals from cross-validation and standard deviationof bootstrap predictions for all land cover classes The resid-uals of each land cover class show near-normal distributionsFor five of the nine land cover classes (wet forest wet un-used peatland arable land coniferous forest and reed) theShapirondashWilk test of normality is positive (p gt 005) Fig-ure 10a further indicates that residuals of each land cover

Figure 9 Observed vs predicted transformed annual mean waterlevel (WLt) from cross-validation results Error bars show selecteddata and bootstrap model errors as standard deviation Data pointsare scaled by their weights

scatter fairly well around zero indicating low bias for the var-ious land cover classes Land-cover-class-specific confidenceintervals of model predictions can thus be derived from theRMSEcv of each land cover class eg 2times RMSEcv repre-senting the 95 confidence interval

The prediction uncertainty derived from cross-validationis much higher than the bootstrap prediction uncertainty ob-tained from the bootstrap standard deviation (sd) with 2times sdcorresponding to the 95 confidence interval (Fig 10)The large difference between these values indicates that themodel has structural deficits that can be attributed to severalerror sources

i Key influences on WLt are missing in the set of pre-dictor variables None of the predictor variables in-dicate whether and to which extent water level in-crease due to re-wetting measures took place in the last

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 7: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3325Ta

ble

1O

verv

iew

onpr

edic

tor

varia

bles

Pre

dict

orva

riabl

eVa

riabl

ena

me

Valu

esP

oint

buf

fers

(m)

Dat

aso

urce

Land

-use

type

Ara

ble

gras

slan

dfo

rest

shr

ubs

peat

-min

ing

unus

edP

oint

100

500

100

025

00D

igita

lLan

dsca

peM

odel

1

peat

land

sw

amp

open

wat

er

Vege

tatio

nat

trib

utes

Dec

iduo

usfo

rest

mix

edfo

rest

con

ifero

usfo

rest

ree

dP

oint

Dig

italL

ands

cape

Mod

el1

(opt

iona

l)sh

rubs

gra

ss

ldquoWet

soil

obse

rved

rdquoY

esn

oP

oint

Dig

italL

ands

cape

Mod

el1

Com

bine

dla

ndco

ver

lcA

rabl

egr

assl

and

wet

gras

slan

dde

cidu

ous

incl

udin

gm

ixed

Poi

nt1

005

001

000

2500

Dig

italL

ands

cape

Mod

el1

info

rmat

ion

(land

-use

type

fo

rest

wet

fore

stc

onife

rous

fore

str

eed

unus

edpe

atla

nd

veg

and

wet

soil

attr

)w

etun

used

peat

land

Dry

land

cove

rfr

actio

nf

dry(

X)

Ara

ble

05times

gras

slan

don

orga

nic

soil

area

0to

110

050

010

002

500

Dig

italL

ands

cape

Mod

el1

Wet

land

cove

rfr

actio

nR

eed

wet

gras

slan

dw

etfo

rest

and

wet

unus

edpe

atla

ndon

100

500

1000

250

0D

igita

lLan

dsca

peM

odel

1

orga

nic

soil

area

0to

1

Tota

llen

gth

ofdi

tche

sfo

rdi le

ndr

y(X

)ge

0m

Poi

nt5

025

010

002

500

Dig

italL

ands

cape

Mod

el1

alll

can

don

lyfo

rar

able

and

gras

slan

d(s

ubsc

rldquod

ryrdquo)

Dis

tanc

eto

next

ditc

hge

0m

Poi

ntD

igita

lLan

dsca

peM

odel1

Pea

tland

type

pty

peLo

wla

ndbo

gup

land

bog

fen

neig

hbor

ing

surf

ace

wat

erf

enP

oint

Map

ofor

gani

cso

ils2

with

outn

eigh

borin

gsu

rfac

ew

ater

oth

erldquolo

w-C

rdquoor

gani

cso

il

Mat

eria

latp

eatb

ase

pba

seU

ncon

solid

ated

rock

pea

tcla

yla

yer

rock

no

info

rmat

ion

Poi

ntM

apof

orga

nic

soils

2

Pea

tland

frac

tion

fpe

at(X

)0

to1

Poi

nt5

001

000

2500

Geo

logi

calm

ap(B

GR

)3

Dis

tanc

eto

edge

ofpe

atla

ndgt

0m

Geo

logi

calm

ap(B

GR

)3

Rat

ioof

dpe

atf

peat

gt0

2500

Geo

logi

calm

ap(B

GR

)3

Pre

cipi

tatio

nge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Eva

potr

ansp

iratio

nge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Clim

atic

wat

erba

lanc

ew

b sum

mer

lt0

andge

0m

mP

oint

Ras

ter

map

1times1

km(D

WD

)4

Rel

ativ

ehe

ight

hre

l(X

)lt

0an

dge0

mP

oint

ndashm

edia

n25

50

100

250

Dig

italE

leva

tion

Mod

el5

500

1000

Topo

grap

hic

inde

xti ra

sR(X

)gt

0P

oint

and

1000

buffe

rfo

r10

D

igita

lEle

vatio

nM

odel

5

252

501

000

rast

erva

lues

Pro

tect

ion

stat

usN

atur

eco

nser

vatio

nar

eas

peci

alar

eas

ofco

nser

vatio

nP

oint

Map

sof

prot

ecte

dar

eas

6

spec

ialp

rote

ctio

nar

eafo

rw

ildbi

rds

UN

ES

CO

bios

pher

ere

serv

ena

ture

park

nat

iona

lpar

kla

ndsc

ape

prot

ectio

nar

ea

1AT

KIS

Bas

isD

LMF

eder

alA

genc

yfo

rC

arto

grap

hyan

dG

eode

syB

KG

2

map

ofor

gani

cso

ils(R

oszligko

pfet

al

2014

Hum

bold

tUni

vers

ityof

Ber

lin)

3ge

olog

ical

map

12

0000

0(G

UE

K20

0B

GR

ndashF

eder

alIn

stitu

tefo

rG

eosc

ienc

esan

dN

atur

alR

esou

rces

)4

rast

erm

ap1times

1km

ofw

eath

erda

ta(G

erm

anW

eath

erS

ervi

ce)

5B

KG

var

iabl

ena

me

indi

cate

dfo

rth

eni

neva

riabl

esin

the

final

mod

elw

ith(X

)in

dica

ting

buf

size

andR

indi

catin

gra

ster

reso

lutio

n6F

eder

alA

genc

yfo

rN

atur

eC

onse

rvat

ion

(BfN

)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3326 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 3 Illustration of the annual mean water level (WL) transformation Hypothetical transfer function relating GHG budget to WL (m)(a) GHG budget vs the transformed water level (WLt) (b) WLt vs WL The lines along thex axes indicate the data quantiles of theanalyzed data set(c)

et al 2013) Thus we performed this simplificationprocess by first dropping those parameters with a cor-relationgt 07 (either Pearson or Spearman type) to an-other parameter with a higher contribution (Clapcott etal 2011) This ensured that two highly correlated pa-rameters would not remain in the parameter set longerthan the last parameter of another group of variableswhich may contribute less compared to the two highlycorrelated parameters but provides extra informationthat is not covered by the other parameters After allhighly correlated parameters have been dropped fur-ther parameters with low contribution were droppedprogressively

Predictor contributions are calculated as proportional con-tributions to the total error reduction and can be consideredas a measure for the influence of the individual predictorsAdditionally a BRT model allows the derivation of partialdependence plots which indicate how the response is affectedby a certain predictor after accounting for the average effectsof all other predictors in the model (Elith et al 2008) Theseplots do not show the full effect of each parameter on themodel response due to interactions with other parameters thatare fixed to derive theses plots as well as due to parameter co-correlation However they can be used for interpreting modelbehavior (Elith et al 2008)

231 WLt transformation of WL

The map of water levels of this study was developed to im-prove the upscaling of greenhouse gas emissions from or-ganic soils Therefore the final map should provide the high-est accuracy for the water level range for which the high-est differences of greenhouse gas emissions occur This canbe achieved by transforming WL into a transformed vari-able WLt which shows a linear relationship with GHG emis-sions The sensitivity of greenhouse gas emissions to waterlevel has been analyzed in several laboratory and field ex-perimental and monitoring studies (Berglund and Berglund

2011 Droumlsler et al 2011 Hahn-Schoumlfl et al 2011 Leiber-Sauheitl et al 2014 Moore and Roulet 1993 Moore andDalva 1993 van den Akker et al 2012) General trends are astrong increase of methane (CH4) emissions for annual meanwater levels of approximatelygt minus01 m and an increase ofCO2 emissions for water levelslt minus01 m with a trend simi-lar to a saturation function that levels out approximately be-tweenminus04 andminus08 m (Fig 3a) While studies agree overthese general trends the exact shape of the transfer functionand the maximum levels of emissions as well as their depen-dence on soil properties and other environmental parametersare still controversial Here we assume a hypothetical trans-fer function relating the normalized GHG budget rangingfrom 0 to 1 to the water level (see also Fig 3)

GHG Balance=

minuse3(WL+01)

+1 WLlt=minus011minuseminus3(WL+01) WLgtminus01

(1)

As the GHG budget can be positive for both low and highWL we introduced the transformed water level WLt as(Fig 3)

WLt =

e3(WL+01)

minus 1 WL lt= minus011 minus eminus3(WL+01) WL gt minus01

(2)

By calibrating the model to both WL and WLt we test if theoptimization of WLt provides the highest model accuracy forthe water level range relevant for GHG emissions and if itoptimizes the map for application to GHG upscaling

232 Weighting scheme

When considering possible data weighting schemes it isworth emphasizing at this point that the goal of this study isthe development of a statistical model that can explain boththe water level variability within a peatland as well as amongdifferent peatlands The data of target and predictor variablesfor building this model is highly heterogeneous First the tar-get variable data set contains peatland areas that strongly dif-fer in their spatial extent and in the number of installed dip

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3327

wells Second the predictor variable data set contains cate-gorical and numerical data and part of the predictor variablespredominantly vary from peatland to peatland (eg climaticboundary conditions large-scale topographic wetness indexpeatland characteristics) whereas others also show within-peatland variability (eg land use small-scale topographicwetness index drainage network) As the influence of the in-dividual predictor variables on our target WLt is expected tobe rather diffuse due to abundant interactions with other sitecharacteristics the robustness of derived dependencies willstrongly depend on the number of different peatlands in thedata set

There are no universal data weighting rules for similarlyheterogeneous data situations and some degree of expertjudgment and subjectivity is inevitable involved when de-veloping an appropriate scheme (Francis 2011) The needfor introducing a data weighting scheme is obvious as with-out data weighting during calibration too much influencewould be given to small and well-studied peatlands whichwill reduce predictive model performance for large less-well-studied peatland areas To avoid this in a simple man-ner weight could be reduced by the number of dip wells ineach peatland which results in each peatland being equallyweighted This scheme however does not sufficiently usethe high information content provided by well-studied largepeatlands which should have a higher impact on model cali-bration than a small peatland with only few dip wells

Here we propose a new weighting scheme that takes intoaccount both factors peatland size and local density of dipwells to derive dip well specific weighting factors It is basedon principles of data uncertainty reduction by repeated mea-surements and of geostatistics First we consider our datasituation as an analogue of meta-analysis with grouped dataIt is has been shown for homogeneous problems (all datafrom same population) that optimal group weights for meta-analysis is 1SE2 (Hedges and Olkin 1985) with SE beingthe standard error of each group

SE =σe

radicN

(3)

whereσe is the error standard deviation of a measurementandN is the number of measurements in a group For ho-mogeneous problems and uniformσe this results in weightsthat are linearly dependent onN which we here call the firstend member of weighting Heterogeneity (within-group vari-ance) reduces the variation of the group weights which canbe shown by random effect models (Cumming 2012) Aswith second end member of weighting when heterogene-ity totally dominates within-group variance optimal groupweights are uniform for all groups ie weights are inde-pendent ofN We are not aware of a method that allowsthe estimation of the degree of heterogeneity for the com-plex target and predictor data situation in this study includ-ing data (spatial and temporal variability measurement er-ror) and model errors (missing parameters) As a trade-off

Figure 4 Sample semivariogram and fitted semivariogram modelof the annual mean water level data WL

between 1SE2(homogeneous end member) and 1 (heteroge-neous end member) we decided on a group weight that isthe inverse of the standard error 1SE which is for exam-ple often used in econometric studies (Dickens 1990) Weemphasize that this is a subjective decision

The group weight 1SE is the basis for the geostatisticalpart of our weighting scheme There are two reasons why wecannot directly treat our peatlands as groups First there iswithin-peatland variability which is related to changing sitecharacteristics It is one objective of our study to describethis variability by statistical modeling Thus dip wells mustbe treated individually and data cannot be aggregated at thepeatland level Second we expect the model to learn morewhen the same number of dip wells is installed in a largerpeatland In a small peatland spatial autocorrelation betweendip wells is higher ie the information content is lower thanfor large peatlands As a consequence of the first point wedo not aggregate and keep all dip wells in the target variabledata set by attributing to each dip well the fraction 1N ofits group weight so that the relative weights of the groupsremain constant As a consequence of the second point weuse principles of geostatistics in our weighting scheme Wereplace the group sizeN (positive integer number) by theldquostatisticalrdquo group sizen (positive continuous number be-inggt 1) which we derive from the spatial autocorrelationamong the dip wells

Therefore we analyze the spatial autocorrelation structureof the data set A single spherical variogram model was fit-ted to the sample variogram of all data (Fig 4 in Sect 31)Variogram models allow the differentiation of the total datavariance (called ldquosillrdquo) into a spatially uncorrelated variance(called ldquonuggetrdquo) and a spatially correlated variance (calledldquostructural variancerdquo and defined as sillndashnugget) (Wacker-nagel 2003) The variogram model allows for derivation forany distance between two locations the average squared dif-ference of values here defined asγ By definition at dis-tance 0 the average squared difference equals the nuggetand at distances greater than that called the ldquorangerdquo of spatial

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3328 M Bechtold et al Large-scale regionalization of water table depth in peatlands

autocorrelation the average squared difference equals thesill Accordingly the autocorrelated fractionf of the av-erage squared difference between two dip wellsi and j is

fij =sill minus γij

sill minus nugget (4)

We now define the ldquostatisticalrdquo group sizen of each dip wellito be the sum of one plus the autocorrelated fractionsfij ofall dip wells that are within the range of spatial autocorrela-tion of i

ni = 1 +

msumj=1

sill minus γij

sill minus nugget (5)

According to the discussion above dip-well-specific weightscan then be calculated with

wi =1

ni SEi

=1

σeiradic

ni

(6)

whereni is derived from Eq (5) The equation shows thatwith increasing ldquostatisticalrdquo group sizen ie with increas-ing spatial data density the weight of an individual dip wellis ldquodown-weightedrdquo to some degree a behavior that corre-sponds to our initial intention to lower the influence of smallpeatlands compared to large ones The error standard devia-tion σe is dependent on several factors eg the length of thetime series the temporal measurement density and the mi-crotopography around the dip well For simplicity we hereassumedσe to be uniform for all dip wells which simplifiesEq (6) towi =

1radic

ni

Only dip wells with the same land-use type were summedup with Eq (5) which avoids the down-weighting by dipwells that have different land-use types The latter are mostlycharacterized by fairly different WLt and thus by rather lowspatial autocorrelation to dip welli

After spatial correlation has been accounted for the sumof the weights of all dip wells of each land-use type were ad-justed that they correspond to the fractions of this land-usetype in Germany This adjustment accounts for the overrep-resentation in the data set of dip wells in unused peatlandsand underrepresentation of dip wells in arable land

233 Model performance criteria

Model fit and predictive performance after cross-validationwere quantified by the weighted root mean square error

RMSE =

radicradicradicradic 1summi=1 wi

msumi=1

(wi

(xoi minus xsi

)2) (7)

wherem is the number of dip wellsxoi is observed WLor WLt of dip well i xsi is simulated WL or WLt of dipwell i andwi is the data weight of dip welli (see below) We

refer to the root mean square error of the predicted data ofcross validation as RMSEcv Model performance was furtherquantified by NashndashSutcliffe efficiency (NSE)

NSE = 1 minus

msumi=1

wi

(xoi minus xsi

)2

msumi=1

wi

(xoi minus xo

)2 (8)

wherexo is the mean of all observed WL or WLt It indicateshow well observed vs predicted values match the 1 1 lineNSE is a good overall indicator of predictive performancebecause it combines scatter and bias (common offset andorslope difference from 11 line) (Nash and Sutcliffe 1970)Values greater than 0 signify a model that is better than thereference model based on the data mean We refer to the NSEof the training data as NSEcal and of the predicted data ofcross validation as NSEcv

Systematic errors were quantified by calculating the modelbias here defined as

BIAS =

msumi=1

(wi xoi minus wi xsi

) (9)

24 Model uncertainty and stability evaluation

Uncertainty of the model predictions was assessed by boot-strapping cross-validation and residual analysis

For the bootstrapping analysis we followed the procedureof Leathwick et al (2006) We estimated the confidence in-tervals around the predictions and the fitted functions by tak-ing 1000 bootstrap samples of the 53 peatlands The numberof peatlands in each sample was equivalent to the data set butpeatlands were selected randomly with replacement Usingthe predictor variables of the final model a BRT model wasfitted to each sample Cross validation was again performedon peatlands thus a peatland in the calibration data set wasnot part of the cross-validation data set to avoid overly opti-mistic results Variances of the predictions and of the fittedfunctions of the 1000 models were evaluated

If data sets are relatively small (egn lt 1000 Dersquoath2007) then the small size of the training and test data setslowers model accuracy Given the fairly small number ofpeatlands in the data set and the partly high spatial corre-lation of dip wells within these peatlands we decided not tosplit the data set into a training and test data set Estimatesof model accuracy can then be based on cross-validationthereby making effective use of all the data (Dersquoath 2007)The prediction uncertainty of the final model is estimatedby the root mean square error of prediction (RMSEcv seeabove) for each land cover class After testing for near-normal distribution of the residuals RMSEcv can be used toderive the 68 and 95 confidence intervals of the predictionswith RMSEcv and 2times RMSEcv respectively

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3329

Finally additional residual analysis was performed to eval-uate whether the predictions are biased for different landcover classes or geographical regions

25 Regionalization

In the final regionalization step the predictor variables con-tributing to the final model were determined at a 25times 25 mraster for all organic soil in Germany Predictor variableswere determined with the same map input that was used formodel building Land cover information including informa-tion on ditches was based on the data from year 2012 and theclimatic data was based on the average of the last 30 yearsThe fine spatial resolution of 25times 25 m was not chosen tofool the reader with a highly spatially accurate model Ratherthis fairly fine scale was necessary to map the relativelysmall-scale effects of the topography land use and peatlandgeometry variables The final model was then used to makea prediction for each of these raster cells

3 Results and discussion

31 Spatial correlation structure of the data set

The variogram model fitted to the sample variogram provideda nugget (0012 m2 011 m) a sill (009 m2 03 m) and arange of spatial correlation (2700 m) for our data set of WL(Fig 4) The nugget represents the very small-scale soil hy-draulic variability and micro-topography effects on WL (vander Ploeg et al 2012) and measurement error eg by dif-ferences in the determination of the ground surface and inthe timing of the manual measurements Furthermore micro-topography (eg hummocks) and oscillating peat surfaces ofwet peatlands pose a challenge for an accurate determinationof both ground surface and water level The water level timeseries in the data set were of different lengths and rangedfrom 1 to 20 years Interannual variability of water levels canbe large (eg Knotters and van Walsum 1997) For simplic-ity in our analysis data were not harmonized by extrapolat-ing WL time series using weather data to a 30-year periodThus the nugget also includes errors that are introduced bydip wells with different measurement periods that are locatedin the range of spatial correlation In consideration of theseerror sources the fitted nugget of 011 m appears to be a re-alistic value At 03 m the fitted sill matched nearly perfectlywith the standard deviation of the data (031 m) which in-dicates consistency between semivariogram model and dataset The fitted range of spatial correlation of 2700 m reflectsboth physical effects ie the average range of lateral flowdue to hydraulic gradients as well as the effect of averageland-use patterns in Germany on the spatial correlation ofWL Fitted values were used in the calculation of the dip-well-specific weights using Eq (6)

32 Typical water levels for land-use types in Germanorganic soils

The land cover classes are characterized by plausible meanand median water levels which show consistent differencesbetween each other (Table 2 and Fig 5a) The mean valuesof arable land and grassland agree with what can be expectedfor their agronomic requirements with slightly lower waterlevels for arable land The high variability observed for bothclasses may be related to the variability of the efficiency ofinstalled drainage systems as for example the presence andcondition of tile drains and the depth of ditches Grasslandscan be managed with very variable intensity which is partlyreflected in different water levels Figure 5a further showsthat deciduous forests seem to dominate slightly drier organicsoils compared to coniferous forests which dominate underwetter conditions A high variability of water levels is ob-served for the land cover class unused peatland On the onehand post peat-cutting topography increases the variabilityof WL over short distances It probably contributes to thehigh variance observed for this class On the other hand thisclass comprises both rather dry unused peatlands and wetterpeatlands in which re-wetting measures already took placewhich however do not show yet a wet soil attribute in theATKIS Digital Landscape Model This may also cause part ofthe variance observed in the grassland and forest land coverclass All wet land cover classes (reed wet grassland wetforest and wet unused peatland) that were separated by wet-ness indication clearly show higher water levels showing thewetness attribute of the Digital Landscape Model is a usefulattribute

Figure 5b shows the transformed water level for all classesIt can be observed that the variances of the wetter landcover increase relative to the variances of the dry land coverclasses This is due to the highest sensitivity of GHG emis-sions in the wet range of water levels (gt minus05 m) Conse-quently the rather high variance of WL for arable land cor-responds to a rather low variance of WLt ie to a rather lowassumed effect of WL variability on the GHG budget

33 BRT model calibration and validation WL vs WL t

In contrast to land cover class the other predictor variablesshowed if at all only weak relations to WL and WLt whenevaluating them with box plots 2-D cross plots and simplecorrelation matrices Here we expected BRT to detect thestrongest predictor interactions and to identify the most in-formative predictors

After model calibration with all predictors subsequentmodel simplification successively dropped those parameterswith correlationgt 07 and the lowest contribution For bothWL and WLt model performance improved during this sim-plification For WLt the highest values of NSEcv of approxi-mately 046 were achieved with 21 to 9 model parametersThe development of NSEcv for the last 50 parameters is

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3330 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 5 Water level relative to ground surface WL (m) and transformed water level WLt (minus) by land cover class illustrated as a weightedbox plot WLt = minus1 corresponds to maximum CO2 emissions and WLt = 1 to maximum CH4 emissions In the top horizontal axes thenumber of dip wells in each class is indicated

Figure 6NSEcv as a function of number of predictor variables usedin the model of WLt during model simplification and shown for thelast 50 parameter drops

shown in Fig 6 Further elimination of parameters led to apronounced decline of model performance Similar behav-ior was observed for the calibration on WL In favor of amore parsimonious model we chose the model with the low-est number of parameters before the pronounced decline ofmodel performance occurred For the calibration on WLtthis corresponded to the model with lowest number of param-eters that still achieved NSEcv values ofgt 045 (Fig 6) Thefinal WLt model comprised nine predictor variables and thefinal WL model seven parameters The percentages of pa-rameter contributions to the final model and their individualinfluences are discussed for WLt in Sect 34

Table 3 summarizes the statistical performances of themodels calibrated on WL and WLt For both models NSEcalis considerably higher than NSEcv and shows the commonly

Table 2 Weighted mean and standard deviation of WL and WLtdata and of the WLt map presented in Sect 36 for the nine landcover classes

WL (m) WLt (minus)WLt (minus)

Meanplusmn sd Meanplusmn sd Map meanplusmn sd

Arable land minus069plusmn 030 minus076plusmn 017 minus066plusmn 022Deciduous f minus045plusmn 034 minus049plusmn 037 minus047plusmn 035Grassland minus044plusmn 029 minus052plusmn 032 minus049plusmn 030Unused peatl minus039plusmn 036 minus039plusmn 041 minus037plusmn 040Coniferous f minus036plusmn 036 minus037plusmn 037 minus046plusmn 035Wet unused peatl minus022plusmn 027 minus018plusmn 040 minus017plusmn 036Wet forest minus022plusmn 029 minus017plusmn 043 minus021plusmn 039Wet grassland minus010plusmn 014 minus000plusmn 031 minus015plusmn 039Reed minus001plusmn 017 020plusmn 029 minus006plusmn 032

observed overfitting behavior of BRT models The differentmeasures that we conducted to minimize overfitting (cross-validation on peatlands restriction to monotonic responsesand model simplification including elimination of highly cor-related variables) lowered the difference between NSEcal andNSEcv but could not totally avoid overfitting NSEcv of theWLt model (0453) indicates higher predictive model perfor-mance compared to the WL model (0381) However as thedata ranges differ due to the transformation this comparisonmay be misleading Therefore we transformed the predic-tions of the WL model to obtain WLt values from this modeland equally calculated the performance criteria (Table 3 sec-ond column) Then NSEcv is slightly increased (0397) butdoes not achieve the values of the model that was calibratedon WLt A better predictive model performance of the modelcalibrated on WLt is also visible for the RMSEcv valuesThe total RMSEcv as well as the RMSEcv values for the

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3331

Table 3Performance criteria of the different models dry range de-fined as WLlt minus03 m and wet range as WLgt minus03 m

WL (m) WLt (minus) WLt (minus)(calibrated (calibrated on WL) (calibrated on WLt)

on WL)

NSEcal 0627 0559 0642NSEcv 0381 0397 0453RMSEcv 0269 0299 0284RMSEcvdry 0284 0263 0259RMSEcvwet 0222 0382 0355Bias minus0003 0083 0002Biasdry minus0012 0070 0003Biaswet 0021 0120 0000

dry (WL lt minus03 m) and wet range (WLgt minus03 m) showslightly lower values for the WLt model compared to WLtvalues from the model calibrated on WL Given our hypo-thetical transfer function (Fig 3) in which the GHG budgetis linearly related to WLt the higher accuracy of WLt pre-dictions directly corresponds to a higher accuracy of GHGbudget predictions

Superior model performance is also evident when evaluat-ing model bias Only when calibrating directly on WLt arethe WLt predictions bias-free Calibration on WL and subse-quent transformation to WLt introduces a model bias towardssystematically lower WLt values In subsequent applicationsto GHG emission upscaling lower WLt values would lead toan overestimation of CO2 emissions and to an underestima-tion of CH4 emissions

34 Influence of predictor variables on WLt

Given the beneficial characteristics of the model calibratedon WLt for GHG upscaling presentation and discussion offurther model results is restricted to the WLt model

The BRT method allows the analysis of the parameter con-tributions to and influences on the model (Elith et al 2008)and thus may contribute to system understanding The per-centages of the contributions of the nine predictor variablesto the final model ranged from 252 to 56 (Fig 7) Ex-cept protection status at least one parameter of each of theseven parameter groups contributed to the final model Allprotection status information was dropped early during thesimplification process due to low contribution although WLshowed slightly higher values for data from nature protectionor special areas of conservation However other parametersseem to be able to fully compensate the information that islost by dropping this predictor

Land cover class lc at the dip well was the parameter withstrongest contribution (252 ) It basically follows the trendillustrated in Fig 5b The bootstrap error plotted as standarddeviation (Fig 7) shows the variation of this influence overthe 1000 bootstrap models A second land cover parameterthe fraction of dry land cover classes on organic soils in abuffer of 2500 m radiusfdry (2500) contributed to the model

with 103 The monotonic decrease of WLt with increas-ing fdry (2500) is plausible as higher values reflect intensiveland use in the surroundings of the dip well and thus indicateintensive artificial drainage Together both parameter con-tributed 355 and thus land cover represents the parametergroup with the strongest model contribution

Peatland characteristics are the second most importantparameter group The peatland type contributed 16 Themodel indicates that peatlands without any connection to sur-face water bodies (river or lake) and the class of other organicsoils are characterized by lower WLt compared to the peat-land types lowland bog upland bog and fen neighboring sur-face water As the class of other organic soils is generallyexpected to reflect lower water levels and as surface watermay have a stabilization effect on water levels of organicsoils the influence of the peatland type can be consideredplausible Besides peatland type the substrate of the peatbase contributes 56 Here organic soils overlying peatclay layers (eg limnic sediments such as calcareous gyt-tja) or basement rock are characterized by higher WLt com-pared to organic soils overlying unconsolidated rock Thiscan be explained by the lower drainage resistance of uncon-solidated rocks This may cause an increased efficiency ofanthropogenic drainage andor a general higher vulnerabil-ity to seepage losses Finally slightly lower WLt values areindicated by a high fraction of organic soils for the 500 mbufferfpeat(500) This may reflect the higher land-use pres-sure on large peatlands compared to rather small peatlandswhich tentatively are more easily preserved by nature protec-tion efforts

The remaining four parameter groups are represented inthe model by only one parameter each The third most influ-ential parameter was the length of ditches on arable land andgrassland for the 250 m buffer dilendry (250) At first glanceit may be surprising that with increasing ditch density WLtvalues tend to be higher as ditches are supposed to drain thewater when land is used as arable land and grassland Thefact that the model identifies a rather strong effect in the op-posite direction may be caused by incomplete informationabout the drainage network There is not detailed informa-tion about the spatial distribution of tile drains Based on ex-pert knowledge agricultural areas with a lower ditch densityare more likely to have tile drains As these drains easilyinstalled with a narrow drain spacing are more effective atdraining organic soils low WLt values for arable land andgrassland may be related to low ditch densities Furthermoreditches were originally dug at narrow spacing in especiallywet areas of organic soils but there is no information avail-able whether these ditches still function properly

The parameters wbsummer hrel and tiras25all show expectedtrends The model predicts higher WLt for increasing cli-matic water balance in the summer period (May to October)wbsummer for dip wells located in depressions (low values ofhrel) and for higher small-scale topographic wetness indicescalculated on the 25times 25 digital elevation model (tiras25)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3332 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 7 Partial dependence plots for the predictor variables For an explanation of variables see Table 1 They axes are on WLt scale andare centered around the mean WLt Error bars and grey area indicate standard deviation of the response of over 1000 bootstrap models Therelative contribution of each predictor is indicated as percentage The lines along thex axes of each plot show distribution of data across thatvariable in deciles

The fact that all parameters show expected or explainableresponses in the model corroborates the reliability of the cal-ibrated WLt model The standard deviation of the predictorresponses based on the bootstrap samples shows the stabilityof the observed responses

Further insight into model behavior can be obtained by an-alyzing parameter interactions This is obtained by changingtwo parameters simultaneously while keeping mean valuesfor all other parameters (Elith et al 2008) Figure 8 showsthe two strongest parameter interactions Parameter wbsummer

strongly interacts withptype The generally lower values ofWLt of fens without surface water connection and other or-ganic soils show a stronger dependency on the summer cli-matic water balance While a summer climatic water balanceof gt minus80 mm shows a rather weak effect on WLt for the wet-ter peatland types in contrast to the two drier peatland typesthere is still a strong effect with increasing wbsummer Thetrend for wbsummergt 130 mm for the dry peatland types issupported by seven different peatlands

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3333

Figure 8Partial dependence plots representing the two strongest interactions in the model(a) betweenptype and wbsummerand(b) betweenpbaseandfdry Fitted WLt is plotted on they axis which is obtained after accounting for the average effect of all other predictor variables

Another strong interaction is observed forpbase andfdry (2500) While a rather weak effect of the fraction ofarable land and grassland is observed for organic soils over-lying basement rock and peat clay layer a strong effect is ob-served for organic soils overlying unconsolidated rock Thisinteraction reflects the higher lateral range of drainage effectsfor organic soils with little flow resistance at the peat base Inthese organic soils intensive land use lowers the water levelover large areas

35 Discussion of model uncertainty

Plotting observed vs predicted WLt from cross-validation(Fig 9) illustrates the rather large residual variance that can-not be explained by the model As indicated by the higherRMSEcv for the wet range (Table 3) scatter increases withincreasing WLt Error bars in they direction indicate dataerror derived from the nugget of the variogram It is shownfor a few data points as an example Due to transformationdata error increases for higher WLt Figure 9 demonstratesthat the fraction of unexplainable variance related to data er-ror is much higher for the wet than for the dry range Boot-strap error indicating the variation of the model predictionsfor 1000 bootstrap samples is shown in thex direction for thesame data points Bootstrap error is lower than the data errorfor the wet range and slightly higher for the dry range

Bootstrap errors demonstrate the sensitivity of model pre-dictions to changes of the data set used for calibration Whena model possesses structural deficits such as missing pre-dictor variables bootstrap errors should not be used to de-fine confidence intervals for the model predictions Figure 10shows residuals from cross-validation and standard deviationof bootstrap predictions for all land cover classes The resid-uals of each land cover class show near-normal distributionsFor five of the nine land cover classes (wet forest wet un-used peatland arable land coniferous forest and reed) theShapirondashWilk test of normality is positive (p gt 005) Fig-ure 10a further indicates that residuals of each land cover

Figure 9 Observed vs predicted transformed annual mean waterlevel (WLt) from cross-validation results Error bars show selecteddata and bootstrap model errors as standard deviation Data pointsare scaled by their weights

scatter fairly well around zero indicating low bias for the var-ious land cover classes Land-cover-class-specific confidenceintervals of model predictions can thus be derived from theRMSEcv of each land cover class eg 2times RMSEcv repre-senting the 95 confidence interval

The prediction uncertainty derived from cross-validationis much higher than the bootstrap prediction uncertainty ob-tained from the bootstrap standard deviation (sd) with 2times sdcorresponding to the 95 confidence interval (Fig 10)The large difference between these values indicates that themodel has structural deficits that can be attributed to severalerror sources

i Key influences on WLt are missing in the set of pre-dictor variables None of the predictor variables in-dicate whether and to which extent water level in-crease due to re-wetting measures took place in the last

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 8: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

3326 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 3 Illustration of the annual mean water level (WL) transformation Hypothetical transfer function relating GHG budget to WL (m)(a) GHG budget vs the transformed water level (WLt) (b) WLt vs WL The lines along thex axes indicate the data quantiles of theanalyzed data set(c)

et al 2013) Thus we performed this simplificationprocess by first dropping those parameters with a cor-relationgt 07 (either Pearson or Spearman type) to an-other parameter with a higher contribution (Clapcott etal 2011) This ensured that two highly correlated pa-rameters would not remain in the parameter set longerthan the last parameter of another group of variableswhich may contribute less compared to the two highlycorrelated parameters but provides extra informationthat is not covered by the other parameters After allhighly correlated parameters have been dropped fur-ther parameters with low contribution were droppedprogressively

Predictor contributions are calculated as proportional con-tributions to the total error reduction and can be consideredas a measure for the influence of the individual predictorsAdditionally a BRT model allows the derivation of partialdependence plots which indicate how the response is affectedby a certain predictor after accounting for the average effectsof all other predictors in the model (Elith et al 2008) Theseplots do not show the full effect of each parameter on themodel response due to interactions with other parameters thatare fixed to derive theses plots as well as due to parameter co-correlation However they can be used for interpreting modelbehavior (Elith et al 2008)

231 WLt transformation of WL

The map of water levels of this study was developed to im-prove the upscaling of greenhouse gas emissions from or-ganic soils Therefore the final map should provide the high-est accuracy for the water level range for which the high-est differences of greenhouse gas emissions occur This canbe achieved by transforming WL into a transformed vari-able WLt which shows a linear relationship with GHG emis-sions The sensitivity of greenhouse gas emissions to waterlevel has been analyzed in several laboratory and field ex-perimental and monitoring studies (Berglund and Berglund

2011 Droumlsler et al 2011 Hahn-Schoumlfl et al 2011 Leiber-Sauheitl et al 2014 Moore and Roulet 1993 Moore andDalva 1993 van den Akker et al 2012) General trends are astrong increase of methane (CH4) emissions for annual meanwater levels of approximatelygt minus01 m and an increase ofCO2 emissions for water levelslt minus01 m with a trend simi-lar to a saturation function that levels out approximately be-tweenminus04 andminus08 m (Fig 3a) While studies agree overthese general trends the exact shape of the transfer functionand the maximum levels of emissions as well as their depen-dence on soil properties and other environmental parametersare still controversial Here we assume a hypothetical trans-fer function relating the normalized GHG budget rangingfrom 0 to 1 to the water level (see also Fig 3)

GHG Balance=

minuse3(WL+01)

+1 WLlt=minus011minuseminus3(WL+01) WLgtminus01

(1)

As the GHG budget can be positive for both low and highWL we introduced the transformed water level WLt as(Fig 3)

WLt =

e3(WL+01)

minus 1 WL lt= minus011 minus eminus3(WL+01) WL gt minus01

(2)

By calibrating the model to both WL and WLt we test if theoptimization of WLt provides the highest model accuracy forthe water level range relevant for GHG emissions and if itoptimizes the map for application to GHG upscaling

232 Weighting scheme

When considering possible data weighting schemes it isworth emphasizing at this point that the goal of this study isthe development of a statistical model that can explain boththe water level variability within a peatland as well as amongdifferent peatlands The data of target and predictor variablesfor building this model is highly heterogeneous First the tar-get variable data set contains peatland areas that strongly dif-fer in their spatial extent and in the number of installed dip

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3327

wells Second the predictor variable data set contains cate-gorical and numerical data and part of the predictor variablespredominantly vary from peatland to peatland (eg climaticboundary conditions large-scale topographic wetness indexpeatland characteristics) whereas others also show within-peatland variability (eg land use small-scale topographicwetness index drainage network) As the influence of the in-dividual predictor variables on our target WLt is expected tobe rather diffuse due to abundant interactions with other sitecharacteristics the robustness of derived dependencies willstrongly depend on the number of different peatlands in thedata set

There are no universal data weighting rules for similarlyheterogeneous data situations and some degree of expertjudgment and subjectivity is inevitable involved when de-veloping an appropriate scheme (Francis 2011) The needfor introducing a data weighting scheme is obvious as with-out data weighting during calibration too much influencewould be given to small and well-studied peatlands whichwill reduce predictive model performance for large less-well-studied peatland areas To avoid this in a simple man-ner weight could be reduced by the number of dip wells ineach peatland which results in each peatland being equallyweighted This scheme however does not sufficiently usethe high information content provided by well-studied largepeatlands which should have a higher impact on model cali-bration than a small peatland with only few dip wells

Here we propose a new weighting scheme that takes intoaccount both factors peatland size and local density of dipwells to derive dip well specific weighting factors It is basedon principles of data uncertainty reduction by repeated mea-surements and of geostatistics First we consider our datasituation as an analogue of meta-analysis with grouped dataIt is has been shown for homogeneous problems (all datafrom same population) that optimal group weights for meta-analysis is 1SE2 (Hedges and Olkin 1985) with SE beingthe standard error of each group

SE =σe

radicN

(3)

whereσe is the error standard deviation of a measurementandN is the number of measurements in a group For ho-mogeneous problems and uniformσe this results in weightsthat are linearly dependent onN which we here call the firstend member of weighting Heterogeneity (within-group vari-ance) reduces the variation of the group weights which canbe shown by random effect models (Cumming 2012) Aswith second end member of weighting when heterogene-ity totally dominates within-group variance optimal groupweights are uniform for all groups ie weights are inde-pendent ofN We are not aware of a method that allowsthe estimation of the degree of heterogeneity for the com-plex target and predictor data situation in this study includ-ing data (spatial and temporal variability measurement er-ror) and model errors (missing parameters) As a trade-off

Figure 4 Sample semivariogram and fitted semivariogram modelof the annual mean water level data WL

between 1SE2(homogeneous end member) and 1 (heteroge-neous end member) we decided on a group weight that isthe inverse of the standard error 1SE which is for exam-ple often used in econometric studies (Dickens 1990) Weemphasize that this is a subjective decision

The group weight 1SE is the basis for the geostatisticalpart of our weighting scheme There are two reasons why wecannot directly treat our peatlands as groups First there iswithin-peatland variability which is related to changing sitecharacteristics It is one objective of our study to describethis variability by statistical modeling Thus dip wells mustbe treated individually and data cannot be aggregated at thepeatland level Second we expect the model to learn morewhen the same number of dip wells is installed in a largerpeatland In a small peatland spatial autocorrelation betweendip wells is higher ie the information content is lower thanfor large peatlands As a consequence of the first point wedo not aggregate and keep all dip wells in the target variabledata set by attributing to each dip well the fraction 1N ofits group weight so that the relative weights of the groupsremain constant As a consequence of the second point weuse principles of geostatistics in our weighting scheme Wereplace the group sizeN (positive integer number) by theldquostatisticalrdquo group sizen (positive continuous number be-inggt 1) which we derive from the spatial autocorrelationamong the dip wells

Therefore we analyze the spatial autocorrelation structureof the data set A single spherical variogram model was fit-ted to the sample variogram of all data (Fig 4 in Sect 31)Variogram models allow the differentiation of the total datavariance (called ldquosillrdquo) into a spatially uncorrelated variance(called ldquonuggetrdquo) and a spatially correlated variance (calledldquostructural variancerdquo and defined as sillndashnugget) (Wacker-nagel 2003) The variogram model allows for derivation forany distance between two locations the average squared dif-ference of values here defined asγ By definition at dis-tance 0 the average squared difference equals the nuggetand at distances greater than that called the ldquorangerdquo of spatial

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3328 M Bechtold et al Large-scale regionalization of water table depth in peatlands

autocorrelation the average squared difference equals thesill Accordingly the autocorrelated fractionf of the av-erage squared difference between two dip wellsi and j is

fij =sill minus γij

sill minus nugget (4)

We now define the ldquostatisticalrdquo group sizen of each dip wellito be the sum of one plus the autocorrelated fractionsfij ofall dip wells that are within the range of spatial autocorrela-tion of i

ni = 1 +

msumj=1

sill minus γij

sill minus nugget (5)

According to the discussion above dip-well-specific weightscan then be calculated with

wi =1

ni SEi

=1

σeiradic

ni

(6)

whereni is derived from Eq (5) The equation shows thatwith increasing ldquostatisticalrdquo group sizen ie with increas-ing spatial data density the weight of an individual dip wellis ldquodown-weightedrdquo to some degree a behavior that corre-sponds to our initial intention to lower the influence of smallpeatlands compared to large ones The error standard devia-tion σe is dependent on several factors eg the length of thetime series the temporal measurement density and the mi-crotopography around the dip well For simplicity we hereassumedσe to be uniform for all dip wells which simplifiesEq (6) towi =

1radic

ni

Only dip wells with the same land-use type were summedup with Eq (5) which avoids the down-weighting by dipwells that have different land-use types The latter are mostlycharacterized by fairly different WLt and thus by rather lowspatial autocorrelation to dip welli

After spatial correlation has been accounted for the sumof the weights of all dip wells of each land-use type were ad-justed that they correspond to the fractions of this land-usetype in Germany This adjustment accounts for the overrep-resentation in the data set of dip wells in unused peatlandsand underrepresentation of dip wells in arable land

233 Model performance criteria

Model fit and predictive performance after cross-validationwere quantified by the weighted root mean square error

RMSE =

radicradicradicradic 1summi=1 wi

msumi=1

(wi

(xoi minus xsi

)2) (7)

wherem is the number of dip wellsxoi is observed WLor WLt of dip well i xsi is simulated WL or WLt of dipwell i andwi is the data weight of dip welli (see below) We

refer to the root mean square error of the predicted data ofcross validation as RMSEcv Model performance was furtherquantified by NashndashSutcliffe efficiency (NSE)

NSE = 1 minus

msumi=1

wi

(xoi minus xsi

)2

msumi=1

wi

(xoi minus xo

)2 (8)

wherexo is the mean of all observed WL or WLt It indicateshow well observed vs predicted values match the 1 1 lineNSE is a good overall indicator of predictive performancebecause it combines scatter and bias (common offset andorslope difference from 11 line) (Nash and Sutcliffe 1970)Values greater than 0 signify a model that is better than thereference model based on the data mean We refer to the NSEof the training data as NSEcal and of the predicted data ofcross validation as NSEcv

Systematic errors were quantified by calculating the modelbias here defined as

BIAS =

msumi=1

(wi xoi minus wi xsi

) (9)

24 Model uncertainty and stability evaluation

Uncertainty of the model predictions was assessed by boot-strapping cross-validation and residual analysis

For the bootstrapping analysis we followed the procedureof Leathwick et al (2006) We estimated the confidence in-tervals around the predictions and the fitted functions by tak-ing 1000 bootstrap samples of the 53 peatlands The numberof peatlands in each sample was equivalent to the data set butpeatlands were selected randomly with replacement Usingthe predictor variables of the final model a BRT model wasfitted to each sample Cross validation was again performedon peatlands thus a peatland in the calibration data set wasnot part of the cross-validation data set to avoid overly opti-mistic results Variances of the predictions and of the fittedfunctions of the 1000 models were evaluated

If data sets are relatively small (egn lt 1000 Dersquoath2007) then the small size of the training and test data setslowers model accuracy Given the fairly small number ofpeatlands in the data set and the partly high spatial corre-lation of dip wells within these peatlands we decided not tosplit the data set into a training and test data set Estimatesof model accuracy can then be based on cross-validationthereby making effective use of all the data (Dersquoath 2007)The prediction uncertainty of the final model is estimatedby the root mean square error of prediction (RMSEcv seeabove) for each land cover class After testing for near-normal distribution of the residuals RMSEcv can be used toderive the 68 and 95 confidence intervals of the predictionswith RMSEcv and 2times RMSEcv respectively

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3329

Finally additional residual analysis was performed to eval-uate whether the predictions are biased for different landcover classes or geographical regions

25 Regionalization

In the final regionalization step the predictor variables con-tributing to the final model were determined at a 25times 25 mraster for all organic soil in Germany Predictor variableswere determined with the same map input that was used formodel building Land cover information including informa-tion on ditches was based on the data from year 2012 and theclimatic data was based on the average of the last 30 yearsThe fine spatial resolution of 25times 25 m was not chosen tofool the reader with a highly spatially accurate model Ratherthis fairly fine scale was necessary to map the relativelysmall-scale effects of the topography land use and peatlandgeometry variables The final model was then used to makea prediction for each of these raster cells

3 Results and discussion

31 Spatial correlation structure of the data set

The variogram model fitted to the sample variogram provideda nugget (0012 m2 011 m) a sill (009 m2 03 m) and arange of spatial correlation (2700 m) for our data set of WL(Fig 4) The nugget represents the very small-scale soil hy-draulic variability and micro-topography effects on WL (vander Ploeg et al 2012) and measurement error eg by dif-ferences in the determination of the ground surface and inthe timing of the manual measurements Furthermore micro-topography (eg hummocks) and oscillating peat surfaces ofwet peatlands pose a challenge for an accurate determinationof both ground surface and water level The water level timeseries in the data set were of different lengths and rangedfrom 1 to 20 years Interannual variability of water levels canbe large (eg Knotters and van Walsum 1997) For simplic-ity in our analysis data were not harmonized by extrapolat-ing WL time series using weather data to a 30-year periodThus the nugget also includes errors that are introduced bydip wells with different measurement periods that are locatedin the range of spatial correlation In consideration of theseerror sources the fitted nugget of 011 m appears to be a re-alistic value At 03 m the fitted sill matched nearly perfectlywith the standard deviation of the data (031 m) which in-dicates consistency between semivariogram model and dataset The fitted range of spatial correlation of 2700 m reflectsboth physical effects ie the average range of lateral flowdue to hydraulic gradients as well as the effect of averageland-use patterns in Germany on the spatial correlation ofWL Fitted values were used in the calculation of the dip-well-specific weights using Eq (6)

32 Typical water levels for land-use types in Germanorganic soils

The land cover classes are characterized by plausible meanand median water levels which show consistent differencesbetween each other (Table 2 and Fig 5a) The mean valuesof arable land and grassland agree with what can be expectedfor their agronomic requirements with slightly lower waterlevels for arable land The high variability observed for bothclasses may be related to the variability of the efficiency ofinstalled drainage systems as for example the presence andcondition of tile drains and the depth of ditches Grasslandscan be managed with very variable intensity which is partlyreflected in different water levels Figure 5a further showsthat deciduous forests seem to dominate slightly drier organicsoils compared to coniferous forests which dominate underwetter conditions A high variability of water levels is ob-served for the land cover class unused peatland On the onehand post peat-cutting topography increases the variabilityof WL over short distances It probably contributes to thehigh variance observed for this class On the other hand thisclass comprises both rather dry unused peatlands and wetterpeatlands in which re-wetting measures already took placewhich however do not show yet a wet soil attribute in theATKIS Digital Landscape Model This may also cause part ofthe variance observed in the grassland and forest land coverclass All wet land cover classes (reed wet grassland wetforest and wet unused peatland) that were separated by wet-ness indication clearly show higher water levels showing thewetness attribute of the Digital Landscape Model is a usefulattribute

Figure 5b shows the transformed water level for all classesIt can be observed that the variances of the wetter landcover increase relative to the variances of the dry land coverclasses This is due to the highest sensitivity of GHG emis-sions in the wet range of water levels (gt minus05 m) Conse-quently the rather high variance of WL for arable land cor-responds to a rather low variance of WLt ie to a rather lowassumed effect of WL variability on the GHG budget

33 BRT model calibration and validation WL vs WL t

In contrast to land cover class the other predictor variablesshowed if at all only weak relations to WL and WLt whenevaluating them with box plots 2-D cross plots and simplecorrelation matrices Here we expected BRT to detect thestrongest predictor interactions and to identify the most in-formative predictors

After model calibration with all predictors subsequentmodel simplification successively dropped those parameterswith correlationgt 07 and the lowest contribution For bothWL and WLt model performance improved during this sim-plification For WLt the highest values of NSEcv of approxi-mately 046 were achieved with 21 to 9 model parametersThe development of NSEcv for the last 50 parameters is

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3330 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 5 Water level relative to ground surface WL (m) and transformed water level WLt (minus) by land cover class illustrated as a weightedbox plot WLt = minus1 corresponds to maximum CO2 emissions and WLt = 1 to maximum CH4 emissions In the top horizontal axes thenumber of dip wells in each class is indicated

Figure 6NSEcv as a function of number of predictor variables usedin the model of WLt during model simplification and shown for thelast 50 parameter drops

shown in Fig 6 Further elimination of parameters led to apronounced decline of model performance Similar behav-ior was observed for the calibration on WL In favor of amore parsimonious model we chose the model with the low-est number of parameters before the pronounced decline ofmodel performance occurred For the calibration on WLtthis corresponded to the model with lowest number of param-eters that still achieved NSEcv values ofgt 045 (Fig 6) Thefinal WLt model comprised nine predictor variables and thefinal WL model seven parameters The percentages of pa-rameter contributions to the final model and their individualinfluences are discussed for WLt in Sect 34

Table 3 summarizes the statistical performances of themodels calibrated on WL and WLt For both models NSEcalis considerably higher than NSEcv and shows the commonly

Table 2 Weighted mean and standard deviation of WL and WLtdata and of the WLt map presented in Sect 36 for the nine landcover classes

WL (m) WLt (minus)WLt (minus)

Meanplusmn sd Meanplusmn sd Map meanplusmn sd

Arable land minus069plusmn 030 minus076plusmn 017 minus066plusmn 022Deciduous f minus045plusmn 034 minus049plusmn 037 minus047plusmn 035Grassland minus044plusmn 029 minus052plusmn 032 minus049plusmn 030Unused peatl minus039plusmn 036 minus039plusmn 041 minus037plusmn 040Coniferous f minus036plusmn 036 minus037plusmn 037 minus046plusmn 035Wet unused peatl minus022plusmn 027 minus018plusmn 040 minus017plusmn 036Wet forest minus022plusmn 029 minus017plusmn 043 minus021plusmn 039Wet grassland minus010plusmn 014 minus000plusmn 031 minus015plusmn 039Reed minus001plusmn 017 020plusmn 029 minus006plusmn 032

observed overfitting behavior of BRT models The differentmeasures that we conducted to minimize overfitting (cross-validation on peatlands restriction to monotonic responsesand model simplification including elimination of highly cor-related variables) lowered the difference between NSEcal andNSEcv but could not totally avoid overfitting NSEcv of theWLt model (0453) indicates higher predictive model perfor-mance compared to the WL model (0381) However as thedata ranges differ due to the transformation this comparisonmay be misleading Therefore we transformed the predic-tions of the WL model to obtain WLt values from this modeland equally calculated the performance criteria (Table 3 sec-ond column) Then NSEcv is slightly increased (0397) butdoes not achieve the values of the model that was calibratedon WLt A better predictive model performance of the modelcalibrated on WLt is also visible for the RMSEcv valuesThe total RMSEcv as well as the RMSEcv values for the

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3331

Table 3Performance criteria of the different models dry range de-fined as WLlt minus03 m and wet range as WLgt minus03 m

WL (m) WLt (minus) WLt (minus)(calibrated (calibrated on WL) (calibrated on WLt)

on WL)

NSEcal 0627 0559 0642NSEcv 0381 0397 0453RMSEcv 0269 0299 0284RMSEcvdry 0284 0263 0259RMSEcvwet 0222 0382 0355Bias minus0003 0083 0002Biasdry minus0012 0070 0003Biaswet 0021 0120 0000

dry (WL lt minus03 m) and wet range (WLgt minus03 m) showslightly lower values for the WLt model compared to WLtvalues from the model calibrated on WL Given our hypo-thetical transfer function (Fig 3) in which the GHG budgetis linearly related to WLt the higher accuracy of WLt pre-dictions directly corresponds to a higher accuracy of GHGbudget predictions

Superior model performance is also evident when evaluat-ing model bias Only when calibrating directly on WLt arethe WLt predictions bias-free Calibration on WL and subse-quent transformation to WLt introduces a model bias towardssystematically lower WLt values In subsequent applicationsto GHG emission upscaling lower WLt values would lead toan overestimation of CO2 emissions and to an underestima-tion of CH4 emissions

34 Influence of predictor variables on WLt

Given the beneficial characteristics of the model calibratedon WLt for GHG upscaling presentation and discussion offurther model results is restricted to the WLt model

The BRT method allows the analysis of the parameter con-tributions to and influences on the model (Elith et al 2008)and thus may contribute to system understanding The per-centages of the contributions of the nine predictor variablesto the final model ranged from 252 to 56 (Fig 7) Ex-cept protection status at least one parameter of each of theseven parameter groups contributed to the final model Allprotection status information was dropped early during thesimplification process due to low contribution although WLshowed slightly higher values for data from nature protectionor special areas of conservation However other parametersseem to be able to fully compensate the information that islost by dropping this predictor

Land cover class lc at the dip well was the parameter withstrongest contribution (252 ) It basically follows the trendillustrated in Fig 5b The bootstrap error plotted as standarddeviation (Fig 7) shows the variation of this influence overthe 1000 bootstrap models A second land cover parameterthe fraction of dry land cover classes on organic soils in abuffer of 2500 m radiusfdry (2500) contributed to the model

with 103 The monotonic decrease of WLt with increas-ing fdry (2500) is plausible as higher values reflect intensiveland use in the surroundings of the dip well and thus indicateintensive artificial drainage Together both parameter con-tributed 355 and thus land cover represents the parametergroup with the strongest model contribution

Peatland characteristics are the second most importantparameter group The peatland type contributed 16 Themodel indicates that peatlands without any connection to sur-face water bodies (river or lake) and the class of other organicsoils are characterized by lower WLt compared to the peat-land types lowland bog upland bog and fen neighboring sur-face water As the class of other organic soils is generallyexpected to reflect lower water levels and as surface watermay have a stabilization effect on water levels of organicsoils the influence of the peatland type can be consideredplausible Besides peatland type the substrate of the peatbase contributes 56 Here organic soils overlying peatclay layers (eg limnic sediments such as calcareous gyt-tja) or basement rock are characterized by higher WLt com-pared to organic soils overlying unconsolidated rock Thiscan be explained by the lower drainage resistance of uncon-solidated rocks This may cause an increased efficiency ofanthropogenic drainage andor a general higher vulnerabil-ity to seepage losses Finally slightly lower WLt values areindicated by a high fraction of organic soils for the 500 mbufferfpeat(500) This may reflect the higher land-use pres-sure on large peatlands compared to rather small peatlandswhich tentatively are more easily preserved by nature protec-tion efforts

The remaining four parameter groups are represented inthe model by only one parameter each The third most influ-ential parameter was the length of ditches on arable land andgrassland for the 250 m buffer dilendry (250) At first glanceit may be surprising that with increasing ditch density WLtvalues tend to be higher as ditches are supposed to drain thewater when land is used as arable land and grassland Thefact that the model identifies a rather strong effect in the op-posite direction may be caused by incomplete informationabout the drainage network There is not detailed informa-tion about the spatial distribution of tile drains Based on ex-pert knowledge agricultural areas with a lower ditch densityare more likely to have tile drains As these drains easilyinstalled with a narrow drain spacing are more effective atdraining organic soils low WLt values for arable land andgrassland may be related to low ditch densities Furthermoreditches were originally dug at narrow spacing in especiallywet areas of organic soils but there is no information avail-able whether these ditches still function properly

The parameters wbsummer hrel and tiras25all show expectedtrends The model predicts higher WLt for increasing cli-matic water balance in the summer period (May to October)wbsummer for dip wells located in depressions (low values ofhrel) and for higher small-scale topographic wetness indicescalculated on the 25times 25 digital elevation model (tiras25)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3332 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 7 Partial dependence plots for the predictor variables For an explanation of variables see Table 1 They axes are on WLt scale andare centered around the mean WLt Error bars and grey area indicate standard deviation of the response of over 1000 bootstrap models Therelative contribution of each predictor is indicated as percentage The lines along thex axes of each plot show distribution of data across thatvariable in deciles

The fact that all parameters show expected or explainableresponses in the model corroborates the reliability of the cal-ibrated WLt model The standard deviation of the predictorresponses based on the bootstrap samples shows the stabilityof the observed responses

Further insight into model behavior can be obtained by an-alyzing parameter interactions This is obtained by changingtwo parameters simultaneously while keeping mean valuesfor all other parameters (Elith et al 2008) Figure 8 showsthe two strongest parameter interactions Parameter wbsummer

strongly interacts withptype The generally lower values ofWLt of fens without surface water connection and other or-ganic soils show a stronger dependency on the summer cli-matic water balance While a summer climatic water balanceof gt minus80 mm shows a rather weak effect on WLt for the wet-ter peatland types in contrast to the two drier peatland typesthere is still a strong effect with increasing wbsummer Thetrend for wbsummergt 130 mm for the dry peatland types issupported by seven different peatlands

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3333

Figure 8Partial dependence plots representing the two strongest interactions in the model(a) betweenptype and wbsummerand(b) betweenpbaseandfdry Fitted WLt is plotted on they axis which is obtained after accounting for the average effect of all other predictor variables

Another strong interaction is observed forpbase andfdry (2500) While a rather weak effect of the fraction ofarable land and grassland is observed for organic soils over-lying basement rock and peat clay layer a strong effect is ob-served for organic soils overlying unconsolidated rock Thisinteraction reflects the higher lateral range of drainage effectsfor organic soils with little flow resistance at the peat base Inthese organic soils intensive land use lowers the water levelover large areas

35 Discussion of model uncertainty

Plotting observed vs predicted WLt from cross-validation(Fig 9) illustrates the rather large residual variance that can-not be explained by the model As indicated by the higherRMSEcv for the wet range (Table 3) scatter increases withincreasing WLt Error bars in they direction indicate dataerror derived from the nugget of the variogram It is shownfor a few data points as an example Due to transformationdata error increases for higher WLt Figure 9 demonstratesthat the fraction of unexplainable variance related to data er-ror is much higher for the wet than for the dry range Boot-strap error indicating the variation of the model predictionsfor 1000 bootstrap samples is shown in thex direction for thesame data points Bootstrap error is lower than the data errorfor the wet range and slightly higher for the dry range

Bootstrap errors demonstrate the sensitivity of model pre-dictions to changes of the data set used for calibration Whena model possesses structural deficits such as missing pre-dictor variables bootstrap errors should not be used to de-fine confidence intervals for the model predictions Figure 10shows residuals from cross-validation and standard deviationof bootstrap predictions for all land cover classes The resid-uals of each land cover class show near-normal distributionsFor five of the nine land cover classes (wet forest wet un-used peatland arable land coniferous forest and reed) theShapirondashWilk test of normality is positive (p gt 005) Fig-ure 10a further indicates that residuals of each land cover

Figure 9 Observed vs predicted transformed annual mean waterlevel (WLt) from cross-validation results Error bars show selecteddata and bootstrap model errors as standard deviation Data pointsare scaled by their weights

scatter fairly well around zero indicating low bias for the var-ious land cover classes Land-cover-class-specific confidenceintervals of model predictions can thus be derived from theRMSEcv of each land cover class eg 2times RMSEcv repre-senting the 95 confidence interval

The prediction uncertainty derived from cross-validationis much higher than the bootstrap prediction uncertainty ob-tained from the bootstrap standard deviation (sd) with 2times sdcorresponding to the 95 confidence interval (Fig 10)The large difference between these values indicates that themodel has structural deficits that can be attributed to severalerror sources

i Key influences on WLt are missing in the set of pre-dictor variables None of the predictor variables in-dicate whether and to which extent water level in-crease due to re-wetting measures took place in the last

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 9: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3327

wells Second the predictor variable data set contains cate-gorical and numerical data and part of the predictor variablespredominantly vary from peatland to peatland (eg climaticboundary conditions large-scale topographic wetness indexpeatland characteristics) whereas others also show within-peatland variability (eg land use small-scale topographicwetness index drainage network) As the influence of the in-dividual predictor variables on our target WLt is expected tobe rather diffuse due to abundant interactions with other sitecharacteristics the robustness of derived dependencies willstrongly depend on the number of different peatlands in thedata set

There are no universal data weighting rules for similarlyheterogeneous data situations and some degree of expertjudgment and subjectivity is inevitable involved when de-veloping an appropriate scheme (Francis 2011) The needfor introducing a data weighting scheme is obvious as with-out data weighting during calibration too much influencewould be given to small and well-studied peatlands whichwill reduce predictive model performance for large less-well-studied peatland areas To avoid this in a simple man-ner weight could be reduced by the number of dip wells ineach peatland which results in each peatland being equallyweighted This scheme however does not sufficiently usethe high information content provided by well-studied largepeatlands which should have a higher impact on model cali-bration than a small peatland with only few dip wells

Here we propose a new weighting scheme that takes intoaccount both factors peatland size and local density of dipwells to derive dip well specific weighting factors It is basedon principles of data uncertainty reduction by repeated mea-surements and of geostatistics First we consider our datasituation as an analogue of meta-analysis with grouped dataIt is has been shown for homogeneous problems (all datafrom same population) that optimal group weights for meta-analysis is 1SE2 (Hedges and Olkin 1985) with SE beingthe standard error of each group

SE =σe

radicN

(3)

whereσe is the error standard deviation of a measurementandN is the number of measurements in a group For ho-mogeneous problems and uniformσe this results in weightsthat are linearly dependent onN which we here call the firstend member of weighting Heterogeneity (within-group vari-ance) reduces the variation of the group weights which canbe shown by random effect models (Cumming 2012) Aswith second end member of weighting when heterogene-ity totally dominates within-group variance optimal groupweights are uniform for all groups ie weights are inde-pendent ofN We are not aware of a method that allowsthe estimation of the degree of heterogeneity for the com-plex target and predictor data situation in this study includ-ing data (spatial and temporal variability measurement er-ror) and model errors (missing parameters) As a trade-off

Figure 4 Sample semivariogram and fitted semivariogram modelof the annual mean water level data WL

between 1SE2(homogeneous end member) and 1 (heteroge-neous end member) we decided on a group weight that isthe inverse of the standard error 1SE which is for exam-ple often used in econometric studies (Dickens 1990) Weemphasize that this is a subjective decision

The group weight 1SE is the basis for the geostatisticalpart of our weighting scheme There are two reasons why wecannot directly treat our peatlands as groups First there iswithin-peatland variability which is related to changing sitecharacteristics It is one objective of our study to describethis variability by statistical modeling Thus dip wells mustbe treated individually and data cannot be aggregated at thepeatland level Second we expect the model to learn morewhen the same number of dip wells is installed in a largerpeatland In a small peatland spatial autocorrelation betweendip wells is higher ie the information content is lower thanfor large peatlands As a consequence of the first point wedo not aggregate and keep all dip wells in the target variabledata set by attributing to each dip well the fraction 1N ofits group weight so that the relative weights of the groupsremain constant As a consequence of the second point weuse principles of geostatistics in our weighting scheme Wereplace the group sizeN (positive integer number) by theldquostatisticalrdquo group sizen (positive continuous number be-inggt 1) which we derive from the spatial autocorrelationamong the dip wells

Therefore we analyze the spatial autocorrelation structureof the data set A single spherical variogram model was fit-ted to the sample variogram of all data (Fig 4 in Sect 31)Variogram models allow the differentiation of the total datavariance (called ldquosillrdquo) into a spatially uncorrelated variance(called ldquonuggetrdquo) and a spatially correlated variance (calledldquostructural variancerdquo and defined as sillndashnugget) (Wacker-nagel 2003) The variogram model allows for derivation forany distance between two locations the average squared dif-ference of values here defined asγ By definition at dis-tance 0 the average squared difference equals the nuggetand at distances greater than that called the ldquorangerdquo of spatial

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3328 M Bechtold et al Large-scale regionalization of water table depth in peatlands

autocorrelation the average squared difference equals thesill Accordingly the autocorrelated fractionf of the av-erage squared difference between two dip wellsi and j is

fij =sill minus γij

sill minus nugget (4)

We now define the ldquostatisticalrdquo group sizen of each dip wellito be the sum of one plus the autocorrelated fractionsfij ofall dip wells that are within the range of spatial autocorrela-tion of i

ni = 1 +

msumj=1

sill minus γij

sill minus nugget (5)

According to the discussion above dip-well-specific weightscan then be calculated with

wi =1

ni SEi

=1

σeiradic

ni

(6)

whereni is derived from Eq (5) The equation shows thatwith increasing ldquostatisticalrdquo group sizen ie with increas-ing spatial data density the weight of an individual dip wellis ldquodown-weightedrdquo to some degree a behavior that corre-sponds to our initial intention to lower the influence of smallpeatlands compared to large ones The error standard devia-tion σe is dependent on several factors eg the length of thetime series the temporal measurement density and the mi-crotopography around the dip well For simplicity we hereassumedσe to be uniform for all dip wells which simplifiesEq (6) towi =

1radic

ni

Only dip wells with the same land-use type were summedup with Eq (5) which avoids the down-weighting by dipwells that have different land-use types The latter are mostlycharacterized by fairly different WLt and thus by rather lowspatial autocorrelation to dip welli

After spatial correlation has been accounted for the sumof the weights of all dip wells of each land-use type were ad-justed that they correspond to the fractions of this land-usetype in Germany This adjustment accounts for the overrep-resentation in the data set of dip wells in unused peatlandsand underrepresentation of dip wells in arable land

233 Model performance criteria

Model fit and predictive performance after cross-validationwere quantified by the weighted root mean square error

RMSE =

radicradicradicradic 1summi=1 wi

msumi=1

(wi

(xoi minus xsi

)2) (7)

wherem is the number of dip wellsxoi is observed WLor WLt of dip well i xsi is simulated WL or WLt of dipwell i andwi is the data weight of dip welli (see below) We

refer to the root mean square error of the predicted data ofcross validation as RMSEcv Model performance was furtherquantified by NashndashSutcliffe efficiency (NSE)

NSE = 1 minus

msumi=1

wi

(xoi minus xsi

)2

msumi=1

wi

(xoi minus xo

)2 (8)

wherexo is the mean of all observed WL or WLt It indicateshow well observed vs predicted values match the 1 1 lineNSE is a good overall indicator of predictive performancebecause it combines scatter and bias (common offset andorslope difference from 11 line) (Nash and Sutcliffe 1970)Values greater than 0 signify a model that is better than thereference model based on the data mean We refer to the NSEof the training data as NSEcal and of the predicted data ofcross validation as NSEcv

Systematic errors were quantified by calculating the modelbias here defined as

BIAS =

msumi=1

(wi xoi minus wi xsi

) (9)

24 Model uncertainty and stability evaluation

Uncertainty of the model predictions was assessed by boot-strapping cross-validation and residual analysis

For the bootstrapping analysis we followed the procedureof Leathwick et al (2006) We estimated the confidence in-tervals around the predictions and the fitted functions by tak-ing 1000 bootstrap samples of the 53 peatlands The numberof peatlands in each sample was equivalent to the data set butpeatlands were selected randomly with replacement Usingthe predictor variables of the final model a BRT model wasfitted to each sample Cross validation was again performedon peatlands thus a peatland in the calibration data set wasnot part of the cross-validation data set to avoid overly opti-mistic results Variances of the predictions and of the fittedfunctions of the 1000 models were evaluated

If data sets are relatively small (egn lt 1000 Dersquoath2007) then the small size of the training and test data setslowers model accuracy Given the fairly small number ofpeatlands in the data set and the partly high spatial corre-lation of dip wells within these peatlands we decided not tosplit the data set into a training and test data set Estimatesof model accuracy can then be based on cross-validationthereby making effective use of all the data (Dersquoath 2007)The prediction uncertainty of the final model is estimatedby the root mean square error of prediction (RMSEcv seeabove) for each land cover class After testing for near-normal distribution of the residuals RMSEcv can be used toderive the 68 and 95 confidence intervals of the predictionswith RMSEcv and 2times RMSEcv respectively

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3329

Finally additional residual analysis was performed to eval-uate whether the predictions are biased for different landcover classes or geographical regions

25 Regionalization

In the final regionalization step the predictor variables con-tributing to the final model were determined at a 25times 25 mraster for all organic soil in Germany Predictor variableswere determined with the same map input that was used formodel building Land cover information including informa-tion on ditches was based on the data from year 2012 and theclimatic data was based on the average of the last 30 yearsThe fine spatial resolution of 25times 25 m was not chosen tofool the reader with a highly spatially accurate model Ratherthis fairly fine scale was necessary to map the relativelysmall-scale effects of the topography land use and peatlandgeometry variables The final model was then used to makea prediction for each of these raster cells

3 Results and discussion

31 Spatial correlation structure of the data set

The variogram model fitted to the sample variogram provideda nugget (0012 m2 011 m) a sill (009 m2 03 m) and arange of spatial correlation (2700 m) for our data set of WL(Fig 4) The nugget represents the very small-scale soil hy-draulic variability and micro-topography effects on WL (vander Ploeg et al 2012) and measurement error eg by dif-ferences in the determination of the ground surface and inthe timing of the manual measurements Furthermore micro-topography (eg hummocks) and oscillating peat surfaces ofwet peatlands pose a challenge for an accurate determinationof both ground surface and water level The water level timeseries in the data set were of different lengths and rangedfrom 1 to 20 years Interannual variability of water levels canbe large (eg Knotters and van Walsum 1997) For simplic-ity in our analysis data were not harmonized by extrapolat-ing WL time series using weather data to a 30-year periodThus the nugget also includes errors that are introduced bydip wells with different measurement periods that are locatedin the range of spatial correlation In consideration of theseerror sources the fitted nugget of 011 m appears to be a re-alistic value At 03 m the fitted sill matched nearly perfectlywith the standard deviation of the data (031 m) which in-dicates consistency between semivariogram model and dataset The fitted range of spatial correlation of 2700 m reflectsboth physical effects ie the average range of lateral flowdue to hydraulic gradients as well as the effect of averageland-use patterns in Germany on the spatial correlation ofWL Fitted values were used in the calculation of the dip-well-specific weights using Eq (6)

32 Typical water levels for land-use types in Germanorganic soils

The land cover classes are characterized by plausible meanand median water levels which show consistent differencesbetween each other (Table 2 and Fig 5a) The mean valuesof arable land and grassland agree with what can be expectedfor their agronomic requirements with slightly lower waterlevels for arable land The high variability observed for bothclasses may be related to the variability of the efficiency ofinstalled drainage systems as for example the presence andcondition of tile drains and the depth of ditches Grasslandscan be managed with very variable intensity which is partlyreflected in different water levels Figure 5a further showsthat deciduous forests seem to dominate slightly drier organicsoils compared to coniferous forests which dominate underwetter conditions A high variability of water levels is ob-served for the land cover class unused peatland On the onehand post peat-cutting topography increases the variabilityof WL over short distances It probably contributes to thehigh variance observed for this class On the other hand thisclass comprises both rather dry unused peatlands and wetterpeatlands in which re-wetting measures already took placewhich however do not show yet a wet soil attribute in theATKIS Digital Landscape Model This may also cause part ofthe variance observed in the grassland and forest land coverclass All wet land cover classes (reed wet grassland wetforest and wet unused peatland) that were separated by wet-ness indication clearly show higher water levels showing thewetness attribute of the Digital Landscape Model is a usefulattribute

Figure 5b shows the transformed water level for all classesIt can be observed that the variances of the wetter landcover increase relative to the variances of the dry land coverclasses This is due to the highest sensitivity of GHG emis-sions in the wet range of water levels (gt minus05 m) Conse-quently the rather high variance of WL for arable land cor-responds to a rather low variance of WLt ie to a rather lowassumed effect of WL variability on the GHG budget

33 BRT model calibration and validation WL vs WL t

In contrast to land cover class the other predictor variablesshowed if at all only weak relations to WL and WLt whenevaluating them with box plots 2-D cross plots and simplecorrelation matrices Here we expected BRT to detect thestrongest predictor interactions and to identify the most in-formative predictors

After model calibration with all predictors subsequentmodel simplification successively dropped those parameterswith correlationgt 07 and the lowest contribution For bothWL and WLt model performance improved during this sim-plification For WLt the highest values of NSEcv of approxi-mately 046 were achieved with 21 to 9 model parametersThe development of NSEcv for the last 50 parameters is

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3330 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 5 Water level relative to ground surface WL (m) and transformed water level WLt (minus) by land cover class illustrated as a weightedbox plot WLt = minus1 corresponds to maximum CO2 emissions and WLt = 1 to maximum CH4 emissions In the top horizontal axes thenumber of dip wells in each class is indicated

Figure 6NSEcv as a function of number of predictor variables usedin the model of WLt during model simplification and shown for thelast 50 parameter drops

shown in Fig 6 Further elimination of parameters led to apronounced decline of model performance Similar behav-ior was observed for the calibration on WL In favor of amore parsimonious model we chose the model with the low-est number of parameters before the pronounced decline ofmodel performance occurred For the calibration on WLtthis corresponded to the model with lowest number of param-eters that still achieved NSEcv values ofgt 045 (Fig 6) Thefinal WLt model comprised nine predictor variables and thefinal WL model seven parameters The percentages of pa-rameter contributions to the final model and their individualinfluences are discussed for WLt in Sect 34

Table 3 summarizes the statistical performances of themodels calibrated on WL and WLt For both models NSEcalis considerably higher than NSEcv and shows the commonly

Table 2 Weighted mean and standard deviation of WL and WLtdata and of the WLt map presented in Sect 36 for the nine landcover classes

WL (m) WLt (minus)WLt (minus)

Meanplusmn sd Meanplusmn sd Map meanplusmn sd

Arable land minus069plusmn 030 minus076plusmn 017 minus066plusmn 022Deciduous f minus045plusmn 034 minus049plusmn 037 minus047plusmn 035Grassland minus044plusmn 029 minus052plusmn 032 minus049plusmn 030Unused peatl minus039plusmn 036 minus039plusmn 041 minus037plusmn 040Coniferous f minus036plusmn 036 minus037plusmn 037 minus046plusmn 035Wet unused peatl minus022plusmn 027 minus018plusmn 040 minus017plusmn 036Wet forest minus022plusmn 029 minus017plusmn 043 minus021plusmn 039Wet grassland minus010plusmn 014 minus000plusmn 031 minus015plusmn 039Reed minus001plusmn 017 020plusmn 029 minus006plusmn 032

observed overfitting behavior of BRT models The differentmeasures that we conducted to minimize overfitting (cross-validation on peatlands restriction to monotonic responsesand model simplification including elimination of highly cor-related variables) lowered the difference between NSEcal andNSEcv but could not totally avoid overfitting NSEcv of theWLt model (0453) indicates higher predictive model perfor-mance compared to the WL model (0381) However as thedata ranges differ due to the transformation this comparisonmay be misleading Therefore we transformed the predic-tions of the WL model to obtain WLt values from this modeland equally calculated the performance criteria (Table 3 sec-ond column) Then NSEcv is slightly increased (0397) butdoes not achieve the values of the model that was calibratedon WLt A better predictive model performance of the modelcalibrated on WLt is also visible for the RMSEcv valuesThe total RMSEcv as well as the RMSEcv values for the

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3331

Table 3Performance criteria of the different models dry range de-fined as WLlt minus03 m and wet range as WLgt minus03 m

WL (m) WLt (minus) WLt (minus)(calibrated (calibrated on WL) (calibrated on WLt)

on WL)

NSEcal 0627 0559 0642NSEcv 0381 0397 0453RMSEcv 0269 0299 0284RMSEcvdry 0284 0263 0259RMSEcvwet 0222 0382 0355Bias minus0003 0083 0002Biasdry minus0012 0070 0003Biaswet 0021 0120 0000

dry (WL lt minus03 m) and wet range (WLgt minus03 m) showslightly lower values for the WLt model compared to WLtvalues from the model calibrated on WL Given our hypo-thetical transfer function (Fig 3) in which the GHG budgetis linearly related to WLt the higher accuracy of WLt pre-dictions directly corresponds to a higher accuracy of GHGbudget predictions

Superior model performance is also evident when evaluat-ing model bias Only when calibrating directly on WLt arethe WLt predictions bias-free Calibration on WL and subse-quent transformation to WLt introduces a model bias towardssystematically lower WLt values In subsequent applicationsto GHG emission upscaling lower WLt values would lead toan overestimation of CO2 emissions and to an underestima-tion of CH4 emissions

34 Influence of predictor variables on WLt

Given the beneficial characteristics of the model calibratedon WLt for GHG upscaling presentation and discussion offurther model results is restricted to the WLt model

The BRT method allows the analysis of the parameter con-tributions to and influences on the model (Elith et al 2008)and thus may contribute to system understanding The per-centages of the contributions of the nine predictor variablesto the final model ranged from 252 to 56 (Fig 7) Ex-cept protection status at least one parameter of each of theseven parameter groups contributed to the final model Allprotection status information was dropped early during thesimplification process due to low contribution although WLshowed slightly higher values for data from nature protectionor special areas of conservation However other parametersseem to be able to fully compensate the information that islost by dropping this predictor

Land cover class lc at the dip well was the parameter withstrongest contribution (252 ) It basically follows the trendillustrated in Fig 5b The bootstrap error plotted as standarddeviation (Fig 7) shows the variation of this influence overthe 1000 bootstrap models A second land cover parameterthe fraction of dry land cover classes on organic soils in abuffer of 2500 m radiusfdry (2500) contributed to the model

with 103 The monotonic decrease of WLt with increas-ing fdry (2500) is plausible as higher values reflect intensiveland use in the surroundings of the dip well and thus indicateintensive artificial drainage Together both parameter con-tributed 355 and thus land cover represents the parametergroup with the strongest model contribution

Peatland characteristics are the second most importantparameter group The peatland type contributed 16 Themodel indicates that peatlands without any connection to sur-face water bodies (river or lake) and the class of other organicsoils are characterized by lower WLt compared to the peat-land types lowland bog upland bog and fen neighboring sur-face water As the class of other organic soils is generallyexpected to reflect lower water levels and as surface watermay have a stabilization effect on water levels of organicsoils the influence of the peatland type can be consideredplausible Besides peatland type the substrate of the peatbase contributes 56 Here organic soils overlying peatclay layers (eg limnic sediments such as calcareous gyt-tja) or basement rock are characterized by higher WLt com-pared to organic soils overlying unconsolidated rock Thiscan be explained by the lower drainage resistance of uncon-solidated rocks This may cause an increased efficiency ofanthropogenic drainage andor a general higher vulnerabil-ity to seepage losses Finally slightly lower WLt values areindicated by a high fraction of organic soils for the 500 mbufferfpeat(500) This may reflect the higher land-use pres-sure on large peatlands compared to rather small peatlandswhich tentatively are more easily preserved by nature protec-tion efforts

The remaining four parameter groups are represented inthe model by only one parameter each The third most influ-ential parameter was the length of ditches on arable land andgrassland for the 250 m buffer dilendry (250) At first glanceit may be surprising that with increasing ditch density WLtvalues tend to be higher as ditches are supposed to drain thewater when land is used as arable land and grassland Thefact that the model identifies a rather strong effect in the op-posite direction may be caused by incomplete informationabout the drainage network There is not detailed informa-tion about the spatial distribution of tile drains Based on ex-pert knowledge agricultural areas with a lower ditch densityare more likely to have tile drains As these drains easilyinstalled with a narrow drain spacing are more effective atdraining organic soils low WLt values for arable land andgrassland may be related to low ditch densities Furthermoreditches were originally dug at narrow spacing in especiallywet areas of organic soils but there is no information avail-able whether these ditches still function properly

The parameters wbsummer hrel and tiras25all show expectedtrends The model predicts higher WLt for increasing cli-matic water balance in the summer period (May to October)wbsummer for dip wells located in depressions (low values ofhrel) and for higher small-scale topographic wetness indicescalculated on the 25times 25 digital elevation model (tiras25)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3332 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 7 Partial dependence plots for the predictor variables For an explanation of variables see Table 1 They axes are on WLt scale andare centered around the mean WLt Error bars and grey area indicate standard deviation of the response of over 1000 bootstrap models Therelative contribution of each predictor is indicated as percentage The lines along thex axes of each plot show distribution of data across thatvariable in deciles

The fact that all parameters show expected or explainableresponses in the model corroborates the reliability of the cal-ibrated WLt model The standard deviation of the predictorresponses based on the bootstrap samples shows the stabilityof the observed responses

Further insight into model behavior can be obtained by an-alyzing parameter interactions This is obtained by changingtwo parameters simultaneously while keeping mean valuesfor all other parameters (Elith et al 2008) Figure 8 showsthe two strongest parameter interactions Parameter wbsummer

strongly interacts withptype The generally lower values ofWLt of fens without surface water connection and other or-ganic soils show a stronger dependency on the summer cli-matic water balance While a summer climatic water balanceof gt minus80 mm shows a rather weak effect on WLt for the wet-ter peatland types in contrast to the two drier peatland typesthere is still a strong effect with increasing wbsummer Thetrend for wbsummergt 130 mm for the dry peatland types issupported by seven different peatlands

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3333

Figure 8Partial dependence plots representing the two strongest interactions in the model(a) betweenptype and wbsummerand(b) betweenpbaseandfdry Fitted WLt is plotted on they axis which is obtained after accounting for the average effect of all other predictor variables

Another strong interaction is observed forpbase andfdry (2500) While a rather weak effect of the fraction ofarable land and grassland is observed for organic soils over-lying basement rock and peat clay layer a strong effect is ob-served for organic soils overlying unconsolidated rock Thisinteraction reflects the higher lateral range of drainage effectsfor organic soils with little flow resistance at the peat base Inthese organic soils intensive land use lowers the water levelover large areas

35 Discussion of model uncertainty

Plotting observed vs predicted WLt from cross-validation(Fig 9) illustrates the rather large residual variance that can-not be explained by the model As indicated by the higherRMSEcv for the wet range (Table 3) scatter increases withincreasing WLt Error bars in they direction indicate dataerror derived from the nugget of the variogram It is shownfor a few data points as an example Due to transformationdata error increases for higher WLt Figure 9 demonstratesthat the fraction of unexplainable variance related to data er-ror is much higher for the wet than for the dry range Boot-strap error indicating the variation of the model predictionsfor 1000 bootstrap samples is shown in thex direction for thesame data points Bootstrap error is lower than the data errorfor the wet range and slightly higher for the dry range

Bootstrap errors demonstrate the sensitivity of model pre-dictions to changes of the data set used for calibration Whena model possesses structural deficits such as missing pre-dictor variables bootstrap errors should not be used to de-fine confidence intervals for the model predictions Figure 10shows residuals from cross-validation and standard deviationof bootstrap predictions for all land cover classes The resid-uals of each land cover class show near-normal distributionsFor five of the nine land cover classes (wet forest wet un-used peatland arable land coniferous forest and reed) theShapirondashWilk test of normality is positive (p gt 005) Fig-ure 10a further indicates that residuals of each land cover

Figure 9 Observed vs predicted transformed annual mean waterlevel (WLt) from cross-validation results Error bars show selecteddata and bootstrap model errors as standard deviation Data pointsare scaled by their weights

scatter fairly well around zero indicating low bias for the var-ious land cover classes Land-cover-class-specific confidenceintervals of model predictions can thus be derived from theRMSEcv of each land cover class eg 2times RMSEcv repre-senting the 95 confidence interval

The prediction uncertainty derived from cross-validationis much higher than the bootstrap prediction uncertainty ob-tained from the bootstrap standard deviation (sd) with 2times sdcorresponding to the 95 confidence interval (Fig 10)The large difference between these values indicates that themodel has structural deficits that can be attributed to severalerror sources

i Key influences on WLt are missing in the set of pre-dictor variables None of the predictor variables in-dicate whether and to which extent water level in-crease due to re-wetting measures took place in the last

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 10: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

3328 M Bechtold et al Large-scale regionalization of water table depth in peatlands

autocorrelation the average squared difference equals thesill Accordingly the autocorrelated fractionf of the av-erage squared difference between two dip wellsi and j is

fij =sill minus γij

sill minus nugget (4)

We now define the ldquostatisticalrdquo group sizen of each dip wellito be the sum of one plus the autocorrelated fractionsfij ofall dip wells that are within the range of spatial autocorrela-tion of i

ni = 1 +

msumj=1

sill minus γij

sill minus nugget (5)

According to the discussion above dip-well-specific weightscan then be calculated with

wi =1

ni SEi

=1

σeiradic

ni

(6)

whereni is derived from Eq (5) The equation shows thatwith increasing ldquostatisticalrdquo group sizen ie with increas-ing spatial data density the weight of an individual dip wellis ldquodown-weightedrdquo to some degree a behavior that corre-sponds to our initial intention to lower the influence of smallpeatlands compared to large ones The error standard devia-tion σe is dependent on several factors eg the length of thetime series the temporal measurement density and the mi-crotopography around the dip well For simplicity we hereassumedσe to be uniform for all dip wells which simplifiesEq (6) towi =

1radic

ni

Only dip wells with the same land-use type were summedup with Eq (5) which avoids the down-weighting by dipwells that have different land-use types The latter are mostlycharacterized by fairly different WLt and thus by rather lowspatial autocorrelation to dip welli

After spatial correlation has been accounted for the sumof the weights of all dip wells of each land-use type were ad-justed that they correspond to the fractions of this land-usetype in Germany This adjustment accounts for the overrep-resentation in the data set of dip wells in unused peatlandsand underrepresentation of dip wells in arable land

233 Model performance criteria

Model fit and predictive performance after cross-validationwere quantified by the weighted root mean square error

RMSE =

radicradicradicradic 1summi=1 wi

msumi=1

(wi

(xoi minus xsi

)2) (7)

wherem is the number of dip wellsxoi is observed WLor WLt of dip well i xsi is simulated WL or WLt of dipwell i andwi is the data weight of dip welli (see below) We

refer to the root mean square error of the predicted data ofcross validation as RMSEcv Model performance was furtherquantified by NashndashSutcliffe efficiency (NSE)

NSE = 1 minus

msumi=1

wi

(xoi minus xsi

)2

msumi=1

wi

(xoi minus xo

)2 (8)

wherexo is the mean of all observed WL or WLt It indicateshow well observed vs predicted values match the 1 1 lineNSE is a good overall indicator of predictive performancebecause it combines scatter and bias (common offset andorslope difference from 11 line) (Nash and Sutcliffe 1970)Values greater than 0 signify a model that is better than thereference model based on the data mean We refer to the NSEof the training data as NSEcal and of the predicted data ofcross validation as NSEcv

Systematic errors were quantified by calculating the modelbias here defined as

BIAS =

msumi=1

(wi xoi minus wi xsi

) (9)

24 Model uncertainty and stability evaluation

Uncertainty of the model predictions was assessed by boot-strapping cross-validation and residual analysis

For the bootstrapping analysis we followed the procedureof Leathwick et al (2006) We estimated the confidence in-tervals around the predictions and the fitted functions by tak-ing 1000 bootstrap samples of the 53 peatlands The numberof peatlands in each sample was equivalent to the data set butpeatlands were selected randomly with replacement Usingthe predictor variables of the final model a BRT model wasfitted to each sample Cross validation was again performedon peatlands thus a peatland in the calibration data set wasnot part of the cross-validation data set to avoid overly opti-mistic results Variances of the predictions and of the fittedfunctions of the 1000 models were evaluated

If data sets are relatively small (egn lt 1000 Dersquoath2007) then the small size of the training and test data setslowers model accuracy Given the fairly small number ofpeatlands in the data set and the partly high spatial corre-lation of dip wells within these peatlands we decided not tosplit the data set into a training and test data set Estimatesof model accuracy can then be based on cross-validationthereby making effective use of all the data (Dersquoath 2007)The prediction uncertainty of the final model is estimatedby the root mean square error of prediction (RMSEcv seeabove) for each land cover class After testing for near-normal distribution of the residuals RMSEcv can be used toderive the 68 and 95 confidence intervals of the predictionswith RMSEcv and 2times RMSEcv respectively

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3329

Finally additional residual analysis was performed to eval-uate whether the predictions are biased for different landcover classes or geographical regions

25 Regionalization

In the final regionalization step the predictor variables con-tributing to the final model were determined at a 25times 25 mraster for all organic soil in Germany Predictor variableswere determined with the same map input that was used formodel building Land cover information including informa-tion on ditches was based on the data from year 2012 and theclimatic data was based on the average of the last 30 yearsThe fine spatial resolution of 25times 25 m was not chosen tofool the reader with a highly spatially accurate model Ratherthis fairly fine scale was necessary to map the relativelysmall-scale effects of the topography land use and peatlandgeometry variables The final model was then used to makea prediction for each of these raster cells

3 Results and discussion

31 Spatial correlation structure of the data set

The variogram model fitted to the sample variogram provideda nugget (0012 m2 011 m) a sill (009 m2 03 m) and arange of spatial correlation (2700 m) for our data set of WL(Fig 4) The nugget represents the very small-scale soil hy-draulic variability and micro-topography effects on WL (vander Ploeg et al 2012) and measurement error eg by dif-ferences in the determination of the ground surface and inthe timing of the manual measurements Furthermore micro-topography (eg hummocks) and oscillating peat surfaces ofwet peatlands pose a challenge for an accurate determinationof both ground surface and water level The water level timeseries in the data set were of different lengths and rangedfrom 1 to 20 years Interannual variability of water levels canbe large (eg Knotters and van Walsum 1997) For simplic-ity in our analysis data were not harmonized by extrapolat-ing WL time series using weather data to a 30-year periodThus the nugget also includes errors that are introduced bydip wells with different measurement periods that are locatedin the range of spatial correlation In consideration of theseerror sources the fitted nugget of 011 m appears to be a re-alistic value At 03 m the fitted sill matched nearly perfectlywith the standard deviation of the data (031 m) which in-dicates consistency between semivariogram model and dataset The fitted range of spatial correlation of 2700 m reflectsboth physical effects ie the average range of lateral flowdue to hydraulic gradients as well as the effect of averageland-use patterns in Germany on the spatial correlation ofWL Fitted values were used in the calculation of the dip-well-specific weights using Eq (6)

32 Typical water levels for land-use types in Germanorganic soils

The land cover classes are characterized by plausible meanand median water levels which show consistent differencesbetween each other (Table 2 and Fig 5a) The mean valuesof arable land and grassland agree with what can be expectedfor their agronomic requirements with slightly lower waterlevels for arable land The high variability observed for bothclasses may be related to the variability of the efficiency ofinstalled drainage systems as for example the presence andcondition of tile drains and the depth of ditches Grasslandscan be managed with very variable intensity which is partlyreflected in different water levels Figure 5a further showsthat deciduous forests seem to dominate slightly drier organicsoils compared to coniferous forests which dominate underwetter conditions A high variability of water levels is ob-served for the land cover class unused peatland On the onehand post peat-cutting topography increases the variabilityof WL over short distances It probably contributes to thehigh variance observed for this class On the other hand thisclass comprises both rather dry unused peatlands and wetterpeatlands in which re-wetting measures already took placewhich however do not show yet a wet soil attribute in theATKIS Digital Landscape Model This may also cause part ofthe variance observed in the grassland and forest land coverclass All wet land cover classes (reed wet grassland wetforest and wet unused peatland) that were separated by wet-ness indication clearly show higher water levels showing thewetness attribute of the Digital Landscape Model is a usefulattribute

Figure 5b shows the transformed water level for all classesIt can be observed that the variances of the wetter landcover increase relative to the variances of the dry land coverclasses This is due to the highest sensitivity of GHG emis-sions in the wet range of water levels (gt minus05 m) Conse-quently the rather high variance of WL for arable land cor-responds to a rather low variance of WLt ie to a rather lowassumed effect of WL variability on the GHG budget

33 BRT model calibration and validation WL vs WL t

In contrast to land cover class the other predictor variablesshowed if at all only weak relations to WL and WLt whenevaluating them with box plots 2-D cross plots and simplecorrelation matrices Here we expected BRT to detect thestrongest predictor interactions and to identify the most in-formative predictors

After model calibration with all predictors subsequentmodel simplification successively dropped those parameterswith correlationgt 07 and the lowest contribution For bothWL and WLt model performance improved during this sim-plification For WLt the highest values of NSEcv of approxi-mately 046 were achieved with 21 to 9 model parametersThe development of NSEcv for the last 50 parameters is

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3330 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 5 Water level relative to ground surface WL (m) and transformed water level WLt (minus) by land cover class illustrated as a weightedbox plot WLt = minus1 corresponds to maximum CO2 emissions and WLt = 1 to maximum CH4 emissions In the top horizontal axes thenumber of dip wells in each class is indicated

Figure 6NSEcv as a function of number of predictor variables usedin the model of WLt during model simplification and shown for thelast 50 parameter drops

shown in Fig 6 Further elimination of parameters led to apronounced decline of model performance Similar behav-ior was observed for the calibration on WL In favor of amore parsimonious model we chose the model with the low-est number of parameters before the pronounced decline ofmodel performance occurred For the calibration on WLtthis corresponded to the model with lowest number of param-eters that still achieved NSEcv values ofgt 045 (Fig 6) Thefinal WLt model comprised nine predictor variables and thefinal WL model seven parameters The percentages of pa-rameter contributions to the final model and their individualinfluences are discussed for WLt in Sect 34

Table 3 summarizes the statistical performances of themodels calibrated on WL and WLt For both models NSEcalis considerably higher than NSEcv and shows the commonly

Table 2 Weighted mean and standard deviation of WL and WLtdata and of the WLt map presented in Sect 36 for the nine landcover classes

WL (m) WLt (minus)WLt (minus)

Meanplusmn sd Meanplusmn sd Map meanplusmn sd

Arable land minus069plusmn 030 minus076plusmn 017 minus066plusmn 022Deciduous f minus045plusmn 034 minus049plusmn 037 minus047plusmn 035Grassland minus044plusmn 029 minus052plusmn 032 minus049plusmn 030Unused peatl minus039plusmn 036 minus039plusmn 041 minus037plusmn 040Coniferous f minus036plusmn 036 minus037plusmn 037 minus046plusmn 035Wet unused peatl minus022plusmn 027 minus018plusmn 040 minus017plusmn 036Wet forest minus022plusmn 029 minus017plusmn 043 minus021plusmn 039Wet grassland minus010plusmn 014 minus000plusmn 031 minus015plusmn 039Reed minus001plusmn 017 020plusmn 029 minus006plusmn 032

observed overfitting behavior of BRT models The differentmeasures that we conducted to minimize overfitting (cross-validation on peatlands restriction to monotonic responsesand model simplification including elimination of highly cor-related variables) lowered the difference between NSEcal andNSEcv but could not totally avoid overfitting NSEcv of theWLt model (0453) indicates higher predictive model perfor-mance compared to the WL model (0381) However as thedata ranges differ due to the transformation this comparisonmay be misleading Therefore we transformed the predic-tions of the WL model to obtain WLt values from this modeland equally calculated the performance criteria (Table 3 sec-ond column) Then NSEcv is slightly increased (0397) butdoes not achieve the values of the model that was calibratedon WLt A better predictive model performance of the modelcalibrated on WLt is also visible for the RMSEcv valuesThe total RMSEcv as well as the RMSEcv values for the

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3331

Table 3Performance criteria of the different models dry range de-fined as WLlt minus03 m and wet range as WLgt minus03 m

WL (m) WLt (minus) WLt (minus)(calibrated (calibrated on WL) (calibrated on WLt)

on WL)

NSEcal 0627 0559 0642NSEcv 0381 0397 0453RMSEcv 0269 0299 0284RMSEcvdry 0284 0263 0259RMSEcvwet 0222 0382 0355Bias minus0003 0083 0002Biasdry minus0012 0070 0003Biaswet 0021 0120 0000

dry (WL lt minus03 m) and wet range (WLgt minus03 m) showslightly lower values for the WLt model compared to WLtvalues from the model calibrated on WL Given our hypo-thetical transfer function (Fig 3) in which the GHG budgetis linearly related to WLt the higher accuracy of WLt pre-dictions directly corresponds to a higher accuracy of GHGbudget predictions

Superior model performance is also evident when evaluat-ing model bias Only when calibrating directly on WLt arethe WLt predictions bias-free Calibration on WL and subse-quent transformation to WLt introduces a model bias towardssystematically lower WLt values In subsequent applicationsto GHG emission upscaling lower WLt values would lead toan overestimation of CO2 emissions and to an underestima-tion of CH4 emissions

34 Influence of predictor variables on WLt

Given the beneficial characteristics of the model calibratedon WLt for GHG upscaling presentation and discussion offurther model results is restricted to the WLt model

The BRT method allows the analysis of the parameter con-tributions to and influences on the model (Elith et al 2008)and thus may contribute to system understanding The per-centages of the contributions of the nine predictor variablesto the final model ranged from 252 to 56 (Fig 7) Ex-cept protection status at least one parameter of each of theseven parameter groups contributed to the final model Allprotection status information was dropped early during thesimplification process due to low contribution although WLshowed slightly higher values for data from nature protectionor special areas of conservation However other parametersseem to be able to fully compensate the information that islost by dropping this predictor

Land cover class lc at the dip well was the parameter withstrongest contribution (252 ) It basically follows the trendillustrated in Fig 5b The bootstrap error plotted as standarddeviation (Fig 7) shows the variation of this influence overthe 1000 bootstrap models A second land cover parameterthe fraction of dry land cover classes on organic soils in abuffer of 2500 m radiusfdry (2500) contributed to the model

with 103 The monotonic decrease of WLt with increas-ing fdry (2500) is plausible as higher values reflect intensiveland use in the surroundings of the dip well and thus indicateintensive artificial drainage Together both parameter con-tributed 355 and thus land cover represents the parametergroup with the strongest model contribution

Peatland characteristics are the second most importantparameter group The peatland type contributed 16 Themodel indicates that peatlands without any connection to sur-face water bodies (river or lake) and the class of other organicsoils are characterized by lower WLt compared to the peat-land types lowland bog upland bog and fen neighboring sur-face water As the class of other organic soils is generallyexpected to reflect lower water levels and as surface watermay have a stabilization effect on water levels of organicsoils the influence of the peatland type can be consideredplausible Besides peatland type the substrate of the peatbase contributes 56 Here organic soils overlying peatclay layers (eg limnic sediments such as calcareous gyt-tja) or basement rock are characterized by higher WLt com-pared to organic soils overlying unconsolidated rock Thiscan be explained by the lower drainage resistance of uncon-solidated rocks This may cause an increased efficiency ofanthropogenic drainage andor a general higher vulnerabil-ity to seepage losses Finally slightly lower WLt values areindicated by a high fraction of organic soils for the 500 mbufferfpeat(500) This may reflect the higher land-use pres-sure on large peatlands compared to rather small peatlandswhich tentatively are more easily preserved by nature protec-tion efforts

The remaining four parameter groups are represented inthe model by only one parameter each The third most influ-ential parameter was the length of ditches on arable land andgrassland for the 250 m buffer dilendry (250) At first glanceit may be surprising that with increasing ditch density WLtvalues tend to be higher as ditches are supposed to drain thewater when land is used as arable land and grassland Thefact that the model identifies a rather strong effect in the op-posite direction may be caused by incomplete informationabout the drainage network There is not detailed informa-tion about the spatial distribution of tile drains Based on ex-pert knowledge agricultural areas with a lower ditch densityare more likely to have tile drains As these drains easilyinstalled with a narrow drain spacing are more effective atdraining organic soils low WLt values for arable land andgrassland may be related to low ditch densities Furthermoreditches were originally dug at narrow spacing in especiallywet areas of organic soils but there is no information avail-able whether these ditches still function properly

The parameters wbsummer hrel and tiras25all show expectedtrends The model predicts higher WLt for increasing cli-matic water balance in the summer period (May to October)wbsummer for dip wells located in depressions (low values ofhrel) and for higher small-scale topographic wetness indicescalculated on the 25times 25 digital elevation model (tiras25)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3332 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 7 Partial dependence plots for the predictor variables For an explanation of variables see Table 1 They axes are on WLt scale andare centered around the mean WLt Error bars and grey area indicate standard deviation of the response of over 1000 bootstrap models Therelative contribution of each predictor is indicated as percentage The lines along thex axes of each plot show distribution of data across thatvariable in deciles

The fact that all parameters show expected or explainableresponses in the model corroborates the reliability of the cal-ibrated WLt model The standard deviation of the predictorresponses based on the bootstrap samples shows the stabilityof the observed responses

Further insight into model behavior can be obtained by an-alyzing parameter interactions This is obtained by changingtwo parameters simultaneously while keeping mean valuesfor all other parameters (Elith et al 2008) Figure 8 showsthe two strongest parameter interactions Parameter wbsummer

strongly interacts withptype The generally lower values ofWLt of fens without surface water connection and other or-ganic soils show a stronger dependency on the summer cli-matic water balance While a summer climatic water balanceof gt minus80 mm shows a rather weak effect on WLt for the wet-ter peatland types in contrast to the two drier peatland typesthere is still a strong effect with increasing wbsummer Thetrend for wbsummergt 130 mm for the dry peatland types issupported by seven different peatlands

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3333

Figure 8Partial dependence plots representing the two strongest interactions in the model(a) betweenptype and wbsummerand(b) betweenpbaseandfdry Fitted WLt is plotted on they axis which is obtained after accounting for the average effect of all other predictor variables

Another strong interaction is observed forpbase andfdry (2500) While a rather weak effect of the fraction ofarable land and grassland is observed for organic soils over-lying basement rock and peat clay layer a strong effect is ob-served for organic soils overlying unconsolidated rock Thisinteraction reflects the higher lateral range of drainage effectsfor organic soils with little flow resistance at the peat base Inthese organic soils intensive land use lowers the water levelover large areas

35 Discussion of model uncertainty

Plotting observed vs predicted WLt from cross-validation(Fig 9) illustrates the rather large residual variance that can-not be explained by the model As indicated by the higherRMSEcv for the wet range (Table 3) scatter increases withincreasing WLt Error bars in they direction indicate dataerror derived from the nugget of the variogram It is shownfor a few data points as an example Due to transformationdata error increases for higher WLt Figure 9 demonstratesthat the fraction of unexplainable variance related to data er-ror is much higher for the wet than for the dry range Boot-strap error indicating the variation of the model predictionsfor 1000 bootstrap samples is shown in thex direction for thesame data points Bootstrap error is lower than the data errorfor the wet range and slightly higher for the dry range

Bootstrap errors demonstrate the sensitivity of model pre-dictions to changes of the data set used for calibration Whena model possesses structural deficits such as missing pre-dictor variables bootstrap errors should not be used to de-fine confidence intervals for the model predictions Figure 10shows residuals from cross-validation and standard deviationof bootstrap predictions for all land cover classes The resid-uals of each land cover class show near-normal distributionsFor five of the nine land cover classes (wet forest wet un-used peatland arable land coniferous forest and reed) theShapirondashWilk test of normality is positive (p gt 005) Fig-ure 10a further indicates that residuals of each land cover

Figure 9 Observed vs predicted transformed annual mean waterlevel (WLt) from cross-validation results Error bars show selecteddata and bootstrap model errors as standard deviation Data pointsare scaled by their weights

scatter fairly well around zero indicating low bias for the var-ious land cover classes Land-cover-class-specific confidenceintervals of model predictions can thus be derived from theRMSEcv of each land cover class eg 2times RMSEcv repre-senting the 95 confidence interval

The prediction uncertainty derived from cross-validationis much higher than the bootstrap prediction uncertainty ob-tained from the bootstrap standard deviation (sd) with 2times sdcorresponding to the 95 confidence interval (Fig 10)The large difference between these values indicates that themodel has structural deficits that can be attributed to severalerror sources

i Key influences on WLt are missing in the set of pre-dictor variables None of the predictor variables in-dicate whether and to which extent water level in-crease due to re-wetting measures took place in the last

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 11: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3329

Finally additional residual analysis was performed to eval-uate whether the predictions are biased for different landcover classes or geographical regions

25 Regionalization

In the final regionalization step the predictor variables con-tributing to the final model were determined at a 25times 25 mraster for all organic soil in Germany Predictor variableswere determined with the same map input that was used formodel building Land cover information including informa-tion on ditches was based on the data from year 2012 and theclimatic data was based on the average of the last 30 yearsThe fine spatial resolution of 25times 25 m was not chosen tofool the reader with a highly spatially accurate model Ratherthis fairly fine scale was necessary to map the relativelysmall-scale effects of the topography land use and peatlandgeometry variables The final model was then used to makea prediction for each of these raster cells

3 Results and discussion

31 Spatial correlation structure of the data set

The variogram model fitted to the sample variogram provideda nugget (0012 m2 011 m) a sill (009 m2 03 m) and arange of spatial correlation (2700 m) for our data set of WL(Fig 4) The nugget represents the very small-scale soil hy-draulic variability and micro-topography effects on WL (vander Ploeg et al 2012) and measurement error eg by dif-ferences in the determination of the ground surface and inthe timing of the manual measurements Furthermore micro-topography (eg hummocks) and oscillating peat surfaces ofwet peatlands pose a challenge for an accurate determinationof both ground surface and water level The water level timeseries in the data set were of different lengths and rangedfrom 1 to 20 years Interannual variability of water levels canbe large (eg Knotters and van Walsum 1997) For simplic-ity in our analysis data were not harmonized by extrapolat-ing WL time series using weather data to a 30-year periodThus the nugget also includes errors that are introduced bydip wells with different measurement periods that are locatedin the range of spatial correlation In consideration of theseerror sources the fitted nugget of 011 m appears to be a re-alistic value At 03 m the fitted sill matched nearly perfectlywith the standard deviation of the data (031 m) which in-dicates consistency between semivariogram model and dataset The fitted range of spatial correlation of 2700 m reflectsboth physical effects ie the average range of lateral flowdue to hydraulic gradients as well as the effect of averageland-use patterns in Germany on the spatial correlation ofWL Fitted values were used in the calculation of the dip-well-specific weights using Eq (6)

32 Typical water levels for land-use types in Germanorganic soils

The land cover classes are characterized by plausible meanand median water levels which show consistent differencesbetween each other (Table 2 and Fig 5a) The mean valuesof arable land and grassland agree with what can be expectedfor their agronomic requirements with slightly lower waterlevels for arable land The high variability observed for bothclasses may be related to the variability of the efficiency ofinstalled drainage systems as for example the presence andcondition of tile drains and the depth of ditches Grasslandscan be managed with very variable intensity which is partlyreflected in different water levels Figure 5a further showsthat deciduous forests seem to dominate slightly drier organicsoils compared to coniferous forests which dominate underwetter conditions A high variability of water levels is ob-served for the land cover class unused peatland On the onehand post peat-cutting topography increases the variabilityof WL over short distances It probably contributes to thehigh variance observed for this class On the other hand thisclass comprises both rather dry unused peatlands and wetterpeatlands in which re-wetting measures already took placewhich however do not show yet a wet soil attribute in theATKIS Digital Landscape Model This may also cause part ofthe variance observed in the grassland and forest land coverclass All wet land cover classes (reed wet grassland wetforest and wet unused peatland) that were separated by wet-ness indication clearly show higher water levels showing thewetness attribute of the Digital Landscape Model is a usefulattribute

Figure 5b shows the transformed water level for all classesIt can be observed that the variances of the wetter landcover increase relative to the variances of the dry land coverclasses This is due to the highest sensitivity of GHG emis-sions in the wet range of water levels (gt minus05 m) Conse-quently the rather high variance of WL for arable land cor-responds to a rather low variance of WLt ie to a rather lowassumed effect of WL variability on the GHG budget

33 BRT model calibration and validation WL vs WL t

In contrast to land cover class the other predictor variablesshowed if at all only weak relations to WL and WLt whenevaluating them with box plots 2-D cross plots and simplecorrelation matrices Here we expected BRT to detect thestrongest predictor interactions and to identify the most in-formative predictors

After model calibration with all predictors subsequentmodel simplification successively dropped those parameterswith correlationgt 07 and the lowest contribution For bothWL and WLt model performance improved during this sim-plification For WLt the highest values of NSEcv of approxi-mately 046 were achieved with 21 to 9 model parametersThe development of NSEcv for the last 50 parameters is

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3330 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 5 Water level relative to ground surface WL (m) and transformed water level WLt (minus) by land cover class illustrated as a weightedbox plot WLt = minus1 corresponds to maximum CO2 emissions and WLt = 1 to maximum CH4 emissions In the top horizontal axes thenumber of dip wells in each class is indicated

Figure 6NSEcv as a function of number of predictor variables usedin the model of WLt during model simplification and shown for thelast 50 parameter drops

shown in Fig 6 Further elimination of parameters led to apronounced decline of model performance Similar behav-ior was observed for the calibration on WL In favor of amore parsimonious model we chose the model with the low-est number of parameters before the pronounced decline ofmodel performance occurred For the calibration on WLtthis corresponded to the model with lowest number of param-eters that still achieved NSEcv values ofgt 045 (Fig 6) Thefinal WLt model comprised nine predictor variables and thefinal WL model seven parameters The percentages of pa-rameter contributions to the final model and their individualinfluences are discussed for WLt in Sect 34

Table 3 summarizes the statistical performances of themodels calibrated on WL and WLt For both models NSEcalis considerably higher than NSEcv and shows the commonly

Table 2 Weighted mean and standard deviation of WL and WLtdata and of the WLt map presented in Sect 36 for the nine landcover classes

WL (m) WLt (minus)WLt (minus)

Meanplusmn sd Meanplusmn sd Map meanplusmn sd

Arable land minus069plusmn 030 minus076plusmn 017 minus066plusmn 022Deciduous f minus045plusmn 034 minus049plusmn 037 minus047plusmn 035Grassland minus044plusmn 029 minus052plusmn 032 minus049plusmn 030Unused peatl minus039plusmn 036 minus039plusmn 041 minus037plusmn 040Coniferous f minus036plusmn 036 minus037plusmn 037 minus046plusmn 035Wet unused peatl minus022plusmn 027 minus018plusmn 040 minus017plusmn 036Wet forest minus022plusmn 029 minus017plusmn 043 minus021plusmn 039Wet grassland minus010plusmn 014 minus000plusmn 031 minus015plusmn 039Reed minus001plusmn 017 020plusmn 029 minus006plusmn 032

observed overfitting behavior of BRT models The differentmeasures that we conducted to minimize overfitting (cross-validation on peatlands restriction to monotonic responsesand model simplification including elimination of highly cor-related variables) lowered the difference between NSEcal andNSEcv but could not totally avoid overfitting NSEcv of theWLt model (0453) indicates higher predictive model perfor-mance compared to the WL model (0381) However as thedata ranges differ due to the transformation this comparisonmay be misleading Therefore we transformed the predic-tions of the WL model to obtain WLt values from this modeland equally calculated the performance criteria (Table 3 sec-ond column) Then NSEcv is slightly increased (0397) butdoes not achieve the values of the model that was calibratedon WLt A better predictive model performance of the modelcalibrated on WLt is also visible for the RMSEcv valuesThe total RMSEcv as well as the RMSEcv values for the

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3331

Table 3Performance criteria of the different models dry range de-fined as WLlt minus03 m and wet range as WLgt minus03 m

WL (m) WLt (minus) WLt (minus)(calibrated (calibrated on WL) (calibrated on WLt)

on WL)

NSEcal 0627 0559 0642NSEcv 0381 0397 0453RMSEcv 0269 0299 0284RMSEcvdry 0284 0263 0259RMSEcvwet 0222 0382 0355Bias minus0003 0083 0002Biasdry minus0012 0070 0003Biaswet 0021 0120 0000

dry (WL lt minus03 m) and wet range (WLgt minus03 m) showslightly lower values for the WLt model compared to WLtvalues from the model calibrated on WL Given our hypo-thetical transfer function (Fig 3) in which the GHG budgetis linearly related to WLt the higher accuracy of WLt pre-dictions directly corresponds to a higher accuracy of GHGbudget predictions

Superior model performance is also evident when evaluat-ing model bias Only when calibrating directly on WLt arethe WLt predictions bias-free Calibration on WL and subse-quent transformation to WLt introduces a model bias towardssystematically lower WLt values In subsequent applicationsto GHG emission upscaling lower WLt values would lead toan overestimation of CO2 emissions and to an underestima-tion of CH4 emissions

34 Influence of predictor variables on WLt

Given the beneficial characteristics of the model calibratedon WLt for GHG upscaling presentation and discussion offurther model results is restricted to the WLt model

The BRT method allows the analysis of the parameter con-tributions to and influences on the model (Elith et al 2008)and thus may contribute to system understanding The per-centages of the contributions of the nine predictor variablesto the final model ranged from 252 to 56 (Fig 7) Ex-cept protection status at least one parameter of each of theseven parameter groups contributed to the final model Allprotection status information was dropped early during thesimplification process due to low contribution although WLshowed slightly higher values for data from nature protectionor special areas of conservation However other parametersseem to be able to fully compensate the information that islost by dropping this predictor

Land cover class lc at the dip well was the parameter withstrongest contribution (252 ) It basically follows the trendillustrated in Fig 5b The bootstrap error plotted as standarddeviation (Fig 7) shows the variation of this influence overthe 1000 bootstrap models A second land cover parameterthe fraction of dry land cover classes on organic soils in abuffer of 2500 m radiusfdry (2500) contributed to the model

with 103 The monotonic decrease of WLt with increas-ing fdry (2500) is plausible as higher values reflect intensiveland use in the surroundings of the dip well and thus indicateintensive artificial drainage Together both parameter con-tributed 355 and thus land cover represents the parametergroup with the strongest model contribution

Peatland characteristics are the second most importantparameter group The peatland type contributed 16 Themodel indicates that peatlands without any connection to sur-face water bodies (river or lake) and the class of other organicsoils are characterized by lower WLt compared to the peat-land types lowland bog upland bog and fen neighboring sur-face water As the class of other organic soils is generallyexpected to reflect lower water levels and as surface watermay have a stabilization effect on water levels of organicsoils the influence of the peatland type can be consideredplausible Besides peatland type the substrate of the peatbase contributes 56 Here organic soils overlying peatclay layers (eg limnic sediments such as calcareous gyt-tja) or basement rock are characterized by higher WLt com-pared to organic soils overlying unconsolidated rock Thiscan be explained by the lower drainage resistance of uncon-solidated rocks This may cause an increased efficiency ofanthropogenic drainage andor a general higher vulnerabil-ity to seepage losses Finally slightly lower WLt values areindicated by a high fraction of organic soils for the 500 mbufferfpeat(500) This may reflect the higher land-use pres-sure on large peatlands compared to rather small peatlandswhich tentatively are more easily preserved by nature protec-tion efforts

The remaining four parameter groups are represented inthe model by only one parameter each The third most influ-ential parameter was the length of ditches on arable land andgrassland for the 250 m buffer dilendry (250) At first glanceit may be surprising that with increasing ditch density WLtvalues tend to be higher as ditches are supposed to drain thewater when land is used as arable land and grassland Thefact that the model identifies a rather strong effect in the op-posite direction may be caused by incomplete informationabout the drainage network There is not detailed informa-tion about the spatial distribution of tile drains Based on ex-pert knowledge agricultural areas with a lower ditch densityare more likely to have tile drains As these drains easilyinstalled with a narrow drain spacing are more effective atdraining organic soils low WLt values for arable land andgrassland may be related to low ditch densities Furthermoreditches were originally dug at narrow spacing in especiallywet areas of organic soils but there is no information avail-able whether these ditches still function properly

The parameters wbsummer hrel and tiras25all show expectedtrends The model predicts higher WLt for increasing cli-matic water balance in the summer period (May to October)wbsummer for dip wells located in depressions (low values ofhrel) and for higher small-scale topographic wetness indicescalculated on the 25times 25 digital elevation model (tiras25)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3332 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 7 Partial dependence plots for the predictor variables For an explanation of variables see Table 1 They axes are on WLt scale andare centered around the mean WLt Error bars and grey area indicate standard deviation of the response of over 1000 bootstrap models Therelative contribution of each predictor is indicated as percentage The lines along thex axes of each plot show distribution of data across thatvariable in deciles

The fact that all parameters show expected or explainableresponses in the model corroborates the reliability of the cal-ibrated WLt model The standard deviation of the predictorresponses based on the bootstrap samples shows the stabilityof the observed responses

Further insight into model behavior can be obtained by an-alyzing parameter interactions This is obtained by changingtwo parameters simultaneously while keeping mean valuesfor all other parameters (Elith et al 2008) Figure 8 showsthe two strongest parameter interactions Parameter wbsummer

strongly interacts withptype The generally lower values ofWLt of fens without surface water connection and other or-ganic soils show a stronger dependency on the summer cli-matic water balance While a summer climatic water balanceof gt minus80 mm shows a rather weak effect on WLt for the wet-ter peatland types in contrast to the two drier peatland typesthere is still a strong effect with increasing wbsummer Thetrend for wbsummergt 130 mm for the dry peatland types issupported by seven different peatlands

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3333

Figure 8Partial dependence plots representing the two strongest interactions in the model(a) betweenptype and wbsummerand(b) betweenpbaseandfdry Fitted WLt is plotted on they axis which is obtained after accounting for the average effect of all other predictor variables

Another strong interaction is observed forpbase andfdry (2500) While a rather weak effect of the fraction ofarable land and grassland is observed for organic soils over-lying basement rock and peat clay layer a strong effect is ob-served for organic soils overlying unconsolidated rock Thisinteraction reflects the higher lateral range of drainage effectsfor organic soils with little flow resistance at the peat base Inthese organic soils intensive land use lowers the water levelover large areas

35 Discussion of model uncertainty

Plotting observed vs predicted WLt from cross-validation(Fig 9) illustrates the rather large residual variance that can-not be explained by the model As indicated by the higherRMSEcv for the wet range (Table 3) scatter increases withincreasing WLt Error bars in they direction indicate dataerror derived from the nugget of the variogram It is shownfor a few data points as an example Due to transformationdata error increases for higher WLt Figure 9 demonstratesthat the fraction of unexplainable variance related to data er-ror is much higher for the wet than for the dry range Boot-strap error indicating the variation of the model predictionsfor 1000 bootstrap samples is shown in thex direction for thesame data points Bootstrap error is lower than the data errorfor the wet range and slightly higher for the dry range

Bootstrap errors demonstrate the sensitivity of model pre-dictions to changes of the data set used for calibration Whena model possesses structural deficits such as missing pre-dictor variables bootstrap errors should not be used to de-fine confidence intervals for the model predictions Figure 10shows residuals from cross-validation and standard deviationof bootstrap predictions for all land cover classes The resid-uals of each land cover class show near-normal distributionsFor five of the nine land cover classes (wet forest wet un-used peatland arable land coniferous forest and reed) theShapirondashWilk test of normality is positive (p gt 005) Fig-ure 10a further indicates that residuals of each land cover

Figure 9 Observed vs predicted transformed annual mean waterlevel (WLt) from cross-validation results Error bars show selecteddata and bootstrap model errors as standard deviation Data pointsare scaled by their weights

scatter fairly well around zero indicating low bias for the var-ious land cover classes Land-cover-class-specific confidenceintervals of model predictions can thus be derived from theRMSEcv of each land cover class eg 2times RMSEcv repre-senting the 95 confidence interval

The prediction uncertainty derived from cross-validationis much higher than the bootstrap prediction uncertainty ob-tained from the bootstrap standard deviation (sd) with 2times sdcorresponding to the 95 confidence interval (Fig 10)The large difference between these values indicates that themodel has structural deficits that can be attributed to severalerror sources

i Key influences on WLt are missing in the set of pre-dictor variables None of the predictor variables in-dicate whether and to which extent water level in-crease due to re-wetting measures took place in the last

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 12: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

3330 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 5 Water level relative to ground surface WL (m) and transformed water level WLt (minus) by land cover class illustrated as a weightedbox plot WLt = minus1 corresponds to maximum CO2 emissions and WLt = 1 to maximum CH4 emissions In the top horizontal axes thenumber of dip wells in each class is indicated

Figure 6NSEcv as a function of number of predictor variables usedin the model of WLt during model simplification and shown for thelast 50 parameter drops

shown in Fig 6 Further elimination of parameters led to apronounced decline of model performance Similar behav-ior was observed for the calibration on WL In favor of amore parsimonious model we chose the model with the low-est number of parameters before the pronounced decline ofmodel performance occurred For the calibration on WLtthis corresponded to the model with lowest number of param-eters that still achieved NSEcv values ofgt 045 (Fig 6) Thefinal WLt model comprised nine predictor variables and thefinal WL model seven parameters The percentages of pa-rameter contributions to the final model and their individualinfluences are discussed for WLt in Sect 34

Table 3 summarizes the statistical performances of themodels calibrated on WL and WLt For both models NSEcalis considerably higher than NSEcv and shows the commonly

Table 2 Weighted mean and standard deviation of WL and WLtdata and of the WLt map presented in Sect 36 for the nine landcover classes

WL (m) WLt (minus)WLt (minus)

Meanplusmn sd Meanplusmn sd Map meanplusmn sd

Arable land minus069plusmn 030 minus076plusmn 017 minus066plusmn 022Deciduous f minus045plusmn 034 minus049plusmn 037 minus047plusmn 035Grassland minus044plusmn 029 minus052plusmn 032 minus049plusmn 030Unused peatl minus039plusmn 036 minus039plusmn 041 minus037plusmn 040Coniferous f minus036plusmn 036 minus037plusmn 037 minus046plusmn 035Wet unused peatl minus022plusmn 027 minus018plusmn 040 minus017plusmn 036Wet forest minus022plusmn 029 minus017plusmn 043 minus021plusmn 039Wet grassland minus010plusmn 014 minus000plusmn 031 minus015plusmn 039Reed minus001plusmn 017 020plusmn 029 minus006plusmn 032

observed overfitting behavior of BRT models The differentmeasures that we conducted to minimize overfitting (cross-validation on peatlands restriction to monotonic responsesand model simplification including elimination of highly cor-related variables) lowered the difference between NSEcal andNSEcv but could not totally avoid overfitting NSEcv of theWLt model (0453) indicates higher predictive model perfor-mance compared to the WL model (0381) However as thedata ranges differ due to the transformation this comparisonmay be misleading Therefore we transformed the predic-tions of the WL model to obtain WLt values from this modeland equally calculated the performance criteria (Table 3 sec-ond column) Then NSEcv is slightly increased (0397) butdoes not achieve the values of the model that was calibratedon WLt A better predictive model performance of the modelcalibrated on WLt is also visible for the RMSEcv valuesThe total RMSEcv as well as the RMSEcv values for the

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3331

Table 3Performance criteria of the different models dry range de-fined as WLlt minus03 m and wet range as WLgt minus03 m

WL (m) WLt (minus) WLt (minus)(calibrated (calibrated on WL) (calibrated on WLt)

on WL)

NSEcal 0627 0559 0642NSEcv 0381 0397 0453RMSEcv 0269 0299 0284RMSEcvdry 0284 0263 0259RMSEcvwet 0222 0382 0355Bias minus0003 0083 0002Biasdry minus0012 0070 0003Biaswet 0021 0120 0000

dry (WL lt minus03 m) and wet range (WLgt minus03 m) showslightly lower values for the WLt model compared to WLtvalues from the model calibrated on WL Given our hypo-thetical transfer function (Fig 3) in which the GHG budgetis linearly related to WLt the higher accuracy of WLt pre-dictions directly corresponds to a higher accuracy of GHGbudget predictions

Superior model performance is also evident when evaluat-ing model bias Only when calibrating directly on WLt arethe WLt predictions bias-free Calibration on WL and subse-quent transformation to WLt introduces a model bias towardssystematically lower WLt values In subsequent applicationsto GHG emission upscaling lower WLt values would lead toan overestimation of CO2 emissions and to an underestima-tion of CH4 emissions

34 Influence of predictor variables on WLt

Given the beneficial characteristics of the model calibratedon WLt for GHG upscaling presentation and discussion offurther model results is restricted to the WLt model

The BRT method allows the analysis of the parameter con-tributions to and influences on the model (Elith et al 2008)and thus may contribute to system understanding The per-centages of the contributions of the nine predictor variablesto the final model ranged from 252 to 56 (Fig 7) Ex-cept protection status at least one parameter of each of theseven parameter groups contributed to the final model Allprotection status information was dropped early during thesimplification process due to low contribution although WLshowed slightly higher values for data from nature protectionor special areas of conservation However other parametersseem to be able to fully compensate the information that islost by dropping this predictor

Land cover class lc at the dip well was the parameter withstrongest contribution (252 ) It basically follows the trendillustrated in Fig 5b The bootstrap error plotted as standarddeviation (Fig 7) shows the variation of this influence overthe 1000 bootstrap models A second land cover parameterthe fraction of dry land cover classes on organic soils in abuffer of 2500 m radiusfdry (2500) contributed to the model

with 103 The monotonic decrease of WLt with increas-ing fdry (2500) is plausible as higher values reflect intensiveland use in the surroundings of the dip well and thus indicateintensive artificial drainage Together both parameter con-tributed 355 and thus land cover represents the parametergroup with the strongest model contribution

Peatland characteristics are the second most importantparameter group The peatland type contributed 16 Themodel indicates that peatlands without any connection to sur-face water bodies (river or lake) and the class of other organicsoils are characterized by lower WLt compared to the peat-land types lowland bog upland bog and fen neighboring sur-face water As the class of other organic soils is generallyexpected to reflect lower water levels and as surface watermay have a stabilization effect on water levels of organicsoils the influence of the peatland type can be consideredplausible Besides peatland type the substrate of the peatbase contributes 56 Here organic soils overlying peatclay layers (eg limnic sediments such as calcareous gyt-tja) or basement rock are characterized by higher WLt com-pared to organic soils overlying unconsolidated rock Thiscan be explained by the lower drainage resistance of uncon-solidated rocks This may cause an increased efficiency ofanthropogenic drainage andor a general higher vulnerabil-ity to seepage losses Finally slightly lower WLt values areindicated by a high fraction of organic soils for the 500 mbufferfpeat(500) This may reflect the higher land-use pres-sure on large peatlands compared to rather small peatlandswhich tentatively are more easily preserved by nature protec-tion efforts

The remaining four parameter groups are represented inthe model by only one parameter each The third most influ-ential parameter was the length of ditches on arable land andgrassland for the 250 m buffer dilendry (250) At first glanceit may be surprising that with increasing ditch density WLtvalues tend to be higher as ditches are supposed to drain thewater when land is used as arable land and grassland Thefact that the model identifies a rather strong effect in the op-posite direction may be caused by incomplete informationabout the drainage network There is not detailed informa-tion about the spatial distribution of tile drains Based on ex-pert knowledge agricultural areas with a lower ditch densityare more likely to have tile drains As these drains easilyinstalled with a narrow drain spacing are more effective atdraining organic soils low WLt values for arable land andgrassland may be related to low ditch densities Furthermoreditches were originally dug at narrow spacing in especiallywet areas of organic soils but there is no information avail-able whether these ditches still function properly

The parameters wbsummer hrel and tiras25all show expectedtrends The model predicts higher WLt for increasing cli-matic water balance in the summer period (May to October)wbsummer for dip wells located in depressions (low values ofhrel) and for higher small-scale topographic wetness indicescalculated on the 25times 25 digital elevation model (tiras25)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3332 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 7 Partial dependence plots for the predictor variables For an explanation of variables see Table 1 They axes are on WLt scale andare centered around the mean WLt Error bars and grey area indicate standard deviation of the response of over 1000 bootstrap models Therelative contribution of each predictor is indicated as percentage The lines along thex axes of each plot show distribution of data across thatvariable in deciles

The fact that all parameters show expected or explainableresponses in the model corroborates the reliability of the cal-ibrated WLt model The standard deviation of the predictorresponses based on the bootstrap samples shows the stabilityof the observed responses

Further insight into model behavior can be obtained by an-alyzing parameter interactions This is obtained by changingtwo parameters simultaneously while keeping mean valuesfor all other parameters (Elith et al 2008) Figure 8 showsthe two strongest parameter interactions Parameter wbsummer

strongly interacts withptype The generally lower values ofWLt of fens without surface water connection and other or-ganic soils show a stronger dependency on the summer cli-matic water balance While a summer climatic water balanceof gt minus80 mm shows a rather weak effect on WLt for the wet-ter peatland types in contrast to the two drier peatland typesthere is still a strong effect with increasing wbsummer Thetrend for wbsummergt 130 mm for the dry peatland types issupported by seven different peatlands

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3333

Figure 8Partial dependence plots representing the two strongest interactions in the model(a) betweenptype and wbsummerand(b) betweenpbaseandfdry Fitted WLt is plotted on they axis which is obtained after accounting for the average effect of all other predictor variables

Another strong interaction is observed forpbase andfdry (2500) While a rather weak effect of the fraction ofarable land and grassland is observed for organic soils over-lying basement rock and peat clay layer a strong effect is ob-served for organic soils overlying unconsolidated rock Thisinteraction reflects the higher lateral range of drainage effectsfor organic soils with little flow resistance at the peat base Inthese organic soils intensive land use lowers the water levelover large areas

35 Discussion of model uncertainty

Plotting observed vs predicted WLt from cross-validation(Fig 9) illustrates the rather large residual variance that can-not be explained by the model As indicated by the higherRMSEcv for the wet range (Table 3) scatter increases withincreasing WLt Error bars in they direction indicate dataerror derived from the nugget of the variogram It is shownfor a few data points as an example Due to transformationdata error increases for higher WLt Figure 9 demonstratesthat the fraction of unexplainable variance related to data er-ror is much higher for the wet than for the dry range Boot-strap error indicating the variation of the model predictionsfor 1000 bootstrap samples is shown in thex direction for thesame data points Bootstrap error is lower than the data errorfor the wet range and slightly higher for the dry range

Bootstrap errors demonstrate the sensitivity of model pre-dictions to changes of the data set used for calibration Whena model possesses structural deficits such as missing pre-dictor variables bootstrap errors should not be used to de-fine confidence intervals for the model predictions Figure 10shows residuals from cross-validation and standard deviationof bootstrap predictions for all land cover classes The resid-uals of each land cover class show near-normal distributionsFor five of the nine land cover classes (wet forest wet un-used peatland arable land coniferous forest and reed) theShapirondashWilk test of normality is positive (p gt 005) Fig-ure 10a further indicates that residuals of each land cover

Figure 9 Observed vs predicted transformed annual mean waterlevel (WLt) from cross-validation results Error bars show selecteddata and bootstrap model errors as standard deviation Data pointsare scaled by their weights

scatter fairly well around zero indicating low bias for the var-ious land cover classes Land-cover-class-specific confidenceintervals of model predictions can thus be derived from theRMSEcv of each land cover class eg 2times RMSEcv repre-senting the 95 confidence interval

The prediction uncertainty derived from cross-validationis much higher than the bootstrap prediction uncertainty ob-tained from the bootstrap standard deviation (sd) with 2times sdcorresponding to the 95 confidence interval (Fig 10)The large difference between these values indicates that themodel has structural deficits that can be attributed to severalerror sources

i Key influences on WLt are missing in the set of pre-dictor variables None of the predictor variables in-dicate whether and to which extent water level in-crease due to re-wetting measures took place in the last

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 13: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3331

Table 3Performance criteria of the different models dry range de-fined as WLlt minus03 m and wet range as WLgt minus03 m

WL (m) WLt (minus) WLt (minus)(calibrated (calibrated on WL) (calibrated on WLt)

on WL)

NSEcal 0627 0559 0642NSEcv 0381 0397 0453RMSEcv 0269 0299 0284RMSEcvdry 0284 0263 0259RMSEcvwet 0222 0382 0355Bias minus0003 0083 0002Biasdry minus0012 0070 0003Biaswet 0021 0120 0000

dry (WL lt minus03 m) and wet range (WLgt minus03 m) showslightly lower values for the WLt model compared to WLtvalues from the model calibrated on WL Given our hypo-thetical transfer function (Fig 3) in which the GHG budgetis linearly related to WLt the higher accuracy of WLt pre-dictions directly corresponds to a higher accuracy of GHGbudget predictions

Superior model performance is also evident when evaluat-ing model bias Only when calibrating directly on WLt arethe WLt predictions bias-free Calibration on WL and subse-quent transformation to WLt introduces a model bias towardssystematically lower WLt values In subsequent applicationsto GHG emission upscaling lower WLt values would lead toan overestimation of CO2 emissions and to an underestima-tion of CH4 emissions

34 Influence of predictor variables on WLt

Given the beneficial characteristics of the model calibratedon WLt for GHG upscaling presentation and discussion offurther model results is restricted to the WLt model

The BRT method allows the analysis of the parameter con-tributions to and influences on the model (Elith et al 2008)and thus may contribute to system understanding The per-centages of the contributions of the nine predictor variablesto the final model ranged from 252 to 56 (Fig 7) Ex-cept protection status at least one parameter of each of theseven parameter groups contributed to the final model Allprotection status information was dropped early during thesimplification process due to low contribution although WLshowed slightly higher values for data from nature protectionor special areas of conservation However other parametersseem to be able to fully compensate the information that islost by dropping this predictor

Land cover class lc at the dip well was the parameter withstrongest contribution (252 ) It basically follows the trendillustrated in Fig 5b The bootstrap error plotted as standarddeviation (Fig 7) shows the variation of this influence overthe 1000 bootstrap models A second land cover parameterthe fraction of dry land cover classes on organic soils in abuffer of 2500 m radiusfdry (2500) contributed to the model

with 103 The monotonic decrease of WLt with increas-ing fdry (2500) is plausible as higher values reflect intensiveland use in the surroundings of the dip well and thus indicateintensive artificial drainage Together both parameter con-tributed 355 and thus land cover represents the parametergroup with the strongest model contribution

Peatland characteristics are the second most importantparameter group The peatland type contributed 16 Themodel indicates that peatlands without any connection to sur-face water bodies (river or lake) and the class of other organicsoils are characterized by lower WLt compared to the peat-land types lowland bog upland bog and fen neighboring sur-face water As the class of other organic soils is generallyexpected to reflect lower water levels and as surface watermay have a stabilization effect on water levels of organicsoils the influence of the peatland type can be consideredplausible Besides peatland type the substrate of the peatbase contributes 56 Here organic soils overlying peatclay layers (eg limnic sediments such as calcareous gyt-tja) or basement rock are characterized by higher WLt com-pared to organic soils overlying unconsolidated rock Thiscan be explained by the lower drainage resistance of uncon-solidated rocks This may cause an increased efficiency ofanthropogenic drainage andor a general higher vulnerabil-ity to seepage losses Finally slightly lower WLt values areindicated by a high fraction of organic soils for the 500 mbufferfpeat(500) This may reflect the higher land-use pres-sure on large peatlands compared to rather small peatlandswhich tentatively are more easily preserved by nature protec-tion efforts

The remaining four parameter groups are represented inthe model by only one parameter each The third most influ-ential parameter was the length of ditches on arable land andgrassland for the 250 m buffer dilendry (250) At first glanceit may be surprising that with increasing ditch density WLtvalues tend to be higher as ditches are supposed to drain thewater when land is used as arable land and grassland Thefact that the model identifies a rather strong effect in the op-posite direction may be caused by incomplete informationabout the drainage network There is not detailed informa-tion about the spatial distribution of tile drains Based on ex-pert knowledge agricultural areas with a lower ditch densityare more likely to have tile drains As these drains easilyinstalled with a narrow drain spacing are more effective atdraining organic soils low WLt values for arable land andgrassland may be related to low ditch densities Furthermoreditches were originally dug at narrow spacing in especiallywet areas of organic soils but there is no information avail-able whether these ditches still function properly

The parameters wbsummer hrel and tiras25all show expectedtrends The model predicts higher WLt for increasing cli-matic water balance in the summer period (May to October)wbsummer for dip wells located in depressions (low values ofhrel) and for higher small-scale topographic wetness indicescalculated on the 25times 25 digital elevation model (tiras25)

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3332 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 7 Partial dependence plots for the predictor variables For an explanation of variables see Table 1 They axes are on WLt scale andare centered around the mean WLt Error bars and grey area indicate standard deviation of the response of over 1000 bootstrap models Therelative contribution of each predictor is indicated as percentage The lines along thex axes of each plot show distribution of data across thatvariable in deciles

The fact that all parameters show expected or explainableresponses in the model corroborates the reliability of the cal-ibrated WLt model The standard deviation of the predictorresponses based on the bootstrap samples shows the stabilityof the observed responses

Further insight into model behavior can be obtained by an-alyzing parameter interactions This is obtained by changingtwo parameters simultaneously while keeping mean valuesfor all other parameters (Elith et al 2008) Figure 8 showsthe two strongest parameter interactions Parameter wbsummer

strongly interacts withptype The generally lower values ofWLt of fens without surface water connection and other or-ganic soils show a stronger dependency on the summer cli-matic water balance While a summer climatic water balanceof gt minus80 mm shows a rather weak effect on WLt for the wet-ter peatland types in contrast to the two drier peatland typesthere is still a strong effect with increasing wbsummer Thetrend for wbsummergt 130 mm for the dry peatland types issupported by seven different peatlands

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3333

Figure 8Partial dependence plots representing the two strongest interactions in the model(a) betweenptype and wbsummerand(b) betweenpbaseandfdry Fitted WLt is plotted on they axis which is obtained after accounting for the average effect of all other predictor variables

Another strong interaction is observed forpbase andfdry (2500) While a rather weak effect of the fraction ofarable land and grassland is observed for organic soils over-lying basement rock and peat clay layer a strong effect is ob-served for organic soils overlying unconsolidated rock Thisinteraction reflects the higher lateral range of drainage effectsfor organic soils with little flow resistance at the peat base Inthese organic soils intensive land use lowers the water levelover large areas

35 Discussion of model uncertainty

Plotting observed vs predicted WLt from cross-validation(Fig 9) illustrates the rather large residual variance that can-not be explained by the model As indicated by the higherRMSEcv for the wet range (Table 3) scatter increases withincreasing WLt Error bars in they direction indicate dataerror derived from the nugget of the variogram It is shownfor a few data points as an example Due to transformationdata error increases for higher WLt Figure 9 demonstratesthat the fraction of unexplainable variance related to data er-ror is much higher for the wet than for the dry range Boot-strap error indicating the variation of the model predictionsfor 1000 bootstrap samples is shown in thex direction for thesame data points Bootstrap error is lower than the data errorfor the wet range and slightly higher for the dry range

Bootstrap errors demonstrate the sensitivity of model pre-dictions to changes of the data set used for calibration Whena model possesses structural deficits such as missing pre-dictor variables bootstrap errors should not be used to de-fine confidence intervals for the model predictions Figure 10shows residuals from cross-validation and standard deviationof bootstrap predictions for all land cover classes The resid-uals of each land cover class show near-normal distributionsFor five of the nine land cover classes (wet forest wet un-used peatland arable land coniferous forest and reed) theShapirondashWilk test of normality is positive (p gt 005) Fig-ure 10a further indicates that residuals of each land cover

Figure 9 Observed vs predicted transformed annual mean waterlevel (WLt) from cross-validation results Error bars show selecteddata and bootstrap model errors as standard deviation Data pointsare scaled by their weights

scatter fairly well around zero indicating low bias for the var-ious land cover classes Land-cover-class-specific confidenceintervals of model predictions can thus be derived from theRMSEcv of each land cover class eg 2times RMSEcv repre-senting the 95 confidence interval

The prediction uncertainty derived from cross-validationis much higher than the bootstrap prediction uncertainty ob-tained from the bootstrap standard deviation (sd) with 2times sdcorresponding to the 95 confidence interval (Fig 10)The large difference between these values indicates that themodel has structural deficits that can be attributed to severalerror sources

i Key influences on WLt are missing in the set of pre-dictor variables None of the predictor variables in-dicate whether and to which extent water level in-crease due to re-wetting measures took place in the last

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 14: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

3332 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 7 Partial dependence plots for the predictor variables For an explanation of variables see Table 1 They axes are on WLt scale andare centered around the mean WLt Error bars and grey area indicate standard deviation of the response of over 1000 bootstrap models Therelative contribution of each predictor is indicated as percentage The lines along thex axes of each plot show distribution of data across thatvariable in deciles

The fact that all parameters show expected or explainableresponses in the model corroborates the reliability of the cal-ibrated WLt model The standard deviation of the predictorresponses based on the bootstrap samples shows the stabilityof the observed responses

Further insight into model behavior can be obtained by an-alyzing parameter interactions This is obtained by changingtwo parameters simultaneously while keeping mean valuesfor all other parameters (Elith et al 2008) Figure 8 showsthe two strongest parameter interactions Parameter wbsummer

strongly interacts withptype The generally lower values ofWLt of fens without surface water connection and other or-ganic soils show a stronger dependency on the summer cli-matic water balance While a summer climatic water balanceof gt minus80 mm shows a rather weak effect on WLt for the wet-ter peatland types in contrast to the two drier peatland typesthere is still a strong effect with increasing wbsummer Thetrend for wbsummergt 130 mm for the dry peatland types issupported by seven different peatlands

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3333

Figure 8Partial dependence plots representing the two strongest interactions in the model(a) betweenptype and wbsummerand(b) betweenpbaseandfdry Fitted WLt is plotted on they axis which is obtained after accounting for the average effect of all other predictor variables

Another strong interaction is observed forpbase andfdry (2500) While a rather weak effect of the fraction ofarable land and grassland is observed for organic soils over-lying basement rock and peat clay layer a strong effect is ob-served for organic soils overlying unconsolidated rock Thisinteraction reflects the higher lateral range of drainage effectsfor organic soils with little flow resistance at the peat base Inthese organic soils intensive land use lowers the water levelover large areas

35 Discussion of model uncertainty

Plotting observed vs predicted WLt from cross-validation(Fig 9) illustrates the rather large residual variance that can-not be explained by the model As indicated by the higherRMSEcv for the wet range (Table 3) scatter increases withincreasing WLt Error bars in they direction indicate dataerror derived from the nugget of the variogram It is shownfor a few data points as an example Due to transformationdata error increases for higher WLt Figure 9 demonstratesthat the fraction of unexplainable variance related to data er-ror is much higher for the wet than for the dry range Boot-strap error indicating the variation of the model predictionsfor 1000 bootstrap samples is shown in thex direction for thesame data points Bootstrap error is lower than the data errorfor the wet range and slightly higher for the dry range

Bootstrap errors demonstrate the sensitivity of model pre-dictions to changes of the data set used for calibration Whena model possesses structural deficits such as missing pre-dictor variables bootstrap errors should not be used to de-fine confidence intervals for the model predictions Figure 10shows residuals from cross-validation and standard deviationof bootstrap predictions for all land cover classes The resid-uals of each land cover class show near-normal distributionsFor five of the nine land cover classes (wet forest wet un-used peatland arable land coniferous forest and reed) theShapirondashWilk test of normality is positive (p gt 005) Fig-ure 10a further indicates that residuals of each land cover

Figure 9 Observed vs predicted transformed annual mean waterlevel (WLt) from cross-validation results Error bars show selecteddata and bootstrap model errors as standard deviation Data pointsare scaled by their weights

scatter fairly well around zero indicating low bias for the var-ious land cover classes Land-cover-class-specific confidenceintervals of model predictions can thus be derived from theRMSEcv of each land cover class eg 2times RMSEcv repre-senting the 95 confidence interval

The prediction uncertainty derived from cross-validationis much higher than the bootstrap prediction uncertainty ob-tained from the bootstrap standard deviation (sd) with 2times sdcorresponding to the 95 confidence interval (Fig 10)The large difference between these values indicates that themodel has structural deficits that can be attributed to severalerror sources

i Key influences on WLt are missing in the set of pre-dictor variables None of the predictor variables in-dicate whether and to which extent water level in-crease due to re-wetting measures took place in the last

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 15: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3333

Figure 8Partial dependence plots representing the two strongest interactions in the model(a) betweenptype and wbsummerand(b) betweenpbaseandfdry Fitted WLt is plotted on they axis which is obtained after accounting for the average effect of all other predictor variables

Another strong interaction is observed forpbase andfdry (2500) While a rather weak effect of the fraction ofarable land and grassland is observed for organic soils over-lying basement rock and peat clay layer a strong effect is ob-served for organic soils overlying unconsolidated rock Thisinteraction reflects the higher lateral range of drainage effectsfor organic soils with little flow resistance at the peat base Inthese organic soils intensive land use lowers the water levelover large areas

35 Discussion of model uncertainty

Plotting observed vs predicted WLt from cross-validation(Fig 9) illustrates the rather large residual variance that can-not be explained by the model As indicated by the higherRMSEcv for the wet range (Table 3) scatter increases withincreasing WLt Error bars in they direction indicate dataerror derived from the nugget of the variogram It is shownfor a few data points as an example Due to transformationdata error increases for higher WLt Figure 9 demonstratesthat the fraction of unexplainable variance related to data er-ror is much higher for the wet than for the dry range Boot-strap error indicating the variation of the model predictionsfor 1000 bootstrap samples is shown in thex direction for thesame data points Bootstrap error is lower than the data errorfor the wet range and slightly higher for the dry range

Bootstrap errors demonstrate the sensitivity of model pre-dictions to changes of the data set used for calibration Whena model possesses structural deficits such as missing pre-dictor variables bootstrap errors should not be used to de-fine confidence intervals for the model predictions Figure 10shows residuals from cross-validation and standard deviationof bootstrap predictions for all land cover classes The resid-uals of each land cover class show near-normal distributionsFor five of the nine land cover classes (wet forest wet un-used peatland arable land coniferous forest and reed) theShapirondashWilk test of normality is positive (p gt 005) Fig-ure 10a further indicates that residuals of each land cover

Figure 9 Observed vs predicted transformed annual mean waterlevel (WLt) from cross-validation results Error bars show selecteddata and bootstrap model errors as standard deviation Data pointsare scaled by their weights

scatter fairly well around zero indicating low bias for the var-ious land cover classes Land-cover-class-specific confidenceintervals of model predictions can thus be derived from theRMSEcv of each land cover class eg 2times RMSEcv repre-senting the 95 confidence interval

The prediction uncertainty derived from cross-validationis much higher than the bootstrap prediction uncertainty ob-tained from the bootstrap standard deviation (sd) with 2times sdcorresponding to the 95 confidence interval (Fig 10)The large difference between these values indicates that themodel has structural deficits that can be attributed to severalerror sources

i Key influences on WLt are missing in the set of pre-dictor variables None of the predictor variables in-dicate whether and to which extent water level in-crease due to re-wetting measures took place in the last

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 16: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

3334 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Figure 10 (a)Residuals (observationndashprediction) of WLt predictions and(b) standard deviation (sd) of bootstrap predictions shown for thenine land cover classes In the upper part the number of dip wells in each class is indicated

years Wetness indicators (wet soil andor vegetation at-tributes) that are obtained from the Digital LandscapeModel probably react with a delay of several yearsThus we expect the occurrence of several observed highWLt values that cannot be explained by any of the pre-dictor variables

ii Small-scale topography that is not represented with suf-ficient detail and accuracy in the DEM may cause sev-eral predictions to strongly differ from what would beexpected from the other predictor variables A commonexample may be a dip well that is located on a narrowpeat ridge which remained after peat-cutting and is ab-sent in the DEM and that is situated in an area classi-fied as wet soil by the Digital Landscape Model Thenthe model indicates a WLt that is much higher than theobserved WLt as for the observed value the referencesurface was the surface of the peat ridge

iii Consistent information about tile drains is missing andonly exists on the regional scale (Tetzlaff et al 2009)At the national scale however there are no maps on tiledrains Tile drains are known to have a strong effect onWLt for arable land and grassland As explained abovewe expect parameter dilendry (250) to partially compen-sate for this missing information

iv Another source of prediction uncertainty may compriseinconsistent and erroneous land cover classification ofthe Digital Landscape Model due to the high degree ofsubjectivity for many of the attributes Furthermore thetemporal accuracy of the Digital Landscape Model maybe as inaccurate as 5 years which can cause time serieswith land-use change to be split at the wrong date andvegetation and wetness attributes to be not yet updatedto the current conditions

Figure 11 Residuals (observationndashprediction) of WLt predictionsfor the three major geographical peatland regions of Germany Inthe upper part the number of dip wells in each class is indicated

v The water balance of fens strongly depends on the sizeand the hydraulic head of the groundwater catchmentie of the aquifer underlying the peat layer Unfortu-nately there is no consistent map of hydraulic heads orgroundwater catchments for all Germany

We checked model predictions for geographical bias Ge-ographical location was not one of the model parametersHowever the history and policy of land use on organic soilscurrent ditch water management and climate do show large-scale geographical trends We divided our data set into thethree major German peatland regions (NE NW and S) andevaluated the model residuals (Fig 11) to see whether ourmodel is biased due to important missing geographical ef-fects A serious bias for any of the three major German peat-land regions cannot be identified

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 17: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3335

Figure 12Map of predictions of transformed annual mean water level (WLt) for all German organic soils(a)and an enlarged map section(b)Probability distribution in(c) indicates the uncertainty of a specific point prediction for wet grassland as an example Here predicted valueis approximately WLt = 0 but note that wet grassland predictions do vary in space depending on the values of the other model parametersThe histogram shows the residuals from cross-validation for wet grassland to which the probability distribution was fitted

When applying calibrated statistical models during region-alization it is important to check model behavior for extrapo-lation outside the range of the parameter space that is coveredby the data upon which the model was built BRT always ex-trapolates at a constant value from the most extreme environ-mental value in the training data In contrast to other typesof statistical models eg generalized linear models BRTdoes not continue the fitted trend beyond the last observa-tion Regarding the categorical variables the data set coversall classes occurring in Germany with several peatlands Thedata set also covers the major range of values occurring inGermany for the numerical predictor variables FurthermoreFig 7 indicates that the constant values at which the modelextrapolates the influence of the variables do not raise majorconcern for any extreme predictions outside the parameterrange

36 Regionalization

The map of WLt resulting from the application of the fittedWLt model to all grid cells shows gradients at the regionalscale (Fig 12a) In the south of Germany for example agradient from wet to dry can be observed for the pre-alpineupland bogs and the peatlands of the moraine plain In thenorth of Germany the map indicates that organic soils in the

very NE are wetter than the rest For the rest of the north aslight gradient can be observed from less dry to dry from NWto E which is mainly driven by the higher summer climaticwater balance in the NW As both categorical and numeri-cal predictor variables do also vary at the sub-regional scalethe resulting map also shows gradients within peatland areaseg due to small-scale land-use ditch density gradients andtopography effects (Fig 12b)

We calculated WLt averages of the land cover classes us-ing the regionalized WLt from the map (Table 2 column 3)The given standard deviation comprises both the variabilitywithin a land cover class that is explained by the model aswell as the uncertainty of each prediction Resulting meansand standard deviations slightly differ from the correspond-ing values of the data set The land-cover-specific WLt valuesobtained from the map can be considered as being more rep-resentative as the regionalization procedure is supposed topartly account for potential bias in the data set

When applying this map and its predicted WLt values insubsequent GHG upscaling it is crucial that model uncer-tainty is propagated properly An example demonstrates thenecessity of uncertainty propagation For a grid cell classi-fied as wet grassland the probability distribution of WLt isshown based on a normal distribution that was fitted to the

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 18: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

3336 M Bechtold et al Large-scale regionalization of water table depth in peatlands

residuals of this land cover class (Fig 12c) Without prop-agating the uncertainty and when only translating the pre-dicted WLt (eventually in combination with other parame-ters eg soil properties) into a GHG budget GHG budgetis strongly underestimated as the WLt prediction is close tozero indicating neither large CO2 nor CH4 emissions Whentranslating the full distribution of WLt into a GHG budgetthe resulting GHG budget would be much higher as at bothsides of the predicted WLt the GHG budget increases

37 Possible paths for model improvement

The model performance that is achieved by the statistical ap-proach presented in our study raises the question whethercollecting more WL data can improve model performanceor whether the factor that is constraining the model perfor-mance is the limited strength of the nation-wide availablepredictor variables To assess this question additional ldquohold-out modelsrdquo were developed by fitting the BRT model tovarious random sets of data with a limited number of peat-land areas (from 10 to 50 peatlands) For each number ofpeatland areas 500 random selections were calibrated andmodel performance was evaluated with NSEcv As expectedresults indicate an increase of model performance with in-creasing number of peatlands used in the model building pro-cess (Fig 13) Results also indicate a substantial flattening ofthe learning curve Thus further collection of WL data mayonly lead to a substantial model improvement when includ-ing many more peatlands into the data set More promisingwould be the specific collection of more data on the weaklyrepresented andor important land cover classes arable landand grassland

Another path to achieve a stronger model is the develop-ment of new predictor variables In the future the availabilityof a more accurate DEM based on laser-scanning data whichis already available at full coverage for some federal states ofGermany may strongly increase the predictability of the ob-served WL data Additionally a nation-wide map of watermanagement and of the distribution of tile drains would havegreat potential to explain large parts of the residual varianceandor even allow setting up a large-scale physically basedmodel that includes water management Furthermore dataharmonization by extrapolating the water level time seriesof our data set with the climatic boundary conditions of thelast 30 years may lower the unexplainable variance of thedata set due to short measurement periods (Bartholomeus etal 2008) an effort that has been successfully conducted inFinke et al (2004) using the transfer noise model of Bierkenset al (1999) Finally we believe that the inclusion of re-mote sensing products in our statistical model approach aseg spaceborne microwave soil moisture observations (Su-tanudjaja et al 2013) may hold large potential to improvemodel performance as moisture differences due to varyingwater levels are high for organic soils

Figure 13 NSE of cross-validation vs number of randomly se-lected peatland areas Dashed lines indicate NSEcv plusmn sd

4 Conclusions

Our study demonstrates the potential of statistical modelingfor the regionalization of water levels in organic soils whendata covers only a small fraction of peatlands of the final mapand thus spatial interpolation is not possible With the avail-able data set of target and predictor variables it was possibleto predict 45 of the GHG relevant water level variance inthe data set in a cross-validation scheme The variance is ex-plained by nine predictor variables With the analysis of theireffect on the water level it was possible to gain insight intonatural and anthropogenic boundary conditions that controlwater levels of organic soils in Germany

Based on a hypothetical GHG transfer function relat-ing GHG emissions to annual mean water levels (WL) weshowed the advantages of transforming the annual mean wa-ter level into a new variable (WLt) to which GHG emissionslinearly depend on The transformation improved model ac-curacy increased the explained variance of the water levelrange that is relevant for GHG emissions and avoided modelbias

The presented approach is transparent and allows succes-sive improvement when new input data and predictor vari-ables become available Our results show that model im-provement by increasing number of WLt data howeverseems to be limited If efforts are made data collectionshould be concentrated on agriculturally used organic soilsfor which relatively few data is available We believe that theconstraining factor of model performance is rather the weak-ness of the predictor variables that are currently available atlarge scales The development of new more informative pre-dictor variables as for example water management maps andremote sensing products may be the more promising path formodel improvement

The proposed regionalization approach is suited to appli-cation to any other country where similar data of target andpredictor variables is available It is important that the spatial

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 19: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3337

resolution of the predictor variables is high enough (Finke etal 2004) If predictor variables like land use and peatlandtype are only available at a much coarser scale and providedas percentages for grid cells the dependency between pre-dictor variables and the rather local WL will probably be lostfor most of the predictor variables

Our work must be considered as one piece of a broaderframework for the regionalization of GHG emissions that in-cludes other site characteristics and must be further devel-oped in future research For example if for specific regionsdetailed information on peat properties becomes availableand its effect on GHG emissions can be estimated by theuse of multivariate transfer functions the map of transformedwater levels (WLt) can be used as an input for this follow-upregionalization

AcknowledgementsSeveral institutions made their data availablefor this synthesis study We gratefully thank ARGE SchwaumlbischesDonaumoos Biologische Station Steinfurt Biosphaumlrenreser-vat Vessertal BUND Diepholzer Moorniederung DeutscherWetterdienst (DWD) Bezirksregierung Detmold FoumlrdervereinFeldberg-Uckermaumlrkische Seenlandschaft Hochschule fuumlrWirtschaft und Umwelt Nuumlrtingen (Institut fuumlr AngewandteForschung) Hochschule Weihenstephan-Triesdorf (Professur fuumlrVegetationsoumlkologie) Humboldt Universitaumlt zu Berlin (FachgebietBodenkunde und Standortlehre) Landkreis Gifhorn LBEGHannover (Referat Boden- und Grundwassermonitoring) LUNGMecklenburg-Vorpommern Eberhard Gaumlrtner IGB Berlin (Zen-trales Chemielabor) Landkreis Gifhorn NABU Minden-LuumlbbeckeNaturpark Droumlmling Naturpark ErzgebirgeVogtland Region Han-nover Naturpark NossentinerSchwinzer Heide NaturschutzfondBrandenburg Oumlkologische Station Steinhuder Meer TechnischeUniversitaumlt Muumlnchen (Lehrstuhl fuumlr Vegetationsoumlkologie) Uni-versitaumlt Hohenheim (Institut fuumlr Bodenkunde und Standortslehre)Johannes Gutenberg-Universitaumlt Mainz (Geographisches InstitutBodenkunde) Universitaumlt Rostock (Professur fuumlr Bodenphysikund Ressourcenschutz Professur fuumlr Landschaftsoumlkologie undStandortkunde Professur fuumlr Hydrologie) Kees Vegelin ZALF(Institut fuumlr Bodenlandschaftsforschung Institut fuumlr Land-schaftsbiogeochemie) Werner Kutsch Christian Bruumlmmer andMiriam Hurkuck Furthermore we acknowledge Maik Hunzigerand Soumlren Gebbert for field and GRASS support Katharina Leiber-Sauheitl Stefan Frank Ullrich Dettmann and Reneacute Dechow fordata processing support Annette Freibauer for reviewing thepaper draft and three anonymous reviewers for providing helpfulcomments during the revision process The study was financiallysupported by the joint research project ldquoOrganic soilsrdquo funded bythe Thuumlnen Institute

Edited by J Liu

References

Bartholomeus R Witte J P M van Bodegom P M and AertsR The need of data harmonization to derive robust empiricalrelationships between soil conditions and vegetation J Veg Sci19 799ndash808 doi1031702008-8-18450 2008

Berglund O and Berglund K Influence of water table leveland soil properties on emissions of greenhouse gases fromcultivated peat soil Soil Biol Biochem 43 5 923ndash931doi101016jsoilbio201101002 2011

Beven K J and Kirby M A physically based variable contributingarea model of catchment hydrology Hydrol Sci Bull 24 43ndash69 1979

Bierkens M F P and Stroet C B M T Modellingnon-linear water table dynamics and specific dischargethrough landscape analysis J Hydrol 332 412ndash426doi101016jjhydrol200607011 2007

Bierkens M F P Knotters M and van Geer F C Calibra-tion of transfer function-noise models to sparsely or irregu-larly observed time series Water Resour Res 35 1741ndash1750doi1010291999wr900083 1999

Buchanan S and Triantafilis J Mapping Water Table Depth UsingGeophysical and Environmental Variables Groundwater 47 80ndash96 doi101111j1745-6584200800490x 2009

Clapcott J Young R Goodwin E Leathwick J and Kelly DRelationships between multiple land-use pressures and individ-ual and combined indicators of stream ecological integrity De-partment of Conservation DOC Research and Development se-ries 326 Wellington New Zealand 2011

Cumming G Understanding The New Statistics Routledge NewYork USA 535 pp 2012

Dersquoath G Boosted trees for ecological modeling andprediction Ecology 88 243ndash251 doi1018900012-9658(2007)88[243Btfema]20Co2 2007

Dickens W T Error components in grouped data Is it ever worthweighting Rev Econ Stat 72 328ndash333 1990

Dormann C F Elith J Bacher S Buchmann C Carl G CarreG Marquez J R G Gruber B Lafourcade B Leitao P JMunkemuller T McClean C Osborne P E Reineking BSchroder B Skidmore A K Zurell D and Lautenbach SCollinearity a review of methods to deal with it and a simula-tion study evaluating their performance Ecography 36 27ndash46doi101111j1600-0587201207348x 2013

Droumlsler M Freibauer A Adelmann W Augustin J BergmannL Beyer C Chojnicki B Foumlrster C Giebels M GoumlrlitzS Houmlper H Kantelhardt J Liebersbach H Hahn-Schoumlfl MMinke M Petschow U Pfadenhauer J Schaller L SchaumlgnerP Sommer M Thuille A and Wehrhan M Klimaschutzdurch Moorschutz in der Praxis Ergebnisse aus dem BMBF-Verbundprojekt Klimaschutz ndash Moornutzungsstrategien 2006mdash2010 vTI-Arbeitsberichte 42011 Johann Heinrich von Thuumlnen-Institut Braunschweig Germany 2011

Elith J Leathwick J R and Hastie T A working guideto boosted regression trees J Anim Ecol 77 802ndash813doi101111j1365-2656200801390x 2008

Fan Y and Miguez-Macho G A simple hydrologic framework forsimulating wetlands in climate and earth system models ClimDynam 37 253ndash278 doi101007s00382-010-0829-8 2011

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 20: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

3338 M Bechtold et al Large-scale regionalization of water table depth in peatlands

Finke P A Brus D J Bierkens M F P Hoogland T KnottersM and de Vries F Mapping groundwater dynamics using mul-tiple sources of exhaustive high resolution data Geoderma 12323ndash39 doi101016jgeoderma200401025 2004

Francis R I C C Data weighting in statistical fisheries stockassessment models Can J Fish Aquat Sci 68 1124ndash1138doi101139F2011-025 2011

Gong J N Wang K Y Kellomaki S Zhang C MartikainenP J and Shurpali N Modeling water table changes in bo-real peatlands of Finland under changing climate conditionsEcol Model 244 65ndash78 doi101016jecolmodel2012060312012

Hahn-Schoumlfl M Zak D Minke M Gelbrecht J AugustinJ and Freibauer A Organic sediment formed during inunda-tion of a degraded fen grassland emits large fluxes of CH4 andCO2 Biogeosciences 8 1539ndash1550 doi105194bg-8-1539-2011 2011

Hedges L V and Olkin I Statistical Methods for Meta-AnalysisAcademic Press Orlando USA 369 pp 1985

Hijmans R J Species distribution modeling Documentation onthe R Package ldquodismordquo version 09-3httpcranr-projectorgwebpackagesdismodismopdf(last accesse February 2014)2013

Hoogland T Heuvelink G B M and Knotters M Map-ping Water-Table Depths Over Time to Assess Desiccation ofGroundwater-Dependent Ecosystems in the Netherlands Wet-lands 30 137ndash147 doi101007s13157-009-0011-4 2010

IPCC IPCC guidelines for national greenhouse gas inventoriesedited by Eggleston H S Buendia L Miwa K and NgaraT IGES Japan 2006

Ju W M Chen J M Black T A Barr A G MccaugheyH and Roulet N T Hydrological effects on carbon cy-cles of Canadarsquos forests and wetlands Tellus B 58 16ndash30doi101111j1600-0889200500168x 2006

Knotters M and van Walsum P E V Estimating fluctua-tion quantities from time series of water-table depths usingmodels with a stochastic component J Hydrol 197 25ndash46doi101016S0022-1694(96)03278-7 1997

Leathwick J R Elith J Francis M P Hastie T andTaylor P Variation in demersal fish species richness inthe oceans surrounding New Zealand an analysis usingboosted regression trees Mar Ecol-Prog Ser 321 267ndash281doi103354Meps321267 2006

Leiber-Sauheitl K Fuszlig R Voigt C and Freibauer A HighCO2 fluxes from grassland on histic Gleysol along soil car-bon and drainage gradients Biogeosciences 11 749ndash761doi105194bg-11-749-2014 2014

Levy P E Burden A Cooper M D A Dinsmore K J DrewerJ Evans C Fowler D Gaiawyn J Gray A Jones S KJones T Mcnamara N P Mills R Ostle N Sheppard L JSkiba U Sowerby A Ward S E and Zielinski P Methaneemissions from soils synthesis and analysis of a large UK dataset Global Change Biol 18 1657ndash1669 doi101111j1365-2486201102616x 2012

Limpens J Berendse F Blodau C Canadell J G FreemanC Holden J Roulet N Rydin H and Schaepman-StrubG Peatlands and the carbon cycle from local processes toglobal implications ndash a synthesis Biogeosciences 5 1475ndash1491doi105194bg-5-1475-2008 2008

Martin M P Wattenbach M Smith P Meersmans J Jolivet CBoulonne L and Arrouays D Spatial distribution of soil or-ganic carbon stocks in France Biogeosciences 8 5 1053-1065doi105194bg-8-1053-2011 2011

Melton J R Wania R Hodson E L Poulter B Ringeval BSpahni R Bohn T Avis C A Beerling D J Chen GEliseev A V Denisov S N Hopcroft P O Lettenmaier DP Riley W J Singarayer J S Subin Z M Tian H ZuumlrcherS Brovkin V van Bodegom P M Kleinen T Yu Z Cand Kaplan J O Present state of global wetland extent andwetland methane modelling conclusions from a model inter-comparison project (WETCHIMP) Biogeosciences 10 753ndash788 doi105194bg-10-753-2013 2013

Moore T R and Dalva M The Influence of Temperature andWater-Table Position on Carbon-Dioxide and Methane Emis-sions from Laboratory Columns of Peatland Soils J Soil Sci44 651ndash664 doi101111j1365-23891993tb02330x 1993

Moore T R and Roulet N T Methane Flux ndash Water-Table Re-lations in Northern Wetlands Geophys Res Lett 20 587ndash590doi10102993gl00208 1993

Nash J E and Sutcliffe J V River flow forecasting through con-ceptual models part I ndash A discussion of principles J Hydrol 10282ndash290 doi1010160022-1694(70)90255-6 1970

Regina K Nykaumlnen H Silvola J and Martikainen P J Fluxesof nitrous oxide from boreal peatlands as affected by peatlandtype water table level and nitrification capacity Biogeochem-istry 35 401ndash418 doi101007BF02183033 1996

Ridgeway G Generalized boosted regression models Documen-tation on the R Package ldquogbmrdquo version 21httpcranr-projectorgwebpackagesgbmgbmpdf(last access February 2014)2013

Roszligkopf N Fell H and Zeitz J Organic soils in Germany theirdistribution and carbon stocks Catena in review 2014

Sutanudjaja E H van Beek L P H de Jong S M vanGeer F C and Bierkens M F P Using ERS spacebornemicrowave soil moisture observations to predict groundwaterhead in space and time Remote Sens Environ 138 172ndash188doi101016jrse201307022 2013

Tetzlaff B Kuhr P and Wendland F A New Method for CreatingMaps of Artificially Drained Areas in Large River Basins Basedon Aerial Photographs and Geodata Irrig Drain 58 569ndash585doi101002Ird426 2009

Thompson J R Gavin H Refsgaard A Sorenson H R andGowing D J Modelling the hydrological impacts of climatechange on UK lowland wet grassland Wetl Ecol Manage 17503ndash523 doi101007s11273-008-9127-1 2009

UBA National Inventory Report for the German Greenhouse GasInventory 1990ndash2008 Submission under the United NationsFramework Convention on Climate Change and the Kyoto Pro-tocol 2012 Dessau Germany 2012

van den Akker J J H Jansen P C Hendriks R F A HovingI and Pleijter M Submerged infiltration to halve subsidenceand GHG emissions of agricultural peat soils Proceedings of the14th International Peat Congress Stockholm Sweden 2012

van der Gaast J W J Massop H T L and Vroon H RJ Actuele grondwaterstandsituatie in natuurgebieden Een Pi-lotstudie Wettelijke Onderzoekstaken Natuur amp Milieu WOt-rapport 94 Wageningen 134 pp 2009

Hydrol Earth Syst Sci 18 3319ndash3339 2014 wwwhydrol-earth-syst-scinet1833192014

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014

Page 21: Large-scale regionalization of water table depth in peatlands … · 2014-09-04 · M. Bechtold et al.: Large-scale regionalization of water table depth in peatlands 3321 was to regionalize

M Bechtold et al Large-scale regionalization of water table depth in peatlands 3339

van der Ploeg M J Appels W M Cirkel D G Oosterwoud MR Witte J P M and van der Zee S E A T M Microtopog-raphy as a Driving Mechanism for Ecohydrological Processesin Shallow Groundwater Systems Vadose Zone J 11 52ndash62doi102136Vzj20110098 2012

Wackernagel H Multivariate Geostatistics Springer Berlin Ger-many 387 pp 2003

wwwhydrol-earth-syst-scinet1833192014 Hydrol Earth Syst Sci 18 3319ndash3339 2014