-
Stat Comput (2015) 25:389–405DOI 10.1007/s11222-013-9439-8
An adaptive spatial model for precipitation data from
multiplesatellites over large regions
Avishek Chakraborty · Swarup De ·Kenneth P. Bowman · Huiyan Sang
· Marc G. Genton ·Bani K. Mallick
Received: 2 December 2012 / Accepted: 20 November 2013 /
Published online: 4 December 2013© Springer Science+Business Media
New York 2013
Abstract Satellite measurements have of late become animportant
source of information for climate features suchas precipitation due
to their near-global coverage. In thisarticle, we look at a
precipitation dataset during a 3-hourwindow over tropical South
America that has informationfrom two satellites. We develop a
flexible hierarchical modelto combine instantaneous rainrate
measurements from thosesatellites while accounting for their
potential heterogeneity.
The research of Bani K. Mallick and Avishek Chakraborty
wassupported by National Science Foundation grant DMS
0914951.Research of Marc G. Genton was partially supported by NSF
grantsDMS-1007504 and DMS-1100492. The research in this article
wasalso partially supported by Award No. KUSC1-016-04 made by
KingAbdullah University of Science and Technology (KAUST).
A. Chakraborty (B) · H. Sang · B.K. MallickDepartment of
Statistics, Texas A&M University, College Station,TX
77843-3143, USAe-mail: [email protected]
H. Sange-mail: [email protected]
B.K. Mallicke-mail: [email protected]
S. DeSAS Research & Development (India) Pvt. Ltd, Pune
411013,Indiae-mail: [email protected]
K.P. BowmanDepartment of Atmospheric Sciences, Texas A&M
University,College Station, TX 77843-3150, USAe-mail:
[email protected]
M.G. GentonCEMSE Division, King Abdullah University of
Scienceand Technology, Thuwal 23955-6900, Saudi Arabiae-mail:
[email protected]
Conceptually, we envision an underlying precipitation sur-face
that influences the observed rain as well as absence ofit. The
surface is specified using a mean function centeredat a set of knot
locations, to capture the local patterns in therainrate, combined
with a residual Gaussian process to ac-count for global correlation
across sites. To improve overthe commonly used pre-fixed knot
choices, an efficient re-versible jump scheme is used to allow the
number of suchknots as well as the order and support of associated
poly-nomial terms to be chosen adaptively. To facilitate
compu-tation over a large region, a reduced rank approximation
forthe parent Gaussian process is employed.
Keywords Large data computation · Nonstationary spatialmodel ·
Precipitation modeling · Predictive process ·Random knots ·
Reversible jump Markov chain MonteCarlo · Satellite
measurements
1 Introduction
Algorithms to estimate atmospheric parameters from satel-lite
measurements of upwelling radiation (Lethbridge 1967)have become
invaluable for investigating global weatherand climate. Satellites
are routinely used to observe tem-perature, humidity, clouds,
precipitation, aerosols and at-mospheric trace constituents.
Applications of satellite datainclude weather forecasting, climate
studies, ozone deple-tion, drought monitoring, crop forecasting and
flood warn-ing. Data obtained from satellites are attractive
because oftheir potential for near-global coverage and their
ability togenerate measurements at high spatial and temporal
reso-lution compared to ground-based or airborne sources. Overthe
ocean and in sparsely populated regions of the land sur-face where
few ground-based measurements exist, satelliteobservations are
essential (Simpson et al. 1988; Kidd 2001).
mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]
-
390 Stat Comput (2015) 25:389–405
Precipitation is critically important in terms of economicand
social impacts, but it is also one of the most difficult
at-mospheric phenomena to observe and model. The complexphysical
processes and large variability of precipitation posesignificant
scientific challenges in modeling the atmosphere.Rain gauges are
the most common technology for measur-ing surface rainrates.
Compared to parameters such as tem-perature, rainfall has very
short space and time correlationlength scales, so high resolution
data are required to resolvespatial and temporal variations (Austin
and Houze 1972;Rodríguez-Iturbe and Mejía 1974). With the exception
ofa few small, dense research networks, the spacing betweenrain
gauges tends to be much larger than the typical spatialscales of
precipitation systems (Felgate and Read 1975). Inpractical terms,
gauge networks do not resolve the variabil-ity of precipitation
systems, particularly on smaller scales.In addition, there is
little rain gauge data over the oceansand over large expanses of
many continents. In contrast, dueto their near-global coverage,
satellites have the potential toprovide precipitation estimates
with high spatial resolutionwhere gauge observations or
conventional earth-bound mon-itoring systems are unavailable. The
principal limitation ofobservations from low-Earth-orbiting
satellites is temporalsampling. A single satellite will typically
observe a givenlocation on the Earth’s surface only a few times per
day. Al-though satellites can provide global coverage, satellite
sam-pling and measurement errors can be substantial, so
effectivemethods are needed to validate and combine
observationsinto consistent and useful estimates. A detailed
discussionon sampling errors for satellite rainfall average can be
foundin Bell et al. (1990, 2001), Bell and Kundu (1996) and
Mc-Connell and North (1987).
Over the last few decades many satellite-based precipi-tation
algorithms have been developed (Wilheit 1977; Xieand Arkin 1998; Ba
and Gruber 2001; Huffman et al. 2002;Joyce et al. 2004; Negri et
al. 2002; Sorooshian et al. 2000;Vicente et al. 1998; Weng et al.
2003). Satellite methods canbe used to generate precipitation
products at various spa-tial and temporal resolutions.
Precipitation estimation fromspace is most often based on
observations of infrared or mi-crowave radiation. Infrared-based
techniques can have rel-atively high spatial and temporal
resolution. Infrared radi-ation cannot penetrate typical dense
clouds, but it is pos-sible to measure the altitude of cloud tops.
Higher cloudtops indicate deeper storms, which tend to produce
largerrainfall amounts. A statistical relationship between
cloud-top height and surface rain gauge data can be used to
es-timate surface rainrates from satellite observations. Hence,the
rainrate is related only indirectly to the observed quantity(cloud
top height); so uncertainties are larger. Microwaveradiation, on
the other hand, can penetrate through clouds.Consequently,
microwave methods are more closely tied tothe relevant physical
quantity (falling raindrops), but spatial
and temporal coverage are typically lower. Microwave meth-ods
address these problems by merging precipitation obser-vations from
multiple satellites to yield global precipitationestimates at
reasonably high spatial and temporal resolution(e.g. 0.25◦ and a
few hours) as in the Tropical Rainfall Mea-suring Mission (TRMM)
Multisatellite Precipitation Analy-sis (TMPA) described by Huffman
et al. (2007). The reso-lution of the TMPA grid (0.25◦) is
comparable to the reso-lution of current microwave instruments.
Currently mergedsatellite estimates do not take into account the
different sta-tistical properties of the input data streams. These
differ-ences include variations in spatial and temporal
resolutionand in error characteristics. Multiple observations are
usu-ally combined in the simplest possible way by averaging
thevarious instantaneous estimates to produce a “best estimate”of
the mean rainfall rate over the selected interval (Huff-man et al.
2007). Because statistical properties of rainfallare highly
non-Gaussian and depend strongly on space andtime scale, this
averaging can significantly impact highermoments of the estimates
(e.g. variances and covariances).
In this paper, our approach is to use a Bayesian hierar-chical
framework for combining the available observationsfrom multiple
satellites within a space-time volume. Thisenables us to develop
the model in a way that can account forsome specific
characteristics of a precipitation dataset. Gen-erally,
precipitation patterns are highly localized and havefast time
scales. As a result, at any given location, rainfall isabsent most
of the time. Hence any rainfall data, aggregatedover a short time
interval, is expected to have a high num-ber of zeros. The use of
mixture models with a degeneratemass at zero is a common approach
to model zero-inflateddata, e.g., a zero inflated Poisson (ZIP)
model (Cohen 1991).Agarwal et al. (2002) discussed the use of
Bayesian meth-ods to analyze spatially correlated zero-inflated
count datain the presence of covariate information. In the context
of anecological dataset on presence of plant species, Chakrabortyet
al. (2010) used a spatial probit model to address a largenumber of
absences. In the current work, we introduce a lo-cally varying bias
term for that. Atmospheric convection,which drives the event of
precipitation, involves turbulentinteractions across a wide range
of scales and is governedby fundamentally nonlinear equations of
fluid dynamics. Asa consequence, the information on whether it is
raining at agiven location or not is highly localized. The bias
term weintroduce here addresses this nonlinearity and local
varia-tion, unlike the linearity assumption in Fuentes et al.
(2008).For geospatial datasets, an additive model with
nonlinearcovariate-dependence was also discussed in Kammann andWand
(2003).
For precipitation, when it rains, the probability distribu-tion
of instantaneous rainrate is non-Gaussian with long tail.Hence, a
lognormal model is often adopted for nonzero pre-cipitation
measurements, see Lee and Zawadzki (2005) and
-
Stat Comput (2015) 25:389–405 391
Fuentes et al. (2008). Alternative approaches also exist inthe
literature such as the truncated power transformation ofBardossy
and Plate (1992) and the use of skew-elliptical dis-tributions in
Marchenko and Genton (2010). Our approachis motivated by Fuentes et
al. (2008), where a zero-inflatedlog-Gaussian model has been used
for rainfall, and, in a hier-archical framework, the distribution
of no rainfall events wasmodeled to depend on the true rainfall
intensity. However,the modeling of the rainfall process in this
article signifi-cantly differs from the approach therein. To
jointly modeltrue rainfall intensities at adjacent locations,
Fuentes et al.(2008) used a Gaussian Markov random field with
correla-tion parameters that do not change across locations.
How-ever, in general, the amount of rainfall over a few hoursis
highly localized and nonstationary on a large domain.A number of
works have focussed on developing nonsta-tionary spatial covariance
functions. Spatial deformationsto model nonstationary spatial
processes have been used bySampson and Guttorp (1992), Schmidt and
O’Hagan (2003)and Anderes and Stein (2008). On the other hand,
kernelconvolution and its variants have been applied in
severalpapers to create nonstationary covariance functions as
inHigdon (1998), Fuentes (2002) and Paciorek and Schervish(2006).
Jun and Stein (2008) and Jun (2011) proposed amethod of modeling
nonstationary covariance functions ona sphere. Nonstationarity can
also be introduced through co-variates. Using spatially-varying
regression coefficients fora point-referenced data provides a scope
of detecting sub-regional variation in the response-predictor
relationship, seeGelfand et al. (2003). Specifically, in the
context of analyz-ing a zero-inflated dataset like ours, Finley et
al. (2011) pro-posed a hierarchical model where multivariate
spatial pro-cess priors were used for two different sets of
regressioncoefficients—one for controlling the abundance of zeros
andthe other for the nonzero observations. Here, we also workwith a
covariate-dependent nonstationary model but opt foran adaptive
specification, as in Friedman (1991) and Deni-son et al. (1998). It
depends on finding a set of knot loca-tions in the predictor space
and developing a local poly-nomial for each one of them. This
approach is relativelysimpler to interpret, estimate and,
importantly, model non-Gaussian patterns and input interactions in
the response sur-face. The flexibility of this specification lies
in the fact thatthe functions can be constructed adaptively, i.e.,
the orderand support of the local polynomials and even the numberof
knots is decided by the pattern of the data during model-fitting.
When spatial covariates are available, the model canselect the
important predictors and/or interactions, eliminat-ing the need to
pre-specify the form of dependence. To mod-ify the number of
polynomial terms in the function, an ef-ficient reversible jump
Markov chain Monte Carlo (RJM-CMC, Richardson and Green 1997)
sampler has been devel-oped in Sect. 4. The residual was modeled
with a Gaussian
process (GP) prior so as to reflect correlation across sites ona
global scale.
Although not directly relevant to the modeling and dataanalysis
in this article, we like to mention that there is a sig-nificant
collection of literature on modeling extreme precip-itation events.
Here the quantity of interest is the right-handtail of the rainfall
distribution. The common approach is touse extreme value theory to
propose a probability distribu-tion for exceedance (the event that
rainfall crosses a thresh-old value) rate at each site and relate
the parameters of thosedistributions using spatial process models.
See Cooley et al.(2007) and Sang and Gelfand (2009) for a
hierarchical ap-proach to this problem for rainfall data at point
and grid-level, respectively.
Another important feature of our problem is predictionat
thousands of unsampled sites. As Markov random field-based models
do not have predictive property, one needs toinclude all such sites
directly into the estimation, potentiallyleading to a large number
of spatial random effects and slowconvergence. With the spline-GP
combination used in thisarticle, prediction can be done as a
post-MCMC analysis.One potential issue here is the sensitivity of
GP computationto the number of locations in the dataset. There are
a num-ber of approximation techniques in the literature, such
asprocess convolution (Higdon 2002), approximate likelihood(Stein
et al. 2004), fixed rank kriging (Cressie and Johannes-son 2008),
covariance tapering (Furrer et al. 2006; Kaufmanet al. 2008),
predictive process (Banerjee et al. 2008) andvery recently a
combined approach involving reduced rankapproximation and
covariance tapering by Sang and Huang(2012); see Sun et al. (2012)
for a review of available meth-ods. We employ the fixed knot-based
predictive process ap-proximation, discussed in brief in Sect. 4.1.
Finally, as withany Bayesian approach, information on uncertainty
aboutprocess parameters, in addition to their pointwise
estimatescan be readily obtained which may be of significant
interestwith rainfall being a highly variable event.
The article is organized as follows. Measurements of
pre-cipitation around the world with multiple satellites and
sub-sequent processing of the raw data are discussed in Sect. 2.In
Sect. 3, the formulation of a Bayesian hierarchical spa-tial model
for rainfall is introduced. Implementation detailsincluding the
choice of priors, large data computation andsampling scheme are
outlined in Sect. 4. The details of thesimulation study and the
real data analysis are presented inSect. 5. Finally, Sect. 6
summarizes the present work andpoints out further related research.
In this article, the nota-tions N(μ,σ 2) and LN(μ,σ 2) have been
used for denotingnormal and lognormal probability densities with
location μand scale σ , respectively. Φ is used for the standard
normalcumulative distribution function and Nd(μ,Σ) stands
ford-dimensional multivariate normal distribution with meanvector μ
and dispersion matrix Σ .
-
392 Stat Comput (2015) 25:389–405
Fig. 1 Observation swaths of different satellites in a typical
3-hourobserving period. The figure is from Huffman et al. (2007).
Blackindicates no data. Colors indicate rainrate in mm/hr. Shades
of gray in-dicate observations of zero rain from the different
satellites. The white
trajectory corresponds to the path of TMI. Observations are
averagedwhere overlaps occur. Reproduced by permission of the
American Me-teorological Society (Color figure online)
2 Collection of satellite measurements on precipitation
Precipitation observations from multiple satellites can bemerged
to yield global precipitation estimates at reason-ably high spatial
and temporal resolution as is done withthe Tropical Rainfall
Measuring Mission (TRMM) Multi-satellite Precipitation Analysis
(TMPA) described by Huff-man et al. (2007). Data are collected by a
variety of lowearth orbit satellites, including the TRMM Microwave
Im-ager (TMI) on the TRMM satellite, Special Sensor Mi-crowave
Imagers on the Defense Meteorological SatelliteProgram satellites,
the Advanced Microwave Scanning Ra-diometer on the NASA Aqua
satellite, and Advanced Mi-crowave Sounding Units (AMSU) on
National Oceanic andAtmospheric Administration (NOAA) operational
satellites.These satellites vary with respect to altitude, orbital
inclina-tion, and equator crossing time. Data from multiple
satellitesare aggregated in 3-hour windows, which are
approximatelytwo complete orbits for low-Earth-orbiting satellites.
The3-hour time window is usual in precipitation study (Huffmanet
al. 2007) as a compromise between two competing goals:higher
temporal resolution of the pattern of precipitation andgreater
spatial coverage during each averaging window. Be-cause of the
constraints of orbital motion and instrument de-sign, a single
satellite provides limited coverage of the globewithin a 3-hour
window. The current suite of operational and
research satellites can, together, provide nearly global
cov-erage in a 3-hour period. Longer sampling windows wouldoffer
greater coverage but would have a lesser ability to re-solve the
diurnal cycle, which is an important componentof precipitation
variation. Figure 1 shows the available datafor a typical 3-hour
period. For example, the regional 3-hourprecipitation dataset we
used for analysis in Sect. 5 has in-formation from two
satellites—TMI and AMSU-NOAA17.Following the TMPA practice, we
choose the TMI as the ref-erence standard due to its higher spatial
resolution (resultingfrom relatively lower altitude) and ongoing
calibration withthe TRMM Precipitation Radar.
A typical satellite microwave radiometer measures theupwelling
microwave radiation by using a rotating antennato scan in a conical
pattern as the satellite moves abovethe Earth’s surface. The
spatial resolution of the measure-ments depends on the altitude of
the satellite, the size ofthe microwave antenna (typically about 1
m), and the wave-lengths used. Rainrates are estimated using
physical algo-rithms (Wilheit et al. 1977) based on reverse lookup
tablesthat are precomputed using radiative transfer models. For
ef-ficiency and convenience in processing the data, within
each3-hour window the rain estimates from the instantaneousfields
of view (pixels) of the satellite are first averaged ontoa 0.25◦ ×
0.25◦ latitude-longitude grid. In the tropics theseboxes are nearly
square and are approximately 28 × 28 km2.
-
Stat Comput (2015) 25:389–405 393
This resolution represents a reasonable compromise amongthe
differing spatial resolutions of the various instrumentsand is
suitable for prediction related to climatological oragricultural
studies. Rain measurements are reported as in-stantaneous rainrates
in mm/hr.
3 Hierarchical spatial model for rainrate
In this section we describe the proposed hierarchical modelfor
the multi-satellite precipitation data. Let D be the do-main of
observation, T be the time window of study, X(s) =(x1(s), x2(s), .
. . , xp(s))
T be the p-dimensional vector ofcovariates measured at location
s and Ỹl(s) is the observedrainrate during T from the l-th
satellite at location s ∈ Dfor l = 1,2, . . . ,L. However, at any
given s, the rainratemay not be available for some or all of the L
satellites.The observed rainrate data is modeled as a noisy version
ofan underlying unobserved potential rainrate process as
fol-lows:
Ỹl(s) =
⎧⎪⎨
⎪⎩
0 with probability π(s),
exp{c1l + c2lY (s) + εl(s)}with probability 1 − π(s),
(1)
where π(s) is the probability of zero rainfall at grid cell sand
exp(Y (s)) represents the latent potential rainrate pro-cess at
location s. If it rains, then the rainrate observedby satellite l,
Ỹl(s), is a noisy measurement of that la-tent process. The
parameters c1l , c2l are the model’s addi-tive and multiplicative
bias adjustments specific to satel-lite l. For identifiability
purpose, we need to set c1l0 =0, c2l0 = 1 for some l0 ∈ {1,2, . . .
,L}, so that any infer-ence from the model can be interpreted with
satellite l0 asthe reference. Generally, one chooses l0 to be the
satel-lite which is known to have maximum precision in
mea-surements. In applications, where rainfall data are avail-able
from multiple sources, the one expected to have thehighest accuracy
can be used as the reference, e.g., rain-gauge data in Fuentes et
al. (2008). The zero-mean noiseεl(s) characterizes the variations
due to measurement er-ror and/or micro-scale spatial variations for
the l-th satel-lite.
We introduce the data augmentation approach (Tannerand Wong
1987; Albert and Chib 1993) to relate the rain-fall probability
π(s) with the latent rainrate process Y(s) ina flexible way. The
spatial probit model for π(s) is as fol-lows:
π(s) = 1 − Φ{μπ(s) + βπY (s)}. (2)
Conceptually, this amounts to modeling the zeros of rain-rate
measurements to correspond to the low values of the
latent process Y . The variable intercept μπ(·) can be re-ferred
to as “bias function” that accounts for the poten-tial event of
zero rainfall due to nonlinear interactions thatcannot be captured
linearly through Y(s). If μπ(s) is aconstant over D, we have that
E{log Ỹk(s)|Ỹk(s) > 0} andΦ−1{π(s)} are linear functions of
each other for any k, sim-ilar to the assumption made in Fuentes et
al. (2008). We in-troduce L latent surfaces Z1(s),Z2(s), . . .
,ZL(s) such that
Zl(s)i.i.d.∼ N(μπ(s) + βπY (s),1), 1 ≤ l ≤ L. Then we can
rewrite (1) as
Ỹl(s) ={
0 if Zl(s) ≤ 0,exp{c1l + c2lY (s) + εl(s)} if Zl(s) > 0.
(3)
Estimation of {Y(s) : s ∈ D} is of prime interest in
thisproblem. Y(s) captures rainfall patterns over D. Since
rain-fall over a small time window is a highly localized event,the
usual isotropic GP-based spatial models will not sufficefor Y .
There are different approaches to introduce nonsta-tionarity as
outlined in Sect. 1. The approach we take here isto specify the
mean surface μy(·) using multivariate adap-tive regression splines
(MARS; Friedman 1991; Denisonet al. 1998). The idea is to model the
function as a sum ofinteractions of varying order from a basis set
of local poly-nomials as follows:
Y(s) = μy(s) + w(s),
μy(s) = my(s) +ky∑
h=1νy,hφy,h
(X(s)
), (4)
φy,h(x) =nh∏
r=1
[uhr(xvhr − thr )
]
+,
where my(·) represents a fixed trend function (e.g. just
anintercept or a linear or quadratic trend in components of s)and
(·)+ = max(·,0), nh is the degree of the interaction ofthe basis
function φy,h. The {uhr}, sign indicators, are ±1,the vhr gives the
index of the predictor variable which is be-ing split at the value
thr within its range. Thus, the func-tion φy,h represents a pattern
around the knot th = {thr :r = 1,2, . . . , nh} (we refer to it as
a sp-knot) and the set{φy,h(·) : h = 1,2, . . . , ky} defines an
adaptive partitioningof the multidimensional space.
Specifying μy(·) with local interactions provides
greaterflexibility to model surface patterns and variable
relation-ships. Most importantly, localized structures allow for
non-stationarity. The flexibility of MARS lies in the fact that
theinteraction functions can be constructed adaptively, i.e.,
theorder of interaction, knot locations, signs and even the num-ber
of such terms ky is decided by the pattern of the dataduring
model-fitting, eliminating the need for any prior ad-hoc or
empirical judgement. In spite of having a flexible
-
394 Stat Comput (2015) 25:389–405
mean structure, MARS is relatively simple to fit which isan
obvious advantage against other choices of nonstation-ary processes
in the present application. For interpretabil-ity and to avoid
overfitting, interactions up to a certain or-der are used and only
up to a prefixed number of termsmay be allowed in the above sum.
Any additional or higherorder global pattern is accounted for by
assigning a zero-mean GP prior to the residual process w(s). The
covariancefunction for w is assumed to be isotropic, i.e. for two
lo-cations s and s′, cov{w(s),w(s′)} = σ 2y ρ{d(s, s′), κ}
whered(·, ·) is the great circle distance for latitude-longitude
data.ρ is the correlation function (validity of usual
correlationfunctions such as exponential and Matérn with respect
togeodetic distance is discussed in Banerjee (2005)), κ isthe
parameter (vector) that controls the smoothness andrate of decay.
The above specification implies that, for aset of n locations s =
(s1, s2, . . . , sn), the vector Y(s) =(Y (s1), Y (s2), . . . , Y
(sn))
T is distributed as:
Y(s) ∼ Nn(μy(s),Σ(s;κ)
),
where Σ(i, i′) = σ 2y ρ{d(si, si′), κ}. If a priori νy ∼ Nky
×(0, cIky ) then marginalizing out νy , we have:
E[Y(s)
] = my(s),V[Y(s)
] = cP (s)P (s)T + Σ(s, κ),where P(s) is a n × ky matrix with
i-th column being(φy,i(s1),φy,i(s2), . . . , φy,i (sn))
T . Let θh = {(vhr , thr , uhr ):r = 1,2, . . . , nh} be the
parameters in φy,h and f (s; θh) =φy,h(s). Then at individual
location level, we have:
var[Y(si)
] = cky∑
h=1f 2(si; θh) + σ 2y ,
cov[Y(si), Y (si′)
] = cky∑
h=1f (si; θh)f (si′ ; θh)
+ σ 2y ρ{d(si, si′), κ
}.
(5)
This provides a very flexible model for the covariance
spec-ification of the Y process that is also easy to interpret.
Thetotal covariance is decomposed into a locally varying
(non-stationary) component combined with a global pattern com-ing
from the GP. The vector θh controls the characteristicsof the
pattern associated with the h-th sp-knot and f (s, θh)represents
its effect at location s. The resulting covarianceis the sum of
such local effects. In practice, one starts witha global covariance
term only and then the model itself se-lects the local effects as
necessary. The parameter c repre-sents prior confidence
(uncertainty) in these effects. The biasfunction, μπ(·) is also
specified using MARS as above, witha separate set of
parameters.
It follows from (5) that MARS specification amounts tobuilding
the covariance model of the output process with
locally-supported components. In spatial literature,
commonapproaches to incorporate nonstationarity by kernel mixingof
process variables offers essentially similar decompositionof
covariance; see Higdon et al. (1999), Fuentes (2002) andBanerjee et
al. (2004). However, MARS offers significantlygreater flexibility
over those methods as the shape of sucheffects around the knots can
be decided independently ofeach other without being controlled by
that of any chosenkernel function. Moreover, allowing each of these
patternsto have its own local support encourages sparsity and
avoidsthe complexity often associated with determining appropri-ate
kernel bandwidth parameter(s).
Another important advantage that this specification of-fers is
the ability to let the required number of such lo-cal effects and
the associated sp-knots be entirely decidedby the data, without
compromising for the computationalcomplexity and interpretability.
For any spatial model de-pendent on a set of knots, selection of an
appropriate num-ber of knots and placing them optimally over space
has al-ways been a critical issue. With too few knots, there is
al-ways a possibility of overestimating the actual spatial rangeand
neglecting important local patterns. On the other hand,using too
many knots increases the computational demandand may lead to poor
predictive performance by accountingfor even the noises or
negligible variations in the observeddata. Conditional on a fixed
number of knots, Gelfand et al.(2012) discussed an approach to
place them optimally usinga minimum predictive variance criterion.
A model-based ap-proach for random knots was introduced in
Guhaniyogi et al.(2011). There, in a multi-stage structure, a point
processprior was assumed for the set of knots. The intensity of
thatpoint process can either be a parametric multimodal surfaceor a
log-Gaussian process itself. However, when the numberof such knots
is significant, efficiently updating them mayturn out to be
challenging owing to a nonstandard posteriordistribution. Our
specification allows for random knot selec-tion by placing a prior
on the set of observed points in thespace and then, during the MCMC
scheme, varies the size aswell as the members of the collection of
the sp-knots via ad-dition, deletion or modification, only one at a
time. Anothervery important feature of our knot selection procedure
is thateven though we are working in a p-dimensional
predictorspace, a typical sp-knot can be a location in a lower
dimen-sional space. This can be particularly useful when the
spatialprocess under consideration has different degrees of
smooth-ness across coordinates. For example, an atmospheric
pro-cess may exhibit significant variations with change in
lon-gitude, but may remain relatively uniform with variations
inlatitude (at a fixed longitude). In those situations, a more
par-simonious representation can be ensured with MARS by al-lowing
the data to position its sp-knots only over the range oflongitudes.
Since most knot-based spatial methods have notaddressed this
dimension-wise variation so far, it can pro-vide a potentially
interesting direction for future research in
-
Stat Comput (2015) 25:389–405 395
this field. For our multi-stage spatial model, the
samplingprocedure is described in Sect. 4.
4 Details of estimation and inference
In this section, we focus on how to implement the model inSect.
3 on a (potentially massive) precipitation dataset. SinceGP
computation is sensitive to the data size, we begin witha suitable
approximation method to make the model capableof handling the
computation. Subsequently we describe thefull hierarchical model
used to fit the dataset and outline theestimation procedure via
MCMC. Finally, we mention someof the quantities of interest which
can be estimated by post-processing the posterior samples.
4.1 Knot-based approximation for large dataset
When the number of locations inside D gets large (inthousands),
updating the latent spatial process parametersinside an MCMC
becomes complicated due to its high-dimensional covariance matrix.
We use a reduced rank rep-resentation of the original process, the
predictive process, asdeveloped in Banerjee et al. (2008). Below,
we present theidea in brief.
Consider realizations from a zero-mean, unit-scale GPw(·) at a
set of n locations s = (s1, s2, . . . , sn) ∈ D wheren is large.
The method proceeds by first choosing m � nlocations s0 = (s01 ,
s02 , . . . , s0m) in D, to be referred to aspp-knots, and then
replaces w(s) by an approximate pro-cess w̃(s) = E[w(s)|w(s0)] =
Lw(s0) where the matrix Lis calculated from the correlation
function ρ of the origi-nal process w. L depends on the correlation
parameter(s)of ρ. If m � n, we gain in terms of computation time
us-ing the Sherman-Woodbury-Morrison (S-W-M) type formu-lae
(Banerjee et al. 2008). However the accuracy of the ap-proximation
goes up with increasing m, so there has to bea trade-off. We prefer
to use this method, as it is deriveddirectly from the parent
process without any need for ad-hoc choice of basis functions, is
easy to interpret and hasclosed form analytic expressions. This
approximation is co-herent with the MARS model introduced in Sect.
3, wherethe resulting covariance was shown to depend on a set
ofrandom sp-knots. We like to mention that, conditional on afixed
number of pp-knots m, we can allow their locations tovary by using
a point process prior on them as in Guhaniyogiet al. (2011).
However, the hierarchical model described inSect. 3 already
includes an adaptive spatial function basedon sp-knots and GP is
used only as a model for the residualprocess. So, in this article,
we choose to work with a fixedset of pp-knots only.
We introduce bias correction, a modification discussedin Finley
et al. (2009). Since var{w(sj )} > var{w̃(sj )} for
each j , the predictive process is expected to underestimatethe
spatial variance and increase the variance of the pureerror. The
correction introduces an heteroscedastic indepen-dent error ε∗ so
that w̃(s) = Lw(s0) + ε∗ and var{w̃(sj )} =var{w(sj )} for any j =
1,2, . . . , n. Introduction of a biascorrection term also
facilitates computation when the GPunder consideration is applied
to a latent-stage response (asours) that needs to be updated every
iteration. The advantageof this method is illustrated in Sect.
4.3.
4.2 MCMC from the complete hierarchical model
We discuss parameter estimation using a MCMC schemefrom the
model in Sect. 3. Let {sl1, sl2, . . . , slnl } be thelocations in
D at which rainrate measurements are avail-able from the satellite
l for l = 1,2, . . . ,L. Let ỹlj andxlj denote the rainrate
measurement and available covari-ate information, respectively, at
location slj by satellite l forj = 1,2, . . . , nl . Let s denotes
the pooled set of n distinct lo-cations
⋃Ll=1{sl1, sl2, . . . , slnl } and s0 the set of m pp-knots
as above. For the joint set of locations (s, s0), partition
thespatial correlation matrix as Cn+m(κ) =
( Cn(κ) Cn,m(κ)Cn,m(κ) Cm(κ)
),
where the entries of Cn+m are unit scale correlation termsunder
correlation function ρ(·, κ). From Sect. 4.1, L(κ) =Cn,m(κ)C
−1m (κ). Then we can write the full hierarchical
model as follows:
ỹlj ∼ 1(zlj < 0)δ0 + 1(zlj > 0)LN(c1l + c2ly(slj ), σ
20
),
zlj ∼ N(μπ(slj ) + βπy(slj ),1
),
y(s) = μy(s) + σyw̃(s),w̃(s) = L(κ)w(s0) + ε∗(s), (6)w
(s0
) ∼ GP (0, ρ(·, κ)),ε∗(s) ∼ Nn
(0,Diag
{In − Cn,m(κ)C−1m (κ)Cm,n(κ)
}),
μd(s) = md(s) +kd∑
h=1νd,hφd,h
(x(s)
), d = π,y.
Regarding prior specification, we set c1l0 = 0, c2l0 = 1 forsome
l0 as discussed in Sect. 3. For l = l0 we use Gaus-sian priors
centered at 0 and 1, respectively. For the regres-sion coefficient
βπ and variance parameter σ 20 , we use Nor-mal and inverse-gamma
priors, respectively, for conjugacyof posterior distributions. We
choose ρ(·, κ) to be the expo-nential correlation function with
decay parameter κ . It canbe easily shown that κ ≈ 3.0/R (R is the
spatial range, i.e.,the distance at which the correlation falls
below 0.05) soκ can be specified using a prior idea about possible
valuesof R. An Inverse Gamma (aσ , bσ ) prior was used for
thespatial variance σ 2y . We chose the trend function md(·) to bea
constant intercept for d = π,y and, without loss of gener-ality,
merge it with the MARS predictors as a constant basis
-
396 Stat Comput (2015) 25:389–405
function φd,0(·) ≡ 1. Conditional on kπ and ky , the numberof
basis functions in the expansion, we assign Gaussian pri-ors to the
coefficient vectors so that νy ∼ Nky (0, σ 2y τ 2y Iky )and νπ ∼
Nkπ (0, τ 2πIkπ ). We assign Inverse gamma priorsto both scale
parameters τ 2y and τ
2π to maintain conjugacy,
so that we can draw them from respective Inverse gammaposterior
conditional distributions.
For d = π,y, we can control the parsimony of the non-stationary
function μd in three different ways: (1) chang-ing the prior mean
of kd , (2) putting an order constraint oneach φd,h and (3) setting
a fixed threshold k0 for maximumvalue of kd . Accounting for the
column of ones correspond-ing to the constant basis function, (kd −
1) is chosen to havea Poisson(λd ) prior truncated to the right at
k0 to control thenumber of terms in the sum. For each φd,h we can
eitheruse a strict upper bound (e.g. allowing only functions
uptosecond order) or choose a prior that puts small probabilityon a
higher order basis function. The value of k0 shouldbe chosen based
on our idea of the variability in y-surface.k0 = 0 corresponds to
the most parsimonious model—a per-fectly stationary surface.
Increasing k0 will allow us to cap-ture more and more local
patterns but risks overfitting. Inpractical examples, the choice of
k0 may come from prioridea about nature of variation in the
surface. In absence ofany such information, one can use a
validation method, i.e.,using a subset of the data as the test set,
fit the model withdifferent values of k0 and investigate its
influence on the pre-dictive performance.
During the MCMC, the vector of parameters has beenupdated in the
following blocks: (i) spline coefficients{νd,h}, number of basis
functions kd and parameters withineach φd,h for d = π,y; (ii)
latent rainrate variables y(s);(iii) auxiliary surface z(·); (iv)
regression parameters suchas βπ and {cil : i = 1,2, l = 1,2, . . .
,L, l = l0}. We rewritethe probability distribution of y(s) as y(s)
∼ Nn(μy(s) +L(κ)w(s0),Σy) where Σy = σ 2y Diag{In − Cn,m(κ) ×C−1m
(κ)Cm,n(κ)}. The resulting posterior distribution for yis Gaussian
and, importantly, has independence across lo-cations. So, it is
easy to draw y even when n is large. Asin Albert and Chib (1993),
conditional on y(slj ) and ỹlj , zljcan also be sampled
independently across different satellitesand locations:
zlj |ỹlj , y(slj ),μπ ,βπ ind∼ 1(ỹlj = 0)N(−∞,0)(μlj ,1)+
1(ỹlj > 0)N(0,∞)(μlj ,1)
μlj = μπ(slj ) + βπy(slj ),
where NA(μ,1) stands for N(μ,1) distribution truncatedwithin A ⊂
R.
Next, we discuss updating parameters related to the pos-terior
distribution of y, i.e., κ,σ 2y and μy . For this, we first
marginalize out w(s0) from the distribution of y. As ob-served
in Chib and Carlin (1999), marginalizing out the ran-dom effects
improves the mixing behavior of the MCMC.However, this leads to a
full covariance structure for y as:
Σ(y(s)
) = σ 2y[Cn,m(κ)C
−1m (κ)Cm,n(κ)
+ Diag{In − Cn,m(κ)C−1m (κ)Cm,n(κ)}]
.
We can use S-W-M type matrix computations for calculat-ing the
determinant and inverse of this covariance matrix anduse that in
drawing from posterior distributions of κ and σ 2y .
Posterior samples of w(s0) can be drawn afterwards froma
multivariate normal distribution as evident from (6). Re-gression
parameters c1k, c2k and βπ can also be updated us-ing standard
sampling steps. The most important componentof this MCMC is
updating the spline parameters appearingin μy and μπ which has been
performed using a reversiblejump (Richardson and Green 1997) scheme
described be-low.
We start with μy , the mean function for y. We drop thesuffix y
for notational simplicity. With k basis functions,let αk =
{nh,uh,vh, νh, th}kh=1 be the corresponding set ofspline
parameters. Marginalizing out ν and σ 2, the distri-bution for
y(s), p(y(s)|k,αk, . . .), can be written in closedform (see
Appendix). Now, using a suitable proposal dis-tribution q , propose
a dimension changing move (k,αk) →(k′, αk′). We consider three
types of possible moves (i) birth:addition of a basis function;
(ii) death: deletion of an exist-ing basis function; and (iii)
change: modification of an ex-isting basis function. Thus k′ ∈ {k −
1, k, k + 1}. The accep-tance ratio for such a move is given by
pk→k′ = min(
1,p(y(s)|k′, αk′ , . . .)p(y(s)|k,αk, . . .)
p(αk′ |k′)p(k′)p(αk|k)p(k)
× q{(k′, αk′) → (k,αk)}
q{(k,αk) → (k′, αk′)})
.
First, we mention the prior for (k,αk) in the form
ofp(αk|k)p(k). As specified above, (k + 1) has a Poisson(λ)prior
truncated at some upper bound k0. Within each con-stituent local
polynomial, nh controls its order, vh controlsthe set of variables
involved whereas uh and th determine thesigns and the position of
the sp-knot. If p is the total num-ber of covariates and we allow
interactions up to 2nd order,then the number of possible choices
for a basis function (ex-
cluding the constant function) is N = p +p + (p2) = p2+3p2 .
Accounting for rearrangement of the same set of basis
func-tions, the number of distinct k-set basis functions is
Nk/k!.Once we determine a basis function, we choose the individ-ual
coordinates of that sp-knot uniformly from the availabledata points
(since a change in pattern can only be detected atdata points) and
determine its sign to be positive or negative
-
Stat Comput (2015) 25:389–405 397
with probability 1/2 each. Thus we obtain:
p(αk|k) ∝ Nk
k! (1/2n)∑k
h nh .
Although above we assumed that all covariates have ndistinct
values to locate a sp-knot, modifications can bemade easily when
this is not the case.
Next we specify the proposal distribution q(·, ·) for eachof the
three moves as follows:
(i) First decide on the type of move to be proposed
withprobabilities bk (birth), dk (death) and ck (change),bk +dk +ck
= 1. We put dk = 0, ck = 0 if k = 1, bk = 0if k = k0.
(ii) For a birth move, choose a new basis function ran-domly
from the N -set, calculate its order nh and choosethe location of
its sp-knot and signs as before withprobability ( 12n )
nh .(iii) The death move is performed by randomly removing
one of the k − 1 existing basis functions (excluding theconstant
basis function).
(iv) A change move consists of choosing an existing non-constant
basis function randomly and alter its sign andcorresponding
sp-knot.
From above, we have
q((k,αk) →
(k′, αk′
)) =
⎧⎪⎨
⎪⎩
bk1N
( 12n)nk+1 k′ = k + 1,
dk1
k−1 k′ = k − 1,
ck1
k−1 (1
2n )nh k′ = k.
Above, for the ‘change’ step, h denotes the index of ba-sis
function that has been randomly chosen for change. Theacceptance
ratios for different types of move can be workedout from this. Set
k = k′, αk = αk′ if the move is accepted,leave them unchanged
otherwise. Subsequently, νy can beupdated using the k-variate t
distribution with degrees offreedom d = n + 2aσ , mean μk ,
dispersion c0kΣkd , whoseexpressions are given (with derivation) in
Appendix. Theupdating scheme for the spline parameters in μπ is
simi-lar except for the fact that we need to marginalize over
νπonly as z has a known variance (set to 1) as in (6).
4.3 Posterior inference
The principal objective of this data analysis is to
understandthe precipitation pattern over the region D. For that, we
cre-ate spatial maps of (i) expected rainrate {π(s) exp{Y(s)} :s ∈
D}; and (ii) probability of rainfall {π(s) : s ∈ D} usingthe
posterior samples. There can be locations inside D withno available
measurements from any of the L satellites thatdo not have any
likelihood contribution. The realizations ofthe rainrate process Y
at those locations are constructed us-ing the predictive
distributions of a GP. Since Gaussian pro-cesses can capture a wide
range of dependencies, using them
in a hierarchical setting enhances predictive performance forthe
model. Mathematically, if only n out of N sites haveat least one
satellite measurement then inference for the re-maining (N −n)
sites is done from their posterior predictivedistributions. If sp =
{sn+1, sn+2, . . . , sN } denotes the set oflocations with no
precipitation data, the foregoing predictiveprocess approximation
yields
y(sp) = μy(sp) + σy{CN−n,m(κ)C−1m (κ)w
(s0
) + ε∗(sp)},
ε∗(sp) ∼ NN−n(0,Diag
{IN−n − CN−n,m(κ)C−1m (κ)
× Cm,N−n(κ)})
.
(7)
As we mentioned earlier, the advantage of using bias cor-rection
is evident here since conditional on w(s0) and κ , wecan draw
samples from the posterior predictive distributionof Y(sp),
independent of each other and also of Y(s) (con-ditional on
realizations of w(s0)) due to the independenceamong ε∗s across
locations. This is computationally veryefficient since we do not
need to draw from a high dimen-sional multivariate Gaussian
distribution if we want to studya larger region and require
predicting the rainrate surface atthousands of sites with no
satellite readings. Also of inter-est is the posterior estimate of
{μy(s) : s ∈ D}, which pro-vides an idea of localized patterns in
the rainrate over D. Allthese diagnostics are provided with the
real data analysis inSect. 5.
5 Data analysis
We proceed to application of the variable-knot approach
de-scribed in Sects. 3 and 4. First, in Sect. 5.1 we carry out
sim-ulation studies to highlight the improvement in
predictiveperformance under the proposed method relative to
fixed-knot predictive process models. Then, in Sect. 5.2, we
an-alyze an actual precipitation dataset from Northern
SouthAmerica.
5.1 Simulation study
In the discussion following (4) and (5), we have arguedthat one
of the key advantage of a MARS-based covariancemodel is its ability
to capture a wide range of spatial struc-tures in the response
surface. To justify that numerically, weuse synthetic datasets from
two different models.
For simulation, we fix the input space X to be the unithypercube
in R4. The input points x = (x1, x2, x3, x4)T aredrawn uniformly
over X and the response, Y(x), is simu-lated from a Gaussian
process with a nonstationary covari-ance function as follows:
E[Y(x)
] = β0;Cov
[Y(x), Y
(x′
)] = σ 20 Ix=x′ + xT Ω0x′,(A)
-
398 Stat Comput (2015) 25:389–405
Table 1 Comparison ofpredictive performance fordifferent spatial
models
Simulationmodel
Predictionproperty
Model for estimation
MARSw/pp-knots
Predictive process with
same # pp-knots 2× # pp-knots 3× # pp-knots
(A) Abs. Bias 0.099 0.104 0.107 0.112
Pred. Uncertainty 0.570 0.626 0.603 0.603
Coverage Propn. 92.100 91.500 91.000 90.600
(B) Abs. Bias 0.048 0.271 0.195 0.137
Pred. Uncertainty 0.539 1.937 1.536 1.374
Coverage Propn. 96.200 93.500 93.800 94.500
where IA = 1 if A is true, 0 otherwise. For any σ 20 > 0
andany positive definite matrix Ω0, it is easy to verify that
theabove is a valid covariance function.
In the second example, keeping X unchanged, we moveto a more
general functional form for Y(x) as follows:
Y(x) = β0 + β1x51 + log(1 + x22
) + β2x3 sin(πx21
)
+ Ix4
-
Stat Comput (2015) 25:389–405 399
Fig. 2 (Left) Map of the study region showing number of
satellite measurements available across different sites during 1.30
a.m.–4.30 a.m. onJanuary 1st, 2008. (Right) Location of the study
region in South America
America (Fig. 2) lying south of the equator, inside the
rect-angle [70◦W,40◦W ]× [20◦S,0◦]. This region is chosen be-cause
of its large average rainfall rate, which helps to reducesampling
errors, and its regular diurnal variability. A longterm mean
rainrate pattern over 13 years for this region ispresented in Fig.
3. In the current work, we select a 3-hourtime window, 1.30
a.m.–4.30 a.m. on January 1st, 2008.Data are available from two
satellites—TMI and AMSU-NOAA17. Other satellites did not make any
observation inthis region during the specified time interval.
Ground-basedobservations by gauges or radar are very sparse in the
Ama-zon basin. From Fig. 2, it is evident that some of the
siteshave multiple rainrate measurements as they fell inside
theintersection of trajectories of both satellites during that
timewindow whereas some parts of the region were not coveredby any
of the satellites. Thus, estimation of rainrate at ob-served
locations, from either one or both of the satellites aswell as
prediction at unmapped sites are necessary to createa complete
precipitation map for this region.
We start with some empirical summaries of the raw data.Rainrate
measurements from both of the satellites are avail-able at 2026 of
a total of 9600 sites in the region whereas2305 of them have no
available data. During the time win-dow, TMI was able to cover 5653
sites whereas AMSU-NOAA17 covered 3668 sites. About 17 % of all
satellitemeasurements recorded positive precipitation, the rest
beingall zero. Figure 4 provides a comprehensive representationof
the precipitation measurements collected during that timeinterval
from the above region.
We analyze the data using the model from (6). We usediffused
prior specifications for model parameters. Regres-sion coefficients
such as βπ, c11 and c21, are assigned a
N(0,100) prior. For variance parameters, we use an
InverseGamma(2,4) prior. For the sp-knots in μy and μπ , we
alloweach of them to have at most k0 = 30 nonconstant local
func-tions, i.e. each of ky and kπ is assigned an 1 +
Poisson(4)prior truncated between [1,31]. As we show later, this
rangeturns out to be adequate for our dataset. We consider
localpolynomials of only upto second order, i.e. nh has a uni-form
prior on {1,2}. We select m = 100 locations within theregion as the
pp-knots. For the correlation parameter κ , wefirst select a
possible set of values for the spatial range R anduse a uniform
distribution over equidistant points in that set.As mentioned in
Sect. 4.2, this leads to a discrete uniformprior for κ . The MCMC
is run for 15000 iterations, discard-ing the first 5000 draws and
thinning the rest at every 5thdraw.
Before providing the posterior summaries for precipita-tion
patterns, we perform a validation step by comparingmodel-based
predictions with corresponding true observa-tions. For that, we
randomly remove a “test” set that con-sists of 130 and 122 sites
from the region with positive pre-cipitation records from TMI and
AMSU-NOAA17, respec-tively. We treat those sites as having no
available measure-ment, predict the latent process Y as in Sect.
4.3 and, trac-ing back the hierarchy in (6), regenerate the samples
for the“observed” rainrates Ỹ at each of those locations for each
ofthe satellites using 2000 thinned MC samples of model
pa-rameters. The samples are summarized in form of predictivemean
and credible set. As with the simulation studies, wecompute three
measures of predictive performance: (i) abso-lute bias (ii)
predictive uncertainty and (iii) coverage statusfor every Ỹ in the
test set. In Table 2 we present, for each
-
400 Stat Comput (2015) 25:389–405
Fig. 3 Climatological meansurface rainrate for January forthe
period 1998–2010 from theTRMM MultisatellitePrecipitation Analysis
(Huffmanet al. 2007). The domain for thisstudy is indicated by the
blackrectangle box. The meanrainrate field is relativelysmooth
across the study domain,except in the southwest corner,where
orographic effects fromthe Andes Mountains areapparent (Color
figure online)
Fig. 4 Grid-level rainratemeasurements from (top) TMIand
(bottom) AMSU-NOAA17satellites during [1:30,4:30] a.m. on January
1, 2008(Color figure online)
-
Stat Comput (2015) 25:389–405 401
satellite, the summary of these measures—the mean abso-lute
bias, the mean predictive uncertainty and the
coverageproportion.
The results turn out to be satisfactory as the empiricalcoverage
rates for both satellites exceed 89 %. As expected,the TMI
measurements have significantly better estimatesof error and
reduced uncertainty of prediction over those ofAMSU-NOAA17 because
the former has been used as thereference standard for this data
analysis. Now, we provideposterior summary statistics for some
important model pa-rameters in Table 3.
Table 2 Predictive performance of both satellites on test
dataset
Satellite Mean absolutebias (in mm/hr)
Mean predictiveuncertainty(in mm/hr)
Coverage proportionfor 90 % credible sets
TMI 1.039 3.765 0.892
AMSU-NOAA17
1.697 5.738 0.893
Combined 1.357 4.720 0.893
The number of sp-knots for the latent log-rainrate processas
well as for the probability of no precipitation are well-within
their assigned range of [1, 31]. The multiplicativefactor for the
AMSU-NOAA17, c2 is marginally above 1.The effect of the latent log
rainrate process Y on the event ofrainfall is parametrized by βπ
and turns out to be significant.Next, we look at the spatial maps
for rainrate summaries.First, we present the pointwise estimates
for the exp[Y ] andπ surfaces in Fig. 5. Then, the posterior mean
and uncer-tainty estimates (90 % credible set width) of the
expectedrainrate process, as defined in Sect. 4.3, are included
in
Table 3 Posterior summaries of important model parameters
Parametertype
Parametername
Pointestimate
90 % posteriorcredible interval
Latentprocess-specific
ky 21 [17, 26]
kπ 8 [5, 12]
βπ 1.392 [1.271, 1.582]
AMSU-NOAA17-specific
c1 0.332 [0.245, 0.429]
c2 1.090 [0.999, 1.187]
Fig. 5 Posterior surfaceestimates of (top) probability
ofrainfall π and (bottom) potentialrainrate exp[Y ] (Color
figureonline)
-
402 Stat Comput (2015) 25:389–405
Fig. 6 Posterior estimatedsurfaces for (top) expectedrainrate
and (bottom) itsuncertainty (width of pointwise90 % credible sets)
(Color figureonline)
Fig. 6. To highlight the contribution of sp-knot based
func-tions, we present the posterior maps of exp{μy(·)} and μπin
Fig. 7.
Precipitation is mostly concentrated in the west-centralpart of
the region and decreases as one moves towards theocean in the east.
In Fig. 5, the probability map shows higherchance of observing
precipitation in the central part of theregion as a result of
smoothing effect of high rainfall ob-servations in the surrounding
regions from both satellites.Figure 6 shows patches of region with
relatively higher ex-pected rainrate. The highs and lows of the
uncertainty esti-mates are often related to those of the
corresponding pointestimates. This is a natural property of the
log-Gaussianmodel (also other models like Gamma) due to the
inter-dependence between mean and variance. The
uncertaintyestimates are reflective of typically high variability
(as wellas lack of spatial smoothness) of precipitation over a
shorttime window. Finally, Fig. 7 shows bands of piecewise
ho-mogeneous regions created by the collection of sp-knotsand
associated local polynomials. The estimated surface forexp[μy]
contains relatively higher number of localized pat-terns than the
surface for μπ . This is justified by Table 3,which shows that the
posterior probability mass function for
ky puts greater weight on larger values within the [1,31]range
than the one for kπ .
6 Summary and future work
In this article, we presented a novel hierarchical model
tocombine precipitation measurements from multiple satel-lites. To
capture a wide range of localized as well as large-scale spatial
patterns in the underlying potential rainrate pro-cess, a flexible
random knot-based mean function has beenused in combination with a
stationary residual in the logscale. The method was adjusted to
handle a large numberof observation locations using a predictive
process approxi-mation, making it applicable to studies involving
larger re-gions. However, it is likely for a larger domain to
containheterogeneous subregions, e.g., land-sea boundaries or
re-gions separated by mountain stretches, that may
experiencedifferent rainfall patterns. Whereas the MARS
specification,employed in this work, can be really useful for these
sit-uations (since it does not make assumptions regarding anyglobal
pattern of correlation in the response surface), it isworth
exploring alternative modeling ideas that can account
-
Stat Comput (2015) 25:389–405 403
Fig. 7 Maps of sp-knot basedsurfaces—(top) exp[μy ] and(bottom)
μπ constructed fromtheir respective posteriorsamples (Color figure
online)
for boundary effects between nonhomogeneous regions. An-other
extension lies in extending the 3-hr window to a largertime
interval like a day or a week that brings a temporal pat-tern to
the data. The associated spatio-temporal process canbe developed as
an extension to the current spatial model forY in a number of
different ways. Limitations for such spec-ifications with respect
to restricted model assumptions (e.g.separability across space and
time), computational load orintuitive interpretability need to be
compared for preferringone of them over the others. Another
important informationthat may significantly improve rainfall
prediction is the useof associated covariate data. Covariates can
be of differenttypes: (i) climate features such as temperature,
wind speed,wind direction etc., (ii) geographic information such as
ele-vation and (iii) measures of human intervention, e.g.,
forestcover, emission rates of pollutants etc. which are believedto
influence rainfall in the long run. Inclusion of appropri-ate local
covariate information is often useful to explain
thenonstationarity, thus eliminating the need for complex
mod-els.
Acknowledgements The authors acknowledge the Texas A&M
Uni-versity Brazos HPC cluster that contributed to the research
reportedhere (http://brazos.tamu.edu).
Appendix: Marginalizing out νy and σ 2y for estimationof spline
parameters in μy(s)
Denote by . . . all parameters except ν,σ 2y . Let P =[φ1[x(s)],
φ2[x(s)], . . . , φk[x(s)]], S = y(s) − Pνy . Wehave,
p(y(s)| . . .)
∝∫
νy
∫
σ 2y
p(y(s)|νy, σ 2y , . . .
)p(νy |σ 2y
)p(σ 2y
)dσ 2y dνy,
∝ (2πτ 2y)−k/2
∫
νy
∫
σ 2y
(σ 2y
)− n+k2 −aσ −1
× exp[
− 12σ 2y
(ST D−1S + νTy νy/τ 2y + 2bσ
)]
dσ 2y dνy,
∝ (2πτ 2y)−k/2
Γ
(n
2+ aσ
)
∫
νy
(ST D−1S + νTy νy/τ 2y
2+ bσ
)− n+m+k2 −aσdνy.
http://brazos.tamu.edu
-
404 Stat Comput (2015) 25:389–405
Now write ST D−1S + νTy νy = νTy Aνy − 2νTy B + C, whereA = P T
D−1P + Ik
τ 2y, B = P T D−1Sy , C = STy D−1Sy . Then
we have, ST D−1S + νTy νy + 2bσ = (νy − μk)T Σ−1k (νy −μk) + c0k
, where μk = A−1B,Σk = A−1, c0k = C −bT A−1b + 2bσ . Denote d = n +
2aσ . Thenp(y(s)| . . .)
∝ (πτ 2y)−k/2
c− d+k20k Γ
(d + k
2
)∫
νy
[1
d(νy − μk)T
×(
c0kΣk
d
)−1(νy − μk) + 1
]− d+k2dνy.
The integrand is the pdf (up to a constant) for the k-variate
tdistribution with mean μk , dispersion
c0kΣkd
and degrees offreedom d . Hence, we obtain the closed form
expression for
p(y(s)| . . .) ∝ (τ 2y )−k/2c−d2
0k |Σk|1/2.
References
Agarwal, D.K., Gelfand, A.E., Citron-Pousty, S.: Zero-inflated
modelswith application to spatial count data. Environ. Ecol. Stat.
9, 341–355 (2002)
Albert, J.H., Chib, S.: Bayesian analysis of binary and
polychotomousresponse data. J. Am. Stat. Assoc. 88(422), 669–679
(1993)
Anderes, E.B., Stein, M.L.: Estimating deformations of isotropic
Gaus-sian random fields on the plane. Ann. Stat. 36, 719–741
(2008)
Austin, P.M., Houze, R.A.: Analysis of the structure of
precipitationpatterns in New England. J. Appl. Meteorol. 11,
926–935 (1972)
Ba, M.B., Gruber, A.: Goes multispectral rainfall algorithm
(gmsra). J.Appl. Meteorol. 40, 1500–1514 (2001)
Banerjee, S.: On geodetic distance computations in spatial
modeling.Biometrics 61(2), 617–625 (2005)
Banerjee, S., Gelfand, A.E., Knight, J.R., Sirmans, C.F.:
Spatial mod-eling of house prices using normalized
distance-weighted sums ofstationary processes. J. Bus. Econ. Stat.
22(2), 206–213 (2004)
Banerjee, S., Gelfand, A., Finley, A., Sang, H.: Gaussian
predictiveprocess models for large spatial data sets. J. R. Stat.
Soc. B 70(4),825–848 (2008)
Bardossy, A., Plate, E.J.: Space-time model for daily rainfall
using at-mospheric circulation patterns. Water Resour. Res. 28(5),
1247–1259 (1992)
Bell, T.L., Kundu, P.K.: A study of the sampling error in
satellite rain-fall estimates using optimal averaging of data and a
stochasticmodel. J. Climate 9, 1251–1268 (1996)
Bell, T.L., Abdullah, A., Martin, R.L., North, G.R.: Sampling
errorsfor satellite-derived tropical rainfall: Monte Carlo study
usinga space-time stochastic model. J. Geophys. Res. 95(D3),
2195–2205 (1990)
Bell, T.L., Kundu, P.K., Kummerow, C.D.: Sampling errors of
ssm/iand trmm rainfall averages: comparison with error estimates
fromsurface data and a simple model. J. Appl. Meteorol. 40,
938–954(2001)
Chakraborty, A., Gelfand, A.E., Wilson, A.M., Latimer, A.M.,
Silan-der, J.A.: Modeling large scale species abundance with latent
spa-tial processes. Ann. Appl. Stat. 4(3), 1403–1429 (2010)
Chakraborty, A., Mallick, B.K., McClarren, R.G., Kuranz, C.C.,
Bing-ham, D.R., Grosskopf, M.J., Rutter, E., Stripling, H.F.,
Drake,R.P.: Spline-based emulators for radiative shock experiments
withmeasurement error. J. Am. Stat. Assoc. 108, 411–428 (2013)
Chib, S., Carlin, B.P.: On mcmc sampling in hierarchical
longitudinalmodels. Stat. Comput. 9(1), 17–26 (1999)
Cohen, A.C.: Truncated and Censored Samples, 1st edn.
MarcelDekker, New York (1991)
Cooley, D., Nychka, D., Naveau, P.: Bayesian spatial modeling of
ex-treme precipitation return levels. J. Am. Stat. Assoc.
102(479),824–840 (2007)
Cressie, N., Johannesson, G.: Fixed rank kriging for very large
spatialdata sets. J. R. Stat. Soc. B 70(1), 209–226 (2008)
Denison, D.G.T., Mallick, B.K., Smith, A.F.M.: Bayesian mars.
Stat.Comput. 8(4), 337–346 (1998)
Felgate, D.G., Read, D.G.: Correlation analysis of the cellular
structureof storms observed by raingauges. J. Hydrol. 24, 191–200
(1975)
Finley, A., Sang, H., Banerjee, S., Gelfand, A.: Improving the
perfor-mance of predictive process modeling for large datasets.
Comput.Stat. Data Anal. 53(8), 2873–2884 (2009)
Finley, A.O., Banerjee, S., MacFarlane, D.W.: A hierarchical
model forquantifying forest variables over large heterogeneous
landscapeswith uncertain forest areas. J. Am. Stat. Assoc.
106(493), 31–48(2011)
Friedman, J.: Multivariate adaptive regression splines. Ann.
Stat. 19(1),1–67 (1991)
Fuentes, M.: Spectral methods for nonstationary spatial
processes.Biometrika 89, 197–210 (2002)
Fuentes, M., Reich, B., Lee, G.: Spatial-temporal mesoscale
modellingof rainfall intensity using gauge and radar data. Ann.
Appl. Stat.2, 1148–1169 (2008)
Furrer, R., Genton, M.G., Nychka, D.: Covariance tapering for
inter-polation of large spatial datasets. J. Comput. Graph. Stat.
15(3),502–523 (2006)
Gelfand, A.E., Kim, H.J., Sirmans, C.F., Banerjee, S.: Spatial
modelingwith spatially varying coefficient processes. J. Am. Stat.
Assoc.98(462), 387–396 (2003)
Gelfand, A.E., Banerjee, S., Finley, A.O.: Spatial design for
knot se-lection in knot-based dimension reduction models. In:
Mateu, J.,Müller, W.G. (eds.) Spatio-Temporal Design: Advances in
Effi-cient Data Acquisition, pp. 142–169. Wiley, Chichester
(2012)
Guhaniyogi, R., Finley, A.O., Banerjee, S., Gelfand, A.E.:
AdaptiveGaussian predictive process models for large spatial
datasets. En-vironmetrics 22(8), 997–1007 (2011)
Higdon, D.: A process-convolution approach to modelling
tempera-tures in the North Atlantic Ocean. Environ. Ecol. Stat.
5(2), 173–190 (1998)
Higdon, D.: Space and space-time modeling using process
convolu-tions. In: Anderson, C., Barnett, V., Chatwin, P.C.,
El-Shaarawi,A.H. (eds.) Quantitative Methods for Current
Environmental Is-sues, pp. 37–56. Springer, London (2002)
Higdon, D., Swall, J., Kern, J.: Non-stationary spatial
modeling. In:Bernardo, J.M., Berger, J.O., Dawid, A.P., Smith,
A.F.M. (eds.)Bayesian Statistics, vol. 7, pp. 181–197. Oxford
University Press,Oxford (1999)
Huffman, G.J., Adler, R.F., Stocker, E.F., Bolvin, D.T., Nelkin,
E.J.:A trmm-based system for real-time quasi-global merged
precip-itation estimates. In: TRMM International Science
Conference,Honolulu, pp. 22–26 (2002)
Huffman, G.J., Adler, R.F., Bolvin, D.T., Gu, G., Nelkin, E.J.,
Bow-man, K.P., Hong, Y., Stocker, E.F., Wolef, D.B.: The trmm
mul-tisatellite precipitation analysis (tmpa): quasi-global,
multiyear,combined-sensor precipitation estimates at fine scales.
J. Hydrom-eteorol. 8, 38–55 (2007)
Joyce, R.J., Janowiak, J.E., Arkin, P.A., Xie, P.: Cmorph: a
method thatproduces global precipitation estimates from passive
microwaveand infrared data at high spatial and temporal resolution.
J. Hy-drometeorol. 5, 487–503 (2004)
Jun, M.: Non-stationary cross-covariance models for multivariate
pro-cesses on a globe. Scand. J. Stat. 38, 726–747 (2011)
-
Stat Comput (2015) 25:389–405 405
Jun, M., Stein, M.L.: Nonstationary covariance models for global
data.Ann. Appl. Stat. 2(4), 1271–1289 (2008)
Kammann, E.E., Wand, M.P.: Geoadditive models. J. R. Stat. Soc.,
Ser.C, Appl. Stat. 52(1), 1–18 (2003)
Kaufman, C.G., Schervish, M.J., Nychka, D.W.: Covariance
taperingfor likelihood-based estimation in large spatial data sets.
J. Am.Stat. Assoc. 103(484), 1545–1555 (2008)
Kidd, C.: Satellite rainfall climatology: a review. Int. J.
Climatol. 21,1041–1066 (2001)
Lee, G.W., Zawadzki, I.: Variability of drop size distributions:
time-scale dependence of the variability and its effects on rain
estima-tion. J. Appl. Meteorol. 44, 241–255 (2005)
Lethbridge, M.: Precipitation probability and satellite
radiation data.Mon. Weather Rev. 95(7), 487–490 (1967)
Marchenko, Y.V., Genton, M.G.: Multivariate log-skew-elliptical
dis-tributions with applications to precipitation data.
Environmetrics21(3–4), 318–340 (2010)
McConnell, A., North, G.R.: Sampling errors in satellite
estimates oftropical rain. J. Geophys. Res. 92(D8), 9567–9570
(1987)
Negri, A.J., Xu, L., Adler, R.F.: A trmm-calibrated infrared
rainfallalgorithm applied over Brazil. J. Geophys. Res. 107(D20),
8048–8062 (2002)
Paciorek, C., Schervish, M.: Spatial modelling using a new class
ofnonstationary covariance functions. Environmetrics 17,
483–506(2006)
Richardson, S., Green, P.J.: On Bayesian analysis of mixtures
with anunknown number of components. J. R. Stat. Soc. B 59(4),
731–792 (1997)
Rodríguez-Iturbe, I., Mejía, J.M.: The design of rainfall
networks intime and space. Water Resour. Res. 10, 713–728
(1974)
Sampson, P.D., Guttorp, P.: Nonparametric estimation on
nonstation-ary spatial covariance structure. J. Am. Stat. Assoc.
87, 108–119(1992)
Sang, H., Gelfand, A.E.: Hierarchical modeling for extreme
values ob-served over space and time. Environ. Ecol. Stat. 16(3),
407–426(2009)
Sang, H., Huang, J.Z.: A full scale approximation of covariance
func-tions for large spatial data sets. J. R. Stat. Soc. B 74(1),
111–132(2012)
Schmidt, A.M., O’Hagan, A.: Bayesian inference for
non-stationaryspatial covariance structure via spatial
deformations. J. R. Stat.Soc. B 65, 743–758 (2003)
Simpson, J., Adler, R.F., North, G.R.: A proposed Tropical
RainfallMeasuring Mission (TRMM) satellite. Bull. Am. Meteorol.
Soc.69(3), 278–295 (1988)
Sorooshian, S., Hsu, K.L., Gao, X., Gupta, H., Imam, B.,
Braith-waite, D.: Evaluation of Persiann system satellite-based
estimatesof tropical rainfall. Bull. Am. Meteorol. Soc. 81(9),
2035–2046(2000)
Stein, M., Chi, Z., Welty, L.: Approximating likelihoods for
large spa-tial data sets. J. R. Stat. Soc. B 66, 275–296 (2004)
Sun, Y., Li, B., Genton, M.G.: Geostatistics for large datasets.
In:Porcu, E., Montero, J.M., Schlather, M. (eds.) Advances and
Chal-lenges in Space-Time Modelling of Natural Events, vol. 207,pp.
55–77. Springer, Berlin (2012)
Tanner, T.A., Wong, W.H.: The calculation of posterior
distributionsby data augmentation. J. Am. Stat. Assoc. 82, 528–549
(1987)
Vicente, G.A., Scofield, R.A., Menzel, W.P.: The operational
goes in-frared rainfall estimation technique. Bull. Am. Meteorol.
Soc.79(9), 1883–1898 (1998)
Weng, F.W., Zhao, L., Ferraro, R., Pre, G., Li, X., Grody, N.C.:
Ad-vanced microwave sounding unit (amsu) cloud and
precipitationalgorithms. Radio Sci. 38(4), 8068–8079 (2003)
Wilheit, T.T.: A satellite technique for quantitatively mapping
rainfallrates over the ocean. J. Appl. Meteorol. 16, 551–560
(1977)
Wilheit, T.T., Chang, A.T.C., Rao, M.S.V., Rodgers, E.B., Theon,
J.S.:A satellite technique for quantitatively mapping rainfall
rates overthe oceans. J. Appl. Meteorol. 16(5), 551–560 (1977)
Xie, P., Arkin, P.A.: Global monthly precipitation estimates
fromsatellite-observed outgoing longwave radiation. J. Climate
11,137–164 (1998)
An adaptive spatial model for precipitation data from multiple
satellites over large regionsAbstractIntroductionCollection of
satellite measurements on precipitationHierarchical spatial model
for rainrateDetails of estimation and inferenceKnot-based
approximation for large datasetMCMC from the complete hierarchical
modelPosterior inference
Data analysisSimulation studyMulti-satellite precipitation
data
Summary and future workAcknowledgementsAppendix: Marginalizing
out nuy and sigma2y for estimation of spline parameters in
µy(s)References