Hierarchical Modelling for Large Spatial Datasets Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. October 15, 2012 1 Big N problem Introduction The Big n issue Univariate spatial regression Y = Xβ + w + , Estimation involves (σ 2 R(φ)+ τ 2 I ) -1 , which is n × n. Matrix computations occur in each MCMC iteration. Known as the “Big-N problem” in geostatistics. Approach: Use a model Y = Xβ + Zw * + . But what Z? 2 UNL Department of Statstics Spatio-temporal Workshop Big N problem Predictive Process Models Consider “knots” S * = {s * 1 ,..., s * n * } with n * << n. Let w * = {w(s * i )} n * i=1 Z(θ)= {cov(w(s i ),w(s * j ))} {var(w * )} -1 is n × n * . Predictive process regression model Y = Xβ + Z(θ)w * + , Fitting requires only n * × n * matrix computations (n * << n). Key attraction: The above arises as a process model: ˜ w(s) ∼ GP (0,σ 2 w ˜ ρ(·; φ)) instead of w(s). ˜ ρ(s 1 , s 2 ; φ)= cov(w(s 1 ), w * )var(w * ) -1 cov(w * ,w(s 2 )) 3 UNL Department of Statstics Spatio-temporal Workshop Big N problem Selection of knots Knots: A “Knotty” problem?? Knot selection: Regular grid? More knots near locations we have sampled more? Formal spatial design paradigm: maximize information metrics (Zhu and Stein, 2006; Diggle & Lophaven, 2006) Geometric considerations: space-filling designs (Royle & Nychka, 1998); various clustering algorithms Compare performance of estimation of range and smoothness by varying knot size. Stein (2007, 2008): method may not work for fine-scale spatial data Still a popular choice – seamlessly adapts to multivariate and spatiotemporal settings. 4 UNL Department of Statstics Spatio-temporal Workshop Big N problem Selection of knots 0 50 100 150 200 0 5 10 15 20 25 knots tauˆ2 0 50 100 150 200 5 UNL Department of Statstics Spatio-temporal Workshop Big N problem Comparisons: Unrectified VS Rectified A rectified predictive process is defined as ˜ w ˜ (s)=˜ w(s)+˜ (s), where ˜ (s) indep ∼ N (0,σ 2 w (1 - r(s, φ) R *-1 (φ)r(s, φ))). Maximum likelihood estimates of τ 2 : # of Knots Predictive Process Rectified Predictive Process 25 1.56941 1.00786 36 1.65688 1.15386 64 1.45169 1.08358 100 1.37916 1.09657 225 1.27391 1.08985 400 1.22429 1.09489 625 1.21127 1.09998 exact 1.14414 1.14414 6 UNL Department of Statstics Spatio-temporal Workshop
4
Embed
Hierarchical Modelling for Large Spatial Datasetsblue.for.msu.edu/UNL_12/SC/slides/PredictiveProcess-6up.pdf · Hierarchical Modelling for Large Spatial Datasets ... 2 UNL Department
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Hierarchical Modelling for Large SpatialDatasets
Sudipto Banerjee1 and Andrew O. Finley2
1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A.
2 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A.
October 15, 2012
1
Big N problem Introduction
The Big n issue
Univariate spatial regression
Y = Xβ + w + ε,
Estimation involves (σ2R(φ) + τ2I)−1, which is n× n.
Matrix computations occur in each MCMC iteration.
Known as the “Big-N problem” in geostatistics.
Approach: Use a model Y = Xβ + Zw∗ + ε. But what Z?
2 UNL Department of Statstics Spatio-temporal Workshop
Big N problem Predictive Process Models
Consider “knots” S ∗ = {s∗1, . . . ,s∗n∗} with n∗ << n.Let w∗ = {w(s∗i )}n
∗i=1
Z(θ) = {cov(w(si), w(s∗j ))}′{var(w∗)}−1 is n× n∗.
Predictive process regression model
Y = Xβ + Z(θ)w∗ + ε,
Fitting requires only n∗ × n∗ matrix computations(n∗ << n).Key attraction: The above arises as a process model:w̃(s) ∼ GP (0, σ2wρ̃(·;φ)) instead of w(s).ρ̃(s1,s2;φ) = cov(w(s1),w∗)var(w∗)−1cov(w∗, w(s2))
3 UNL Department of Statstics Spatio-temporal Workshop
Big N problem Selection of knots
Knots: A “Knotty” problem??
Knot selection: Regular grid? More knots near locationswe have sampled more?
6 UNL Department of Statstics Spatio-temporal Workshop
Illustration
Illustration from:
Finley, A.O., S. Banerjee, P. Waldmann, and T. Ericsson. (2008)Hierarchical spatial modeling of additive and dominancegenetic variance for large spatial trial datasets. Biometrics.DOI:10.1111/j.1541-0420.2008.01115.x
7 UNL Department of Statstics Spatio-temporal Workshop
Illustration Univariate random effects models
Univariate random effects models
Modeling genetic variation in Scots pine (Pinus sylvestris L.),long-term progeny study in northern Sweden.
Quantitative genetics: studies the inheritance of polygenictraits, focusing upon estimation of additive genetic variance, σ2a,and the heritability h2 = σ2a/σ
2Tot, where the σ2Tot represents the
total genetic and unexplained variation.
A high heritability, h2, should result in a larger selectionresponse (i.e., a higher probability for genetic gain in futuregenerations).
8 UNL Department of Statstics Spatio-temporal Workshop
Illustration Univariate random effects models
Observed trees
Observed height
Data overview:
established in 1971 (bySkogforsk)
partial diallel design of52 parent trees
8,160 planted randomlyon 2.2m squares
1997 reinventory of4,970 surviving trees,height, DBH, branchangle, etc.
9 UNL Department of Statstics Spatio-temporal Workshop
Illustration Univariate random effects models
Genetic effects model:
Yi = xTi β + ai + di + εi,
a = [ai]ni=1 ∼MVN(0, σ2aA)
d = [di]ni=1 ∼MVN(0, σ2dD)
ε = [εi]ni=1 ∼ N(0, τ2In)
A and D are fixed relationship matrices (See e.g., Henderson,1985; Lynch and Walsh,1998)
Note, genetic variance is further partitioned into additive andthe non-additive dominance component σ2d
10 UNL Department of Statstics Spatio-temporal Workshop
Illustration Univariate random effects models
Genetic effects model:
Yi = xTi β + ai + di + εi,
Common feature is systematic heterogeneity amongobservational units (i.e., violation of ε ∼ N(0, τ2In))
12 UNL Department of Statstics Spatio-temporal Workshop
Illustration Univariate random effects models
Genetic model results, cont’d.
Residual surface Residual semivariogram
So, ε � N(0, τ2In). Consider a spatial model.
13 UNL Department of Statstics Spatio-temporal Workshop
Illustration Univariate random effects models
Previous approaches to accommodating residual spatialdependence:
Manipulate the mean functionconstructing covariates using residuals from neighboringunits (see e.g., Wilkinson et al., 1983; Besag and Kempton,1986; Williams, 1986)
Geostatiticalspatial process formed AR(1)col ⊗AR(1)row (Martin, 1990;Cullis et al., 1998)classical geostatistical method (Zimmerman and Harville,1991)
All are computationally feasible, but ad hoc and/or restrictivefrom a modeling perspective.
14 UNL Department of Statstics Spatio-temporal Workshop
Illustration Univariate random effects models
Spatial model for genetic trials:
Y (si) = xT (si)β + ai + di + w(si) + εi,
a = [ai]ni=1 ∼MVN(0, σ2aA)
d = [di]ni=1 ∼MVN(0, σ2dD)
w = [w(si)]ni=1 ∼MVN(0, σ2wC(θ))
ε = [εi]ni=1 ∼ N(0, τ2In)
Tools used to estimate parameters:Markov chain Monte Carlo (MCMC) - iterative
Gibbs sampler (β, a, d, w)Metropolis-Hastings and Slice samplers (θ)
Here MCMC is computationally infeasible because of Big-N!
15 UNL Department of Statstics Spatio-temporal Workshop
Illustration Univariate random effects models
Trick to sample genetic effects:
Gibbs draw for random effects, e.g., a|· ∼MVN(µa|·,Σa|·),
where calculating Σa|· =[
1σ2aA−1 + In
τ2
]−1is computationally
expensive!
However A and D are known, so use initial spectraldecomposition i.e., A−1 = P TΛ−1P .
Thus, Σa|· = P T(
1σ2aΛ−1 + 1
τ2I)−1
P to achieve computationalbenefits.
16 UNL Department of Statstics Spatio-temporal Workshop
Illustration Univariate random effects models
Unfortunately, this trick does not work for w. Rather, weproposed the knot-based predictive process.
Corresponding predictive process model:
Y (si) = xT (si)β + ai + di + w̃(si) + εi,
w̃(si) = c(si;θ)TC(θ)∗−1(θ)w∗
where, w∗ = [w(s∗i )]
mi=1 ∼MVN(0, C∗(θ)) and C∗(θ) = [C(s∗
i , s∗j ;θ)]
mi,j=1
Projection =⇒
17 UNL Department of Statstics Spatio-temporal Workshop
Decrease in τ2 due to removal of spatial variation, resultsin increase in h2 (i.e., ∼ 0.25 vs. ∼ 0.15 with confounding).
21 UNL Department of Statstics Spatio-temporal Workshop
Illustration Univariate random effects models
Genetic + spatial effects models results, cont’d.
Genetic model residuals w̃(s), 64 knots w̃(s), 256 knots
Predictive process – balance model richness withcomputational feasibility (e.g., 4,970×4,970 vs. 64×64).
22 UNL Department of Statstics Spatio-temporal Workshop
Summary
Summary
Challenge - to meet modeling needs:
ensure computationally feasiblereduce algorithmic complexity = cheap tricks (e.g., spectraldecomp. of A prior to MCMC)reduce dimensionality = predictive process
maintain richness and flexibilityfocus on the model not how to estimate the parameters =embrace new tools (MCMC) for estimating highly flexiblehierarchical models
truly acknowledge sources of uncertaintypropagate uncertainty through hierarchical structures (e.g.,recognize uncertainty in C(θ))
23 UNL Department of Statstics Spatio-temporal Workshop