UNCERTAINTY IN PREDICTING SPECIES DISTRIBUTION A case study using four methods to map tiger occurrence in Central Sumatra Sunarto & Marcella J. Kelly, Virginia Tech
UNCERTAINTY IN
PREDICTING SPECIES DISTRIBUTION
A case study using four methods to map tiger
occurrence in Central Sumatra
Sunarto & Marcella J. Kelly, Virginia Tech
Pre-assessment
How many of you have modeled species distribution?
For those who have done it, how many have:
accounted uncertainty/evaluated model accuracy?
Outline
Background
Why species distribution?
How species distribution is mapped
Uncertainty and other issues
Case study on predicting tiger distribution
Why map/model sp distribution?
Understanding the ecology
Fulfilling the need for (species) management
Why: Examples of ApplicationUnderstanding ecology:
Dispersal & barrier
Resources requirements
Interactions: wildlife & human
ww
w.tam
rin.
pro
board
s.co
m/
Why: Examples of Application
Biodiversity conservation priorities
6 Bro
oks
et
al. 2
006
7
Conservation vs.
Development
High Conservation
Value Areas (HCVA)
8
9
MCP
Kernel Density
Home-range buffer 1
Home-range buffer 2
Grid
Predicting species distribution: typical steps
1. Determining scale
2. Selecting variables
3. Developing experimental design
4. Collecting data
5. Developing/choosing statistical procedures: range/envelope, distance/similarity, regressions,…
6. Building & selecting models
7. Translating mathematical model into distribution map
Uncertainty & other issues Imprecise data/measurement error: positional& attribute, applicable for both
species presence data (response variable) or environmental/habitat (predictor
variables)
Error/uncertainty in data transfer/treatments/manipulation
Uncertainty in selections of predictor variables
Uncertainty in species detection
Error/uncertainty in model parameter estimation
Uncertainty in modeling approach
Uncertain inferences
Fallacious/unrealistic assumptions
Natural variability/stochasticity of the system
Ambiguous or incorrect scientific questions
Aft
er
Morr
ison e
t a
l. 2
00
6
Other issues : scale
Adapte
d
from
: Kara
nth
& N
icho
ls 2
002
A case study: predicting tiger distribution
• Critically endangered
• Elusive: high association with uncertainty
• Urgent need for distribution maps
• Year of the tiger
Study Area
Study Area
17 km
17 km
Types of data
Species occurrence (response variables):
• Presence only
• Presence-’absence’
• Detection-Nondetection
• Count
Predictors: Macro-habitat (global/landscape level features)
Modeling approaches used:
Presence only data: Maximum Entropy (MaxEnt)
Presence-’absence’ data: Logistic regression (R)
Count data: Zero-inflated Negative Binomial regression (R)
Detection-non detection data: Occupancy (Program PRESENCE)
Relative importance of variablesMaxEnt
(% Contr.)
Logistic Regr.
(Logit link β)
Count
(Log link β)
Occupancy
(Logit link β)
Intercept NA -11.74 -2.38 (1.11) -8.48 (4.10)
Forest area in 2007 59.8 1.30 0.73 (0.22) 0.61 (0.50)
Altitude 0.7 113.09 20.94 (11.79) 94.08 (45.55)
Distance to core forest area 0.1 -0.77 NA -0.85 (0.66)
Road density 14.8 0.92 NA 0.05 (0.69)
Distance to road 12.5 NA NA NA
Deforestation from 2006 to 2007 1.7 NA NA NA
Distance to core of protected areas 8.8 -1.07 NA NA
Precipitation 1.7 NA NA NA
Summary of predictions
Maxent Logistic Count Occupancy
Minimum 0* 0 0 0
Maximum 0.9 1.0 128.3 1.0
Mean 0.237 0.329 2.036 0.423
CV 0.871 1.185 4.499 0.830
*) For Maxent outputs, no cell can be interpreted as complete absent even if prediction is (near) 0
(Phillips et al. 2006)
MaxEnt
Logistic
regression
Count
Occupancy
“All models are wrong, but
some are useful”(George Box, quoted in Kennedy 1992: 73).
Concordance (%) between modelsREFERENCES
Maxent Logistic Count OccupancyOverall
Accuracy
K-hat Overall
Accuracy
K-hat Overall
Accuracy
K-hat Overall
Accuracy
K-hat
Maxent 28 13 38 22 38 23
Logistic 28 13 47 34 52 40
Count 38 22 47 34 56 45
Occupancy 38 23 52 40 56 45
‘Correct predictions’ by different models ( based on presence-only tiger records in 20 grids collected from independent surveys)
34 1 = Very low, 2= low, 3= medium, 4= high
0%
25%
50%
75%
100%
1 2 3 4
Perc
enta
ge o
f co
rrect
pre
dic
tion
Treshholds on probability of occurrence
MaxEnt
Occupancy
Count
Logistic regression
Conclusions
Better to model than simply to map occurrence: accounting sampling efforts & environmental variables
Ignoring detection probability ~ underestimating the population parameters
Variation in results from different models
Model robustness for some variables & areas
Account for missing detections whenever possible: use occupancy
For some cases, presence-only model is still an acceptable choice.
Acknowledgement
Financial & programmatic supports
provided by: WWF Indonesia & Networks,
Virginia Tech, Hurvis Family, STF, PHKA, OFWIM
Field team: Zulfahmi, Harry Kurniawan,
Karmila Parakkasi, Eka Septayuda, Kusdianto,
Fendy Panjaitan, Agung Suprianto, E. Tugiyo, L.
Subali, H. Gebog, Herri Irawan, Roni Faslah,
Kokok Yulianto, Sunandar, Riza Sukriana,
Tarmison
Special thanks to: Drs. Sybille Klenzendorf,
Steve Prisley, Jim Nichols, Jim Hines, Dean F.
Stauffer, Mike R. Vaughan, WHAPA colleagues