METHODOLOGICAL GUIDELINE TO PRODUCE A FUTURE DEFORESTATION MODEL FOR PALM OIL EXPANSION IN PAPUA NEW GUINEA USING By: Giancarlo Raschio Freddie Alei November, 2016
METHODOLOGICAL GUIDELINE TO PRODUCE A
FUTURE DEFORESTATION MODEL FOR PALM
OIL EXPANSION IN PAPUA NEW GUINEA USING
By:
Giancarlo Raschio
Freddie Alei
November, 2016
2
Table of Contents
Table of Contents
1 Data and Methods Used .......................................................................................... 3
2 Preparation of Hansen data for the deforestation model ......................................... 4
2.1 First stage .................................................................................................................... 4
2.2 Second stage ............................................................................................................... 9
2.3 Third stage ................................................................................................................ 12
2.4 Fourth stage .............................................................................................................. 14
3 Import raster files to Idrisi ..................................................................................... 16
4 Import of vector files to Idrisi ................................................................................ 18
5 Generation of Factor Maps .................................................................................... 22
6 Calibration ............................................................................................................ 28
6.1 Change Analysis ......................................................................................................... 29
6.2 Transition Potentials .................................................................................................. 32
6.3 Change Prediction ...................................................................................................... 42
7 References ............................................................................................................ 45
8 Annexes ................................................................................................................ 46
8.1 Annex 1: Complete process flow for preparing Hansen Dataset to generate Forest/Non-
forest maps for the years 2000 and 2014 ............................................................................. 46
8.2 Annex 2: Validation ................................................................................................... 47
3
PROJECTION OF THE QUANTITY AND LOCATION OF FUTURE
DEFORESTATION
This report presents the methodology and results to locate in space and time the baseline
deforestation expected to occur within the STUDY AREA during the project crediting period.
1 Data and Methods Used
To project deforestation into the future it is necessary at remote sensing data on land-cover
for at least two points in time. In our case we used the publicly and freely available Hansen
dataset (Hansen et al. 2013). Of course, the objective for PNGFA is to use its own classified
remote sensing imagery for the analysis of future deforestation.
The Hansen dataset provides the following data:
i) Forest cover in the year 2000 (tcover): This data refers to a land-cover
classification of two classes: tree cover and tree-cover loss1
ii) Annual tree-cover loss for each year between 2000 and 2014 (lossyear): this data
refers to the tree-cover that has been lost in each of the 14 years between 2000
and 2014
iii) A data mask layer (dmask): this data presents information about features that are
neither tree cover nor tree-cover loss; thus, they are rivers.
iv) A data layer of tree-cover gain between 2000 and 2014 (gain): this data refers to
tree-cover that has been regenerated between 2000 and 2014. However, it should
be taken into account that in the case of PNG such gain or “regeneration” is the
result of farming cycles; areas that appeared as regenerated tree-cover in 2014
were plantations already present in 2000 that grew their crops. Therefore, it is
important to consider this layer to avoid overestimating baseline deforestation.
The objective was to first generate forest and non-forest classified images for the years 2000
and 2014. Of course, the deforestation modeling process can be replicated and updated using
a different set of images.
Besides the data aforementioned, we also used a layer with official data on agricultural and
forestry plantations in PNG (Qa,Qf), which was provided by PNGFA. This data layer is key
because it allowed us to identify which areas were already plantations in the year 2000.
The software we used was the module “Land Change Modeler” (LCM) from Idrisi Selva2.
For modeling, the method of Similarity-Weighted Instance-based Machine Learning
(SimWeight) was used. The transition that will be evaluated is Forest to Non-Forest.
The software calculated the deforestation rate based on the classified images using a Markov
Matrix. Once the deforestation rate was calculated, the model estimates the quantity and
location of future deforestation.
1 Tree-cover loss is not the same as deforestation. What a forest is depends on the definition adopted by each country whereas the Hansen dataset identifies tree-cover loss as 2 http://clarklabs.org/applications/upload/IDRISI_Focus_Paper_REDD.pdf
4
First, we need to create a validation model that will test a set of driver variables that are
assumed to describe the change from Forest to Non-Forest in the study area or selected
provinces. Land cover maps from two points in time were created for this purpose: 2000 and
2014. The process will model assess forest loss between 2000 and 2014 and then use the
calculated deforestation rate to predict future deforestation. Through this step, we get to test
and identify the various driver variables to better match the predicted map to the map of
reality.
2 Preparation of Hansen data for the deforestation model
To prepare the data from Hansen dataset we used the Model Builder tool in ArcGIS 10.1
software. The objective is to have as results: i) Forest and non-forest cover map in 2000,
and; Forest and non-forest cover map in 2014. The overall model to process Hansen
datasets (Fig. 1) has been divided in four (4) stages for didactic purposes (see Annex 1 for a
larger figure).
Figure 1: processes flow in model builder to prepare Hansen datasets to produce one Forest/Non-forest map for the year 2000 and one for the year 2014.
IMPORTANT: Before initiating any data analysis make sure that all raster and vector layers are in
the same desired projection and coordinate system. For raster files, make sure all have the same
desired number of columns and rows and the same cell size, otherwise the software won’t allow
to start the modeling process. Number of columns and rows as well as cell size can be
standardized by resampling the raster files either in a ArcGIS or Idrisi deforehand.
2.1 First stage
First, we need to use three raster files: gain, tcover, and the layer with data on existing
agricultural and forestry plantations from PNGFA (Qa,Qf). These three layers need to be
reclassified to the adequate raster values to then go through a weighted sum (Fig. 2)
5
Figure 2: outlook of stage one “Forest and Non-forest in 2000 without gain cover”
The raster “gain” should be reclassified by turning all values larger than zero equal to ten
(Fig 3).
Figure 3: reclassification of the “gain” raster.
The raster “tcover” should be reclassified by selecting a threshold to define what
percentage of tree cover should be considered as forest and which as non-forest. In our
case, we tried with different thresholds, run the whole model, and then compared the area
of forest in our resulting “Map 3” raster (map of forest/non-forest in 2014) to the actual
area of forest in the study area or province according to the official data from Papua New
Guinea’s Forest Authority (PNGFA 2013) (Fig 4). The idea is to input an initial threshold
6
percentage (see red square in Fig. 4) so anything above it should be forest and all below it
should be non-forest. Then, we should run the model, calculate areas of forest, and compare
with the official forest area for the study province according to PNGFA. We must be as
close as possible to the official data and, to be conservative, it’d be better to have slightly
less forest than the official source. This because the less initial forest we have (more non-
forest) the less future non-forest we might have (thus lower projected deforestation, which
is a conservative assumption).
This step was necessary because of three reasons: i) the “tcover” raster from the Hansen
dataset presents values of tree cover in percentage (from 0 to 100); ii) using the official
forest definition of PNG will generate a forest area much larger than the official records,
and; iii) there was no official classification of forest in 2000.
Figure 4: reclassification of the “tcover” raster
The raster “QaQf” only contained values for the plantation areas. For the model we cannot
have No Data values within our study area or province. For this reason, it was necessary to
allocate a value to the background of the study area or province. So, as a preliminary step,
the QaQf raster had to be converted to a binary raster of 0 and 1 values. This was done via a
reclassification (tool Reclassify of ArcGIS) with the following setting:
Original Value New Value
7
NoData 0
1 - 151 1
Also, a working environment was set up using the “Environments” option of the
“Reclassify” tool. We selected “Processing Extend” and set up the extend to be the same as
the “fcover” raster. Then, we selected “Raster Analysis” and set up the extend to be the
same as the “fcover” raster. Then click OK and then OK again to run the reclassification.
So, once we have the binary QaQf we use it in the Model Builder and indicated that all
values equal to zero should remain zero and that all values higher than zero should be
converted to three (Fig 5).
Figure 5: reclassification of the plantation binary raster (QaQf_bi)
What the Weigthed Sum tool will do is to sum the values of the overlaying three rasters.
We assigned the an equal weight of “1” to all rasters. So, we are adding the following
rasters:
GAIN Description
0 Background
10 Gain
8
TCOVER Description
1 Forest in 2000
2 Non-forest in 2000
QaQf Description
0 Background
3 Existing Plantations in 2000
As a result, the “Weighted_t1” raster will have the following values which should be
reclassified (Fig. 6) to new values (Table. 1):
Table 1: Meaning of the values generated via the Weighted Sum tool
Values Meaning New Class
1 Forest on background Forest
2 Non-forest on background Non-forest
4 Forest that falls over existing plantation Non-forest
5 Non-forest on existing plantation Non-forest
11 Forest on Gain Non-forest
12 Non-forest on Gain Non-forest
14 Forest on Gain on existing plantation Non-forest
15 Non-forest on Gain on existing plantation Non-forest
Figure 6: reclassification of the weighted sum results
9
Resulting from this first stage we have the “Forest2000” raster, which has two classes, (1)
forest and (2) non-forest, and accounts for the presence of “gain” and “plantations”.
2.2 Second stage
In this stage we’ll generate a raster with three classes: (1) forest, (2) non-forest, and (3)
rivers/water that we’ll call “Map1”. This Map 1 represents the Forest in 2000, which is our
initial year of the analysis period (Fig. 7).
Figure 7: outlook of stage two
For this stage we used two inputs: i) the “dmask” raster (refer to Section 1),and; ii) the
“Forest2000” raster that we generated as a result of Stage 1. The dmask raster was used to
identify rivers/water and to overlay these on the Forest2000 raster, which only contains data
on forest/non-forest. We used the dmask raster and reclassified its values so initial class “2”
would be final class “3” (Fig. 8).
10
Figure 8: reclassification of dmask
What the Weigthed Sum tool will do is to sum the values of the overlaying two rasters. We
assigned an equal weight of “1” to all rasters. So, we are adding the following rasters:
RIVERS Description
0 Background
3 Rivers/Water
FOREST2000 Description
1 Forest in 2000
2 Non-forest in 2000
As a result, we had an intermediate raster called “t3” with the following values, which
should be then reclassified (Fig. 9) to new values (Table. 2):
11
Figure 9: reclassification of “t3” intermediate raster
Table 2: Meaning of the values generated via the Weighted Sum tool
Values Meaning New Class
1 Forest on background Forest
2 Non-forest on background Non-forest
4 Non-forest on rivers/water Rivers/water
5 Non-forest on rivers/water Non-forest
Finally, resulting from this stage we will have Map1 that is the Forest/Non-forest Map in
2000 with 3 classes. Then we can decide to convert the raster to polygon to calculate the
areas.
12
2.3 Third stage
In this stage we generated a map of forest cover loss between the years 2000 and 2014 (Fig.
10).
For this stage we used two inputs: i) Map1 (Forest/Non-forest Map in 2000 with 3 classes)
from Stage 2, and; ii) the lossyear raster (see Section 1).
First we reclassified the lossyear raster so tree cover loss for the year 2000 (class 1) was
converted to background data (class 0). All the other classes representing tree cover loss
between 2001 and 2013 (classes 1 to 13) were reclassified to class 3 (Fig. 11).
Figure 10: sub-process flow of Stage 3
Figure 11: reclassification of lossyear raster
13
What the Weigthed Sum tool will do is to sum the values of the overlaying two rasters. We
assigned an equal weight of “1” to all rasters. So, we are adding the following rasters:
LOSSYEAR Description
0 Background
3 New Non-forest
2001-2013
FOREST2000
(MAP1)
Description
1 Forest in 2000
2 Non-forest in 2000
3 Rivers/Water
As a result, we had an intermediate raster called “t5” with the following values, which
should be then reclassified (Fig. 12) to new values (Table. 3):
Table 3: Meaning of the values generated via the Weighted Sum tool
Values Meaning New Class
1 Forest on background Forest
2 Non-forest on background Non-forest
3 Rivers/water on background Rivers/water
4 Forest2000 on new non-forest New Non-forest
5 Non-forest2000 on new non-forest New Non-forest
14
Figure 12: reclassification of “t5” intermediate raster
Finally, resulting from this stage we will have Map2 that is the Map of forest cover loss
between 2000 and 2014 Forest/Non-forest Map in 2000 with 4 classes. Then we can decide
to convert the raster to polygon to calculate the areas.
2.4 Fourth stage
In this final stage we generated the Forest 2014 map, which we called Map3 (Fig. 13).
For this stage we used two inputs: i) Intermediate “t5” raster from Stage 3 (see Section 2.3),
and; ii) the dmask raster (see Section 1).
15
Figure 13: Overlook of Stage 4
In this case we used the “dmask” raster to generate a mask within which the reclassification
of the intermediate raster t5 will take place. We reclassified classes 1 and 2 to a new class
1, and values “0” to NoData (Fig. 14)
Figure 14: reclassification of dmask raster to generate a mask of the study area
The, we reclassified the intermediate raster “t5” so all non-forest classes (2 and 4) should
become a single non-forest class (2). Forest (1) and Rivers/Water (3) remain with the same
class values (Fig. 15).
16
Figure 15: reclassification of the intermediate raster “t5” to have only one non-forest class
Finally, resulting from this stage we will have Map3 that is the Forest/Non-forest Map in
2014 with 3 classes. Then we can decide to convert the raster to polygon to calculate the
areas.
3 Import raster files to Idrisi
The raster files generated in Section 2 must be converted to a format that can be imported
to Idrisi to become inputs in the Land Change Modeler (LCM).
Rasters in ArcGIS must be exported from GRID format to TIFF format. Then, in Idrisi,
TIFF files can be imported to Idrisi format (Fig. 16).
17
Figure 16: Importing TIFF files into Idrisi
IMPORTANT: Once imported, you need to set the categories for the Hansen classified
images that you’ll be working with. In our case, Forest Cover 2000 and Forest Cover 2014.
To do this, select the raster file, go to metadata, and open the menu of “Categories” Here
you’ll set a name for each of the three categories in the forest cover raster file: 1) Forest; 2)
Non-forest; 3) Rivers (Fig. 17). Do the same for both forest cover raster images.
This way, the software will be able to identify by name the changes in forest-cover later in
the process.
18
Figure 17: Setting category names for the forest cover raster images
4 Import of vector files to Idrisi
Vector files must be imported from shapefile format to Idrisi format before they can be
used in the LCM. In the main menu go to File >Import > Software Specific Formats >
ESRI formats > SHAPEIDR (Fig. 18).
19
Figure 18: Finding the tool to import shapefiles to Idrisi
A new menu will open and here select the vector file you want to import, the name for the
imported Idrisi vector file, and the Reference System for the Idrisi vector file, and click OK
(Fig. 19).
20
Figure 19: Importing shapefiles to Idrisi
Keep in mind that you will need at least a vector file of roads (primary and/or secondary). If
you have separate vector files for primary and secondary roads you’ll need to also generate
a separate vector file that combines both primary and secondary roads (you’ll need this
combined file as noted in section 6).
Finally, you’ll have to convert all your vector files into raster files using the tool
“RasterVector”. In the tool select a conversion option (depending on the type of vector file
you’re converting), select the vector file to be converted and the name for the raster file to
be generated. In “Operation type” you just keep the default option, and click OK. (Fig. 20).
Figure 20: RasterVector menu
21
Once you click OK, a warning message will appear and click Yes. Then, the “Image
Initialization” menu will open. This is used to define the spatial parameters of the raster file
you are creating from a vector file. If you already have a raster file with the cell size you
want (in our case the forest cover 2000 and 2014 generated from Hansen dataset) keep the
default option “Copy spatial parameters from another image”. Keep the “Output image”
option as default, and in the “Image to copy parameters from” select the raster file you’ll
use to extract the cell size, and leave the rest as default. Once you click OK the process of
converting a vector to raster is completed (Fig. 21).
Figure 21: Image initialization menu
22
5 Generation of Factor Maps
IMPORTANT: Before initiating any data analysis make sure that all raster and vector layers are in
the same desired projection and coordinate system. For raster files, make sure all have the same
desired number of columns and rows and the same cell size, otherwise the software won’t allow
to start the modeling process. Number of columns and rows as well as cell size can be
standardized by resampling the raster files either in a ArcGIS or Idrisi deforehand.
Once all raster and vector files were imported to Idrisi, we proceeded to generate Factor
Maps.
Factor Maps (FM) represented the variables used to explain deforestation in the analysis
period and that will be used to project future deforestation.
In our case we selected eight (8) variables thus we had eight FM:
1. Distance to non-forest in 2000 (non-forest in the initial year)
2. Distance to rivers
3. Digital Elevation Model of 30 meters (DEM30)
4. Distance to Census 2011 points
5. Distance to major towns
6. Distance to primary roads
7. Distance to secondary roads
8. Distance to Special Agricultural Businesses Leases (SABL) areas
To generate the distance for the seven variables (we do not generate a distance map for the
DEM, we just use it as it is) we used the tool “Distance” in Idrisi. The feature image is the
variable we are working with and the output image is the name with which the “distance to
variable” new file will be saved as (Fig. 22)
Figure 22: Distance tool in Idrisi
Finally, we must multiply each distance raster by a mask of the study area. This mask is a
raster file with values 0 and 1. Value 1 is assigned to the study area and the value 0 to the
NoData areas (this raster mask can be created using the “Reclass” tool in Idrisi or ArcGIS).
23
To multiply each distance raster by the raster mask we used the “Overlay” tool in Idrisi
making sure to select “First*Second” in the Overlay options (Fig. 23):
Figure 23: Overlay tool to create mask raster from distance raster files
Data for FM 1 and 2 came from our preparation of Hansen data explained in Section 1. Data
for FM 3 through 8 came from PNGFA.
Spatial variables and Distance Maps (Factor Maps) for the study area are presented below:
24
Figure 24: Non-forest in 2000 and distance to non-forest
Figure 25: Rivers and distance to rivers
25
Figure 26: Digital Elevation Model (DEM)
Figure 27: 2011 census points and distance to census points
26
Figure 28: Major towns and distance to major towns
Figure 29: Primary roads and distance to primary roads
27
Figure 30: Secondary roads and distance to secondary roads
28
Figure 31: SABL areas and distance to SABL areas
6 Calibration
With the raster distance files created we are ready to start with the modeling process. Go to
the main menu and select Modeling > Environmental/Simulation Models > Land Change
Modeler:ES (Fig. 32).
29
Figure 32: Route to open the Land Change Modeler tool
The Land Change Modeler tool consists on many tabs with sub-sections each. We’ll through
one of the applicable tabs and sub-sections.
6.1 Change Analysis
The first tab is “Change Analysis” and it is here that you need to select a name for your
modeling project (Fig. 33).
Then, you will select the initial and final land cover images for your analysis. On our case,
these were forest cover 2000 and forest cover 2014. The tool identifies automatically the
years of each of the raster files (Fig. 33).
Check the box “REDD project”, then select the end date for the modeling period and the
interval for the model projections. In our case we selected 40 years into the future (2014-
2044) on 5-year intervals (Fig. 33).
Then you need to select a the vector file wit all the roads (primary and secondary), and the
DEM raster file. The selection of a palette is optional for the colors of the future projected
deforestation (Fig. 33).
Finally, click “Continue” (Fig. 34) and the software will calibrate the gain and losses in the
analysis period (2000 – 2014) to which point it will show a gain and losses bar chart (Fig.
35).
30
Figure 33: First tab of LCM Sub-section “Transition Sub-Model Structure”
31
Figure 34: Once all files are input into the LCM project parameters click Continue
Finally, click “Continue” and the software will calibrate the gain and losses in the analysis
period (2000 – 2014) to which point it will show a gain and losses bar chart (Fig. 35).
Figure 35: Gain and losses between 2000 and 2014
Note: if the raster files do not have the same cell size or if category names have not been
assigned to the forest cover raster files, a warning message will appear and you won’t be able
to continue with the process until such issues are solved.
32
6.2 Transition Potentials
Next you’ll go to the “Transition Potential” tab (Fig. 36). The first sub-section indicates
which transitions will be evaluated. In our case, we only have one transition which is forest
to non-forest, thus that is the only one that appears in this sub-section. In this case we don’t
make any changes in this sub-section and proceed to the next one.
Figure 36: Tab “Transition Potentials”
We go directly to the sub-section “Test and Selection of site and Driver Variables” (Fig.
37). In this sub-section we can test the Cramer’s V index. This index represents the
correlation between variables and non-forest expansion in the assessment period and
presents values from 0 to 1 (the closer to 1, the more correlation there is).
This step is optional, but it is helpful to discriminate among several variables when we are
deciding which ones to include in our model. I we decide to use it, we proceed to select the
variable to evaluate and then click on “Text Explanatory Power”. If we decide that a
variable should be used in our model, we can add it directly by clicking on “Add to
Model”. If you decide not to test Cramer’s V coefficient, you can add variables manually
on the next sub-section.
33
Results will show in a table and, in this case, we should look at the Cramer’s V coefficient
for non-forest (Fig. 38), which is the transition we are interested on.
Figure 37: Sub-section “Test and Selection of Site and Driver Variables”
34
Figure 38: Results of evaluating the variable “distance to non-forest in 2000” for the Cramer’s V coefficient
In the next sub-section “Transition Sub-Model Structure you’ll select the variables for the
deforestation mode (the variables might already appear in this sub-section if you added
them during the assessment of Cramer’s V coefficient) (Fig. 39).
35
Figure 39: Sub-section “Transition Sub-Model Structure”
For each variable you must indicate its role, if it is “Static” or “Dynamic”. In our case we
selected as “Dynamic” variables: non-forest in 2000, distance to primary roads, and
distance to secondary roads. All the other variables remained as “Static”.
Then, we select the “Basis layer type” for the non-forest 2000, distance to primary roads,
and distance to secondary roads variables. We click in this field for the non-forest 2000
36
variable and a pop-up menu appears in which we select “non-forest” and then click
“insert”, and then OK (Fig. 40). In the case of distance to roads (primary and secondary) we
select either it is a variable representing primary or secondary roads in each case and then
click “insert”, and then OK (Fig. 41).
Figure 40: Pop-up menu for the basis layer type of the non-forest 2000 variable
37
Figure 41: Pop-up menu for the basis layer type of the distance to roads (primary and then secondary) variables
38
Once all variables have been input in the “Transition Sub-Model Structure” sub-section we
are ready to mode on to the next sub-section called “Run Transition Sub-Model” (Fig. 42).
It is in this sub-section that we will first calculate the relevance weights of each variable
and then generate a sub-model for the transition under assessment, in this case, forest to
non-forest.
There are three options for a modeling approach. We selected SimWeight which is the most
appropriate when assessing only one transition (forest to non-forest). The selected method
was the Similarity-Weighted Instance-based Machine Learning (SimWeight) as the model
method to run the transition sub-model. SimWeight is a Similarity-Weighted Instance-based
Machine Learning algorithm. It uses a slightly modified variant of the algorithm described
by Sangermano et al., (2010) – a similarity-weighted K-nearest neighbor procedure (Idrisi
help).
39
Figure 42: Sub-section “Run Transition Sub-Model”
We leave all values by default and click on “Calculate Relevance Weights”. This process
takes a few minutes and as a result we will have a chart with the relevance weight of each
selected variable to predict the change in forest to non-forest in the assessment period (Fig.
43). The relevance weight chart is an indication of each variable’s importance at
40
discriminating change. For each variable, it compares the standard deviation of the variable
inside areas that have changed (Forest to Non-Forest) to the standard deviation across the
entire study area. For a variable to be important it would have a smaller standard deviation
in the change area than for the entire study area. The graph can be used as a guide to inform
the utility of variables as well to indicate that more variables may need to be identifies to
include in the model.
Figure 43: Relevance weight of each of the selected variables for the sub-model
Once the relevance weight of the variables has been calculated, we proceed to run the sub-
model. You’ll notice that the button we previously used to “Calculate Relevance Weights”
has now change into “Run Sub-Model”. Click on “Run Sub-Model” and leave the software
to calculate the sub-model for the assessment period (Fig. 44). This operation will take
several minutes depending on the size of the study area (in our case it took about 5 hours).
The result from the sub-model calculation is a soft-prediction or deforestation risk map for
the assessment period (Fig. 45).
41
Figure 44: Option to run the sub-model or soft-prediction
42
Figure 45: Soft-prediction map or deforestation risk map for the assessment period
Once the sub-model has been calculated we can then proceed to run the actual future
deforestation model as explained in the next section.
6.3 Change Prediction
We can now go to the next tab “Change Prediction” (Fig. 46). In our case we didn’t account
for the expansion of roads so we didn’t use the sub-section “Dynamic Road Development”.
We go straight to the sub-section “Change Allocation”.
In our case we didn’t include any dynamic road development, changes in infrastructure or
zones of constraint/incentives, thus we left unchecked all the options in the “Optional
Components” box.
We leave all other options by default and proceed to click on “Run Model” (Fig. 46). At
this point the software will start developing the future deforestation model based on the
variables we have chosen and for the period and intervals selected. This process will take
significant time because the software will calculate a soft-prediction for each transition
period and, based on this, will generate a hard-prediction or spatial distribution of
deforestation for the selected year.
43
Figure 46: Sub-section “Change Allocation”
5.4 Results
Two basic models of change are provided: a hard prediction model and a soft prediction
model (Fig. 47). The hard prediction model is based on a competitive land allocation model
similar to a multi-objective decision process. The soft prediction yields a map of vulnerability
to change for the selected set of transitions. Hard and Soft prediction maps are generated for
each year. The output is a series of predicted land cover maps for the study area.
44
Figure 47: Soft-prediction (above) and hard-prediction (below) models
The resulting hard-prediction maps are raster files from which we can calculate areas of
non-forest increase (deforestation). This raster files can be worked in Idrisi or can be
exported to ArcGIS.
45
7 References
Eastman, J.R., 2012. “Land Change Modeler” Idrisi Selva Tutorial, Manual
version 17. 264-280
Hansen, Matthew C., et al. "High-resolution global maps of 21st-century forest
cover change." Science 342.6160 (2013): 850-853.
PNGFA. 2013. “Forest and Land Use in Papua New Guinea 2013.” Port
Moresby.
Puyravaud, J.P., 2003. "Standardizing the calculation of the annual rate of
deforestation". Forest Ecology and Management, 177: 593-596
Sangermano, F., Eastman, J.R., Zhu, H., 2010. Similarity weighted instance
based learning for the generation of transition potentials in land change
modeling. Transactions in GIS, 14(5), 569-580.
Takada, T., Miyamoto, A., and Hasegawa, S.F., 2010. "Derivation of a yearly
transition probability matrix for land-use dynamics and its applications",
Landscape Ecology, 25, 561–572.
8 Annexes
8.1 Annex 1: Complete process flow for preparing Hansen Dataset to generate Forest/Non-forest maps for the years
2000 and 2014
8.2 Annex 2: Validation
A commonly used step in deforestation modeling is model validation. This allows the analyst
to have an idea on the accuracy of the model. In our case, the scope and time of the
assignment as well as the availability of data did not allow for a validation of the deforestation
models. However, it is suggested that PNGFA should validate all the deforestation models it
generates as new data becomes available.
Model validation is needed to determine which of the deforestation risks maps is the most
accurate in order to confirm the quality of the model output. To confirm a model output, it is
required both a “Calibration” and a “Validation” stage. For example, imagine we have data
for three points in time: year 1996, 2004, and 2008. In this case, there are two historical
periods (1996-2004 and 2004-2008) that have shown a similar deforestation trends. Data
from the most recent period (2004-2008) can be used as the “validation” data set and those
from the previous period (1996-2004) as the “calibration” data set.
With data from the calibration period (1996-2004), we prepare a Risk Map and a Prediction
Map of the deforestation for the validation period (2004-2008). Then, predicted deforestation
for 2008 will be overlaid with locations that were actually deforested in 2008 (land cover
map for 2008).
It is necessary to select the Prediction Map which best fits with the real map and that best
reproduce actual deforestation in the validation period.
In this step, the hard prediction 2008 will be used to validate the model, given that the actual
land cover map for 2008 is already known. The output map is a 3-way cross-tabulation
between the projected or “predicted” 2008 map and the actual 2008 map (the “reality” map).
Area calculation for both predicted and actual map of 2008 should be similar because we
used the actual rate of change in the period 2004-2008. What we want to do in this case is to
validate whether the projected locations of change are similar to those of actual changes in
this period.
One of the assessments techniques that can be used is the “Figure of Merit” (FOM) that
confirms the model prediction in statistical manner. This FOM is a ratio of the intersection
of the observed change and the predicted change to the union of the observed change and the
predicted change and ranges from 0 (where there is no overlap between observed and
predicted change) to 1.0 (where there is perfect overlap between observed and predicted
change)
Results show three data for analysis:
- False Alarms (C), which are areas where we predicted a change from forest to non-
forest but there was no change;
- Misses (A), which are areas where a change was not predicted but pixels actually
changed from forest to non-forest; and
48
- Hits (B), which are pixels predicted to change to non-forest areas did change.
The Figure of Merit (FOM) is calculated from these values, as is presented below:
FOM = B / (A+B+C)