METHODOLOGICAL GUIDELINE TO PRODUCE A FUTURE … REports/Future...DEFORESTATION This report presents the methodology and results to locate in space and time the baseline deforestation

METHODOLOGICAL GUIDELINE TO PRODUCE A

FUTURE DEFORESTATION MODEL FOR PALM

OIL EXPANSION IN PAPUA NEW GUINEA USING

By:

Giancarlo Raschio

Freddie Alei

November, 2016

2

Table of Contents

Table of Contents

1 Data and Methods Used .......................................................................................... 3

2 Preparation of Hansen data for the deforestation model ......................................... 4

2.1 First stage .................................................................................................................... 4

2.2 Second stage ............................................................................................................... 9

2.3 Third stage ................................................................................................................ 12

2.4 Fourth stage .............................................................................................................. 14

3 Import raster files to Idrisi ..................................................................................... 16

4 Import of vector files to Idrisi ................................................................................ 18

5 Generation of Factor Maps .................................................................................... 22

6 Calibration ............................................................................................................ 28

6.1 Change Analysis ......................................................................................................... 29

6.2 Transition Potentials .................................................................................................. 32

6.3 Change Prediction ...................................................................................................... 42

7 References ............................................................................................................ 45

8 Annexes ................................................................................................................ 46

8.1 Annex 1: Complete process flow for preparing Hansen Dataset to generate Forest/Non-

forest maps for the years 2000 and 2014 ............................................................................. 46

8.2 Annex 2: Validation ................................................................................................... 47

3

PROJECTION OF THE QUANTITY AND LOCATION OF FUTURE

DEFORESTATION

This report presents the methodology and results to locate in space and time the baseline

deforestation expected to occur within the STUDY AREA during the project crediting period.

1 Data and Methods Used

To project deforestation into the future it is necessary at remote sensing data on land-cover

for at least two points in time. In our case we used the publicly and freely available Hansen

dataset (Hansen et al. 2013). Of course, the objective for PNGFA is to use its own classified

remote sensing imagery for the analysis of future deforestation.

The Hansen dataset provides the following data:

i) Forest cover in the year 2000 (tcover): This data refers to a land-cover

classification of two classes: tree cover and tree-cover loss1

ii) Annual tree-cover loss for each year between 2000 and 2014 (lossyear): this data

refers to the tree-cover that has been lost in each of the 14 years between 2000

and 2014

iii) A data mask layer (dmask): this data presents information about features that are

neither tree cover nor tree-cover loss; thus, they are rivers.

iv) A data layer of tree-cover gain between 2000 and 2014 (gain): this data refers to

tree-cover that has been regenerated between 2000 and 2014. However, it should

be taken into account that in the case of PNG such gain or “regeneration” is the

result of farming cycles; areas that appeared as regenerated tree-cover in 2014

were plantations already present in 2000 that grew their crops. Therefore, it is

important to consider this layer to avoid overestimating baseline deforestation.

The objective was to first generate forest and non-forest classified images for the years 2000

and 2014. Of course, the deforestation modeling process can be replicated and updated using

a different set of images.

Besides the data aforementioned, we also used a layer with official data on agricultural and

forestry plantations in PNG (Qa,Qf), which was provided by PNGFA. This data layer is key

because it allowed us to identify which areas were already plantations in the year 2000.

The software we used was the module “Land Change Modeler” (LCM) from Idrisi Selva2.

For modeling, the method of Similarity-Weighted Instance-based Machine Learning

(SimWeight) was used. The transition that will be evaluated is Forest to Non-Forest.

The software calculated the deforestation rate based on the classified images using a Markov

Matrix. Once the deforestation rate was calculated, the model estimates the quantity and

location of future deforestation.

1 Tree-cover loss is not the same as deforestation. What a forest is depends on the definition adopted by each country whereas the Hansen dataset identifies tree-cover loss as 2 http://clarklabs.org/applications/upload/IDRISI_Focus_Paper_REDD.pdf

4

First, we need to create a validation model that will test a set of driver variables that are

assumed to describe the change from Forest to Non-Forest in the study area or selected

provinces. Land cover maps from two points in time were created for this purpose: 2000 and

2014. The process will model assess forest loss between 2000 and 2014 and then use the

calculated deforestation rate to predict future deforestation. Through this step, we get to test

and identify the various driver variables to better match the predicted map to the map of

reality.

2 Preparation of Hansen data for the deforestation model

To prepare the data from Hansen dataset we used the Model Builder tool in ArcGIS 10.1

software. The objective is to have as results: i) Forest and non-forest cover map in 2000,

and; Forest and non-forest cover map in 2014. The overall model to process Hansen

datasets (Fig. 1) has been divided in four (4) stages for didactic purposes (see Annex 1 for a

larger figure).

Figure 1: processes flow in model builder to prepare Hansen datasets to produce one Forest/Non-forest map for the year 2000 and one for the year 2014.

IMPORTANT: Before initiating any data analysis make sure that all raster and vector layers are in

the same desired projection and coordinate system. For raster files, make sure all have the same

desired number of columns and rows and the same cell size, otherwise the software won’t allow

to start the modeling process. Number of columns and rows as well as cell size can be

standardized by resampling the raster files either in a ArcGIS or Idrisi deforehand.

2.1 First stage

First, we need to use three raster files: gain, tcover, and the layer with data on existing

agricultural and forestry plantations from PNGFA (Qa,Qf). These three layers need to be

reclassified to the adequate raster values to then go through a weighted sum (Fig. 2)

5

Figure 2: outlook of stage one “Forest and Non-forest in 2000 without gain cover”

The raster “gain” should be reclassified by turning all values larger than zero equal to ten

(Fig 3).

Figure 3: reclassification of the “gain” raster.

The raster “tcover” should be reclassified by selecting a threshold to define what

percentage of tree cover should be considered as forest and which as non-forest. In our

case, we tried with different thresholds, run the whole model, and then compared the area

of forest in our resulting “Map 3” raster (map of forest/non-forest in 2014) to the actual

area of forest in the study area or province according to the official data from Papua New

Guinea’s Forest Authority (PNGFA 2013) (Fig 4). The idea is to input an initial threshold

6

percentage (see red square in Fig. 4) so anything above it should be forest and all below it

should be non-forest. Then, we should run the model, calculate areas of forest, and compare

with the official forest area for the study province according to PNGFA. We must be as

close as possible to the official data and, to be conservative, it’d be better to have slightly

less forest than the official source. This because the less initial forest we have (more non-

forest) the less future non-forest we might have (thus lower projected deforestation, which

is a conservative assumption).

This step was necessary because of three reasons: i) the “tcover” raster from the Hansen

dataset presents values of tree cover in percentage (from 0 to 100); ii) using the official

forest definition of PNG will generate a forest area much larger than the official records,

and; iii) there was no official classification of forest in 2000.

Figure 4: reclassification of the “tcover” raster

The raster “QaQf” only contained values for the plantation areas. For the model we cannot

have No Data values within our study area or province. For this reason, it was necessary to

allocate a value to the background of the study area or province. So, as a preliminary step,

the QaQf raster had to be converted to a binary raster of 0 and 1 values. This was done via a

reclassification (tool Reclassify of ArcGIS) with the following setting:

Original Value New Value

7

NoData 0

1 - 151 1

Also, a working environment was set up using the “Environments” option of the

“Reclassify” tool. We selected “Processing Extend” and set up the extend to be the same as

the “fcover” raster. Then, we selected “Raster Analysis” and set up the extend to be the

same as the “fcover” raster. Then click OK and then OK again to run the reclassification.

So, once we have the binary QaQf we use it in the Model Builder and indicated that all

values equal to zero should remain zero and that all values higher than zero should be

converted to three (Fig 5).

Figure 5: reclassification of the plantation binary raster (QaQf_bi)

What the Weigthed Sum tool will do is to sum the values of the overlaying three rasters.

We assigned the an equal weight of “1” to all rasters. So, we are adding the following

rasters:

GAIN Description

0 Background

10 Gain

8

TCOVER Description

1 Forest in 2000

2 Non-forest in 2000

QaQf Description

0 Background

3 Existing Plantations in 2000

As a result, the “Weighted_t1” raster will have the following values which should be

reclassified (Fig. 6) to new values (Table. 1):

Table 1: Meaning of the values generated via the Weighted Sum tool

Values Meaning New Class

1 Forest on background Forest

2 Non-forest on background Non-forest

4 Forest that falls over existing plantation Non-forest

5 Non-forest on existing plantation Non-forest

11 Forest on Gain Non-forest

12 Non-forest on Gain Non-forest

14 Forest on Gain on existing plantation Non-forest

15 Non-forest on Gain on existing plantation Non-forest

Figure 6: reclassification of the weighted sum results

9

Resulting from this first stage we have the “Forest2000” raster, which has two classes, (1)

forest and (2) non-forest, and accounts for the presence of “gain” and “plantations”.

2.2 Second stage

In this stage we’ll generate a raster with three classes: (1) forest, (2) non-forest, and (3)

rivers/water that we’ll call “Map1”. This Map 1 represents the Forest in 2000, which is our

initial year of the analysis period (Fig. 7).

Figure 7: outlook of stage two

For this stage we used two inputs: i) the “dmask” raster (refer to Section 1),and; ii) the

“Forest2000” raster that we generated as a result of Stage 1. The dmask raster was used to

identify rivers/water and to overlay these on the Forest2000 raster, which only contains data

on forest/non-forest. We used the dmask raster and reclassified its values so initial class “2”

would be final class “3” (Fig. 8).

10

Figure 8: reclassification of dmask

What the Weigthed Sum tool will do is to sum the values of the overlaying two rasters. We

assigned an equal weight of “1” to all rasters. So, we are adding the following rasters:

RIVERS Description

0 Background

3 Rivers/Water

FOREST2000 Description

1 Forest in 2000


As a result, we had an intermediate raster called “t3” with the following values, which

should be then reclassified (Fig. 9) to new values (Table. 2):

11

Figure 9: reclassification of “t3” intermediate raster





4 Non-forest on rivers/water Rivers/water

5 Non-forest on rivers/water Non-forest

Finally, resulting from this stage we will have Map1 that is the Forest/Non-forest Map in

2000 with 3 classes. Then we can decide to convert the raster to polygon to calculate the

areas.

12

2.3 Third stage

In this stage we generated a map of forest cover loss between the years 2000 and 2014 (Fig.

10).

For this stage we used two inputs: i) Map1 (Forest/Non-forest Map in 2000 with 3 classes)

from Stage 2, and; ii) the lossyear raster (see Section 1).

First we reclassified the lossyear raster so tree cover loss for the year 2000 (class 1) was

converted to background data (class 0). All the other classes representing tree cover loss

between 2001 and 2013 (classes 1 to 13) were reclassified to class 3 (Fig. 11).

Figure 10: sub-process flow of Stage 3

Figure 11: reclassification of lossyear raster

13

What the Weigthed Sum tool will do is to sum the values of the overlaying two rasters. We

assigned an equal weight of “1” to all rasters. So, we are adding the following rasters:

LOSSYEAR Description

0 Background

3 New Non-forest

2001-2013

FOREST2000

(MAP1)

Description

1 Forest in 2000


3 Rivers/Water

As a result, we had an intermediate raster called “t5” with the following values, which

should be then reclassified (Fig. 12) to new values (Table. 3):





3 Rivers/water on background Rivers/water

4 Forest2000 on new non-forest New Non-forest

5 Non-forest2000 on new non-forest New Non-forest

14

Figure 12: reclassification of “t5” intermediate raster

Finally, resulting from this stage we will have Map2 that is the Map of forest cover loss

between 2000 and 2014 Forest/Non-forest Map in 2000 with 4 classes. Then we can decide

to convert the raster to polygon to calculate the areas.

2.4 Fourth stage

In this final stage we generated the Forest 2014 map, which we called Map3 (Fig. 13).

For this stage we used two inputs: i) Intermediate “t5” raster from Stage 3 (see Section 2.3),

and; ii) the dmask raster (see Section 1).

15

Figure 13: Overlook of Stage 4

In this case we used the “dmask” raster to generate a mask within which the reclassification

of the intermediate raster t5 will take place. We reclassified classes 1 and 2 to a new class

1, and values “0” to NoData (Fig. 14)

Figure 14: reclassification of dmask raster to generate a mask of the study area

The, we reclassified the intermediate raster “t5” so all non-forest classes (2 and 4) should

become a single non-forest class (2). Forest (1) and Rivers/Water (3) remain with the same

class values (Fig. 15).

16

Figure 15: reclassification of the intermediate raster “t5” to have only one non-forest class

Finally, resulting from this stage we will have Map3 that is the Forest/Non-forest Map in

2014 with 3 classes. Then we can decide to convert the raster to polygon to calculate the

areas.

3 Import raster files to Idrisi

The raster files generated in Section 2 must be converted to a format that can be imported

to Idrisi to become inputs in the Land Change Modeler (LCM).

Rasters in ArcGIS must be exported from GRID format to TIFF format. Then, in Idrisi,

TIFF files can be imported to Idrisi format (Fig. 16).

17

Figure 16: Importing TIFF files into Idrisi

IMPORTANT: Once imported, you need to set the categories for the Hansen classified

images that you’ll be working with. In our case, Forest Cover 2000 and Forest Cover 2014.

To do this, select the raster file, go to metadata, and open the menu of “Categories” Here

you’ll set a name for each of the three categories in the forest cover raster file: 1) Forest; 2)

Non-forest; 3) Rivers (Fig. 17). Do the same for both forest cover raster images.

This way, the software will be able to identify by name the changes in forest-cover later in

the process.

18

Figure 17: Setting category names for the forest cover raster images

4 Import of vector files to Idrisi

Vector files must be imported from shapefile format to Idrisi format before they can be

used in the LCM. In the main menu go to File >Import > Software Specific Formats >

ESRI formats > SHAPEIDR (Fig. 18).

19

Figure 18: Finding the tool to import shapefiles to Idrisi

A new menu will open and here select the vector file you want to import, the name for the

imported Idrisi vector file, and the Reference System for the Idrisi vector file, and click OK

(Fig. 19).

20

Figure 19: Importing shapefiles to Idrisi

Keep in mind that you will need at least a vector file of roads (primary and/or secondary). If

you have separate vector files for primary and secondary roads you’ll need to also generate

a separate vector file that combines both primary and secondary roads (you’ll need this

combined file as noted in section 6).

Finally, you’ll have to convert all your vector files into raster files using the tool

“RasterVector”. In the tool select a conversion option (depending on the type of vector file

you’re converting), select the vector file to be converted and the name for the raster file to

be generated. In “Operation type” you just keep the default option, and click OK. (Fig. 20).

Figure 20: RasterVector menu

21

Once you click OK, a warning message will appear and click Yes. Then, the “Image

Initialization” menu will open. This is used to define the spatial parameters of the raster file

you are creating from a vector file. If you already have a raster file with the cell size you

want (in our case the forest cover 2000 and 2014 generated from Hansen dataset) keep the

default option “Copy spatial parameters from another image”. Keep the “Output image”

option as default, and in the “Image to copy parameters from” select the raster file you’ll

use to extract the cell size, and leave the rest as default. Once you click OK the process of

converting a vector to raster is completed (Fig. 21).

Figure 21: Image initialization menu

22

5 Generation of Factor Maps

IMPORTANT: Before initiating any data analysis make sure that all raster and vector layers are in

the same desired projection and coordinate system. For raster files, make sure all have the same

desired number of columns and rows and the same cell size, otherwise the software won’t allow

to start the modeling process. Number of columns and rows as well as cell size can be

standardized by resampling the raster files either in a ArcGIS or Idrisi deforehand.

Once all raster and vector files were imported to Idrisi, we proceeded to generate Factor

Maps.

Factor Maps (FM) represented the variables used to explain deforestation in the analysis

period and that will be used to project future deforestation.

In our case we selected eight (8) variables thus we had eight FM:

1. Distance to non-forest in 2000 (non-forest in the initial year)

2. Distance to rivers

3. Digital Elevation Model of 30 meters (DEM30)

4. Distance to Census 2011 points

5. Distance to major towns

6. Distance to primary roads

7. Distance to secondary roads

8. Distance to Special Agricultural Businesses Leases (SABL) areas

To generate the distance for the seven variables (we do not generate a distance map for the

DEM, we just use it as it is) we used the tool “Distance” in Idrisi. The feature image is the

variable we are working with and the output image is the name with which the “distance to

variable” new file will be saved as (Fig. 22)

Figure 22: Distance tool in Idrisi

Finally, we must multiply each distance raster by a mask of the study area. This mask is a

raster file with values 0 and 1. Value 1 is assigned to the study area and the value 0 to the

NoData areas (this raster mask can be created using the “Reclass” tool in Idrisi or ArcGIS).

23

To multiply each distance raster by the raster mask we used the “Overlay” tool in Idrisi

making sure to select “First*Second” in the Overlay options (Fig. 23):

Figure 23: Overlay tool to create mask raster from distance raster files

Data for FM 1 and 2 came from our preparation of Hansen data explained in Section 1. Data

for FM 3 through 8 came from PNGFA.

Spatial variables and Distance Maps (Factor Maps) for the study area are presented below:

24

Figure 24: Non-forest in 2000 and distance to non-forest

Figure 25: Rivers and distance to rivers

25

Figure 26: Digital Elevation Model (DEM)

Figure 27: 2011 census points and distance to census points

26

Figure 28: Major towns and distance to major towns

Figure 29: Primary roads and distance to primary roads

27

Figure 30: Secondary roads and distance to secondary roads

28

Figure 31: SABL areas and distance to SABL areas

6 Calibration

With the raster distance files created we are ready to start with the modeling process. Go to

the main menu and select Modeling > Environmental/Simulation Models > Land Change

Modeler:ES (Fig. 32).

29

Figure 32: Route to open the Land Change Modeler tool

The Land Change Modeler tool consists on many tabs with sub-sections each. We’ll through

one of the applicable tabs and sub-sections.

6.1 Change Analysis

The first tab is “Change Analysis” and it is here that you need to select a name for your

modeling project (Fig. 33).

Then, you will select the initial and final land cover images for your analysis. On our case,

these were forest cover 2000 and forest cover 2014. The tool identifies automatically the

years of each of the raster files (Fig. 33).

Check the box “REDD project”, then select the end date for the modeling period and the

interval for the model projections. In our case we selected 40 years into the future (2014-

2044) on 5-year intervals (Fig. 33).

Then you need to select a the vector file wit all the roads (primary and secondary), and the

DEM raster file. The selection of a palette is optional for the colors of the future projected

deforestation (Fig. 33).

Finally, click “Continue” (Fig. 34) and the software will calibrate the gain and losses in the

analysis period (2000 – 2014) to which point it will show a gain and losses bar chart (Fig.

35).

30

Figure 33: First tab of LCM Sub-section “Transition Sub-Model Structure”

31

Figure 34: Once all files are input into the LCM project parameters click Continue

Finally, click “Continue” and the software will calibrate the gain and losses in the analysis

period (2000 – 2014) to which point it will show a gain and losses bar chart (Fig. 35).

Figure 35: Gain and losses between 2000 and 2014

Note: if the raster files do not have the same cell size or if category names have not been

assigned to the forest cover raster files, a warning message will appear and you won’t be able

to continue with the process until such issues are solved.

32

6.2 Transition Potentials

Next you’ll go to the “Transition Potential” tab (Fig. 36). The first sub-section indicates

which transitions will be evaluated. In our case, we only have one transition which is forest

to non-forest, thus that is the only one that appears in this sub-section. In this case we don’t

make any changes in this sub-section and proceed to the next one.

Figure 36: Tab “Transition Potentials”

We go directly to the sub-section “Test and Selection of site and Driver Variables” (Fig.

37). In this sub-section we can test the Cramer’s V index. This index represents the

correlation between variables and non-forest expansion in the assessment period and

presents values from 0 to 1 (the closer to 1, the more correlation there is).

This step is optional, but it is helpful to discriminate among several variables when we are

deciding which ones to include in our model. I we decide to use it, we proceed to select the

variable to evaluate and then click on “Text Explanatory Power”. If we decide that a

variable should be used in our model, we can add it directly by clicking on “Add to

Model”. If you decide not to test Cramer’s V coefficient, you can add variables manually

on the next sub-section.

33

Results will show in a table and, in this case, we should look at the Cramer’s V coefficient

for non-forest (Fig. 38), which is the transition we are interested on.

Figure 37: Sub-section “Test and Selection of Site and Driver Variables”

34

Figure 38: Results of evaluating the variable “distance to non-forest in 2000” for the Cramer’s V coefficient

In the next sub-section “Transition Sub-Model Structure you’ll select the variables for the

deforestation mode (the variables might already appear in this sub-section if you added

them during the assessment of Cramer’s V coefficient) (Fig. 39).

35

Figure 39: Sub-section “Transition Sub-Model Structure”

For each variable you must indicate its role, if it is “Static” or “Dynamic”. In our case we

selected as “Dynamic” variables: non-forest in 2000, distance to primary roads, and

distance to secondary roads. All the other variables remained as “Static”.

Then, we select the “Basis layer type” for the non-forest 2000, distance to primary roads,

and distance to secondary roads variables. We click in this field for the non-forest 2000

36

variable and a pop-up menu appears in which we select “non-forest” and then click

“insert”, and then OK (Fig. 40). In the case of distance to roads (primary and secondary) we

select either it is a variable representing primary or secondary roads in each case and then

click “insert”, and then OK (Fig. 41).

Figure 40: Pop-up menu for the basis layer type of the non-forest 2000 variable

37

Figure 41: Pop-up menu for the basis layer type of the distance to roads (primary and then secondary) variables

38

Once all variables have been input in the “Transition Sub-Model Structure” sub-section we

are ready to mode on to the next sub-section called “Run Transition Sub-Model” (Fig. 42).

It is in this sub-section that we will first calculate the relevance weights of each variable

and then generate a sub-model for the transition under assessment, in this case, forest to

non-forest.

There are three options for a modeling approach. We selected SimWeight which is the most

appropriate when assessing only one transition (forest to non-forest). The selected method

was the Similarity-Weighted Instance-based Machine Learning (SimWeight) as the model

method to run the transition sub-model. SimWeight is a Similarity-Weighted Instance-based

Machine Learning algorithm. It uses a slightly modified variant of the algorithm described

by Sangermano et al., (2010) – a similarity-weighted K-nearest neighbor procedure (Idrisi

help).

39

Figure 42: Sub-section “Run Transition Sub-Model”

We leave all values by default and click on “Calculate Relevance Weights”. This process

takes a few minutes and as a result we will have a chart with the relevance weight of each

selected variable to predict the change in forest to non-forest in the assessment period (Fig.

43). The relevance weight chart is an indication of each variable’s importance at

40

discriminating change. For each variable, it compares the standard deviation of the variable

inside areas that have changed (Forest to Non-Forest) to the standard deviation across the

entire study area. For a variable to be important it would have a smaller standard deviation

in the change area than for the entire study area. The graph can be used as a guide to inform

the utility of variables as well to indicate that more variables may need to be identifies to

include in the model.

Figure 43: Relevance weight of each of the selected variables for the sub-model

Once the relevance weight of the variables has been calculated, we proceed to run the sub-

model. You’ll notice that the button we previously used to “Calculate Relevance Weights”

has now change into “Run Sub-Model”. Click on “Run Sub-Model” and leave the software

to calculate the sub-model for the assessment period (Fig. 44). This operation will take

several minutes depending on the size of the study area (in our case it took about 5 hours).

The result from the sub-model calculation is a soft-prediction or deforestation risk map for

the assessment period (Fig. 45).

41

Figure 44: Option to run the sub-model or soft-prediction

42

Figure 45: Soft-prediction map or deforestation risk map for the assessment period

Once the sub-model has been calculated we can then proceed to run the actual future

deforestation model as explained in the next section.

6.3 Change Prediction

We can now go to the next tab “Change Prediction” (Fig. 46). In our case we didn’t account

for the expansion of roads so we didn’t use the sub-section “Dynamic Road Development”.

We go straight to the sub-section “Change Allocation”.

In our case we didn’t include any dynamic road development, changes in infrastructure or

zones of constraint/incentives, thus we left unchecked all the options in the “Optional

Components” box.

We leave all other options by default and proceed to click on “Run Model” (Fig. 46). At

this point the software will start developing the future deforestation model based on the

variables we have chosen and for the period and intervals selected. This process will take

significant time because the software will calculate a soft-prediction for each transition

period and, based on this, will generate a hard-prediction or spatial distribution of

deforestation for the selected year.

43

Figure 46: Sub-section “Change Allocation”

5.4 Results

Two basic models of change are provided: a hard prediction model and a soft prediction

model (Fig. 47). The hard prediction model is based on a competitive land allocation model

similar to a multi-objective decision process. The soft prediction yields a map of vulnerability

to change for the selected set of transitions. Hard and Soft prediction maps are generated for

each year. The output is a series of predicted land cover maps for the study area.

44

Figure 47: Soft-prediction (above) and hard-prediction (below) models

The resulting hard-prediction maps are raster files from which we can calculate areas of

non-forest increase (deforestation). This raster files can be worked in Idrisi or can be

exported to ArcGIS.

45

7 References

Eastman, J.R., 2012. “Land Change Modeler” Idrisi Selva Tutorial, Manual

version 17. 264-280

Hansen, Matthew C., et al. "High-resolution global maps of 21st-century forest

cover change." Science 342.6160 (2013): 850-853.

PNGFA. 2013. “Forest and Land Use in Papua New Guinea 2013.” Port

Moresby.

Puyravaud, J.P., 2003. "Standardizing the calculation of the annual rate of

deforestation". Forest Ecology and Management, 177: 593-596

Sangermano, F., Eastman, J.R., Zhu, H., 2010. Similarity weighted instance

based learning for the generation of transition potentials in land change

modeling. Transactions in GIS, 14(5), 569-580.

Takada, T., Miyamoto, A., and Hasegawa, S.F., 2010. "Derivation of a yearly

transition probability matrix for land-use dynamics and its applications",

Landscape Ecology, 25, 561–572.

8 Annexes

8.1 Annex 1: Complete process flow for preparing Hansen Dataset to generate Forest/Non-forest maps for the years

2000 and 2014

8.2 Annex 2: Validation

A commonly used step in deforestation modeling is model validation. This allows the analyst

to have an idea on the accuracy of the model. In our case, the scope and time of the

assignment as well as the availability of data did not allow for a validation of the deforestation

models. However, it is suggested that PNGFA should validate all the deforestation models it

generates as new data becomes available.

Model validation is needed to determine which of the deforestation risks maps is the most

accurate in order to confirm the quality of the model output. To confirm a model output, it is

required both a “Calibration” and a “Validation” stage. For example, imagine we have data

for three points in time: year 1996, 2004, and 2008. In this case, there are two historical

periods (1996-2004 and 2004-2008) that have shown a similar deforestation trends. Data

from the most recent period (2004-2008) can be used as the “validation” data set and those

from the previous period (1996-2004) as the “calibration” data set.

With data from the calibration period (1996-2004), we prepare a Risk Map and a Prediction

Map of the deforestation for the validation period (2004-2008). Then, predicted deforestation

for 2008 will be overlaid with locations that were actually deforested in 2008 (land cover

map for 2008).

It is necessary to select the Prediction Map which best fits with the real map and that best

reproduce actual deforestation in the validation period.

In this step, the hard prediction 2008 will be used to validate the model, given that the actual

land cover map for 2008 is already known. The output map is a 3-way cross-tabulation

between the projected or “predicted” 2008 map and the actual 2008 map (the “reality” map).

Area calculation for both predicted and actual map of 2008 should be similar because we

used the actual rate of change in the period 2004-2008. What we want to do in this case is to

validate whether the projected locations of change are similar to those of actual changes in

this period.

One of the assessments techniques that can be used is the “Figure of Merit” (FOM) that

confirms the model prediction in statistical manner. This FOM is a ratio of the intersection

of the observed change and the predicted change to the union of the observed change and the

predicted change and ranges from 0 (where there is no overlap between observed and

predicted change) to 1.0 (where there is perfect overlap between observed and predicted

change)

Results show three data for analysis:

- False Alarms (C), which are areas where we predicted a change from forest to non-

forest but there was no change;

- Misses (A), which are areas where a change was not predicted but pixels actually

changed from forest to non-forest; and

48

- Hits (B), which are pixels predicted to change to non-forest areas did change.

The Figure of Merit (FOM) is calculated from these values, as is presented below:

FOM = B / (A+B+C)

METHODOLOGICAL GUIDELINE TO PRODUCE A FUTURE … REports/Future...DEFORESTATION This report presents the methodology and results to locate in space and time the baseline deforestation

Documents