Mda Nlcd User Guide 2.0.8.7

NLCD Mapping ToolUser’s Guide

MDA Information Systems LLC820 West Diamond Avenue, Suite 300Gaithersburg, MD 20878-1419 USA

Under Contract G11PX02259To the U.S. Geological Survey

December 2012

TABLE OF CONTENTS

1.0 INTRODUCTION.........................................................................................................42.0 INSTALLATION..........................................................................................................43.0 NLCD MAPPING TOOL ICON AND MENU.............................................................54.0 PERCENT CALCULATION TOOL............................................................................65.0 NLCD SAMPLING TOOL...........................................................................................7

a) Independent Variable Files Input........................................................................8b) Dependent Variables File Input........................................................................10c) Ignore Values....................................................................................................11d) Sampling Number Designation.........................................................................12e) Sampling Method Designation.........................................................................13

1) Random sampling.................................................................................142) Stratified random sampling...................................................................153) Systematic sampling.............................................................................16

f) Output File Names Designation........................................................................16g) Cubist and See5 Options...................................................................................18

6.0 CUBIST AND SEE5 CLASSIFIER TOOLS..............................................................19a) Input Name File................................................................................................19b) Rules or Tree Option and Input File.................................................................20c) Input Model File...............................................................................................20d) Use Mask File...................................................................................................20e) Output File........................................................................................................20f) Create Error or Confidence Layer.....................................................................20

7.0 ACCURACY ASSESSMENT TOOL.........................................................................218.0 SMART ELIMINATE TOOL.....................................................................................22

a) Input Name File................................................................................................22b) Minimum Mapping Unit...................................................................................22c) Weight File........................................................................................................24

1) No Weights...........................................................................................252) No Weights w/ 0...................................................................................253) Old Weight File.....................................................................................264) New Weight File...................................................................................26

d) Single Step Option............................................................................................28e) Output File........................................................................................................28

9.0 CREATING BATCHES FOR ERDAS.......................................................................2910.0 LOGGING AND ERROR MESSAGES...................................................30

a) Logging Messages............................................................................................311) cubistinput.c..........................................................................................312) cartclass.c..............................................................................................313) SmartEliminate.c...................................................................................32

b) Error Messages..................................................................................................331) cubistinput.c..........................................................................................33

2

2) cartclass.c..............................................................................................343) SmartEliminate.c...................................................................................35

c) System Error Messages.....................................................................................361) Side-by-side Error.................................................................................362) Erdas Imagine Error..............................................................................363) Command Line Too Long.....................................................................37

3

1.0 INTRODUCTION

This document was created in order to detail installation instructions and use of the National Land Cover Dataset (NLCD) Mapping Tool designed by MDA Information Systems LLC (formerly MDA Federal), for the United States Geological Survey. All rights to this software are held by the USGS.

The tools described in this document were initially developed for use within the ERDAS Imagine 8.7 software environment. They have been updated to work with Imagine versions 9.1 through 2011. The tools have been designed and tested to produce data files compatible Rulequest Research’s Cubist versions 2.02 through 2.07 and See5/C5.0 versions 2.02 through 2.08, and to read and apply models from those versions of Rulequest’s software. The executables were compiled using the Imagine Toolkits and Microsoft Visual C++ compilers. The tools have been tested on Windows XP and Windows 7 operating systems.

2.0 INSTALLATION

To install the software double click on the executable file named “NLCD_Mapping_Tools_v2.0.8.7.exe”. It is recommended you close all other applications before starting Setup. In addition, depending on the security policy on your system, you may need to have administrator privileges to install software into the “Program Files” directory.

After double clicking the installer, the following window will appear:

4

Click the “Next” button. The following window changes to the following:

The installer will find which versions of Imagine are installed on the system. Select the versions for which you would like the NLCD Sampling Tool installed. In addition, the last checkbox is for installing the version of the Microsoft Visual C Runtime Redistributable Library needed by the software.

3.0 NLCD MAPPING TOOL ICON AND MENU

After completing all of the above installation steps and restarting the appropriate version of ERDAS Imagine, a new icon should be added to the main ERDAS Imagine toolbar. This icon will launch the NLCD Mapping Tool GUI.

In the Ribbon interface a new tab called “NLCD Tools” will be added to the left of the “Help” tab. From this GUI menu or tab the user can launch any application included as part of the NLCD Mapping Tool. These applications include: Percent Calculation, NLCD Sampling Tool, Cubist Classifier for Cubist v2.06, Cubist Classifier for Cubist v2.07,

5

See5 Classifier, Accuracy Assessment, and Smart Eliminate. Each of these applications is described in detail below.

4.0 PERCENT CALCULATION TOOL

The Percent Calculation Tool is designed to take a user-defined, high-resolution input source image and calculate the percent of a given target surface for a 30-meter neighborhood. This 30-meter neighborhood is based upon the area equivalent to a30-meter Landsat Enhanced Thematic Mapper Plus pixel. The user can define whether the percent calculation will be performed on 1-meter or 4-meter source imagery. Output images will be unchanged in spatial resolution, but will have values based upon the pixel neighborhood percentage of the target feature.

Input Source images must be classified into three classes, where a value of zero designates non-target features, a value of one represents the user’s target class, and a value of two represents values to be ignored. Values of 2 are ignored in both the numerator and denominator.

6

This tool also features a batch command function that allows the user to run multiple percent calculations on multiple user-defined input files. These files will be run in succession and are input at once, prior to the start of the calculation process.

5.0 NLCD SAMPLING TOOL

The NLCD Sampling Tool is designed to be an interface or translator between image files within the ERDAS Imagine environment and appropriately formatted *.names, *.data and *.test text files to be used by Cubist and See5/C5.0. This tool allows the user to define independent and dependent inputs, background values or values to be ignored, several sampling schemes and methodologies, as well as parameters and naming options for the output files.

Use of this tool should occur before building a classification rule set within Cubist or See5/C5.0, and is meant to be used in conjunction with Cubist or See5/C5.0 and the NLCD Classifier tools (described below) in order to achieve an output classified image. The Graphical User’s Interface (GUI) for this tool is described in greater detail in the following text. For each sub-item listed within this section, please see the corresponding figures showing where the items described occur within the GUI. The format of the output *.names file is compatible with See5 versions 1.17 through 2.08. The format is compatible with Cubist versions 1.12 through 2.07. The precision of certain values in the *.model file was increased by Rulequest in version 2.01 from version 1.13, so results may be slightly different using those two versions.

7

a) Independent Variable Files InputThis set of input options allow the user to navigate to the appropriate path and file(s) to be designated as the input independent variable file(s). This can be done through:

1) An internal navigation window.2) A drop down navigation window.3) An add Independent Variable Files dialog.

8

Independent inputs can be of any data type. Dependent input files may be either unsigned 1, 2, 4, 8, or 16-bits or else signed 8 or 16-bit. Input files must be of the same spatial resolution (pixel size) and map projection. Files may be of varying extents, but will only be sampled from within the geographic area of intersection for all independent input images. All independent files input by the user will be added to the input independent file input list, located to the right of the drop down navigation window.

A second option for specifying the list of independent files is to use a *.txt file. The txt file must have one file name per line. The txt file may be necessary when using a large number of independent files because the GUI software communicates with the sampling executable by passing the files on the command line. The command line string is limited in length. The user can also create and save a txt file after selecting the files through the interface.

Independent File Input List1

2

3

9

b) Dependent Variables File InputThis input option allows the user to navigate to the appropriate path and file to be designated as the dependent variable file. This file is the reference image for the sampling process. All sampling options will be based upon the values and distribution of information present within this image. Designation of this file is accomplished through the Dependent Variable File dialog.

The dependent file can either be an Imagine (*.img) file or a text (*.txt) file. The txt option is useful when sparse training data is available. Rather than constructing a raster with mostly areas of fill a simple txt file can be used. The format of the text file is one line for each sample point with each line listing the x-coordinate, y-coordinate, and dependent value. The values may be separated by a space, tab, or comma. Also, comments may be included in this file by putting pound sign (#) in the first column. The sampling tool will find the pixel in each independent layer that is directly over the input x and y coordinates and use that in the data and test files.

10

c) Ignore ValuesThis option allows the user to specify a value or multiple values that should be ignored within the dependent input file during the sampling process. Values can be listed as a comma delimited list (i.e. 0,100,255) or defined as a range of values with the use of a hyphen (i.e. 0-255). No samples will be collected from the pixels designated by the user-defined ignore value(s). These areas will be treated as background values and will not affect the distribution of either training or validation samples.

11

d) Sampling Number DesignationThis option allows the user to specify whether sampling should occur based upon a user-defined number of samples or upon the percentage of total available samples (pixels) within the input dependent variable image. The number of total available samples is defined by the total number of image pixels minus the sum of the number of pixels that are defined by an ignored value. A third option is to select the “All Points” button, which will sample all pixels except those with an ignored value. Note that this is equivalent to selecting the Percent button and entering 100%, and the call to the sampling executable uses this form.

12

After the user has defined whether sampling will be number or percentage based, an appropriate number or proportion of training and validation samples must be defined. These training and validation samples will be written to separate files for use within the Rulequest Cubist or See5/C5.0 software, in order to build and evaluate, respectively, the rule set(s) produced by Cubist or See5/C5.0.

e) Sampling Method DesignationThis option allows the user to specify the type of sampling method that will be applied in collecting the appropriate amount of training samples. Validation samples are always sampled in a purely random fashion from the set of pixels not selected for training, in order to assure there is no bias in the subsequent evaluation of rule sets based upon the comparison of these values to those later predicted by the Rulequest software. Selection of these various methods, therefore, has no influence over how validation samples are collected.

13

Three different sampling methods are provided. Each of these methods should be selected by the user based upon the appropriateness of the input data layers and desired output classification. All three methods sample on a single pixel basis. Each pixel is therefore treated as a separate possible sample, regardless of the spatial proximity of these samples. The three methods include:

1) Random samplingThis method selects a random subset of the input pixels without regard for the value of the dependent variable. The image is scanned once to see how many eligible pixels (i.e., not having an excluded value) are present and compares that with the number of sample pixels the user desires. It divides the two numbers to get the probability that the first pixel is selected. It then reads through the image again and selects the first pixel with probability by using a random number generator. The number of remaining pixels is decremented by one and the number of desired pixels is either decremented by one or left the same depending whether the first

14

pixel was selected or not. The probability of selecting the second pixel is then calculated, the second pixel selected or not with this probability. This is repeated until the whole image has been scanned. All eligible pixels are treated equally. The spatial distribution of the sample should mimic the distribution of the eligible pixels – there is no attempt to force a spatially diverse sample. The random number generator used for sampling is derived by calling the standard “C” library function rand() 4 times, so it produces 60-bit random numbers. This is needed for the very small probabilities that result from small samples taken from large input images.

2) Stratified random samplingThis sampling option is similar to that of the random sampling method described above, with the exception that each set of pixels with the same value of the dependent variable is treated as a separate population. The first step of the algorithm is to determine how many eligible pixels there are in each class. This is done by scanning the image once to collect the histogram.

The second step is to compute how many pixels of each class will be selected. The total number of samples desired is proportionally allocated to each class. The proportions may include non-integer numbers, so those values are rounded down to the nearest integer. A small number of samples will possibly remain unallocated. Those are allocated to the classes that had the largest fractional part in the proportional allocation.

A numerical example may make this discussion more concrete. Suppose there are 5 classes with 2000, 5000, 3500, 8000, and 10000 eligible pixels. Also suppose the user requests 100 samples. To compute the proportional allocation first see there are 2000 + 5000 + 3500 + 8000 + 10000 = 28500 total samples, so the allocation ratio is 100 / 28500 = 0.003508. The non-integer allocations are 7.017, 17.543, 12.280, 28.070, and 35.087 pixels. These are rounded down to integers: 7, 17, 12, 28, 35. These values sum only to 99, not 100 (because we rounded each number down), so 1 additional sample needs to be allocated. The second class had the largest fractional part (0.543), so the last sample goes there. The final allocation is 7, 18, 12, 28, and 35.

The additional Minimum Samples option allows the user to specify a minimum number of samples that must be collected from each class or stratum. For rare cases, where there are fewer available pixels then specified minimum required samples, all of these available pixels will be used as samples in order to acquire a maximum amount of training samples for the class(es). The default for this option is zero, which places

15

no constraints on the minimum number of samples collected within each stratum.

If the user has selected a minimum number of samples per class, then one additional step is necessary. For each class, the number of samples is increased to the smaller of the number of eligible pixels in that class and the minimum desired. If this option is used the total number of samples may be greater than the total number of samples requested.

Once the allocation of samples to classes is complete, the sampling begins. The algorithm is identical to the sequential sampling described above, but each class is treated independently. For each class two values are maintained: the number of eligible pixels remaining to be seen and the number of additional samples required. The probability of selecting the first pixel of that class is computed by dividing these numbers. After examining the pixel the two numbers and probability is updated. The same 60-bit random number generator described above is used.

3) Systematic samplingThis sampling option spreads the appropriate (user-defined) number of training samples evenly across the extent of the image area based upon areas of available (non-ignored) pixels. The total number of these available pixels divided by the number of training samples required defines the interval of sampling. The sampling start point is assigned through random number generation. The image pixels are examined in blocks (size depending on the *.img settings) sequentially from upper left to lower right.

f) Output File Names DesignationThis option allows the user to specify the output path and filenames for the output text files created within the NLCD Sampling Tool. These files will be used by the Rulequest and NLCD Classifier software in order to build, evaluate, and spatially extrapolate the rule sets. These files include:

i) *.names file: This file defines the appropriate paths and filenames for all input independent and dependent files sampled within the NLCD sample tool. Data type and order of sampled bands can also be examined. The top of this file records the user inputs that affected the sample process, such as number of samples (or percentage) requested for training and validation, the output format (Cubist or See5), Sampling method, and the actual number of training and validation samples.

16

ii) *.data file: This file contains a list of all independent and dependent sample values which were sampled to be used as part of the rule set training process. These files were sampled by the user-defined methods (described above).

iii) *.test file: This file contains a list of all independent and dependent sample values which were sampled, separately from the training samples, to be used as part of the rule set validation process. These files were sampled by with a random method (described above).

iv) *.names.hst is also created through the NLCD Sampling Tool. This file details the distribution of samples available within the dependent input, and those output into the *.data and *.test files.

The user need only supply an appropriate name to the output *.names file, and this root name will be applied to the corresponding *.data and *.test files. These files must all have matching roots in order for the Rulequest and NLCD Classifier software to perform correctly. Files with root names that do not match exactly will cause errors for these two programs.

17

g) Cubist and See5 OptionsThe Cubist and See5 check box options allow the user to specify whether the output *.names, *.data and *.test files will be for use by Cubist or See5. This is an important option, which is determined by the features being studied and the objectives of the study. The major difference that occurs between these two options is the definition of the dependent input file type. Selection of the Cubist option will cause the dependent input file to be defined as a continuous file, while the See5 option will define this dependent input as categorical, which lists the available categories, or classes, within the output *.names file.

18

6.0 CUBIST AND SEE5 CLASSIFIER TOOLS

The CUBIST and SEE5 Classifier Tools are designed to apply the rule set obtained through the Classification and Regression Tree (CART) process, using either Cubist or See5/C5.0, to the input images defined by the user within the initial NLCD Sampling step. This is the spatial extrapolation of the CART classification. Both tools are described together in this section since their layouts and functions are similar. There are two versions for running the Cubist classifier – one for models created in Cubist versions 2.06 and earlier, and one for models created in Cubist 2.07. The user must select the correct option depending on the software used to create the models.

a) Input Name FileThis option allows the user to navigate and define the appropriate path and file to be designated as the input *.names file. This file serves as a reference file for the classification process. The appropriate rule set files, created within Cubist or See5, are associated with this file through a common root name. It is through this association that the corresponding classification can be defined and applied. The rules defined in these files will be applied to the pixel values within the corresponding independent input imagery, which is also defined, within the *.names file, by path, filename, and band number.

19

b) Rules or Tree Option and Input FileFor classifications that are associated with the See5 classifier option the Rules or Tree check box must be defined. These check box options allow the user to specify whether the input rule set files corresponding to the designated *.names were created with either the rule set (non-mutually exclusive) or tree (mutually exclusive) options within See5. The appropriate *.rules or *.tree files will then be automatically defined and used for classification. An error will occur if files created with one program are defined as being associated with the other.

c) Input Model FileThe Cubist classifications select the input model file using the file selector.

d) Use Mask FileThis option allows the user to navigate to and define an optional file that can be used to limit the area to be classified. This mask file will designate areas to be classified as a value of one and areas that are to be ignored as a value of zero. These ignored, zero values will not be classified in the NLCD Classifier output image, and will possess output values of zero. Application of such limits to classification area greatly increase the speed of classification.

e) Output FileThis option allows the user to define the path and filename for the output classified image. This image is a result of the rule sets derived from Cubist or See5, which were based upon a set of user defined independent input images. Where independent input files may have been of differing geographic extents, only the area of intersection for these files will be classified.

f) Create Error or Confidence LayerThis option allows the user to specify an output image file, in addition to the actual output classification. This additional file spatially represents the estimated error (Cubist) or predicted confidence (See5/C5.0) that is associated with that output pixel, based upon the rule(s) that were used to classify it. This is useful in that the user can see the spatial representation of distribution and magnitude of error or confidence for a given classification. This image has the same root file name as the user defined output classification file, with an added *_error.img for the Cubist error layer and *_conf.img for the See5/C5.0 confidence layer.

For Cubist, this error layer is 10 times the error estimated in the prediction of a continuous variable. This number represents the estimated amount of difference between the value that is predicted, through the CART classification, and the value that could actually exist based upon the estimated error in the rule used to

20

predict this value. This provides the user with an estimated range, or confidence, in these values. The scaling factor of 10 was chosen to get more precision in the value. The file format is unsigned 16-bit.

For See5/C5.0, this error layer represents a percent confidence associated with each rule and output categorical, classified value. It is expressed as a percentage of confidence. A value of zero would therefore have a low confidence (always wrong), while a value of 100 would have a very high confidence (always right). The file format is 8-bit unsigned.

7.0 ACCURACY ASSESSMENT TOOL

The Accuracy Assessment/Error Validation Tool is designed to aid the user in evaluating the accuracy of the output classified image. This is done by subtracting the classified layer from a user-defined quality control image. The result of this operation will provide the user with an output image that identifies areas of both positive and negative misclassification. Output values will range from -100 to 100 and will be defined within the value column of the output attribute table.

This evaluation tool is for use with continuous data which ranges from 0 to 100 only. Accuracy assessment of categorical classifications will provide meaningless numbers, which denote magnitude of difference between the predicted and actual classes. Such a magnitude of difference is not appropriate for categorical accuracy assessment.

21

8.0 SMART ELIMINATE TOOL

The purpose of the Smart Eliminate tool is to eliminate from a thematic file small clusters that are below a user-specified minimum mapping unit (MMU). Any contiguous (defined by 8-way connectivity) clump of pixels less than the MMU will be replaced by one its neighboring classes. The priority of elimination and MMU are controlled by the user, and by using different priority orders on the same file different outputs will be produced. The following text describes how to use the tool and how to configure the ancillary inputs.

a) Input Name FileThe first step is to select the input image (Imagine format) using the file selector at the top of the window

b) Minimum Mapping UnitThe next step is to select the Minimum Mapping Unit. There are two choices for this: Single or Multiple. Selecting “Single” will apply the same MMU to all classes. The MMU is set by either entering the number of pixels or clicking on the up/down arrows.

22

Selecting the “Multiple” option allows the algorithm to use different MMU’s for different classes. In order to input this information, the user can specify an existing file (in the right format), or edit and save one through the “Interactive MMU Input” feature. The format for a MMU file is simple: the file is a text file and on each line the class and the MMU must be listed, separated by a space. Any class not listed in the MMU file will default to the value specified in the main frame. A file can be edited in the text editing field of the Interactive MMU Input window. Clicking on the File selector will allow the user to save the MMU file. The extension of the file should be .txt.

23

c) Weight FileIn the middle panel, there are four choices for a “weights” file. The weights file specifies the priority of elimination. When a clump smaller than the MMU is found, the pixels in that clump are replaced by one of the neighboring classes (defined by 8-way connectivity).

Edit MMUs

24

The algorithm for deciding which class will be the replacement class is as follows: the class with the highest weight is chosen; if there are multiple classes with the same weight, the one with more pixels is chosen; if there is still a tie, the class that comes first in the weight file order is selected as the replacement.

The four choices are “No Weights”, “No Weights w/ 0”, “Old Weight File”, or “New Weight File”. The meaning of each of these choices is described below.

1) No WeightsWith this option, all potential replacement classes have equal weight. Because of this, the class with the most neighbors will be used as the replacement class. If there is a tie, the class with the lower code will be chosen.

2) No Weights w/ 0This option is the same as option 1) “No Weights” except that class “0” is also eliminated. In option 1) the “0” class is considered “fill” and not eligible to be eliminated.

25

3) Old Weight FileThis option can only be used on images that have been recoded to classes 1 to N. Zero is reserved as a “no data” or “fill” value. Using this option requires creating a weight file outside of the Smart Eliminate Tool using a text editor or preferably a spreadsheet application (save the file in text format). The weight file contains first the value of “N” and then an N x N matrix of non-negative integer weights. On line M, the weights assigned to classes 1 through N as replacement classes for small clumps of class M are listed. The first column will be the weight for replacement class 1, the second column the weight for replacement class 2, etc. Since the image data is from 1 to N, there is no need to list the class numbers in the file. There are two downsides to this format: the image must be recoded to 1 to N and there is usually many equal weights (typically 0) for replacement classes that have the lowest priority. To alleviate these problems, a new format was designed.

4) New Weight FileThe new weight format is more intuitive and can be used on any thematic images. In addition, an interactive weight editor is supplied within the Smart Eliminate Tool for this weight format. After selecting “New Weight File”, an input file is required. If one has already been created, it can be entered through the file selector to the right of the 3 radio buttons.

The new weight file has its own format. There is one line for each class in the image. This class is listed first on a line. Following the class value separated by a space, tab, or comma, are the replacement classes in decreasing priority. A comment line may be included in this format by putting a pound sign (#) in the first column of a line. Any

26

classes not listed (as replacement classes) are assumed to have equal and lowest priority. The order of lines in the file determines the default priority among the lowest priority classes. All classes in the input image must have their own line and list of replacement classes. The interactive editor works like the MMU editor described above. The large box on the right is a text editor. After the editing is complete, you can save the file by using the file selector at the bottom of the window. The replacement algorithm chooses the highest priority class among the neighboring pixels. When saving a weight file be sure to enter the full file name, including the extension .wt or .txt. If you do not enter the extension then the file name will not contain a period and will be unusable later. This is necessary because Erdas Imagine does not recognize the *.wt file type.

In the case of new style weight matrix, the Smart Eliminate code performs an on-the-fly recoding of the input data when it is read and processed, and subsequent decoding when the output data is written. To do this, the weight matrix is scanned once to get the classes (first element in each line) and to assign it a code. When the matrix is scanned again, a full weight matrix is populated. It is therefore an error to have replacement classes that are not primary classes. Similarly, it is an error to have classes in the image that are not listed in the weight file. However, it is alright to have classes listed in the weight file that are not represented in the image.

27

d) Single Step OptionThe last option to discuss in the Smart Eliminate Tool is the check box marked “Single Step.” This option relates to the algorithm used for eliminating the small clumps. In a single pass of the algorithm, not all clumps up to the MMU are eliminated. To guarantee clumps up to the MMU are eliminated, the algorithm must be run with MMU’s increasing by multiples of 2 (i.e., 2, 4, 8, 16, etc). The default is for this to happen automatically. Intermediate files will automatically be created and deleted as the algorithm proceeds through the required steps. The user does NOT need to manually change the MMU in this order. However, some users may want to run only a single instance of the elimination algorithm or manage the progression of MMU’s by running successive single eliminations. To force a single run the “Single Step” box needs to be checked. Most users will not need this option. If outputs appear to have clumps smaller than the requested MMU, check to make sure this box is not checked.

e) Output FileThe name of the output file is entered through the bottom-most file selector.

28

9.0 CREATING BATCHES FOR ERDAS

For advanced users, all of the tools in the NLCD Tools are capable of being run in ERDAS batch command process (.bcf). The Sampling Tool’s batch capabilities are limited because the Independent variable can vary in number and this is something that is very hard to do in ERDAS bcf files. The following is a quick overview on how to create a batch command file.

First, you will need to grab the command call from the Session Log. This command will have the process that is being called (either modeler or an executable) and all of the variables that are needed to run the process. In this graphic you can see where the See5 Classifier that was run.

Here I copied out the command and broke the command apart so that each individual variable is on a separate line. Notice the bolded text, these are things that must remain for the program run correctly and the text that is in italics are variables that will change.

c:/program files/imagine 8.7/bin/ntx86/nlcd_see5class_208.exed:/test/cart_sample/scrub_cb8.names d:/test/cart_sample/scrub_cb8.img -rules d:/test/cart_sample/scrub_cb8.rules -tree d:/test/cart_sample/scrub_cb8.tree -format Tree -maskfile d:/test/cart_sample/scrub_bin.img -error 0 -meter

29

nlcd_see5class_208 ‘$(Input_names)’ ‘$(Output_img)’ -rules ‘$(Input_rules)’ –tree ‘$(Input_tree)’ -format Tree -maskfile ‘$(Input_mask)’ -error 0 –meter

Above you can see the edited command that would be used in a .bcf file. Notice that all of the files were replaced with a ‘$(…)’. This denotes that these files are now variables and will need to be defined in the bcf file using the variable function. see below.

Variable Input_names User;Variable Output_img Auto “$(Input_names.path)$(Input.root).img”;

The final .bcf file for See5 Classifier would look like this.

/* Variable definitions */Variable Input_names User;Variable Input_mask User;

Variable Input_rules Auto “$(Input_names.path)$(Input.root).rules”;Variable Input_tree Auto “$(Input_names.path)$(Input.root).tree”;

Variable Output_img Auto “$(Input_names.path)$(Input.root).img”;

/* Function */nlcd_see5class_208 ‘$(Input_names)’ ‘$(Output_img)’ -rules ‘$(Input_rules)’ –tree ‘$(Input_tree)’ -format Tree -maskfile ‘$(Input_mask)’ -error 0 –meter

10.0 LOGGING AND ERROR MESSAGES

The executables developed as part of the NLCD Mapping Tool are written in “C” using the Erdas Imagine Developers Toolkit and compiled with Microsoft Visual Studio. The executables are all “Jobs” running under the Session Manager. All outputs are written to the Session Log. As each executable runs through its processing steps, logging messages are written to the Session Log. These messages will help the user debug any errors that arise during processing.

The user must set the “Log Message level” to “verbose” for the messages to appear. This is set through the Preferences Editor’s “User Interface & Session” category. If the user does not want the messages to appear he/she can set them to “terse” and only use verbose mode when debugging a problem.

The Job has a "main" function as its entry point. This routine initializes the Toolkit, starts the job, prints program information, parses the input arguments, and calls the "jobMain" function which handles the processing. The "main" function has two logging outputs: one

30

that prints "Print program information." followed by the program information; and a second one that logs the number of command line input arguments and their values.

As customary for programs written with the Erdas Imagine Toolkit, processing each input argument and option switch calls a different "set" function that copies the input variables to global variables in the code and does some rudimentary checking on the values.

These logging messages give the user visibility into the processing and how it is progressing. If inputs or variables are not what the user is expecting they should be corrected and the process rerun.

a) Logging Messages

1) cubistinput.c

For the cubistinput.c program there are 14 such functions ("SetDepFilename", "SetDepFileType", etc), and as each one is called, it logs the values being set. As the independent variable files are parsed, each name is also logged to the Session Log. This happens if the files are passed in on the command line or else through a text "list" file.

When control passes to the "CubistInput_Main" function, 13 logging messages are printed, namely,

"Main -- 1 -- Get names.""Main -- 2 -- Create meter.""Main -- 3 -- Allocate arrays.""Main -- 4 -- Open indFile: %d", once for each independent file"Main -- 5 -- Read map info.""Main -- 6 -- Create windows.""Main -- 7 -- Reopen with windows.""Main -- 8 -- Allocate counters.""Main -- 9 -- Allocate Pixel Rects.""Main -- 10 -- Counting sampling pixels.""Main -- 11 -- Initializing training data.""Main -- 12 -- Reading training data.""Main -- 13 -- Write names file."

In addition to these messages that report progress through the steps needed for sampling, there are feedback messages written to the Session Log with values calculated and used in the particular run -- things such as the pixel size, output map coordinates, processing start time, and number of samples counted.

2) cartclass.c

The classification step is handled by the source code in cartclass.c. There are 7 “Set” functions in that code. The logging messages for cartclass.c are

31

"Main -- 1 -- Copy file names.""Main -- 2 -- Create meter.""Main -- 3 -- Checking file names.""Main -- 4 -- Reading layer names.""Main -- 5 -- RuleQuest reading layer names.""Main -- 6 -- Initialize Model.""Main -- 7 -- Allocate Arrays.""Main -- 8 -- Check Projections.""Main -- 9 -- Check Mask Projection.""Main -- 10 -- Compute output coordinates.""Main -- 11 -- Compute input windows.""Main -- 12 -- Compute mask window.""Main -- 13 -- Create output layer.""Main -- 14 -- Create output error layer.""Main -- 15 -- Create pixel blocks.""Main -- 16 -- Start processing.""Main -- 17 -- Write Map Info.""Main -- 18 -- Clean up."

3) SmartEliminate.c

The SmartEliminate.c code has 7 “Set” functions. When control passes to the "jobMain" function, three logging messages are printed, namely,

"Main -- 1 -- Initializing mmu."at the beginning of the routine;

"Main -- 2a -- Reading mmu file." or "Main -- 2b -- No mmu file specified."depending whether or not an MMU file has been specified;

and then"Main -- 3a -- Multiple Steps of Smart Eliminate." or"Main -- 3b -- Single step of Smart Eliminate."depending whether the reduction is in multiple steps or just one.

Each reduction step calls "SmartEliminate" with the input and output file names, the final and initial MMU arrays. The SmartEliminate function prints 10 log messages as it progresses through the code. The messages are:

"Smart Eliminate -- 1 -- Create meter.""Smart Eliminate -- 2 -- Check input/output names.""Smart Eliminate -- 3 -- Check input file.""Smart Eliminate -- 4 -- Get projection and color table.""Smart Eliminate -- 5 -- Set projection and color table in output file.""Smart Eliminate -- 6 -- Set recode arrays.""Smart Eliminate -- 7 -- Allocate arrays.""Smart Eliminate -- 8 -- Read and set weights.""Smart Eliminate -- 9 -- Starting processing, mmu = %d", theMMU);"Smart Eliminate -- 10 -- Done processing. Closing layers."

32

Finally, if the user presses the "Cancel" button then this log message is printed to the session log:"Smart Eliminate -- 11 -- USER CANCELED PROCESSING"

b) Error Messages

Every call into the Developers Toolkit library is monitored for error conditions. The critical errors will stop the program. The errors in cubistinput.c and cartclass.c are each given a unique number so the user will be able to zero in on the condition causing the error. SmartEliminate.c prints messages along with each error. Under normal operating conditions none of these error messages should appear. The error numbers are:

1) cubistinput.c

1, "Could not initialize toolkit"2, "Error allocating memory for Indlayernames"3, "Error allocating memory for nlayers"4, "Independent file must have more than one layer"5, "Can not open txt file"6, "Error allocating memory for Indlayernames"7, "Error allocating memory for nlayers"8, "Independent file must have more than one layer"9, "Error in format option" 10, "Error in format option"11, "Error reading sampling method option"12, "Error in training samples value"13, "Error in validation samples value"14, "Error in number of ignore values"15, "Error allocating memory for outfilenamehst array"16, "Dependent file must have one layer"17, "Error allocating memory for Indlayerstack"18, "Error allocating memory for windowind"19, "Error allocating memory for mapinfoind"20, "Error allocating memory for xOffsetind"21, "Error allocating memory for yOffsetind"22, "Different Projections"23, "Different Projections"24, "Error in pixel size (all image files must be same)"25, "Error in dependent pixel type (must be unsigned 1,2,4,8, or 16 bit)"26, "Can't open input training data file");27, "Error in training data file"28, "Error allocating memory for counting array"29, "Error allocating memory for counting array"30, "Error allocating memory for counting array"31, "Error allocating memory for counting array"32, "Error allocating memory for counting array"33, "Error from eimg_PixelRectStackCreate"34, "Error allocating memory for indpixelblock array"35, "Error from eimg_PixelRectStackCreate"36, "Error allocating memory for indpixelblock array"37, "Error from eimg_PixelRectStackCreate"

33

38, "Can't open input training data file"39, "Error in training data file"40, "Error allocating memory for count array"41, "Error opening output data file"42, "Error opening output test file"43, "Error allocating memory for xBASE array"44, "Error allocating memory for fBASE array"45, "Error allocating memory for sampsizeBASE array"46, "Error: can't distribute excess pixels"47, "Error: didn't distribute all the trainingsamples!"48, "Can't open input training data file"49, "Error in training data file"50, "Error allocating memory for indpixelblock array"51, "Error from eimg_PixelRectStackCreate"52, "Error opening output names file"53, "Error opening output histogram file"

2) cartclass.c

1, "Could not initialize toolkit"2, "Error in create error layer option"3, "No model file specified"4, "Error in classification type (tree or rules)"5, "Rules file not specified or found"6, "Tree file not specified or found"7, "Ill-defined classification type"8, "Error opening names file"9, "Error allocating memory for layersin"10, "Error allocating memory for windowin"11, "Error allocating memory for mapinfoin"12, "Error allocating memory for xOffsetin"13, "Error allocating memory for yOffsetin"14, "Error opening Imagine file"15, "Different Projections"16, "Error in pixel size (all image files must be same)"17, "Error reported by function eimg_LayerGetNames"18, "Different Projections"19, "Error in pixel size (all image files must be same)"20, "Error allocating memory for datalayerd"21, "Error creating output file"22, "Error creating error output file"23, "Error allocating memory for datalayerd"24, "Error allocating memory for pixelblock"25, "Error allocating memory for pixel block"26, "Error allocating memory for pixel block"27, "Error allocating memory for pixel block"28, "Error reading input file"29, "Error reading input file"30, "Error reading names file"31, "Error allocating memory for layernames"32, "Error reading names file"

34

3) SmartEliminate.c

"Error initializing the Toolkit""Error connecting to the session manager""You did not specify a MMU!""Can't open specified MMU file!"" Error meter info function""You didn't specify a valid input file!""You didn't specify a valid output file!""Error getting the layernames to work on!""Input image has invalid number of layers!""Error opening Input layer!""Image has no width!""Image has no height!"" Error map info reading"" Error reading red color table"" Error reading green color table"" Error reading blue color table"" Error reading opacity color table"" Error open column class names"" Error class names column is not a string type"" Error create table"" Error column read"" Error deleting output file""Error setting the output layer name!""Error creating the output layer!"" Error map info writing"" Error projection parameters writing"" Error writing red color table"" Error writing green color table"" Error writing blue color table"" Error writing opacity color table"" Error creating output names column"" Error writing output names column"" Error deleting input names column data"" Error closing input names column"" Error closing output names column""Error creating pixel buffer!""Can't allocate for grid array""Can't allocate for check array""Can't allocate for flag array""Can't allocate for matrix array""Can't allocate for grid[i] array""Can't allocate for check[i] array""Can't allocate for flag[i] array""Can't allocate for matrix[i] array""Can't allocate for newcov array"" Found new class in weight file"" Found new class in weight file"" Error with changing meter message"" Error with LayerRead"" Found class in input file not in weight file"" Error with LayerRead"" Error with LayerRead"" Found class in input file not in weight file"" Error with LayerRead"

35

" Found class in input file not in weight file"" Error with meter info print"" Error deleting output file"" Error with LayerRead""Algorithmic error: newcov[0] > 0""Algorithmic error: cover < 0"

c) System Error Messages

1) Side-by-side Error

When starting the executables, a message box with this error message may appear:

Error: The application has failed to start because its side-by-side configuration is incorrect. Please see the application event log for more detail.

This is caused by a missing Visual C++ Runtime library on the user’s machine. The libraries are available for free from Microsoft. They can be installed by following the instructions on this web page:

http://www.microsoft.com/download/en/details.aspx?id=26347

2) Erdas Imagine Error

Some users have reported the following error seen in the Session Log:

nlcd_see5_class_208.exe exited with status -1073741819

This error is generated from within Erdas’s Toolkit code. The exact source of the error is uncertain but in many cases we have traced it to an excess usage of RAM causing a crash of the virtual memory system.

To debug the error the user can monitor the RAM used by the program through the Windows Task Manager. In the Processes pane, display the “Mem Usage” column and apply a descending sort. As the program runs and reads more input files, monitor the usage. If it approaches 2 GB then excess memory usage is the likely problem.

To decrease the memory usage the user can

1) Decrease the number of layers used. This may not be feasible if the problem domain requires all the layers, but if some layers can be omitted they should be.

2) Convert the input data to smaller block size. Since the programs operate block-by-block, more memory is required for larger block sizes. We have seen

36

a decrease in RAM used by a factor of 3-4 when converting data from block size 512 to size 64.

3) Set the default output block size to a smaller number. This number is found in the Preferences Editor, Image Files (General) area.

4) Cut input images to the area that needs to be processed and aligning the blocks to the same boundary.

In addition, Erdas recommends that in general the user5) Set the temp directory to a user controlled directory, not a Windows directory.6) Apply the most recent patches to the Imagine version begin used.

3) Command Line Too Long

The Sampling Tool interface communicates to the executable by parameters supplied on the command line. For the Sampling Tool these parameters are mostly the full paths of the input files. This command line can be seen in the Session Log. There is a limit set inside the Erdas software on the number of characters allowed on the command line -- if the command line is above this limit it will be truncated. If it appears that the command line is begin truncated then the user should use the txt file option instead of passing the file names individually.

37

Mda Nlcd User Guide 2.0.8.7

Documents