Final Report for the WRAP Regional Modeling Center for the Project Period March 1, 2004 – February 28, 2005 Appendix F: Appendix to Section 11, “Task 10: Continued Improvement to Model Evaluation Software” The attached appendix is referred to in Section 11, which covers Task 10. This appendix is a draft of the model performance evaluation (MPE) documentation prepared by RMC staff at CE-CERT, University of California, Riverside and titled “User’s Guide: Air Quality Model Evaluation Software, Version 2.0.1.” Note that the main body of this report is contained in a separate file located at http://pah.cert.ucr.edu/aqm/308/reports/final/2004_RMC_final_report_main_body.pdf
35
Embed
Appendix F: Appendix to Section 11, “Task 10: Continued ... fileAppendix F: Appendix to Section 11, “Task 10: Continued Improvement to Model Evaluation Software” ... TABEL OF
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Final Report for the WRAP Regional Modeling Center for the Project Period March 1, 2004 – February 28, 2005
Appendix F: Appendix to Section 11, “Task 10: Continued Improvement to Model Evaluation Software”
The attached appendix is referred to in Section 11, which covers Task 10. This appendix is a draft of the model performance evaluation (MPE) documentation prepared by RMC staff at CE-CERT, University of California, Riverside and titled “User’s Guide: Air Quality Model Evaluation Software, Version 2.0.1.”
Note that the main body of this report is contained in a separate file located at
Air Quality Modeling Group CE-CERT, University of California Riverside
1084 Columbia Ave. Riverside, CA 92507
Voice: 909-781-5791 | Fax: 909-781-5790
May 24, 2005
UCR Model Evalution Software 5/24/2005
ii
Copyright (c) 2003 - 2004
Air Quality Modeling Group College of Engineering Center for Environmental Research and Technology
University of California Riverside
This program is free software; permission to use, copy, modify, and distribute this software and its documentation for any purpose without fee is hereby granted, provided that both the above copyright notice and this permission notice remain untouched in all copies and in supporting documentation. This software is provided "as is" without express or implied warranty of any kind.
UCR Model Evalution Software 5/24/2005
iii
TABEL OF CONTENTS 1. Introduction
2. Ambient Monitoring Data 3. Model Performance Metrics 4. Software Design Approach ……….…………………………………………………….……...1 5. Preparation of Model Performance Evaluation… ……………………....……...………………..3
5.1. Steps to install
5.2. Steps to run
5.2.1. Preparation of Data Input Files………………………………………………………3
5.2.2. Preparation of Monitoring Station Location Information File……………………….4
5.2.3. Preparation of Species Mapping File………………………………………………...5
5.2.4. Preparation of Model Output File or Model Input File………………………………6
5.2.5. Define the Model Evaluation Configuration in the Driven Script File………………7 Appendix A: Sample Driven Script....………………………………………………...16 Appendix B: Sample Ambient Data Input Files ……...………….………………...…17 Appendix C: Sample Output Files…………………………..………………………...18 Acknowledgement…………………………………………………………………….22
UCR Model Evalution Software 5/24/2005
1
1. Introduction Air quality models (AQMs) are widely used for developing emissions reduction strategies to attain National Ambient Air Quality Standards (NAAQS) for ambient air pollutants. AQMs are typically applied for an historical air pollution episode using event specific emissions and meteorology input data, and the model predictions are then compared with ambient monitoring data to validate the usefulness of the model other applications. For regulatory applications the validated model is typically used to predict the benefits of emissions controls for reducing pollutants for hypothetical future air pollution episode, with the assumption that future meteorological conditions will be similar to that of the historical episode. The comparison of model predictions to ambient concentrations for the historical episode has been variously described as model “verification”, “validation” or “evaluation”. Oresekes et al. (1994) discuss the significance of each of these terms and they argue that, for complex environmental models, the most accurate descriptor is “model evaluation”, where the evaluation is designed to test the usefulness of the model for a particular application. Consistent with Oreskes et al. we use the term model evaluation to describe the comparison of model predictions to ambient data. Comparisons of AQM predictions to ambient data is not the only criteria that should be used in evaluating air quality models. For example, model formulation, completeness, code design, and accuracy and robustness of numerical algorithms are all important factors in selecting a particular AQM. However, assuming these choices are sound, comparison of model results to ambient data is perhaps the most important step in evaluating an AQM. While considerable effort has been mode to develop standardized AQMs for general use (e.g., CAMx and CMAQ are two widely used community models) there has not been a similar effort to standardize and make accessible ambient data bases and software for performing model evaluations. Nonetheless, there is a critical need for these because AQM evaluation is a complex, resource intensive task. Moreover, many of these tasks, such as preprocessing and QA of ambient data need only be performed once and then can be shared among many different users. Issues that must be addressed in performing a model evaluation include the following:
• Access to raw ambient monitoring data is sometimes highly limited, e.g., the AIRS gas phase data is not readily accessible to most modelers.
• Extensive data processing, formatting and QA is required before the ambient data can be used.
• In many cases the ambient trace species data are inconsistent with the representation of the trace species used in AQMs, and specialized knowledge is required to determine how best to compare ambient data to model species.
• Software or other data analysis tools must be used to map monitoring site locations to model grid cells and to handle cases where monitoring sites are located near boundaries of grid cells.
UCR Model Evalution Software 5/24/2005
2
• Software must be developed to match the monitoring data temporal averaging period to the model results and to correctly treat time zone information, where both the time zone and sampling period can vary among monitoring sites for some ambient networks.
• Policy guidance from EPA requires that a variety of model performance statistics be computed for regulatory applications of AQMs, and this requires specialized knowledge.
• A variety measures of error and bias metrics are used in the modeling community, so modelers must compute a large set of metrics to facilitate comparisons of model performance result for different application and among different research groups.
Despite the importance and the complexity of model evaluation, there are no standardized data sets or software packages that are widely accessible. In early 2001, with funding from the Western Governors’ Association through its Western Regional Air Partnership (WRAO) we began an effort to perform long term modeling of gas phase and particulate chemistry and transport in which we routinely operate the CMAQ model for annual simulation for a variety of model sensitivity cases. Because the difficulty and cost of performing a model performance evaluation are compounded by long term modeling scenarios, we decided to invest considerable effort in developing software packages to automate the model performance evaluation. While this initially required greater effort, the MPE software makes it possible to routinely perform sophisticated model evaluation studies with minimal effort. As we continue to explore methods of presenting model evaluation results we also continue to modify the MPE software. By releasing an open source version of the software we hope that the air quality modeling community can adapt and add new features to the MPE software so that all modelers can benefit from others experience and resources. Finally, we note that comparisons of AQMs to ambient data can be classified as objective evaluations in which modeled species concentrations are compared to ambient observations, and diagnostic evaluations in which combinations of trace species and reaction kinetics data are used to probe or diagnose the photochemical regimes and the chemical transformations of precursors to secondary air pollutants. In this document we focus specifically on tools for comparing model to ambient data. 2. Ambient Monitoring Data We have compiled the ground-level model evaluation database for 2002 using several routine and research-grade ambient monitoring databases including the following for fine particulate matter:
• Interagency Monitoring of Protected Visual Environments (IMPROVE) • Clean Air Status and Trends Network (CASTNET) • EPA Speciation Trends Network (STN) • National Acid Deposition Network (NADP) • Southeastern Aerosol Research and Characterization (SEARCH).
UCR Model Evalution Software 5/24/2005
3
In addition the we use the EPA’s Aerometric Information Retrieval System (AIRS/AQS) data base for routine gas-phase concentration measurements for ozone, NO, NO2 and CO archived. Data from these ambient networks are briefly described in the following sections. Figures 2.1~2.2 display the locations of the monitors for the various monitoring networks operating during 2002. Typically, these networks provide ozone, PM and visibility measurements, and the types of data available from these specialized PM monitoring programs are summarized in Table 2-1. Because of different lumping schemes in the model chemistry and the way model outputs in concentration units, some measured species can not be compared directly to the model species, certain mapping schemes are thus applied for model-to-observation species comparisons (as summarized in Table 2-2). Table 1.1 Summary of ambient database used in the evaluation
Monitoring Network Chemical Species Measured
Sampling Frequency; Duration
Data Availability (sites)
The Interagency Monitoring of Protected Visual Environments (IMPROVE)
Speciated PM25 and PM10
1 in 3 days; 24 hr ~62
Clean Air Status and Trends Network (CASTNET)
Speciated PM25, Ozone
Weekly; Week ~72
EPA Air Quality System (AQS) O3, CO, SO2, NO, NOy
Hourly; 1-hr average ~1536
Speciation Trend Network (STN)
Speciated PM25 Varies; Varies ~215
National Acid Deposition Network (NADP)
WSO4, WNO3, WNH4
Weekly ~100
Southeastern Aerosol Research and Characterization (SEARCH)
IMPROVE The Interagency Monitoring of Protected Visual Environments monitoring network (http://vista.cira.colostate.edu/improve) reports detailed chemical species in its measurements of major visibility-reducing aerosol species on a twice-a-week basis. The PM fine mass species being used in the evaluation are as follows:
• Sulfates (SO4), as sulfate ion; • Nitrates (NO3), as nitrate ion; • Organic carbon (OC), as organic carbon mass; • Elemental carbon (EC), as light absorbing carbon or carbon soot; • Soil (SOIL), as fine soil and is sum of several inorganic elements such as Al, Si, Ca, Fe and
Ti.
These species are all measured using a 2.5-micron cut point inlet. The IMPROVE monitors also measure total PM10 and PM2.5 mass. These values are reported as the PM2.5 fine matter (FM) portion of the mass and the coarse matter (CM) portion, as PM10 - PM2.5. The mapping of the CMAQ species to the IMPROVE species counterparts is summarized Table 3. Noted that in CMAQ water as fine particle species is not included among the mapping of IMPROVE species, because IMPROVE measures only dry particles. In addition, IMPROVE defines SOIL as fine soil concentration, which is the sum concentrations of several inorganic species. Although fine soil is not specifically defined in CMAQ, it is taken as unspeciated portion of PM2.5 emitted species. Therefore, model species, A25J+A25I, are used as surrogates for IMPROVE’s fine soil concentration.
CASTNET The Clean Air Status and Trends Network (CASTNET) was developed to monitor dry deposition. It includes measurements of ambient concentration and meteorology and land use which are then used to calculate dry deposition rates. A majority of the CASTNET sites measure sulfate, nitrate, including both gaseous (as HNO3) and aerosol phase, and ammonium in 7-day filter samples. Detailed data collection procedures are described at the EPA CASTNET website: http://www.eap.gov/castnet. In short, atmospheric concentration data are collected at each site with open-faced, 3-stage filter packs. The filter pack contains a Teflon filter for collection of particulate species, a nylon filter for nitric acid and a base-impregnated cellulose (Whatman) filter for sulfur dioxide. Filter packs are exposed for 1-week intervals, and are later extracted and analyzed for certain species.
Because CASTNET reports gaseous species, e.g. SO2 and HNO3, in ug/m3 unit, and, except REMSAD, all other models have gaseous species in ppmV unit, conversion factors (assumed under STP condition) of 2617.6 and 2576.7 are applied for the modeled SO2 and HNO3 species. It has been suggested that total nitrate (NO3 + HNO3), instead of individual aerosol nitrate and gaseous nitric acid, should be used for observation-modeled comparison, because the possible volatilization loss on the Teflon filter pack used in CASTNET (Ames R. B. and W.C. Malm, Atmospheric Environment, 2001, 905-916). Notice that a ratio of 0.9841, molecular weight ratio of NO3/HNO3, is applied in CASTNET’s total nitrate calculation.
UCR Model Evalution Software 5/24/2005
2
AQS
The Air Quality System database is EPA’s repository of “criteria air pollutant”— carbon monoxide (CO), nitrogen dioxide (NO2), sulfur dioxide (SO2), ozone (O3), particulate matter (PM10 and PM25), and lead (Pb), monitoring data since 1970s. It replaced the Aeronomic Information Retrieval System (AIRS) as EPA’s main repository for ambient air monitoring data, including data from the State and Local Air Monitoring Stations (SLAMS), the National Air Monitoring Stations (NAMS), Photochemical Assessment Monitoring Stations (PAMS), and other sources of data. A majority of the ozone, and several gaseous components, including O3, SO2, CO, NO2 and etc., can be retrieved for hourly data from EPA’s web site: http://www.epa.gov/ttn/airs/airsaqs/archived%20data/downloadaqsdata.htm through data query requests. In this study, only hourly gaseous species in AQS are used for model evaluations. Unlike CASTNET data, gaseous species in AQS are reported in ppmV unit. Therefore, modeled species for REMSAD, which outputs in ug/m3, need to apply unit conversion factors as shown in Table 3. STN
EPA’s Speciation Trends Network includes about 215 monitoring stations nationwide. It appears that among these 215 sites may include IMPROVE sites or other data from other networks. This, however, needs to be verified. Daily PM2.5 data are measured for 64 species in the STN network. Some archived STN data files were obtained from the website:http://www.epa.gov/ttn/airs/airsaqs/archived%20data/archivedaqsdata.htm.
NADP
The National Atmospheric Deposition Program/National Trends Network (NADP/NTN) is designed to measure wet deposition. The network is a cooperative effort between State Agricultural Experiment Stations, the U.S. Geological Survey, U.S. Department of Agriculture, and other governmental and private entities. It includes over 200 sites in the continental United States, Alaska, and Puerto Rico, and the Virgin Islands. The purpose of the network is to collect data on the chemistry of precipitation for monitoring of geographical and temporal long-term trends. The precipitation at each station is collected weekly is analyzed for hydrogen (acidity as pH), sulfate, nitrate, ammonium, chloride, and base cations (such as calcium, magnesium, potassium and sodium). The NADP network includes a quality assurance program, so we expect to use this data without any additional QA. The NADP includes the Mercury Deposition Network (MDN) and the Atmospheric Integrated Research Monitoring Network (AIRMoN), designed to study precipitation chemistry trends with greater temporal resolution. Precipitation samples are collected daily from a network of nine sites and analyzed for the same constituents as the NADP/NTN samples. We are currently investigating the availability of the NADP and AIRMoN data. At present, we have not been able to access this data.
SEARCH
UCR Model Evalution Software 5/24/2005
3
The Southeastern Aerosol Research and Characterization is a research experimental study intending to provide detailed aerosol climatology for the Southeast. There are currently eight monitoring sites located in four states in the SEARCH network. Most sites provide hourly and daily measurements for both gaseous and aerosol species. Archived data file and collecting information for SEARCH monitoring network can be obtained from the web site: http://www.atmospheric-research.com/studies/SEARCH/index.htm Ambient samples for SEARCH are collected on a sequential, multi-channel sampler known as the Particle Composition Monitor (PCM), and subsequently analyzed for particle speciation data. SEARCH uses two approaches, FRM Equivalent and Best Estimate, to represent the measurements of particulate matter of less than 2.5 ug/m3 (PM25) and its constituents. While FRM Equivalent is an attempt to replicate the measurement by FRM, Best Estimate is an attempt to represent what is actually in the atmosphere and is therefore used for the model evaluation in this study. 3. Model Performance Metrics Statistical measures that are frequently used in current PM and visibility model performance evaluation include accuracy, error and bias as summarized in Table 3-1. The calculations of error and bias statistic measures are based on the residuals of all pairs of model estimates and observations. Both error and bias measures provide a useful basis for comparison among model simulations across different model episodes. While most model performance evaluations have used the observations to normalize the error and the bias, this approach can lead to misleading conclusion. When normalizing to very low observed concentration values (e.g., clean conditions) model over predictions are weighted much more strongly than equivalent under predictions, as suggested by Seigneur et al. (JAWMA, 50, 588-599, 2000). Seigneur et al. (2000) have recommended that peak bias, average fractional bias, average fractional gross error, and regression be included as the key statistics in model’s operational evaluation to alleviate such problem. As the criteria for model performance evaluation have not been established, we recommend using all statistical measures listed in Table 3-1 in this study.
UCR Model Evalution Software 5/24/2005
1
Table 3-1 Recommended statistical measures for model performance evaluation
Measure Mathematical Expression Notation
Accuracy of unpaired peak (Au)
peak
peakupeak
OOP −
Opeak = peak observation; upeakP = unpaired peak
prediction within certain surrounding grid cells of peak observation
Accuracy of paired peak (Ap)
peak
peak
OOP −
Ppeak = paired (in both time and space) peak prediction
Coefficient of determination (r2)
∑ ∑
∑
= =
=
−−
⎥⎦
⎤⎢⎣
⎡−−
N
i
N
iii
N
iii
OOPP
OOPP
1 1
22
2
1
)()(
))((
Pi = prediction at time and location i; Oi = observation at time and location i; P = arithmetic average of Pi, i=1,2,…, N; O = arithmetic average of Oi, i=1,2,…,N
Normalized Mean Error (NME)
∑
∑
=
=
−
N
ii
N
iii
O
OP
1
1
Reported as %
Root Mean Square Error (RMSE) ( )
21
1
21⎥⎦
⎤⎢⎣
⎡−∑
=
N
iii OP
N
UCR Model Evalution Software 5/24/2005
2
Fractional Gross Error (FE) ∑
= +−N
i ii
ii
OPOP
N 1
2 Reported as %
Mean Absolute Gross Error (MAGE) ∑
=
−N
iii OP
N 1
1
Mean Normalized Gross Error (MNGE) ∑
=
−N
i i
ii
OOP
N 1
1
Reported as %
Mean Bias (MB) ( )∑=
−N
iii OP
N 1
1
Mean Normalized Bias (MNB)
( )∑=
−N
i i
ii
OOP
N 1
1 Reported as %
Mean Fractionalized Bias (Fractional Bias, MFB)
∑=
⎟⎟⎠
⎞⎜⎜⎝
⎛+−N
i ii
ii
OPOP
N 1
2 Reported as %
UCR Model Evalution Software 5/24/2005
3
Normalized Mean Bias (NMB)
∑
∑
=
=
−
N
ii
N
iii
O
OP
1
1)(
Reported as %
Bias Factor (BF) ∑=
N
i i
i
OP
N 1
1
Bias Factor = 1 + MNB; Reported as ratio notation (prediction:observation)
UCR Model Evalution Software 5/24/2005
4
4. Software Design Approach: Five Types of Input File:
• Observed Ambient Data File in spreadsheet ASCII format • Monitor Station Information File in spreadsheet ASCII format • Air Quality Model Output File(s) in netcdf format • Air Quality Model Meteorological Input File(s) in netcdf format (optional) • Ambient/Model Data Species Mapping File(s) in ASCII format
One Simple Driven Script – specify the followings in the script:
• What are the models’ names? • Where are the model files? • What is the evaluation period? • What are the monitoring data network names? • What types of output plots do you want?
Three Types of output Files
• Observed and Model Data Query files in ASCII format, allowing further examining or processing with other database software, e.g. Excel.
• Results of 17 Types of Statistical Analysis Matrix based on "One Site All Day", "All Site
All Day" and "One Day All Site".
1. Accuracy of Paired Peak 2. Coefficient of Determination 3. Normalized Mean Error 4. Root Mean Square Error 5. Fractional Gross Error 6. Mean Absolute Gross Error 7. Mean Normalized Gross Error 8. Mean Bias 9. Mean Normalized Bias 10. Fractional Bias 11. Normalized Mean Bias 12. Observed Mean 13. Predicted Mean 14. Standard Deviation for Observation 15. Standard Deviation for Prediction 16. Correlation Variance 17. Bias Factor
UCR Model Evalution Software 5/24/2005
5
• 4 Types of Plots in PNG format: 1. Time Series Plot based on "One Site All Day" 2. Scatter Plot based on "One Site All Day" 3. Scatter Plot based on "All Site All Day" 4. Scatter Plot based on "One Day All Site"
Notes: All Plots also show two statistical results chosen from the above 17 types, plus all scatter
plots show regression analysis results.
UCR Model Evalution Software 5/24/2005
6
5. Preparation of Model Performance Evaluation
5.1. Steps to Install o Open the package by `tar –xzvf Model_Eval_Tool.v2.tar.gz` at where you want to install o Use default pgf90 compiler or define other compiler in “Model_Eval_Too.v2/Makefile” o Go into the Model_Eval_Too.v2, type `make`, find your executable in bin/ o Done, if installed gnuplot already.
5.2. Steps to Run
5.2.1. Preparation of Observed Data Input File:
This program supports four types of ambient datasets differentiated by data recording method: “Daily Average”, “Hourly”, “Weekly Total”, and “Weekly Average”.
Table 5.1 Format of the “Daily Average” observation data input files. Column Description Format Notes
1 Site Identification One Text String String consists of characters and/or numeric numbers
2 Year Integer 4 digits (e.g., 1996) 3 Julian Date Integer <= 3 digits (e.g., 8, 45, 183) 4 Data Value Float Data sample value for 1 species
Table 5-2. Format of the “Hourly” observation data input files. Column Description Format Notes
1 Site Identification One Text String String consists of characters and/or numeric numbers
2 Year Integer 4 digits (e.g., 1996) 3 Julian Date Integer <= 3 digits (e.g., 8, 45, 183) 4 Hour Integer <=2 digits, 0 <= Hour <= 23 4 Data Value Float Data sample value for 1 species
UCR Model Evalution Software 5/24/2005
7
Table 5-3. Format of the “Weekly Total” or “Weekly Average” observation data input files. Column Description Format Notes
1 Site Identification One Text String String consists of characters and/or numeric numbers
2 Year Integer 4 digits (e.g., 1996) 3 Start Julian Date Integer <= 3 digits (e.g., 1, 39, 183) 4 End Julian Date Integer <= 3 digits (e.g., 8, 45, 190) 5 Start Hour Integer <=2 digits, 0 <= Hour <= 23 6 End Hour Integer <=2 digits, 0 <= Hour <= 23 7 Data Value Float Data sample value for 1 species
Notes:
a) There is always a header line on the NO.1 Row other than the above data part for each observation data input file. The Header consists of strings with one single string for each column. i) “Daily Average” data:
Site Year Jdate $SpeciesName
ii) “Hourly” data: Site Year Jdate Hour $SpeciesName
iii) “Weekly Total” or “Weekly Average” data: Site Year StartJdate EndJdate StartHour EndHour $SpeciesName
b) If there are more than one species in the observation data files, please add more data value
columns correspondingly. c) Units should have been converted to ppmV for Gas Phase Species, and microgram/m3 for
Aerosol Species, 1/Mm for Extinction Coefficient Species, and kg/ha for Wet and Dry Deposition species.
d) Columns are separated with “space” or “tab”. e) For missing or illegal ambient data, please fill in the data field as –999.0 f) During model evaluation, model data will be processed only for those grid cell(s) in which
the monitor station(s) are located, and only for those corresponding observed sampling period.
g) In the plotting results, model data point will be showed up only when it has corresponding “available” observed data in the same location with the same sampling period. “available” means the observed data either greater than 0 or equal to –999.0.
UCR Model Evalution Software 5/24/2005
8
5.2.2 Preparation of Monitoring Station Location Information File:
Table 1-4. Format of the Monitor Station Information File. Column Description Format Notes
1 Site Identification One Text String String consists of characters and/or numeric numbers, Should match those used in Observed Data file(s)
2 Longitude Float Should be negative 3 Latitude Float Should be positive 4 Time Zone Integer 5 – Eastern Time
6 – Central Time 7 – Mountain Time 8 – Pacific Time
5 Daylight Saving Flag Integer 1 – Day Light Saving Enabled 0 – Day Light Saving is Special e.g., in part of Arizona, Indiana
Notes:
a) There is always a header line on the NO.1 Row other than the above data part for each Monitoring Station Location Information File. The Header is made of strings with one single string for each column.
Site Longitude Latitude TimeZone DayLightSaving
5.2.3 Preparation of Species Mapping File
UCR Model Evalution Software 5/24/2005
9
Figure 5.1: A Complete Sample Species Mapping File Illustration:
a) Reserved Keyword: "ambient:", "model:" (case sensitive) b) Operator allowed: "+", "-", "*", "\", "=" only c) Species Naming Convention: name started with a letter [a-zA-Z] d) Factor Species: species name shown on the right hand side of equation e) Target Species: species name shown on the left hand side of equation f) Ambient Scope: defined between the keywords "ambient" and "model". In Ambient Scope,
all of the Factor Species must either be shown on the first line of the Observed Ambient Data File, with corresponding data column provided, or have been defined as a Target Species by one of its preceding equations in Ambient Scope.
g) Model Scope: defined below the keyword "model:" In Model Scope, all of the Factor Species must either be a model species in one of the model files, or, have been defined as a Target Species by one of its preceding equations in Model Scope. In case a species is defined by more than one type of model file at the same time, the first definition occurrence will be chosen.
h) fRH is a lifetime exception. Its lifetime spans across both Ambient Scope and Model Scope. i) Where there is a Target Species in Ambient Scope, there must be a Target Species in Model
Scope in the same sequence order. Each of the Target Species in Ambient scope will be
paired with the corresponding Target Species in Model Scope in the following plotting and statistical analysis.
j) Target species in Ambient Scope must be in the same order as those in the model scope. If you run two model’s comparison, then the first model’s Ambient Target Species must also be in the same order as those Ambient Target Species in the second model. (Plotting module will pair up the ambient and model Target Species data; or ambient, model1 and model2 Target Species data if running two model comparison based on these consistent order)
k) Define Unit for Target Species on Plot: Program will choose correct unit to be put on the plots based on the way the target species are named in the Species Mapping File(s):
Table 5-5. Naming Convention to control the unit being used on plots.
Target Species Units Notes
Extinction Coefficient 1/Mm Attach "BEXT_" to the Beginning of the Target Species name. e.g., BEXT_becon = 1000*EXT_Recon
Gas Phase ppmV Attach "G_" to the Beginning of the Target Species name. e.g., G_O3 = O3
Wet Deposition kg/ha Attach "_wdep" to the end of the Target Species name. e.g, SO4_wdep = ASO4J + ASO4I
Dry Deposition kg/ha Attach "_ddep" to the end of the Target Species name. e.g, SO4_ddep = ASO4J + ASO4I
Aerosol microgram/m3 Do nothing special. e.g., CM = ACORS + ASEAS + ASOIL
5.2.4 Preparation of Model Output File or Model Input File: CMAQ output/input in netcdf format can be feed in directly. CAMx model binary output needs to be pre-processed by a CAMx_to_netCDF_converter at first. Use similar method to treat other air quality models.
5.2.5 Define the Model Evaluation Configuration in the Driven Script File
UCR Model Evalution Software 5/24/2005
11
Table 5-6. A Complete Configuration Flag Illustration:
Flag Name Configuration Option
Notes
MODEL_NUM 1, 2 Total number of model being evaluated, support up to 2 models at the same time. e.g. setenv MODEL_NUM 2
FIRST_MODEL A string with length <= 10 characters
First model name that will be shown on plots. e.g. setenv FIRST_MODEL basecase
FIRST_MODEL_GDTYPE LAMBERT, UTM, LATLON
Coordinate system that the first model used. e.g., setenv FIRST_MODEL_GDTYPE LAMBENT
FIRST_MODEL_FILENUM Any integer between 1 and 10
Model evaluation input's model file type number e.g., if you want to evaluate .CONC and .AEROVIS file for CMAQ, then setenv FIRST_MODEL_FILENUM 2 e.g., if you only want to evaluate .CONC file for CMAQ, then setenv FIRST_MODEL_FILENUM 1
FI_FILE${i} A string with length <= 256
First Model’s input file(s). ${i} <= ${FIRST_MODEL_FILENUM} e.g., if $FIRST_MODEL_FILENUM = 3, then, FI_FILE1, FI_FILE2 and FI_FILE3 needs to be defined. For a model file name in the format of /home/aqm/CCTM_cb4.CONC.YYYYDDD (YYYYDDD is a Julian date such as 1996215) then, setenv FI_FILE1 /home/aqm/CCTM_cb4.CONC.JDATE For a model file name in the format of /home/aqm/CCTM_cb4.CONC.YYYYMMDD (YYYYMMDD is a general date such as 19960326) then, setenv FI_FILE1 /home/aqm/CCTM_cb4.CONC.GDATE
SECOND_MODEL Need to be set only if $MODEL_NUM = 2
Refer to FIRST_MODEL
SECOND_MODEL_GDTYPE Need to be set only if $MODEL_NUM = 2
Refer to FIRST_MODEL_GDTYPE
SECOND_MODEL_FILENUM Need to be set only if $MODEL_NUM = 2
Refer to FIRST_MODEL_FILENUM
SE_FILE${i} Need to be set only if $MODEL_NUM = 2
Refer to FI_FILE${i}
MODEL_YEAR Any integer between 1990 and 2004
modeling year e.g. setenv MODEL_YEAR 1996 program can be scaled up to support other years very easily.
SDATE Any integer between 1 and 366
Model evaluation start Julian date e.g., setenv SDATE 183
EDATE Any integer between 1 and 366
Model evaluation end Julian date e.g., setenv EDATE 213
UCR Model Evalution Software 5/24/2005
12
OBSERVED_NETWORK A string with length <= 256
Any monitor network name, name value will not affect model evaluation result. e.g., setenv OBSERVED_NETWORK IMPROVE
They mean “daily average”, “hourly”, “weekly average” and “weekly total” respectively. They represent how the monitor network records its data e.g., setenv EVAL_FREQUENCY DAILY_AVG
TIMEZONE GMT, LOCAL Represents whether or not the monitor network record its data in GMT e.g., setenv TIMEZONE LOCAL
CON_FILE A string with length <= 256
Observed data text file location e.g., setenv CON_FILE /home/bwang/improve.dat
STN_FILE A string with length <= 256
Monitor station information text file location e.g., setnev STN_FILE /home/bwang/improve.stn
FIRST_MODEL_MAPPING A string with length <= 256
First model’s Species Mapping File location e.g., setenv FIRST_MODEL_MAPPING /home/bwang/improve_species_mapping.txt
SECOND_MODEL_MAPPING Need to be set only if $MODEL_NUM = 2
Refer to FIRST_MODEL_MAPPING
PLOTTING_SCALE LOG, NOLOG Show scatter plot in log or non-log scale e.g., setenv PLOTTING_SCALE NOLOG
PLOT_TIMESERIES YES, NO Create Time Series plots or not e.g., setenv PLOT_TIMESERIES YES
PLOT_ALLDAY_ONESITE YES, NO Create “All Day One Site” Scatter plot and “All Day One Site” statistical result or not e.g., setenv PLOT_ALLDAY_ONESIE NO
PLOT_ALLSITE_ONEDAY YES, NO Create “All Site One Day” Scatter plot and “All Site One Day” statistical result or not e.g., setenv PLOT_ALLSITE_ONEDAY YES
PLOT_ALLSITE_ALLDAY YES, NO Create “All Site All Day” Scatter plot and “All Site All Day” statistical result or not e.g, setenv PLOT_ALLSITE_ALLDAY YES
PLINEAR YES, NO Show linear regression equation line on scatter plot or not e.g., setenv PLINEAR NO
ACCURACY_PAIRED_PEAK YES, NO COEF_DETERMINATION YES, NO NORM_MEAN_ERROR YES, NO ROOT_MEAN_SQR_ERROR YES, NO FRAC_GROSS_ERROR YES, NO MEAN_ABS_GROSS_ERROR YES, NO MEAN_NORM_GROSS_ERROR
YES, NO
MEAN_BIAS YES, NO MEAN_NORM_BIAS YES, NO MEAN_FRAC_BIAS YES, NO NORM_MEAN_BIAS YES, NO OBS_MEAN YES, NO MOD_MEAN YES, NO SD_OBS YES, NO SD_MOD YES, NO CORRELATION_VARIANCE YES, NO BIAS_FACTOR YES, NO
UCR Model Evalution Software 5/24/2005
13
Notes:
• Among Statistical options which starts from “ACCURACY_PAIRED_PEAK” to “BIAS_FACTOR”, only two statistical analysis results with "YES” option will be shown on plots, while all of these above statistical analysis results will be shown in Statistical Analysis Matrix ASCII Output File.
• If There are more than two statistical options were defined as “YES”, the first two "YES"'s statistical analysis will be chosen to be put on plots.
• If there are less than two statistical options were defined as “YES”, program will choose NO's item to fill up the deficiency.