-
HIERARCHAL INDUCTIVE PROCESS MODELING AND ANALYSIS
Youri Noël Nelson
A Thesis Submitted to theUniversity of North Carolina Wilmington
in Partial Fulfillment
of the Requirements for the Degree ofMaster of Science
Department of Mathematics and Statistics
University of North Carolina Wilmington
2011
Approved by
Advisory Committee
Michael Freeze Xin Lu
Wei Feng Stuart Borrett
Chair Co-Chair
Accepted by
Dean, Graduate School
-
TABLE OF CONTENTS
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . iii
DEDICATION . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . iv
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . .
. . . v
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . vi
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . vii
LIST OF SYMBOLS . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . viii
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . .
. . . 1
2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . 10
2.1 HIPM Description . . . . . . . . . . . . . . . . . . . . . .
. . 10
2.1.1 Measure of Fit . . . . . . . . . . . . . . . . . . . . .
12
2.1.2 Entities specification and model library . . . . . . . .
13
2.2 Experiment Design . . . . . . . . . . . . . . . . . . . . .
. . . 16
3 COMPUTATIONAL RESULTS . . . . . . . . . . . . . . . . . . . .
. 20
3.1 Increase in number of time-series input . . . . . . . . . .
. . . 24
3.2 Value of Information . . . . . . . . . . . . . . . . . . . .
. . . 28
3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . 30
4 ANALYTICAL ANALYSIS . . . . . . . . . . . . . . . . . . . . .
. . 33
4.1 Most recurrent models . . . . . . . . . . . . . . . . . . .
. . . 33
4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . .
. . . . 38
4.3 Model A . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 41
4.4 Model B . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 48
4.5 Model C . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 57
4.6 Effects of increasing the number of constraints . . . . . .
. . . 63
5 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . 65
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . 69
ii
-
APPENDIX . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 72
A. Sample CIAO data - 1997 . . . . . . . . . . . . . . . . . . .
. . . . 72
B. Full entity specification file . . . . . . . . . . . . . . .
. . . . . . . 73
C. Full ross Sea generic model library . . . . . . . . . . . . .
. . . . . 75
D. Models selected in both experiment 8 and 19 . . . . . . . . .
. . . 87
E. Models selected in both experiment 8 and 21 . . . . . . . . .
. . . 89
iii
-
ABSTRACT
Understanding the Phytoplankton dynamic in the Ross Sea Polynya
may yield useful
knowledge in the search for solving the worlds rising carbon
dioxide levels. Modeling
such dynamics is a very lengthy and tedious process that can be
helped with the use
of computational tools like HIPM. This system relies on
knowledge that is already
available, in the shape of time series data and process library,
to construct and then
evaluates these models. In this research models were ranked by
sum of squared
error, from lowest to highest. The lowest being the best fit
model. Some of the
questions that arise from the use of HIPM are about the amount
and value of the
time series provided to the software, from which we formulated
two hypotheses.
Will having more time series better the output of the system ?
Will time series
for different variables provide different quality of output?
Through 31 experiments
and mathematical analysis, we began to answer these questions.
The computational
result showed us that our first hypothesis does not always hold
true, which is thought
to be because of the way the fit is measured. On the other hand
the mathematical
analysis showed us many variations, over all the experiments, in
the zooplankton
equation structure which can be indication that the process
library needs to be better
defined and that the system needs to take into consideration not
only Phaeocystis
antartica phytoplankton species but also diatoms. This thesis
provides the start to
an answer for this hypothesis but further research is still
needed.
iv
-
DEDICATION
This Thesis is dedicated to all my friends and family have
supported me in this
incredible journey I started 5 years ago. More importantly I
want to dedicate to our
Lord and Savior as I certainly would not be here today without
his help, support
and comfort.
“I can do anything through God who strengthens me.”(Philippians
4:13)
I also want to dedicate this to my nephew Noah Nelson and my
niece Sarah Nelson
for always putting a smile on my face during the tough times,
their unconditional
love and making me want to persevere always. I love you beyond
words.
Thank you, Christel & Douglas Nelson, Lara Nelson, Celio
& Elise Nelson, Sven
Diebold, Andrew & Robin Nelson, Ed & Pat Nelson, Joann
Nelson, Philip Varvaris,
Luke Brown, Taylor Jackson and Bud Edwards (for always being
there at the right
place at the right time) and all my other friends and family
members that are not
named here but are present in my heart and to whom I am so
grateful for all the
words of encouragement and support throughout the years.
v
-
ACKNOWLEDGMENTS
I would like to thank Dr. Feng, Dr. Borrett, Dr. Simmons, Dr.
Freeze and Dr.
Lu for all their help and support in this endeavor and process,
as well as my friend
Brevin Rock for his advice in completing a Masters thesis.
vi
-
LIST OF TABLES
1 Example of entity definition and instantiation (P) . . . . . .
. . . . . 15
2 Example of process definition (Growth) . . . . . . . . . . . .
. . . . . 16
3 Data contained in CIAO set . . . . . . . . . . . . . . . . . .
. . . . . 18
4 Cutoff Value Chart . . . . . . . . . . . . . . . . . . . . . .
. . . . . . 26
5 Model A Parameter Values . . . . . . . . . . . . . . . . . . .
. . . . . 34
6 Model B Parameter Values . . . . . . . . . . . . . . . . . . .
. . . . . 36
7 Model C Parameter Values . . . . . . . . . . . . . . . . . . .
. . . . . 57
vii
-
LIST OF FIGURES
1 Initial Conceptual Model . . . . . . . . . . . . . . . . . . .
. . . . . . 4
2 Tree diagram representing the process library . . . . . . . .
. . . . . 5
3 Map of the Ross Sea . . . . . . . . . . . . . . . . . . . . .
. . . . . . 7
4 reMSE summary - Part 1 . . . . . . . . . . . . . . . . . . . .
. . . . . 21
5 reMSE summary - Part 2 . . . . . . . . . . . . . . . . . . . .
. . . . . 22
6 reMSE summary - Part 3 . . . . . . . . . . . . . . . . . . . .
. . . . . 23
7 Good fit Models VS. Number of inputted time-series . . . . . .
. . . 24
8 Mean Activation Values Graph . . . . . . . . . . . . . . . . .
. . . . 29
viii
-
LIST OF SYMBOLS
P = Amount of Phytoplankton present in the system (mg
Chla/m3),
D = Detritus concentration (mg C/m3),
F = Iron concentration (µM),
Z = Zooplankton concentration (mg C/m3),
N = Nitrate concentration (µM),
Eice(t) = Sea ice concentration
ETH2O(t) = Temperature of the water (◦C)
EPUR(t) = Photosynthetically usable radiation ( µmol photons m−2
s−1)
ETH2Omax= Maximum water temperature
ETH2Omin = Minimum water temperature
ai = Optimal parameters of the system selected by HIPM
software
ix
-
1 INTRODUCTION
Whether you talk about biology, mathematics, physics, ecology,
or any other type
of science, all have a common objective to explain and describe
the world that sur-
rounds us. All of these fields build upon the collection of
observations, to explain
recurring phenomena. To explain and depict some of these
phenomena scientists
make use of models which can take a variety of forms including
conceptual, formal,
physical and diagrammatic (Haefner, 2005).
Models are widely used in science and researchers continue to
look for tools or
techniques that will enhance and optimize their ability to
construct new models or
improve existing ones. Given a certain task the type of modeling
technique will
differ, for instance in his book Haefner (2005) uses a Forrester
Diagram to model a
hypothetical agro-ecosystem system, which is a qualitative model
formulation. An-
other example would be in biology when describing predator-prey
interaction, one
can use differential equations models like those formulated by
Lokta and Volterra
(Berryman 1992). Models are useful for system study because they
let researchers
conduct experiments and test theories on the system that would
otherwise be un-
ethical or impossible to perform, as well as enabling them to
predict the behavior of
varying components of an ecosystem.
Model construction is a difficult and lengthy endeavor. For a
given system there
may be many different combinations of processes (i.e. grazing,
decay, growth) that
could provide a plausible explanation for the behavior being
studied. Thus, ex-
ploring and evaluating all these possibilities makes for a
tedious task. In the past,
limitations in computational powers restricted scientists in
their ability to investi-
gate more complex models, certain known or suspected processes
would be left out
to simplify calculations in part because as computational powers
increased so did our
capacity to evaluate more intricate models (Oreskes 2000). In
addition, numerical
-
models of natural systems are non-unique, there is multiple ways
to represent the
same dynamic. Creating computational tools that would quickly
and automatically
evaluate multiple models seemed to be a promising idea to search
through the exten-
sive model space. The success of machine learning and data
mining in commercial
domains led scientists to investigate the field of automated
modeling to serve that
particular purpose (Fayyad et al., 1996).
The act of gathering small pieces of information and combining
it to prior knowl-
edge to formulate a complex overview of an object or process
studied is called in-
duction. Induction prevents from searching the entire space of
possible equations
by only piecing together the meaningful terms, for instance a
predator-prey model
will need terms specifying growth and death (Todorovski et al.
2005). Inductive
modeling methods (i.e. LAGRAMGE, HIPM, ARIMA, FUSE) use the
principles of
induction to construct models of the studied system. Methods
used for commercial
application, such as Knowledge Discovery in Database (KDD)
process, were insuffi-
cient for scientific purposes as they only described and did not
explain the observed
system behavior (Langley et al. 2006). A simple example would be
the modeling of
water consumption in a city, a water company could easily create
a numerical model
based on previous years that would give a good estimate of the
projected water
consumption over time but it may not explain why the consumption
fluctuates the
way it does. In other words the commercial methods were able to
produce models
that are useful when trying to make accurate predictions for a
system but become
very limited when trying to explain which processes drive
systems behaviors; these
methods did not explore the realm of all possible models. Thus,
induction methods
had to be enhanced to automate the task of building and
evaluating multiple models
(Dzeroski et al. 1995).
In this thesis, I used the hierarchal inductive process modeling
technique, which
is encoded as computer algorithm called HIPM (Langley et al.
2006; Bridewell et
2
-
al. 2005; Dzeroski et al. 1995; Borrett et al. 2007). Inductive
process modeling
methods such as HIPM (Bridewell et al. 2008; Borrett et al.
2007; Langley et al.
2006; Todorovski et al. 2005) searches through two spaces; the
first space is made
up of mathematical formulations and alternative model
structures, which consist of
entities, processes and the connection biding the two and the
second space is made
up of parameter values (Borrett et al. 2007).The system takes as
input a hierarchy
of generic processes - a process being a certain action on the
system which is defined
by mean of fragment mathematical equations and the rule on how
to combine these
fragments with the rest of the equations -, a set of entities -
an entity being an ob-
ject regrouping the properties of the organism or nutrient by
mean of variables and
parameters - and a set of observed time series of the entities
variables (Todorovski
et al. 2005). HIPM will perform one of two search for for the
model structure, a
heuristic search or exhaustive search. With the search option
selected, HIPM creates
all the possible model structures with the given background
knowledge and selects
the best set of parameters for each model structure. Finally,
the system ranks the
models based on their sum of squared error (Todorovski et al.
2005).
This system allows for model representation of complex system
dynamics, for
example in the study of photosynthesis regulation it generated a
model that repro-
duced both the qualitative shape and the quantitative details of
the time series data
while incorporating processes that made biological sense
(Langley et al. 2006). In
our case we studied the phytoplankton dynamic in the aquatic
ecosystem of the Ross
Sea.
In this thesis I used the HIPM tool combined with the
appropriate process li-
brary to study of the phytoplankton dynamic in Ross Sea
ecosystem. Here the term
process library is defined as the collection of processes (i.e.
grazing, decay, growth)
and entities (i.e. phytoplankton, zooplankton, nitrate), with
their relation to one
another. It is best represented by Figure 2.
3
-
Figure 1: This schematic represent the interaction between
entities and exogenousvariables driving the model. Here, P, Z , D ,
NO3 and Fe are the state variables.PUR, T and Ice are the exogenous
variables acting on the system and influencing thestate variables.
The arrows represent the interaction of one variable onto
another(Borrett, unpublished research).
Arrigo, Borrett, Bridewell and Langley used HIPM and the Ross
Sea process li-
brary to create and search a space of over 1120 possible model
structures to explain
the phytoplankton and nitrogen temporal dynamics in the Ross Sea
ecosystem; all
models contained five state variables, phytoplankton,
zooplankton, detritus, nitro-
gen and iron. Time series for both phytoplankton and nitrogen
where available and
given to HIPM along with the process library. Their initial
research found that 200
model structures were deemed of good fit, in this case good fit
was defined by models
having a sum of squared error less than or equal to 0.2. From a
computer scientist
standpoint, reducing the search space from 1120 models structure
to 200 is a great
accomplishment; however for a biologist the solution is not
specific enough and offers
few insights on the ecosystem dynamics. There is a need for ways
to constraint the
search further, bringing down the number of good fit models,
making the output
4
-
Figure 2: A tree diagram representing the process library
constructed for the RossSea ecosystem problem. The interaction
between processes and entities is defined inthe library as
explained in Section 2.1.2 ( Borrett et al. 2007)
useful to biologists.
Superficially, HIPM appears related to equation discovery
methods, which is a
subfield of machine learning (Langley, 1995; Mitchell, 1997)
that investigates col-
lections of measurements and observations, using different
computational methods,
in search of quantitative laws (Todorovski, 2003). For example
the LAGRAMGE
system will take in as input background knowledge encoded in
terms of a grammar
5
-
specifying the space of possible equations and a dependent
variable and will output
the best equation for the variable, able to only perform the
search for one variable
at the time (Dzeroski et al. 1993, Todrovski 2003). This is
further related to the
methods used in Ljungs work (1993) on system identification, but
is further removed
to that of inductive process modeling.
The main assumption behind system identification is that the
model structure
is known and that the primary concern is finding the adequate
parameter values;
equation discovery focuses on both the structure and parameter
values (Todorovski
et al. 1998). Both of these approach produce descriptive models
that summarize
and predict the data but they fail to search through the space
of alternative expla-
nations, these methods do not take into account models with
theoretical variables
or consider alternate processes to explain certain dynamics
(Bridewell et al. 2005).
The Southern Ocean covers an area equivalent to about 10% of the
global ocean
and is a key element of the global ocean system as it links all
major ocean basins and
facilitates the global distribution of its deep water; it is
considered to play an impor-
tant part in the global carbon (C) cycle (Arrigo et al. 2003).
The Ross Sea polynya
(area of open water surrounded by sea ice) is one of the most
productive ecosystems
in the Southern Ocean as it experiences some of the largest
phytoplankton blooms
in the region (Arrigo et al 1994, 1998, 2000, 2003). Indeed,
phytoplankton produc-
tivity (photosynthesis) is important to the carbon cycle as it
removes carbon dioxide
(CO2) from surface water during photosynthesis, part of which
will then be exported
to deep ocean water. What makes the Ross Sea polynya so
interesting for ecologist
compared to other locations such as Terra Nova Bay, is the type
of phytoplankton
dominating the ecosystem. In the Ross Sea polynya , Phaeocystis
antartica domi-
nates as opposed to diatoms (species such as Fragilariopsis
spp.) in Terra Nova Bay.
Phaeocystis antartica are thought to resist grazing more than
other phytoplankton
species, which could imply that more carbon would be taken from
shallow water into
6
-
the depth as the un-eaten phytoplankton full of CO2 sinks to the
bottom (Tagliabue
and Arrigo 2003). Deep ocean water has a larger residence time
than shallow water,
meaning that carbon trapped in deep ocean water will be
effectively removed from
atmospheric circulation for a much longer time than the carbon
contained in surface
water.
Figure 3: Map of the southwestern Ross Sea showing the Ross Sea
ploynya, locatednorth of the Ross Sea Ice Shelf, and the Terra Nova
Bay polynya, located on thewestern continental shelf (Arrigo et al.
2003)
Thus, there is an incentive to understand the ecological
processes that control the
7
-
phytoplankton productivity and community composition -which
species dominates-
in the Ross Sea. Fluctuations in phytoplankton population could
potentially have
effects on the CO2 levels in the atmosphere (Carlson et al.
1998) and if we can
figure out why Phaeocystis antartica is predominant it would be
useful informa-
tion to scientist as they entertain the idea of altering
phytoplankton populations
around the world to create carbon sinks, providing a temporary
solution to our CO2
problem. It is all these elements that initiated the search for
the best process ex-
planation of the phytoplankton dynamics in the Ross Sea, by
determining which
processes act upon the system and which entities are most
important, scientist will
accumulate knowledge that may prove valuable in the fight
against rising CO2 levels.
As mentioned the tool that I have chosen for model search relies
on measure-
ments and observations of one or more variables of a system to
make inferences on
the remaining variables for which no data is available and the
processes at works in
the system. In Borrett’s study, the only state variables for
which he had measure-
ments and observations are Phytoplankton and Nitrate. Ultimately
the goal is to
select model structures that would be good approximations of the
natural system
and give good insights on the processes at work in the system.
However, here I was
faced with an under constrained optimization problem, there was
no data available
for 3 of the state variables. Indeed, one of the big challenges
of using HIPM for this
particular ecosystem was that the data that is used to conduct
the search is very
expensive to collect, and it becomes especially complicated when
it comes to iron
(Fe) as it is difficult to measure. From this last statement
arise two questions: does
knowing data for more than one state variable narrow down the
number of possible
good fit models in a significant manner? Will knowledge about
certain variable have
better optimization power than for others? For example if we
could only afford to
collect data for one of the five variables in the system, would
phytoplankton give us
8
-
better model output (fewer good fit models) in HIPM than
zooplankton or would it
be detritus ?
This is an important question because as scientist are trying to
advance their knowl-
edge on the Ross Sea; there is a need to make educated decisions
on what information
to collect in an effort to optimize the use of resources.
This thesis is structured in five parts, firstly I described the
method used to
gather the data that was used in my analysis, and this includes
the HIPM software
as well as an overview of the data sets. I then went into the
quantitative analysis,
by looking strictly at the results generated from the HIPM
software and discussing
what it tells us on an ecological standpoint. In section 4, I
entered the analytical
part of our analysis, picking and studying some of the best-fit
models selected during
the quantitative analysis. I then discussed these analytical
results and in the next
section tied it back to the biology in an effort to link both
qualitative and quantita-
tive research. Through this analysis we saw how we can help
HIPMs model selection
method as well as assist scientists in finding a model that most
accurately explain
the processes at works in the ecosystem observed.
9
-
2 METHOD
The method employed in this paper involves constructing process
models from con-
tinuous data. To assist in this task we used a piece of software
named HIPM. It
is the output and model selection efficiency of this computer
software that we are
investigating. To better understand the task at hand it is
important to define what
HIPM does, as well as the steps we are taking to test its
efficiency.
2.1 HIPM Description
Ecologists rely on system modeling quite heavily to build
ecological theory, guide
environmental assessment and management (Borrett et al. 2007).
Typically scien-
tists will build and study a couple of models, basing the model
structure on previous
research or by making a judgement call on which entities and
processes should or
not be included. One of the aspirations and problems of modeling
natural systems is
to capture the essence of the system necessary for the model
purpose by figuring out
what can be left out; in that regards which entities and
processes should be included,
and what are the best mathematical formulation and parameter
values for a given
structure become an essential part of this search. Choosing from
among the possible
model structures presents an intricate and time consuming
challenge for ecologists
who want to navigate this space (Borrett et al. 2007). In
searching through this
space of possible models, we are guided by the claim made by
Langley et al. (1987),
which we support, that we must look for models that will fit
real-life observations. In
summary,we are faced with the problem of constructing models
anchored in domain
theory, conducting a time consuming search and linking the
models to empirical
data (Borrett et al. 2007). This is where the HIPM software
comes into play to
remedy these issues, HIPM stands for Hierarchal Inductive
Process Modeling. This
scientific approach (Lantley et al. 2005) assumes the
following:
10
-
• Given: Time-series data for continuous variables.
• Given: Background knowledge about the entities of the system;
in other words
constraints on variables and other parameters driving these
entities.
• Given: Background knowledge on the type of processes that may
be involved
in driving the ecosystem as well as the constraints that may
exist for the said
processes.
Then the task for the software is to perform a search through
the structure and
parameter space defined by the process-entity library to find
the models that best
fit the data. HIPM operates in four phases.
1. In an exhaustive search, it first finds all the possible
instantiations of the
generic processes for all variables. This means that the system
will find all the
possible combinations of processes that can affect a given
variable (We will
give an example in Section 2.1.2 ). For our purposes we used the
exhaustive
search option programmed into the software but there is also a
heuristic search
option available.
2. The system then walks through each model and puts them
together. In other
words, it puts together, into a generic model, one instantiation
of generic
processes for each variable present in the system. It uses the
constraints given
by the users to determine which instantiations can be linked
together into a
generic model; the program goes through an exhaustive search to
find all the
possible models. In our study it makes 1120 model structures,
due mainly to
the large amount of different grazing processes that are
potentially present in
the ecosystem.
3. It searches for the parameter values for each model using the
constraints de-
fined by the users. To infer these parameters, the system picks
a random
11
-
set of values that respect the constraints and, using the
Levenberg-Marquardt
gradient descent method, finds a local optimum. To avoid
entrapment in lo-
cal minima, the system will restart the parameter estimation
from multiple
random points retaining only the parameters that produce the
lowest error.
In our experiment we set the number of restarts to 128. This
technique has
been found to produce reasonable matches to time series in
multiple systems
(Langley et al. 2007).
4. Evaluates the performances of the produced model structures
(predicted val-
ues) against the data series (observed values) by calculating
the root mean
square error (reMSE); models with the lowest reMSE will be
considered best
fit models.
2.1.1 Measure of Fit
As mentioned above, HIPM evaluates and selects the best model
structure and set
of parameters according to a fitness measure. The system
currently uses the sum
of square error (SSE) to evaluate fitness (Bridewell et al.
2007), which is defined as
follow:
n∑i=1
SSE(xi, xobsi ) =
n∑i=1
m∑k=1
(xi,k − xobsi,k )2
where xi, . . . , xn are the variables that are being fitted
with m observed values for
each. To take into account the modeling of variables of varying
scale, the system
uses a relative mean squared error that we define in the
following way:
reMSE =
∑ni=1
SSE(xi,xobsi )
s2(xobsi )
nm
Here s2(xobsi ) is the sample variance of the observation for
xi. Across this paper
12
-
we will refer to the relative mean squared error as reMSE. The
biggest asset to this
rescaling is the ability to compare values across data sets.
Typically, an ReMSE of
1.0 or above signifies that the model performs poorly and
inversely, the lower the
reMSE, the better the fit.
2.1.2 Entities specification and model library
Each entity of a system is defined by a combination of variables
and parameters
which makes them actors but also receivers of action in the
model. A distinction is
to be made between generic entity and instantiated entity.
Indeed, a formal generic
entity has a name and a set of properties which can include both
variables and
parameters. In a given model the parameters of the instantiated
entity will not
change whereas the variables do. Every variable in the entity
has a name and a
rule that determines how multiple processes and their
subprocesses are combined
(e.g. summed, minimum, product, etc...). For the parameters
there is a name
and a range that constrains their possible values. On the other
hand, instantiated
entities have their variables associated with either time-series
or they are given initial
values and the parameters have been assigned real values. A
field is also included
to indicate the parent generic entity (Borrett et al. 2007). One
given generic entity
can be instantiated multiple times, the generic entity can be
thought of as a blue
print for the instantiated entities. For example in our system
we defined the entity
phytoplankton as presented in Table 1. Here our entity’s name is
“P”; it contains the
variables “conc”, “growth rate” and “growth lim” with the rules
determining how
they will be aggregated with other processes; the next part of
the entity definition is
the list of parameters that are of concern for this entity such
as “max growth’ with
possible values in the (0,600) range. Following the definition
of a generic entity in
Table 1 is an instantiated entity, “pe” which refers to the
parent generic entity. The
variables are then either given the name of a time-series to
which the model will be
13
-
fitted such as for “conc”, with the “PHA c” referring to the
phytoplankton column
of the CIAO data set, or an initial value such as 0 for “growth
rate”, indicating
that this particular state variable won’t be fitted to a
time-series. The mention
“system” as opposed to “exogenous” simply states that this
variable is dependent
on the system as opposed to being independent like variables
such as solar radiation
or water temperature. The full instantiated entity library can
be found in Appendix
B and the generic entity library in Appendix C.
For HIPM to be fully functional there needs to be a library of
processes. Processes
are the physical, chemical, or biological actions that drive
change in dynamic models.
Just as we made a distinction between generic entity and
instantiated entity, we
make a distinction between generic processes and instantiated
processes. All generic
processes are defined by a name by which entities can tie into
the process, the
subprocesses that are tied to that one process and one or
multiple equations. The
generic process can also include a set of Bolean conditions that
determine if the
process is active, making the process dynamic by turning the
process on and off
depending on whether the conditions are satisfied (Borrett et
al. 2007). For instance
we could set the photosynthetic process to only occur if a set
environment light
variable is greater than zero. We have an example of generic
process in Table 2, it is
named “growth”, and any of the following entities “P, N, D,
E”can take a role in the
process, then there is a list of the subprocesses, with the
entities that can take a role
in the subprocess, that are linked to this process and finally
the equation that defined
this process; this equation calls onto the “conc” and “growth
rate’ variables that all
entities must have. The instantiated process will take on a
specific name and will be
bound to a specific instantiated entity, one of P, N, D or E.
The instantiated entity
will take it’s role in the equation of the instantiated process.
All the instantiated
processes will be aggregated according to the rule defined in
the generic entity. It
is this organization in terms of entity and process that drives
inductive process
14
-
modeling. It makes for an easier construction of systems of
equations by building in
fragments.
Table 1: In this table we are first giving an example of generic
entity definition withits variables and parameters followed by an
example of an instantiated entity, morespecifically Phytoplankton -
P, to which the variable “conc” is given a time seriesand the other
variables initial values.
pe = lib.add_generic_entity("P",
{ "conc":"sum",
"growth_rate":"prod",
"growth_lim":"min"},
{ "max_growth": (0.4,0.8),
"exude_rate": (0.001,0.2),
"death_rate": (0.02,0.04),
"Ek_max":(1,100),
"sinking_rate":(0.0001,0.25),
"biomin":(0.02,0.04),
"PhotoInhib":(200,1500),});
p1 = entity_instance (pe, "phyto",
{ "conc": ("system", "PHA_c", (0,600)),
"growth_rate": ("system", 0, (0,1)),
"growth_lim": ("system", 1, (0,1))},
{ "max_growth":0.59,
"exude_rate":0.19,
"death_rate":0.025,
"Ek_max":30,
"biomin":0.025,
"PhotoInhib":200 } );
15
-
Table 2: Defining a process - Growth
lib.add_generic_process(
"growth", "",
[("P",[pe],1,1), ("N",[no3,fe],1,100),
("D",[de],1,1), ("E",[ee],1,1)],
[("limited_growth", ["P","N","E"], 0),
("exudation",["P"],1),
("nutrient_uptake",["P","N"],0)],
{},
{},
{"P.conc": "P.growth_rate * P.conc"} );
To sum it up, HIPM’s power resides in its knowledge of the
modeled domain as
well as its ability to estimate parameters (Bridewell et al.
2007).
2.2 Experiment Design
Having now established how HIPM works let us consider the
problem at hand.
Though in theory HIPM is an extremely powerful tool which
permits a search
through a wide structure and parameter space, previous research
has demonstrated
that a more thorough investigation of HIPM’s output is necessary
to evaluate its
potential and usefulness to biologist. In our example of the
Ross Sea ecosystem
with the process-entity library set up as described, the search
space represents 1120
possible models; each model can take on a wide variety of
parameters set depending
on the constraints given to the software. The Phytoplankton
dynamic models of the
Ross Sea have five variables: Phytoplankton (P ), Zooplankton
(Z), Detritus (D),
Nitrate (N) and Iron (F ). In previous research, real-life time
series about Phyto-
16
-
plankton and Nitrate were available to us for this particular
ecosystem, thus the
data was fed to HIPM. By doing so, HIPM came out with about 200
possible mod-
els that have a reMSE of less or equal to 0.2 which from a
computer science stand
point is a good improvement. Indeed, we reduce the search space
from 1120 possible
models to 200 models. However, for a biologist that is still a
quite large amount of
models approximating the ecosystem studied; going through and
testing out every
one of these 200 models would be extremely time-consuming.
Therefore, it is clear
that we somehow need to lower this number of possible models to
a point deemed
reasonable/useful to biologist. Logically we assume that
increasing the number of
constraints (i.e. add real-life time series of a variable for
which we had no previous
empirical data) would help model discrimination in HIPM. But
this would imply
that the scientist would have to go into the field and collect
time series for one of
the variables in the system; that process being very expensive,
can HIPM be used
to make an informed decision about which variable would yield
the most discrimi-
natory powers, if there is at all a difference between
variables? This is what we are
investigating and in the light of these elements we have
formulated two hypotheses:
• Hypothesis 1: Increasing the number of constraints: increasing
the number of
time-series for which we have data in HIPM for model selection
will induce
better fits. In other words, the increase in number of known
time-series of
system variables leads to better model discrimination and
therefore better
model selection.
• Hypothesis 2: Variables yield different values of information:
some variables
will have more discriminatory power and restrict the best fit
models more than
others.
To test our two hypotheses it was imperative to employ a full
data set including
time-series for all variables of the system in order to compare
the results depending
17
-
upon whether certain time-series are included or not as
constraint for HIPM. Since
no full data set with real-life data was available, we turned to
a simulated data set
called the ”Couple Ice and Ocean model” datasets otherwise
referred to as CIAO
datasets. This dataset is generated from a three dimensional
ecosystem model that
spans the entire water column and multiple stations across the
Ross Sea. However,
for our purposes only a portion of this data, the top 5 meters
at the Ross Sea Polynya
station 01, is used. The type of information contained in the
CIAO dataset is stated
in Table 3.
Table 3: Information included in the CIAO data set.NOTE: A
sample of the CIAO 1997 data can be found as Appendix A.
Symbol Units DescriptionJDAY Day Day of the measurementsTEMP ◦C
Temperature of the waterDPML m Mixed layer depthAI Sea ice
concentrationNITR µM Nitrate concentrationPHOS mg Chla/m3 Phosphate
concentration,SILC µM Silicate concentrationIRON nM or µM Iron
concentrationPARL µmol photons m−2 s−1 Solar radiation used by
organism in photosynthesis.PHA mg Chla/m3 Phaeo chlorophyll
concentrationDIAT mg Chla/m3 Diatom chlorophyll concentrationZOO mg
C/m3 Zooplankton concentrationDET mg C/m3 Detritus
concentrationPURL µmol photons m−2 s−1 Photosynthetically usable
radiation
In addition to a full data set, it is necessary to have a
working library, that, as
stated in Section 2.1.2, defined both entities and processes for
HIPM. The process-
entity library that we used is available in Appendix B and C, it
was previously
put together by Bridewell, Borrett, Langley and Arrigo. All the
processes and
subprocesses in which the instantiated entities can take a role
in our study are
represented in Figure 2.
18
-
Having the background knowledge necessary for HIPM to conduct
successful runs
we designed thirty one experiments; each experiment represents a
possible combi-
nation of time-series constraints that could potentially be
entered into the software.
For example, if we had time-series for Iron and Nitrate and fed
the information into
HIPM they would act as additional constraints in the model
selection process. To
be selected, models have to exhibit behavior close to the given
time-series. All the
experiments are summarized in Table 4 .
19
-
3 COMPUTATIONAL RESULTS
The main topic in this paper, is to determine how to optimize
the usage we make of
HIPM to assist scientists in there decision making process when
it comes to selecting
a model that most accurately represent an ecosystem. The first
need is to narrow
down the number of possible good fit models capable of
describing the system. We
did this feeding additional time series about one of the state
variable into HIPM,
thus providing more constraints; so did this assumption hold
true? Secondly, if
adding more constraints to HIPM does reduce that number, are
observations for a
specific state variable holding more reducing power than the
other state variables?
The data collected helped us answer these questions as well as
discuss the efficiency
of HIPM in its current state.
There were thirty-one different experiments performed, each
returning a measure of
fit value (reMSE) every one of the 1120 models tested in every
experiment. This
makes for a large amount of data to analyze. To get a better
idea of what this data
looks like, the measures of fit values of models that had an
reMSE between 0 and
2 were graphed, ranking and graphing them from lowest to highest
(see Figure 4, 5
and 6) value. We did not look at reMSE higher than 2.0 since, as
stated previously,
models with reMSE higher than 1.0 are typically classified as
poorly performing
models as it indicates a very large difference between observed
and expected values.
We estimated that the (0,2) range would be sufficient for our
purpose, as it would
encompass most models. Based on these initial results we decided
to pick an reMSE
of 0.5 as our good fit model cutoff; any model under that cutoff
is considered of good
fit. This choice of cutoff was made because the multiple graphs
seemed to exhibit a
turning point or slight step pattern around this reMSE value,
such as portrayed in
the graph for experiments 1, 5 or 20.
20
-
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
0.0
0.5
1.0
1.5
2.0 1[P]
197 Good Fit
Models●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
2[Z]
101 Good Fit
Models●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
3[D]
366 Good Fit Models
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●
0.0
0.5
1.0
1.5
2.0 4[N]
439 Good Fit
Models●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●
5[F]
509 Good Fit Models●●●●
●●●●●●●●●●●●●●●
●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●
●●●●●●●
●●●●●●●●●●●●●●●●●●
●●
●●●●●●
●●
●
●●
6[P,Z]
5 Good Fit Models
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●
●●
0.0
0.5
1.0
1.5
2.0 7[P,D]
61 Good Fit
Models●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●
●
8[P,N]
25 Good Fit Models
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●
●●●
●
●
9[P,F]
79 Good Fit Models
●●●●●
●●●
●●●
●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●
0 200 400 600 800 1200
0.0
0.5
1.0
1.5
2.0 10[Z,D]
8 Good Fit Models
●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●
●
●
●●
●●
●●
0 200 400 600 800 1000
11[Z,N]
1 Good Fit Models
●●●●●●●●●●
●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●
●
●●
●
●●●
●
●●
●
●●●
0 200 400 600 800 1000
12[Z,F]
0 Good Fit Models
Figure 4: reMSE value are ranked from lowest to highest. The
reMSE = 0.5 signifiesthe good fit model cutoff, any models under
that value are considered good fit models.The experimental setup
for each run as well as the ID number is indicated in thetop right
corner.
21
-
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
0.0
0.5
1.0
1.5
2.0 13[D,N]
67 Good Fit Models
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●●●