Top Banner
Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty of Computing, Engineering & Technology, Staffordshire
45

Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

Mar 26, 2015

Download

Documents

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

Decision Support Tools for River Quality Management

Martin Paisley, David Trigg and William Walley

Centre for Intelligent Environmental Systems, Faculty of Computing, Engineering & Technology, Staffordshire University

Page 2: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Contents Background

Our Aims Our Approach

The River Pollution Diagnostic System (RPDS). Pattern Recognition Data exploration, diagnosis and classification

The River Bayesian Belief Network (RPBBN). Plausible Reasoning Diagnosis, prognosis and scenario-testing

Summary

Page 3: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Our Aims Maximise the benefit gained from existing

databases/information, increased objectivity. Exploit the available technology to create

sophisticated, flexible, multi-purpose tools Make the technology easy to use. Provide expert support to those who need it

to help them do their job.

Page 4: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Our Approach Our initial studies with expert ecologist H.A.

Hawkes lead to goal of trying to capture expertise.

Expert systems is the branch of Artificial Intelligence (AI) that attempts to capture expertise in a computer based system.

Study of an expert is required to reveal: what they do, how they do it; and what information and mental processes they use.

Page 5: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

The Expert Ecologist Our early research discovered the expert ecologist

tend used to use two complementary techniques. Memory (pattern matching) – “I’ve seen this before,

it was due to …” Scientific knowledge (plausible reasoning) – based

on their knowledge of the system and available evidence they are able to reason about the likely state of other elements of the system.

We set out to replicate these processes and produce software that would allow people to gain easy access to ‘expert’ interpretation

Page 6: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

The Modelling Tools After over a decade of research in this field the

current modelling techniques we use are: our own clustering and visualisation system know as

MIR-Max (Mutual Information & Regression Maximisation) for pattern matching; and

Bayesian Belief Networks (BBN) for plausible reasoning.

These techniques were used to produce the models on which our decision support software is based.

Page 7: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

What the tools provide. Visualisation and exploration of large complex

datasets. (RPDS) Classification of samples. (RPDS) Diagnosis of potential pressures. (RPDS &

RPBBN) Prediction of biology from environmental and

chemical parameters. (RPBBN) Scenario testing – impact of changing sample

parameters. (RPBBN)

Page 8: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Pattern Recognition

Page 9: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Pattern Recognition –What is it? Recognition of patterns – pattern implies multiple

attributes, so is a multivariate technique. Classification of a new pattern (thing) as being of

a particular type, based on similarity to a set of attributes indicative of that type.

Success of pattern recognition reliant on having the appropriate distinguishing features. Enough features to clearly discriminate. Appropriate set of features –

orthogonal/uncorrelated.

Page 10: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Pattern Recognition – Why do it? Method of managing information – reduce

multiple instances as single type or kind. Classification of situations allows to cope

with novel but similar situations. Exploitation of existing ‘information’.

Once identified as being of a type ‘unknown’ attributes can be inferred.

Page 11: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Pattern Recognition - Clustering To create a model first need to

cluster training samples The training samples contain

both data on the training/clustering variables and additional ‘information’ variables (those that are to be predicted).

In the case of RPDS, the training variables are the biology and the information variables the chemical and other stress parameters.

Page 12: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Set of samples .. grouped into ‘clusters’ .. to provide templates/types in the model

Pattern Recognition - Clustering

Page 13: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Pattern Recognition - Classification

Classification involves matching a new sample with an existing cluster. Based on the training variables.

In this example the closest match for the new sample is cluster ‘A’.

This is the ‘classification’ of the new sample. The quality of the cluster is that assigned to the new sample.

Page 14: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

The diagnosis is derived from the values for the information variables (the blue bars) in the training samples grouped in the cluster.

The predicted values are derived from the training samples in the cluster.

These values are usually a statistic such as mean, median or a percentile.

Pattern Recognition - Diagnosis

Page 15: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Visualisation Classification can appear as a black box

system. Visualisation is a useful tool.

Opens the model up for inspection. Helps understand & validate model. Helps explore data and discovery of new relationships.

To help visualisation clusters can be ‘ordered’ in a map.

Page 16: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Ordering Ordering sole purpose is to help visualise the data

and the cluster model, no more no less. The process involves arranging the clusters in a

space/map usually based on similarity. Similar clusters are placed close together dissimilar far apart.

Our algorithm, R-Max, uses the r correlation coefficient between distances in data space and corresponding distances in output space

Page 17: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Data Visualisation - Ordering Clusters

i

j

d

x

y

z

X

Y

D

j

i

d = distance in data spaceD = distance between clusters in mapR-Max aims to maximise the correlation r between d and D

Page 18: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Pattern Recognition - OrderingClusters templates/types … destination map … clusters ordered by similarity

Page 19: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Pattern Recognition - Visualisation Maps can be

colour-coded to show the value of any chosen feature across all of the clusters

‘Feature maps’ and ‘templates’ form the basis of RPDS visualisation

Page 20: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

RPDS 3.0 Primary uses are

Data exploration – visual element to the clustered/organised data allows existing relationships in the data to be verified (model validation) and new ones to be identified (data mining).

Classification - assignment of a sample to cluster allows an estimated quality class to be defined.

Diagnosis - The ‘known’ stress information associated with other samples in the cluster can help diagnose potential problems.

Page 21: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

RPDS 3.0 - Data Exploration

Page 22: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

RPDS 3.0 - Data Exploration

Page 23: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

RPDS 3.0 - Data Exploration

Page 24: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

RPDS 3.0 - Data Exploration

Page 25: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

RPDS 3.0 - Data Exploration

Page 26: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

RPDS 3.0 - Classification

Page 27: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

RPDS 3.0 - Classification

Page 28: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

RPDS 3.0 - Diagnosis

Page 29: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

RPDS 3.0 - Comparison

Page 30: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Plausible Reasoning

Page 31: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Reasoning Reasoning:

Thinking that is coherent and logical. A set of cognitive processes by which an individual

may infer a conclusion from an assortment of evidence or from statements of principles.

Goal-directed thought that involves manipulating information to draw conclusions of various kinds.

Use available information combined with existing knowledge to derive conclusions for a particular purpose.

Page 32: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Reasoning with Uncertainty If reasoning is ‘coherent and logical’, how can it

deal with unknowns, conflicting information and uncertainty?

The ability to quantifying uncertainty helps to resolve conflicts and provides ‘lubrication’ for the reasoning process.

In humans this takes the form of beliefs. Probability theory provides a mathematical

method of handling uncertainty.

Page 33: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Probability Theory Probability theory is robust and proven to be a

mathematically sound. It provides a method for representing and

manipulating uncertainty. It is one of the principle methods used for

handling uncertainty in computer based systems. Bayesian Belief Networks (BBN) are currently

the most popular methods for creating probabilistic systems.

Page 34: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Bayesian Belief Networks A BBN consists of two elements causal network

and a set of probability matrices. A causal network is a graph of arcs (variables)

and directed edges (relationships). The network defines the relationships between all

the variables in a domain. The causal variables are often referred to

‘parents’ and the effect variables as ‘children’. Can be defined through data analysis but is

probably best achieved by an expert.

Page 35: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Causal Network

Page 36: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Probability Matrix The probability matrices encode the relationship between variables. A probability is required for every combination of parent and child

states. The number of states grows geometrically meaning that the

derivation probabilities is often better achieved via data analysis.

Page 37: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Outputs - Predictions The outputs of the system are

likelihood of each of the states of the variables occurring.

The whole system is updated every time evidence is entered regardless of where it occurs.

The most common way to represent the values is through a bar chart, where the bars depict the likelihood of each state.

State Labels

Variable Name

Probability Bars

Probability Values (0 - 100)

Page 38: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

RPBBN 2.0 Primary uses are:

Prediction of concentrations of common ‘chemical’ pollutants from biological sample data.

Scenario testing, prediction of new biological community and biological assessment ‘scores’ based on the modification of changeable environmental and chemical parameters for a site.

Page 39: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

RPBBN 2.0 - Prediction

Page 40: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

RPBBN 2.0 - Prediction

Page 41: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

RPBBN 2.0 - Scenario Testing

Page 42: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

RPBBN 2.0 - Scenario Testing

Page 43: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Summary RPDS organises the EA dataset allowing exploration and

analysis and provides the ability to classify new samples and diagnose potential problems.

RPBBN allows prediction of the states of variables in a system based on any available evidence. Making it useful for diagnosis, prognosis and scenario testing.

Together these tools can help decision makers identify potential problems, suggest areas for further investigation, help develop programmes of remedial action and define targets.

Page 44: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

Summary The models are based primarily on data analysis

making them more objective than expert opinion. The systems robust and consistent in their

operation. The software is easily reproduce and distributed

meaning that the valuable expertise they hold can easily be spread through out an organisation

Page 45: Decision Support Tools for River Quality Management Martin Paisley, David Trigg and William Walley Centre for Intelligent Environmental Systems, Faculty.

© 2009 David Trigg

The Future River Quality - include more geographic

information and move from site to river basin management.

Improvement in algorithms, incorporation of sample bias and improved confidence measures.

Major revision of software – potentially rewritten as web-based application.