Monitoring and Control of Biological Wastewater Treatment ...elibrary.cenn.org/Wastewater/Monitoring and Control... · Monitoring and Control of Biological Wastewater Treatment Process,

Monitoring and Control of Biological

Wastewater Treatment Process

ChangKyoo Yoo

Department of Chemical Engineering (Process Control and Environmental Engineering Program)

Pohang University of Science and Technology

Monitoring and Control of Biological

Wastewater Treatment Process

by

Chang Kyoo Yoo

Department of Chemical Engineering

(Process Control and Environmental Engineering Program) Pohang University of Science and Technology

A thesis submitted to the faculty of Pohang University of Science

and Technology in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the department of Chemical Engineering (Process Control and Environmental Engineering program)

Pohang, Korea 11. 28. 2001 Approved by

Major Advisor

i

ABSTRACT

Increasingly stringent demands are being placed on nutrient removal from

wastewater. These stricter demands increase the complexity of the treatment process

and necessitate the upgrading of treatment plants. As a consequence, there is a need

to optimize plant operation as well as to maximize plant efficiency and reliability.

Process monitoring and advanced control systems are generally considered to be an

important means of achieving stable operation under large load variations. Recent

advances in modeling and sensor technology have motivated significant research

aimed at constructing better process monitoring and advanced control systems. In

the present study we aim to design a process control and monitoring algorithm

appropriate for the wastewater treatment process (WWTP).

In Section III, entitled “Autotuning and Supervisory DO Control in Fullscale

WWTP”, we introduce the autotuning of the PID controller to the dissolved oxygen

(DO) control system in the WWTP and propose a simple supervisory control law to

suggest its set point. Process identification for autotuning approximates the DO

dynamics to a high-order model using the integral transform method and reduces it

to the first-order plus time delay (FOPTD) or second-order plus time delay (SOPTD)

model for the PID controller tuning. Simultaneously, a simple algorithm for the

supervisory control of set point decision is proposed to decide a proper DO set point

for the current operation condition of the aeration basin. The key idea in this method

DCE

19993133

유 창규 Chang Kyoo Yoo Monitoring and Control of Biological Wastewater Treatment Process, 생물학적 폐수처리공정의 모니터링 및 제어 Department of Chemical Engineering (Process Control and Environmental Engineering Program) 2002, 195 pages Advisor: In-Beum Lee, Text in English.

ii

is that the DO set point is proportional to the respiration rate, which is the indicator

of the biologically degradable load. The full-scale experimental results showed good

identification performance and good tracking ability. As a result of the improved

control performance, the fluctuation of the variation of the dissolved oxygen process

decreased and 15% of the electrical power was saved.

In Section IV, entitled “Generalized Damped Least Squares method”, we

propose a generalized damped least squares (GDLS) method to systematically

remove an estimation windup problem in the adaptive control and self-tuning

control system. The key element of the proposed method is the addition of a penalty

of parameter variations to the objective function of the normal least squares

algorithm to prevent the singularity problem. Mathematical analysis shows that the

proposed method has almost equivalent properties to the normal least squares

method and guarantees that no estimation problem will be encountered for poorly

excited situations. The proposed method was applied to estimate the parameters of a

first-order system under closed-loop control and to estimate the respiration rate (R)

and oxygen transfer rate (KLa) of the DO control system in the WWTP, which was

used to derive an adaptive model-based DO control law. Simulation results show

that a GDLS algorithm gives excellent estimation performance under closed-loop

control and can be used in adaptive model-based DO control in WWTP.

In Section V, entitled “Disturbance Detection and Isolation in WWTP”, we

propose a new fault detection and isolation (FDI) method. The proposed method

monitors the distribution of process data and detects changes in this distribution,

which reflect changes in the corresponding operating condition. A modified

dissimilarity index and an FDI technique are defined to quantitatively evaluate the

difference between successive data sets. This technique considers the importance of

each transformed variable in the multivariate system. The proposed FDI technique is

applied to a benchmark simulation and to data from a real WWTP. In addition, we

iii

investigate the kind of disturbance and various scenarios that frequently occur in the

WWTP. Simulation results show that the proposed method could immediately detect

disturbances and automatically distinguish between serious and minor anomalies for

various scales of fault by facilitating the interpretation of the disturbance scales. In

particular, the simulations confirmed that the proposed method is efficient in

adaptive and nonstationary processes, such as the WWTP.

In Section VI, entitled “Modeling and Multiresolution Analysis in WWTP”,

modeling and multiresolution analysis (MRA) are described for the full-scale

WWTP. The proposed method is based on the modeling by partial least squares

(PLS) regression method and multiscale monitoring by application of a generic

dissimilarity measure (GDM) to PLS score values. PLS score values are normally

distributed as a consequence of the central limit theorem, regardless of the

distribution of the original variables; hence, the proposed monitoring method is

suitable for non-stationary and non-normal data sets. Experimental results show that

the PLS method gives good modeling performance and is a powerful tool for

analyzing the full-scale WWTP. MRA also certified the detection and isolation

capability of the proposed method. In particular, the MRA indicated that the

proposed strategy was appropriate for the detection and isolation of various faults

and events in biological treatment, that is, the proposed method could cope with

multiscale process changes in non-stationary signals with non-normal characteristics.

In Section VII, entitled “Process Monitoring for a Continuous Process with

Cyclic Operation”, we propose a method for monitoring a continuous process with

diurnal cyclic characteristics in the domestic WWTP. A subspace identification

method is used to extract “within cycle” and “between-cycle” correlation

information from historical data in the form of a state-space model. This method is

designed to describe the variations from the mean behavior of a periodically time-

varying state-space model. The method can also incorporate the concept of

iv

inferential sensing to predict the quality variables and to enhance process monitoring.

In the simulation results, the proposed method could detect small mean shifts and

abnormalities in the slowly decreasing nitrification rate that were difficult to detect

using the conventional PCA method.

In Section VIII, entitled “Simultaneous Prediction and Classification in the

Secondary Settling Tank”, we propose a method of prediction of solid volume index

(SVI) and a simultaneous classification of the current state of a secondary settler.

Adaptive modeling scheme is implemented as recursive least squares (RLS) method

to update the model parameters adaptively and neural network (NN) classifier is

used as a process classifier. The basic idea is that RLS model parameters have good

features to classify the current state of a secondary settler, thus secondary clarifier

can be detected by monitoring the variations of RLS parameters during the SVI

prediction. Experiments and theoretical analysis shows that the RLS method can

predict SVI of a secondary settler well and parameters of RLS method can be a good

feature for monitoring the state of secondary settler, which could be verified through

the power spectrum analysis.

In Section IX, entitled “Nonlinear Fuzzy PLS Modeling”, we propose a new

nonlinear partial least squares (NLPLS) algorithm that embeds the Takagi-Sugeno-

Kang (TSK) fuzzy model into the regression framework of the partial least squares

(PLS) method. The proposed method applies the TSK fuzzy model to the PLS inner

regression. Using this approach, the interpretability of the TSK fuzzy model

overcomes some of the handicaps of previous NLPLS algorithms. The proposed

method uses the PLS method to solve the problems of high dimensionality and

collinearity and the TSK fuzzy model is used to capture the nonlinearity and to

increase the use of experts’ knowledge. As a result, the FPLS model gives a more

favorable modeling environment in which the knowledge of experts can be easily

applied. Simulation results showed good modeling performance of the FPLS model

v

in a simulation benchmark and a full-scale WWTP.

vi

C o n t e n t s

I. Introduction ......................................................................................................... 1

1.1 Research Motivation.................................................................................... 1

1.2 Research Objective ...................................................................................... 4

II. Biological Wastewater Treatment Process .......................................................... 9

2.1 Activated Sludge Process ............................................................................ 9

2.2 Simulation Benchmark .............................................................................. 11

2.3 Fullscale WWTP ....................................................................................... 13

III. Autotuning and Supervisory DO Control in Fullscale WWTP........................... 19

3.1 Introduction ............................................................................................... 19

3.2 Method....................................................................................................... 20

3.2.1 Autotuning Method .......................................................................... 20

3.2.2 Supervisory Control.......................................................................... 22

3.3 Experimental Results ................................................................................. 24

3.4 Conclusions ............................................................................................... 27

IV. Generalized Damped Least Squares Method ...................................................... 36

4.1 Introduction ............................................................................................... 36

4.2 Theory ....................................................................................................... 38

4.2.1 Generalized Damped Least Squares Algorithm................................ 38

4.2.2 Theoretical Analysis ......................................................................... 41

4.2.3 Soft Sensor of Oxygen Transfer Rate and Respiration Rate ............ 44

4.2.4 Adaptive Model-based DO Control.................................................. 46

4.3 Simulation Study....................................................................................... 48

4.4 Conclusions ............................................................................................... 53

V. Disturbance Detection and Isolation in WWTP................................................... 64

5.1 Introduction ............................................................................................... 64

5.2 Theory ....................................................................................................... 66

5.2.1 Modified Dissimilarity Measur ........................................................ 66

5.2.2 Fault Detection and Isolation (FDI).................................................. 68

vii

5.3 Simulation Studies .................................................................................... 69

5.3.1 Simulation of Benchmark Plant........................................................ 69

5.3.2 Fullscale WWTP .............................................................................. 72

5.4 Conclusions ............................................................................................... 75

VI. Modeling and Multiresolution Analysis in WWTP............................................ 85

6.1 Introduction ............................................................................................... 85

6.2 Theory ....................................................................................................... 87

6.2.1 Partial Least Squares (PLS).............................................................. 87

6.2.2 Generic Dissimilarity Measure (GDM) ............................................ 89

6.2.3 Multiresolution Analysis .................................................................. 90

6.3 Result and Discussion................................................................................ 92

6.4 Conclusions ............................................................................................... 97

VII. Process Monitoring for Continuous Process with Cyclic Operation ............... 108

7.1 Introduction ............................................................................................. 108

7.2 Theory ..................................................................................................... 111

7.3 Simulation Study ..................................................................................... 116

7.4 Conclusions ............................................................................................. 121

VIII. Simultaneous Prediction and Classification in the Secondary Settling Tank 136

8.1 Introduction ............................................................................................. 136

8.2 Theory ..................................................................................................... 137

8.3 Simulation Study..................................................................................... 142

8.4 Conclusions ............................................................................................. 145

IX. Nonlinear Fuzzy PLS modeling........................................................................ 151

9.1 Introduction ............................................................................................. 151

9.2 Theory ..................................................................................................... 152

9.2.1 PLS Modeling Method ................................................................... 152

9.2.2 TSK Fuzzy Modeling ..................................................................... 153

9.2.3 Nonlinear FPLS Modeling ............................................................. 155

9.3 Result and Discussion.............................................................................. 164

9.4 Conclusions ............................................................................................. 169

9.5 Appendix ................................................................................................. 169

viii

Summary in Korean ................................................................................................ 182 X. References....................................................................................................... 187

ix

List of Figures

Figure 2.1 A basic activated sludge process with an aerated tank and a settler........ 15

Figure 2.2 A simplified process scheme of an activated sludge process using pre-

denitrification (layout of simulation benchmark) ..................................................... 16

Figure 2.3 Plant layout of coke WWTP, Korea ........................................................ 17

Figure 3.1 Identification and autotuning procedure for PID controller .................... 28

Figure 3.2 Supervisory DO control scheme.............................................................. 29

Figure 3.3 Schematic diagram of full-scale WWTP ................................................. 30

Figure 3.4 Experimental result during identification phase ...................................... 31

Figure 3.5 Bode plots of the identified FOPTD and SOPTD models ....................... 32

Figure 3.6 The validation test: real data and model prediction value ....................... 33

Figure 3.7 Estimated respiration rate during identification phase ............................ 34

Figure 3.8 DO control result using autotuning and supervisory control algorithm .. 35

Figure 4.1 Robust estimator of KL a(u(t)) and R(t) with a GDLS method................. 54

Figure 4.2 Adaptive model-based DO control strategy with GDLS algorithm ........ 55

Figure 4.3 Flow chart of soft sensor and the model-based DO control algorithm.... 56

Figure 4.4 Process output and input during the simulation (a) process output (b)

control input. ............................................................................................................. 57

Figure 4.5 Parameter estimates )(ˆ ta and )(ˆ tb of RLS with exponential data

weighting .................................................................................................................. 58

Figure 4.6 Trace of P with RLS estimation method ................................................. 59

Figure 4.7 Parameter estimates )(ˆ ta and )(ˆ tb of RLS with constant trace ............ 60

x

Figure 4.8 Parameter estimates )(ˆ ta and )(ˆ tb with GDLS. ..................................... 61

Figure 4.9 Estimation comparisons of RLS and GDLS method under PID control 62

Figure 4.10 Model-based DO control result with a GDLS method (a) process output,

(b) control input ........................................................................................................ 63

Figure 5.1 Moving windows between successive two datasets. ............................... 76

Figure 5.2 Measured variables during the storm weeks............................................ 77

Figure 5.3 Monitoring performances under an external disturbance (a) dissimilarity

index, (b-d) individual eigenvalue plots ................................................................... 78

Figure 5.4 Monitoring performances under internal disturbances caused by

decreasing nitrification (left plot) and settler bulking (right plot) (a) dissimilarity

index, (b-d) individual eigenvalue plots ................................................................... 79

Figure 5.5 Monitoring performances of sensor faults (left plot) and setpoint change

(right plot) (a) dissimilarity index, (b-d) individual eigenvalue plots....................... 80

Figure 5.6 PCA monitoring performances (a) Hotteling’s T2 chart, (b) SPE plot .... 81

Figure 5.7 FDI monitoring performances (a) dissimilarity index, (b-f) the 1, 2, 3, 4,

5th eigenvalues........................................................................................................... 82

Figure 6.1 Multiresolution analysis for PLS monitoring .......................................... 98

Figure 6.2 Normal probability plot and histogram of original data and PCA score

values (a) normal probability plot of original data (b) probability plot of score values

(c) histogram of original data (d) histogram of score values .................................... 99

Figure 6.3 Prediction results of PLS model with real Y value (solid line with

squares) and predicted value (dotted line) (a) SVI (b) reduction of CN (c) reduction

of COD (d) residual error of Y variables (SPEY)..................................................... 100

xi

Figure 6.4 The second PLS weight vector plotted against the first for PLS model

................................................................................................................................ 101

Figure 6.5 Variable influence on projection (VIP) for the predictor variables ....... 102

Figure 6.6 Monitoring performances based on T2 and SPEX statistics with 95%

confidence limits ..................................................................................................... 103

Figure 6.7 Monitoring performances of MRA for the PLS score values with 95%

confidence limits (a) GDM (b) EV1 (c) EV2 (d) EV3.............................................. 104

Figure 6.8 Contribution plot of the PLS score value for the first event.................. 105

Figure 7.1 Measured variables of the first 10 days of normal data set ................... 122

Figure 7.2 Conventional PCA monitoring result during the first 10 days of normal

data set .................................................................................................................... 123

Figure 7.3 Conventional PCA monitoring result for nitrification linear decrease: T2

and SPE plot............................................................................................................ 124

Figure 7.4 PCA monitoring result with periodic removal for nitrification linear

decrease: T2 and SPE plot ....................................................................................... 125

Figure 7.5 Monitoring result of the proposed method for nitrification linear decrease

................................................................................................................................ 126

Figure 7.6 Conventional PCA monitoring result for nitrification step decrease: T2

and SPE plot............................................................................................................ 127

Figure 7.7 PCA monitoring result with periodic removal for nitrification step

decrease: T2 and SPE plot ....................................................................................... 128

Figure 7.8 Monitoring result of the proposed method for nitrification step decrease

................................................................................................................................ 129

xii

Figure 7.9 Prediction results of SNH,e and SNO,e for validation data set with static PLS

method..................................................................................................................... 130

Figure 7.10 Prediction results of SNH,e and SNO,e for validation data set with static

PLS method (after periodic removal)...................................................................... 131

Figure 7.11 Prediction results of SNH,e and SNO,e for validation data set with the

proposed method..................................................................................................... 132

Figure 8.1 Schematic diagram of the proposed hierarchy structure........................ 146

Figure 8.2 One-step ahead prediction value of SVI using RLS method ................. 147

Figure 8.3 Sensitivity of the ARX model parameters of each state ........................ 148

Figure 8.4 Power spectrum in each state (a) normal (b) bad (c) bulking state........ 149

Figure 9.1 Block diagram of the TSK fuzzy model ................................................ 172

Figure 9.2 Block diagram of the FPLS method ...................................................... 173

Figure 9.3 Scatter plots and firing strength plots of FPLS model in benchmark (a)

first LV (b) second LV (c) third LV (d) fourth LV................................................. 174

Figure 9.4 Comparisons of LPLS and FPLS for the predicted and actual SNHe in

benchmark (a) Time series plot (b) Scatter plot...................................................... 175

Figure 9.5 Comparisons of LPLS and FPLS for the predicted and actual SNOe in

benchmark (a) Time series plot (b) Scatter plot...................................................... 176

Figure 9.6 Scatter plots and firing strength plots of FPLS model in BET (a) first LV

(b) second LV (c) third LV (d) fourth LV ........................................................... 177

Figure 9.7 Time series plots of predicted and actual output in BET (a) SVI with

LPLS and FPLS (b) ∆ CN with LPLS and FPLS (c) ∆ COD with LPLS and FPLS

................................................................................................................................ 178

xiii

Figure 9.8 Scatter plots of predicted and actual output in BET (a) SVI with LPLS

and FPLS (b) ∆ CN with LPLS and FPLS (c) ∆ COD with LPLS and FPLS ........ 179

xiv

List of Tables

Table 2.1 Process Input/Output Variables in full-scale WWTP................................ 18

Table 5.1 Fault (disturbance) sources in the simulation benchmark ......................... 83

Table 5.2 Process variables in full-scale WWTP ...................................................... 84

Table 6.1 Process Input/Output Variables in WWTP ............................................. 106

Table 6.2 Variations explained by the PLS model of four latent variables ............ 107

Table 7.1 Two disturbances in the benchmark ........................................................ 133

Table 7.2 Percent variance captured by PCA model............................................... 134

Table 7.3 MSE of two PLS methods and the proposed method ............................. 135

Table 8.1 Confusion matrix of the test data ............................................................ 150

Table 9.1 Percent variance captured (%) and MSE of several PLS models in

benchmark............................................................................................................... 180

Table 9.2 Percent variance captured (%) and MSE of several PLS models in BET

................................................................................................................................ 181

1

I. Introduction

1.1 Research Motivation

The requirements imposed on the wastewater treatment process (WWTP) in

regard to effluent quality have become increasingly stringent, and existing plants are

subject to increasing loads. To meet these stricter guidelines requires the

development of an efficient wastewater treatment methodology. One way to improve

process efficiency is to build new and larger treatment plants; however, this option is

expensive and in many cases impossible due to lack of a suitable site. Another way

to improve efficiency is to introduce advanced control techniques and to optimize

operating conditions. This approach may improve the effluent water quality,

decrease the use of chemicals, save energy, and reduce operating costs. Any

sustainable solution to the current problems confronting wastewater treatment will

require the development of an adequate information system for the control and

supervision of WWTP.

Close inspection of the current operation of WWTP reveals that

instrumentation, control and automation (ICA) technology is minimal (Olsson and

Newell, 1999). Few plants are equipped with more than a few elementary sensing

elements and control loops, which are mostly used for flow metering and control,

and for monitoring the basic plant performance over relatively long periods of time.

Little progress has been made since the early 1970s, when a major leap forward was

made by the widespread introduction of dissolved oxygen (DO) control. The

introduction of ICA technology has been slow due to the lack of reliable

instrumentation and the harsh environment in which the computer and automation

devices are housed and operated. However, this situation is rapidly changing due to

advances in sensor technology and the introduction of smart sensors capable of self-

cleaning, self-calibration and self-reconfiguration. The current trend is towards

2

integrated systems that control and monitor the process from the wastewater sources

right through to the receiving waters and sludge disposal.

The primary purpose of ICA is to facilitate efficient operation of the WWTP,

allowing effluent standards to be met for the lowest possible operational and capital

costs. The main bottlenecks for the implementation of ICA technology within the

WWTP are related to the following (Olsson and Newell, 1999; Jeppsson et al.,

2001):

Poor legislation

Inadequate education, training and understanding

Lack of confidence and acceptance within WWTP industries

Lack of collaboration between stakeholders/organizations

Economy and time to develop solution in practice and making sure that

they work

Unreliable measuring devices

Plant constraints and inadequate sewer systems

Lack of transparency

Lack of software and instrument standardization

The increase in public awareness about wastewater disposal over the past

decade, as reflected in more stringent effluent regulation, has considerably increased

the requirements imposed on treatment plants. The treatment process must now

eliminate not only organic carbon pollution from wastewater, but also nutrients (e.g.,

nitrogen and phosphorus). The introduction of biological nutrient removal, the most

economical method for removing nutrients, has significantly increased the

complexity of process configurations. The main driving forces for ICA are related

to :

Stricter effluent quality standards

Demand for lower sludge production

3

Economic incentive

Reduce energy consumption and increase energy production

Increased plant complexity (co-ordination of processes and loops,

monitoring etc.)

New treatment concepts – e.g. more compact plants and water reuse

New and cheaper technical solutions – computers and communication

At present, many WWTPs are operated according to predetermined schemes

with very little consideration given to variations in influent load. The use of on-line

sensors for on-line control of plant operation could enhance the ability of WWTP to

comply with assigned effluent standards. Application of modern control theory in

combination with new on-line sensors and appropriate models have great potential to

improve effluent water quality, decrease the use of chemicals, and to save energy

and money. In particular, the input/output behavior of these processes can be such

that they appear highly stable right up to the time at which a gross process failure

occurs, apparently significant input disturbances do not excite any significant output

response, while very significant responses may occur in the absence of any

corresponding input disturbance. This distinctive feature of the WWTP has long

challenged control engineers. (Lindberg, 1997; Jeppsson, 1996; Lukasse, 1999;

Olsson and Newell, 1999; Singman, 1999; Steffens and Lant, 1999; Lee, 2000;

Sotomayor et al., 2000).

In contrast to the situation for the WWTP, multivariate statistical monitoring

and diagnosis of the process operating performance are extremely important aspects

of plant safety and economical viability in most other process industries, for

example the petrochemical and pharmaceutical industries. However, wastewater

treatment industries are not among the most diligent and systematic users of

statistical monitoring methods. To date, monitoring in wastewater treatment has

mostly focused on a few key effluent quantities that are subject to regulations

4

enforced by government or other authorities. However, the ongoing tightening of

environmental restrictions requires increased efforts to improve effluent quality from

the WWTP using advanced monitoring technology. To effectively monitor process

behavior statistically, important information must be extracted from the large

number of measured variables, and this information must be presented in a form that

is readily interpreted. The concept of sustainability, which entails minimizing the

use of resources such as energy, chemicals and manpower, has also become an

important issue in the design and modification of WWTPs (Rosen, 1998; Olsson and

Newell, 1999; Teppola, 1999).

However, there are further difficulties to overcome before a monitoring system

can successfully be applied to WWTP (Rosen, 2001).

Non-stationary data – The conditions in which WWTP are operated are

normally of a varying nature. Diurnal, weekly and seasonal patterns are

normally found in the influent wastewater characteristics. These disturbances

must be considered as normal and is in practice seen as state of things rather

than disturbances. It is often difficult to discern other process disturbances from

those caused by the varying influent conditions, which tend to have a dominant

effect on the process behavior.

Multiscale data – A difficulty related to the dynamic properties of the

disturbances as well as of the process is that disturbances occur in many

different time scales. It means that some disturbances affect the process in a

short time frame, whereas others have a much slower response. Apart from that

this fact complicates the discernment of disturbances in a similar way to that of

non-stationarity, it also deteriorates the performance of many monitoring

techniques. Moreover, information on the time scale of a disturbance may

prove crucial for a decision on counteractive actions. The multiscale nature of

data is, however, not only a problem; it can also be used to decouple the

5

process time.

Nonlinearities – Wastewater processes display a nonlinear behavior and

relationships between variables cannot always be approximated by a linear

function. Consequently, if this is the case, nonlinearities must be taken into

account when developing a monitoring system.

Dynamic data – Almost all data form dynamic process are autocorrelated,

which means that each observation is not independent of the previous

observation. This may have a great impact on statistical properties of the

monitoring output and consequently caution must be taken when interpreting

the result.

1.2 Research Objectives

The principal goal of this research is to develop advanced control and

monitoring systems that improve the operation of the WWTP. This work was

undertaken with the following detailed objectives.

Objective 1: To apply autotuning and supervisory control in the full-

scale WWTP

In the WWTP, PID controllers are familiar to process operators and very

popular because of their simplicity, ease of operation and robustness to modeling

error. However, it is well known that DO dynamics cannot be effectively controlled

by PID controllers with fixed gain parameters. Moreover, manual tuning of PID

controllers is tedious and laborious. In the present study, closed-loop identification

and auto-tuned PID controller are applied to the DO control system in the full-scale

WWTP. In addition, we propose a method for deciding on a proper DO set point for

the current operation condition of the aeration basin. The full-scale experimental

results showed good identification performance and good control performance.

6

Objective 2: To develop a robust estimation and control algorithm

We present a generalized damped least squares (GDLS) algorithm to

systematically remove an estimation windup problem in the adaptive control and

self-tuning control system. The key element of the proposed method is the addition

of a penalty of parameter variations to the objective function of the normal least

squares algorithm to prevent the singularity problem. Mathematical analysis shows

that the proposed method has almost equivalent properties to the normal least

squares method and guarantees that no estimation problem will be encountered for

poorly excited situations. Simulation results show that the proposed method gives

better estimation performance than previous methods. We applied this method in the

simultaneous control and estimation of important variables in the WWTP.

Objective 3 and 4: To develop a disturbance detection and isolation

algorithm

The biological nutrient removal process alters gradually over time, indicating

that the process is nonstationary. This represents a problem for developing a

conventional multivariate statistical analysis because such an analysis must be

developed from a set of "normal" operating data. Moreover, monitoring of the

biological treatment process is very important because recovery from failures is

time-consuming and expensive. Hence, a reliable detection procedure is needed. We

propose an on-line fault detection and isolation algorithm, which uses a dissimilarity

measure to evaluate the difference between successive data sets and to discriminate

between serious and minor abnormalities. In addition, to cope with the nonstationary

and multiscale process changes in WWTP, we propose a modeling and

multiresolution analysis (MRA) method.

Objective 5: To develop a process monitoring for continuous process

7

with cyclic operation

Most WWTPs are subject to large diurnal fluctuations in the flow rate and

composition of the feed stream. Consequently, WWTPs exhibit daily periodic

characteristics, with strong diurnal fluctuations in the process input and output

variables. Although these processes are non-stationary, their behavior tends to repeat

from cycle to cycle and hence their cycle-to-cycle behavior may be assumed

stationary. We propose a method for monitoring a continuous process with diurnal

cyclic characteristics in the domestic WWTP. The proposed method uses a state-

space model to capture and utilize the cycle-to-cycle correlation structure. The

method can also incorporate the concept of inferential sensing to predict the quality

variables and enhance process monitoring.

Objective 6: To develop a simultaneous prediction and classification in

the secondary settling tank

An efficient operation of the secondary settler is very important since it

separates the biomass from the treated wastewater and is a key mechanism of

determining the effluent quality in a biological WWTP. Simultaneous prediction and

classification in the secondary settling tank is proposed. Adaptive modeling scheme

is implemented as recursive least squares (RLS) method to update the model

parameters adaptively and neural network (NN) classifier is used as a process

classifier. Experimental results shows that the prediction model describes the

dynamics of the secondary settler well and neural network classifier combined with

an adaptive scheme is quite adequate for the monitoring of the secondary settler in

the WWTP.

Objective 7: To develop a nonlinear fuzzy partial least squares method

Fuzzy modeling has proved an efficient alternative for describing nonlinear

8

biological processes. Recently, the Takagi-Sugeno-Kang (TSK) fuzzy model has

received considerable attention because of its prediction ability and suitability for

continuous process modeling. The TSK model allows us to combine a set of

linearized models into a global model to approximate the complex nonlinear system

with less complexity. We propose a new nonlinear fuzzy partial least squares (FPLS)

algorithm that embeds the TSK fuzzy model into the regression framework of the

partial least squares (PLS) method. The proposed method applies the TSK fuzzy

model to the PLS inner regression. It uses the PLS method to solve the problems of

high dimensionality and collinearity and the TSK fuzzy model to capture the

nonlinearity and to increase the use of experts’ knowledge. We applied this method

in the simulation benchmark and full-scale WWTP.

9

II. Biological Wastewater Treatment Process

2.1 Activated sludge process

The activated sludge process with its many variations is the basis for the

treatment of wastewater almost everywhere in the United States. Especially, nearly

99% of municipal WWTPs in Korea use the traditional activated sludge process

(Lee, 2000). Figure 2.1 shows the basic layout of an activated sludge process

(Lindberg, 1997). The activated sludge process is a biological process in which

microorganisms oxidize and mineralize organic matter. All microorganisms enter the

system with the influent wastewater. The composition of the species depends not

only on the influent wastewater but also on the design and operation of WWTP. The

activated sludge is kept suspended in water by stirring and aeration. The

microorganisms to oxidize organic matter use oxygen. To maintain the

microorganism concentration, the sludge from the settler is recycled to the aeration

tank. The growth of the microorganisms and influent particulate inert matter are

removed from the process as excess sludge. Microorganism concentration is

controlled by the excess sludge flow rate.

Biological nitrogen removal

Nitrogen materials can enter the aquatic environment from either natural or

human caused sources. Excessive accumulation of various forms of nitrogen in

surface and ground waters can lead to adverse ecological and human health effects.

One of the major effects has been the direct and indirect depletion of oxygen in

receiving water. Other impacts can be of major importance in particular situations.

These include ammonia toxicity to aquatic animal life, adverse public health effects,

and a reduction in the suitability of water for reuse. Since the early 1970s significant

developments have taken place in the activated sludge method of treating

10

wastewater, as nutrient removal has become a very important factor in the WWTP.

Nitrogen materials are present in several forms in wastewater, e.g. as

ammonium (NH4+), nitrate (NO3

-), nitrite (NO2-) and organic compounds. Nitrogen

is an essential nutrient for biological growth and is one of the main constituents in

all-living organisms. When untreated wastewater arrives to the wastewater treatment

most nitrogen is present in the form of ammonium. Nitrogen can be removed by a

two-step procedure. In the first step, ammonium is oxidized to nitrate in aerated

zones (nitrification). The microorganisms carrying out this process are generally

considered to be nitrosomonas and nitrobacter. The aerobic growth of autotrophs

consumes soluble carbon, ammonia and dissolved oxygen to produce extra biomass

and nitrate in solution. This step can be further divided into two, one producing

nitrites and the second further oxidizing nitrites to nitrates.

−

+−+

→+

++→+

32-2

2224

NOO5.0NO

OH2HNOO5.1NH (2.1)

The second major step is the anoxic growth of heterotrophs, which use nitrates

as oxidizer and produces extra biomass and nitrogen gas (denitrification). This

process takes place in an anaerobic environment where the bacteria responsible for

denitrification respire with nitrate instead of oxygen (anoxic).

OOOCH 22223 H7C5N2H4""54NO ++→++ +− (2.2)

where “CH2O” stands for diverse the organic matter (Lindberg, 1997).

By using these two bacterial processes, nitrogen is removed from wastewater

biologically. Anoxic zones in the activated sludge process are necessary for

denitrification. Anoxic zones can be placed either in the beginning of the tank (pre-

denitrification) or in the end of the tank (post-denitrification). In a pre-denitrifying

system, an extra recirculation flow is usually introduced to transport the nitrate rich

water back to the anoxic zone. For successful denitrification, a sufficiently high

influent carbon:nitrogen ratio is required. When this requirement is not met, an

11

external carbon source has to be added. The dosing rate of that carbon is important.

Dosing an insufficient amount will result in a high effluent nitrate concentration.

Dosing too much will increase the costs considerably due to a high external carbon

use, a high sludge production, and an increased oxygen demand. The strong

variation in influent flow and composition, which if typical for WWTPs, generates a

demand for on-line control of the denitrification process in order to guarantee a

sufficiently low effluent nitrate concentration. Two variables can be manipulated to

achieve this objective: (1) the external carbon dosage, to guarantee that almost all

the recirculated nitrate is removed in the anoxic zone; and (2) the nitrate

recirculation flow rate, to control the amount of nitrate that is recirculated. For

optimal control of the process, the two variables should be controlled simultaneously

of form a multivariable control system (Jeppsson, 1996; Lindberg, 1997; Steffens

and Lant, 1999; Sotomayor et al., 2000).

2.2 Simulation Benchmark

WWTPs are large non-linear systems subject to perturbations in flow and load,

together with uncertainties concerning the composition of the incoming wastewater.

Nevertheless these plants have to be operated continuously, meeting stricter and

stricter regulations. Many control strategies have been proposed in the literature but

their evaluation and comparison, either in real-life applications or based on

simulations, is difficult. This is partly due to the variability of the influent, the

complexity of the biological and hydrodynamic phenomena and the large range of

time constants (from a few minutes to several days, even weeks), but also to the lack

of standard evaluation criteria. It is difficult to judge the particular influence of the

applied control strategy on reported performance increase, because the reference

situation is often not optimal. Due to the complexity of the systems the effort to

develop alternative control approaches is so high that a fair comparison between

12

different options is very rarely made. Then it remains difficult to conclude to what

extent the proposed solution is process or location specific. To enhance the

acceptance of innovating control strategies the evaluation should be based on a

rigorous methodology including a simulation model, plant layout, controllers,

performance criteria and test procedures. To this end, there has been a recent effort

to develop a standardized simulation protocol – ‘simulation benchmark’.

The COST 682 Working Group No.2 has developed a benchmark for

evaluation of control strategies by simulation (COST-624). The benchmark is a

simulation environment defining a plant layout, a simulation model, influent loads,

test procedures and evaluation criteria. For each of these items, compromises were

persued to combine plainness with realism and accepted standards. Once the user

has validated the simulation code, any control strategy can be applied and the

performance can be evaluated according to certain criteria (Alex et. al, 1999; Pons et

al., 1999; Copp et. al, 2000).

A relatively simple layout was selected in simulation benchmark. It combines

nitrification with pre-denitrification, which is most commonly used for nitrogen

removal. Figure 2.2 shows a schematic representation of the layout. The plant

consists of a 5-compartment bioreactor (6000 m3) and a secondary settler (6000 m3).

It combines nitrification with predenitrification, which is most commonly used for

nitrogen removal. The first two compartments of the bioreactor are not aerated

whereas the others are aerated. The IAWQ model No. 1 (Henze et al., 1987) and a

ten-layer one-dimensional settler model (Takács et al., 1991) are used to simulate

the biological reactions and the settling process, respectively. Influent data

developed by a working group on benchmarking of WWTP, COST 624, are used in

the simulation. The return sludge flow rate (Qr) is set to 100% of the influent flow

rate and internal recirculation (Qa) is controlled using a setpoint (SNO, ref) of 1.0 mg

N/l for the nitrate concentration in the second aerator. The aeration (KLa) in the

13

aerator 3 and 4 is set to a constant value, 240 day-1. The DO concentration in the

aerator 5 is controlled to a set point 2.0 mg /l. Simulated influent data are available

in three two-week files derived form real operating data. The files were generated to

simulate three weather situations representing dry weather, storm weather (dry

weather + 2 storm events), and rain weather (dry weather + long rain period). The

file exhibits characteristic diurnal variations inflow and component concentrations.

Each of data contains 14 days of influent data at 15 minutes sampling intervals. Any

control strategy should be tested using each of these weather files.

2.3 Full-scale WWTP

The process data were collected from a WWTP that treated the coke plant

wastewater of the iron and steel making plant in Korea. It is a general activated

sludge process that has five aeration basins (each 900 m3) and a secondary clarifier

(1200 m3). The plant layout of the studied activated sludge plant is presented in

Figure 2.3. It has two wastewater sources, where one directly comes from a coke

making plant (called BET3) and the other comes from a pretreated wastewater of

upstream WWTP at other coke making plant (called BET2). The coke-oven plant

wastewater is produced during the conversion process of coal to coke in the steel

making industries. It is extremely difficult to treat the coke wastewater because it is

highly polluted and most of the chemical oxygen demand (COD) originated from

large quantities of toxic, inhibitory compounds and coal-derived liquors (e.g.

phenolics, thiocyanate, cyanides, poly-hydrocarbons and ammonium). In particular,

cyanide (CN) concentration occupies the most important thing among the influent

load of the coke wastewater. The influent flow rate is 250 – 350 m3/hr, influent COD

is 500 – 1200 mg/l, influent cyanide is 5 – 30 mg/l, influent temperature is 30 – 45 °C, temperature in the aeration basin is 27 - 33 °C and operation cost was 0.08 $/ton

in 1999.

14

Twelve process and manipulated variables, X blocks, were used to model three

process output variables, Y blocks. Y blocks consist of the solid volume index (SVI),

the reduction of cyanide, and the reduction of COD. Table 2.1 describes the process

variables and presents the mean and standard deviation (SD) values of X and Y

blocks. The process data consisted of daily mean values from 1 January, 1998 to 9

November, 2000 with a total number of 1034 observations. The first 720

observations were used for the training data. And the remaining 314 observations

were used as a test data set in order to verify the proposed methods.

15

Figure 2.1 A basic activated sludge process with an aerated tank and a settler

16

Figure 2.2 A simplified process scheme of an activated sludge process using pre-

denitrification (layout of simulation benchmark)

Unit 1 Unit 2 Unit 3 Unit 4 Unit 5

m = 1

m = 10

m = 6

Q0, Z0

Qa, Za

Qr, ZrQw, Zw

Qe, Ze

Qf, Zf

Qu, Zu

WastewaterBiological reactor

ClarifierTo river

Wastage

Internal recycle

External recycle

Anoxic section Aerated sectionPI

Nitrate

PI

Dissolvedoxygen

kLa

kLa = oxygen transfer coefficient

17

A BEq.T/K

Aeration basin

Cokesplant

BET2

BET3Settler

FinalWWTP

Recycle sludge

Wastesludge

C D E

PretreatedWWTP

Figure 2.3 Plant layout of coke WWTP, Korea

18

Table 2.1 Process Input/Output Variables in full-scale WWTP

No Variable Description Unit Mean SD

X1 Q2 Flow rate from BET2 m3/h 179.4 15.98


X3 CN2 Cyanide from BET2 mg/L 2.455 0.3764


X5 COD2 COD from BET2 mg/L 156.4 20.28

X6 COD3 COD from BET3 mg/L 2083 295.5

X7 MLSS_%E MLVSS at final aeration

basin mg/L 1605 409.3

X8 MLSS_R MLSS in recycle mg/L 7194 3444

X9 DOaerator DO at final aeration basin mg/L 2.064 0.9979

X10 Tinfluent Influent temperature °C 37.6 2.513

X11 Taerator Temperature at final

aerator °C 30.74 2.379

X12 pHAT pH at final aeration basin mg/L 7.24 0.22

Y1 SVIsettler Solid volume index at

settler mg/L 63.31 21.73

Y2 CNred Cyanide reduction mg/L 19.31 4.2

Y3 CODred COD reduction mg/L 605.4 97

19

III. Autotuning and Supervisory DO Control in Fullscale

WWTP

3.1 Introduction

The dissolved oxygen (DO) concentration in WWTP has been recognized as an

important variable to be controlled both for economical and process efficiency

purpose. The proper control of DO could achieve improved process performances

and there is an economic incentive to minimize excess oxygenation by supplying

only necessary air to meet the time-varying oxygen demand of the mixed liquors.

Despite the relatively simple dynamics of the DO mass balance, the control may be

known difficult because of time-varying influent wastewater conditions, non-

linearity, time delay, sensor noise and slow sensor dynamics. To overcome these

problems, several adaptive control strategies have been suggested recently to the

control of DO concentration in the aeration basin (Holmberg et al., 1989; Carlsson,

1993; Carlsson et al., 1994, 1996; Lindberg, 1997). The previous works with

advanced control algorithm require detailed process information such as oxygen

transfer rate, respiration rate, reactor volume, wastewater flow rate and use a

mathematically complex algorithm that is difficult to be implemented in on-line

manner. Moreover, these cannot be implemented with the PID controller that is the

most common controller in the real WWTP. Therefore, a method of improving the

PID controller performance is required to use only the process input-output data

without requiring any complicated algorithm. It is an automatic tuning of PID

controller (autotuning).

In WWTP, PID controllers are familiar to process operators and very popular

because of its simplicity, easiness in operation and robustness to modeling error. But

it is well known that the DO concentration cannot be controlled effectively by using

the PID controller with fixed gain parameters. And the manual tuning of PID

20

controller is tedious and laborious. To overcome the time-consumed manual tuning

procedures of PID controller, many on-line identification methods have been

proposed to obtain the process information (Åström and Hägglund, 1985; Sung et al.,

1998a, 1999). In WWTP, Carlsson et al. (1994) used an autotuning controller

suggested by Åström-Hägglund to control DO concentration in WWTP. And Diue et

al. (1995) used relay feedback method in order to tune PID loop controller

parameters of PLC in the chlorination and dechlorination process.

The first objective of this research is to apply an autotuning to actual DO

control system in the full-scale WWTP. It approximates the dissolved oxygen

dynamics to a high order model using the integral transform method and reduces it

to the first-order plus time delay (FOPTD) or second-order plus time delay (SOPTD)

for the PID controller tuning. And then PID controller is tuned based on the reduced

method. The second objective is to suggest a simple supervisory control algorithm

which decides a proper DO set point in the aeration basin’s current operating. The

key idea is that DO set point is determined in proportion to the respiration rate that is

the indicator of biologically degradable load. Because we cannot have the real-time

respirometry in the full-scale plant, we have used the well-known respiration rate

estimation algorithm suitable to the surface aerator type of WWTP by using the

recursive parameter estimation approach. The proposed methods have been

evaluated in the full-scale WWTP.

3.2 Method

3.2.1 Autotuning Method

Nowadays, system identification under the closed-loop condition has been a

special issue in industrial and environmental applications since the process output

can go away from the normal steady state using the open-loop identification method.

21

This section explains a closed-loop identification method using the integral

transform as the identification method (Whitfield and Messali, 1987; Sung et al.,

1998a, b, 1999). It can utilize the process output and input activated by any test

signal generator (e.g. controller itself, relay, P controller, simple set point change of

PID control, pulse or step response) only if the process is activated sufficiently. So,

the operator can activate the process in different ways according to his taste.

The identification has the following steps. First, the process is activated

sufficiently to guarantee that the process output and the process input include

required information. Second, the differential equation of the parametric model is

converted to the corresponding linear algebraic equation by using the integral

transform. Third, the model parameters are estimated by using a least squares

method based on the measured process data.

Consider a general high order process model in the time domain.

ubdtdub

dtudb

dtudby

dtdya

dtyda

dtyda m

m

mm

m

mn

n

nn

n

n 011

1

111

1

1 ++++=++++ −

−

−−

−

− LL (3.1)

The above model (3.1) can approximate usual processes as accurately as

desired, even though the processes include time delay or non-minimum phase zeroes.

To convert the differential equation to an algebraic equation, the following integral

transform (3.2) is applied to both sides of equation (3.1).

ii

i

tfj dddytfiyI

jττττ 1110

)(),(_ −∫ ∫∫= L43421L (3.2)

and as a result, equation (3.3) is obtained.

=+−+++ − ),(_),1(_),1(_),0(_ 11 jjjnjn tfnyItfnyIatfyIatfyIa L

),(_),1(_),(_ 01 jjmjm tfnuIbtfmnuIbtfmnuIb +++−+− − L (3.3)

The objective of the identification is to estimate the coefficients of ak and bk. In

equation (3.3), all integrated values can be calculated numerically for various tfi

22

values. Then ak and bk are obtained by least squares algorithm. Here, it should be

pointed out that the identification method using the integral transform does not care

the types of the signal generators only if the signal can activated the process

sufficiently. And it uses only the least squares method to estimate the parameters of

the process model. The identified high order transfer function model can be used as

the process model for other adaptive control, a Smith predictor or other model-based

controller. On the other hand, we should reduce the identified model to the FOPTD

or SOPTD model to tune the PID controller automatically because many developed

on-line PID tuning methods such as internal model control (IMC), the Integral of

time-weighted absolute value of the error (ITAE) and Cohen-Coon (C-C) are based

on these models.

The PID controller can be tuned at a number of operating conditions such as

high/low respiration rate or load. Because we can separately use the proposed

method for each operation condition, the proposed method can effectively

compensate for the operation condition change by different PID controller

parameters. But, how can we suggest its set point? The next idea is that DO set point

is determined in proportion to the respiration rate that is the indicator of biologically

degradable load.

3.2.2 Supervisory Control

Supervisory control which recommends a proper DO set point in the aeration

basin’s current operation condition has been problematic in WWTP. The proper

aeration is crucial to treatment efficiency since an insufficient DO level impairs the

oxidation process and eventually leads to biomass death. Whereas too high DO may

cause the sludge to settle poorly and excessive aeration is also undesirable from an

economic point of view since the oxygen in excess is lost to the atmosphere.

Therefore, the proper DO set point gives or may give the following advantages such

23

as better control of effluent and energy saving from the lower DO level. However,

there have been few guidelines for the proper supervisory control of DO set point

until now (Lindberg, 1997). Lindberg (1997) suggested a set point controller which

utilizes measurements of ammonia concentration in the aeration basin.

The key idea is that DO set point is determined in proportion to the respiration

rate and influent loading because the respiration rate is the important variable that

characterizes the DO process and the associated removal and degradation of

biodegradable matter and is the only indicator of biologically degradable load. That

is, if toxic matter enters the plant, for example, this can be detected as a decrease in

the respiration rate, since the microorganisms degrade their activity or some of them

die. Then, a rapid decrease in the respiration rate may hence be used as a warning

that toxic matter has entered the plant. In this case, we should increase the DO set

point. Therefore, we can suggest the following decision rule that “The higher

respiration rate, the lower DO set point. The lower respiration rate, the higher DO

set point”. Figure 3.2 shows the scheme of the supervisory control to decide the set

point of DO controller.

Because we didn’t have the real-time respiration rate meter in the waste loading

state, we used the well-known respiration rate estimation algorithm with Kalman

filter approach suitable to the surface aerator type of full-scale WWTP. And the

estimated parameter is also used to give judgment of the present operation states and

process load. There are several different approaches such as recursive method in

order to estimate oxygen transfer rate (KL a) and respiration rate (R(t)) from

measurements of DO and airflow rate (Holmberg , 1989; Carlsson et al., 1994;

Lindberg, 1997; Joanquin et al., 1998). Here we used the Lindberg’s method

(Lindberg, 1997). The KLa and the respiration rate are tracked by a Kalman filter by

using measurements of DO and air flow rate, u(t). During the autotuning phase, the

airflow rate or aerator speed variation is given a high excitation both in amplitude

24

and frequency. The estimation procedure is performed on a relatively short date set,

in our case, autotuning’s identification time. Then, the estimated models of the

respiration rate and oxygen transfer rate could be used to the other controller design.

The estimated value of the respiration rate would be used as a base rule in the set

point decision.

3.3 Experimental Results

In this work, the experiment was performed in the industrial coke wastewater

treatment facility of the iron and steel making plant, Korea. Figure 3.3 shows a

schematic diagram of the WWTP considered in this research. The plant consists of

two parts: one is the biological process made up of the activated sludge process. The

other is the chemical treatment process. As shown in Figure 3.3, WWTP has five

aeration basins and one settling tank in the biological process. Each aeration basin is

equipped with sensors (pH, DO, ORP, MLSS) and a speed controllable surface

aerator in order to supply the oxygen. The automatic control system has a PC/PLC

structure, which is based on a number of tag points for supervision, data acquisition,

data storage and analysis. It was designed as the user-friendly control system using

the commercial man machine interface (MMI) software known as FIX DMACS 7.0.

The PID control algorithm was been installed in the MMI and autotuning program

was implemented with the visual basic 6.0.

The closed-loop identification methods were experimented using various input

signal in the real plant from Feb. 2. 2000 to Feb. 28. 2000. In this research, a simple

set point change of PID controller itself was chosen as the activation signal without

any control mode change. It is simple, stable and easy to implement the proposed

on-line identification method. The tested aeration basin was the last basin which was

the most important in the total wastewater treatment process. The DO set point was

increased from 1.6 to 2.0 mg/l at 0.05hour for identification. Figure 3.4 shows the

25

variations of DO concentration and aerator speed during the closed-loop

identification.

Using the acquired data, a high order model is identified using the explained

identification method. The system order is chosen as n = 4 and m = 3. The

equation (3.4) represents the identification result.

1325.00016.0001.0000003.0

196.0001.000098.0000012.0)()()( 234

23

++−+−+−+−

==ssss

ssssusysG p (3.4)

For the PID controller tuning, the high order model is reduced to the FOPTD

and SOPTD model using the model reduction technique. The reduced models are as

follows and the time unit is hour.

10036.00044.02.0)(,

135.02.0)( 2

17.019.0

++≅

+≅

−−

ssesG

sesG

s

SOPTD

s

FOPTD (3.5)

In Figure 3.5, Bode plots of the identified models are presented for the model

selection. If a high control performance is required, SOPTD model is recommended.

If just a stabilizing controller is the main objective, FOPTD model is sufficient. In

this research, FOPTD model is selected for simplicity because the reduced FOPTD

and SOPTD models show the similar results in the Bode plots. From the

identification result and theoretical analysis, one can know that its time constant is

21 min, the time delay is 11.4 min, and the steady state gain is 0.2. For the model

validation, the aeration speed is increased from 50 to 60 RPM as a step input. Figure

3.6 shows the step response of the real data and the prediction values of FOPTD

model, where the obtained model approximates the behavior of the real plant

successfully in spite of the experimental error and the identified model shows

robustness to measurement noises.

During the identification phase of autotuning, we estimated the respiration rate

using the previous estimation techniques. We experimented on the following process

conditions. Influent flow rate is 280 m3/hr, the temperature of aeration basic is 38 °C

26

and DOsat is 6.5 mg/l. The estimation result is showed in the Figure 3.7 and the

estimated respiration rate converges 54 mg/l/h. In the supervisory control, we

determined the previous simple set point decision rule, “The higher respiration rate,

the lower DO set point. The lower respiration rate, the higher DO set point”. To

avoid the DO set point becomes too high or too low, it should be only be allowed to

vary in an interval, 0.5-3.0 mg/l in our case. And the respiration rate range is 10-110

mg/l/h. In our coke WWTP, we suggested following simple set point decision rule.

75.2)(ˆ025.0 +−= tRDOs (3.6)

Figure 3.8 shows the control results with PID parameter tuned using the

acquired FOPTD model during 26 days. As a tuning rule, the ITAE disturbance

rejection rule was selected because it is appropriate for the step input disturbance

that occurs frequently in WWTP. With the set point set of 1.4 mg/l, it shows the PID

control result during the first 13 days. Then, abnormal process changes in the

influent load variations occurred at about 14th day after the first identification step.

Here, the identification method was used again to acquire a new process model by

the set point change of about 1.5 mg/l. Because the experimental result showed the

estimated respiration rate was low at about 25.0 mg/l/h, we recalculated DO set

point on 2.1 mg/l based on the proposed supervisory control law. And then we

changed the PID control parameters based on the refreshed process model. The

experimental result showed good control performance in spite of the frequent load

variations, abrupt upstream transition or influent toxicity. Since it considers all

measured data sets to estimate several adjustable parameters, it represents robustness

to measurement noises in particular. As a result of autotuning and supervisory

control, it has achieved the overall improvement of effluent quality and have

reduced 15 % of the electric power cost than the fixed gain PID controller.

27

3.4 Conclusions

Autotuning and supervisory control algorithm for DO control in the full-scale

WWTP were evaluated and proposed. Though the proposed method are concise and

doesn’t require any complicated numerical techniques, experimental results

confirmed that overall improvement of effluent quality and 15% reduction of the

electricity cost had been accompanied by autotuning and supervisory control.

Autotuning method has been applied to other control variables in full-scale WWTP,

such as pH, polymer addition control and sludge recycle rate control.

28

Figure 3.1 Identification and autotuning procedure for PID controller

Tuning Rule

d

i

ck

τ

τModel reduction

IdentificationActivated Process Data

29

Figure 3.2 Supervisory DO control scheme

DO

R(t)

AFR DOs Setpoint

Controller

PID(DO)

Controller

ASP

Input

30

Figure 3.3 Schematic diagram of full-scale WWTP

1st SettlingTank

#A #B #C #D #EEqualization

Basin

pH, Temp, flowrate

Influent Wastewater

pH, DO, ORP, MLSS

Aerators2nd Settling Tank

Filter Press

Effluent

FlocculatorThickening

Tank

MMI (VB)

Biological Treatment (Activated Sludge) Chemical Treatment

PLC

31

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.044

46

48

50

52

54

RPM

time [hour]

aera

tor s

peed

[RP

M]

1.0

1.5

2.0

2.5

3.0

DO

DO

[mg/l]

Figure 3.4 Experimental result during identification phase

32

Figure 3.5 Bode plots of the identified FOPTD and SOPTD models

0.1 1 10-180

-150

-120

-90

-60

-30

0

High Order SOPTD FOPTD

φ(deg)

ω (rad/hour)

0.1 1 10

0.1

HighOrder SOPTD FOPTD

AR

ω (rad/hour)

33

Figure 3.6 The validation test: real data and model prediction value

2

2.2

2.4

2.6

2.8

3

0 5 10 15 20 25 30

Time [min]

DO

[m

g/l]

Real

FOPTD`

34

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.010

15

20

25

30

35

40

45

50

55

60

65

Res

pira

tion

rate

[mg/

l/h]

time [hour]

Figure 3.7 Estimated respiration rate during identification phase

35

0 5 10 15 20 25 301.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9

2.0

2.1

2.2

DO

[mg/

l]

time [day]

Figure 3.8 DO control result using autotuning and supervisory control algorithm

36

IV. Generalized Damped Least Squares Method

4.1 Introduction

Since the main difficult in controlling biological processes arises from the

variability of kinetic parameters and the limited amount of on-line information, an

adaptive controller is best suited for this purpose. Adaptive control or self-tuning

control is usually based upon simultaneous model identification and control and thus

requires on-line updating of the model parameters rather than off-line processing of

the process data. Recursive Least Squares (RLS) method is the most popular method

used in the on-line recursive parameter estimation algorithm (Lambert, 1987; Ljung,

1987; Söderström & Stoica, 1989; Åström, 1995; Gau & Stadther, 2000). The

recursive parameter estimation is a useful tool in WWTP, since the system have

often time varying characteristics and parameters (Marsili-Libelli, 1990).

In general, RLS method generally works well only if the process is properly

excited all the time. But there are problems with exponential forgetting when the

exciting is poor, that is, reveal a good control performance in the view of the control.

A closed-loop control introduces linear dependencies among the process information.

And it produces a singular problem when the signals are not activated sufficiently

under the closed-loop system. It is known as an “estimation windup” or “blowup”,

which the gain of the estimator grows exponentially. This is easiest to see which the

process has a good control performance and is operated in the steady state. It is a

main bottleneck in applying adaptive controllers (Åström, 1995).

On the other hand, the estimation of the oxygen transfer rate KL a(u) and the

respiration rate R(t) in WWTP is needed to monitor the biological activity and

process control system performance or construct a nonlinear controller for

controlling DO concentration more effectively. Knowledge of the two variables is of

interest in both process diagnosis and process control. In particular, the respiration

37

rate is the key variable that characterizes the DO process and the associated removal

and degradation of biodegradable matter. It is the only true indicator of biologically

degradable load. If toxic matter enters the plant, this can be detected as a decrease in

the respiration rate, since the microorganisms slow down their activity or some of

them die. A rapid decrease in the respiration rate may hence be used as a warning

that toxic mater has entered the plant and by-pass action may be taken to save the

microorganisms (Lindberg, 1997).

For on-line purposes, it is crucial to be able to estimate both R(t) and KLa

simultaneously. In this case, the control strategy is dual in the sense that the control

signal is used both to control the dissolve oxygen (DO) concentration and to excite

the DO dynamics sufficiently to allow the parameters to be estimated. That is, in

considering the combined mechanism of the two algorithms (DO control and

R(t)/KLa estimation), a conflict arises, because controller keeps the DO constant,

consistent R(t) and KLa estimation requires this quantity to vary. These contrasting

requirements can be reconciled by adding a specific input such as relay or pseudo

random binary signal (PRBS) to control signal (Holmberg et al., 1989; Marsili-

Libelli and Vaggi, 1997). And it is then suggested to stop the estimation as soon as

convergence is obtained. It can be detected by low diagonal values in the covariance

matrix of RLS algorithm. The limit of this procedure is that time varying parameters

cannot be estimated unless the algorithm is periodically reinitialized. But under

feedback control, the conventional recursive estimation methods such as RLS

always show the estimation windup. Several solutions have been suggested and can

indeed reduce the parameter drift but are not a systematic solution of estimation

windup problem in the adaptive controller.

To overcome shortcomings of the previous methods, we will propose a

generalized damped least squares method as the fundamental solution of estimation

windup problem in the adaptive controller. And then an adaptive model-based DO

38

control algorithm will be proposed, which can simultaneously estimate the two key

parameters, oxygen transfer rate and respiration rate.

4.2 Theory

4.2.1 Generalized Damped Least Squares Algorithm

Numerous papers have appeared to avoid the estimation windup which often

occur in the practical application, where are normalization of the regression vector,

matrix regularization, constant trace with a variable forgetting factor, information

measurements for turning adaptation on/off, parameter variations and maximum

limit. There have been few previous methods which can solve the estimation windup

problem fundamentally.

The interesting two papers have been reported. Lambert (1987) introduced a

modification to classical RLS with exponential forgetting factor. His basic idea is to

weight the estimated parameter vector in only one sampled interval. However, his

idea lacks a multi-step approach in the estimated parameter vector. Tu (1990)

suggested regularized least squares algorithm with a smoothing constraint and a self-

adaptation of regularization parameter. This method is suitable for the ill-

conditioned least squares which regressor vector has a high condition number. This

method is originated for numerical stability property remedy. So, this approach is

not adequate for the system identification and control fields. To overcome

shortcomings of the previous methods, we extend Lambert’s idea and propose a

generalized damped least squares (GDLS) method.

Consider a dynamic system with input signal {u(t)} and output signal {y(t)}.

Suppose that these signals are sampled in discrete time t=1, 2, 3, … and that the

sampled values can be related through the Auto-Regressive with an eXogeneous

input (ARX).

39

)()()1()()1()( 11 temtubtubntyatyaty mn +−++−=−++−+ LL (4.1)

where y(t), u(t) are deviation variable or incremental mode and e(t) is white noise.

Model (4.1) describes the dynamic relationship between the input and output signals.

equation (4.1) can be rewritten as

)()()1()()1()( 11 temtutuntytyty mnnn +−++−+−++−= ++ θθθθ LL (4.2)

[ ]mnT bbaa LLL 11 −−=θ (4.3)

To solve the estimation windup and the drift problem in the closed-loop control

system, we modify a normal least squares algorithm. We add the supplementary

exponential weighted parameter variation restriction in the objective function of the

least squares method. The objective function follows as:

−+= ∑∑−

=

−−

=

−1

0

2

21

0

2

)()()()(MIN)(

N

t

tNN

t

tN

NtNtwV θθδελθ

θ (4.4)

where λ and δ are the exponential forgetting factor for the model error and

parameter variation, respectively, w is the weighting factor between the modeling

error and parameter variation and ε(t) is one step ahead prediction error. The

problem is to obtain the parameter estimates, θ which minimize the quadratic

objective function (4.4). This can be rewritten as follows.

( ) )()()()()( NYNwYNNNw addlsaddls +=Φ+Φ θ (4.5)

−−−−−−

−−−−−−

−−−−−−

=Φ

∑∑∑

∑∑∑

∑∑∑

=

−

=

−

=

−

=

−

=

−

=

−

=

−

=

−

=

−

N

t

tNN

t

tNN

t

tN

N

t

tNN

t

tNN

t

tN

N

t

tNN

t

tNN

t

tN

ls

mtumtutymtutymtu

mtutytytytyty

mtutytytytyty

N

000

000

000

)()()2()()1()(

)()2()2()2()1()2(

)()1()2()1()1()1(

)(

λλλ

λλλ

λλλ

L

M

L

L

(4.6)

40

=Φ ∑−

=

−−

10

00100001

)(1

0

1

L

O

L

LN

t

tNadd N δ (4.7)

[ ]T21 )()()()( NNNN mn+= θθθθ L (4.8)

T

0 00

)()()2()()1()()(

−−−= ∑ ∑∑

= =

−

=

−−N

t

N

t

tNN

t

tNtNls mtutytytytytyNY λλλ L (4.9)

T1

0

11

02

11

01

1 )()()()(

= ∑∑∑

−

=+

−−−

=

−−−

=

−−N

tmn

tNN

t

tNN

t

tNadd tttNY θδθδθδ L (4.10)

where Φls and Yls have the same meaning as the least squares method, Yadd and Φadd

of the additional information vector and matrix come from the penalty of parameter

variation. Then the solution vector is as follows:

( ) ( ))()()()()( 1 NYNwYNNwN addlsaddls +Φ+Φ= −θ (4.11)

This equation is the solution of the batch GDLS algorithm. The matrix wΦls(N) +

Φadd(N) of the batch GDLS solution is invertible unlike the least squares method

even though eth signal is not excited enough.

It is desirable to make the computations recursively to save computation in an

adaptive control because the process data are obtained successively in real-time. The

following equations are required to derive the recursive robust estimation algorithm.

[ ] )()()()()( NYNwYNNNw addlsaddls +=Φ+Φ θ (4.12)

−−−−

−−−−−−−−

+−Φ=Φ

)()()()1(

)()2()1()2()()1()1()1(

)1()(

mNumNumNuNy

mNuNyNyNymNuNyNyNy

NN lsls

L

M

L

L

λ

(4.13)

41

+−Φ=Φ

100000

00100001

)1()(

L

O

L

L

NN addadd λ (4.14)

[ ]Tlsls mNuNyNyNyNyNyNYNY )()()2()()1()()1()( −−−+−= Lλ (4.15)

)1()1()( −+−= NNYNY addadd θδ (4.16)

( ) ( ))()()()()( 1 NYNwYNNwN addlsaddls +Φ+Φ= −θ (4.17)

In these formulae, we call Equation (4.12) as the normal equation of recursive

GDLS algorithm. Equation (4.17) has strong intuitive and meaningful appeals.

Because the Φadd(N) term can make the matrix wΦls(N) + Φadd(N) invertible, it does

not suffer the possibility of an numerical ill conditioning in spite that y(t) and u(t)

has zero value in the closed loop control or steady state data set.

4.2.2 Theoretical Analysis

In this section, we will prove that the proposed GDLS algorithm guarantees the

theoretical properties of least squares algorithm like unbiased, consistent, minimum

variance and an exponential rate of convergence and more robust numerical

properties.

Property 1. When the weight goes infinite (w→ ∞), GDLS parameter estimate

converges to that of least squares method ( lsGDLSwθθ ˆˆlim =

∞→).

This may easily be verified. If w is infinite, GDLS equation becomes as

lsaddlswlsaddlswwYYwYww =+Φ=Φ+Φ

∞→∞→lim,lim

which gives a parameter estimate,

( ) ( ) ( ) lslslsaddlsaddlswGDLSwNwYNwNYNwYNNw θθ ˆ)()()()()()(limˆlim 11 =Φ=+Φ+Φ= −−

∞→∞→

lsGDLSwθθ ˆˆlim =

∞→ (4.18)

42

Property 2. For large number of observation data (N → ∞), GDLS objective

function becomes same as least squares method with exponential forgetting

( lsGDLSNVV =

∞→lim ).

For a large N value, there exists such a small M (<N) as δN-M ≈ 0 and assume a

stable convergence. Then θ(N) = θ(N-1) = … = θ(N-m) and the objective function

becomes as follows.

== ∑=

−

∞→

N

t

tN

NlsGDLSNtwVV

0

2

)()(MIN)(lim ελθ

θ (4.19)

From this, we can know that its objective function is equal to the objective function

of least squares method. We can assume that statistical properties of the normal least

squares estimation can be established asymptotically for large number of

observations in the proposed method.

Property 3. In the absence of exciting data, the present parameter estimate is equal

to a previous estimate value and thus is free from covariance windup problem for

even steady- state data sets or under the closed loop control.

This may be certified that an introduction of additional diagonal matrix of

parameter variation becomes a bound and makes the singular matrix nonsingular.

Under closed loop control or for steady state data sets, y(t) and u(t) become zero, and

then the matrix becomes,

=Φ

00

00)(

L

M

L

Nls , INadd δδ −=

−=Φ

11

10

00100001

11)(

L

O

L

L

(4.20)

[ ]T00)( L=NYls , )1(1

1)( −−

= NNYadd θδ

(4.21)

)()()()( NNNwN addaddlsaug Φ=Φ+Φ=Φ (4.22)

43

)1(ˆ)1(ˆ1

11

1ˆlim1

0)(),(−=

−

−

−=

−

→NNIGDLStuty

θθδδ

θ (4.23)

Thus the current solution is equivalent to a previous estimate value )1(ˆ −Nθ . It

changes the invertible term of the augmented matrix nonsingular and avoids the

blow-up associated with exponential weighted least squares. Moreover, it is

expected to reduce undesirably large variation in the estimated parameters for the

abrupt measurement noises.

Property 4. The proposed method has better numerical property than the least

squares method.

Estimation algorithms are normally implemented on digital computers and

hence there is the possibility of numerical ill conditioning. Since, in an ill-posed

problem where 1)( −ΦΦ lsT

ls takes a large condition number in the normal least

squares, least squares solution may becomes basically very sensitive to the

perturbation in data y(t). This implies that the norm of such solution in significant

greater that the norm of exact solution. In our algorithm, we overcome this problem

fundamentally by considering additional constraint, ∑−

=

−− −1

0

22

1 )()(N

t

tN tN θθδ , on the

objective function. So, we can easily solve the ill-posed inverse problem efficiently

and do not suffer from the possibility of the numerical ill-conditioning for the steady

state data set or closed loop control. This property is the same as that of the

regularized least squares method.

In the analysis, we conclude that the proposed GDLS algorithm keeps the

theoretical properties of least squares unlike other modified least squares method.

Other modifications such as covariance resetting alter the geometry and convergence

of the true least squares properties. So, it has supplementary robust properties and

retains properties of least squares without the desirable theoretical properties. In

44

addition, GDLS method can also use merits of other variants, e.g. low pass filtering,

conditional updating and variable forgetting factor.

4.2.3 Soft Sensor of Oxygen Transfer Rate and Respiration Rate

Two general approaches of the estimation of respiration rate, “soft sensor”,

have been developed during the last years. The first approach is based on the

respirometer that estimated the respiration rate from DO mass balance in a

respiration chamber (Spanjers et al., 1998; Olsson and Newell, 1999). The second

approach is to estimate the respiration rate directly from the DO sensor and airflow

rate measurement in the aeration basin. In this research, the latter approach is used.

There are several different approaches of the estimation of R(t) and KL a based

on simple from measurements of DO sensor and airflow rate in the real aeration

basin. Holmberg et al. (1989) estimated the linear oxygen transfer rate model in a

recursive way. Here, the excitation of the process was improved by invoke a small

relay which increases the excitation. Carlsson (1993) developed a novel approach to

estimate the respiration rate by the constrained piecewise linear model. Lindberg

(1997) developed the nonlinear controller using the estimated oxygen transfer rate

during the identification step and presented a systematic estimation method for

oxygen transfer rate and respiration rate. Marsili-Libelli and Voggi (1997)

summarized various estimation methods about respirometric activities in the

bioprocess. Holmberg et al.(1989) and Marsili-Libelli and Vaggi (1997) described a

simultaneous estimation scheme for KL a and R(t) based on the conventional RLS

method, taking advantage of the differing time scale of the two variables. We select

this approach for estimating the respiration rate and oxygen transfer rate.

In this approach, dissolved oxygen deficit (D) in mass balance equation was

introduced.

)())())((())()(()()( tRtyytuaKtytyV

tQdt

tdysatLin −−+−= (4.24)

45

)()( tyytD sat −= (4.25)

where y(t) is the DO concentration in the aeration basin, yin(t) is the DO

concentration of the input flow, ysat is the saturated value of the DO concentration,

Q(t) is the influent wastewater flow rate, V is the volume of the aerator, KL a(u(t)) is

the oxygen transfer rate, u(t) is the airflow rate into the aeration tank from the air

production system, R(t) is the respiration rate, respectively. Due to KL a and R(t)'s

variations, DO dynamics show time varying behavior.

The discrete-time equation with sampling time h is

)()1(/)1(1

)()1(1)()(

tyrVQeaK

tReaK

tDehtD

ahK

L

ahK

L

ahK

L

LL

+−+

−+=+

−

−−

(4.26)

After some manipulations, equation (4.26) can be put in the standard estimation

form.

[ ]

)()())(()()1(/1)()(

thRthDtuaKtyrVhQhtyhtz

L −=+−−+=+

(4.27)

where z(t+h) is the updating information at each sampling instant. The oxygen

transfer rate KL a(u(t)) can be structured as a function of the airflow rate.

airaairL UKUKKtuaK ≅+= 10))(( (4.28)

It is easily seen that this model can be written as

)()()(ˆ tthtz T θϕ=+ (4.29)

where ϕT(t)= [D(t)hUair –h] is the regressors, θ(t) = [Ka R(t)]T is and

)(-)()( htzhtzte ++= is error vector.

From the on-line measurements, a soft sensor of KL a(u) and R(t) will be

designed and constituted by a recursive state estimator that uses the influent flow

rate, airflow rate and DO measurements. The estimated parameters θ(t) can be

updated according to the RLS method but are always experienced an estimation

46

windup problem under feedback control. In this research, we use a GDLS method as

an estimation algorithm because under closed-loop control. A schematic figure of

the robust estimator is shown in Figure 4.1.

4.2.4 Adaptive Model-based DO Control

Despite of the relatively simple dynamics of DO process, DO control may not

sufficiently be satisfied by the operator in the biological treatment process since the

DO process dynamics has a time varying characteristics. This means that high

control and estimation performance for all operating conditions may be hard to be

achieved with a conventional method.

After estimating two key variables, we can design an adaptive model-based DO

control in order to correct the process/model mismatch and to estimate the

unmeasured state variables. Using the available process input and output

measurements, model-based control adaptively updates the estimated parameters

θ. In this research, GDLS method is used for the estimation windup problem under

closed-loop control. And then updated model is used by an adaptive generic model

control (AGMC) for the control input. In AGMC, nonlinear process models can be

imbedded into the controller directly without any linearization. AGMC is very

simple and robust nonlinear control algorithm in single-input and single-output

(SISO) process. Although the proposed control parameters are constant, the updated

model can compensate the process/model mismatch because of its adaptive and

model-based characteristics (Lee and Sullivan, 1988; Signal and Lee, 1992).

The proposed method in the DO control is as follows.

DO Process:

)())())((())()(()()(

tRtyytuaKtytyV

tQdt


DO model:

47

)(ˆ))())(((ˆ))()(()()( tRtyytuaKtytyV

tQdt


Desired trajectory:

PIdttyyKtyyKdt

tdy t

ssdesired =−+−= ∫021 ))(())((

)( (4.32)

Here, we made the desired trajectory as Proportional–Integral controller (PI)

without a state observer. aKLˆ and )(ˆ tR are estimated by the previous described

GDLS estimation algorithm. Using equations (4.31) and (4.32), we can derive the

following equation in order to obtain the control input u(t).

)(

)()()()()(ˆ))((ˆ

tyytyVtQtyVtQtRPItuaK

sat

inL −

⋅−⋅++= (4.33)

Using the linear model ( )(ˆ))((ˆ1 tuktuaK L = ), u(t) can be easily obtained by

−

−−+=

))((ˆ))()((/)(ˆ

)(1 tyyk

tytyVQtRPItu

sat

in (4.34)

with the constraint maxmin )( utuu ≤≤ . In equation (4.34), u(t) is explicitly shown

and the function of the estimated values and the measured process values. The

control input u(t) has the nonlinear gain and all variables in numerator except PI are

the sum of the bias of steady state term and feed-forward compensation of the

respiration rate. Because we can update the estimated parameters of the oxygen

transfer rate KL a and respiration rate R(t) under the proposed controller, we can

easily compute the control action from equation (4.34). Figure 4.2 shows the

structure of the model-based DO control scheme.

This model-based DO controller shows no offset about the modeling error since

it contains the integral action by the external input in the structure itself. As an ideal

case, if the estimated values are equal to the true ones, )()(ˆ taKtaK LL = and

)()(ˆ tRtR = , the model-based control algorithm makes the offset free. Combining the

48

DO dynamics (4.30) with control input (4.34) gives the following error equation.

PIdtteKteKdt

tdy t=+= ∫021 )()()( (4.35)

So, the error signal, e(t) will approach to zero exponentially. Moreover, it has the

special robustness to the disturbance of DO process, that is, respiration rate because

it contains the estimated respiration rate and oxygen transfer rate in the controller

structure.

Adaptive model-based DO control with the proposed GDLS algorithm need not

require any specific estimation phase and can acquire both estimation of respiration

rate and DO control. The estimated value of respiration rate can give the information

about the biological activity and can be used monitoring index. In Figure 4.3, we

represent the proposed procedure for the soft sensor and model-based DO control

structure.

4.3 Simulation Study

In this section, we will see the performance of a GDLS algorithm. First, we will

apply a GDLS method to the estimation problem of first-order system under the

closed loop control and discuss about simultaneous estimation and control problem.

Second, a GDLS method will be applied in DO process dynamics. Examples

illustrate what happens when RLS is used versus the improvements obtained by

using the proposed algorithm.

First-Order System under Closed Loop Control

The following first-order process is simulated and controlled by a proportional

control law. Through the section, data is generated by

)1()1()( −∆+−∆=∆ tubtyaty (4.36)

where

49

)3000,2000(1.0)2000,750(5.0

)750,1(0.1)3000,2000(5.0)2000,750(9.0

)750,1(1.0

∈∀−=∈∀=∈∀=∈∀=

∈∀−=∈∀−=

tttbt

tta

where y(t) is process output, u(t) is control input. During the simulation, the two

parameters (a, b) were changed two times. Signal to noise ratio changes in the same

way. Notice that for t > 2000, the process was experienced the steady state gain’s

sign change. It is very large change in the system. The process excited by a

Proportional controller activation signal of the following structure, u(t)=0.15(ys(t)-

y(t)). We generated a random set point every 10 sampling time unit during t < 1000

and the fixed set point, 1.0 during 1000 < t < 3000 for the closed loop control. The

corresponding input and output response are shown in the Figure 4.4. The input and

output sequences generated for 3000 sample intervals are used to illustrate the

comparisons with RLS and GDLS algorithms for the two parameter estimation

problem (a, b) in the closed loop control. The RLS initial conditions (λ= 0.95,

0)0( =θ , P(0) = 109) were used in the simulation runs. Values used in the simulation

have deviation form.

The RLS algorithm with deadband update was used to show the estimation and

investigate the corresponding estimated parameter windup in the closed loop control.

We have observed that continued identification during periods of low excitation

leads to parameter drifting and bad estimates in Figure 4.5. During the random set

point change (0, 1000), the estimated parameters are accurate and bounded more or

less. But in the absence of any input excitation in the interval (1000, 3000), the

estimated parameters escape from real value. The noise in the process and

insufficient exciting signal then cause drifting of the parameter estimates (e.g. )(ˆ ta

and )(ˆ tb ). This increases the probability of bursting and results in deterioration of

50

subsequent set point changes. During the closed loop, the P matrix grows

unbounded whenever the system excitation is insufficient. Figure 4.6 represent the

trace of P(t) in closed loop control. The covariance matrix blows up and the trace of

P(t) increased rapidly in the absence of any input excitation in the interval (1000,

3000). The blowup usually occurs between set point change, unmeasured

disturbance and measurement noises.

In order to compare the performance of variants of RLS algorithm, the previous

experiment was repeated for constant-trace algorithm. This scheme is to scale it in

such a way that the trace of the matrix is constant. An additional refinement is to

also add a small a unit matrix. This gives the so-called regularized constant-trance

algorithm (Åström, 1995).

))1(ˆ)()()(()1(ˆ)(ˆ −−+−= tttytKtt T θϕθθ

( ) 1)()1()()()1()(

−−+−= ttPtIttPtK T ϕϕλϕ

−+−−

−−=)()1()(1

)1()()()1()1(1)(ttPt

tPtttPtPtP T

T

ϕϕϕϕ

λ

( ) IctPtr

tPctP 21 )()()( += (4.37)

where 0 0, 21 ≥> cc . Its result is shown in Figure 4.7, for constant trace RLS

combined with conditional updating. We used the following parameters: c1 is 100, c2

is 1 and its estimation deadband condition is 0.1. The estimate of control input,

)(ˆ tb has comparatively accurate value, while estimated value of process output,

)(ˆ ta cannot tract its correct value and shows the slow and poor estimation result once

excitation is removed during t > 1000.

We simulated the proposed GDLS upon the same process data for the

comparison of RLS. The following conditions are used in a GDLS simulation (λ=

0.95, δ=0.95, w=1000, θ(0) =0). Through many simulations, we could know that

51

weighting factor, w was adequate around 1000 in the closed loop control, forgetting

factor of least squares, λ could have value between 0.9 and 1.0 and forgetting factor

of parameter variation, δ could have value between 0.0 and 1.0. We can select other

values in the other process. Figure 4.8 shows the identification result of GDLS.

Despite periods of exciting and non-exciting data, the result shows good estimation

performance and indicates that the estimated parameters exactly keep track of the

real value under closed loop control. Even process gain change, GDLS can correctly

track the parameter variations.

DO Process

In the simulations, the following DO process is setup.

)())())((())()(()()( tRtyytuaKtytyV

tQdt


where Q(t) = 200-500 l/h, V = 1000 l, ysat = 8 mg/l, yin(t) = 1 mg/l,. And R(t) has a

diurnal variation of WWTP, RSS is steady state of respiration rate and

KLa(u(t))=0.0018(1+(R(t)-RSS)/500)Uair h-1. This configuration has a same condition

of our experimental condition. The sampling time is 30 seconds. And for the

similarity with the real process, we added the zero mean white measurement noise

with 10% magnitude of process output. Also, we consider the time delay that always

exists in the real biological treatment process. The time delay of DO dynamics is

two times of the sampling time. In the estimation algorithm, the following setup was

used. The RLS initial conditions (λ=0.99, 0)0( θθ = , P(0)=106) were used in the

simulation runs. The parameters of GDLS are w=1000, λ =0.95 and δ =0.95.

Figure 4.9 shows the estimation comparisons of RLS and GDLS method under

PID control. The basic RLS algorithm with deadband update within 0.05 is used. At

initial of RLS estimation, the estimated parameters of RLS are accurate and bounded

more or less. But in the absence of any input excitation using good feedback control,

52

the estimated parameters escape from real value and diverge after set point change

of DO controller. It is originated that the covariance matrix of RLS grows

unbounded, estimation windup, whenever the system excitation is insufficient

during the closed-loop control. We can see that continued identification during

periods of low excitation leads to parameter drifting and bad estimates in Figure 4.9.

On the other hand, estimation result of GDLS method shows good estimation

performance in spite of feedback control. Note that these estimation results are

operated under feedback control and low exciting signal.

Based on GDLS estimation, adaptive model-based DO controller with equation

(4.34) is used with the constraint 000,10)(10 ≤≤ tu . The tuning parameters of GMC

are tuned by Lee’s reference trajectory shape (Lee and Sullivan, 1988), which are

K1=9.50 and K2=47.5. In spite of the time-varying R(t) and KL a, the proposed

controller shows the good control performance in Figure 4.10. However, the PID

controller shows some offset since DO dynamics has continuously time-varying

influent load and respiration rate. On the other hand, influent load and respiration

rate is compensated by adaptive and feed-forward action and oxygen transfer rate is

compensated by nonlinear gain in an adaptive model-based control. This means that

the adaptive model-based DO control can cope with the operation condition changes

such as the various load, respiration rate and other process changes. These dynamic

variations are frequently occurred in WWTP. And the estimated respiration rate (soft

sensor) under closed-loop can give the information about the biological activity and

can be used monitoring index.

4.4 Conclusions

In this research, we propose a simple and systematic estimation method for the

estimation windup problem. On the basis of analysis, we concluded that a GDLS had

the same properties as the least squares algorithm and more robust numerical

53

properties. Simulation examples show that GDLS method keeps the tracking ability

of process parameters under the closed loop control. Based on the robust estimation

performance, the model-based DO controller can efficiently cope with the time

varying characteristics and operating condition changes that are frequently occurred

in WWTP

54

Estimator (GDLS)D(t)hUair

-h R(t)

KLa

( )( ))()(

)()()( 1

NYNwYNNwN

addls

addls

+Φ+Φ= −θ

Figure 4.1 Robust estimator of KL a(u(t)) and R(t) with a GDLS method

55

y(t)ys DOdynamics

u(t)AGMC

Soft sensor (GDLS)[ ]T

L taKtRt )(ˆ),(ˆ)(ˆ =θ

Figure 4.2 Adaptive model-based DO control strategy with a GDLS algorithm

56

Soft sensor and model-based DO control in WWTP

Choose AGMC trajectory parameter (K1, K2)

Measure the process input/output values (u(t), y(t))

Soft sensing of KL a(u) and R(t) by GDLS algorithm

Calculate the adaptive model-based DO control input

Figure 4.3 Flow chart of soft sensor and the model-based DO control algorithm

57

0 1000 2000 3000-0.5

-0.4

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

0.4

0.5P

roce

ss O

utpu

t

Sampling Intervals

0 1000 2000 3000-0.5

-0.4

-0.3

-0.2

-0.1

0.0

0.1

0.2

0.3

0.4

0.5

Pro

cess

Inpu

t

Sampling Intervals

Figure 4.4 Process output and input during the simulation (a) process output (b)

control input

58

0 1000 2000 3000-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5 areal aRLS

a

Sampling Intervals

0 1000 2000 3000-0.5

0.0

0.5

1.0

1.5 breal bRLS

b

Sampling Intervals

Figure 4.5 Parameter estimates )(ˆ ta and )(ˆ tb of RLS with exponential data

weighting

59

Figure 4.6 Trace of P with RLS estimation method

0 1000 2000 30000

1000

2000

3000

Tra

ce o

f P

Sampling Intervals

60

0 1000 2000 3000-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5 areal aRLS

a

Sampling Intervals

0 1000 2000 3000-0.5

0.0

0.5

1.0

1.5

breal b

RLS

b

Sampling Intervals

Figure 4.7 Parameter estimates )(ˆ ta and )(ˆ tb of RLS with constant trace

61

0 1000 2000 3000-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

areal

aGDLS

a

Sampling Intervals

0 1000 2000 3000-0.5

0.0

0.5

1.0

1.5

breal bGDLS

b

Sampling Intervals

Figure 4.8 Parameter estimates )(ˆ ta and )(ˆ tb with GDLS

62

0 2 4 6 8 10 12 14 16 18 20 22 2420

25

30

35

40

45

50

0 2 4 6 8 10 12 14 16 18 20 22 24-200

-150

-100

-50

0

50

100

(a)

Res

pira

tion

rate

[mg/

l/h]

time [h]

R(t) Re(t) of GDLS

(b)

Res

pira

tion

rate

[mg/

l/h]

time [h]

R(t) Re(t) of RLS

Figure 4.9 Estimation comparisons of RLS and GDLS method under PID control

63

0 4 8 12 16 20 240

1

2

3

4(a)

DO

[mg/

l]

time [h]

set point PID proposed

0 4 8 12 16 20 241000

2000

3000

4000

5000(b)

UAI

R [l/h

]

time [h]

PID Proposed

Figure 4.10 Model-based DO control result with a GDLS method (a) process output,

(b) control input

64

V. Disturbance Detection and Isolation in WWTP

5.1 Introduction

The increase in environmental restrictions in recent times has led to an increase

in efforts aimed at attaining higher effluent quality from WWTP. Achieving this

goal requires the advanced monitoring of plant performance. Most of the changes in

biological WWTP are slow when the process is recovering back from a ‘bad’ state to

a ‘normal’ state. The early detection and isolation of faults in the biological process

are very efficient because they allow corrective action to be taken well before the

situation becomes dangerous. Some changes are not very obvious and may gradually

grow until they become a serious operational problem. The discrimination between

serious and minor anomalies is of primary concern in the monitoring of these

processes. To make this distinction, a reliable procedure for the detection and

isolation of disturbances is needed.

In the case of the activated sludge process (ASP), multivariate statistical

process control (MSPC) has been developed to extract useful information from

process data and utilize it for monitoring and detection (Krofta et. al, 1995; Rosen

and Olsson, 1998; Olsson and Newell, 1999; Teppola, 1999). Krofta et al. (1995)

applied MSPC techniques to fault detection in dissolved air flotation. Rosen and

Olsson (1998) adapted multivariate statistics based methods to the wastewater

treatment monitoring system using simulated and real process data. Teppola (1999)

used an approach that combined multivariate techniques, fuzzy clustering and multi-

resolution analysis for wastewater data monitoring.

However, MSPC has fundamental weakness as a method for monitoring the

ASP. These problems arise because the biological nutrient removal process changes

gradually over time, making the process non-stationary. Thus, ASP hardly ever

operates normally for long periods, and the non-stationary nature of the process

65

causes the definition of normality to change over time. One shortcoming of MSPC is

that it cannot detect the change in correlation among process variables when

Hotelling’s T2 and sum of squared prediction error (SPE) are inside the control limit.

Hence, MSPC is not suited for monitoring non-stationary processes because it

assumes stationary data. This is a problem for developing statistical control charts,

as they should be developed from a set of normal operating data. Several methods

have been suggested to model these non-stationary natures of the WWTP (Bakshi,

1998; Rosen and Lennox, 2000). One method that has shown potential for treating

non-stationary processes is the use of adaptive algorithms for MSPC (Rosen and

Lennox, 2000). Another proposed method employs a multiscale model through the

use of wavelet transforms (Bakshi, 1998; Teppola, 1999; Rosen and Lennox, 2000).

Another important issue in process analysis is the ability to diagnose the source

of abnormal behavior. Chemometric methods such as principal component analysis

(PCA) have been utilized for merging detection with the diagnosis of the causes of

abnormal situations (Ku et al., 1995; Kano et al., 2000a,b). Ku et al. (1995)

proposed a diagnostic method in which the out-of-control observation was compared

to PCA models for known disturbances. Using refinements of statistical disturbances,

discriminant analysis was then used to select the most likely causes of the current

out-of-control condition. Kano et al. (2000a, b) proposed a new statistical process-

monitoring algorithm. This method is based on the idea that a change of operating

condition can be detected by monitoring the distribution of time-series process data,

because this distribution reflects the corresponding operating condition. They did

not, however, consider the individual contributions of each transformed constituent

in the normalization of the dissimilarity index.

In this research, we propose a modified dissimilarity measure and disturbance

detection method for the successive data sets. Using eigenvalue monitoring, the

proposed method can also detect the disturbance and isolate the type of disturbance

66

scale.

5.2 Theory

5.2.1 Modified Dissimilarity Measure

The dissimilarity measure that has been traditional used is based on the

Karhunen-Loeve (KL) expansion and is identical to the PCA. This measure

compares the covariance structures of two data sets and represents the degree of

dissimilarity between them. In the computational procedure, the variance of a

transformed data vector is normalized by its corresponding eigenvalue. The

dissimilarity measure therefore considers not the absolute magnitude but the relative

magnitude of the variance change, and neglects the importance of each transformed

variable. We suggest a modified dissimilarity measure that considers the importance

of each transformed variable. Using this modified measure we propose the fault

detection and isolation (FDI) technique. This technique is divided into two major

steps: a training phase using an historical data set representing the process in normal

operation, and the on-line monitoring and isolation using the test data set.

The modified dissimilarity measure algorithm is as follows. First, the data

window size and step size are determined, where the window size is the number of

samples in each data set and the step size is the monitoring interval. Second, two

successive data sets are selected and normalized with the sample mean and sample

variance (Xi, i = 1, 2). Figure 5.1 represents the concept of window and step size

using a moving window. Third, the sample covariance matrix is found and singular

value decomposition (SVD) is applied to it. The algebraic representation of these

steps is

22

11

2

1

2

1

11

11

11 SS

XX

XX

S−−

+−−

=

−

=NN

NN

N

T

(5.1)

67

2,11

1=

−= i

N iTi

ii XXS (5.2)

PΛSP = (5.3)

where P is the loading matrix andΛ is the diagonal matrix. In this procedure, input

variables are transformed into orthogonal variables (transformation Xi to Yi).

ii

i NN

TY11

−−

= (Ti = XiP) (5.4)

Fourth, the sample covariance matrix (R) of two transformed data sets (Y1 and Y2) is

found and SVD is applied to R.

ΛRR =+ 21 (5.5)

rjandiij

ij

iji ,...,12,1, === qqR λ (5.6)

( ijq : loading vector, i

jλ :eigenvalue, r: dimension of data in PCA). Combining

equations (5.5) and (5.6) and using some algebra, we obtain

( ) 1112

1111 jjjjjjjj and qqRqqR λλ −Λ== (5.7)

This result shows that two sample covariance structures share eigenvectors,

whose eigenvalues satisfy

jjjj Λ=λ+λ 21 (5.8)

where i

jλ is the jth eigenvalue in the ith data set and jjΛ is the eigenvalue in the

total data set. As two of the data sets are more similar than others, their eigenvalues

are closer 0.5 jjΛ . In general, the first few principal components, r, explain most of

the variation of the data. Next, the modified dissimilarity index D is found,

∑∑==

Λ

Λ−=

r

jjj

r

j

jjjD

1

2

1

2

24 λ (5.9)

D has a value between 0 and 1. The more similar two data are, the closer D is to

0, and the more dissimilar two data are, the closer D is to 1. Finally, the (1-α)100%

68

confidence interval of each eigenvalue is determined. For many samples, it is

reasonable to assume that each eigenvalue is a normal random variable by the

central limit theorem. For the samples obtained from a normal operation, the interval

containing 99% of eigenvalues calculated above is obtained by

( ) { } ( ) { } ij

ij

ij

ij

ij sNtsNt λλαλλλα +−−≤≤+−−− 1;2/11;2/1 (5.10)

where ijλ is the mean of a sample, { }i

js λ is the variance of a sample, and α is

99%. That is, (1-α)100% of ijλ are below the limit value and the remainder are

above it (Johnson and Wichern, 1992).

5.2.2 Fault Detection and Isolation (FDI)

For on-line monitoring, the normal operation data is used as a training data set.

And confidence limits are calculated from the previous step. In addition, the sample

representing a current operating condition is scaled by the sample mean and sample

variance obtained in previous steps. The corresponding modified dissimilarity index

and eigenvalues are then calculated using the previous step. The modified

dissimilarity index, which evaluates the difference between two data sets, can

quantitatively detect a change of operating condition and monitor a distribution of

time-series data. If the index is outside the control limit or deviates from a value of

zero, the operating condition is changed and a disturbance is said to have occurred.

In particular, we can focus on the individual variation of several eigenvalue.

Only a few eigenvalues are considered as monitoring indexes because most of the

variation is captured by the first few eigenvectors. The remaining variation that is

not captured by the principal eigenvectors is negligible and it is not critical to

identify whether it is caused by changes in the process or noise. If any of the

principal eigenvalues exceeds its corresponding confidence limit, the current

operation at that eigenvalue is changed indicating that an operating change has

69

occurred. In this eigenvalue, a disturbance detected. Monitoring at each eigenvalue

allows us to distinguish a process change from an instantaneous fault or sensor noise.

Because it represents the corresponding characteristics at each eigenvalue, this

technique gives information about the eigenvalue on which a disturbance occurs,

and makes it possible to analyze the physical/biological reasons for the disturbance.

This method automatically gives us the capability to isolate and interpret the

disturbance source.

If adaptive scaling is to be used to tackle non-stationary or dynamical problems,

the sample mean and variance should be successively updated to detect changes in

continuous processes. And a forgetting factor can be introduced to reduce the effect

of previous measurements (Li et al., 2000). Another important consideration in

monitoring changes in the process or operating condition is the determination of

appropriate window and step sizes. These quantities should be carefully selected

taking into consideration the process characteristics. We suggest that the window

size should be large in comparison to the time constant of the process, and the step

size should be small in comparison to the sampling time.

5.3 Simulation Studies

5.3.1 Simulation of Benchmark Plant

For the monitoring purpose, the proposed method was applied to the detection

of various disturbances in the simulated data obtained from a benchmark simulation.

Four types of disturbance were tested using the FDI method: External disturbance,

internal disturbance, setpoint change, and sensor fault (see Table 5.1). External

disturbances are defined as measurable disturbances, which are outside of the

process and are detectable from the sensor signal. Examples of such disturbances are

changes in the influent flow rate or nitrogen concentration. Internal disturbances are

70

caused by changes within the process affecting the process behavior. These

disturbances include factors such as decreased nitrification, non-measurable

inhibition of influent or gradual reduction of the settling velocity in the secondary

clarifier (denoted as bulking phenomena). The two other simulated disturbances

were a set point transfer signal with low frequency information and a sensor failure

event in the high frequency band.

Three events in the influent data developed by the benchmark are associated

with the influent flow rate (dry, storm and rain weather). The training model was

based on a normal operation period of one week of dry weather. The data used were

the influent file and outputs with noises suggested by the benchmark. The variables

used to build the X-block in the disturbance detection were the influent ammonia

concentration (SNH,in), influent flow rate (Qin), total suspended solid in aerator 4

(TSS4), DO concentration in aerators 3 and 4 (SO,3, SO,4), and oxygen transfer

coefficient in aerator 5 (KLa5). The conditions used for on-line monitoring were a

window size of 20 samples (5 hours) and a step size of 5 samples (1.25 hours). The

mean and variance were the values calculated from the normal data.

External process disturbances

We tested a storm event that suddenly occurs two times after a long period of

dry weather. This example shows how external disturbances appear within the

proposed method. The pattern of measurement variables during the storm weeks was

the same as the storm condition in the benchmark. The pattern of measurement

variables during the storm weeks is presented in Figure 5.2. And Figure 5.3 shows

the monitoring results obtained using the FDI technique during the storm weeks.

The dissimilarity index sharply increases at around samples 850 and 1050, which

correspond to the first and the second storm. The two storm events are largely

detected as changes in the first and second eigenvalues, as shown in Figure 1(b-d).

The magnitude of each eigenvalue represents the proportion of the variation

71

captured by its corresponding eigenvector.

Internal process disturbances

The first internal disturbance was imposed by decreasing nitrification rate in the

biological reactor through a decrease in the specific growth rate of the autotrophs

(µA) is decreased. The autotrophic growth rate at sample 300 was linearly decreased

from 0.5 to 0.3 day-1. As shown in the left of Figure 5.4(a), the decrease in

nitrification is detected for the first time at around sample 330, which is 30 samples

after the event occurred. This event is quickly detected by the second eigenvalue,

while the first eigenvalue increases continuously after this event. After the

deterioration of nitrification ends, the dissimilarity index shows peaks at around

samples 500 and 600. These sudden increases in the dissimilarity index are caused

by the increase of the first eigenvalue. The gradual increase of first eigenvalue

means that the process has undergone this type of internal disturbance such as

nitrification or denitrification rate decrease.

The second internal disturbance imposed on the system was a linear decrease in

the settling velocity in the secondary settler between samples 300 and 500. For the

early detection, it was necessary to add another measurement of the effluent total

suspended solid (TSSe) to the general X-blocks. The right side of Figure 5.4(a) shows

an increase in the dissimilarity index after sample 330. As in the case of the decrease

in nitrification, the dissimilarity index is constant for 30 samples after the onset of

the decrease in settling velocity. The increase in the dissimilarity index at around

sample 330 is caused by increases in the second and the third eigenvalues. The first

eigenvalue jump up about sample 410, contributing greatly to the increase in the

dissimilarity index observed shortly afterwards. The jump up of first eigenvalue

means that the process has undergone this type of internal disturbance such as

bulking or biomass decay events.

72

Sensor faults and setpoint change

To identify the usefulness of the FDI method for detecting sensor faults with

high frequency information, we corrupted the nitrate sensor in the secondary anoxic

tank. In the sensor fault case, it was also necessary to add the nitrate concentration

(SNO,2) to the general X-blocks. The fault was introduced during the sample period

200-400. Prior to that period, the sensor was operating properly except for the

imposition of sensor noise. Monitoring results are presented in the left side of Figure

5.5. The sensor fault is detected by a change in the second eigenvalue, indicating

that the sensor fault is caused by the variation change along the second contributing

axis. On the other hand, the disturbance caused by the setpoint change with low

frequency information is demonstrated in the right side of Figure 5.5, which shows

when the DO controller setpoint in the 5th biological reactor was suddenly changed

from 2 to 1 mg/l at sample 300. In contrast to the sensor fault, the setpoint change

causes a variation along the third contribution axis.

5.3.2 Full-scale WWTP

We now consider a second example, using real data from the coke WWTP of an

iron and steel making plant in Korea. The treatment system is a general activated

sludge process that has five aeration basins and a secondary clarifier. We selected 16

general variables to describe the process state of the WWTP; these variables are

described in Table 5.2. The data set consisted of daily mean values from 1 January,

1998 to 9 November, 2000, comprising a total of 1034 observations. The first 720

observations were used for the training model of the mean-centered and auto-scaled

data. The remaining 314 observations were used as a test data set to test the

proposed method.

In addition to the proposed algorithm, the PCA method was used to monitor the

73

WWTP characteristics. PCA results were then compared with those of the proposed

FDI algorithm. We managed to capture only above 55% of the variance by

projecting the variables with four latent variables. Figure 5.6 shows the Hotelling’s

T2 and the squared prediction error (SPE) chart. The two horizontal lines correspond

to the 95% significance levels of the original training data. The data deviated

slightly in samples 120 to 125. From Figure 5.6, we can see certain deviations in

some of the variables within these intervals. To make the cause of the deviation

more obvious, the contributions from every measurement variable were calculated.

However, it is difficult to detect this disturbance from the plots. Moreover, it is not

possible to diagnose and isolate the disturbance frequencies.

The monitoring performances of the proposed FDI method in WWTP are given

in Figure 5.7. To monitor changes in the process and operating condition, a window

and step sizes of 20 and 5 samples were used, respectively. The dissimilarity indexes

of the test data set are shown in Figure 5.7 (a). The dissimilarity index has high

values around samples 110 - 120 (19 April, 2000 – 4 May, 2000), leading us to

predict that a large process change happens at this time. Five eigenvalues which

correspond to a range of disturbance scales are depicted in Figure 5.7 (b-f). The

remaining eigenvalues give little information because they provide only high

frequency information such as measurement noise. It is evident from Figure 5.7 that

the third and fourth eigenvalues, which are representative of middle scale

disturbances, contribute greatly to the increase in the dissimilarity index. At this

time, the WWTP received a high input of cyanide and chemical oxygen demand

(COD) load. This load reduced the activity of the micro-organisms and diminished

the settling performance, causing an increase in the solid volume index in the

secondary settler. We found that the increase in an influent load started out as an

external disturbance but subsequently transformed into an internal disturbance that

changed the process operation region in WWTP. These process changes are detected

74

by the dissimilarity index, and the disturbance sources were isolated by the proposed

FDI method.

We can draw the following conclusions from the results of several simulations.

First, the modified dissimilarity index unifies all the scales into one monitoring

value and provides a compact index. Second, the eigenvalue at each eigenvalue can

discern the dominant dynamics and detect the scale on which a disturbance occurs.

In this analysis it is presumed that the eigenvalue with a large magnitude represents

the effect of low frequency information such as a large change in the process or the

occurrence of a large and long disturbance. The eigenvalue of intermediate

magnitude represents a small change in operation condition such as a short external

disturbance, while the eigenvalue with a small magnitude represents high frequency

information such as sensor faults and measurement noises. The fault isolation

approach therefore provides intelligence on the scale at which a disturbance occurs,

and can be used to analyze and interpret the physical cause and effect of

disturbances. Third, because the proposed method is based on evaluation the

difference between successive time series data sets with a moving window, as is

done in adaptive PCA, the proposed method can tackle the non-stationary problem

of WWTP.

5.4 Conclusions

The strategy proposed in the present work is able to detect and isolate the effect

of various disturbances occurring in the activated sludge process. This strategy uses

a modified dissimilarity index and monitoring of individual eigenvalues. One merit

of this technique is that it can simultaneously detect the disturbance and isolate its

source, in contrast to conventional MSPC. The strength of the isolation technique is

that it gives information about the scale on which a disturbance occurs, assisting in

the interpretation of the disturbance. Experimental results show that it is an

75

appropriate monitoring technique for the activated sludge process, which is

characterized by a variety of fault and disturbance sources and non-stationary

characteristics. This fault detection and isolation method provides us with a new

analysis tool for acquiring a deeper understanding of process monitoring

methodology through the detection and isolation of disturbances at different scales.

76

Window sizeStep size

Val

ue

Time

Figure 5.1 Moving windows between successive two datasets.

77

0 200 400 600 800 1000 1200 14000

50

Snh

,in

0 200 400 600 800 1000 1200 14000

5

10x 10

4

Qin

0 200 400 600 800 1000 1200 14002000

3000

4000

TSS

4

0 200 400 600 800 1000 1200 14000

2

4

So3

0 200 400 600 800 1000 1200 14000

5

10

So4

0 200 400 600 800 1000 1200 14000

200

400

KLa

5

sample number

Figure 5.2 Measured variables during the storm weeks

78

0 200 400 600 800 1000 1200

0.00

0.05

0.10

0.15

0.20(a)

D

sample number

0 200 400 600 800 1000 1200

0

10

20

30

40

50

60 (b)

1st e

igen

val

ue

sample number

0 200 400 600 800 1000 1200

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5 (c)

2nd

eige

n va

lue

sample number0 200 400 600 800 1000 1200

0.0

0.5

1.0

1.5

2.0(d)

3rd

eige

n va

lue

sample number

Figure 5.3 Monitoring performances of the external disturbance (a) modified

dissimilarity index (b-d) individual eigenvalue plot.

79

0 100 200 300 400 500 600

0.000

0.005

0.010

0.015

0.020(a)

D

sample number0 100 200 300 400 500 600

0

2

4

6

8

10

12 (b)

1st e

igen

val

ue

sample number

0 100 200 300 400 500 600

0.0

0.5

1.0

1.5

2.0

2.5

3.0 (c)

2nd

eige

n va

lue

sample number0 100 200 300 400 500 600

0.0

0.1

0.2

0.3

0.4

0.5

0.6 (d)

3rd

eige

n va

lue

sample number

0 100 200 300 400 500 600

0.00

0.01

0.02

0.03

0.04(a)

D

sample number

0 100 200 300 400 500 600

0

5

10

15

20

25(b)

1st e

igen

val

ue

sample number

0 100 200 300 400 500 600

0

1

2

3

4

5 (d)(c)

2nd

eige

n va

lue

sample number0 100 200 300 400 500 600

0.0

0.2

0.4

0.6

0.8

1.0

3rd

eige

n va

lue

sample number

Figure 5.4 Monitoring performances under internal disturbances caused by

decreasing nitrification (left plot) and settler bulking (right plot) (a) dissimilarity

index, (b-d) individual eigenvalue plots

80

0 100 200 300 400 500 600-0.02

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0 100 200 300 400 500 600

0

1

2

3

4

5

6

7

8

0 100 200 300 400 500 600

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

0 100 200 300 400 500 600

0.0

0.5

1.0

1.5

2.0

2.5

(b)

(d)(c)

(a)

D

sample number

1st e

igen

val

ue

sample number

2nd

eige

n va

lue

sample number

3rd

eige

n va

lue

sample number

0 100 200 300 400 500 600-0.02

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

0.18

0 100 200 300 400 500 600

0

1

2

3

4

5

6

7

8

0 100 200 300 400 500 600

0.0

0.5

1.0

1.5

2.0

0 100 200 300 400 500 600

0.0

0.5

1.0

1.5

2.0

(b)

(d)(c)

D

sample number

1st e

igen

val

ue

sample number

2nd

eige

n va

lue

sample number

(a)

3rd

eige

n va

lue

sample number

Figure 5.5 Monitoring performances of sensor faults (left plot) and setpoint change

(right plot) (a) dissimilarity index, (b-d) individual eigenvalue plots

81

0 50 100 150 200 250 3000.0

0.5

1.0

1.5

2.0

2.5

3.00 50 100 150 200 250 300

0

5

10

15

(b)

95% UL

SPE

sample number

(a)

95% UL

Hot

ellin

g T2

sample number

Figure 5.6 PCA monitoring performances (a) Hotteling’s T2 chart, (b) SPE plot

82

Figure 5.7 FDI monitoring performances (a) dissimilarity index, (b-f) the 1, 2, 3, 4,

5th eigenvalues

0 50 100 150 200 250 300

0.00

0.01

0.02

0.03

0.04 (b)(a)

D

time (days)0 50 100 150 200 250 300

02468

10121416

EV1

time (days)

0 50 100 150 200 250 3000

1

2

3

4 (c)

EV 2

time (days)0 50 100 150 200 250 300

0.00.51.01.52.02.53.03.5 (d)

EV 3

time (days)

0 50 100 150 200 250 300

0.2

0.4

0.6

0.8

1.0 (e)

EV4

time (days)0 50 100 150 200 250 300

0.1

0.2

0.3

0.4

0.5

0.6 (f)

EV5

time (days)

83

Table 5.1 Fault (disturbance) sources in the simulation benchmark

Disturbance

type Disturbances Simulation conditions

External Storm events Abrupt change of influent flow rate at around samples 850 and 1050

Internal Decreasing nitrification

Specific growth rate for autotrophs: from 0.5 to 0.3 day-1 in a linear fashion during

sample 300 – 500

Internal Decreasing settling velocity

Settling velocity in a secondary settler: from 250 to 150 mday-1 in a linear

fashion during sample 300 – 500

Sensor faults Nitrate sensor failure

Nitrate sensor noise in the second anoxic tank: noise mean changed from 0 to 1 mg

N/l during sample 200 – 400 Operation

change Set point change of DO controller

DO controller set point: from 2 to 1 mg/l at sample 300

84

Table 5.2 Process variables in full-scale WWTP

Variable Unit Mean STD Variable Unit Mean STD

Q2 m3/h 179.4 15.98 MLSSAT mg/L 1605 409.3

Q3 m3/h 85.53 8.73 MLSSR mg/L 7194 3444

CN2 mg/L 2.455 0.38 DOAT mg/L 2.064 0.99

CN3 mg/L 14.35 2.49 Tinfluent °C 37.6 2.51

CNPS mg/L 1.49 0.25 TAT °C 30.74 2.34

COD2 mg/L 156.4 20.28 SVIAT ml/g 11.42 3.15

COD3 mg/L 2083 295.5 SVIR ml/g 63.31 21.73

CODPS mg/L 143.28 17.66 PHAT ml/g 7.24 0.22

85

VI. Modeling and Multiresolution Analysis in WWTP

6.1 Introduction

Up to date, monitoring in WWTP has mostly focused on a few key effluent

quantities upon which regulations are enforced. However, since the environmental

restriction becomes more rigid nowadays, the increasing effort for higher effluent

quality is required in the monitoring of WWTP performance. Specially, monitoring

of the biological treatment process is very important because the recovery from

failures is time-consuming and expensive, where some of changes are not very

obvious and may grow gradually until they produce a serious operational problem.

Therefore, early fault detection and isolation in the biological process are very

efficient to execute corrective action well before a dangerous situation happens. At

the same time the discrimination between serious and minor abnormalities is of

primary concern. To accomplish this task, a reliable detection procedure is needed.

However, few monitoring techniques are available to utilize large on-line data sets

despite of the increasing popularity and the decreasing price of on-line measurement

systems in the field of WWTP.

Multivariate statistical process monitoring (MSPM) or multivariate statistical

process control (MSPC) is a possible solution to multivariate, collinear, auto or

cross-correlated processes. This comprises chemometrics methods such as principal

components analysis (PCA) or partial least squares (PLS) combined with standard

sorts of control charts. In order to extract useful information from process data and

utilize it for the monitoring of WWTP, several applications of MSPM or MSPC

have been developed (Krofta et. al, 1995; Rosen, 1998; Van Dongen and Geuens,

1998; Teppola, 1999).

However, the biological treatment process has several peculiar features unlike

chemical or industrial engineering. Above all, it is ‘nonstationary’, which means

86

that the process itself changes gradually over time. For example, systematic

seasonable variations show a dynamic pattern, for example, the process normal

condition evolves according to the seasonal variations. In addition, many underlying

phenomena of WWTP takes place simultaneously and it may be difficult to separate

specific phenomenon among them. Namely, it has ‘multiscale’ characteristics that

have multiple simultaneous phenomena affecting the data at different time or

frequency scales. If these synchronous characteristics interfere or mask other time or

frequency variations, called the disturbing effects, the situation turns troublesome

because the multiscale variations are enlarging up the confidence limits. This is

unfavorable because the actual events can stay undetected by the monitoring

algorithm while the plant is being under way of the events.

To solve these problems, several methods have been suggested recently using

adaptive PCA, multiscale analysis with dynamic PCA and multiresolution analysis

with wavelet (Bakshi, 1998; Kano et al., 2000a,b; Rosen and Lennox, 2000; Teppola

and Minkkinen, 2000, 2001; Ying and Joseph, 2000; Choi et al., 2001; Yoo et al.,

2001). Bakshi (1998) used a multiscale model through the use of wavelet transforms.

Kano et al. (2000) proposed a dissimilarity index based on the distribution of time-

series process data. Rosen and Lennox (2000) applied and developed adaptive PCA

and multiresolution analysis of wavelet. Ying and Joseph (2000) evaluated the

feasibility of sensor fault detection using multi-frequency signal analysis of noise.

Teppola and Minkkinen (2000, 2001) suggested several multiresolution analyses

using wavelet-PLS regression model for interpreting and scrutinizing a multivariate

model. Choi et al. (2001) suggested a generic monitoring algorithm utilizing a

modified dissimilarity index in the benchmark simulation and Yoo et al. (2001)

confirmed these results using a PCA-type monitoring algorithm in a real WWTP.

Shortly, this research applies two methods, one is for prediction and the other is

for multiresolution monitoring technique. In this way, it is possible to take into

87

account the multivariate, nonstationary and multiscale natures of WWTP. These

approaches are organized by putting PLS model and multiresolution analysis

together. In the first approach, PLS model is used for the prediction and data

analysis. In the second approach, multiresolution analysis using a generic

dissimilarity measure and singular value decomposition to PLS score matrix is

proposed. The statistical confidence limit of detection and isolation is suggested and

its ability is verified by using real plant data.

6.2 Theory

6.2.1 Partial Least Squares (PLS)

Very often in industrial applications, the data are severely corrupted by noise

and collinearities among a high number of variables. To treat these problems, it is

convent to apply latent variables models, particularly PLS modeling. PLS

maximizes the covariance between process variables and responses. In PLS, the

matrix X (process variables) is decomposed and modeled in such a way that the

information in Y (responses) can be predicted as well as possible. In addition, PLS

uses only the variation in X matrix that is significant in the prediction of the

variation in Y matrix. Moreover, one does not assume that the X variables are free of

noise as in multiple linear regression (MLR). The noise and insignificant variations

are not used in modeling.

In PLS, the standardized sample matrices ZX of X and ZY of Y are decomposed

as follows.

XXX

m

i

Tii

nx

mi

Tii

nx

i

m

i

Tii

Tii

TX EZEptptptptTPZ +=+=+=== ∑∑∑ ∑

=+== =

ˆ111 1

(6.1)

YYY

m

i

Tii

ny

mi

Tii

ny

i

m

i

Tii

Tii

TY EZEququququUQZ +=+=+=== ∑∑∑ ∑

=+== =

ˆ111 1

(6.2)

88

In the above representations, ti and ui are score vectors, pi and qi are loading

vectors, XZ and YZ are unbiased estimates of XZ and YZ respectively, and

EX and EY are residual matrices. PLS is composed of outer and inner relationship. In

construction of outer model, score vectors ti and ui are obtained by the projection of

ZX and ZY onto loading vectors pi and qi, respectively. While in construction of inner

model, ti is linearly regressed on ui yielding iii btu =ˆ , where bi is a regression

coefficient. Then ZY can be expressed as

∑=

+=+=m

iY

TiiiY

TY b

1

EqtETBQZ (6.3)

where B is a diagonal matrix of the regression coefficient bi. In this respect, PLS can

be considered an useful tool which divides multivariate linear regression into simple

linear regression. The first several latent variables (LVs) are extracted from the

matrix X and Y and they contain most of variance of matrix X and Y, respectively.

On the other hand, the last LVs mostly consist of noise and variations that are not

related to X and Y. Importantly, the LVs are orthogonal to each other. These features

together make it possible to compress information in the presence of collinearity and

redundancy. Although PLS is a regression technique, it is a more important

technique that visualizing ability enables us to probe search and data sets more

minutely (Geladi and Kowalski, 1986; Höskuldsson, 1996; Rosen, 1998).

PLS projects X and Y variables simultaneously onto the same subspace, T, in

such a manner that there is a good relation between the position of one observation

on the X-plane and its corresponding position on the Y-plane. Once a PLS model has

been derived, it is important to grasp its meaning. For this, the scores t and u are

considered. They contain information about the observations and their

similarities/dissimilarities in X and Y space with respect to the given problem and

model. X and Y weights provide the way how the variables combine to form t and u,

which in turn express the quantitative relation between X and Y. Hence, these

89

weights are essential for the understanding which X variables are important for

modeling Y, which X variables provide common information, and also for the

interpretation of the scores t.

In order to detect the occurrence of process faults and disturbances, PCA-type

monitoring is based on the statistical analytical approach of the score values and the

residuals. The scores are monitored by using Hotelling’s T2 statistics or viewing the

corresponding score plots directly. The residuals are monitored by Q statistics, that

is, sum of squared prediction error of X variables (SPEX). T2 statistics is a measure of

the distance from the multivariate mean to the projection of the operating point on

the principal component (PC) plane. Q statistics is the Euclidean distance of the

operating point from the plane formed by the retained PCs. T2 monitors systematic

variations in the latent variable space while SPEX represents variations, not explained

by the retained PCs (Kourti and MacGregor, 1995; Wise et al., 1990;Wise and

Gallagher, 1996; Teppola, 1999). However, the conventional MSPM method, such

as T2 and Q statistics, does not always function well, because it cannot detect the

changes of correlation among process variables if T2 and Q statistics are inside the

confidence limits.

6.2.2 Generic Dissimilarity Measure (GDM)

Recently, several dissimilarity indices with the distribution between two

datasets have emerged (Kano et al., 2000; Choi et al., 2001; Yoo et al., 2001). They

are based on the idea of that a change of process operation can be detected by

comparing the distribution of successive datasets because the data distribution

reflects the corresponding process operating condition. In previous section (5.2.1),

we introduced and developed a generic dissimilarity measure (GDM) algorithm. It

compares covariance structures of two datasets and represents the degree of

dissimilarity between them by considering the importance of each transformed

90

variable (see the section 5.2.1).

6.2.3 Multiresolution Analysis (MRA)

PLS monitoring is different from the PCA-type monitoring algorithm. In PLS,

principal component decomposition of X blocks should be rotated (by introducing

the loading weights) in order to maximize the covariance between X and Y blocks.

Therefore, these multivariate control charts are only approximations. A comparison

of X block loadings and loading weights is one way to check at least a partial

validity. In this case, there were no significant differences between the loading and

the loading weights (Teppola, 1999). Therefore, it is required a new monitoring

method for PLS monitoring that can effectively treat the peculiar characteristics of

the biological treatment process and isolate and diagnose their fault sources with a

multiscale approach.

Figure 6.1 shows the scheme of multiresolution analysis (MRA) for PLS

monitoring. In the first place, a PLS model is constructed with normal historical data

in order to solve the multivariate and collinear problems in a biological WWTP. It is

used to represent the process behavior and the common-cause variations of WWTP

and excludes noise, measurement errors, and those variations that are uncorrelated to

Y variables. Then, MRA for score values is executed by GDM and principal

eigenvalues contribution to detect the process change and to diagnose or isolate

different kinds of faults and disturbances. The motivation of this work is to identify

the type of event which has occurred. It is believed that different events can result in

different process measurement values, which could be projected into the change of

data distribution and be manifested into different areas of the PC space.

Here, each successive dataset in GDM consists of PLS score values with a

moving window because PLS score values are normally distributed than the original

variables themselves. This is a consequence of the central limit theorem, which can

91

be stated as follows: If the sample size is large, the theoretical sampling distribution

of the mean can be approximated closely with a normal distribution. Thus, we would

expect the scores, which are a weighed sum like a mean, to be distributed

approximately normally (Neter et al., 1996; Wise and Gallagher, 1996). Figure 6.2

demonstrates the normality comparison between the original value and PLS score

value. Therefore, as the abnormality will manifest itself as a shift or time series

distribution change in the score value than the original variables. As the abnormality

will manifest itself as a shift in the score plane like T2 statistics of PCA and PLS

monitoring, it will be shown in this case as a dissimilarity value between successive

two datasets, that is, GDM. Exactly, a moving window concept of PLS score values

for GDM is a remedy of nonstationary problem of the PLS monitoring algorithm.

On the other hand, if the relationships between the process variables are rapidly

changed and the correlation structure has a breakdown, SPEX of PLS residual error

value should be included in two datasets of the proposed MRA algorithm. In this

case, moving window matrices combined with score values and residual error values

of PLS model are processed in the GDM and MRA method. Since the process inner

relationship in WWTP, however, is slowly changed, only score matrices are

sufficient for monitoring.

Additionally, the confidence limit of individual eigenvalue is a proposal for

multiscale fault detection and isolation. If each of eigenvalues exceeds to its

corresponding confidence limit, the current process at that scale is changing and a

certain event is occurring. By monitoring at each scale, we can diagnose diverse

process variations and events, i.e., diagnosis of slow variations (seasonal

fluctuations or other long-term dynamics), middle scale variation (internal

disturbance, process operation change), and instantaneous variations (input

disturbances, faults or sensor noises). Because it represents the corresponding

characteristics at each scale, multiscale technique can discover information on the

92

scale where process changes, faults and events occur and analyze the

physical/biological reasons. The proposed MRA automatically gives us the

diagnosis and interpretation capability of events and fault sources. Note that it can

get rid of nonstationary problem systematically by comparing successive datasets

with a moving window concept. Moreover, it does not bring about the zero padding

problems unlike other MRA, such as wavelet.

6.3 Result and Discussion

PLS modeling

The process data were collected from a biological WWTP that treated the coke

wastewater of the iron and steel making plant in Korea (Figure 2.3). Eleven

process and manipulated variables, X blocks, were used to model three process

output variables, Y blocks. Y blocks consist of the solid volume index (SVI), the

reduction of cyanide (∆CN), and the reduction of COD (∆COD). Table 6.1 describes

the process variables and presents the mean and standard deviation (SD) values of X

and Y blocks. The process data consisted of daily mean values from 1 January, 1998

to 9 November, 2000 with a total number of 1034 observations. The first 720

observations were used for the training of PLS model of mean-centered and auto-

scaled data. And the remaining 314 observations were used as a test data set in order

to verify the proposed method. For the determination of the latent variable number

of PLS model, a cross-validation method was used and four LVs were selected in

PLS model. It managed to capture about 54% of the X block variance and 61% of

the Y block variance by projecting the variables from dimension 14 to dimension 4,

which is originated from the troublesome and difficult treatment of coke wastewater.

The results of PLS model are represented in Table 6.2.

An appealing feature of PLS method is the modeling ability, that is, predictive

93

capability. Figure 6.3 shows the real and predicted value from PLS model and

displays the residual of Y blocks. The prediction values of the reduction of COD and

the reduction of CN are explained very well in the test periods and manifest the

prediction power of PLS model for the response Y variables. However, the

prediction of SVI of secondary settler is not satisfied unlike the process quality

variables. That may result from measurement inaccuracy and the operator’s

carelessness. It needs a precise measurement skill to the operator. The residual value

of Y blocks shows the sum of differences between the real and predicted values for

three response variables, which is mostly caused by the residual error of SVI

prediction.

Interpretation of PLS model

For the interpretation of WWTP, we consider the PLS loading weights to see

how X and Y variables are interrelated. Figure 6.3 represents that specific X and Y

variables load strongly in the first two latent variables dimension, where COD3,

COD2, and Taerator for COD reduction are closely correlated as seen in the left middle

side of Figure 6.4. The first Y variable, COD reduction of WWTP is influenced by

the COD load from BET2 and BET3 and the temperature in aerators which is

certified by the heterotrophic biomass activity effects of the temperature in a

biological treatment for the carbonaceous nutrients. The second group is formed by

CN2, CN3, Tinfluent, Q2 and Q3, and DO of aerator for CN reduction in the low section

of Figure 6.4. It also presents that the reduction of cyanide is affected by the cyanide

loads, influent flow rate, influent temperature, and dissolved oxygen level. Specially,

microorganisms related with cyanide are counter-connected with the heterotrophic

organisms and cyanide compounds are toxic and inhibitory to the growth of

heterotrophs, which is shown in the opposite direction of each other in the loading

plots. So, shock loading of cyanides in the wastewater influent causes a deterioration

of the biological WWTP. Those facts are well reported with experimental results in

94

a technical paper (Lee, 2000). The third group is made up of MLSS_R and

MLSS_%E for the SVI of secondary settler in the right upper side region, which

exemplifies that the settleability of biomass is related to the microorganism amount

(MLSS_R) and activity (MLSS_%E) included in the aerator and settler. Those are

excellent results taken into account biological similarity and the fact that process

layout represents.

Sometimes it may be quite useful to overview the PLS weights with large

number of latent variables, especially over 3 LVs. A real wastewater plant has

generally more than 3 LVs, in our case, four latent variables. The variable

influence on projection (VIP) informs us of the relevance of each X block pooled

over all dimensions and Y blocks (Eriksson et al., 1995). Thus, VIP in square is a

weighted sum of squares of the PLS weights, w, considering also the amount of Y

variance explanted by each latent variable. VIP plot is shown in Figure 6.5 and this

reveals that COD3 is the most important variable, followed the Taerator, MLSS_R,

MLSS_%E, and so on. This can be interpreted that COD influent from BET3 is most

important to the plant treatment efficiency of the aeration basins and the settling

ability of the secondary clarifier.

In Figure 6.6, the monitoring results of T2 and SPEX statistics are shown. The

horizontal line corresponds to the 95% significance level of the training data. From

this figure, we can see two deviations in the monitoring chart of T2 and SPEX

statistics. During samples 75 to 80 in the T2 chart, statistics have been deviated

slightly, which indicates that the deviations are large within the internal model.

However, SPEX does not increase and it is an indication that the internal mutual

relations are not altered according to PLS model. Since 15 days of solid retention

time (SRT) from the T2 deviation at sample 75, the SPEX chart begins to being

changed from samples 90 to 110. In this event, the SPEX value represents a similar

shape of the T2 change, except that it occurs after a SRT period. And around sample

95

250, the T2 statistics has a peak value, while SPEX is maintained in the vicinity of

95% confidence limit for a long time. We infer that the process has experienced the

large transition in the operation at this time, but does not know its cause correctly. In

order to identify more obvious cause for the deviation, the contributions from every

measurement variable are calculated. Also, it cannot diagnose and isolate their fault

(or disturbance) scale from the viewpoint of the process dynamics. This result arises

from a weak point of T2 and Q statistics. This is, although the events were within the

confidence limits, the changes or upsets in the operating condition sometimes

occurred in practice. This result illustrates that the principal components might

undergo a change though the correlation structure is unchanged, when the variances

of scores and residual error are similar to each other.

Multiresolution analysis

After the construction of PLS model, MRA was processed to the score matrix

(T) of X blocks of PLS model. To monitor the process change or fault and event

change, window and step sizes are 15 samples considering the SRT and 3 samples

considering the hydraulic retention time (HRT), respectively.

MRA to the PLS score values of the test dataset are shown in Figure 6.6. As

shown in Figure 6.7(a), GDM started to change at sample 65 and deviated during

around samples 65 - 120 (March 3, 2000 –April 27, 2000), where a large process

change happened at this time. It shows more rapid and critical detection ability than

the conventional MSPM method. Three eigenvalues which indicate their own

specific scale disturbance are depicted in Figure 6.7(b-d). The remaining

eigenvalues have little information and gives only high frequency information such

as measurement noises. From Figure 6.7, we can know that the first and second

eigenvalues largely contribute to the increase of GDM and are representative of

middle scale disturbances. In detail, the process change is first detected in GDM,

which is caused by the peaks of the second eigenvalue and then has experienced the

96

systematic variations of the first eigenvalue. It is easily identified and visualized by

monitoring each eigenvalue pattern at two scales. At this time, WWTP received high

input cyanide and COD load, while a small influent flow rate, that is, a highly

concentrated load. It reduced the activity of the microorganisms and diminished the

settling performance, then turned up the SVI increase in the secondary settler. From

this result, it has been seen that sludge and floc formation changes due to high load

and influent quality. Figure 6.8 shows the contribution plot at this time. From this

result, we found that a large influent load broke out the external disturbance and

were transformed into an internal disturbance, and then it changed the process

operation region in the activated sludge process. Meanwhile, GDM deviated again

from sample 230 to the last of test dataset (August 16, 2000 – November 9, 2000).

During the summer, WWTP was modified and a number of treatment equipments

and facilities were appended. This made it feasible for operators to change the

operation strategy which increased the MLSS concentration and maintained the high

DO concentration. It invokes the large process changes, which is shown as a gradual

increase of the first eigenvalue in Figure 6.8(b). This result confirms that it is

distinctly better than other conventional methods for a multiscale process change in

a nonstationary signal of unknown characteristics. They indicated that the proposed

MRA could be effectively used to extract information resulting form the change in

process operation and as a result could be contributed the localization of different

process faults and events.

6.4 Conclusions

In this research, a new approach of a multiresolution monitoring algorithm for

the PLS model is presented in order to solve the distinctive problems in WWTP,

such as collinear, multivariate, noisy, nonstationary, and multiscale. It is achieved by

combining the PLS technique for the modeling and multiresolution analysis for the

97

monitoring. PLS model is used for the prediction and data analysis that take full

advantage of the multivariate nature of the data and MRA of the PLS score and

residual error value is utilized to detect and diagnose the fault and disturbance with a

multiscale concept. It would give us the prediction, detection, and diagnosis power

at a time and make the investigation about nonstationary and multiscale phenomena

practicable. Experimental results from the industrial coke WWTP demonstrated that

it had the prediction and analysis ability of a complex plant and simultaneously the

suitable power of detection and isolation about various faults and events occurring in

the biological treatment. Moreover, it can distinguish small failures from process

upsets.

98

X

YT

PLSregrssion

FaultDetection

GDM ofmoving score matrix

EV1 EV2 EVn....

Prediction

ConfidenceLimits

Diagnosis

U

Biplots & Contribution plot

Figure 6.1 Multiresolution analysis for PLS monitoring

99

-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0-4

-3

-2

-1

0

1

2

3

4(b)

expe

cted

val

ue

score value (t1)

-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5-4

-3

-2

-1

0

1

2

3

4 (a)ex

pect

ed v

alue

original data

-1 0 1 20

20

40

60

80

100(d)(c)

-1 0 10

20

40

60

80

100

Figure 6.2 Normal probability plot and histogram of original data and PCA score

values (a) normal probability plot of original data (b) probability plot of score values

(c) histogram of original data (d) histogram of score values

100

0 50 100 150 200 250 3000

20

40

60

80

100

120

140(a)

SVI

time (days)

0 50 100 150 200 250 300200

300

400

500

600

700

800

900

1000 (c)

CO

D re

duct

ion

time (days)

0 50 100 150 200 250 3000.0

0.5

1.0

1.5

2.0(d)

SP

E Y

time (days)

0 50 100 150 200 250 3000

5

10

15

20

25

30

35

40(b)

CN

redu

ctio

n

time (days)

Figure 6.3 Prediction results of the PLS model with real Y value (solid line with

squares) and predicted value (dotted line) (a) SVI (b) reduction of CN (c) reduction

of COD (d) residual error of Y variables (SPEY)

101

-0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6-0.2

-0.1

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Q2Q3

MLSS_#E

DO_Aerator

CN2CN3

COD2

COD3

MLSS_R

T_Influent

T_Aerator

SVI_R

CN_red

COD_red

LV 2

LV 1

Figure 6.4 The second PLS weight vector plotted against the first for the PLS model

102

C

OD

3

T_ae

rato

r

MLS

S_R

MLS

S_#E

CO

D2

DO

_aer

ator

CN

3

T_In

fluen

t

CN

2

Q3

Q2

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

VIP

Figure 6.5 Variable influence on projection (VIP) for the predictor variables

103

0 50 100 150 200 250 3000

1

2

3

4

0 50 100 150 200 250 3000

5

10

15

20

25

(b)

SPE X

time (days)

(a)

T2

time (days)

Figure 6.6 Monitoring performances based on T2 and SPEX statistics with 95%

confidence limits

104

0 50 100 150 200 250 300

0.00

0.02

0.04

0.06

0.08

0.10

GD

M

time (days)

0 50 100 150 200 250 3000.0

0.2

0.4

0.6

0.8

1.0

EV2

time (days)

0 50 100 150 200 250 3000

1

2

3

4

5

6 (b)

(d)(c)

(a)

EV 1

time (days)

0 50 100 150 200 250 3000.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

EV 3

time (days)

Figure 6.7 Monitoring performance of MRA for the PLS score values with 95%

confidence limits

105

Figure 6.8 Contribution plot of PLS score value for the first event values

106

Table 6.1 Process Input/Output Variables in WWTP

No Variable Description Unit Mean SD





X5 COD2 COD from BET2 mg/L 156.4 20.28

X6 COD3 COD from BET3 mg/L 2083 295.5

X7 MLSS_%E MLVSS at final aeration

basin mg/L 1605 409.3

X8 MLSS_R MLSS in recycle mg/L 7194 3444

X9 DOaerator DO at final aeration basin mg/L 2.064 0.9979

X10 Tinfluent Influent temperature °C 37.6 2.513

X11 Taerator Temperature at final

aerator °C 30.74 2.379

Y1 SVIsettler Solid volume index at

settler mg/L 63.31 21.73

Y2 CNred Cyanide reduction mg/L 19.31 4.2

Y3 CODred COD reduction mg/L 605.4 97

107

Table 6.2 Variations explained by the PLS model of four latent variables

X blocks

(cumulative)

Y blocks

(cumulative)

LV 1 0.192 0.319

LV 2 0.338 0.481

LV 3 0.446 0.581

LV 4 0.540 0.607

108

VII. Process Monitoring for Continuous Process with Cyclic

Operation

7.1 Introduction

Recently, due to the increasingly stringent environmental regulations, advanced

monitoring and control strategies for WWTP are attracting a lot of interests.

However, some specific features about this process are yet to be addressed fully.

First, most changes in this biological process are slow and recovery from failures

can be very time-consuming and expensive. Sometimes it takes several months for

the process to recover from an abnormal operation. Therefore, early detection of

developing abnormalities is especially important for this process. Secondly, most

WWTPs are subject to large diurnal fluctuations in the flow rate and compositions

of the feed stream. So, these biological processes exhibit periodic characteristics,

where strong diurnal fluctuations are observed in the flow rate and compositions of

the feed waste stream. Since the variables of such processes tend to fluctuate widely

over a cycle, their mean and variance do not remain constant with time. Because of

this, conventional statistical process monitoring (SPM) methods like principal

component analysis (PCA), which implicitly assume a stationary underlying process,

may lead to many false alarms and missed faults. Better treatment performances can

be expected by accounting for this periodic pattern when applying advanced

monitoring and control strategies to this process.

Recently it has been made several attempts to treat the characteristic features of

a particular WWTP. First approach is solving a nonstationary problem, which are an

adaptive PCA, dynamic PCA, PLS, a monitoring based on state space method, and

so on. Second approach is solving a multiscale problem, which are a wavelet,

multiscale PCA, multiresolution analysis, and so on.

Rosen (1998) used conventional methods of PCA and PLS in the simulation

109

benchmark. For the monitoring purpose, a PCA model was built from a period of 14

days of dry weather condition with diurnal and weekly variations. Based on PCA

model, it tested the storm, rain weather and decreasing nitrification rate. For the

prediction model, they compared the linear PLS, dynamic PLS and nonlinear PLS

method in predicting the effluent nitrogen concentration. Because it did not

deliberate the non-stationary and periodic features of benchmark influent, T2 and

SPE chart also showed the periodicity and frequently revealed the false alarm and

bad prediction performance.

Teppola (1997) applied the dynamic PLS method for the purpose of process

modeling and analysis in real wastewater plant. Twenty process and control

variables were used to model four purification efficiency-related variables, i.e. the

diluted sludge volume index (DSVI) and the reductions of chemical oxygen demand

(COD), nitrogen and phosphorus. The data set consisted of daily values for each

variable during two years. Dynamic PLS model was used for extracting relevant

information form X to predict Y. The prediction performances were varied typically

from 60 to 90%. DSVI has been explained very well in cross-validation compared to

other effluent quality related variables. Some results of the prediction model were

poor. Especially, some of high peaks of nutrient were difficult to predict because

there was not enough information in those X-variables. Moreover, it did not include

the seasonal variations of process.

Rosen and Lennox (2000) pointed out the limitations of conventional PCA

method, that is, stationary and one time-scale. The data used for the examples are

industrial WWTP which are fourteen available variables form the on-line

measurement system and sampling time is five minutes. For these solutions, they

applied and compared the adaptive PCA and multiscale PCA. First approach,

adaptive PCA, in terms of updating mean, variance and covariance structure

overcomes the problems of non-stationary process data. The monitoring model is

110

continuously updated using an exponential memory function. These adaptations

made most of the variation the score plane (T2) from the residual plane (SPE).

Second approach, multiscale PCA used a wavelet transform to decomposed

measurement data into different time-scales and separate PCA models were used to

monitor each scale. And for the interpretation of a disturbance, they recombined the

scales to more physically interpretable scales, fast scale (hydraulic dynamics),

medium scale (concentration dynamics) and slow scale (population dynamics).

multiscale PCA increases the sensitivity of the monitoring and makes it easy to

interpret the disturbance. But adaptive PCA has the limitations, such as adaptation

into abnormal changes, test of information content (information windup) and the

interpretation difficulty of updating the covariance matrix. MSPC makes the

interpretation cumbersome by using a PCA model on each scale and its covariance

structure is static which may introduce errors. But two approaches are too complex

and simplest model should be used for monitoring. A trade-off between complexity

and information should be considered.

Teppola (2000) suggested the combined approach of PLS and multiresolution

analysis (MRA). In this work, a PLS model was built for removing the collinearity

problem and parsimonious modeling. Then the score values of the PLS model were

processed by using wavelet and MRA to extract the process trends and to detect

different kinds of fault and disturbance. It is shown how seasonal trends, faults, and

disturbances can be separated and discriminated by observing at different scales and

also diagnosed by studying biplots at multi-scale and computing variable

contribution. In the next paper, Teppola (2001) presented the monitoring algorithm

to remove the periodic seasonal fluctuation and long-term drifting problems, which

these low-frequency variations mask and interfere with detection of small and

moderate-level transient phenomena. By trending, this relatively common problem

of autocorrelated measurements can be avoided. It first applies the wavelet filter to

111

the original data and detrends the low-frequency components and then constructs the

PLS model. These particular data, the PLS monitoring results are shown to be

superior compared to conventional PLS model. This is because it removes low-

frequency fluctuations and results in a more stationary filtered data set that is more

suitable for monitoring.

Although these processes are non-stationary, their dynamic behavior tend to

repeat from cycle to cycle and hence their cycle-to-cycle behavior may be assumed

stationary. Hence, it is plausible to calculate and use different means and

covariances for different time points within each cycle. One can also establish

correlations among the samples at different time points of a cycle, much like in

Multiway-PCA (M-PCA) used for batch process monitoring (Nomikos and

MacGregor, 1994, 1995). Beyond that lies the possibility to capture correlation in

the variations from cycle to cycle for quicker detection of small mean shifts and

slow drifts. An efficient way of doing this is to describe the variations (from their

mean behavior) by a periodically time-varying (PTV) state-space model. However,

PTV system models are difficult to build and we need a systematic framework for it.

In order to provide this monitoring ability, we propose the monitoring method based

on state space model to capture and use the period-to-period correlation structure.

7.2 Theory

Model development

To make the modeling task manageable, we adopt the technique of lifting. In

the “lifted” form of a PTV model, all samples within one cycle are collected as a

single vector (Dorsey and Lee, 2001). Let yk(t) represent the vector of the mean

centered and scaled process measurements available at sample time t during cycle k.

yk(t) ∈ Rny where ny is the number of variables selected for monitoring purposes

112

(including the controller outputs and process outputs). Assume that there are N

time samples in each cycle. Then, the lifted vector looks like

[ ]Tk

Tk

Tkk )()2()1( Ny,,y,yY L= (7.1)

First, a cycle-to-cycle invariant model is constructed using the subspace

identification technique (Van Overschee and De Moor, 1993, 1994, 1996; Dorsey

and Lee, 2001).

kkk

kkk

eCxYeKAxx

+=+=+1 (7.2)

where A, K, C are the system matrices, Yk is the lifted vectors of all data made for the

kth cycle, xk is the state sequence that is extracted the process data based on the

relevancy of the previous cycle measurements for predicting the future cycle

measurement, that is, the state is defined to be a holder of information from previous

cycles that is relevant for predicting future cycles, ek is the innovation vector that is

the residual between the process data and its estimate, and Kek. is a stochastic input

vector as a state disturbance.

The dimension is usually very high for Yk (N*ny) and there exist strong

correlations among its elements. To reduce the dimension and facilitate the

identification step, PCA can be applied to Yk to obtain score vector rkY . Now the

stochastic state-space model can be constructed with the reduced output vector.

kk

kkk

exCY

KeAxx

+=

+=+

rrk

1 (7.3)

kr

kk EYY +Θ= (7.4)

Based on the above cycle-to-cycle system model (7.3), a time evolution PTV model

can be constructed for the on-line monitoring purpose.

113

)()()()(

)()1(ttxtHty

txtx

kk ε+==+

k

kk (7.5)

[ ] rCtItH Θ= LL ),0,(,0,)( (7.6)

where εk(t) includes both the appropriate elements of ek and the residual Ek from

PCA. The cycle-to-cycle transition is then described as

kKeNAxx +=+ 1)-()0(1 kk (7.7)

Hence, the terminal state of one period becomes the initial state of the next period.

This means the states of successive periods are naturally connected through the

dynamics of the process. This is the main difference from the case of batch systems

where the sate is reset at the start to each run.

However, this model formulation may not be valid for Kalman filter

implementation, since the residual could become correlated series, instead of white

noise, that is, the state noise Kek and the output noise εk are correlated. To deal with

this, one can use the following augmented form of the model:

krk

rrk

eIK

Y

x

CA

Y

x

+

=

−

+

1

1

00 kk

[ ]

=

−r

k

rrk Y

xCY

1

0 k (7.8)

where

[ ]

=Γ=

=

= +

+ IK

,CH,CA

,Y

xz r

rrk

000

Φ11

kk

Then, within a cycle, the time model becomes

=

+

−r

kr

k Y

tx

Y

tx

1

)()1( kk [ ] [ ] )()(

0),0,(,0,)( tνY

txItIty PCAr

k

kk +

Θ= LL (7.9)

where

114

[ ] [ ]ItItH 0),0,(,0,)( Θ= LL

Finally, cycle transition model is,

krk

rrk

eIK

NY

Nx

CA

Y

x

+

=

−

+

1)-(

1)-(

00

)0(

)0(

1

1 kk

[ ]

=

1)-(1)-(

0NeNx

CYk

rrk

k (7.10)

Now, a periodic Kalman filter can be applied to the model (7.9) and (7.10). The PTV

Kalman filter can be designed using the standard equations to update the state and

the output score vectors recursively on the basis of incoming measurements. Assume

Φ, Η, Γ, Q, R represent the state transition, output, state noise coefficient, state noise

covariance, and measurement noise covariance matrices respectively. The Kalman

filter equation is as follow,

[ ])|1(-1)(1)()|1(1)|1( ttxHtytKttxttx +++++Φ=++

[ ] 11)()|1()|1(1)( −++++=+ tRHttHPHttPtK TT

TT QttPttP ΓΓ+ΦΦ=+ )|()|1(

[ ] )|1(1)(-)1|1( ttPHtKIttP ++=++ (7.11)

When t = 0, …, N-2, Φ=I, Q=0, H= )(tH , R can be estimated from the PCA residual.

When t = N-1, Φ = Φ, Q is the error covariance matrix obtained from subspace

identification, H = H, R = 0.

Process Monitoring Measure

The Hotelling’s T2 monitoring statistics based on state space has been proposed

by Negiz and Cinar (1997). The Hotelling’s T2 statistics is a metric that includes

information for both mean and covariance structure of the state variables.

115

)()(

)1(12 nNn,FnN

nN~xxT −−−

Σ= −αk

Tkk (7.12)

where subscript k indicates time, N is the number of samples and n is the dimension

of the state vector x. The T2 is obtained by assuming that the state variables follow a

Gaussian distribution, that is, zero mean and multinormal distribution with estimated

covariance matrix Σ while they are orthogonal at zero lag. Fα(n, N-n) is the upper

100α% critical point of the F-distribution with n and N-n degree of freedom, which

can be used to establish control limit with significance level α.

The updated state vector x and the output score vector Yr can be monitored

separately using T2 statistics. Note that T2 monitoring of Yr amounts to the period-

by-period M-PCA monitoring implemented in a real-time manner. The monitoring

of state vector x(t) could give extra information about small mean shifts or slowly

developing abnormalities that are hard to detect with PCA. That is, the state has an

information form previous cycles that is relevant for predicting future cycles.

Therefore, a monitoring algorithm built around the developed state space model

gives a efficient ability to detect abnormal deviation in the process dynamics from

cycle-to-cycle dynamics point of view as well as deviation in process measurements

(Dorsey and Lee, 2001).

On the other hand, in many chemical and biological processes, some quality

variables cannot be measured on-line and their lab measurements are available only

after long delays. And advanced nutrient sensors are expensive in the aspect of cost

and maintenance. In such cases, inferential sensing can be very useful for on-line

monitoring and control. An added advantage of the proposed framework that it is

very easy to implement inferential sensing. For the purpose of inferential sensing,

the quality variables qky can be augmented with all other measured output before

doing PCA, such as,

116

[ ]TTTTT yyyyyY )(,,)2(,)2(,)1(,)1( Nqk

qkk

qkkk L= (7.13)

Note that quality measurements do not have to be available at a same rate as

process measurements. All the presented model formulations remain the same. Once

the model is built, the Kalman filter can be designed to use whatever measurements

available at each each time. Inferential predictions on the quality variables can be

obtained just by picking out the proper element after the Kalman filter estimate and

transforming it back from the score space to the original full space.

[ ] [ ]

Θ=

)()(

),0,(,0,)(tetx

ICtItyk

krLL qq (7.14)


The proposed a process monitoring method for cyclic process is tested on data

generated form a simulation of benchmark plant. The exiting dry weather data set in

benchmark may not be proper for constructing cycle-to-cycle model. There are few

variations from day to day influent data. While, constructing dynamic day to day

model needs data which contains enough variations and correlations form day to day.

And benchmark data is not long enough. If considering to construct daily model,

only 14 days of data are available, which is far from enough, especially for later

subspace identification.

To generate data which shows day to day correlation as well as some in-cycle

correlation a variation (or disturbance) model is used of both SNH,in and Qin. The used

functions are:

For time model variation

)()()1( tetaEtE kk +=+ (7.15)

For cycle-to-cycle model,

117

kkk Eb +=+ θθ 1 (7.16)

where θk is a vector which stores all the variations of each sample. Through the

simulation, data set in-cycle correlation would have only 4 times a day which means

every 6 hours, a variation is assumed to occur. It comes from that too much in-cycle

correlation would require more PCA scores to capture the whole cycle dynamics

which will complicate the subsequent state space modeling and makes the PTV

Kalman filter hard to follow the trend.

We generated 300 days normal data set with influent data file, which used last

200 days data as modeling data set for PCA and N4SID and used any part of first

100 days as prediction test. Figure 7.1 show the measured variables of the first 10

days of normal data set, which has a cyclic and diurnal variation. Two disturbance

scenarios were simulated. The first deals with slowly linear decrease of nitrification

rate. The second disturbance is a small mean (step) decrease of nitrification rate.

Table 7.1 represents the simulation conditions of two disturbance cases.

For the comparisons, we applied the general PCA and PLS method for both

original measured data set and data set with subtracting mean variation. For a data

set with subtracting a mean variation, PCA technique is applied to the remaining

dataset which averaging trajectory within a day is subtracted. So, periodicity is

removed within one day, which normal trajectory within a day is subtracted

averaging within one day (96 sample time). And then PCA model for the auto-scaled

data set with mean zero and unit variance is treated.

Monitoring and quality prediction performances are compared with those of the

static PCA and PLS methods. The variables used to build the X-block in the

disturbance detection were the influent ammonia concentration (SNH,in), influent flow

rate (Qin), nitrate concentration in the second aerator (SNO,2 ), total suspended

solid in aerator 4 (TSS4), DO concentration in aerators 3 and 4 (SO,3, SO,4), oxygen

transfer coefficient in aerator 5 (KLa5), and internal recirculation rate (Qint). A PCA

118

model of 95% and 99% confidence limit is built from training periods of 200 days of

dry weather dataset (normal operation). Three PCs are selected for PCA model and

captured variability is shown in Table 7.2. Figure 7.2 shows the T2 and SPE plot of

conventional PCA method during first 10 days of normal data set. During normal

operation with diurnal influent, the PCA monitoring result shows apparent cyclic

and non-stationary characteristics. This non-stationary (periodic) behavior of T2

score values is the cause of false alarm and missed fault by widen confidence limit.

Monitoring of nitrification linear decrease case

Figure 7.3 shows the general PCA monitoring results with stationary statistics

assumption for nitrification linear decrease case. T2 and SPE plot which focuses on

during 0-1000 samples show the bad monitoring results for this linear shift type

disturbance, where T2 plot cannot detect this type of event and SPE plot has a delay

about 200 samples. It originated from the widened confidence limit of their mean

behavior. For these non-stationary (periodic) behaviors, it tends to give false alarm

at peak value, but not sensitive at lower value period.

Figure 7.4 shows the general PCA monitoring results with subtracting mean

variation for nitrification linear decrease case during 0-500 samples. The control

limits of PCA method for cyclic removal data set have higher control limits than

limits of the conventional dataset. T2 and SPE plot for nitrification linear decrease

case show better monitoring performances than the previous data set, which T2 plot

shows the removal of the periodic variations and shows some other variations. From

the Figure 7.4, we know that T2 measure shows a delayed detection performance and

SPE plot has a delayed detection about 100 samples. However, T2 and SPE plot still

have some false alarms.

The monitoring result of the proposed method is shown in Figure 7.5. The 95%

confidence limits is calculated and is shown by the horizontal dotted line on the

plots. The result represents a high sensitivity of the proposed state space monitoring

119

method, where T2 around the state vector can immediately detect the linear shift

event of the nitrification rate. The earlier detection capability of the proposed

method could allow for a corrective action and operation change before any serious

situations such as the biomass decay and sludge bulking would be occurred.

Monitoring of nitrification step decrease

Figure 7.6 shows the conventional PCA monitoring results for small step

decrease of nitrification rate. PCA model shows the bad monitoring result for this

step type disturbance, which both T2 and SPE plot cannot detect the disturbance of

small step decrease. It originated from the widened confidence limit of their mean

behavior. For non-stationary behavior, it tends to give false alarm at peak value, but

not sensitive at lower value period. Because small decrease of nitrification rate is

introduced at lower value time, PCA monitoring method which is based on the

average of the whole trajectory of each variable cannot detect this type of small

mean shift.

Figure 7.7 shows the general PCA monitoring results with subtracting mean

variation for nitrification step decrease case during 0-1000 samples. Here, the

normal trajectory within a day is subtracted. As the previous case, T2 and SPE plot

of small step decrease show slightly better monitoring performances than the

original data set. But both T2 and SPE plot have a delay about 200 -300 samples and

also have some false alarm. It comes from that PCA method with subtracting variant

mean only considers the time variant mean and assume the constant variance which

can be time variant.

The monitoring result of the proposed method is shown in Figure 7.8. The

result represents a high sensitivity of the proposed state space monitoring method,

where T2 around the state vector can immediately detect the small step decrease of

the nitrification rate. The earlier detection capability of the proposed method could

allow for a corrective action and operation change before some undesirable

120

situations such as the reduced plant efficiency and the decreased settling ability

would be occurred.

Inferential sensing

Quality prediction performances of inferential sensing are compared with those

of PLS method for both original data set and data set with subtracting mean

variation. We made 300 days normal data set, where used last 200 days data as

modeling data set for PLS and used any part of first 100 days as validation test.

Because we are interested in the prediction ability based on integrated monitoring

model in this research, we constructed the inferential model which was based on 200

days of normal operation and did not consider any time lags between the input and

output. Quality variables are the effluent ammonia and nitrate, SNH,e and SNO,e.

Linear PLS model of original data set is built for the prediction of quality

variables, where four latent variables (LVs) are selected. The prediction results of

validation data set are shown in Figure 7.9. PLS model shows bad prediction results

of both SNH,e and SNO,e which has nonlinear and periodic dynamics. Linear PLS

model cannot model these nonlinear process behaviors. And then, PLS model for the

periodic removed data set is built, where four LVs are also selected. Compared to

the previous PLS model, Figure 7.10 shows highly good prediction performance. It

means that subtracting the average trajectory from the periodic process removes the

major nonlinear behavior of SNH,e. Finally, Figure 7.11 shows the prediction

performance of the proposed inferential sensing. Table 7.3 compares the mean

square of prediction error (MSE) of three prediction methods. As PLS prediction for

periodic removed data set, it removes the periodic dynamics and shows the good

prediction performance. This is not a surprising result as the proposed method

captures the most variability of the normal data set and the main dynamics of the

system. So, the proposed method offers an advantage for the process monitoring as

well as a chance to predict the quality variable in integrated model.

121

7.4 Conclusions

In this research, we propose a monitoring method based on state space models

for diurnal cyclic characteristics in domestic WWTP. A state-space model is

identified to extract “within cycle” and “between-cycle” correlation information

from historical data using subspace identification method. First, a cycle-to-cycle

invariant model is constructed using the subspace identification method. Second,

time-varying Kalman filter model is constructed for on-line monitoring. For the

purpose of inferential sensing, integrated framework augmenting the quality

variables is also suggested. Simulation results show that the proposed method is an

appropriate monitoring technique for WWTP with cyclic operation. Specially, it can

detect more rapidly the changes of cycle-to-cycle behavior, linear decrease of

nitrification, and more accurately detect mean shift, step decrease of nitrification,

than conventional monitoring methods.

122

Figure 7.1 Measured variables of the first 10 days of normal data set

123

Figure 7.2 Conventional PCA monitoring result during the first 10 days of normal

data set

0 100 200 300 400 500 600 700 800 900 10000

2

4

6

8

10

12

14Value of T2 with 95 and 99

Sample Number

Val

ue o

f T2

0 100 200 300 400 500 600 700 800 900 10000

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6Process Residual Q with 95

Sample Number

Res

idua

l

124

Figure 7.3 Conventional PCA monitoring result for nitrification linear decrease: T2

and SPE plot

125

Figure 7.4 PCA monitoring result with periodic removal for nitrification linear

decrease: T2 and SPE plot

126

Figure 7.5 Monitoring result of the proposed method for nitrification linear decrease

127

Figure 7.6 Conventional PCA monitoring result for nitrification step decrease: T2

and SPE plot

128

Figure 7.7 PCA monitoring result with periodic removal for nitrification step

decrease: T2 and SPE plot

129

Figure 7.8 Monitoring result of the proposed method for nitrification step decrease

130

Figure 7.9 Prediction results of SNH,e and SNO,e for validation data set with static PLS

method

131

Figure 7.10 Prediction results of SNH,e and SNO,e for validation data set with static

PLS method (after periodic removal)

132

Figure 7.11 Prediction results of SNH,e and SNO,e for validation data set with the

proposed method

133

Table 7.1 Two disturbances in the benchmark

Disturbances Linear/step Simulation conditions

Decreasing nitrification

rate Linear

Specific growth rate for autotrophs: from 0.5 to 0.4 day-1 in a linear fashion during

samples 288 (3 day) – 480 (5day) Decreasing nitrification

rate Step

Specific growth rate for autotrophs: from 0.5 to 0.47 day-1 in a step fashion during

sample 288 (3 day)

134

Table 7.2 Percent variance captured by PCA model

Original Periodic removal

PCs % Variance

captured this PC

% Variance

captured

total

% Variance captured this PC

% Variance

captured

total

1 69.8 69.80 39.91 39.91

2 19.18 88.98 21.05 60.96

3 6.37 95.35 15.48 76.43

135

Table 7.3 MSE of two PLS methods and the proposed method

Model type MSE

Conventional PLS 2.2460

Periodic removal

PLS 0.1023

The proposed

method 0.0044

136

VIII. Simultaneous Prediction and Classification in the

Secondary Settling Tank

8.1 Introduction

Since the environmental restriction becomes harder and harder nowadays, the

increasing effort for higher effluent quality from wastewater treatment plant is

required in the advanced monitoring of plant performance. To reduce the effluent

pollutants from the wastewater treatment plant (WWTP), it should precede any other

procedure to analyze the current state of each unit plant of WWTP. Most of the

changes in WWTP are slow when the process is recovering back from a ‘bad’ state

to a ‘normal’ state. Early fault detection and current status classification in the

biological process are very efficient to execute corrective action well before a

dangerous situation starts. So, the monitoring and multivariate analysis of activated

sludge process or the secondary settler have long been noted (Olsson et. al., 1988;

Hasselblad et. al., 1996; Stefano, 1998; Teppola et. al, 1997, 1999; Van Dongen et.

al., 1998; Rosen et. al., 1998).

Secondary sedimentation in the WWTP separates the biomass from the treated

wastewater and its performance is crucial to the operation of an activated sludge

system. Its operation depends on the status of sludge that relies on many other

parameters such as temperature, organic loading, influent flow rate, and flock

properties. The solid volume index (SVI) primarily describes the settling properties

of the sludge. High SVI values are indicators of a bulking state and overgrowth of

filamentous microbes, which is one of the major upsets of the activated sludge

process leading to the deterioration of the purification efficiency. Therefore, the

prediction of SVI value is very important from the viewpoint of the strategy of the

settler operation.

In this research, we forecasted the solid volume index (SVI) of the secondary

137

settler using an adaptive RLS method. ARX model parameters are proposed and

verified as a good feature in a secondary clarifier monitoring by observing the

evolution of ARX model parameters through power spectrum. The capability of

monitoring the secondary clarifier is illustrated with the application of a neural

network classifier which, combined the adaptive processing scheme, proved to be

suitable for the monitoring and classification application in the real wastewater

treatment plant.

8.2 Theory

Monitoring system in the secondary settler has been developed based on a

neural network of parallel distributed processing, powerful learning and

generalization capability of the pattern information. This system is composed of

three fundamental parts. First, auto-regressive with exogenous input (ARX) models

of time series data of SVI value in the secondary settler have been used to predict

the SVI value. Its parameters are adaptively estimated by RLS method and provide

feature vectors. Its ability is showed by the power spectrum analysis of ARX

parameters. Second, in the classification process, we design neural network classifier

to identify the current state of the secondary classifier. After training the neural

networks, we can recognize the state class of the settler from the values of output

nodes is chosen according to the maximum selection rule. Third, in the structure of

the classifier, we decide the optimal number of hidden nodes using GA.

SVI prediction with RLS method

To model SVI of the secondary settler, we introduce the identification methods

for the ARX model using recursive least square (RLS) method (Ljung, 1987; Ko and

Cho, 1996). A general form of the discrete ARX model is as follows.

)()()2()1()()1()( 211 tentubtubtubntyatyaty bnan ba+−++−+−=−++−+ LL (8.1)

138

The objective of the ARX is to estimate the adjustable parameters of ai and bi to

minimize the difference between the predicted process output and the measured

process output. Because the secondary settler is time varying process and has

inherently dynamic characteristics, it is required to use the adaptive ability. For this

purpose, we use the RLS algorithm that makes the modeling technique well suited

for time varying environment. The RLS algorithm is as follows.

λϕϕλ

ϕϕ

ϕϕλϕ

θϕ

θθ

)()1()()1()()()1()1(

)(

)()1()()()1()(

)1(ˆ)()(ˆ

))(ˆ)()(()1(ˆ)(ˆ

ttPttPtttPtP

tP

ttPtttptK

ttty

tytytKtt

T

T

T

T

−+−−

−−=

−+−

=

−=

−+−=

(8.2)

where K(t) is an adaptation gain, θ(t) parameter estimation at time, ŷ(t) is the

prediction value based on observations at time t-1, ϕ(t) regression vector, λ

forgetting vector, and P(t) covariance matrix of estimates. This recursive form is

very convenient for updating the model at each time, so that the model follows the

gradual change in the characteristic of the settler process. If the parameters of ARX

model are well tuned, a change in dynamic characteristics of the settler process will

cause gradual change in the parameter vector and prediction error. Therefore, the

status of the secondary settler can be observed by a gradual change in ARX model

parameters.

Power Spectrum

In order to see the sensitivity of ARX coefficients of SVI at each state and

verify its discriminant ability, the comparison of the power spectrum at each state is

required. The power spectrum of a stationary process is defined as the Fourier

transform of its covariance function (Ljung, 1987). While a deterministic signal can

be expressed as a mixture of sine and cosine functions at different frequencies, a

139

time series response or stochastic system response of a function of time doesn’t

belong to the class of functions dealt with in the usual Fourier transform theory. The

frequency decomposition of these random functions can be obtained by taking the

Fourier Transform of the auto-covariance function for which the usual Fourier

transform can be used.

For stochastic process, y(t) can be given by

)()()()()( teqHtuqGty += (8.3)

where u(t) is a quasi-stationary, deterministic signal with a spectrum, and e(t) is

white noise with a variance. Let G(q) and H(q) be stable filters. Then y(t) is quasi-

stationary and

222

)()()()( iwau

iwy eHeG σωω +Φ=Φ (8.4)

)()()( ωω uiw

yu Φ=Φ eG (8.5)

where Φy(ω) is a power spectrum of y(t) and Φyu(ω) is a cross spectrum of y(t) and

u(t). It should be noted that this type of spectrum estimates is inherently smooth

because they are obtained based on a parameter representation of the system. The

result has a physical interpretation, where |G(eiw)|2 is the steady-state amplitude of

the response of the system to sine wave with a frequency. The value of the spectral

density of the output is then the product of the power |G(eiw)|2 and the spectral

density of the input Φu(ω). If the power spectrum was separated and had a

dissimilarity value at a different state, analyzing the power spectrum of ARX

parameters can make the decision on the state of the secondary settler.

Pattern classification (neural network)

While different states are not completely separable in the original input and

output dimensional space under a wide range of conditions, the classes become

separable in the dimensional feature of ARX parameters space using neural network

140

classifier that has the ability of nonlinear mapping. So, ARX parameters were used

as input features for the neural network classifier. The explanation of neural network

classifier is as follows.

Pattern recognition methods such as neural networks are important for the

classification problems because they do not require accurate process models, which

are often difficult to obtain for many biological and chemical processes. And neural

network computing ability outperforms the conventional statistical approach in

many engineering application because of its non-linear transformation (Bishop,

1995; Lin and Lee, 1996; Haykin, 1999). Neural network maps a set of input

patterns (e.g., process operating conditions) to respective output classes (e.g.,

categorical groups). We use an input vector and an output vector to represent the

input pattern and output class, respectively. The output vector, y, from the neural

network is bipolar, with -1 indicating that the input pattern is not within the specific,

and 1 indicating that it is within a specific class (e.g., “-1” = not in class I; “1” = in

class I). The actual output from the neural network is a numerical value between -1

and 1, and can be viewed as the probability that the input pattern corresponds to a

specific class. The output vector (y) contains three possible classes, that is, y={class

I, class II, class Ⅲ}. Note that for every point within the input space, there must be

only one class specified. In this paper, we have only three possible output vectors for

training the network, for example, y = {[1,-1,-1], [-1,1,-1], [-1,-1,1]}.

After calculation of the neural network classifier output, the values of output

nodes are passed to the maximum selector. The output node selected by the

maximum selector gives information on the class that includes a current input. In

theory, for an M-class classification problem in which the union of the M distinct

classes forms the entire input space, we need a total of M outputs to represent all

possible classification decisions. It can be expressed as follows.

If yi(xj) > yk(xj) for all k (k=1,2,…,M : k ≠ i) , then xj ∈ si (8.6)

141

where xj is jth input vector, yi is the ith output node value of the neural network

classifier for input xj, si is the ith state of secondary clarifier, and M is the number of

output nodes. A unique largest output value exists with probability 1 when the

underlying posterior class distributions are distinct.

Genetic algorithm (GA)

The GA is a derivative-free stochastic optimization technique in which the

stochastic search algorithm is based on the idea of the principle of natures such as

natural selection, crossover, and mutation (Marsili-Libelli, 1996; Wang, et al., 1998).

One of the GA’s characteristics is the multiple points search, which discriminate the

GA from other random search methods. In this paper, the string, which is a model of

chromosome, represents the number of hidden layer of the neural network. The GA

typically starts by randomly generating initial population of strings. Each string is

transformed into the fitness value to obtain a quantitative measure. On the basis of

the fitness value, the strings undergo genetic operations. The goal of genetic

operations is to find a set of parameters that search the optimal solution to the

problem or to reach the limited generation.

Since the ultimate objective of a pattern classifier is to achieve an acceptable

rate of correct classification, this criterion is used to judge when the variable

parameters of the neural network are optimal. In addition, GA is the useful tool to

select features for neural network classifiers. For example, GA can be used to learn

or train neural network structure or to initialize the reasonable weight that is

generally assigned randomly. This paper uses the hybrid algorithm for the

optimization of the neural network structure using GA in order to improve the

behavior and the design of neural networks. GA was used to find the optimal

number of hidden nodes.

Hierarchy structure

142

Monitoring system in a secondary settler has been developed based on a neural

network of parallel distributed processing, powerful learning and generalization

capability of the pattern information. This system is composed of three fundamental

parts. These are an adaptive ARX estimation processing, neural network

classification and maximum decision rule making part. Figure 8.1 represents the

schematic diagram of the proposed hierarchy structure.

First, adaptive estimation processing is performed to predict SVI value and

provide feature vectors. That is, ARX model has been used to predict SVI value in a

secondary settler. Its parameters are adaptively estimated by RLS method and input

vector of the neural network classifier. Second, in the neural network classification,

the feature vectors are associated with the desired output decision. In the structure of

the neural network classifier, we decide the optimal number of hidden node using

GA. After training neural networks, the classifier output is calculated by the trained

weight. Third, in the maximum decision rule, only one of the values of the classifier

output is chosen according to the rule "the minority is subordinated to the majority”.


In this research, we used the industrial wastewater treatment facility data of the

iron and steel making plant in Korea. It is a general activated sludge process that has

five aeration basins and a secondary clarifier. Figure 2.3 shows the layout of the

WWTP. The data set consisted of daily mean values from January 1, 1997 to

December 22, 1999. The data are divided into two parts. A training set consisted of

the values during first two years and a test data set during the remaining one year are

used to see how the monitoring proceeds with the proposed algorithm.

First, the ARX model structure is as follows. Its inputs are four which are the

143

influent flow rate, influent COD, dissolved oxygen (DO) of the final aeration basin

and mixed liquor suspended solid (MLSS) in the final aeration basin. Output

variable is SVI of the settler. The state of a secondary settler is divided three classes

that were judged by the experienced operator. The choice of the order of the model

is a non-trivial problem that requires trade-off between precise description of data

and model complexity. We determined the model order using cross-validation and

numerous simulation. The prediction model uses ARX structure whose parameter is

adapted by RLS with the forgetting factor, where order of AR part is 3 and the order

of each exogenous input is 2. The applied ARX model has a following form.

2)(1)(2)(1)(2)(1)(2)(1)(

3)(2)(1)()(

−+−+−+−+

−+−+−+−=−+−+−+

tubtubtubtubtubtubtubtub

tyatyatyaty

44,244,133,233,1

22,222,111,211,1

321

(8.7)

where y(t) is SVI, u1(t) is influent flow rate, u2(t) is influent COD, u3(t) is DO and

u4(t) is MLSS. To remove data redundancy, we normalize the raw training data. The

RLS method uses the dead-zone method to remedy the estimation windup. Figure

8.2 shows the result of the one-step ahead prediction value of SVI that forecasts

reasonably. The dot point is real value and solid line is the prediction value.

In order to see the sensitivity of the ARX coefficients at each state, the

parameter values of each state were shown in Figure 8.3. In this Figure, ARX

parameters have different values according to each state, which means that the

decision on the state of the secondary clarifier can be achieved by quantitatively

analyzing the ARX parameters. To conform the difference between parameters in

each class theoretically, we display the power spectrum analysis of the parameters in

the Figure 8.4.

Second, neural network classifier has a MLP structure with two hidden layer,

which its nodes are decided by GA. To speed up training and stabilize the learning

algorithm, we use the momentum term, adaptive learning rate, normalized weight

144

updating and batch learning techniques. The neural network is trained using three

patterns according to the state of a secondary settler. The number of ARX

parameters, which is used as the input variables of neural network, is eleven. And

other operating conditions can be taken as additional features to compensate for

sensitivity of the ARX parameters to the variation of operation conditions, such as

toxic occurrence, aeration basin status. The simulation results showed little

improvement of classification ability. In this paper, we did not use this additional

information for the clarity. The input features were normalized in [-1, 1] ranges in

order to prevent saturation of an activation function. The corresponding target values

of output nodes were set to normal state (0.9, -0.9, -0.9), bad state (-0.9, 0.9, -0.9),

bulking state (-0.9, -0.9, 0.9) for each state of three classes. In the application of GA

for the structure of a neural network, the initial population size of parents was 30

and generation number was 100. Ranked-base selection as a selection operator, and

mutation and uniform crossover as a search operator were used. We have set the

mutation rate for 0.01 and crossover rate for 0.6. GA can find the optimal number of

each hidden node quickly, because the search space is small. The number of first and

second hidden layer is 7 and 4, respectively. In this experiment, the neural network

with two hidden layers have a better result than with only one hidden layer. In

addition, three or more hidden layers have no improvement of performance.

In testing mode, the maximum value of neural network classifier outputs was

chosen in determining the present states. It indicates what state is the current state.

The test data has not a bulking state but only the normal and bad state. Table 8.1

shows the confusion matrix from the result of the test set using the neural network

classifier. This is a matrix A whose (i, j) element is the number of vectors that

originate from the ith distribution and are assigned to the jth cluster. Though output

values don’t completely agree with the corresponding desired outputs, they are

reasonable to recognize the present state. From the trained neural network, the

145

classification rate was about over 80.9% on an average, even though the system was

tested under a wide range of operating condition. Because the process has an abrupt

load variation during the latter part of test set, the misclassification rate was higher

in this period.

8.4 Conclusions

The recognition of the process state of a secondary settler is very important in

the operation decision. We can monitor the current state through the mixed structure

of ARX model and neural network classifier. We found the optimal structure with

second order of ARX model and neural network with three layers. From the

experiment, a strong correlation between the settler states and the values of the ARX

parameters could be used as effective features for secondary clarifier monitoring.

The training and decision making for pattern recognition were successfully

performed through neural network classifier. The proposed method is useful to

predict the SVI value of the secondary settler and to classify the current state of a

secondary settler simultaneously. And the suggested method can also be used as the

classifier of the other process in the wastewater treatment plant.

146

Figure 8.1 Schematic diagram of the proposed hierarchy structure

Number of Hidden Nodes Fitness

Evolution

ARX

MODELFeatur

e

MAXIMUM

DECISION

MAKING

Class si

Inpu

y1

y2

y3

SVI

Prediction

Reproduction of

147

0 50 100 150 200 250 300 3500

20

40

60

80

100

120

140

data

SV

I, m

l/g

Figure 8.2 One-step ahead prediction value of SVI using RLS method

148

0 2 4 6 8 10 12-3

-2

-1

0

1

2

Normal Bad Bulking

Par

amet

er V

alue

Model Parameter

Figure 8.3 Sensitivity of the ARX model parameters of each state

149

-5 0 53

3.5

4

4.5

5

5.5

frequency(ω)

Pow

er

(a)

-5 0 50

5

10

15(b)

-5 0 50

2

4

6

8(c)

frequency(ω) frequency(ω)

Figure 8.4 Power spectrum in each state (a) normal (b) bad (c) bulking state

150

Table 8.1 Confusion matrix of the test data

Predicted

Normal Bad Bulking

Normal 228 9 6

Bad 36 59 17 True

Bulking 0

151

IX. Nonlinear Fuzzy PLS Modeling

9.1 Introduction

Statistical data analysis has been widely used in establishing models from

experimental or historical data. Typical problems in multivariate statistical analysis

are high dimensionality and collinearity in a sparse sample data set. The partial least

squares (PLS) modeling method is one of the most useful measures for overcoming

these problems. PLS is a multivariate statistical data analysis and regression method

which uses projection into latent variables to reduce high dimensional and strongly

correlated data to a much smaller data set that can then be interpreted.

The PLS method is used in a variety of areas where multivariate data emerge,

both in the laboratory and in the real world. Typical ‘lab-scale’ examples are

multivariate calibration, and quantitative structure-property and composition-

property relationships. ‘Real world’ examples include the monitoring of industrial

and environmental processes, geochemistry, and clinical, atmospheric, and marine

chemistry. As PLS uses a statistical data reduction and regression algorithm, it is

employed primarily in data analysis (Teppola et al., 1997, 1998; Rosen and Olsson,

1998; Wikström et al., 1998).

Although the original linear PLS (LPLS) regression method provides good

remedial measures to the problems of correlated inputs and limited observations, it

has the major limitation that only linear information can be extracted from data.

Since many practical data are inherently nonlinear, it is desirable to have a robust

method that can model any nonlinear relation. A successful step towards nonlinear

PLS modeling was the quadratic PLS (QPLS) method proposed by Wold et al.

(1989). In QPLS quadratic functions are used for the inner regression in PLS.

However, the nonlinearity of the QPLS method is very limited. To create a PLS

method of greater nonlinearity, several more generic approaches have been

152

developed such as spline PLS (SPLS), neural networks PLS (NNPLS), and locally

weighted regression PLS (LWR-PLS) (Wold, 1992; Qin and McAvoy, 1992;

Centner and Massart, 1998; Baffi et al., 1999). As their names suggest, SPLS uses

spline inner models and NNPLS uses neural networks inner models. LWR-PLS uses

LPLS as a regression method to build a locally weighted model for every sample. In

general, NLPLS algorithms use the criterion of minimum regression error to select

inner model parameters. However, the resulting models suffer from over-fitting or

local minima. In many cases modeling experts can easily detect these kinds of poor

modeling results by inspecting the PLS score plots. However, correcting NLPLS

models by changing model parameters is not an easy task because the relationship

between model parameters and model shape is not clear, and the models were not

developed taking into consideration the need for this kind of measure. The proposed

FPLS model remedies the shortcomings of NLPLS outlined above.

9.2 Theory

9.2.1 PLS modeling method

Basically, the PLS method is a multivariable linear regression algorithm that

can handle correlated inputs and limited data. The algorithm reduces the dimension

of the predictor variables (input matrix, X) and response variables (output matrix, Y)

by projecting them to the directions (input weight w and output weight c) that

maximize the covariance between input and output variables. Through this

projection decomposes variables of high collinearity into one-dimensional variables

(input score vector t and output score vector u). The decomposition of X and Y by

score vectors is formulated as follows:

∑

=

+=m

h

Thh

1

EX pt (9.1)

153

∑=

+=m

h

Thh

1FY qu (9.2)

where p and q are loading vectors, and E and F are residuals. This relation is known

as the PLS outer relation. The relation between score vectors th and uh is known as

the inner relation.

The original PLS algorithm was developed as a linear regression method that

uses a linear inner relation on the latent space. This LPLS algorithm has many

beneficial properties for use as a data analysis tool. For example, w and c can be

used to find the contributions of different variables to each score, and t and u can be

used to detect outliers. Moreover, the method has a well-developed statistical

foundation and results can be illustrated using biplots that enhance intuition into the

underlying system. However, LPLS is limited to modeling linear relationships, and

the real world is not limited to linear systems. Various nonlinear PLS algorithms

have been proposed to cope with the problems introduced by nonlinearity. However,

each of these approaches has shortcomings such as simplicity, lack of analytical

interpretability of regression coefficients, and so on. The FPLS algorithm proposed

here applies the TSK fuzzy model to the PLS inner regression. This method was

developed because the interpretability of the TSK fuzzy model overcomes some

handicaps of extant nonlinear PLS algorithms.

9.2.2 TSK Fuzzy Modeling

A fuzzy inference system is an effective means of creating models based on

human expertise in a specific application by a selection of fuzzy IF-THEN rules,

which form the key components of the system. Having selected the IF-THEN rules,

fuzzy set theory provides a systematic calculus to deal with information

linguistically, and it performs numerical computation by using linguistic labels

stipulated by membership functions. The fuzzy inference system therefore has the

properties of a structured knowledge representation in the form of fuzzy IF-THEN

154

rules. This system therefore provides a good framework for applying human

expertise in the construction of inference models.

The fuzzy inference system proposed by Takagi, Sugeno and Kang, known as

the TSK model, provides a powerful tool for modeling complex nonlinear systems

(Yen et al., 1998). Typically, a TSK model consists of IF-THEN rules of the form

Ri : if x1 is Ai1 and ⋅⋅⋅ and xr is Air then yi

= bi0 + bi1 x1 + ⋅⋅⋅ + bir xr for i = 1, 2, ⋅⋅⋅, L (9.3)

where L is the number of rules, xi = [x1 x2 ⋅⋅⋅ xr]T are input variables, yi are local

output variables, Aij are fuzzy sets that are characterized by the membership function

Aij(xj), and bi = [bi0 bi1 ⋅⋅⋅ bir]T are real-valued parameters. The overall output of the

model is computed by

∑

∑∑

∑=

=

=

=+++

== L

i i

L

i ririiiL

i i

L

i ii xbxbbyy

1

1 110

1

1)(

τ

τ

τ

τ L (9.4)

where τi is the firing strength of rule Ri, which is defined as

)()()( 2211 ririii xAxAxA ×××= Lτ (9.5)

Figure 9.1 shows a schematic block diagram of the TSK fuzzy model.

In general, Gaussian-type membership functions are used to build the model.

They are defined by

−−= 2

2

2)(exp)(

i

irrrir

cxxAσ

, i = 1, 2, ⋅⋅⋅, L (9.6)

where cir is the center of the ith Gaussian membership function of the rth input

variable xr and σi is the width of the membership function.

The TSK model presented above is sometimes called a first-order TSK model,

because it formulates its rules using a first-order polynomial. In general, any

function can be used for the fuzzy rules as long as it can appropriately describe the

output of the model within the fuzzy region specified by the antecedent of the rule.

155

For example, when the function is a constant it is called a zero-order TSK model.

Moreover, the zero-order TSK model is functionally equivalent to a radial basis

function network. In the present research, we refer only to the first-order TSK model

as the TSK model to avoid complication.

The great advantage of the TSK fuzzy model is its representative power, which

stems from its ability to describe complex nonlinear systems using a small number

of rules. Moreover, the output of the model has an explicit functional form (equation

9.4), and the individual rules give insights into the local behavior of the model. The

good interpretability of the fuzzy system may match the utility of the PLS method in

intuitive data analysis.

9.2.3 Nonlinear FPLS Modeling

Since many practical data are inherently nonlinear, there is a need for a

nonlinear PLS modeling approach which can not only represent any nonlinear

relationship but also attain the robust regression property of the LPLS method. We

propose the FPLS method as such a nonlinear modeling method. The FPLS method

is basically a combination of the PLS method and the TSK fuzzy model. The PLS

outer projection is used as a dimension reduction tool to remove collinearity, and the

TSK fuzzy inner model is used to capture the nonlinearity in the projected latent

space. An advantage of using the TSK fuzzy model as the inner regressor is its

interpretability, which facilitates in the design of the FPLS model structure by

allowing human experts to participate in the design process.

The FPLS method differs from the direct TSK fuzzy modeling approach in that

the data are not used directly to train the TSK model, but are preprocessed by the

PLS outer transform. This transformation decomposes the multivariate regression

problem into a few univariate regression problems and simplifies the TSK model.

The TSK method is a type of kernel regression method, where the input variables are

transformed nonlinearly to feature space variables and the transformed data set is

156

regressed linearly. Well-designed nonlinear transformation procedures usually

reduce the collinearity problem. In the kernel regression method, the method of

nonlinear transform is related directly to the regression performance. However,

designing an optimal nonlinear transformation for high dimensional and collinear

data set is very difficult, and the resulting models often suffer from over-fitting or

local minima. However, the robust data reduction characteristic of the PLS method

can compensate for this problem in the TSK fuzzy modeling method.

In the following subsections we propose the FPLS and IFPLS algorithms. The

basic FPLS algorithm keeps the weight vectors the same as for LPLS, whereas

IFPLS is an extended version of the FPLS algorithm that iteratively updates its

weight vectors according to inner relation functions. The weight updating algorithm

used in the IFPLS algorithm was developed because a complete PLS algorithm

should have an algorithm that updates the weights according to the inner relation.

However, the automatic updating of the weights to meet a certain object function

diminishes the knowledge-based modeling aspect of FPLS because it automatically

changes the score plots. When such a feature is used, the model that experts judged

to have an appropriate structure for the previous score plot may not be good for the

present plot. Therefore, we do not include the weight update scheme in the modeling

procedure of FPLS. However, as IFPLS is an advanced PLS method that follows the

main stream of NLPLS convention, we present it as an extended FPLS algorithm.

FPLS algorithm

Figure 9.2 shows a schematic of the basic FPLS method, which uses the PLS

outer transform to generate score variables from the data. Score vectors (th and uh) of

the same factor h are used to train the inner TSK fuzzy model fh(⋅), which obeys the

following relation

hhhh etfu += )( (9.7)

157

where eh represents the regression error. The parameters of fh(⋅) should be selected to

minimize eh without over-fitting. To summarize, by not updating the outer relation

FPLS keeps the LPLS property that variables are projected into the directions

maximizing the covariance, and it captures nonlinearity through the large modeling

capacity of the TSK model.

The proposed FPLS algorithm can be formulated as follows.

1. Scale X and Y to have zero-mean and unit-variance.

Let E0 = X, F0 = Y and h = 1.

2. For each factor h, take uh from one of the columns of Fh-1.

3. PLS outer transform:

)(1 hT

hhT

hT

h / uuuw −= E (9.8)

hhh / www = (9.9)

hhh wt 1−= E (9.10)

)(1 hT

hhT

hT

h / tttc −= F (9.11)

hhh / ccc = (9.12)

hhh cu 1−= F (9.13)

Iterate this step until it converges. This step is called the nonlinear iterative

partial least squares (NIPALS) algorithm. Although there exists a faster and

more stable algorithm using eigen vectors (Höskuldsson, 1988), we use

NIPALS to give readers a clearer picture of PLS outer projection.

4. Find the TSK fuzzy-type inner relation function, fh(⋅), which predicts the

output score uh with the input score th. fh(⋅) has the functional form

∑=

+=L

iiiih tbbGtf

110 )()( (9.14)

where

158

∑ =

= L

i i

iiG

1τ

τ (9.15)

( )

−−= 2

2

2exp)(

i

ii

cttσ

τ , i = 1, 2, ⋅⋅⋅, L (9.16)

Gi is the normalized firing strength and τi is a Gaussian-type firing strength

for the ith rule. First, the number of fuzzy rules, L, should be estimated by

the model designer at an integer value that minimizes the regression error of

fh(⋅) without creating an over-fitted model. The designer may use intuition

gained from the score plot or some numerical criteria such as the sum of

squared errors (SSE) for cross validation. The designer can then decide the

other parameters, such as ci, σi and bi, using a numerical curve fitting

function to minimize the SSE.

5. Calculate the X and Y loadings

)(1 hT

hhT

hT

h / tttp −= E (9.17)

)ˆˆ(ˆ 1 hT

hhT

hT

h / uuuq −= F (9.18)

where ( ) ( ) ( )[ ]Thhhhh Ntftftff )(,,)2(,)1()(ˆ hhhhtu L== for N samples.

6. Calculate the residuals for factor h.

Thhhh pt−= −1EE (9.19)

Thhhh qu1 −= −FF (9.20)

7. Let h = h + 1, then return to step 2 until all m principal factors are

calculated. The number of factors m is decided by the designer. The

designer may use intuition gained from the score plot or some numerical

criteria such as SSE for cross validation.

The parameters of fh(⋅) can be decided by various heuristics. In this research,

the initial values of ci, σi and bi are decided using the fuzzy c-means (FCM)

159

algorithm (Jang et al., 1997), Moody and Darken’s (M&D) rule (Moody and Darken,

1989) and the global learning procedure (Yen et al., 1998) (see the Appendix for the

mathematical formulations of these methods). Then a numerical nonlinear least

squares curve fitting function is applied for the optimization of the parameters with

the object function of minimizing the SSE. However, if the optimized model shows

signs of over-fitting such as very steep changes in its trend, the designer can change

and fix some parameters and then optimize the other parameters to make a smoother

and more reliable model within the criteria of his or her expertise.

As is shown in the algorithm, the designer’s decisions are emphasized in the

calibration of a FPLS model. This aspect of FPLS represents an improvement over

other PLS algorithms. Generally, structural parameters such as L and m are selected

using cross validation method to avoid the problem of over-fitting. Cross validation

is mandatory for high dimensional models, because the model shape cannot be well

presented in visible form. Although the fuzzy modeling process gives particular

weight to the application of the expert’s knowledge in the modeling process, it is

also hindered by the problem of high dimensionality. Regardless of the type of

modeling, designers should check the validity of their model. The FPLS method aids

designers in model validation by providing a simple modeling interface for visual

checking, in addition to the typical cross validation method. The visual check

comprises checks of the error correlation, high leverage data treatment, local

minimum, over-fitting and lower fitting. Checking using visualization is possible

because of the robust data reduction and the two-dimensional presentation properties

of PLS. Other PLS methods such as LPLS and NNPLS also have these properties,

but they lack the interpretability and high nonlinear regression capacity of the TSK

inner relation function. The fuzzy rules of the TSK function provide insights into the

model that allow us to make a simple linear expectation of its behavior even in the

extrapolation range and to interactively change its parameters. These capabilities

160

make FPLS a promising modeling and monitoring method.

Iterative FPLS (IFPLS) algorithm

NLPLS algorithms include weight update methods. The weight update methods

can be classified as follows. The first approach is not to update any weight. In this

scheme, no iterative weight update procedure is used on either the input or the

output weight. So, inner relation functions are calibrated only one time for each

latent factor. FPLS uses this method. As a result the first score and weight vectors

remain the same as that of LPLS; however, the later vectors change because the

reduction of X and Y changes depending on the nonlinear regression performance.

The advantages of this method are that the direction of the weights remains in the

direction of maximizing covariance and that the calculation time is short. The

second method is to use fixed input weights, and update output weights and inner

relation functions iteratively, where a weight update method similar to that of the

NIPAL algorithm is used (Qin and McAvoy, 1992). As a result only the first input

score and the first input weight vectors remain the same as those of LPLS. This

method goes half way toward fitting the weights to the inner relation function. On

the other hand, it avoids the problem of updating the input weights, which can be

controversial. The third approach is to update both the input and output weights

iteratively (Wold et al., 1989; Baffi et al., 1999). While the output weights are

updated using the same method as that of the second method described above, the

input weights are updated iteratively to the vectors that minimize the regression SSE

of the each inner relation function that is decided at the previous iteration, where

numerical techniques are usually used to find the optimum input weights. However,

this method shows an obvious problem when applied to rank deficient data sets.

When the input dimension is larger than the number of samples, attempts to find the

w which minimizes the SSE of u = f(Xw) can yield an uncountable number of w’s

161

which give the same minimum SSE. Numerical techniques give one of these

solutions. This kind of weight update method does not have the robust dimension

reduction properties of PLS. One feature of this kind of model is that they capture

very large y-variance, but very small x-variance. The fourth approach involves the

simultaneous reduction of x- and y-variances along with the updating of both the

input and output weights. Wold et al. (1992) proposed a method of this kind that

uses the same output weight update method as the second and third methods outlined

above, but which updates the input weights using a correlation related algorithm.

This method has the basic PLS principle in mind, which places equal emphasis on

the approximation of X and on the correlation between X and Y. However, the

algorithm of Wold et al. (1989) is not as balanced as NIPALS.

In this research we propose a new weight update method. The idea behind this

method is to apply the same update scheme to the input and output weights. Our

approach uses the weight update method that has been used only on output weights

in the past. This is achieved by defining a backward inner relation function, g(⋅),

where a TSK model is used for the functional form of g(⋅). The core of the algorithm

is as follows.

1. Initialize PLS parameters using LPLS.

2. Update the parameters.

)(ˆ tu f= (9.21)

)ˆˆ(ˆ uuuc TTT /Y= (9.22)

ccc /= (9.23)

cu Y= (9.24)

)(ˆ ut g= (9.25)

)ˆˆ(ˆ tttw TTT /X= (9.26)

162

www /= (9.27)

wt X= (9.28)

where f(⋅) minimizes SSE between u and u and g(⋅) minimizes SSE

between t and t .

Iterate this step until convergence is reached.

This is a nonlinear extension of the NIPALS algorithm that reduces to LPLS

when the inner relation functions are first order polynomials with no constant term.

The function g(⋅) is used only in the training process not in the prediction procedure.

The use of g(⋅) achieves balanced reductions of X and Y. This algorithm can be

applied to other PLS algorithms with minor changes.

IFPLS is an extended version of FPLS that uses the weight update algorithm

given in equations 9.21-9.29. In the IFPLS modeling, the number of fuzzy rules can

be decided using the method employed for FPLS. However, decisions based on

intuition lose meaning because the score plots change throughout the iteration

process. Therefore, the use of the cross validation method is recommended for

IFPLS.

Prediction method with FPLS model

FPLS and IFPLS models trained on a calibration data set are both identified by

scaling information, outer projection vectors and inner relation parameters, i.e., the

means and variances of the calibration data sets X0 and Y0, loading vectors p and q,

input weight vector w, the number of fuzzy rules L, the center of the membership

function c = {c1, c2, ⋅⋅⋅, cL}, the width of the membership function σ = {σ1, σ2, ⋅⋅⋅,

σL} and the linear regression coefficient b = {b1, b2, ⋅⋅⋅, bL} of fuzzy rules for all the

factors m under consideration. Let us denote the outer projection vectors of the m

factors by matrix form, i.e., P, Q and W. Then, for a new input data set X the output

data set Y can be predicted using the following steps.

163

1. Scale X by the mean and variance of X0.

2. Calculate the input score matrix

1)( −= WPXWT T (9.29)

where T = [t1, t2, ⋅⋅⋅, tm]

3. Predict output score vectors using the TSK inner model defined in equation

(9.14), with ch, σh and bh for each factor h.

)(ˆ htu hh f= (9.30)

4. Predict the scaled Y

TQUY ˆˆ = (9.31)

where ]ˆ,,ˆ,ˆ[ˆ21 muuu L=U for i = 1, 2, ⋅⋅⋅, m.

5. Rescale Y by the mean and variance of Y0

Using the PLS outer relation and the TSK fuzzy-type inner model, the FPLS

method is capable of robustly describing any complex nonlinear system and

provides informative biplots. Because FPLS uses the outer relation of PLS, the

analytical meaning of the outer projection vectors remains valid. Hence, various PLS

monitoring methods are still applicable to FPLS. Moreover, the interpretation based

on fuzzy rules gives a new way of monitoring nonlinear systems. For an example,

each sample of a system modeled by FPLS can be classified according to the fuzzy

rule that has the largest firing strength value on it.

9.3 Results and Discussion

The proposed FPLS algorithm is applied to two data sets. First, a simulation

data set of benchmark plant is considered, followed by real data of BET plant. TSK

fuzzy model were built using the nonlinear least squares optimization function

(Bang et al., 2001), which initial point is determined by FCM clustering algorithm

164

for identifying the center locations, P-nearest neighborhood method for deciding the

width and the global learning procedure for determine the parameters of fuzzy rule

(See the appendix). For the comparison, prediction performances of FPLS are

compared with LPLS and QPLS.

Simulation benchmark

Eight variables used to build the X-block in the simulation benchmark were the

influent ammonia concentration (SNH,in), influent flow rate (Qin), nitrate

concentration in the second aerator (SNO,2 ), total suspended solid in aerator 4

(TSS4), DO concentration in aerators 3 and 4 (SO,3, SO,4), oxygen transfer coefficient

in aerator 5 (KLa5), and internal recirculation rate (Qint). Quality variables are the

effluent ammonia and nitrate, SNH,e and SNO,e. We used 14 days as a normal data set

developed by the benchmark, where the training model was based on a normal

operation period for one week of dry weather and validation data was used on data

set for last 7 days. Because we are interested in the normal operation condition in

this research, we constructed the training model which was based on normal

operation and did not consider any time lags between the input and output to avoid

the complication.

The results of three PLS models are represented in Table 9.1, where four LVs

are selected in PLS model. Figure 9.3 shows the scatter plot and firing strength of

FPLS model. In the score plot, the small circle represents the center ci of a firing

strength function shown in the lower plot and the dashed line crossing the circle is

its fuzzy rule. In the lower plot, the solid lines represent the firing strength τi and the

dashed lines represent the normalized firing strength Gi. These plots clearly show

the nonlinear natures of the benchmark plant. LPLS gives no direct way to cope with

this nonlinearity; however, FPLS can give a direct and interactive way of treating

such nonlinearities. To decide the number of fuzzy rules, we applied various

165

numbers of fuzzy rules and heuristic rules to each LV. Then, we found that ‘2-2-1-1’

fuzzy rules for each LV and fixing the center of fuzzy rule of first LV by FCM gave

the best regression performances on training and validation data sets. The score plots

of the third and fourth LVs showed almost no nonlinearity; hence, we used only one

fuzzy rule for each of these LVs. Compared with other NLPLSs, FPLS model gives

a visual and interactive design capability which can treat such nonlinearities and

avoid overfitting problem.

Percent variances captured of training data (%) and mean squared error (MSE)

of test data set in benchmark with three PLS models are listed in Table 9.1, which

shows the regression performance of all PLS models. Explained variances of X-

block using LPLS, QPLS and FPLS model do not show any particular difference

and the value of Y-variance captured by the FPLS model is larger than two methods.

And the mean squared error (MSE) in the validation data set shows that best

prediction performance is achieved by the FPLS method. Figures 9.4 and 9.5 show

the prediction results of SNH,e and SNO,e in the validation data set for LPLS and FPLS

method. Time series plots and scatter plots illustrate the prediction improvements

that are achievable through the fuzzy regression approach. Scatter plots certify the

modeling capability of FPLS.

These results are not surprising because FPLS model is designed to capture the

main variability of the training data set and validation data set is generated with the

similar statistical properties to the training data. However, the above results are valid

on only the normal data set. In other situations, such as other disturbances cases,

other models may be better than FPLS model. The situation and the aim of the

models can determine their best model structure.

Full-scale WWTP

The process data were collected from a biological WWTP (BET) that treated

the coke wastewater of the iron and steel making plant in Korea (Figure 2.3).

166

Twelve process and manipulated variables, X blocks, were used to model three

process output variables, Y blocks. Y blocks consist of the solid volume index (SVI),

the reduction of cyanide (∆CN), and the reduction of COD (∆COD). Table 2.1

describes the process variables and presents the mean and standard deviation (SD)

values of X and Y blocks. The process data consisted of daily mean values from 1

January, 1998 to 9 November, 2000 with a total number of 1034 observations. The

720 observations were used as the calibration of PLS models, where the samples of

odd number were used as a training set and those of even number were used as a

validation set. And the remaining 314 observations were used as a test data set.

The results of three PLS models are represented in Table 9.2, where six LVs

are selected for each PLS model. Figure 9.6 shows the scatter plot and firing

strength of FPLS model with six LVs (the fifth and sixth LV are not shown). Unlike

the expectation, the data from BET showed no obvious nonlinearity. However, we

did find some nonlinear characteristics at the second LV, which leads us to use three

fuzzy rules for this factor. The first and later factors showed almost no nonlinearity;

hence, one fuzzy rule was used for each of these LVs. To avoid complication, we

did not consider the nonlinearity of these factors further.

The value of X and Y-variance captured by the FPLS model is larger than those

of LPLS and QPLS methods and the mean squared error (MSE) of validation shows

even best result in FPLS. Contrary to our expectation, MSE in the test data set

represents that LPLS and QPLS have better prediction performance than FPLS.

During the test data set, WWTP had received a large influent load and experienced

the large change of operating condition. These process transitions altered a sort of

microorganism and sludge, which changed the process dynamics in BET. Because

FPLS model is designed to capture the nonlinear behavior and statistical properties

of the training data set, FPLS model showed inferior prediction result in these

disturbances cases. Figure 9.7 and 9.8 shows the time series and scatter plot of real

167

and predicted value with LPLS and FPLS model during the validation periods. The

prediction performances of COD and CN reduction are satisfactory. But, the

prediction of SVI of secondary settler is not so good as those of the other process

quality variables. LPLS and FPLS show a similar prediction performance.

After we performed several experiments comparing FPLS with the other

NLPLS models. We concluded that FPLS shows similar regression performance to

the other NLPLS models; however, it is difficult to make a fair comparison between

models, because each algorithm has its own characteristics. For this reason we will

not present a detailed comparison between models, but below we will outline the

difference between FPLS and the other NLPLS in two aspects.

First, inner relation models of FPLS usually take on gentler curvature than

those of other NLPLS, as they are locally weighted averages of linear fuzzy rules

and model designers would not favor highly nonlinear shapes of inner relation

models whose variables are the results of linear computations. In contrast, other

NLPLS models can take on any nonlinear shape to minimize the SSE, providing this

shape is permitted by cross validation. If a FPLS model were built referring only to

the cross-validation result, with no input from the experts, it could have greater

curvature. Hence, it ultimately depends on the experts’ decision whether to use a

conservative model or an SSE-minimizing model.

At seconds, the number of regression parameters estimated for each NLPLS

inner model depends on a few structural parameters, such as the order of a

polynomial for QPLS, the order of polynomials and the number of knots for SPLS,

the number of neurons for NNPLS and the number of rules for FPLS. They also vary

depending on the nonlinearity of the modeled system. If the value of the structural

parameters is increased the regression SSE of the model will decrease and the model

will take on a more nonlinear shape. Because these structural parameters have

different physical meanings, their values cannot be compared with those of another

168

NLPLS. However, if the values are the same, FPLS generally uses more parameters

than other NLPLS methods. For example, if the values of the structural parameter

are L for both NNPLS and FPLS, an inner model of NNPLS needs 2L + 1 regression

parameters for the input and output weights of the neurons plus a bias term, whereas

that of FPLS needs 4L parameters for c, σ and b. However, this does not mean FPLS

is a more complex model to interpret. Because FPLS analyzes the system using

submodels represented by fuzzy rules, the 2L parameters used for b help in the

preparation of submodels and the 2L parameters used for c and σ help to interpret

the relationship between the input data and the submodels. Therefore, although

FPLS uses more regression parameters than other NLPLS methods for a same

structural parameter, its superiority as an informative model will rate it highly

among the elemental NLPLS methods

9.4 Conclusions

In this research we proposed a fuzzy PLS model and presented experimental

results showing the application of this algorithm. The proposed model uses a PLS

framework, which gives the model robust regression performance when used on

high dimensional and collinear data. Moreover, as the model uses TSK fuzzy models,

it can represent highly nonlinear systems. Most importantly, the proposed model has

higher interpretability than any other NLPLS modeling method, creating a modeling

environment that is favorable to the use of experts’ knowledge. The interpretability

of the FPLS model is embodied in the following elements. First, the model can be

presented in intuitively simple biplots; second, the fuzzy rules of the TSK function

provide insight into the model; and finally, the effects of the fuzzy rules can be

estimated using plots of the firing strength. Another property that distinguishes the

FPLS model from other NLPLS models is that the TSK fuzzy model is a

combination of linear submodels. This feature causes the FPLS model to provide

169

more stable estimations of output on extrapolation. Using these properties, a model

designer can interactively revise FPLS model and construct a more robust nonlinear

model with fewer instances of local minima and over-fitting.

9.5 Appendix

A1. Fuzzy c-means (FCM) algorithm

The center of a Gaussian-type membership function, ci, can be decided by using

the FCM algorithm, that is

∑

∑=

== N

j ij

N

j jiji

tc

12

12

µ

µ, i = 1, 2, ⋅⋅⋅, L (9.32)

where

2

1

1

∑ =

−

−=

L

kkj

ij

ij

ct

ctµ (9.33)

is a membership grade.

A2. Moody & Darken’s rule

The width of a Gaussian-type membership function, σi, can be decided by using

the P-nearest neighborhood heuristic suggested by Moody and Darken (1989), that

is

( )

2/1

1

21

−= ∑

=

p

llii cc

pσ

, i = 1, 2, ⋅⋅⋅, L (9.34)

where cl (l = 1, 2, ⋅⋅⋅, p) are the p (typically p = 2) nearest neighborhoods of the

center ci.

A3. Global learning algorithm

170

The parameters, bi, of a fuzzy rule can be decided by using a global learning

algorithm. Global learning chooses the parameters of fuzzy rules that minimize the

objective function JG.

[ ]∑=

−=N

kG kukuJ

1

2)(ˆ)( (9.35)

Equation (9.35) can be rearranged into a simple matrix form.

( ) ( )GGGT

GGGJ bubu G TT −−= (9.36)

where [ ] TG Nuuu )()2()1( L== uu ∈ℜ N × 1

=

)()()()()()()()()(

)2()2()2()2()2()2()2()2()2()1()1()1()1()1()1()1()1()1(

2211

2211

2211

NtNGNGNtNGNGNtNGNG

tGGtGGtGGtGGtGGtGG

LL

LL

LL

L

M

L

L

GT

∈ ℜ N × 2 (9.37)

[ ] TLLG bbbbbb 1021201110 L=b ∈ ℜ 2L × 1 (9.38)

Appling singular value decomposition (SVD) to TG yields

TVΣUT ~~~=G (9.39)

where

[ ]TNuuu ~~~~21 L=U ∈ ℜ N × N (9.40)

[ ]TL221~~~~ vvv L=V ∈ ℜ 2L × 2L (9.41)

)~,,~,~(~221 Ldiag σσσ L=Σ ∈ ℜ N × 2 (9.42)

where 0~~~221 ≥≥≥≥ Lσσσ L . Then the minimum Euclidean norm solution of the

fuzzy rule parameters, bG, is computed as

i

s

1i i

GT

iG σ

vuub ~~~

∑=

= (9.43)

where s is the number of nonzero singular values in Σ~ .

171

172

x is A1

Rule 1

Rule 2

Rule L

x y∑

∑=

=L

i i

L

i ii y

1

1

τ

τ

y1 = xT b1

x is AL yL = xT bL

x is A2 y2 = xT b2

τ1

τ2

τL

Figure 9.1 Block diagram of the TSK fuzzy model

173

w1

c1

t1

u1 û1

p1T

q1T

f1(·)

X

Y

E0

F0

+

+

-

-

w2

c2

t2

u2 û2

p2T

q2T

f2(·)

E1

F1

+

+

-

-

wm

cm

tm

um ûm

pmT

qmT

fm(·)

E2

F2

+

+

-

-

E

F

Em

Fm

. . .

First factor Second factor Last factor

Figure 9.2 Block diagram of the FPLS method

174

(a) (b)

(c) (d)

Figure 9.3 Scatter plots and firing strength plots of FPLS model in benchmark (a)

first LV (b) second LV (c) third LV (d) fourth LV

175

(a)

(b)

Figure 9.4 Comparisons of LPLS and FPLS for the predicted and actual SNHe in

benchmark (a) Time series plot (b) Scatter plot

176

(a)

(b)

Figure 9.5 Comparisons of LPLS and FPLS for the predicted and actual SNOe in

benchmark (a) Time series plot (b) Scatter plot

177

(a) (b)

(c) (d)

Figure 9.6 Scatter plots and firing strength plots of FPLS model in BET (a) first LV

(b) second LV (c) third LV (d) fourth LV

178

(a)

(b)

(c)

Figure 9.7 Time series plots of predicted and actual output in BET (a) SVI with

LPLS and FPLS (b) ∆ CN with LPLS and FPLS (c) ∆ COD with LPLS and FPLS

179

(a)

(b)

(c)

Figure 9.8 Scatter plots of predicted and actual output in BET (a) SVI with LPLS

and FPLS (b) ∆ CN with LPLS and FPLS (c) ∆ COD with LPLS and FPLS

180

Table 9.1 Percent variance captured (%) and MSE of several PLS models in

benchmark

LPLS QPLS FPLS

Factor X Y X Y X Y

1 64.49 40.60 64.49 43.03 64.49 43.68

2 88.96 60.06 88.96 67.85 88.97 71.61

3 91.40 71.04 91.28 77.12 91.45 78.51

4 96.85 72.25 97.04 78.66 97.10 80.00

MSE 0.60 0.46 0.44

181

Table 9.2 Percent variance captured (%) and MSE of several PLS models in BET

LPLS QPLS FPLS

Factor X Y X Y X Y

1 17.75 31.76 17.75 31.92 17.75 31.76

2 33.32 47.85 33.32 48.29 33.32 48.79

3 44.01 58.32 43.96 58.70 44.01 59.27

4 52.96 60.61 52.89 60.83 53.00 61.55

5 59.61 62.20 59.55 62.90 60.37 63.00

6 64.64 63.43 65.10 63.90 65.57 64.21

MSE of

validation data 1.13 1.12 1.11

MSE of test data 1.67 1.68 1.71

182

X. References

Alex, J., Beteau, J.F., Copp, J.B., Hellinga, C., Jepsson, U., Marsili-Libelli, S., Pons,

M.N., Spanjers. H., Vanhooren, H., “Benchmark for evaluating control

strategies in wastewater treatment plants.” European Control Conference 99,

Karlsruhe, 1999.

Åström, K.J. and Hägglund, T., “Automatic Tuning of Simple Regulators with

Specifications on Phase and Amplitude Margins.” Automatica, Vol.20, 1984,

pp.645-651.

Åström, K.J., “Adaptive control.” USA, Addison Wesley, 1995.

Baffi, G., Martin, E.B. and Morris, A.J., “Non-linear projection to latent

structures revisited (the neural network PLS algorithm).” Comp. & Chem.

Eng. Vol.23, 1999, pp.1293-1307.

Bakshi, B. R., “Multiscale PCA with Application to Multivariate Statistical Process

Monitoring.” AIChE J., Vol.44, 1998, 1596-1610.

Bishop, C. M., “Neural Networks for Pattern Recognition.” Claprendon Press, 1995.

Carlsson, B., “On-line estimation of the respiration rate in an activated sludge

process.” Wat. Sci Tech., Vol.28, 1993, pp.427-434.

Carlsson, B., Lindberg, C.F., Hasselblad, S. and Xu, S., “On-line estimation of the

respiration rate and the oxygen transfer rate at Kungsängen wastewater plant in

Uppsala.” Wat. Sci. Tech., Vol. 30, No.4, 1994, pp.255-263.

Carlsson, B., Lindberg, C.F., Hasselblad, S. and Xu, S., “Estimation of the

respiration rate and the oxygen transfer function utilizing a slow do sensor.”

Wat. Sci. Tech., Vol.33, 1996, pp.325-333.

Centner, V. and Massart, D.L., “Optimization in Locally Weighted Regression.”

Anal. Chem. Vol.70, 1998, pp.4206-4211.

Choi, S.W, Yoo, C.K and Lee, I.B., “Generic monitoring system in the biological

183

wastewater treatment process.” J. Chem. Eng. of Japan, 2001 (accepted).

Copp, J.B., “COST Simulation Benchmark Manual”, European Cooperation in the

field of Scientific and Technical Research, 2000.

COST-624, “The European Cooperation in the Field of Scientific and Technical

Research.” Website: http://www.ensic.u-nancy.fr/COSTWWTP.

Dieu, B., Garrett, M.T., Ahmad, Z. and Young, S., “Applications of automatic

control systems for chlorination and dechlorination processes in wastewater

treatment plants.” ISA Trans., Vol.34, 1995, pp.1-28.

Dorsey, A., and Lee, J., “Monitoring of batch processes through state-space

models.” 2001.

Eriksson, L., Hermens, J.L.M., Johansson, E., Verhaar, H.J.M. and Wold, S.,

“Multivariate analysis of aquatic toxicity data with PLS.” Aqu. Sci., Vol.57,

No.3, 1995, pp.1015-1621.

Gau, C. and Stadther M.A., “Reliable nonlinear parameter estimation using interval

analysis: error-in-variable approach.” Computer & Chemical Engineering,

Vol.24, 2000, pp.631-637.

Geladi, P. and Kowalski, B.R., “Partial Least Squares Regressions: a Tutorial.” Anal.

Chim. Acta., Vol.185, 1986, pp.1-17.

Hasselblad, S. and Xu, S., “On-line estimation of settling capacity in secondary

clarifier.” Wat. Sci. Tech, Vol.34(3-4), 1996, pp.323-330.

Haykin, S., “Neural Networks: A comprehensive foundation.” Prentice Hall

International, 1999.

Henze, M., Grady Jr, C. P. L., Gujer, W., Marais, G. and Matsuo, T., “Activated

sldge model no. 1.” Scientific and Technical Report No. 1, IAWQ, London, UK,

1987.

Holmberg, U., Olsson, G. and Andersson, B., “Simultaneous DO control and

respiration estimation.” Wat. Sci. Tech., Vol.21, 1989, pp.1185-1195.

184

Höskuldsson, A., “Prediction Methods in Science and Technology.” Thor Publishing,

Arnegaards Alle, Finland, 1996.

Jang, J.-S.R., Sun, C.-T. and Mizutani, E., “Neuro-Fuzzy and Soft Computing.”

Prentice Hall, 1997, pp. 425-427.

Jeppsson U., “Modelling Aspects of Wastewater Treatment Processes.” Ph. D. thesis,

Lund, Sweden, 1996.

Jeppsson U., Alex, J., Pons, M.N., Spanjers, H. and Vanrolleghem, P.A., “Status and

future trends of ICA in WWTP – A European perspective.” ICA2001, Sweden,

2001, pp.687-694.

Joanquin, S., Ion, I., Xabier, O., and Eduardo, A., “Dissolved oxygen control and

simultaneously estimation of oxygen uptake rate in activated sludge plant.”

Wat. Env. Res., Vol.70, No.3, 1998, pp.316-322.

Johnson, R.A. and Wichern, D.W., “Applied Multivariate Statistical Analysis.” 3rd

ed., Prentice Hall, Englewood Cliffs, USA, 1992.

Kano, M., Nagao, K., Ohno, H., Hasebe, S. and Hashimoto, I. “Dissimilarity of

Process Data for Statistical Process Monitoring.” International symposium on

advanced control of chemical process (ADCHEM), Pisa, Italy, 2000a, pp. 231-

236.

Kano, M., Nagao, K., Hasebe, S., Hashimoto, I., Ohno, H., Strauss, R. and Bakshi,

B., “Comparison of Statistical Process Monitoring Methods: Application to the

Eastman Challenge Problem.” Comp. & Chem. Eng., Vol.24, 2000b, pp.175-

181.

Ko, T. J. and Cho, D.W., “Adaptive Modelling of the Milling Process and

Application of a Neural Network for Tool Wear Monitoring.” Advanced

Manufacturing Technology, Vol.12, 1996, pp.5-16.

Kourti, T. and MacGregor, J.F., “Process Analysis, Monitoring and Diagnosis using

Multivariate Projection Methods.” Chem. Intelli. Lab., Vol.28, 1995, pp.3-21.

Krofta, M., Herath, B., Burgess, D. and Lampman, L., “An Attempt to understand

185

Dissolved Air Flotation using Multivariate Analysis.” Wat. Sci. Tech., Vol.31,

No.3-4, 1995, pp.191-201.

Ku, W., Storer, R.H. and Georgakis, C., “Disturbance Detection and Isolation by

Dynamic Principal Component Analysis.” Chem. Intelli. Lab., Vol.30, 1995,

pp.179-196.

Lambert, E., “Process control applications of long-range prediction.” Ph.D. thesis,

Dept. of Engineering Science, Oxford University, 1987.

Lee, D.S. and Park, J.M., “Neural network modeling for on-line estimation of

nutrient dynamics in a sequentially-operated batch reactor.” J. Biotechnol.,

Vol.75, 1999, pp.229-239.

Lee, D.S., “Neural network modeling of biological wastewater treatment processes.”

Ph. D. thesis, School of Environmental Engineering, POSTECH, Korea, 2000.

Lee, P.L. and Sullivan, G.R., “Generic Model Control.” Comp. & Chem. Eng.,

Vol.124, 1988, pp.573-580.

Li, W. and Qin, S. J., “Consistent dynamic PCA based on error-in-variables

subspace identification.” J. of Process Control, Vol.11, 2001, pp.661-678.

Li, W., Yue, H.H., Valle-Cervantes, S. and Qin, S.J., “Recursive PCA for Adaptive

Process Monitoring.” Journal of Process Control, Vol.10, 2000, pp.471-486.

Lin, C.-T. and Lee, C.S., “A Neuro-Fuzzy Systems.” Prentice-Hall, 1996.

Lindberg, C.F. and Carlsson, B., “Nonlinear and set-point control of the dissolved

oxygen dynamic in an activated sludge process.” Wat. Sci. Tech., Vol.34, No.3-

4, 1996, pp.135-142.

Lindberg, C.F., “Control and estimation strategies applied to the activated sludge

process.” Ph.D. thesis, Uppsala University, Dept. of Material Science, System

and Control Group, Sweden, 1997.

Ljung, L. and Söderström, T., “Theory and Practice of Recursive Identification.”

Cambridge: M.I.T. press, 1987.

186

Ljung, L., “System Identification.” New Jersey: PTR Prentice Hall, 1987.

Lukasse, L.J.S., “Control and identification in activated sludge process.” Ph. D.

thesis, Wageningen Agricultural University, Netherlands, 1999.

Marsili-Libelli, S., “Adaptive estimation of bioactivities in the activated sludge

process.” Proc. IEE, Part D, Vol.137, 1990, pp.349-356.

Marsili-Libelli, S. and Voggi, A., “Estimation of respirometric activities in

bioprocess.” Journal of Biotechnology, Vol.52, 1997, pp.181-192.

Marsili-Libelli, S., “Adaptive Fuzzy Monitoring and Fault Detection.” Int. J.

COMADEM, Vol.1(3), 1998, pp.31-40.

Moody, J. and Darken, C., “Fast learning in networks of locally-tuned processing

units.” J. Neural Comp., No.1, 1989, pp.281-294.

Negiz, A. and Cinar, A., “Statistical monitoring of multivariat dynamic processes

with state-space models.” AIChE, Vol.43, No.8, 1997, pp.2002-2020.

Neter, J., Kutner, M.H., Nachtsheim, C. and Wasserman, W., “Applied Linear

Statistical Models.” 4 Edition, McGraw-Hill, USA, 1996.

Nomikos, P. and MacGregor, J.F., “Monitoring of batch processes using multi-way

principal component analysis.” AIChE, Vol.40, 1994, pp.1361-1375.

Nomikos, P. and MacGregor, J.F., “Multivariate SPC charts for monitoring batch

processes.” Technometrics, Vol.37, No.1, 1995, pp.41-59.

Olsson, G. and Chapman, D., “Modeling the dynamics of clarifier behaviour in

activated sludge systems”, Wat. Sci. Tech, Vol.37(12), 1988, pp.405-412.

Olsson, G. and Newell, B., “Wastewater Treatment Systems: Modelling, diagnosis

and Control.” IWA, UK, 1999.

Pons, M.N., Spanjers, H. and Jeppsson, U., “Towards a Benchmark for evaluating

Control Strategies in Wastewater Treatment Plants by Simulation.” Escape 9,

Budapest, Hungary, 1999, pp. 403-406.

Qin, S.J. and McAvoy, T.J., “Nonlinear PLS modeling using neural networks.”

187

Comp. & Chem. Eng. Vol.16, 1992, pp.379-391.

Rosen, C. and Olsson, G., “Disturbance detection in wastewater treatment plants.”

Wat. Sci. Tech, Vol.37, No.12, 1998, pp.197-205.

Rosen, C., “Monitoring Wastewater Treatment System.” MS thesis, Lund, Sweden,

1998.

Rosen, C. and Lennox, J.A., “Multivariate and Multiscle Monitoring of Wastewater

Treatment Operation.” Water Research, Vol.35, No.14, 2001, pp.3402-3410.

Sagara, S., Yang, Z.J. and Wada, K., “Recursive identification algorithms for

continuous systems using an adaptive procedure.” Int. J. Control, Vol.53, 1991,

pp.391-409.

Signal, P.D. and Lee, P.L., “Generic Model Adaptive Control.” Chem. Eng. Comm.,

Vol.115, 1992, pp.35-52.

Singman, J., “Efficient Control of Wastewater Treatment Plant - a Benchmark

Study.” MSc thesis, Uppsala University, Sweden, 1999.

Söderström, T. and Stoica, P., “System Identification.” Cambridge: Prentice Hall

International, 1989.

Sotomayor, O.A.Z., Park, S. and Garcia, C., “Nitrate Concentration-based Control of

the Activated Sludge Systems.” IFAC-ADCHEM 2000, Vol.I, Pisa, Italy, 2000,

pp.213-218.

Spanjer, H., Vanrolleghems, P., Nguyen, K., Vanhooren, H. and Patry, G., “Toward

a simulation benchmark for evaluating respirometry-based control strategies.”

Wat. Sci Tech., Vol.37, No.12, 1998, pp.219-226.

Spanjers, H. and Klapwijk, A., “On-line meter for respiration rate and short-time

biochemical oxygen demand in the control of the activated sludge process.”

ICA of water and wastewater treatment and transport systems, Pergamon press,

New York, N.Y., 1990.

Steffens, M.A. and Lant, P.A., “Multivariable control of nutrients removing

188

activated sludge systems.” Wat. Res., Vol.33, No.12, 1999, pp.2864-2878.

Sung, S. W.; O, J., Lee, J., Yi, S. and Lee, I., “Automatic Tuning of PID Controller

using Second Order Plus Time Delay Model.” J. Chem. Eng. Japan, Vol.29,

1996, pp.990-999.

Sung, S.W. and Lee, I., “Limitations and Countermeasures of PID controllers.” Ind.

Eng. Chem. Res., Vol.35, 1996, pp.2596-2610.

Sung, S.W., “Process Identification and a New PID Controller Design.” Ph. D.

thesis, School of Environmental Engineering, POSTECH, Korea, 1996.

Sung, S.W., Lee, I. and Lee, J.T., “New Process Identification Method for

Automatic Design for PID controllers.” Automatica., Vol.34, 1998a, pp.513-

520.

Sung, S.W., Lee, I. and Lee B.K., “On-line process identification and automatic

tuning method for PID controllers.” Chem. Eng. Sci., Vol.53, 1998b, pp.1847-

1859.

Sung, S.W. and Lee, I., “PID Controllers and Automatic Tuning.” A-Jin, Korea,

1999.

Takács, I., Patry, G. G. and Nolasco, D., “A dynamic model of the clarification-

thickening process.“ Wat. Res., Vol.25(10), 1991, pp.1263-1271.

Teppola, P., Mujunen, S.-P. and Minkkinen, P., “Partial least square modeling of an

activated sludge plant: A case study.” Chem. Intelli. Lab., Vol.38, 1997,

pp.197-208.

Teppola, P., “Multivariate Process Monitoring of Sequential Process Data – A

Chemometric Approach.” Ph. D. Thesis, Lappeenranta University, Finland,

1999.

Teppola, P., Mujunen, S.-P. and Minkkinen, P., “Adaptive fuzzy c-means clustering

in process monitoring.” Chem. Intelli. Lab., Vol.41, 1999, pp.23-28.

Teppola, P. and Minkkinen, P., “Wavelet-PLS regression models for both

189

exploratory data analysis and process monitoring.” J. of Chemometrics, Vol.14,

2000, pp.383-399.

Teppola, P. and Minkkinen, P., “Wavelets for scrutinizing multivariate exploratory

models through multiresolution analysis.” J. of Chemometrics, Vol.15, 2001,

pp.1-18.

Tu, Y.X., Wernsdörfer, A., Honda, S., and Tuffs, P.S., “Estimation of conduction

velocity distribution by regularized-least-squares method.” IEEE Trans. on

Biomedical Engineering, Vol.44, 1987, pp.1102-1106.

Van Dongen, G. and Geuens, L., “Multivariate Time Series Analysis for Design and

Operation of a Biological Wastewater Treatment Plant.” Water Research,

Vol.32, 1998, pp.691-700.

Van Overschee, P. and De Moor, B., “N4SID: Subspace algorithms for the

identification of combined deterministic stochastic systems.” Automatica,

Vol.30, No.1, 1994, pp.75-93.

Van Overschee, P. and De Moor, B., “Subspace algorithms for linear system:

Theory – Implementation – Application .“ 1996, Kluwer Academic Publishers,

USA.

Van Overschee, P. and De Moor, B., “Subspace algorithms for the stochastic

identification problem.” Automatica, Vol.29, No.3, 1993, pp.649-660.

Wang, C.-H., Hong, T.-P. and Tseng, S.S., “Integrating Fuzzy Knowledge by

Genetic Algorithms.” IEEE. Trans. Evolutionary Computation, Vol.2(4), 1998,

pp.138-149.

Whitfield, A. H. and Messali. N., “Integral-equation approach to system

identification.” Int. J. Control, Vol.45, 1987, pp.1431-1445.

Wikström, C., Albano, C., Eriksson, L. Fridén, H., Johansson, E., Å. Nordahl,

Rännar, S., Sandberg, M., Kettaneh-Wold, N. and Wold, S., “Multivariate

process and quality monitoring applied to an electrolysis process Part II.

190

Multivariate time-series analysis of lagged latent variables.” Chem. Int. Lab.

Sys., Vol.42, 1998, pp.233-240.

Wise, B.M. and Richer, N.L., Veltkamp, D.F. and Kowalski, B.R., “A theoretical

basis for the use of principal component models for monitoring multivariate

processes.” Process Contrl Qual., Vol.1, 1990, pp.41-51.

Wise, B.M. and Gallagher, N.B., “The Process Chemometrics Approach to Process

Monitoring and Fault Detection.” Journal of Process Control, Vol.6, 1996,

pp.329-348.

Wold, S., “Nonlinear Partial least squares modeling II. Spline inner relation.”

Chemom. Int. Lab. Syst., Vol.14, 1992, pp.71-84.

Wold, S., Wold, N. K. and Skagerberg, B., “Nonlinear PLS modeling.” Chem. Int.

Lab. Sys., Vol.7, 1989, 53-65.

Yen, J., Wang, L. and Gillespie, C.W., “Improving the interpretability of TSK Fuzzy

models by combining global learning and local learning.” IEE Trans. Sys. No.6,

1998, pp.530-537.

Ying, C.-M. and Joseph, B., “Sensor Fault Detection Using Noise Analysis.” Ind.

Eng. Chem. Res., Vol.39, 2000, pp.396-407.

Yoo, C.K., Choi, S.W. and Lee, I., “Disturbance Detection and Isolation in the

Activated Sludge Process.” Wat. Sci. Tech. (in press), 2001

Curriculum Vitae

Name : Chang Kyoo Yoo

Date of Birth : September 25, 1969

Birthplace : ChunBuk, Korea

Address : Graduate Apartment 2-1004, POSTECH, Hyoja-Dong, Pohang

City, KyungBuk, 790-784, Korea

Education :

3, 1989 ~ 2, 1993 B. S. in Chemical Engineering,

Yonsei University, Seoul, Korea

3, 1993 ~ 2, 1995 M. S. in Chemical Engineering,

Pohang University of Science and Technology,

Pohang, Korea

3, 1999 ~ 2, 2002 Ph. D. in Chemical Engineering,

Pohang University of Science and Technology,

Pohang, Korea

Career :

3, 1999 ~ 2, 2002 Pohang University of Science and Technology,

Research Assistant

5, 2001 ~ 8, 2001 Visiting scholar, School of Chemical Engineering,

Georgia Institute of Technology, 778 Atlantic Dr.,

Atlanta, GA, USA (“Supported by Brain Korea 21

project”)

1, 1995 ~ 8, 1998 Industrial Researcher, Factory Automation, Doosan

R&D center, Korea

9, 1998 ~ 2, 1999 Industrial Researcher, Automation Research Center,

Pohang University of Science and Technology, Korea

Certificate :

National Technical Qualification Certificate (1994) in a division of chemical

engineering

Institute Activity :

5, 1993 ~ Now KIChE member

Scientific Publications

Research papers

1. C.K.Yoo, J.H. Cho, H.J. Kwak, S.K. Choi, H.D. Chun and I.B. Lee, "Closed-

loop identification and control for dissolved oxygen concentration in the full-

scale coke wastewater treatment plant", Wat. Sci. Tech., Vol.43, No.7, 2001,

pp.207-214

2. Chang Kyoo Yoo, Hee Jin Kwak and In-Beum Lee, “Direct identification

method of second order plus time delay model parameters”, Chemical

Engineering Research and Design, Vol. 79, Part A, October, 2001, 754-764

3. Chang Kyoo Yoo, Dong Soon Kim, Ji-Hoon Cho, Sang Wook Choi and In-Beum

Lee, “Process System Engineering in Wastewater Treatment Process”, The

Korean Journal of Chemical Engineering, Vol.18, No.4, 2001, pp.408-421

4. C.K.Yoo, S.W. Choi and I. Lee, "Disturbance Detection and Isolation in the

Activated Sludge Process", Wat. Sci. Tech., 2001 (in press)

5. Sang Wook Choi, Chang Kyoo Yoo, Kyu Hwang Lee and In-Beum Lee,

“Generic Monitoring system in the biological wastewater treatment process”,

Journal of Chemical Engineering of Japan, 2001 (in press)

6. Sang Wook Choi, Chang Kyoo Yoo, and In-Beum Lee, “SOx emission

monitoring in the power plant in a still mill by a neural network”, Journal of

Environmental Engineering, 2001 (in revision)

7. YoonHo Bang, Chang Kyoo Yoo and In-Beum Lee, “Nonlinear PLS modeling

with fuzzy inference system”, Chemometrics and intelligent laboratory systems,

2001 (in revision)

* Submitted and in revision papers

International conferences

1. Chang Kyoo Yoo, Jin Hyun Park and In-Beum Lee, "Adaptive Generic Model

Control for Automatic DO Control in the activated sludge process", The

Proceedings of The 8th APCChE Congress, pp.359-362, Seoul, Korea, 1999

2. Chang Kyoo Yoo, Jin Hyun Park and In-Beum Lee, "Auto-tuning and

simultaneous setpoint decision for the DO control in the coke wastewater

treatment plant", Proceedings of the 3th Asian Control Conference, pp.744-749,

July 5-7, Shanghai, 2000

3. C.K. Yoo, J.H. Cho, H.J. Kwak, S.K. Choi, H.D. Chun and I.B. Lee, "Closed-loop

identification and control for dissolved oxygen concentration in the full-scale

coke wastewater treatment plant", Proceedings at 5th International IWA

Symposium Systems Analysis and Computing in Water Quality Management

WATERMATEX2000, pp.9.9-9.16, Gent, Belgium, September 18-20, 2000

4. Chang Kyoo Yoo, Sang Wook Choi, Jin Hyun Park and In Beum Lee, "Time

series analysis and neural network classification of the secondary settler in the

wastewater plant", Proceedings at 5th International IWA Symposium Systems

Analysis and Computing in Water Quality Management WATERMATEX2000,

pp.9.9-9.16, Gent, Belgium, September 18-20, 2000

5. Sang Wook Choi, Chang Kyoo Yoo and In-Beum Lee, "Real-Time Process

Monitoring Using Dissimilarity through PCA", Proceedings at the 2nd Cross

Straits Symposium on Materials, Energy and Environmental Sciences, Pusan

National University, Korea, November 2-3, 2000

6. Sang Wook Choi, Chang Kyoo Yoo, byung-Hwan Cho and In-Beum Lee, "Air

Pollution Monitoring at a Power Plant using MLP via ARX model", Proceedings

at the first AEARU Environmental Workshop, pp.171-180, Hong Kong

University of Science and Technology, Hong Kong, January 9-10, 2001

7. C.K.Yoo, S.W. Choi and I. Lee, "Disturbance Detection and Isolation in the

Activated Sludge Process", ICA2001, pp.333-340, Lund University, Sweden,

June 3-7, 2001

8. ChangKyoo Yoo, Jin Hyun Park and In-Beum Lee, "Nonlinear model-based

control in the biological wastewater treatment system", DYCOPTS-6 (6TH IFAC

Symposium on Dynamics and Control of Process System), pp.724-728, Jeju,

Korea, June 3-6, 2001

9. Sang Wook Choi, ChangKyoo Yoo and In-Beum Lee, "Generic detection and

isolation of process events in WWTP", 4th IFAC workshop on on-line fault

detection and supervision in the chemical process industries, 135-139, Jeju,

Korea, June 7-8, 2001

10. ChangKyoo Yoo, SangWook Choi, YoonHo Bang and In-Beum Lee, "Soft

sensor and model-based DO control in the biological wastewater treatment

process", IFAC Workshop on Modeling and Control in Environmental Issues,

pp.307-312, Yokohama, Japan, August 22-23, 2001

11. YoonHo Bang, ChangKyoo Yoo and In-Beum Lee, "An enhanced prediction

model for the emission of air pollutants", IFAC Workshop on Modeling and

Control in Environmental Issues, pp.73-78, Yokohama, Japan, August 22-23,

2001

12. C.K.Yoo and I. Lee, “Wastewater quality modeling by hybrid GA-Fuzzy model ",

ASIAN WATERQUAL 2001, pp.107-112, Fukuoka, Japan, September 12-15,

2001

13. C.K.Yoo, S.W. Choi and I. Lee, "Monitoring algorithm in the biological

wastewater treatment system", 6th world congress of chemical engineering,

pp.2409, Melbourne, Australia, September 23-27, 2001

14. YoonHo Bang, ChangKyoo Yoo and In-Beum Lee, "Modeling and monitoring

with nonlinear FPLS method", Proceedings at the 3nd Cross Straits Symposium

on Materials, Energy and Environmental Sciences, POSTECH, Korea,

November 15-16, 2001

15. Yoon Ho Bang, ChangKyoo Yoo, Sang Wook Choi and In-Beum Lee,

“Nonlinear PLS Monitoring Applied to An Wastewater Treatment Process”,

proc. of ICCAS2001 conference, Oct. 17~20, Jeju, Korea, 2001

16. Yangdong Pan, ChangKyoo Yoo, Jay H. Lee and In-Beum Lee, “Process

Monitoring for Continuous Process with Cycling Operation”, Submitted to

extended abstract of American Control Conference 2002 (Invited Session in

Control of Batch and Periodic Process), 2002.

Domestic conferences

1. Chang Kyoo Yoo, Hee Jin Kwak, Su Whan Sung and In-Beum Lee, "An

Estimation Method of SOPTD Model Parameter", Proceeding of KIChE Spring

Meeting, 1999.

2. Ji-hoon Cho, Chang Kyoo Yoo, Dong-soon Kim and In-Beum Lee, "Wastewater

treatment automation system", Proceeding of KIChE Fall Meeting, 1999.

3. Sang Wook Choi, Chang Kyoo Yoo and In-Beum Lee, "SOx Emission monitoring

in the internal power plant of a still mill by a neural network classifier",

Proceeding of KOSENV Spring Meeting, 2000.

4. Sang Wook Choi, Chang Kyoo Yoo and In-Beum Lee, Proceeding of KICHE, On-

line process monitoring using modified dissimilarity measure", Proceeding of

KIChE Fall Meeting, 2000.

5. Yoon Ho Bang, Chang Kyoo Yoo and In-Beum Lee, “Fuzzy Model을 적용한

Nonlinear PLS algorithm”, Proceeding of KIChE Spring Meeting, 2001.

Monitoring and Control of Biological Wastewater Treatment ...elibrary.cenn.org/Wastewater/Monitoring and Control... · Monitoring and Control of Biological Wastewater Treatment Process,

Documents