Monitoring and Control of Biological Wastewater Treatment Process ChangKyoo Yoo Department of Chemical Engineering (Process Control and Environmental Engineering Program) Pohang University of Science and Technology
Monitoring and Control of Biological
Wastewater Treatment Process
ChangKyoo Yoo
Department of Chemical Engineering (Process Control and Environmental Engineering Program)
Pohang University of Science and Technology
Monitoring and Control of Biological
Wastewater Treatment Process
by
Chang Kyoo Yoo
Department of Chemical Engineering
(Process Control and Environmental Engineering Program) Pohang University of Science and Technology
A thesis submitted to the faculty of Pohang University of Science
and Technology in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the department of Chemical Engineering (Process Control and Environmental Engineering program)
Pohang, Korea 11. 28. 2001 Approved by
Major Advisor
i
ABSTRACT
Increasingly stringent demands are being placed on nutrient removal from
wastewater. These stricter demands increase the complexity of the treatment process
and necessitate the upgrading of treatment plants. As a consequence, there is a need
to optimize plant operation as well as to maximize plant efficiency and reliability.
Process monitoring and advanced control systems are generally considered to be an
important means of achieving stable operation under large load variations. Recent
advances in modeling and sensor technology have motivated significant research
aimed at constructing better process monitoring and advanced control systems. In
the present study we aim to design a process control and monitoring algorithm
appropriate for the wastewater treatment process (WWTP).
In Section III, entitled “Autotuning and Supervisory DO Control in Fullscale
WWTP”, we introduce the autotuning of the PID controller to the dissolved oxygen
(DO) control system in the WWTP and propose a simple supervisory control law to
suggest its set point. Process identification for autotuning approximates the DO
dynamics to a high-order model using the integral transform method and reduces it
to the first-order plus time delay (FOPTD) or second-order plus time delay (SOPTD)
model for the PID controller tuning. Simultaneously, a simple algorithm for the
supervisory control of set point decision is proposed to decide a proper DO set point
for the current operation condition of the aeration basin. The key idea in this method
DCE
19993133
유 창규 Chang Kyoo Yoo Monitoring and Control of Biological Wastewater Treatment Process, 생물학적 폐수처리공정의 모니터링 및 제어 Department of Chemical Engineering (Process Control and Environmental Engineering Program) 2002, 195 pages Advisor: In-Beum Lee, Text in English.
ii
is that the DO set point is proportional to the respiration rate, which is the indicator
of the biologically degradable load. The full-scale experimental results showed good
identification performance and good tracking ability. As a result of the improved
control performance, the fluctuation of the variation of the dissolved oxygen process
decreased and 15% of the electrical power was saved.
In Section IV, entitled “Generalized Damped Least Squares method”, we
propose a generalized damped least squares (GDLS) method to systematically
remove an estimation windup problem in the adaptive control and self-tuning
control system. The key element of the proposed method is the addition of a penalty
of parameter variations to the objective function of the normal least squares
algorithm to prevent the singularity problem. Mathematical analysis shows that the
proposed method has almost equivalent properties to the normal least squares
method and guarantees that no estimation problem will be encountered for poorly
excited situations. The proposed method was applied to estimate the parameters of a
first-order system under closed-loop control and to estimate the respiration rate (R)
and oxygen transfer rate (KLa) of the DO control system in the WWTP, which was
used to derive an adaptive model-based DO control law. Simulation results show
that a GDLS algorithm gives excellent estimation performance under closed-loop
control and can be used in adaptive model-based DO control in WWTP.
In Section V, entitled “Disturbance Detection and Isolation in WWTP”, we
propose a new fault detection and isolation (FDI) method. The proposed method
monitors the distribution of process data and detects changes in this distribution,
which reflect changes in the corresponding operating condition. A modified
dissimilarity index and an FDI technique are defined to quantitatively evaluate the
difference between successive data sets. This technique considers the importance of
each transformed variable in the multivariate system. The proposed FDI technique is
applied to a benchmark simulation and to data from a real WWTP. In addition, we
iii
investigate the kind of disturbance and various scenarios that frequently occur in the
WWTP. Simulation results show that the proposed method could immediately detect
disturbances and automatically distinguish between serious and minor anomalies for
various scales of fault by facilitating the interpretation of the disturbance scales. In
particular, the simulations confirmed that the proposed method is efficient in
adaptive and nonstationary processes, such as the WWTP.
In Section VI, entitled “Modeling and Multiresolution Analysis in WWTP”,
modeling and multiresolution analysis (MRA) are described for the full-scale
WWTP. The proposed method is based on the modeling by partial least squares
(PLS) regression method and multiscale monitoring by application of a generic
dissimilarity measure (GDM) to PLS score values. PLS score values are normally
distributed as a consequence of the central limit theorem, regardless of the
distribution of the original variables; hence, the proposed monitoring method is
suitable for non-stationary and non-normal data sets. Experimental results show that
the PLS method gives good modeling performance and is a powerful tool for
analyzing the full-scale WWTP. MRA also certified the detection and isolation
capability of the proposed method. In particular, the MRA indicated that the
proposed strategy was appropriate for the detection and isolation of various faults
and events in biological treatment, that is, the proposed method could cope with
multiscale process changes in non-stationary signals with non-normal characteristics.
In Section VII, entitled “Process Monitoring for a Continuous Process with
Cyclic Operation”, we propose a method for monitoring a continuous process with
diurnal cyclic characteristics in the domestic WWTP. A subspace identification
method is used to extract “within cycle” and “between-cycle” correlation
information from historical data in the form of a state-space model. This method is
designed to describe the variations from the mean behavior of a periodically time-
varying state-space model. The method can also incorporate the concept of
iv
inferential sensing to predict the quality variables and to enhance process monitoring.
In the simulation results, the proposed method could detect small mean shifts and
abnormalities in the slowly decreasing nitrification rate that were difficult to detect
using the conventional PCA method.
In Section VIII, entitled “Simultaneous Prediction and Classification in the
Secondary Settling Tank”, we propose a method of prediction of solid volume index
(SVI) and a simultaneous classification of the current state of a secondary settler.
Adaptive modeling scheme is implemented as recursive least squares (RLS) method
to update the model parameters adaptively and neural network (NN) classifier is
used as a process classifier. The basic idea is that RLS model parameters have good
features to classify the current state of a secondary settler, thus secondary clarifier
can be detected by monitoring the variations of RLS parameters during the SVI
prediction. Experiments and theoretical analysis shows that the RLS method can
predict SVI of a secondary settler well and parameters of RLS method can be a good
feature for monitoring the state of secondary settler, which could be verified through
the power spectrum analysis.
In Section IX, entitled “Nonlinear Fuzzy PLS Modeling”, we propose a new
nonlinear partial least squares (NLPLS) algorithm that embeds the Takagi-Sugeno-
Kang (TSK) fuzzy model into the regression framework of the partial least squares
(PLS) method. The proposed method applies the TSK fuzzy model to the PLS inner
regression. Using this approach, the interpretability of the TSK fuzzy model
overcomes some of the handicaps of previous NLPLS algorithms. The proposed
method uses the PLS method to solve the problems of high dimensionality and
collinearity and the TSK fuzzy model is used to capture the nonlinearity and to
increase the use of experts’ knowledge. As a result, the FPLS model gives a more
favorable modeling environment in which the knowledge of experts can be easily
applied. Simulation results showed good modeling performance of the FPLS model
v
in a simulation benchmark and a full-scale WWTP.
vi
C o n t e n t s
I. Introduction ......................................................................................................... 1
1.1 Research Motivation.................................................................................... 1
1.2 Research Objective ...................................................................................... 4
II. Biological Wastewater Treatment Process .......................................................... 9
2.1 Activated Sludge Process ............................................................................ 9
2.2 Simulation Benchmark .............................................................................. 11
2.3 Fullscale WWTP ....................................................................................... 13
III. Autotuning and Supervisory DO Control in Fullscale WWTP........................... 19
3.1 Introduction ............................................................................................... 19
3.2 Method....................................................................................................... 20
3.2.1 Autotuning Method .......................................................................... 20
3.2.2 Supervisory Control.......................................................................... 22
3.3 Experimental Results ................................................................................. 24
3.4 Conclusions ............................................................................................... 27
IV. Generalized Damped Least Squares Method ...................................................... 36
4.1 Introduction ............................................................................................... 36
4.2 Theory ....................................................................................................... 38
4.2.1 Generalized Damped Least Squares Algorithm................................ 38
4.2.2 Theoretical Analysis ......................................................................... 41
4.2.3 Soft Sensor of Oxygen Transfer Rate and Respiration Rate ............ 44
4.2.4 Adaptive Model-based DO Control.................................................. 46
4.3 Simulation Study....................................................................................... 48
4.4 Conclusions ............................................................................................... 53
V. Disturbance Detection and Isolation in WWTP................................................... 64
5.1 Introduction ............................................................................................... 64
5.2 Theory ....................................................................................................... 66
5.2.1 Modified Dissimilarity Measur ........................................................ 66
5.2.2 Fault Detection and Isolation (FDI).................................................. 68
vii
5.3 Simulation Studies .................................................................................... 69
5.3.1 Simulation of Benchmark Plant........................................................ 69
5.3.2 Fullscale WWTP .............................................................................. 72
5.4 Conclusions ............................................................................................... 75
VI. Modeling and Multiresolution Analysis in WWTP............................................ 85
6.1 Introduction ............................................................................................... 85
6.2 Theory ....................................................................................................... 87
6.2.1 Partial Least Squares (PLS).............................................................. 87
6.2.2 Generic Dissimilarity Measure (GDM) ............................................ 89
6.2.3 Multiresolution Analysis .................................................................. 90
6.3 Result and Discussion................................................................................ 92
6.4 Conclusions ............................................................................................... 97
VII. Process Monitoring for Continuous Process with Cyclic Operation ............... 108
7.1 Introduction ............................................................................................. 108
7.2 Theory ..................................................................................................... 111
7.3 Simulation Study ..................................................................................... 116
7.4 Conclusions ............................................................................................. 121
VIII. Simultaneous Prediction and Classification in the Secondary Settling Tank 136
8.1 Introduction ............................................................................................. 136
8.2 Theory ..................................................................................................... 137
8.3 Simulation Study..................................................................................... 142
8.4 Conclusions ............................................................................................. 145
IX. Nonlinear Fuzzy PLS modeling........................................................................ 151
9.1 Introduction ............................................................................................. 151
9.2 Theory ..................................................................................................... 152
9.2.1 PLS Modeling Method ................................................................... 152
9.2.2 TSK Fuzzy Modeling ..................................................................... 153
9.2.3 Nonlinear FPLS Modeling ............................................................. 155
9.3 Result and Discussion.............................................................................. 164
9.4 Conclusions ............................................................................................. 169
9.5 Appendix ................................................................................................. 169
viii
Summary in Korean ................................................................................................ 182 X. References....................................................................................................... 187
ix
List of Figures
Figure 2.1 A basic activated sludge process with an aerated tank and a settler........ 15
Figure 2.2 A simplified process scheme of an activated sludge process using pre-
denitrification (layout of simulation benchmark) ..................................................... 16
Figure 2.3 Plant layout of coke WWTP, Korea ........................................................ 17
Figure 3.1 Identification and autotuning procedure for PID controller .................... 28
Figure 3.2 Supervisory DO control scheme.............................................................. 29
Figure 3.3 Schematic diagram of full-scale WWTP ................................................. 30
Figure 3.4 Experimental result during identification phase ...................................... 31
Figure 3.5 Bode plots of the identified FOPTD and SOPTD models ....................... 32
Figure 3.6 The validation test: real data and model prediction value ....................... 33
Figure 3.7 Estimated respiration rate during identification phase ............................ 34
Figure 3.8 DO control result using autotuning and supervisory control algorithm .. 35
Figure 4.1 Robust estimator of KL a(u(t)) and R(t) with a GDLS method................. 54
Figure 4.2 Adaptive model-based DO control strategy with GDLS algorithm ........ 55
Figure 4.3 Flow chart of soft sensor and the model-based DO control algorithm.... 56
Figure 4.4 Process output and input during the simulation (a) process output (b)
control input. ............................................................................................................. 57
Figure 4.5 Parameter estimates )(ˆ ta and )(ˆ tb of RLS with exponential data
weighting .................................................................................................................. 58
Figure 4.6 Trace of P with RLS estimation method ................................................. 59
Figure 4.7 Parameter estimates )(ˆ ta and )(ˆ tb of RLS with constant trace ............ 60
x
Figure 4.8 Parameter estimates )(ˆ ta and )(ˆ tb with GDLS. ..................................... 61
Figure 4.9 Estimation comparisons of RLS and GDLS method under PID control 62
Figure 4.10 Model-based DO control result with a GDLS method (a) process output,
(b) control input ........................................................................................................ 63
Figure 5.1 Moving windows between successive two datasets. ............................... 76
Figure 5.2 Measured variables during the storm weeks............................................ 77
Figure 5.3 Monitoring performances under an external disturbance (a) dissimilarity
index, (b-d) individual eigenvalue plots ................................................................... 78
Figure 5.4 Monitoring performances under internal disturbances caused by
decreasing nitrification (left plot) and settler bulking (right plot) (a) dissimilarity
index, (b-d) individual eigenvalue plots ................................................................... 79
Figure 5.5 Monitoring performances of sensor faults (left plot) and setpoint change
(right plot) (a) dissimilarity index, (b-d) individual eigenvalue plots....................... 80
Figure 5.6 PCA monitoring performances (a) Hotteling’s T2 chart, (b) SPE plot .... 81
Figure 5.7 FDI monitoring performances (a) dissimilarity index, (b-f) the 1, 2, 3, 4,
5th eigenvalues........................................................................................................... 82
Figure 6.1 Multiresolution analysis for PLS monitoring .......................................... 98
Figure 6.2 Normal probability plot and histogram of original data and PCA score
values (a) normal probability plot of original data (b) probability plot of score values
(c) histogram of original data (d) histogram of score values .................................... 99
Figure 6.3 Prediction results of PLS model with real Y value (solid line with
squares) and predicted value (dotted line) (a) SVI (b) reduction of CN (c) reduction
of COD (d) residual error of Y variables (SPEY)..................................................... 100
xi
Figure 6.4 The second PLS weight vector plotted against the first for PLS model
................................................................................................................................ 101
Figure 6.5 Variable influence on projection (VIP) for the predictor variables ....... 102
Figure 6.6 Monitoring performances based on T2 and SPEX statistics with 95%
confidence limits ..................................................................................................... 103
Figure 6.7 Monitoring performances of MRA for the PLS score values with 95%
confidence limits (a) GDM (b) EV1 (c) EV2 (d) EV3.............................................. 104
Figure 6.8 Contribution plot of the PLS score value for the first event.................. 105
Figure 7.1 Measured variables of the first 10 days of normal data set ................... 122
Figure 7.2 Conventional PCA monitoring result during the first 10 days of normal
data set .................................................................................................................... 123
Figure 7.3 Conventional PCA monitoring result for nitrification linear decrease: T2
and SPE plot............................................................................................................ 124
Figure 7.4 PCA monitoring result with periodic removal for nitrification linear
decrease: T2 and SPE plot ....................................................................................... 125
Figure 7.5 Monitoring result of the proposed method for nitrification linear decrease
................................................................................................................................ 126
Figure 7.6 Conventional PCA monitoring result for nitrification step decrease: T2
and SPE plot............................................................................................................ 127
Figure 7.7 PCA monitoring result with periodic removal for nitrification step
decrease: T2 and SPE plot ....................................................................................... 128
Figure 7.8 Monitoring result of the proposed method for nitrification step decrease
................................................................................................................................ 129
xii
Figure 7.9 Prediction results of SNH,e and SNO,e for validation data set with static PLS
method..................................................................................................................... 130
Figure 7.10 Prediction results of SNH,e and SNO,e for validation data set with static
PLS method (after periodic removal)...................................................................... 131
Figure 7.11 Prediction results of SNH,e and SNO,e for validation data set with the
proposed method..................................................................................................... 132
Figure 8.1 Schematic diagram of the proposed hierarchy structure........................ 146
Figure 8.2 One-step ahead prediction value of SVI using RLS method ................. 147
Figure 8.3 Sensitivity of the ARX model parameters of each state ........................ 148
Figure 8.4 Power spectrum in each state (a) normal (b) bad (c) bulking state........ 149
Figure 9.1 Block diagram of the TSK fuzzy model ................................................ 172
Figure 9.2 Block diagram of the FPLS method ...................................................... 173
Figure 9.3 Scatter plots and firing strength plots of FPLS model in benchmark (a)
first LV (b) second LV (c) third LV (d) fourth LV................................................. 174
Figure 9.4 Comparisons of LPLS and FPLS for the predicted and actual SNHe in
benchmark (a) Time series plot (b) Scatter plot...................................................... 175
Figure 9.5 Comparisons of LPLS and FPLS for the predicted and actual SNOe in
benchmark (a) Time series plot (b) Scatter plot...................................................... 176
Figure 9.6 Scatter plots and firing strength plots of FPLS model in BET (a) first LV
(b) second LV (c) third LV (d) fourth LV ........................................................... 177
Figure 9.7 Time series plots of predicted and actual output in BET (a) SVI with
LPLS and FPLS (b) ∆ CN with LPLS and FPLS (c) ∆ COD with LPLS and FPLS
................................................................................................................................ 178
xiii
Figure 9.8 Scatter plots of predicted and actual output in BET (a) SVI with LPLS
and FPLS (b) ∆ CN with LPLS and FPLS (c) ∆ COD with LPLS and FPLS ........ 179
xiv
List of Tables
Table 2.1 Process Input/Output Variables in full-scale WWTP................................ 18
Table 5.1 Fault (disturbance) sources in the simulation benchmark ......................... 83
Table 5.2 Process variables in full-scale WWTP ...................................................... 84
Table 6.1 Process Input/Output Variables in WWTP ............................................. 106
Table 6.2 Variations explained by the PLS model of four latent variables ............ 107
Table 7.1 Two disturbances in the benchmark ........................................................ 133
Table 7.2 Percent variance captured by PCA model............................................... 134
Table 7.3 MSE of two PLS methods and the proposed method ............................. 135
Table 8.1 Confusion matrix of the test data ............................................................ 150
Table 9.1 Percent variance captured (%) and MSE of several PLS models in
benchmark............................................................................................................... 180
Table 9.2 Percent variance captured (%) and MSE of several PLS models in BET
................................................................................................................................ 181
1
I. Introduction
1.1 Research Motivation
The requirements imposed on the wastewater treatment process (WWTP) in
regard to effluent quality have become increasingly stringent, and existing plants are
subject to increasing loads. To meet these stricter guidelines requires the
development of an efficient wastewater treatment methodology. One way to improve
process efficiency is to build new and larger treatment plants; however, this option is
expensive and in many cases impossible due to lack of a suitable site. Another way
to improve efficiency is to introduce advanced control techniques and to optimize
operating conditions. This approach may improve the effluent water quality,
decrease the use of chemicals, save energy, and reduce operating costs. Any
sustainable solution to the current problems confronting wastewater treatment will
require the development of an adequate information system for the control and
supervision of WWTP.
Close inspection of the current operation of WWTP reveals that
instrumentation, control and automation (ICA) technology is minimal (Olsson and
Newell, 1999). Few plants are equipped with more than a few elementary sensing
elements and control loops, which are mostly used for flow metering and control,
and for monitoring the basic plant performance over relatively long periods of time.
Little progress has been made since the early 1970s, when a major leap forward was
made by the widespread introduction of dissolved oxygen (DO) control. The
introduction of ICA technology has been slow due to the lack of reliable
instrumentation and the harsh environment in which the computer and automation
devices are housed and operated. However, this situation is rapidly changing due to
advances in sensor technology and the introduction of smart sensors capable of self-
cleaning, self-calibration and self-reconfiguration. The current trend is towards
2
integrated systems that control and monitor the process from the wastewater sources
right through to the receiving waters and sludge disposal.
The primary purpose of ICA is to facilitate efficient operation of the WWTP,
allowing effluent standards to be met for the lowest possible operational and capital
costs. The main bottlenecks for the implementation of ICA technology within the
WWTP are related to the following (Olsson and Newell, 1999; Jeppsson et al.,
2001):
Poor legislation
Inadequate education, training and understanding
Lack of confidence and acceptance within WWTP industries
Lack of collaboration between stakeholders/organizations
Economy and time to develop solution in practice and making sure that
they work
Unreliable measuring devices
Plant constraints and inadequate sewer systems
Lack of transparency
Lack of software and instrument standardization
The increase in public awareness about wastewater disposal over the past
decade, as reflected in more stringent effluent regulation, has considerably increased
the requirements imposed on treatment plants. The treatment process must now
eliminate not only organic carbon pollution from wastewater, but also nutrients (e.g.,
nitrogen and phosphorus). The introduction of biological nutrient removal, the most
economical method for removing nutrients, has significantly increased the
complexity of process configurations. The main driving forces for ICA are related
to :
Stricter effluent quality standards
Demand for lower sludge production
3
Economic incentive
Reduce energy consumption and increase energy production
Increased plant complexity (co-ordination of processes and loops,
monitoring etc.)
New treatment concepts – e.g. more compact plants and water reuse
New and cheaper technical solutions – computers and communication
At present, many WWTPs are operated according to predetermined schemes
with very little consideration given to variations in influent load. The use of on-line
sensors for on-line control of plant operation could enhance the ability of WWTP to
comply with assigned effluent standards. Application of modern control theory in
combination with new on-line sensors and appropriate models have great potential to
improve effluent water quality, decrease the use of chemicals, and to save energy
and money. In particular, the input/output behavior of these processes can be such
that they appear highly stable right up to the time at which a gross process failure
occurs, apparently significant input disturbances do not excite any significant output
response, while very significant responses may occur in the absence of any
corresponding input disturbance. This distinctive feature of the WWTP has long
challenged control engineers. (Lindberg, 1997; Jeppsson, 1996; Lukasse, 1999;
Olsson and Newell, 1999; Singman, 1999; Steffens and Lant, 1999; Lee, 2000;
Sotomayor et al., 2000).
In contrast to the situation for the WWTP, multivariate statistical monitoring
and diagnosis of the process operating performance are extremely important aspects
of plant safety and economical viability in most other process industries, for
example the petrochemical and pharmaceutical industries. However, wastewater
treatment industries are not among the most diligent and systematic users of
statistical monitoring methods. To date, monitoring in wastewater treatment has
mostly focused on a few key effluent quantities that are subject to regulations
4
enforced by government or other authorities. However, the ongoing tightening of
environmental restrictions requires increased efforts to improve effluent quality from
the WWTP using advanced monitoring technology. To effectively monitor process
behavior statistically, important information must be extracted from the large
number of measured variables, and this information must be presented in a form that
is readily interpreted. The concept of sustainability, which entails minimizing the
use of resources such as energy, chemicals and manpower, has also become an
important issue in the design and modification of WWTPs (Rosen, 1998; Olsson and
Newell, 1999; Teppola, 1999).
However, there are further difficulties to overcome before a monitoring system
can successfully be applied to WWTP (Rosen, 2001).
Non-stationary data – The conditions in which WWTP are operated are
normally of a varying nature. Diurnal, weekly and seasonal patterns are
normally found in the influent wastewater characteristics. These disturbances
must be considered as normal and is in practice seen as state of things rather
than disturbances. It is often difficult to discern other process disturbances from
those caused by the varying influent conditions, which tend to have a dominant
effect on the process behavior.
Multiscale data – A difficulty related to the dynamic properties of the
disturbances as well as of the process is that disturbances occur in many
different time scales. It means that some disturbances affect the process in a
short time frame, whereas others have a much slower response. Apart from that
this fact complicates the discernment of disturbances in a similar way to that of
non-stationarity, it also deteriorates the performance of many monitoring
techniques. Moreover, information on the time scale of a disturbance may
prove crucial for a decision on counteractive actions. The multiscale nature of
data is, however, not only a problem; it can also be used to decouple the
5
process time.
Nonlinearities – Wastewater processes display a nonlinear behavior and
relationships between variables cannot always be approximated by a linear
function. Consequently, if this is the case, nonlinearities must be taken into
account when developing a monitoring system.
Dynamic data – Almost all data form dynamic process are autocorrelated,
which means that each observation is not independent of the previous
observation. This may have a great impact on statistical properties of the
monitoring output and consequently caution must be taken when interpreting
the result.
1.2 Research Objectives
The principal goal of this research is to develop advanced control and
monitoring systems that improve the operation of the WWTP. This work was
undertaken with the following detailed objectives.
Objective 1: To apply autotuning and supervisory control in the full-
scale WWTP
In the WWTP, PID controllers are familiar to process operators and very
popular because of their simplicity, ease of operation and robustness to modeling
error. However, it is well known that DO dynamics cannot be effectively controlled
by PID controllers with fixed gain parameters. Moreover, manual tuning of PID
controllers is tedious and laborious. In the present study, closed-loop identification
and auto-tuned PID controller are applied to the DO control system in the full-scale
WWTP. In addition, we propose a method for deciding on a proper DO set point for
the current operation condition of the aeration basin. The full-scale experimental
results showed good identification performance and good control performance.
6
Objective 2: To develop a robust estimation and control algorithm
We present a generalized damped least squares (GDLS) algorithm to
systematically remove an estimation windup problem in the adaptive control and
self-tuning control system. The key element of the proposed method is the addition
of a penalty of parameter variations to the objective function of the normal least
squares algorithm to prevent the singularity problem. Mathematical analysis shows
that the proposed method has almost equivalent properties to the normal least
squares method and guarantees that no estimation problem will be encountered for
poorly excited situations. Simulation results show that the proposed method gives
better estimation performance than previous methods. We applied this method in the
simultaneous control and estimation of important variables in the WWTP.
Objective 3 and 4: To develop a disturbance detection and isolation
algorithm
The biological nutrient removal process alters gradually over time, indicating
that the process is nonstationary. This represents a problem for developing a
conventional multivariate statistical analysis because such an analysis must be
developed from a set of "normal" operating data. Moreover, monitoring of the
biological treatment process is very important because recovery from failures is
time-consuming and expensive. Hence, a reliable detection procedure is needed. We
propose an on-line fault detection and isolation algorithm, which uses a dissimilarity
measure to evaluate the difference between successive data sets and to discriminate
between serious and minor abnormalities. In addition, to cope with the nonstationary
and multiscale process changes in WWTP, we propose a modeling and
multiresolution analysis (MRA) method.
Objective 5: To develop a process monitoring for continuous process
7
with cyclic operation
Most WWTPs are subject to large diurnal fluctuations in the flow rate and
composition of the feed stream. Consequently, WWTPs exhibit daily periodic
characteristics, with strong diurnal fluctuations in the process input and output
variables. Although these processes are non-stationary, their behavior tends to repeat
from cycle to cycle and hence their cycle-to-cycle behavior may be assumed
stationary. We propose a method for monitoring a continuous process with diurnal
cyclic characteristics in the domestic WWTP. The proposed method uses a state-
space model to capture and utilize the cycle-to-cycle correlation structure. The
method can also incorporate the concept of inferential sensing to predict the quality
variables and enhance process monitoring.
Objective 6: To develop a simultaneous prediction and classification in
the secondary settling tank
An efficient operation of the secondary settler is very important since it
separates the biomass from the treated wastewater and is a key mechanism of
determining the effluent quality in a biological WWTP. Simultaneous prediction and
classification in the secondary settling tank is proposed. Adaptive modeling scheme
is implemented as recursive least squares (RLS) method to update the model
parameters adaptively and neural network (NN) classifier is used as a process
classifier. Experimental results shows that the prediction model describes the
dynamics of the secondary settler well and neural network classifier combined with
an adaptive scheme is quite adequate for the monitoring of the secondary settler in
the WWTP.
Objective 7: To develop a nonlinear fuzzy partial least squares method
Fuzzy modeling has proved an efficient alternative for describing nonlinear
8
biological processes. Recently, the Takagi-Sugeno-Kang (TSK) fuzzy model has
received considerable attention because of its prediction ability and suitability for
continuous process modeling. The TSK model allows us to combine a set of
linearized models into a global model to approximate the complex nonlinear system
with less complexity. We propose a new nonlinear fuzzy partial least squares (FPLS)
algorithm that embeds the TSK fuzzy model into the regression framework of the
partial least squares (PLS) method. The proposed method applies the TSK fuzzy
model to the PLS inner regression. It uses the PLS method to solve the problems of
high dimensionality and collinearity and the TSK fuzzy model to capture the
nonlinearity and to increase the use of experts’ knowledge. We applied this method
in the simulation benchmark and full-scale WWTP.
9
II. Biological Wastewater Treatment Process
2.1 Activated sludge process
The activated sludge process with its many variations is the basis for the
treatment of wastewater almost everywhere in the United States. Especially, nearly
99% of municipal WWTPs in Korea use the traditional activated sludge process
(Lee, 2000). Figure 2.1 shows the basic layout of an activated sludge process
(Lindberg, 1997). The activated sludge process is a biological process in which
microorganisms oxidize and mineralize organic matter. All microorganisms enter the
system with the influent wastewater. The composition of the species depends not
only on the influent wastewater but also on the design and operation of WWTP. The
activated sludge is kept suspended in water by stirring and aeration. The
microorganisms to oxidize organic matter use oxygen. To maintain the
microorganism concentration, the sludge from the settler is recycled to the aeration
tank. The growth of the microorganisms and influent particulate inert matter are
removed from the process as excess sludge. Microorganism concentration is
controlled by the excess sludge flow rate.
Biological nitrogen removal
Nitrogen materials can enter the aquatic environment from either natural or
human caused sources. Excessive accumulation of various forms of nitrogen in
surface and ground waters can lead to adverse ecological and human health effects.
One of the major effects has been the direct and indirect depletion of oxygen in
receiving water. Other impacts can be of major importance in particular situations.
These include ammonia toxicity to aquatic animal life, adverse public health effects,
and a reduction in the suitability of water for reuse. Since the early 1970s significant
developments have taken place in the activated sludge method of treating
10
wastewater, as nutrient removal has become a very important factor in the WWTP.
Nitrogen materials are present in several forms in wastewater, e.g. as
ammonium (NH4+), nitrate (NO3
-), nitrite (NO2-) and organic compounds. Nitrogen
is an essential nutrient for biological growth and is one of the main constituents in
all-living organisms. When untreated wastewater arrives to the wastewater treatment
most nitrogen is present in the form of ammonium. Nitrogen can be removed by a
two-step procedure. In the first step, ammonium is oxidized to nitrate in aerated
zones (nitrification). The microorganisms carrying out this process are generally
considered to be nitrosomonas and nitrobacter. The aerobic growth of autotrophs
consumes soluble carbon, ammonia and dissolved oxygen to produce extra biomass
and nitrate in solution. This step can be further divided into two, one producing
nitrites and the second further oxidizing nitrites to nitrates.
−
+−+
→+
++→+
32-2
2224
NOO5.0NO
OH2HNOO5.1NH (2.1)
The second major step is the anoxic growth of heterotrophs, which use nitrates
as oxidizer and produces extra biomass and nitrogen gas (denitrification). This
process takes place in an anaerobic environment where the bacteria responsible for
denitrification respire with nitrate instead of oxygen (anoxic).
OOOCH 22223 H7C5N2H4""54NO ++→++ +− (2.2)
where “CH2O” stands for diverse the organic matter (Lindberg, 1997).
By using these two bacterial processes, nitrogen is removed from wastewater
biologically. Anoxic zones in the activated sludge process are necessary for
denitrification. Anoxic zones can be placed either in the beginning of the tank (pre-
denitrification) or in the end of the tank (post-denitrification). In a pre-denitrifying
system, an extra recirculation flow is usually introduced to transport the nitrate rich
water back to the anoxic zone. For successful denitrification, a sufficiently high
influent carbon:nitrogen ratio is required. When this requirement is not met, an
11
external carbon source has to be added. The dosing rate of that carbon is important.
Dosing an insufficient amount will result in a high effluent nitrate concentration.
Dosing too much will increase the costs considerably due to a high external carbon
use, a high sludge production, and an increased oxygen demand. The strong
variation in influent flow and composition, which if typical for WWTPs, generates a
demand for on-line control of the denitrification process in order to guarantee a
sufficiently low effluent nitrate concentration. Two variables can be manipulated to
achieve this objective: (1) the external carbon dosage, to guarantee that almost all
the recirculated nitrate is removed in the anoxic zone; and (2) the nitrate
recirculation flow rate, to control the amount of nitrate that is recirculated. For
optimal control of the process, the two variables should be controlled simultaneously
of form a multivariable control system (Jeppsson, 1996; Lindberg, 1997; Steffens
and Lant, 1999; Sotomayor et al., 2000).
2.2 Simulation Benchmark
WWTPs are large non-linear systems subject to perturbations in flow and load,
together with uncertainties concerning the composition of the incoming wastewater.
Nevertheless these plants have to be operated continuously, meeting stricter and
stricter regulations. Many control strategies have been proposed in the literature but
their evaluation and comparison, either in real-life applications or based on
simulations, is difficult. This is partly due to the variability of the influent, the
complexity of the biological and hydrodynamic phenomena and the large range of
time constants (from a few minutes to several days, even weeks), but also to the lack
of standard evaluation criteria. It is difficult to judge the particular influence of the
applied control strategy on reported performance increase, because the reference
situation is often not optimal. Due to the complexity of the systems the effort to
develop alternative control approaches is so high that a fair comparison between
12
different options is very rarely made. Then it remains difficult to conclude to what
extent the proposed solution is process or location specific. To enhance the
acceptance of innovating control strategies the evaluation should be based on a
rigorous methodology including a simulation model, plant layout, controllers,
performance criteria and test procedures. To this end, there has been a recent effort
to develop a standardized simulation protocol – ‘simulation benchmark’.
The COST 682 Working Group No.2 has developed a benchmark for
evaluation of control strategies by simulation (COST-624). The benchmark is a
simulation environment defining a plant layout, a simulation model, influent loads,
test procedures and evaluation criteria. For each of these items, compromises were
persued to combine plainness with realism and accepted standards. Once the user
has validated the simulation code, any control strategy can be applied and the
performance can be evaluated according to certain criteria (Alex et. al, 1999; Pons et
al., 1999; Copp et. al, 2000).
A relatively simple layout was selected in simulation benchmark. It combines
nitrification with pre-denitrification, which is most commonly used for nitrogen
removal. Figure 2.2 shows a schematic representation of the layout. The plant
consists of a 5-compartment bioreactor (6000 m3) and a secondary settler (6000 m3).
It combines nitrification with predenitrification, which is most commonly used for
nitrogen removal. The first two compartments of the bioreactor are not aerated
whereas the others are aerated. The IAWQ model No. 1 (Henze et al., 1987) and a
ten-layer one-dimensional settler model (Takács et al., 1991) are used to simulate
the biological reactions and the settling process, respectively. Influent data
developed by a working group on benchmarking of WWTP, COST 624, are used in
the simulation. The return sludge flow rate (Qr) is set to 100% of the influent flow
rate and internal recirculation (Qa) is controlled using a setpoint (SNO, ref) of 1.0 mg
N/l for the nitrate concentration in the second aerator. The aeration (KLa) in the
13
aerator 3 and 4 is set to a constant value, 240 day-1. The DO concentration in the
aerator 5 is controlled to a set point 2.0 mg /l. Simulated influent data are available
in three two-week files derived form real operating data. The files were generated to
simulate three weather situations representing dry weather, storm weather (dry
weather + 2 storm events), and rain weather (dry weather + long rain period). The
file exhibits characteristic diurnal variations inflow and component concentrations.
Each of data contains 14 days of influent data at 15 minutes sampling intervals. Any
control strategy should be tested using each of these weather files.
2.3 Full-scale WWTP
The process data were collected from a WWTP that treated the coke plant
wastewater of the iron and steel making plant in Korea. It is a general activated
sludge process that has five aeration basins (each 900 m3) and a secondary clarifier
(1200 m3). The plant layout of the studied activated sludge plant is presented in
Figure 2.3. It has two wastewater sources, where one directly comes from a coke
making plant (called BET3) and the other comes from a pretreated wastewater of
upstream WWTP at other coke making plant (called BET2). The coke-oven plant
wastewater is produced during the conversion process of coal to coke in the steel
making industries. It is extremely difficult to treat the coke wastewater because it is
highly polluted and most of the chemical oxygen demand (COD) originated from
large quantities of toxic, inhibitory compounds and coal-derived liquors (e.g.
phenolics, thiocyanate, cyanides, poly-hydrocarbons and ammonium). In particular,
cyanide (CN) concentration occupies the most important thing among the influent
load of the coke wastewater. The influent flow rate is 250 – 350 m3/hr, influent COD
is 500 – 1200 mg/l, influent cyanide is 5 – 30 mg/l, influent temperature is 30 – 45 °C, temperature in the aeration basin is 27 - 33 °C and operation cost was 0.08 $/ton
in 1999.
14
Twelve process and manipulated variables, X blocks, were used to model three
process output variables, Y blocks. Y blocks consist of the solid volume index (SVI),
the reduction of cyanide, and the reduction of COD. Table 2.1 describes the process
variables and presents the mean and standard deviation (SD) values of X and Y
blocks. The process data consisted of daily mean values from 1 January, 1998 to 9
November, 2000 with a total number of 1034 observations. The first 720
observations were used for the training data. And the remaining 314 observations
were used as a test data set in order to verify the proposed methods.
15
Figure 2.1 A basic activated sludge process with an aerated tank and a settler
16
Figure 2.2 A simplified process scheme of an activated sludge process using pre-
denitrification (layout of simulation benchmark)
Unit 1 Unit 2 Unit 3 Unit 4 Unit 5
m = 1
m = 10
m = 6
Q0, Z0
Qa, Za
Qr, ZrQw, Zw
Qe, Ze
Qf, Zf
Qu, Zu
WastewaterBiological reactor
ClarifierTo river
Wastage
Internal recycle
External recycle
Anoxic section Aerated sectionPI
Nitrate
PI
Dissolvedoxygen
kLa
kLa = oxygen transfer coefficient
17
A BEq.T/K
Aeration basin
Cokesplant
BET2
BET3Settler
FinalWWTP
Recycle sludge
Wastesludge
C D E
PretreatedWWTP
Figure 2.3 Plant layout of coke WWTP, Korea
18
Table 2.1 Process Input/Output Variables in full-scale WWTP
No Variable Description Unit Mean SD
X1 Q2 Flow rate from BET2 m3/h 179.4 15.98
X2 Q3 Flow rate from BET2 m3/h 85.53 8.726
X3 CN2 Cyanide from BET2 mg/L 2.455 0.3764
X4 CN3 Cyanide from BET3 mg/L 14.35 2.491
X5 COD2 COD from BET2 mg/L 156.4 20.28
X6 COD3 COD from BET3 mg/L 2083 295.5
X7 MLSS_%E MLVSS at final aeration
basin mg/L 1605 409.3
X8 MLSS_R MLSS in recycle mg/L 7194 3444
X9 DOaerator DO at final aeration basin mg/L 2.064 0.9979
X10 Tinfluent Influent temperature °C 37.6 2.513
X11 Taerator Temperature at final
aerator °C 30.74 2.379
X12 pHAT pH at final aeration basin mg/L 7.24 0.22
Y1 SVIsettler Solid volume index at
settler mg/L 63.31 21.73
Y2 CNred Cyanide reduction mg/L 19.31 4.2
Y3 CODred COD reduction mg/L 605.4 97
19
III. Autotuning and Supervisory DO Control in Fullscale
WWTP
3.1 Introduction
The dissolved oxygen (DO) concentration in WWTP has been recognized as an
important variable to be controlled both for economical and process efficiency
purpose. The proper control of DO could achieve improved process performances
and there is an economic incentive to minimize excess oxygenation by supplying
only necessary air to meet the time-varying oxygen demand of the mixed liquors.
Despite the relatively simple dynamics of the DO mass balance, the control may be
known difficult because of time-varying influent wastewater conditions, non-
linearity, time delay, sensor noise and slow sensor dynamics. To overcome these
problems, several adaptive control strategies have been suggested recently to the
control of DO concentration in the aeration basin (Holmberg et al., 1989; Carlsson,
1993; Carlsson et al., 1994, 1996; Lindberg, 1997). The previous works with
advanced control algorithm require detailed process information such as oxygen
transfer rate, respiration rate, reactor volume, wastewater flow rate and use a
mathematically complex algorithm that is difficult to be implemented in on-line
manner. Moreover, these cannot be implemented with the PID controller that is the
most common controller in the real WWTP. Therefore, a method of improving the
PID controller performance is required to use only the process input-output data
without requiring any complicated algorithm. It is an automatic tuning of PID
controller (autotuning).
In WWTP, PID controllers are familiar to process operators and very popular
because of its simplicity, easiness in operation and robustness to modeling error. But
it is well known that the DO concentration cannot be controlled effectively by using
the PID controller with fixed gain parameters. And the manual tuning of PID
20
controller is tedious and laborious. To overcome the time-consumed manual tuning
procedures of PID controller, many on-line identification methods have been
proposed to obtain the process information (Åström and Hägglund, 1985; Sung et al.,
1998a, 1999). In WWTP, Carlsson et al. (1994) used an autotuning controller
suggested by Åström-Hägglund to control DO concentration in WWTP. And Diue et
al. (1995) used relay feedback method in order to tune PID loop controller
parameters of PLC in the chlorination and dechlorination process.
The first objective of this research is to apply an autotuning to actual DO
control system in the full-scale WWTP. It approximates the dissolved oxygen
dynamics to a high order model using the integral transform method and reduces it
to the first-order plus time delay (FOPTD) or second-order plus time delay (SOPTD)
for the PID controller tuning. And then PID controller is tuned based on the reduced
method. The second objective is to suggest a simple supervisory control algorithm
which decides a proper DO set point in the aeration basin’s current operating. The
key idea is that DO set point is determined in proportion to the respiration rate that is
the indicator of biologically degradable load. Because we cannot have the real-time
respirometry in the full-scale plant, we have used the well-known respiration rate
estimation algorithm suitable to the surface aerator type of WWTP by using the
recursive parameter estimation approach. The proposed methods have been
evaluated in the full-scale WWTP.
3.2 Method
3.2.1 Autotuning Method
Nowadays, system identification under the closed-loop condition has been a
special issue in industrial and environmental applications since the process output
can go away from the normal steady state using the open-loop identification method.
21
This section explains a closed-loop identification method using the integral
transform as the identification method (Whitfield and Messali, 1987; Sung et al.,
1998a, b, 1999). It can utilize the process output and input activated by any test
signal generator (e.g. controller itself, relay, P controller, simple set point change of
PID control, pulse or step response) only if the process is activated sufficiently. So,
the operator can activate the process in different ways according to his taste.
The identification has the following steps. First, the process is activated
sufficiently to guarantee that the process output and the process input include
required information. Second, the differential equation of the parametric model is
converted to the corresponding linear algebraic equation by using the integral
transform. Third, the model parameters are estimated by using a least squares
method based on the measured process data.
Consider a general high order process model in the time domain.
ubdtdub
dtudb
dtudby
dtdya
dtyda
dtyda m
m
mm
m
mn
n
nn
n
n 011
1
111
1
1 ++++=++++ −
−
−−
−
− LL (3.1)
The above model (3.1) can approximate usual processes as accurately as
desired, even though the processes include time delay or non-minimum phase zeroes.
To convert the differential equation to an algebraic equation, the following integral
transform (3.2) is applied to both sides of equation (3.1).
ii
i
tfj dddytfiyI
jττττ 1110
)(),(_ −∫ ∫∫= L43421L (3.2)
and as a result, equation (3.3) is obtained.
=+−+++ − ),(_),1(_),1(_),0(_ 11 jjjnjn tfnyItfnyIatfyIatfyIa L
),(_),1(_),(_ 01 jjmjm tfnuIbtfmnuIbtfmnuIb +++−+− − L (3.3)
The objective of the identification is to estimate the coefficients of ak and bk. In
equation (3.3), all integrated values can be calculated numerically for various tfi
22
values. Then ak and bk are obtained by least squares algorithm. Here, it should be
pointed out that the identification method using the integral transform does not care
the types of the signal generators only if the signal can activated the process
sufficiently. And it uses only the least squares method to estimate the parameters of
the process model. The identified high order transfer function model can be used as
the process model for other adaptive control, a Smith predictor or other model-based
controller. On the other hand, we should reduce the identified model to the FOPTD
or SOPTD model to tune the PID controller automatically because many developed
on-line PID tuning methods such as internal model control (IMC), the Integral of
time-weighted absolute value of the error (ITAE) and Cohen-Coon (C-C) are based
on these models.
The PID controller can be tuned at a number of operating conditions such as
high/low respiration rate or load. Because we can separately use the proposed
method for each operation condition, the proposed method can effectively
compensate for the operation condition change by different PID controller
parameters. But, how can we suggest its set point? The next idea is that DO set point
is determined in proportion to the respiration rate that is the indicator of biologically
degradable load.
3.2.2 Supervisory Control
Supervisory control which recommends a proper DO set point in the aeration
basin’s current operation condition has been problematic in WWTP. The proper
aeration is crucial to treatment efficiency since an insufficient DO level impairs the
oxidation process and eventually leads to biomass death. Whereas too high DO may
cause the sludge to settle poorly and excessive aeration is also undesirable from an
economic point of view since the oxygen in excess is lost to the atmosphere.
Therefore, the proper DO set point gives or may give the following advantages such
23
as better control of effluent and energy saving from the lower DO level. However,
there have been few guidelines for the proper supervisory control of DO set point
until now (Lindberg, 1997). Lindberg (1997) suggested a set point controller which
utilizes measurements of ammonia concentration in the aeration basin.
The key idea is that DO set point is determined in proportion to the respiration
rate and influent loading because the respiration rate is the important variable that
characterizes the DO process and the associated removal and degradation of
biodegradable matter and is the only indicator of biologically degradable load. That
is, if toxic matter enters the plant, for example, this can be detected as a decrease in
the respiration rate, since the microorganisms degrade their activity or some of them
die. Then, a rapid decrease in the respiration rate may hence be used as a warning
that toxic matter has entered the plant. In this case, we should increase the DO set
point. Therefore, we can suggest the following decision rule that “The higher
respiration rate, the lower DO set point. The lower respiration rate, the higher DO
set point”. Figure 3.2 shows the scheme of the supervisory control to decide the set
point of DO controller.
Because we didn’t have the real-time respiration rate meter in the waste loading
state, we used the well-known respiration rate estimation algorithm with Kalman
filter approach suitable to the surface aerator type of full-scale WWTP. And the
estimated parameter is also used to give judgment of the present operation states and
process load. There are several different approaches such as recursive method in
order to estimate oxygen transfer rate (KL a) and respiration rate (R(t)) from
measurements of DO and airflow rate (Holmberg , 1989; Carlsson et al., 1994;
Lindberg, 1997; Joanquin et al., 1998). Here we used the Lindberg’s method
(Lindberg, 1997). The KLa and the respiration rate are tracked by a Kalman filter by
using measurements of DO and air flow rate, u(t). During the autotuning phase, the
airflow rate or aerator speed variation is given a high excitation both in amplitude
24
and frequency. The estimation procedure is performed on a relatively short date set,
in our case, autotuning’s identification time. Then, the estimated models of the
respiration rate and oxygen transfer rate could be used to the other controller design.
The estimated value of the respiration rate would be used as a base rule in the set
point decision.
3.3 Experimental Results
In this work, the experiment was performed in the industrial coke wastewater
treatment facility of the iron and steel making plant, Korea. Figure 3.3 shows a
schematic diagram of the WWTP considered in this research. The plant consists of
two parts: one is the biological process made up of the activated sludge process. The
other is the chemical treatment process. As shown in Figure 3.3, WWTP has five
aeration basins and one settling tank in the biological process. Each aeration basin is
equipped with sensors (pH, DO, ORP, MLSS) and a speed controllable surface
aerator in order to supply the oxygen. The automatic control system has a PC/PLC
structure, which is based on a number of tag points for supervision, data acquisition,
data storage and analysis. It was designed as the user-friendly control system using
the commercial man machine interface (MMI) software known as FIX DMACS 7.0.
The PID control algorithm was been installed in the MMI and autotuning program
was implemented with the visual basic 6.0.
The closed-loop identification methods were experimented using various input
signal in the real plant from Feb. 2. 2000 to Feb. 28. 2000. In this research, a simple
set point change of PID controller itself was chosen as the activation signal without
any control mode change. It is simple, stable and easy to implement the proposed
on-line identification method. The tested aeration basin was the last basin which was
the most important in the total wastewater treatment process. The DO set point was
increased from 1.6 to 2.0 mg/l at 0.05hour for identification. Figure 3.4 shows the
25
variations of DO concentration and aerator speed during the closed-loop
identification.
Using the acquired data, a high order model is identified using the explained
identification method. The system order is chosen as n = 4 and m = 3. The
equation (3.4) represents the identification result.
1325.00016.0001.0000003.0
196.0001.000098.0000012.0)()()( 234
23
++−+−+−+−
==ssss
ssssusysG p (3.4)
For the PID controller tuning, the high order model is reduced to the FOPTD
and SOPTD model using the model reduction technique. The reduced models are as
follows and the time unit is hour.
10036.00044.02.0)(,
135.02.0)( 2
17.019.0
++≅
+≅
−−
ssesG
sesG
s
SOPTD
s
FOPTD (3.5)
In Figure 3.5, Bode plots of the identified models are presented for the model
selection. If a high control performance is required, SOPTD model is recommended.
If just a stabilizing controller is the main objective, FOPTD model is sufficient. In
this research, FOPTD model is selected for simplicity because the reduced FOPTD
and SOPTD models show the similar results in the Bode plots. From the
identification result and theoretical analysis, one can know that its time constant is
21 min, the time delay is 11.4 min, and the steady state gain is 0.2. For the model
validation, the aeration speed is increased from 50 to 60 RPM as a step input. Figure
3.6 shows the step response of the real data and the prediction values of FOPTD
model, where the obtained model approximates the behavior of the real plant
successfully in spite of the experimental error and the identified model shows
robustness to measurement noises.
During the identification phase of autotuning, we estimated the respiration rate
using the previous estimation techniques. We experimented on the following process
conditions. Influent flow rate is 280 m3/hr, the temperature of aeration basic is 38 °C
26
and DOsat is 6.5 mg/l. The estimation result is showed in the Figure 3.7 and the
estimated respiration rate converges 54 mg/l/h. In the supervisory control, we
determined the previous simple set point decision rule, “The higher respiration rate,
the lower DO set point. The lower respiration rate, the higher DO set point”. To
avoid the DO set point becomes too high or too low, it should be only be allowed to
vary in an interval, 0.5-3.0 mg/l in our case. And the respiration rate range is 10-110
mg/l/h. In our coke WWTP, we suggested following simple set point decision rule.
75.2)(ˆ025.0 +−= tRDOs (3.6)
Figure 3.8 shows the control results with PID parameter tuned using the
acquired FOPTD model during 26 days. As a tuning rule, the ITAE disturbance
rejection rule was selected because it is appropriate for the step input disturbance
that occurs frequently in WWTP. With the set point set of 1.4 mg/l, it shows the PID
control result during the first 13 days. Then, abnormal process changes in the
influent load variations occurred at about 14th day after the first identification step.
Here, the identification method was used again to acquire a new process model by
the set point change of about 1.5 mg/l. Because the experimental result showed the
estimated respiration rate was low at about 25.0 mg/l/h, we recalculated DO set
point on 2.1 mg/l based on the proposed supervisory control law. And then we
changed the PID control parameters based on the refreshed process model. The
experimental result showed good control performance in spite of the frequent load
variations, abrupt upstream transition or influent toxicity. Since it considers all
measured data sets to estimate several adjustable parameters, it represents robustness
to measurement noises in particular. As a result of autotuning and supervisory
control, it has achieved the overall improvement of effluent quality and have
reduced 15 % of the electric power cost than the fixed gain PID controller.
27
3.4 Conclusions
Autotuning and supervisory control algorithm for DO control in the full-scale
WWTP were evaluated and proposed. Though the proposed method are concise and
doesn’t require any complicated numerical techniques, experimental results
confirmed that overall improvement of effluent quality and 15% reduction of the
electricity cost had been accompanied by autotuning and supervisory control.
Autotuning method has been applied to other control variables in full-scale WWTP,
such as pH, polymer addition control and sludge recycle rate control.
28
Figure 3.1 Identification and autotuning procedure for PID controller
Tuning Rule
d
i
ck
τ
τModel reduction
IdentificationActivated Process Data
29
Figure 3.2 Supervisory DO control scheme
DO
R(t)
AFR DOs Setpoint
Controller
PID(DO)
Controller
ASP
Input
30
Figure 3.3 Schematic diagram of full-scale WWTP
1st SettlingTank
#A #B #C #D #EEqualization
Basin
pH, Temp, flowrate
Influent Wastewater
pH, DO, ORP, MLSS
Aerators2nd Settling Tank
Filter Press
Effluent
FlocculatorThickening
Tank
MMI (VB)
Biological Treatment (Activated Sludge) Chemical Treatment
PLC
31
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.044
46
48
50
52
54
RPM
time [hour]
aera
tor s
peed
[RP
M]
1.0
1.5
2.0
2.5
3.0
DO
DO
[mg/l]
Figure 3.4 Experimental result during identification phase
32
Figure 3.5 Bode plots of the identified FOPTD and SOPTD models
0.1 1 10-180
-150
-120
-90
-60
-30
0
High Order SOPTD FOPTD
φ(deg)
ω (rad/hour)
0.1 1 10
0.1
HighOrder SOPTD FOPTD
AR
ω (rad/hour)
33
Figure 3.6 The validation test: real data and model prediction value
2
2.2
2.4
2.6
2.8
3
0 5 10 15 20 25 30
Time [min]
DO
[m
g/l]
Real
FOPTD`
34
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.010
15
20
25
30
35
40
45
50
55
60
65
Res
pira
tion
rate
[mg/
l/h]
time [hour]
Figure 3.7 Estimated respiration rate during identification phase
35
0 5 10 15 20 25 301.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
2.1
2.2
DO
[mg/
l]
time [day]
Figure 3.8 DO control result using autotuning and supervisory control algorithm
36
IV. Generalized Damped Least Squares Method
4.1 Introduction
Since the main difficult in controlling biological processes arises from the
variability of kinetic parameters and the limited amount of on-line information, an
adaptive controller is best suited for this purpose. Adaptive control or self-tuning
control is usually based upon simultaneous model identification and control and thus
requires on-line updating of the model parameters rather than off-line processing of
the process data. Recursive Least Squares (RLS) method is the most popular method
used in the on-line recursive parameter estimation algorithm (Lambert, 1987; Ljung,
1987; Söderström & Stoica, 1989; Åström, 1995; Gau & Stadther, 2000). The
recursive parameter estimation is a useful tool in WWTP, since the system have
often time varying characteristics and parameters (Marsili-Libelli, 1990).
In general, RLS method generally works well only if the process is properly
excited all the time. But there are problems with exponential forgetting when the
exciting is poor, that is, reveal a good control performance in the view of the control.
A closed-loop control introduces linear dependencies among the process information.
And it produces a singular problem when the signals are not activated sufficiently
under the closed-loop system. It is known as an “estimation windup” or “blowup”,
which the gain of the estimator grows exponentially. This is easiest to see which the
process has a good control performance and is operated in the steady state. It is a
main bottleneck in applying adaptive controllers (Åström, 1995).
On the other hand, the estimation of the oxygen transfer rate KL a(u) and the
respiration rate R(t) in WWTP is needed to monitor the biological activity and
process control system performance or construct a nonlinear controller for
controlling DO concentration more effectively. Knowledge of the two variables is of
interest in both process diagnosis and process control. In particular, the respiration
37
rate is the key variable that characterizes the DO process and the associated removal
and degradation of biodegradable matter. It is the only true indicator of biologically
degradable load. If toxic matter enters the plant, this can be detected as a decrease in
the respiration rate, since the microorganisms slow down their activity or some of
them die. A rapid decrease in the respiration rate may hence be used as a warning
that toxic mater has entered the plant and by-pass action may be taken to save the
microorganisms (Lindberg, 1997).
For on-line purposes, it is crucial to be able to estimate both R(t) and KLa
simultaneously. In this case, the control strategy is dual in the sense that the control
signal is used both to control the dissolve oxygen (DO) concentration and to excite
the DO dynamics sufficiently to allow the parameters to be estimated. That is, in
considering the combined mechanism of the two algorithms (DO control and
R(t)/KLa estimation), a conflict arises, because controller keeps the DO constant,
consistent R(t) and KLa estimation requires this quantity to vary. These contrasting
requirements can be reconciled by adding a specific input such as relay or pseudo
random binary signal (PRBS) to control signal (Holmberg et al., 1989; Marsili-
Libelli and Vaggi, 1997). And it is then suggested to stop the estimation as soon as
convergence is obtained. It can be detected by low diagonal values in the covariance
matrix of RLS algorithm. The limit of this procedure is that time varying parameters
cannot be estimated unless the algorithm is periodically reinitialized. But under
feedback control, the conventional recursive estimation methods such as RLS
always show the estimation windup. Several solutions have been suggested and can
indeed reduce the parameter drift but are not a systematic solution of estimation
windup problem in the adaptive controller.
To overcome shortcomings of the previous methods, we will propose a
generalized damped least squares method as the fundamental solution of estimation
windup problem in the adaptive controller. And then an adaptive model-based DO
38
control algorithm will be proposed, which can simultaneously estimate the two key
parameters, oxygen transfer rate and respiration rate.
4.2 Theory
4.2.1 Generalized Damped Least Squares Algorithm
Numerous papers have appeared to avoid the estimation windup which often
occur in the practical application, where are normalization of the regression vector,
matrix regularization, constant trace with a variable forgetting factor, information
measurements for turning adaptation on/off, parameter variations and maximum
limit. There have been few previous methods which can solve the estimation windup
problem fundamentally.
The interesting two papers have been reported. Lambert (1987) introduced a
modification to classical RLS with exponential forgetting factor. His basic idea is to
weight the estimated parameter vector in only one sampled interval. However, his
idea lacks a multi-step approach in the estimated parameter vector. Tu (1990)
suggested regularized least squares algorithm with a smoothing constraint and a self-
adaptation of regularization parameter. This method is suitable for the ill-
conditioned least squares which regressor vector has a high condition number. This
method is originated for numerical stability property remedy. So, this approach is
not adequate for the system identification and control fields. To overcome
shortcomings of the previous methods, we extend Lambert’s idea and propose a
generalized damped least squares (GDLS) method.
Consider a dynamic system with input signal {u(t)} and output signal {y(t)}.
Suppose that these signals are sampled in discrete time t=1, 2, 3, … and that the
sampled values can be related through the Auto-Regressive with an eXogeneous
input (ARX).
39
)()()1()()1()( 11 temtubtubntyatyaty mn +−++−=−++−+ LL (4.1)
where y(t), u(t) are deviation variable or incremental mode and e(t) is white noise.
Model (4.1) describes the dynamic relationship between the input and output signals.
equation (4.1) can be rewritten as
)()()1()()1()( 11 temtutuntytyty mnnn +−++−+−++−= ++ θθθθ LL (4.2)
[ ]mnT bbaa LLL 11 −−=θ (4.3)
To solve the estimation windup and the drift problem in the closed-loop control
system, we modify a normal least squares algorithm. We add the supplementary
exponential weighted parameter variation restriction in the objective function of the
least squares method. The objective function follows as:
−+= ∑∑−
=
−−
=
−1
0
2
21
0
2
)()()()(MIN)(
N
t
tNN
t
tN
NtNtwV θθδελθ
θ (4.4)
where λ and δ are the exponential forgetting factor for the model error and
parameter variation, respectively, w is the weighting factor between the modeling
error and parameter variation and ε(t) is one step ahead prediction error. The
problem is to obtain the parameter estimates, θ which minimize the quadratic
objective function (4.4). This can be rewritten as follows.
( ) )()()()()( NYNwYNNNw addlsaddls +=Φ+Φ θ (4.5)
−−−−−−
−−−−−−
−−−−−−
=Φ
∑∑∑
∑∑∑
∑∑∑
=
−
=
−
=
−
=
−
=
−
=
−
=
−
=
−
=
−
N
t
tNN
t
tNN
t
tN
N
t
tNN
t
tNN
t
tN
N
t
tNN
t
tNN
t
tN
ls
mtumtutymtutymtu
mtutytytytyty
mtutytytytyty
N
000
000
000
)()()2()()1()(
)()2()2()2()1()2(
)()1()2()1()1()1(
)(
λλλ
λλλ
λλλ
L
M
L
L
(4.6)
40
=Φ ∑−
=
−−
10
00100001
)(1
0
1
L
O
L
LN
t
tNadd N δ (4.7)
[ ]T21 )()()()( NNNN mn+= θθθθ L (4.8)
T
0 00
)()()2()()1()()(
−−−= ∑ ∑∑
= =
−
=
−−N
t
N
t
tNN
t
tNtNls mtutytytytytyNY λλλ L (4.9)
T1
0
11
02
11
01
1 )()()()(
= ∑∑∑
−
=+
−−−
=
−−−
=
−−N
tmn
tNN
t
tNN
t
tNadd tttNY θδθδθδ L (4.10)
where Φls and Yls have the same meaning as the least squares method, Yadd and Φadd
of the additional information vector and matrix come from the penalty of parameter
variation. Then the solution vector is as follows:
( ) ( ))()()()()( 1 NYNwYNNwN addlsaddls +Φ+Φ= −θ (4.11)
This equation is the solution of the batch GDLS algorithm. The matrix wΦls(N) +
Φadd(N) of the batch GDLS solution is invertible unlike the least squares method
even though eth signal is not excited enough.
It is desirable to make the computations recursively to save computation in an
adaptive control because the process data are obtained successively in real-time. The
following equations are required to derive the recursive robust estimation algorithm.
[ ] )()()()()( NYNwYNNNw addlsaddls +=Φ+Φ θ (4.12)
−−−−
−−−−−−−−
+−Φ=Φ
)()()()1(
)()2()1()2()()1()1()1(
)1()(
mNumNumNuNy
mNuNyNyNymNuNyNyNy
NN lsls
L
M
L
L
λ
(4.13)
41
+−Φ=Φ
100000
00100001
)1()(
L
O
L
L
NN addadd λ (4.14)
[ ]Tlsls mNuNyNyNyNyNyNYNY )()()2()()1()()1()( −−−+−= Lλ (4.15)
)1()1()( −+−= NNYNY addadd θδ (4.16)
( ) ( ))()()()()( 1 NYNwYNNwN addlsaddls +Φ+Φ= −θ (4.17)
In these formulae, we call Equation (4.12) as the normal equation of recursive
GDLS algorithm. Equation (4.17) has strong intuitive and meaningful appeals.
Because the Φadd(N) term can make the matrix wΦls(N) + Φadd(N) invertible, it does
not suffer the possibility of an numerical ill conditioning in spite that y(t) and u(t)
has zero value in the closed loop control or steady state data set.
4.2.2 Theoretical Analysis
In this section, we will prove that the proposed GDLS algorithm guarantees the
theoretical properties of least squares algorithm like unbiased, consistent, minimum
variance and an exponential rate of convergence and more robust numerical
properties.
Property 1. When the weight goes infinite (w→ ∞), GDLS parameter estimate
converges to that of least squares method ( lsGDLSwθθ ˆˆlim =
∞→).
This may easily be verified. If w is infinite, GDLS equation becomes as
lsaddlswlsaddlswwYYwYww =+Φ=Φ+Φ
∞→∞→lim,lim
which gives a parameter estimate,
( ) ( ) ( ) lslslsaddlsaddlswGDLSwNwYNwNYNwYNNw θθ ˆ)()()()()()(limˆlim 11 =Φ=+Φ+Φ= −−
∞→∞→
lsGDLSwθθ ˆˆlim =
∞→ (4.18)
42
Property 2. For large number of observation data (N → ∞), GDLS objective
function becomes same as least squares method with exponential forgetting
( lsGDLSNVV =
∞→lim ).
For a large N value, there exists such a small M (<N) as δN-M ≈ 0 and assume a
stable convergence. Then θ(N) = θ(N-1) = … = θ(N-m) and the objective function
becomes as follows.
== ∑=
−
∞→
N
t
tN
NlsGDLSNtwVV
0
2
)()(MIN)(lim ελθ
θ (4.19)
From this, we can know that its objective function is equal to the objective function
of least squares method. We can assume that statistical properties of the normal least
squares estimation can be established asymptotically for large number of
observations in the proposed method.
Property 3. In the absence of exciting data, the present parameter estimate is equal
to a previous estimate value and thus is free from covariance windup problem for
even steady- state data sets or under the closed loop control.
This may be certified that an introduction of additional diagonal matrix of
parameter variation becomes a bound and makes the singular matrix nonsingular.
Under closed loop control or for steady state data sets, y(t) and u(t) become zero, and
then the matrix becomes,
=Φ
00
00)(
L
M
L
Nls , INadd δδ −=
−=Φ
11
10
00100001
11)(
L
O
L
L
(4.20)
[ ]T00)( L=NYls , )1(1
1)( −−
= NNYadd θδ
(4.21)
)()()()( NNNwN addaddlsaug Φ=Φ+Φ=Φ (4.22)
43
)1(ˆ)1(ˆ1
11
1ˆlim1
0)(),(−=
−
−
−=
−
→NNIGDLStuty
θθδδ
θ (4.23)
Thus the current solution is equivalent to a previous estimate value )1(ˆ −Nθ . It
changes the invertible term of the augmented matrix nonsingular and avoids the
blow-up associated with exponential weighted least squares. Moreover, it is
expected to reduce undesirably large variation in the estimated parameters for the
abrupt measurement noises.
Property 4. The proposed method has better numerical property than the least
squares method.
Estimation algorithms are normally implemented on digital computers and
hence there is the possibility of numerical ill conditioning. Since, in an ill-posed
problem where 1)( −ΦΦ lsT
ls takes a large condition number in the normal least
squares, least squares solution may becomes basically very sensitive to the
perturbation in data y(t). This implies that the norm of such solution in significant
greater that the norm of exact solution. In our algorithm, we overcome this problem
fundamentally by considering additional constraint, ∑−
=
−− −1
0
22
1 )()(N
t
tN tN θθδ , on the
objective function. So, we can easily solve the ill-posed inverse problem efficiently
and do not suffer from the possibility of the numerical ill-conditioning for the steady
state data set or closed loop control. This property is the same as that of the
regularized least squares method.
In the analysis, we conclude that the proposed GDLS algorithm keeps the
theoretical properties of least squares unlike other modified least squares method.
Other modifications such as covariance resetting alter the geometry and convergence
of the true least squares properties. So, it has supplementary robust properties and
retains properties of least squares without the desirable theoretical properties. In
44
addition, GDLS method can also use merits of other variants, e.g. low pass filtering,
conditional updating and variable forgetting factor.
4.2.3 Soft Sensor of Oxygen Transfer Rate and Respiration Rate
Two general approaches of the estimation of respiration rate, “soft sensor”,
have been developed during the last years. The first approach is based on the
respirometer that estimated the respiration rate from DO mass balance in a
respiration chamber (Spanjers et al., 1998; Olsson and Newell, 1999). The second
approach is to estimate the respiration rate directly from the DO sensor and airflow
rate measurement in the aeration basin. In this research, the latter approach is used.
There are several different approaches of the estimation of R(t) and KL a based
on simple from measurements of DO sensor and airflow rate in the real aeration
basin. Holmberg et al. (1989) estimated the linear oxygen transfer rate model in a
recursive way. Here, the excitation of the process was improved by invoke a small
relay which increases the excitation. Carlsson (1993) developed a novel approach to
estimate the respiration rate by the constrained piecewise linear model. Lindberg
(1997) developed the nonlinear controller using the estimated oxygen transfer rate
during the identification step and presented a systematic estimation method for
oxygen transfer rate and respiration rate. Marsili-Libelli and Voggi (1997)
summarized various estimation methods about respirometric activities in the
bioprocess. Holmberg et al.(1989) and Marsili-Libelli and Vaggi (1997) described a
simultaneous estimation scheme for KL a and R(t) based on the conventional RLS
method, taking advantage of the differing time scale of the two variables. We select
this approach for estimating the respiration rate and oxygen transfer rate.
In this approach, dissolved oxygen deficit (D) in mass balance equation was
introduced.
)())())((())()(()()( tRtyytuaKtytyV
tQdt
tdysatLin −−+−= (4.24)
45
)()( tyytD sat −= (4.25)
where y(t) is the DO concentration in the aeration basin, yin(t) is the DO
concentration of the input flow, ysat is the saturated value of the DO concentration,
Q(t) is the influent wastewater flow rate, V is the volume of the aerator, KL a(u(t)) is
the oxygen transfer rate, u(t) is the airflow rate into the aeration tank from the air
production system, R(t) is the respiration rate, respectively. Due to KL a and R(t)'s
variations, DO dynamics show time varying behavior.
The discrete-time equation with sampling time h is
)()1(/)1(1
)()1(1)()(
tyrVQeaK
tReaK
tDehtD
ahK
L
ahK
L
ahK
L
LL
+−+
−+=+
−
−−
(4.26)
After some manipulations, equation (4.26) can be put in the standard estimation
form.
[ ]
)()())(()()1(/1)()(
thRthDtuaKtyrVhQhtyhtz
L −=+−−+=+
(4.27)
where z(t+h) is the updating information at each sampling instant. The oxygen
transfer rate KL a(u(t)) can be structured as a function of the airflow rate.
airaairL UKUKKtuaK ≅+= 10))(( (4.28)
It is easily seen that this model can be written as
)()()(ˆ tthtz T θϕ=+ (4.29)
where ϕT(t)= [D(t)hUair –h] is the regressors, θ(t) = [Ka R(t)]T is and
)(-)()( htzhtzte ++= is error vector.
From the on-line measurements, a soft sensor of KL a(u) and R(t) will be
designed and constituted by a recursive state estimator that uses the influent flow
rate, airflow rate and DO measurements. The estimated parameters θ(t) can be
updated according to the RLS method but are always experienced an estimation
46
windup problem under feedback control. In this research, we use a GDLS method as
an estimation algorithm because under closed-loop control. A schematic figure of
the robust estimator is shown in Figure 4.1.
4.2.4 Adaptive Model-based DO Control
Despite of the relatively simple dynamics of DO process, DO control may not
sufficiently be satisfied by the operator in the biological treatment process since the
DO process dynamics has a time varying characteristics. This means that high
control and estimation performance for all operating conditions may be hard to be
achieved with a conventional method.
After estimating two key variables, we can design an adaptive model-based DO
control in order to correct the process/model mismatch and to estimate the
unmeasured state variables. Using the available process input and output
measurements, model-based control adaptively updates the estimated parameters
θ. In this research, GDLS method is used for the estimation windup problem under
closed-loop control. And then updated model is used by an adaptive generic model
control (AGMC) for the control input. In AGMC, nonlinear process models can be
imbedded into the controller directly without any linearization. AGMC is very
simple and robust nonlinear control algorithm in single-input and single-output
(SISO) process. Although the proposed control parameters are constant, the updated
model can compensate the process/model mismatch because of its adaptive and
model-based characteristics (Lee and Sullivan, 1988; Signal and Lee, 1992).
The proposed method in the DO control is as follows.
DO Process:
)())())((())()(()()(
tRtyytuaKtytyV
tQdt
tdysatLin −−+−= (4.30)
DO model:
47
)(ˆ))())(((ˆ))()(()()( tRtyytuaKtytyV
tQdt
tdysatLin −−+−= (4.31)
Desired trajectory:
PIdttyyKtyyKdt
tdy t
ssdesired =−+−= ∫021 ))(())((
)( (4.32)
Here, we made the desired trajectory as Proportional–Integral controller (PI)
without a state observer. aKLˆ and )(ˆ tR are estimated by the previous described
GDLS estimation algorithm. Using equations (4.31) and (4.32), we can derive the
following equation in order to obtain the control input u(t).
)(
)()()()()(ˆ))((ˆ
tyytyVtQtyVtQtRPItuaK
sat
inL −
⋅−⋅++= (4.33)
Using the linear model ( )(ˆ))((ˆ1 tuktuaK L = ), u(t) can be easily obtained by
−
−−+=
))((ˆ))()((/)(ˆ
)(1 tyyk
tytyVQtRPItu
sat
in (4.34)
with the constraint maxmin )( utuu ≤≤ . In equation (4.34), u(t) is explicitly shown
and the function of the estimated values and the measured process values. The
control input u(t) has the nonlinear gain and all variables in numerator except PI are
the sum of the bias of steady state term and feed-forward compensation of the
respiration rate. Because we can update the estimated parameters of the oxygen
transfer rate KL a and respiration rate R(t) under the proposed controller, we can
easily compute the control action from equation (4.34). Figure 4.2 shows the
structure of the model-based DO control scheme.
This model-based DO controller shows no offset about the modeling error since
it contains the integral action by the external input in the structure itself. As an ideal
case, if the estimated values are equal to the true ones, )()(ˆ taKtaK LL = and
)()(ˆ tRtR = , the model-based control algorithm makes the offset free. Combining the
48
DO dynamics (4.30) with control input (4.34) gives the following error equation.
PIdtteKteKdt
tdy t=+= ∫021 )()()( (4.35)
So, the error signal, e(t) will approach to zero exponentially. Moreover, it has the
special robustness to the disturbance of DO process, that is, respiration rate because
it contains the estimated respiration rate and oxygen transfer rate in the controller
structure.
Adaptive model-based DO control with the proposed GDLS algorithm need not
require any specific estimation phase and can acquire both estimation of respiration
rate and DO control. The estimated value of respiration rate can give the information
about the biological activity and can be used monitoring index. In Figure 4.3, we
represent the proposed procedure for the soft sensor and model-based DO control
structure.
4.3 Simulation Study
In this section, we will see the performance of a GDLS algorithm. First, we will
apply a GDLS method to the estimation problem of first-order system under the
closed loop control and discuss about simultaneous estimation and control problem.
Second, a GDLS method will be applied in DO process dynamics. Examples
illustrate what happens when RLS is used versus the improvements obtained by
using the proposed algorithm.
First-Order System under Closed Loop Control
The following first-order process is simulated and controlled by a proportional
control law. Through the section, data is generated by
)1()1()( −∆+−∆=∆ tubtyaty (4.36)
where
49
)3000,2000(1.0)2000,750(5.0
)750,1(0.1)3000,2000(5.0)2000,750(9.0
)750,1(1.0
∈∀−=∈∀=∈∀=∈∀=
∈∀−=∈∀−=
tttbt
tta
where y(t) is process output, u(t) is control input. During the simulation, the two
parameters (a, b) were changed two times. Signal to noise ratio changes in the same
way. Notice that for t > 2000, the process was experienced the steady state gain’s
sign change. It is very large change in the system. The process excited by a
Proportional controller activation signal of the following structure, u(t)=0.15(ys(t)-
y(t)). We generated a random set point every 10 sampling time unit during t < 1000
and the fixed set point, 1.0 during 1000 < t < 3000 for the closed loop control. The
corresponding input and output response are shown in the Figure 4.4. The input and
output sequences generated for 3000 sample intervals are used to illustrate the
comparisons with RLS and GDLS algorithms for the two parameter estimation
problem (a, b) in the closed loop control. The RLS initial conditions (λ= 0.95,
0)0( =θ , P(0) = 109) were used in the simulation runs. Values used in the simulation
have deviation form.
The RLS algorithm with deadband update was used to show the estimation and
investigate the corresponding estimated parameter windup in the closed loop control.
We have observed that continued identification during periods of low excitation
leads to parameter drifting and bad estimates in Figure 4.5. During the random set
point change (0, 1000), the estimated parameters are accurate and bounded more or
less. But in the absence of any input excitation in the interval (1000, 3000), the
estimated parameters escape from real value. The noise in the process and
insufficient exciting signal then cause drifting of the parameter estimates (e.g. )(ˆ ta
and )(ˆ tb ). This increases the probability of bursting and results in deterioration of
50
subsequent set point changes. During the closed loop, the P matrix grows
unbounded whenever the system excitation is insufficient. Figure 4.6 represent the
trace of P(t) in closed loop control. The covariance matrix blows up and the trace of
P(t) increased rapidly in the absence of any input excitation in the interval (1000,
3000). The blowup usually occurs between set point change, unmeasured
disturbance and measurement noises.
In order to compare the performance of variants of RLS algorithm, the previous
experiment was repeated for constant-trace algorithm. This scheme is to scale it in
such a way that the trace of the matrix is constant. An additional refinement is to
also add a small a unit matrix. This gives the so-called regularized constant-trance
algorithm (Åström, 1995).
))1(ˆ)()()(()1(ˆ)(ˆ −−+−= tttytKtt T θϕθθ
( ) 1)()1()()()1()(
−−+−= ttPtIttPtK T ϕϕλϕ
−+−−
−−=)()1()(1
)1()()()1()1(1)(ttPt
tPtttPtPtP T
T
ϕϕϕϕ
λ
( ) IctPtr
tPctP 21 )()()( += (4.37)
where 0 0, 21 ≥> cc . Its result is shown in Figure 4.7, for constant trace RLS
combined with conditional updating. We used the following parameters: c1 is 100, c2
is 1 and its estimation deadband condition is 0.1. The estimate of control input,
)(ˆ tb has comparatively accurate value, while estimated value of process output,
)(ˆ ta cannot tract its correct value and shows the slow and poor estimation result once
excitation is removed during t > 1000.
We simulated the proposed GDLS upon the same process data for the
comparison of RLS. The following conditions are used in a GDLS simulation (λ=
0.95, δ=0.95, w=1000, θ(0) =0). Through many simulations, we could know that
51
weighting factor, w was adequate around 1000 in the closed loop control, forgetting
factor of least squares, λ could have value between 0.9 and 1.0 and forgetting factor
of parameter variation, δ could have value between 0.0 and 1.0. We can select other
values in the other process. Figure 4.8 shows the identification result of GDLS.
Despite periods of exciting and non-exciting data, the result shows good estimation
performance and indicates that the estimated parameters exactly keep track of the
real value under closed loop control. Even process gain change, GDLS can correctly
track the parameter variations.
DO Process
In the simulations, the following DO process is setup.
)())())((())()(()()( tRtyytuaKtytyV
tQdt
tdysatLin −−+−= (4.38)
where Q(t) = 200-500 l/h, V = 1000 l, ysat = 8 mg/l, yin(t) = 1 mg/l,. And R(t) has a
diurnal variation of WWTP, RSS is steady state of respiration rate and
KLa(u(t))=0.0018(1+(R(t)-RSS)/500)Uair h-1. This configuration has a same condition
of our experimental condition. The sampling time is 30 seconds. And for the
similarity with the real process, we added the zero mean white measurement noise
with 10% magnitude of process output. Also, we consider the time delay that always
exists in the real biological treatment process. The time delay of DO dynamics is
two times of the sampling time. In the estimation algorithm, the following setup was
used. The RLS initial conditions (λ=0.99, 0)0( θθ = , P(0)=106) were used in the
simulation runs. The parameters of GDLS are w=1000, λ =0.95 and δ =0.95.
Figure 4.9 shows the estimation comparisons of RLS and GDLS method under
PID control. The basic RLS algorithm with deadband update within 0.05 is used. At
initial of RLS estimation, the estimated parameters of RLS are accurate and bounded
more or less. But in the absence of any input excitation using good feedback control,
52
the estimated parameters escape from real value and diverge after set point change
of DO controller. It is originated that the covariance matrix of RLS grows
unbounded, estimation windup, whenever the system excitation is insufficient
during the closed-loop control. We can see that continued identification during
periods of low excitation leads to parameter drifting and bad estimates in Figure 4.9.
On the other hand, estimation result of GDLS method shows good estimation
performance in spite of feedback control. Note that these estimation results are
operated under feedback control and low exciting signal.
Based on GDLS estimation, adaptive model-based DO controller with equation
(4.34) is used with the constraint 000,10)(10 ≤≤ tu . The tuning parameters of GMC
are tuned by Lee’s reference trajectory shape (Lee and Sullivan, 1988), which are
K1=9.50 and K2=47.5. In spite of the time-varying R(t) and KL a, the proposed
controller shows the good control performance in Figure 4.10. However, the PID
controller shows some offset since DO dynamics has continuously time-varying
influent load and respiration rate. On the other hand, influent load and respiration
rate is compensated by adaptive and feed-forward action and oxygen transfer rate is
compensated by nonlinear gain in an adaptive model-based control. This means that
the adaptive model-based DO control can cope with the operation condition changes
such as the various load, respiration rate and other process changes. These dynamic
variations are frequently occurred in WWTP. And the estimated respiration rate (soft
sensor) under closed-loop can give the information about the biological activity and
can be used monitoring index.
4.4 Conclusions
In this research, we propose a simple and systematic estimation method for the
estimation windup problem. On the basis of analysis, we concluded that a GDLS had
the same properties as the least squares algorithm and more robust numerical
53
properties. Simulation examples show that GDLS method keeps the tracking ability
of process parameters under the closed loop control. Based on the robust estimation
performance, the model-based DO controller can efficiently cope with the time
varying characteristics and operating condition changes that are frequently occurred
in WWTP
54
Estimator (GDLS)D(t)hUair
-h R(t)
KLa
( )( ))()(
)()()( 1
NYNwYNNwN
addls
addls
+Φ+Φ= −θ
Figure 4.1 Robust estimator of KL a(u(t)) and R(t) with a GDLS method
55
y(t)ys DOdynamics
u(t)AGMC
Soft sensor (GDLS)[ ]T
L taKtRt )(ˆ),(ˆ)(ˆ =θ
Figure 4.2 Adaptive model-based DO control strategy with a GDLS algorithm
56
Soft sensor and model-based DO control in WWTP
Choose AGMC trajectory parameter (K1, K2)
Measure the process input/output values (u(t), y(t))
Soft sensing of KL a(u) and R(t) by GDLS algorithm
Calculate the adaptive model-based DO control input
Figure 4.3 Flow chart of soft sensor and the model-based DO control algorithm
57
0 1000 2000 3000-0.5
-0.4
-0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
0.4
0.5P
roce
ss O
utpu
t
Sampling Intervals
0 1000 2000 3000-0.5
-0.4
-0.3
-0.2
-0.1
0.0
0.1
0.2
0.3
0.4
0.5
Pro
cess
Inpu
t
Sampling Intervals
Figure 4.4 Process output and input during the simulation (a) process output (b)
control input
58
0 1000 2000 3000-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5 areal aRLS
a
Sampling Intervals
0 1000 2000 3000-0.5
0.0
0.5
1.0
1.5 breal bRLS
b
Sampling Intervals
Figure 4.5 Parameter estimates )(ˆ ta and )(ˆ tb of RLS with exponential data
weighting
59
Figure 4.6 Trace of P with RLS estimation method
0 1000 2000 30000
1000
2000
3000
Tra
ce o
f P
Sampling Intervals
60
0 1000 2000 3000-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5 areal aRLS
a
Sampling Intervals
0 1000 2000 3000-0.5
0.0
0.5
1.0
1.5
breal b
RLS
b
Sampling Intervals
Figure 4.7 Parameter estimates )(ˆ ta and )(ˆ tb of RLS with constant trace
61
0 1000 2000 3000-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
areal
aGDLS
a
Sampling Intervals
0 1000 2000 3000-0.5
0.0
0.5
1.0
1.5
breal bGDLS
b
Sampling Intervals
Figure 4.8 Parameter estimates )(ˆ ta and )(ˆ tb with GDLS
62
0 2 4 6 8 10 12 14 16 18 20 22 2420
25
30
35
40
45
50
0 2 4 6 8 10 12 14 16 18 20 22 24-200
-150
-100
-50
0
50
100
(a)
Res
pira
tion
rate
[mg/
l/h]
time [h]
R(t) Re(t) of GDLS
(b)
Res
pira
tion
rate
[mg/
l/h]
time [h]
R(t) Re(t) of RLS
Figure 4.9 Estimation comparisons of RLS and GDLS method under PID control
63
0 4 8 12 16 20 240
1
2
3
4(a)
DO
[mg/
l]
time [h]
set point PID proposed
0 4 8 12 16 20 241000
2000
3000
4000
5000(b)
UAI
R [l/h
]
time [h]
PID Proposed
Figure 4.10 Model-based DO control result with a GDLS method (a) process output,
(b) control input
64
V. Disturbance Detection and Isolation in WWTP
5.1 Introduction
The increase in environmental restrictions in recent times has led to an increase
in efforts aimed at attaining higher effluent quality from WWTP. Achieving this
goal requires the advanced monitoring of plant performance. Most of the changes in
biological WWTP are slow when the process is recovering back from a ‘bad’ state to
a ‘normal’ state. The early detection and isolation of faults in the biological process
are very efficient because they allow corrective action to be taken well before the
situation becomes dangerous. Some changes are not very obvious and may gradually
grow until they become a serious operational problem. The discrimination between
serious and minor anomalies is of primary concern in the monitoring of these
processes. To make this distinction, a reliable procedure for the detection and
isolation of disturbances is needed.
In the case of the activated sludge process (ASP), multivariate statistical
process control (MSPC) has been developed to extract useful information from
process data and utilize it for monitoring and detection (Krofta et. al, 1995; Rosen
and Olsson, 1998; Olsson and Newell, 1999; Teppola, 1999). Krofta et al. (1995)
applied MSPC techniques to fault detection in dissolved air flotation. Rosen and
Olsson (1998) adapted multivariate statistics based methods to the wastewater
treatment monitoring system using simulated and real process data. Teppola (1999)
used an approach that combined multivariate techniques, fuzzy clustering and multi-
resolution analysis for wastewater data monitoring.
However, MSPC has fundamental weakness as a method for monitoring the
ASP. These problems arise because the biological nutrient removal process changes
gradually over time, making the process non-stationary. Thus, ASP hardly ever
operates normally for long periods, and the non-stationary nature of the process
65
causes the definition of normality to change over time. One shortcoming of MSPC is
that it cannot detect the change in correlation among process variables when
Hotelling’s T2 and sum of squared prediction error (SPE) are inside the control limit.
Hence, MSPC is not suited for monitoring non-stationary processes because it
assumes stationary data. This is a problem for developing statistical control charts,
as they should be developed from a set of normal operating data. Several methods
have been suggested to model these non-stationary natures of the WWTP (Bakshi,
1998; Rosen and Lennox, 2000). One method that has shown potential for treating
non-stationary processes is the use of adaptive algorithms for MSPC (Rosen and
Lennox, 2000). Another proposed method employs a multiscale model through the
use of wavelet transforms (Bakshi, 1998; Teppola, 1999; Rosen and Lennox, 2000).
Another important issue in process analysis is the ability to diagnose the source
of abnormal behavior. Chemometric methods such as principal component analysis
(PCA) have been utilized for merging detection with the diagnosis of the causes of
abnormal situations (Ku et al., 1995; Kano et al., 2000a,b). Ku et al. (1995)
proposed a diagnostic method in which the out-of-control observation was compared
to PCA models for known disturbances. Using refinements of statistical disturbances,
discriminant analysis was then used to select the most likely causes of the current
out-of-control condition. Kano et al. (2000a, b) proposed a new statistical process-
monitoring algorithm. This method is based on the idea that a change of operating
condition can be detected by monitoring the distribution of time-series process data,
because this distribution reflects the corresponding operating condition. They did
not, however, consider the individual contributions of each transformed constituent
in the normalization of the dissimilarity index.
In this research, we propose a modified dissimilarity measure and disturbance
detection method for the successive data sets. Using eigenvalue monitoring, the
proposed method can also detect the disturbance and isolate the type of disturbance
66
scale.
5.2 Theory
5.2.1 Modified Dissimilarity Measure
The dissimilarity measure that has been traditional used is based on the
Karhunen-Loeve (KL) expansion and is identical to the PCA. This measure
compares the covariance structures of two data sets and represents the degree of
dissimilarity between them. In the computational procedure, the variance of a
transformed data vector is normalized by its corresponding eigenvalue. The
dissimilarity measure therefore considers not the absolute magnitude but the relative
magnitude of the variance change, and neglects the importance of each transformed
variable. We suggest a modified dissimilarity measure that considers the importance
of each transformed variable. Using this modified measure we propose the fault
detection and isolation (FDI) technique. This technique is divided into two major
steps: a training phase using an historical data set representing the process in normal
operation, and the on-line monitoring and isolation using the test data set.
The modified dissimilarity measure algorithm is as follows. First, the data
window size and step size are determined, where the window size is the number of
samples in each data set and the step size is the monitoring interval. Second, two
successive data sets are selected and normalized with the sample mean and sample
variance (Xi, i = 1, 2). Figure 5.1 represents the concept of window and step size
using a moving window. Third, the sample covariance matrix is found and singular
value decomposition (SVD) is applied to it. The algebraic representation of these
steps is
22
11
2
1
2
1
11
11
11 SS
XX
XX
S−−
+−−
=
−
=NN
NN
N
T
(5.1)
67
2,11
1=
−= i
N iTi
ii XXS (5.2)
PΛSP = (5.3)
where P is the loading matrix andΛ is the diagonal matrix. In this procedure, input
variables are transformed into orthogonal variables (transformation Xi to Yi).
ii
i NN
TY11
−−
= (Ti = XiP) (5.4)
Fourth, the sample covariance matrix (R) of two transformed data sets (Y1 and Y2) is
found and SVD is applied to R.
ΛRR =+ 21 (5.5)
rjandiij
ij
iji ,...,12,1, === qqR λ (5.6)
( ijq : loading vector, i
jλ :eigenvalue, r: dimension of data in PCA). Combining
equations (5.5) and (5.6) and using some algebra, we obtain
( ) 1112
1111 jjjjjjjj and qqRqqR λλ −Λ== (5.7)
This result shows that two sample covariance structures share eigenvectors,
whose eigenvalues satisfy
jjjj Λ=λ+λ 21 (5.8)
where i
jλ is the jth eigenvalue in the ith data set and jjΛ is the eigenvalue in the
total data set. As two of the data sets are more similar than others, their eigenvalues
are closer 0.5 jjΛ . In general, the first few principal components, r, explain most of
the variation of the data. Next, the modified dissimilarity index D is found,
∑∑==
Λ
Λ−=
r
jjj
r
j
jjjD
1
2
1
2
24 λ (5.9)
D has a value between 0 and 1. The more similar two data are, the closer D is to
0, and the more dissimilar two data are, the closer D is to 1. Finally, the (1-α)100%
68
confidence interval of each eigenvalue is determined. For many samples, it is
reasonable to assume that each eigenvalue is a normal random variable by the
central limit theorem. For the samples obtained from a normal operation, the interval
containing 99% of eigenvalues calculated above is obtained by
( ) { } ( ) { } ij
ij
ij
ij
ij sNtsNt λλαλλλα +−−≤≤+−−− 1;2/11;2/1 (5.10)
where ijλ is the mean of a sample, { }i
js λ is the variance of a sample, and α is
99%. That is, (1-α)100% of ijλ are below the limit value and the remainder are
above it (Johnson and Wichern, 1992).
5.2.2 Fault Detection and Isolation (FDI)
For on-line monitoring, the normal operation data is used as a training data set.
And confidence limits are calculated from the previous step. In addition, the sample
representing a current operating condition is scaled by the sample mean and sample
variance obtained in previous steps. The corresponding modified dissimilarity index
and eigenvalues are then calculated using the previous step. The modified
dissimilarity index, which evaluates the difference between two data sets, can
quantitatively detect a change of operating condition and monitor a distribution of
time-series data. If the index is outside the control limit or deviates from a value of
zero, the operating condition is changed and a disturbance is said to have occurred.
In particular, we can focus on the individual variation of several eigenvalue.
Only a few eigenvalues are considered as monitoring indexes because most of the
variation is captured by the first few eigenvectors. The remaining variation that is
not captured by the principal eigenvectors is negligible and it is not critical to
identify whether it is caused by changes in the process or noise. If any of the
principal eigenvalues exceeds its corresponding confidence limit, the current
operation at that eigenvalue is changed indicating that an operating change has
69
occurred. In this eigenvalue, a disturbance detected. Monitoring at each eigenvalue
allows us to distinguish a process change from an instantaneous fault or sensor noise.
Because it represents the corresponding characteristics at each eigenvalue, this
technique gives information about the eigenvalue on which a disturbance occurs,
and makes it possible to analyze the physical/biological reasons for the disturbance.
This method automatically gives us the capability to isolate and interpret the
disturbance source.
If adaptive scaling is to be used to tackle non-stationary or dynamical problems,
the sample mean and variance should be successively updated to detect changes in
continuous processes. And a forgetting factor can be introduced to reduce the effect
of previous measurements (Li et al., 2000). Another important consideration in
monitoring changes in the process or operating condition is the determination of
appropriate window and step sizes. These quantities should be carefully selected
taking into consideration the process characteristics. We suggest that the window
size should be large in comparison to the time constant of the process, and the step
size should be small in comparison to the sampling time.
5.3 Simulation Studies
5.3.1 Simulation of Benchmark Plant
For the monitoring purpose, the proposed method was applied to the detection
of various disturbances in the simulated data obtained from a benchmark simulation.
Four types of disturbance were tested using the FDI method: External disturbance,
internal disturbance, setpoint change, and sensor fault (see Table 5.1). External
disturbances are defined as measurable disturbances, which are outside of the
process and are detectable from the sensor signal. Examples of such disturbances are
changes in the influent flow rate or nitrogen concentration. Internal disturbances are
70
caused by changes within the process affecting the process behavior. These
disturbances include factors such as decreased nitrification, non-measurable
inhibition of influent or gradual reduction of the settling velocity in the secondary
clarifier (denoted as bulking phenomena). The two other simulated disturbances
were a set point transfer signal with low frequency information and a sensor failure
event in the high frequency band.
Three events in the influent data developed by the benchmark are associated
with the influent flow rate (dry, storm and rain weather). The training model was
based on a normal operation period of one week of dry weather. The data used were
the influent file and outputs with noises suggested by the benchmark. The variables
used to build the X-block in the disturbance detection were the influent ammonia
concentration (SNH,in), influent flow rate (Qin), total suspended solid in aerator 4
(TSS4), DO concentration in aerators 3 and 4 (SO,3, SO,4), and oxygen transfer
coefficient in aerator 5 (KLa5). The conditions used for on-line monitoring were a
window size of 20 samples (5 hours) and a step size of 5 samples (1.25 hours). The
mean and variance were the values calculated from the normal data.
External process disturbances
We tested a storm event that suddenly occurs two times after a long period of
dry weather. This example shows how external disturbances appear within the
proposed method. The pattern of measurement variables during the storm weeks was
the same as the storm condition in the benchmark. The pattern of measurement
variables during the storm weeks is presented in Figure 5.2. And Figure 5.3 shows
the monitoring results obtained using the FDI technique during the storm weeks.
The dissimilarity index sharply increases at around samples 850 and 1050, which
correspond to the first and the second storm. The two storm events are largely
detected as changes in the first and second eigenvalues, as shown in Figure 1(b-d).
The magnitude of each eigenvalue represents the proportion of the variation
71
captured by its corresponding eigenvector.
Internal process disturbances
The first internal disturbance was imposed by decreasing nitrification rate in the
biological reactor through a decrease in the specific growth rate of the autotrophs
(µA) is decreased. The autotrophic growth rate at sample 300 was linearly decreased
from 0.5 to 0.3 day-1. As shown in the left of Figure 5.4(a), the decrease in
nitrification is detected for the first time at around sample 330, which is 30 samples
after the event occurred. This event is quickly detected by the second eigenvalue,
while the first eigenvalue increases continuously after this event. After the
deterioration of nitrification ends, the dissimilarity index shows peaks at around
samples 500 and 600. These sudden increases in the dissimilarity index are caused
by the increase of the first eigenvalue. The gradual increase of first eigenvalue
means that the process has undergone this type of internal disturbance such as
nitrification or denitrification rate decrease.
The second internal disturbance imposed on the system was a linear decrease in
the settling velocity in the secondary settler between samples 300 and 500. For the
early detection, it was necessary to add another measurement of the effluent total
suspended solid (TSSe) to the general X-blocks. The right side of Figure 5.4(a) shows
an increase in the dissimilarity index after sample 330. As in the case of the decrease
in nitrification, the dissimilarity index is constant for 30 samples after the onset of
the decrease in settling velocity. The increase in the dissimilarity index at around
sample 330 is caused by increases in the second and the third eigenvalues. The first
eigenvalue jump up about sample 410, contributing greatly to the increase in the
dissimilarity index observed shortly afterwards. The jump up of first eigenvalue
means that the process has undergone this type of internal disturbance such as
bulking or biomass decay events.
72
Sensor faults and setpoint change
To identify the usefulness of the FDI method for detecting sensor faults with
high frequency information, we corrupted the nitrate sensor in the secondary anoxic
tank. In the sensor fault case, it was also necessary to add the nitrate concentration
(SNO,2) to the general X-blocks. The fault was introduced during the sample period
200-400. Prior to that period, the sensor was operating properly except for the
imposition of sensor noise. Monitoring results are presented in the left side of Figure
5.5. The sensor fault is detected by a change in the second eigenvalue, indicating
that the sensor fault is caused by the variation change along the second contributing
axis. On the other hand, the disturbance caused by the setpoint change with low
frequency information is demonstrated in the right side of Figure 5.5, which shows
when the DO controller setpoint in the 5th biological reactor was suddenly changed
from 2 to 1 mg/l at sample 300. In contrast to the sensor fault, the setpoint change
causes a variation along the third contribution axis.
5.3.2 Full-scale WWTP
We now consider a second example, using real data from the coke WWTP of an
iron and steel making plant in Korea. The treatment system is a general activated
sludge process that has five aeration basins and a secondary clarifier. We selected 16
general variables to describe the process state of the WWTP; these variables are
described in Table 5.2. The data set consisted of daily mean values from 1 January,
1998 to 9 November, 2000, comprising a total of 1034 observations. The first 720
observations were used for the training model of the mean-centered and auto-scaled
data. The remaining 314 observations were used as a test data set to test the
proposed method.
In addition to the proposed algorithm, the PCA method was used to monitor the
73
WWTP characteristics. PCA results were then compared with those of the proposed
FDI algorithm. We managed to capture only above 55% of the variance by
projecting the variables with four latent variables. Figure 5.6 shows the Hotelling’s
T2 and the squared prediction error (SPE) chart. The two horizontal lines correspond
to the 95% significance levels of the original training data. The data deviated
slightly in samples 120 to 125. From Figure 5.6, we can see certain deviations in
some of the variables within these intervals. To make the cause of the deviation
more obvious, the contributions from every measurement variable were calculated.
However, it is difficult to detect this disturbance from the plots. Moreover, it is not
possible to diagnose and isolate the disturbance frequencies.
The monitoring performances of the proposed FDI method in WWTP are given
in Figure 5.7. To monitor changes in the process and operating condition, a window
and step sizes of 20 and 5 samples were used, respectively. The dissimilarity indexes
of the test data set are shown in Figure 5.7 (a). The dissimilarity index has high
values around samples 110 - 120 (19 April, 2000 – 4 May, 2000), leading us to
predict that a large process change happens at this time. Five eigenvalues which
correspond to a range of disturbance scales are depicted in Figure 5.7 (b-f). The
remaining eigenvalues give little information because they provide only high
frequency information such as measurement noise. It is evident from Figure 5.7 that
the third and fourth eigenvalues, which are representative of middle scale
disturbances, contribute greatly to the increase in the dissimilarity index. At this
time, the WWTP received a high input of cyanide and chemical oxygen demand
(COD) load. This load reduced the activity of the micro-organisms and diminished
the settling performance, causing an increase in the solid volume index in the
secondary settler. We found that the increase in an influent load started out as an
external disturbance but subsequently transformed into an internal disturbance that
changed the process operation region in WWTP. These process changes are detected
74
by the dissimilarity index, and the disturbance sources were isolated by the proposed
FDI method.
We can draw the following conclusions from the results of several simulations.
First, the modified dissimilarity index unifies all the scales into one monitoring
value and provides a compact index. Second, the eigenvalue at each eigenvalue can
discern the dominant dynamics and detect the scale on which a disturbance occurs.
In this analysis it is presumed that the eigenvalue with a large magnitude represents
the effect of low frequency information such as a large change in the process or the
occurrence of a large and long disturbance. The eigenvalue of intermediate
magnitude represents a small change in operation condition such as a short external
disturbance, while the eigenvalue with a small magnitude represents high frequency
information such as sensor faults and measurement noises. The fault isolation
approach therefore provides intelligence on the scale at which a disturbance occurs,
and can be used to analyze and interpret the physical cause and effect of
disturbances. Third, because the proposed method is based on evaluation the
difference between successive time series data sets with a moving window, as is
done in adaptive PCA, the proposed method can tackle the non-stationary problem
of WWTP.
5.4 Conclusions
The strategy proposed in the present work is able to detect and isolate the effect
of various disturbances occurring in the activated sludge process. This strategy uses
a modified dissimilarity index and monitoring of individual eigenvalues. One merit
of this technique is that it can simultaneously detect the disturbance and isolate its
source, in contrast to conventional MSPC. The strength of the isolation technique is
that it gives information about the scale on which a disturbance occurs, assisting in
the interpretation of the disturbance. Experimental results show that it is an
75
appropriate monitoring technique for the activated sludge process, which is
characterized by a variety of fault and disturbance sources and non-stationary
characteristics. This fault detection and isolation method provides us with a new
analysis tool for acquiring a deeper understanding of process monitoring
methodology through the detection and isolation of disturbances at different scales.
76
Window sizeStep size
Val
ue
Time
Figure 5.1 Moving windows between successive two datasets.
77
0 200 400 600 800 1000 1200 14000
50
Snh
,in
0 200 400 600 800 1000 1200 14000
5
10x 10
4
Qin
0 200 400 600 800 1000 1200 14002000
3000
4000
TSS
4
0 200 400 600 800 1000 1200 14000
2
4
So3
0 200 400 600 800 1000 1200 14000
5
10
So4
0 200 400 600 800 1000 1200 14000
200
400
KLa
5
sample number
Figure 5.2 Measured variables during the storm weeks
78
0 200 400 600 800 1000 1200
0.00
0.05
0.10
0.15
0.20(a)
D
sample number
0 200 400 600 800 1000 1200
0
10
20
30
40
50
60 (b)
1st e
igen
val
ue
sample number
0 200 400 600 800 1000 1200
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5 (c)
2nd
eige
n va
lue
sample number0 200 400 600 800 1000 1200
0.0
0.5
1.0
1.5
2.0(d)
3rd
eige
n va
lue
sample number
Figure 5.3 Monitoring performances of the external disturbance (a) modified
dissimilarity index (b-d) individual eigenvalue plot.
79
0 100 200 300 400 500 600
0.000
0.005
0.010
0.015
0.020(a)
D
sample number0 100 200 300 400 500 600
0
2
4
6
8
10
12 (b)
1st e
igen
val
ue
sample number
0 100 200 300 400 500 600
0.0
0.5
1.0
1.5
2.0
2.5
3.0 (c)
2nd
eige
n va
lue
sample number0 100 200 300 400 500 600
0.0
0.1
0.2
0.3
0.4
0.5
0.6 (d)
3rd
eige
n va
lue
sample number
0 100 200 300 400 500 600
0.00
0.01
0.02
0.03
0.04(a)
D
sample number
0 100 200 300 400 500 600
0
5
10
15
20
25(b)
1st e
igen
val
ue
sample number
0 100 200 300 400 500 600
0
1
2
3
4
5 (d)(c)
2nd
eige
n va
lue
sample number0 100 200 300 400 500 600
0.0
0.2
0.4
0.6
0.8
1.0
3rd
eige
n va
lue
sample number
Figure 5.4 Monitoring performances under internal disturbances caused by
decreasing nitrification (left plot) and settler bulking (right plot) (a) dissimilarity
index, (b-d) individual eigenvalue plots
80
0 100 200 300 400 500 600-0.02
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0 100 200 300 400 500 600
0
1
2
3
4
5
6
7
8
0 100 200 300 400 500 600
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
0 100 200 300 400 500 600
0.0
0.5
1.0
1.5
2.0
2.5
(b)
(d)(c)
(a)
D
sample number
1st e
igen
val
ue
sample number
2nd
eige
n va
lue
sample number
3rd
eige
n va
lue
sample number
0 100 200 300 400 500 600-0.02
0.00
0.02
0.04
0.06
0.08
0.10
0.12
0.14
0.16
0.18
0 100 200 300 400 500 600
0
1
2
3
4
5
6
7
8
0 100 200 300 400 500 600
0.0
0.5
1.0
1.5
2.0
0 100 200 300 400 500 600
0.0
0.5
1.0
1.5
2.0
(b)
(d)(c)
D
sample number
1st e
igen
val
ue
sample number
2nd
eige
n va
lue
sample number
(a)
3rd
eige
n va
lue
sample number
Figure 5.5 Monitoring performances of sensor faults (left plot) and setpoint change
(right plot) (a) dissimilarity index, (b-d) individual eigenvalue plots
81
0 50 100 150 200 250 3000.0
0.5
1.0
1.5
2.0
2.5
3.00 50 100 150 200 250 300
0
5
10
15
(b)
95% UL
SPE
sample number
(a)
95% UL
Hot
ellin
g T2
sample number
Figure 5.6 PCA monitoring performances (a) Hotteling’s T2 chart, (b) SPE plot
82
Figure 5.7 FDI monitoring performances (a) dissimilarity index, (b-f) the 1, 2, 3, 4,
5th eigenvalues
0 50 100 150 200 250 300
0.00
0.01
0.02
0.03
0.04 (b)(a)
D
time (days)0 50 100 150 200 250 300
02468
10121416
EV1
time (days)
0 50 100 150 200 250 3000
1
2
3
4 (c)
EV 2
time (days)0 50 100 150 200 250 300
0.00.51.01.52.02.53.03.5 (d)
EV 3
time (days)
0 50 100 150 200 250 300
0.2
0.4
0.6
0.8
1.0 (e)
EV4
time (days)0 50 100 150 200 250 300
0.1
0.2
0.3
0.4
0.5
0.6 (f)
EV5
time (days)
83
Table 5.1 Fault (disturbance) sources in the simulation benchmark
Disturbance
type Disturbances Simulation conditions
External Storm events Abrupt change of influent flow rate at around samples 850 and 1050
Internal Decreasing nitrification
Specific growth rate for autotrophs: from 0.5 to 0.3 day-1 in a linear fashion during
sample 300 – 500
Internal Decreasing settling velocity
Settling velocity in a secondary settler: from 250 to 150 mday-1 in a linear
fashion during sample 300 – 500
Sensor faults Nitrate sensor failure
Nitrate sensor noise in the second anoxic tank: noise mean changed from 0 to 1 mg
N/l during sample 200 – 400 Operation
change Set point change of DO controller
DO controller set point: from 2 to 1 mg/l at sample 300
84
Table 5.2 Process variables in full-scale WWTP
Variable Unit Mean STD Variable Unit Mean STD
Q2 m3/h 179.4 15.98 MLSSAT mg/L 1605 409.3
Q3 m3/h 85.53 8.73 MLSSR mg/L 7194 3444
CN2 mg/L 2.455 0.38 DOAT mg/L 2.064 0.99
CN3 mg/L 14.35 2.49 Tinfluent °C 37.6 2.51
CNPS mg/L 1.49 0.25 TAT °C 30.74 2.34
COD2 mg/L 156.4 20.28 SVIAT ml/g 11.42 3.15
COD3 mg/L 2083 295.5 SVIR ml/g 63.31 21.73
CODPS mg/L 143.28 17.66 PHAT ml/g 7.24 0.22
85
VI. Modeling and Multiresolution Analysis in WWTP
6.1 Introduction
Up to date, monitoring in WWTP has mostly focused on a few key effluent
quantities upon which regulations are enforced. However, since the environmental
restriction becomes more rigid nowadays, the increasing effort for higher effluent
quality is required in the monitoring of WWTP performance. Specially, monitoring
of the biological treatment process is very important because the recovery from
failures is time-consuming and expensive, where some of changes are not very
obvious and may grow gradually until they produce a serious operational problem.
Therefore, early fault detection and isolation in the biological process are very
efficient to execute corrective action well before a dangerous situation happens. At
the same time the discrimination between serious and minor abnormalities is of
primary concern. To accomplish this task, a reliable detection procedure is needed.
However, few monitoring techniques are available to utilize large on-line data sets
despite of the increasing popularity and the decreasing price of on-line measurement
systems in the field of WWTP.
Multivariate statistical process monitoring (MSPM) or multivariate statistical
process control (MSPC) is a possible solution to multivariate, collinear, auto or
cross-correlated processes. This comprises chemometrics methods such as principal
components analysis (PCA) or partial least squares (PLS) combined with standard
sorts of control charts. In order to extract useful information from process data and
utilize it for the monitoring of WWTP, several applications of MSPM or MSPC
have been developed (Krofta et. al, 1995; Rosen, 1998; Van Dongen and Geuens,
1998; Teppola, 1999).
However, the biological treatment process has several peculiar features unlike
chemical or industrial engineering. Above all, it is ‘nonstationary’, which means
86
that the process itself changes gradually over time. For example, systematic
seasonable variations show a dynamic pattern, for example, the process normal
condition evolves according to the seasonal variations. In addition, many underlying
phenomena of WWTP takes place simultaneously and it may be difficult to separate
specific phenomenon among them. Namely, it has ‘multiscale’ characteristics that
have multiple simultaneous phenomena affecting the data at different time or
frequency scales. If these synchronous characteristics interfere or mask other time or
frequency variations, called the disturbing effects, the situation turns troublesome
because the multiscale variations are enlarging up the confidence limits. This is
unfavorable because the actual events can stay undetected by the monitoring
algorithm while the plant is being under way of the events.
To solve these problems, several methods have been suggested recently using
adaptive PCA, multiscale analysis with dynamic PCA and multiresolution analysis
with wavelet (Bakshi, 1998; Kano et al., 2000a,b; Rosen and Lennox, 2000; Teppola
and Minkkinen, 2000, 2001; Ying and Joseph, 2000; Choi et al., 2001; Yoo et al.,
2001). Bakshi (1998) used a multiscale model through the use of wavelet transforms.
Kano et al. (2000) proposed a dissimilarity index based on the distribution of time-
series process data. Rosen and Lennox (2000) applied and developed adaptive PCA
and multiresolution analysis of wavelet. Ying and Joseph (2000) evaluated the
feasibility of sensor fault detection using multi-frequency signal analysis of noise.
Teppola and Minkkinen (2000, 2001) suggested several multiresolution analyses
using wavelet-PLS regression model for interpreting and scrutinizing a multivariate
model. Choi et al. (2001) suggested a generic monitoring algorithm utilizing a
modified dissimilarity index in the benchmark simulation and Yoo et al. (2001)
confirmed these results using a PCA-type monitoring algorithm in a real WWTP.
Shortly, this research applies two methods, one is for prediction and the other is
for multiresolution monitoring technique. In this way, it is possible to take into
87
account the multivariate, nonstationary and multiscale natures of WWTP. These
approaches are organized by putting PLS model and multiresolution analysis
together. In the first approach, PLS model is used for the prediction and data
analysis. In the second approach, multiresolution analysis using a generic
dissimilarity measure and singular value decomposition to PLS score matrix is
proposed. The statistical confidence limit of detection and isolation is suggested and
its ability is verified by using real plant data.
6.2 Theory
6.2.1 Partial Least Squares (PLS)
Very often in industrial applications, the data are severely corrupted by noise
and collinearities among a high number of variables. To treat these problems, it is
convent to apply latent variables models, particularly PLS modeling. PLS
maximizes the covariance between process variables and responses. In PLS, the
matrix X (process variables) is decomposed and modeled in such a way that the
information in Y (responses) can be predicted as well as possible. In addition, PLS
uses only the variation in X matrix that is significant in the prediction of the
variation in Y matrix. Moreover, one does not assume that the X variables are free of
noise as in multiple linear regression (MLR). The noise and insignificant variations
are not used in modeling.
In PLS, the standardized sample matrices ZX of X and ZY of Y are decomposed
as follows.
XXX
m
i
Tii
nx
mi
Tii
nx
i
m
i
Tii
Tii
TX EZEptptptptTPZ +=+=+=== ∑∑∑ ∑
=+== =
ˆ111 1
(6.1)
YYY
m
i
Tii
ny
mi
Tii
ny
i
m
i
Tii
Tii
TY EZEququququUQZ +=+=+=== ∑∑∑ ∑
=+== =
ˆ111 1
(6.2)
88
In the above representations, ti and ui are score vectors, pi and qi are loading
vectors, XZ and YZ are unbiased estimates of XZ and YZ respectively, and
EX and EY are residual matrices. PLS is composed of outer and inner relationship. In
construction of outer model, score vectors ti and ui are obtained by the projection of
ZX and ZY onto loading vectors pi and qi, respectively. While in construction of inner
model, ti is linearly regressed on ui yielding iii btu =ˆ , where bi is a regression
coefficient. Then ZY can be expressed as
∑=
+=+=m
iY
TiiiY
TY b
1
EqtETBQZ (6.3)
where B is a diagonal matrix of the regression coefficient bi. In this respect, PLS can
be considered an useful tool which divides multivariate linear regression into simple
linear regression. The first several latent variables (LVs) are extracted from the
matrix X and Y and they contain most of variance of matrix X and Y, respectively.
On the other hand, the last LVs mostly consist of noise and variations that are not
related to X and Y. Importantly, the LVs are orthogonal to each other. These features
together make it possible to compress information in the presence of collinearity and
redundancy. Although PLS is a regression technique, it is a more important
technique that visualizing ability enables us to probe search and data sets more
minutely (Geladi and Kowalski, 1986; Höskuldsson, 1996; Rosen, 1998).
PLS projects X and Y variables simultaneously onto the same subspace, T, in
such a manner that there is a good relation between the position of one observation
on the X-plane and its corresponding position on the Y-plane. Once a PLS model has
been derived, it is important to grasp its meaning. For this, the scores t and u are
considered. They contain information about the observations and their
similarities/dissimilarities in X and Y space with respect to the given problem and
model. X and Y weights provide the way how the variables combine to form t and u,
which in turn express the quantitative relation between X and Y. Hence, these
89
weights are essential for the understanding which X variables are important for
modeling Y, which X variables provide common information, and also for the
interpretation of the scores t.
In order to detect the occurrence of process faults and disturbances, PCA-type
monitoring is based on the statistical analytical approach of the score values and the
residuals. The scores are monitored by using Hotelling’s T2 statistics or viewing the
corresponding score plots directly. The residuals are monitored by Q statistics, that
is, sum of squared prediction error of X variables (SPEX). T2 statistics is a measure of
the distance from the multivariate mean to the projection of the operating point on
the principal component (PC) plane. Q statistics is the Euclidean distance of the
operating point from the plane formed by the retained PCs. T2 monitors systematic
variations in the latent variable space while SPEX represents variations, not explained
by the retained PCs (Kourti and MacGregor, 1995; Wise et al., 1990;Wise and
Gallagher, 1996; Teppola, 1999). However, the conventional MSPM method, such
as T2 and Q statistics, does not always function well, because it cannot detect the
changes of correlation among process variables if T2 and Q statistics are inside the
confidence limits.
6.2.2 Generic Dissimilarity Measure (GDM)
Recently, several dissimilarity indices with the distribution between two
datasets have emerged (Kano et al., 2000; Choi et al., 2001; Yoo et al., 2001). They
are based on the idea of that a change of process operation can be detected by
comparing the distribution of successive datasets because the data distribution
reflects the corresponding process operating condition. In previous section (5.2.1),
we introduced and developed a generic dissimilarity measure (GDM) algorithm. It
compares covariance structures of two datasets and represents the degree of
dissimilarity between them by considering the importance of each transformed
90
variable (see the section 5.2.1).
6.2.3 Multiresolution Analysis (MRA)
PLS monitoring is different from the PCA-type monitoring algorithm. In PLS,
principal component decomposition of X blocks should be rotated (by introducing
the loading weights) in order to maximize the covariance between X and Y blocks.
Therefore, these multivariate control charts are only approximations. A comparison
of X block loadings and loading weights is one way to check at least a partial
validity. In this case, there were no significant differences between the loading and
the loading weights (Teppola, 1999). Therefore, it is required a new monitoring
method for PLS monitoring that can effectively treat the peculiar characteristics of
the biological treatment process and isolate and diagnose their fault sources with a
multiscale approach.
Figure 6.1 shows the scheme of multiresolution analysis (MRA) for PLS
monitoring. In the first place, a PLS model is constructed with normal historical data
in order to solve the multivariate and collinear problems in a biological WWTP. It is
used to represent the process behavior and the common-cause variations of WWTP
and excludes noise, measurement errors, and those variations that are uncorrelated to
Y variables. Then, MRA for score values is executed by GDM and principal
eigenvalues contribution to detect the process change and to diagnose or isolate
different kinds of faults and disturbances. The motivation of this work is to identify
the type of event which has occurred. It is believed that different events can result in
different process measurement values, which could be projected into the change of
data distribution and be manifested into different areas of the PC space.
Here, each successive dataset in GDM consists of PLS score values with a
moving window because PLS score values are normally distributed than the original
variables themselves. This is a consequence of the central limit theorem, which can
91
be stated as follows: If the sample size is large, the theoretical sampling distribution
of the mean can be approximated closely with a normal distribution. Thus, we would
expect the scores, which are a weighed sum like a mean, to be distributed
approximately normally (Neter et al., 1996; Wise and Gallagher, 1996). Figure 6.2
demonstrates the normality comparison between the original value and PLS score
value. Therefore, as the abnormality will manifest itself as a shift or time series
distribution change in the score value than the original variables. As the abnormality
will manifest itself as a shift in the score plane like T2 statistics of PCA and PLS
monitoring, it will be shown in this case as a dissimilarity value between successive
two datasets, that is, GDM. Exactly, a moving window concept of PLS score values
for GDM is a remedy of nonstationary problem of the PLS monitoring algorithm.
On the other hand, if the relationships between the process variables are rapidly
changed and the correlation structure has a breakdown, SPEX of PLS residual error
value should be included in two datasets of the proposed MRA algorithm. In this
case, moving window matrices combined with score values and residual error values
of PLS model are processed in the GDM and MRA method. Since the process inner
relationship in WWTP, however, is slowly changed, only score matrices are
sufficient for monitoring.
Additionally, the confidence limit of individual eigenvalue is a proposal for
multiscale fault detection and isolation. If each of eigenvalues exceeds to its
corresponding confidence limit, the current process at that scale is changing and a
certain event is occurring. By monitoring at each scale, we can diagnose diverse
process variations and events, i.e., diagnosis of slow variations (seasonal
fluctuations or other long-term dynamics), middle scale variation (internal
disturbance, process operation change), and instantaneous variations (input
disturbances, faults or sensor noises). Because it represents the corresponding
characteristics at each scale, multiscale technique can discover information on the
92
scale where process changes, faults and events occur and analyze the
physical/biological reasons. The proposed MRA automatically gives us the
diagnosis and interpretation capability of events and fault sources. Note that it can
get rid of nonstationary problem systematically by comparing successive datasets
with a moving window concept. Moreover, it does not bring about the zero padding
problems unlike other MRA, such as wavelet.
6.3 Result and Discussion
PLS modeling
The process data were collected from a biological WWTP that treated the coke
wastewater of the iron and steel making plant in Korea (Figure 2.3). Eleven
process and manipulated variables, X blocks, were used to model three process
output variables, Y blocks. Y blocks consist of the solid volume index (SVI), the
reduction of cyanide (∆CN), and the reduction of COD (∆COD). Table 6.1 describes
the process variables and presents the mean and standard deviation (SD) values of X
and Y blocks. The process data consisted of daily mean values from 1 January, 1998
to 9 November, 2000 with a total number of 1034 observations. The first 720
observations were used for the training of PLS model of mean-centered and auto-
scaled data. And the remaining 314 observations were used as a test data set in order
to verify the proposed method. For the determination of the latent variable number
of PLS model, a cross-validation method was used and four LVs were selected in
PLS model. It managed to capture about 54% of the X block variance and 61% of
the Y block variance by projecting the variables from dimension 14 to dimension 4,
which is originated from the troublesome and difficult treatment of coke wastewater.
The results of PLS model are represented in Table 6.2.
An appealing feature of PLS method is the modeling ability, that is, predictive
93
capability. Figure 6.3 shows the real and predicted value from PLS model and
displays the residual of Y blocks. The prediction values of the reduction of COD and
the reduction of CN are explained very well in the test periods and manifest the
prediction power of PLS model for the response Y variables. However, the
prediction of SVI of secondary settler is not satisfied unlike the process quality
variables. That may result from measurement inaccuracy and the operator’s
carelessness. It needs a precise measurement skill to the operator. The residual value
of Y blocks shows the sum of differences between the real and predicted values for
three response variables, which is mostly caused by the residual error of SVI
prediction.
Interpretation of PLS model
For the interpretation of WWTP, we consider the PLS loading weights to see
how X and Y variables are interrelated. Figure 6.3 represents that specific X and Y
variables load strongly in the first two latent variables dimension, where COD3,
COD2, and Taerator for COD reduction are closely correlated as seen in the left middle
side of Figure 6.4. The first Y variable, COD reduction of WWTP is influenced by
the COD load from BET2 and BET3 and the temperature in aerators which is
certified by the heterotrophic biomass activity effects of the temperature in a
biological treatment for the carbonaceous nutrients. The second group is formed by
CN2, CN3, Tinfluent, Q2 and Q3, and DO of aerator for CN reduction in the low section
of Figure 6.4. It also presents that the reduction of cyanide is affected by the cyanide
loads, influent flow rate, influent temperature, and dissolved oxygen level. Specially,
microorganisms related with cyanide are counter-connected with the heterotrophic
organisms and cyanide compounds are toxic and inhibitory to the growth of
heterotrophs, which is shown in the opposite direction of each other in the loading
plots. So, shock loading of cyanides in the wastewater influent causes a deterioration
of the biological WWTP. Those facts are well reported with experimental results in
94
a technical paper (Lee, 2000). The third group is made up of MLSS_R and
MLSS_%E for the SVI of secondary settler in the right upper side region, which
exemplifies that the settleability of biomass is related to the microorganism amount
(MLSS_R) and activity (MLSS_%E) included in the aerator and settler. Those are
excellent results taken into account biological similarity and the fact that process
layout represents.
Sometimes it may be quite useful to overview the PLS weights with large
number of latent variables, especially over 3 LVs. A real wastewater plant has
generally more than 3 LVs, in our case, four latent variables. The variable
influence on projection (VIP) informs us of the relevance of each X block pooled
over all dimensions and Y blocks (Eriksson et al., 1995). Thus, VIP in square is a
weighted sum of squares of the PLS weights, w, considering also the amount of Y
variance explanted by each latent variable. VIP plot is shown in Figure 6.5 and this
reveals that COD3 is the most important variable, followed the Taerator, MLSS_R,
MLSS_%E, and so on. This can be interpreted that COD influent from BET3 is most
important to the plant treatment efficiency of the aeration basins and the settling
ability of the secondary clarifier.
In Figure 6.6, the monitoring results of T2 and SPEX statistics are shown. The
horizontal line corresponds to the 95% significance level of the training data. From
this figure, we can see two deviations in the monitoring chart of T2 and SPEX
statistics. During samples 75 to 80 in the T2 chart, statistics have been deviated
slightly, which indicates that the deviations are large within the internal model.
However, SPEX does not increase and it is an indication that the internal mutual
relations are not altered according to PLS model. Since 15 days of solid retention
time (SRT) from the T2 deviation at sample 75, the SPEX chart begins to being
changed from samples 90 to 110. In this event, the SPEX value represents a similar
shape of the T2 change, except that it occurs after a SRT period. And around sample
95
250, the T2 statistics has a peak value, while SPEX is maintained in the vicinity of
95% confidence limit for a long time. We infer that the process has experienced the
large transition in the operation at this time, but does not know its cause correctly. In
order to identify more obvious cause for the deviation, the contributions from every
measurement variable are calculated. Also, it cannot diagnose and isolate their fault
(or disturbance) scale from the viewpoint of the process dynamics. This result arises
from a weak point of T2 and Q statistics. This is, although the events were within the
confidence limits, the changes or upsets in the operating condition sometimes
occurred in practice. This result illustrates that the principal components might
undergo a change though the correlation structure is unchanged, when the variances
of scores and residual error are similar to each other.
Multiresolution analysis
After the construction of PLS model, MRA was processed to the score matrix
(T) of X blocks of PLS model. To monitor the process change or fault and event
change, window and step sizes are 15 samples considering the SRT and 3 samples
considering the hydraulic retention time (HRT), respectively.
MRA to the PLS score values of the test dataset are shown in Figure 6.6. As
shown in Figure 6.7(a), GDM started to change at sample 65 and deviated during
around samples 65 - 120 (March 3, 2000 –April 27, 2000), where a large process
change happened at this time. It shows more rapid and critical detection ability than
the conventional MSPM method. Three eigenvalues which indicate their own
specific scale disturbance are depicted in Figure 6.7(b-d). The remaining
eigenvalues have little information and gives only high frequency information such
as measurement noises. From Figure 6.7, we can know that the first and second
eigenvalues largely contribute to the increase of GDM and are representative of
middle scale disturbances. In detail, the process change is first detected in GDM,
which is caused by the peaks of the second eigenvalue and then has experienced the
96
systematic variations of the first eigenvalue. It is easily identified and visualized by
monitoring each eigenvalue pattern at two scales. At this time, WWTP received high
input cyanide and COD load, while a small influent flow rate, that is, a highly
concentrated load. It reduced the activity of the microorganisms and diminished the
settling performance, then turned up the SVI increase in the secondary settler. From
this result, it has been seen that sludge and floc formation changes due to high load
and influent quality. Figure 6.8 shows the contribution plot at this time. From this
result, we found that a large influent load broke out the external disturbance and
were transformed into an internal disturbance, and then it changed the process
operation region in the activated sludge process. Meanwhile, GDM deviated again
from sample 230 to the last of test dataset (August 16, 2000 – November 9, 2000).
During the summer, WWTP was modified and a number of treatment equipments
and facilities were appended. This made it feasible for operators to change the
operation strategy which increased the MLSS concentration and maintained the high
DO concentration. It invokes the large process changes, which is shown as a gradual
increase of the first eigenvalue in Figure 6.8(b). This result confirms that it is
distinctly better than other conventional methods for a multiscale process change in
a nonstationary signal of unknown characteristics. They indicated that the proposed
MRA could be effectively used to extract information resulting form the change in
process operation and as a result could be contributed the localization of different
process faults and events.
6.4 Conclusions
In this research, a new approach of a multiresolution monitoring algorithm for
the PLS model is presented in order to solve the distinctive problems in WWTP,
such as collinear, multivariate, noisy, nonstationary, and multiscale. It is achieved by
combining the PLS technique for the modeling and multiresolution analysis for the
97
monitoring. PLS model is used for the prediction and data analysis that take full
advantage of the multivariate nature of the data and MRA of the PLS score and
residual error value is utilized to detect and diagnose the fault and disturbance with a
multiscale concept. It would give us the prediction, detection, and diagnosis power
at a time and make the investigation about nonstationary and multiscale phenomena
practicable. Experimental results from the industrial coke WWTP demonstrated that
it had the prediction and analysis ability of a complex plant and simultaneously the
suitable power of detection and isolation about various faults and events occurring in
the biological treatment. Moreover, it can distinguish small failures from process
upsets.
98
X
YT
PLSregrssion
FaultDetection
GDM ofmoving score matrix
EV1 EV2 EVn....
Prediction
ConfidenceLimits
Diagnosis
U
Biplots & Contribution plot
Figure 6.1 Multiresolution analysis for PLS monitoring
99
-2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0-4
-3
-2
-1
0
1
2
3
4(b)
expe
cted
val
ue
score value (t1)
-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5 2.0 2.5-4
-3
-2
-1
0
1
2
3
4 (a)ex
pect
ed v
alue
original data
-1 0 1 20
20
40
60
80
100(d)(c)
-1 0 10
20
40
60
80
100
Figure 6.2 Normal probability plot and histogram of original data and PCA score
values (a) normal probability plot of original data (b) probability plot of score values
(c) histogram of original data (d) histogram of score values
100
0 50 100 150 200 250 3000
20
40
60
80
100
120
140(a)
SVI
time (days)
0 50 100 150 200 250 300200
300
400
500
600
700
800
900
1000 (c)
CO
D re
duct
ion
time (days)
0 50 100 150 200 250 3000.0
0.5
1.0
1.5
2.0(d)
SP
E Y
time (days)
0 50 100 150 200 250 3000
5
10
15
20
25
30
35
40(b)
CN
redu
ctio
n
time (days)
Figure 6.3 Prediction results of the PLS model with real Y value (solid line with
squares) and predicted value (dotted line) (a) SVI (b) reduction of CN (c) reduction
of COD (d) residual error of Y variables (SPEY)
101
-0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6-0.2
-0.1
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
Q2Q3
MLSS_#E
DO_Aerator
CN2CN3
COD2
COD3
MLSS_R
T_Influent
T_Aerator
SVI_R
CN_red
COD_red
LV 2
LV 1
Figure 6.4 The second PLS weight vector plotted against the first for the PLS model
102
C
OD
3
T_ae
rato
r
MLS
S_R
MLS
S_#E
CO
D2
DO
_aer
ator
CN
3
T_In
fluen
t
CN
2
Q3
Q2
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
VIP
Figure 6.5 Variable influence on projection (VIP) for the predictor variables
103
0 50 100 150 200 250 3000
1
2
3
4
0 50 100 150 200 250 3000
5
10
15
20
25
(b)
SPE X
time (days)
(a)
T2
time (days)
Figure 6.6 Monitoring performances based on T2 and SPEX statistics with 95%
confidence limits
104
0 50 100 150 200 250 300
0.00
0.02
0.04
0.06
0.08
0.10
GD
M
time (days)
0 50 100 150 200 250 3000.0
0.2
0.4
0.6
0.8
1.0
EV2
time (days)
0 50 100 150 200 250 3000
1
2
3
4
5
6 (b)
(d)(c)
(a)
EV 1
time (days)
0 50 100 150 200 250 3000.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
EV 3
time (days)
Figure 6.7 Monitoring performance of MRA for the PLS score values with 95%
confidence limits
105
Figure 6.8 Contribution plot of PLS score value for the first event values
106
Table 6.1 Process Input/Output Variables in WWTP
No Variable Description Unit Mean SD
X1 Q2 Flow rate from BET2 m3/h 179.4 15.98
X2 Q3 Flow rate from BET2 m3/h 85.53 8.726
X3 CN2 Cyanide from BET2 mg/L 2.455 0.3764
X4 CN3 Cyanide from BET3 mg/L 14.35 2.491
X5 COD2 COD from BET2 mg/L 156.4 20.28
X6 COD3 COD from BET3 mg/L 2083 295.5
X7 MLSS_%E MLVSS at final aeration
basin mg/L 1605 409.3
X8 MLSS_R MLSS in recycle mg/L 7194 3444
X9 DOaerator DO at final aeration basin mg/L 2.064 0.9979
X10 Tinfluent Influent temperature °C 37.6 2.513
X11 Taerator Temperature at final
aerator °C 30.74 2.379
Y1 SVIsettler Solid volume index at
settler mg/L 63.31 21.73
Y2 CNred Cyanide reduction mg/L 19.31 4.2
Y3 CODred COD reduction mg/L 605.4 97
107
Table 6.2 Variations explained by the PLS model of four latent variables
X blocks
(cumulative)
Y blocks
(cumulative)
LV 1 0.192 0.319
LV 2 0.338 0.481
LV 3 0.446 0.581
LV 4 0.540 0.607
108
VII. Process Monitoring for Continuous Process with Cyclic
Operation
7.1 Introduction
Recently, due to the increasingly stringent environmental regulations, advanced
monitoring and control strategies for WWTP are attracting a lot of interests.
However, some specific features about this process are yet to be addressed fully.
First, most changes in this biological process are slow and recovery from failures
can be very time-consuming and expensive. Sometimes it takes several months for
the process to recover from an abnormal operation. Therefore, early detection of
developing abnormalities is especially important for this process. Secondly, most
WWTPs are subject to large diurnal fluctuations in the flow rate and compositions
of the feed stream. So, these biological processes exhibit periodic characteristics,
where strong diurnal fluctuations are observed in the flow rate and compositions of
the feed waste stream. Since the variables of such processes tend to fluctuate widely
over a cycle, their mean and variance do not remain constant with time. Because of
this, conventional statistical process monitoring (SPM) methods like principal
component analysis (PCA), which implicitly assume a stationary underlying process,
may lead to many false alarms and missed faults. Better treatment performances can
be expected by accounting for this periodic pattern when applying advanced
monitoring and control strategies to this process.
Recently it has been made several attempts to treat the characteristic features of
a particular WWTP. First approach is solving a nonstationary problem, which are an
adaptive PCA, dynamic PCA, PLS, a monitoring based on state space method, and
so on. Second approach is solving a multiscale problem, which are a wavelet,
multiscale PCA, multiresolution analysis, and so on.
Rosen (1998) used conventional methods of PCA and PLS in the simulation
109
benchmark. For the monitoring purpose, a PCA model was built from a period of 14
days of dry weather condition with diurnal and weekly variations. Based on PCA
model, it tested the storm, rain weather and decreasing nitrification rate. For the
prediction model, they compared the linear PLS, dynamic PLS and nonlinear PLS
method in predicting the effluent nitrogen concentration. Because it did not
deliberate the non-stationary and periodic features of benchmark influent, T2 and
SPE chart also showed the periodicity and frequently revealed the false alarm and
bad prediction performance.
Teppola (1997) applied the dynamic PLS method for the purpose of process
modeling and analysis in real wastewater plant. Twenty process and control
variables were used to model four purification efficiency-related variables, i.e. the
diluted sludge volume index (DSVI) and the reductions of chemical oxygen demand
(COD), nitrogen and phosphorus. The data set consisted of daily values for each
variable during two years. Dynamic PLS model was used for extracting relevant
information form X to predict Y. The prediction performances were varied typically
from 60 to 90%. DSVI has been explained very well in cross-validation compared to
other effluent quality related variables. Some results of the prediction model were
poor. Especially, some of high peaks of nutrient were difficult to predict because
there was not enough information in those X-variables. Moreover, it did not include
the seasonal variations of process.
Rosen and Lennox (2000) pointed out the limitations of conventional PCA
method, that is, stationary and one time-scale. The data used for the examples are
industrial WWTP which are fourteen available variables form the on-line
measurement system and sampling time is five minutes. For these solutions, they
applied and compared the adaptive PCA and multiscale PCA. First approach,
adaptive PCA, in terms of updating mean, variance and covariance structure
overcomes the problems of non-stationary process data. The monitoring model is
110
continuously updated using an exponential memory function. These adaptations
made most of the variation the score plane (T2) from the residual plane (SPE).
Second approach, multiscale PCA used a wavelet transform to decomposed
measurement data into different time-scales and separate PCA models were used to
monitor each scale. And for the interpretation of a disturbance, they recombined the
scales to more physically interpretable scales, fast scale (hydraulic dynamics),
medium scale (concentration dynamics) and slow scale (population dynamics).
multiscale PCA increases the sensitivity of the monitoring and makes it easy to
interpret the disturbance. But adaptive PCA has the limitations, such as adaptation
into abnormal changes, test of information content (information windup) and the
interpretation difficulty of updating the covariance matrix. MSPC makes the
interpretation cumbersome by using a PCA model on each scale and its covariance
structure is static which may introduce errors. But two approaches are too complex
and simplest model should be used for monitoring. A trade-off between complexity
and information should be considered.
Teppola (2000) suggested the combined approach of PLS and multiresolution
analysis (MRA). In this work, a PLS model was built for removing the collinearity
problem and parsimonious modeling. Then the score values of the PLS model were
processed by using wavelet and MRA to extract the process trends and to detect
different kinds of fault and disturbance. It is shown how seasonal trends, faults, and
disturbances can be separated and discriminated by observing at different scales and
also diagnosed by studying biplots at multi-scale and computing variable
contribution. In the next paper, Teppola (2001) presented the monitoring algorithm
to remove the periodic seasonal fluctuation and long-term drifting problems, which
these low-frequency variations mask and interfere with detection of small and
moderate-level transient phenomena. By trending, this relatively common problem
of autocorrelated measurements can be avoided. It first applies the wavelet filter to
111
the original data and detrends the low-frequency components and then constructs the
PLS model. These particular data, the PLS monitoring results are shown to be
superior compared to conventional PLS model. This is because it removes low-
frequency fluctuations and results in a more stationary filtered data set that is more
suitable for monitoring.
Although these processes are non-stationary, their dynamic behavior tend to
repeat from cycle to cycle and hence their cycle-to-cycle behavior may be assumed
stationary. Hence, it is plausible to calculate and use different means and
covariances for different time points within each cycle. One can also establish
correlations among the samples at different time points of a cycle, much like in
Multiway-PCA (M-PCA) used for batch process monitoring (Nomikos and
MacGregor, 1994, 1995). Beyond that lies the possibility to capture correlation in
the variations from cycle to cycle for quicker detection of small mean shifts and
slow drifts. An efficient way of doing this is to describe the variations (from their
mean behavior) by a periodically time-varying (PTV) state-space model. However,
PTV system models are difficult to build and we need a systematic framework for it.
In order to provide this monitoring ability, we propose the monitoring method based
on state space model to capture and use the period-to-period correlation structure.
7.2 Theory
Model development
To make the modeling task manageable, we adopt the technique of lifting. In
the “lifted” form of a PTV model, all samples within one cycle are collected as a
single vector (Dorsey and Lee, 2001). Let yk(t) represent the vector of the mean
centered and scaled process measurements available at sample time t during cycle k.
yk(t) ∈ Rny where ny is the number of variables selected for monitoring purposes
112
(including the controller outputs and process outputs). Assume that there are N
time samples in each cycle. Then, the lifted vector looks like
[ ]Tk
Tk
Tkk )()2()1( Ny,,y,yY L= (7.1)
First, a cycle-to-cycle invariant model is constructed using the subspace
identification technique (Van Overschee and De Moor, 1993, 1994, 1996; Dorsey
and Lee, 2001).
kkk
kkk
eCxYeKAxx
+=+=+1 (7.2)
where A, K, C are the system matrices, Yk is the lifted vectors of all data made for the
kth cycle, xk is the state sequence that is extracted the process data based on the
relevancy of the previous cycle measurements for predicting the future cycle
measurement, that is, the state is defined to be a holder of information from previous
cycles that is relevant for predicting future cycles, ek is the innovation vector that is
the residual between the process data and its estimate, and Kek. is a stochastic input
vector as a state disturbance.
The dimension is usually very high for Yk (N*ny) and there exist strong
correlations among its elements. To reduce the dimension and facilitate the
identification step, PCA can be applied to Yk to obtain score vector rkY . Now the
stochastic state-space model can be constructed with the reduced output vector.
kk
kkk
exCY
KeAxx
+=
+=+
rrk
1 (7.3)
kr
kk EYY +Θ= (7.4)
Based on the above cycle-to-cycle system model (7.3), a time evolution PTV model
can be constructed for the on-line monitoring purpose.
113
)()()()(
)()1(ttxtHty
txtx
kk ε+==+
k
kk (7.5)
[ ] rCtItH Θ= LL ),0,(,0,)( (7.6)
where εk(t) includes both the appropriate elements of ek and the residual Ek from
PCA. The cycle-to-cycle transition is then described as
kKeNAxx +=+ 1)-()0(1 kk (7.7)
Hence, the terminal state of one period becomes the initial state of the next period.
This means the states of successive periods are naturally connected through the
dynamics of the process. This is the main difference from the case of batch systems
where the sate is reset at the start to each run.
However, this model formulation may not be valid for Kalman filter
implementation, since the residual could become correlated series, instead of white
noise, that is, the state noise Kek and the output noise εk are correlated. To deal with
this, one can use the following augmented form of the model:
krk
rrk
eIK
Y
x
CA
Y
x
+
=
−
+
1
1
00 kk
[ ]
=
−r
k
rrk Y
xCY
1
0 k (7.8)
where
[ ]
=Γ=
=
= +
+ IK
,CH,CA
,Y
xz r
rrk
000
Φ11
kk
Then, within a cycle, the time model becomes
=
+
−r
kr
k Y
tx
Y
tx
1
)()1( kk [ ] [ ] )()(
0),0,(,0,)( tνY
txItIty PCAr
k
kk +
Θ= LL (7.9)
where
114
[ ] [ ]ItItH 0),0,(,0,)( Θ= LL
Finally, cycle transition model is,
krk
rrk
eIK
NY
Nx
CA
Y
x
+
=
−
+
1)-(
1)-(
00
)0(
)0(
1
1 kk
[ ]
=
1)-(1)-(
0NeNx
CYk
rrk
k (7.10)
Now, a periodic Kalman filter can be applied to the model (7.9) and (7.10). The PTV
Kalman filter can be designed using the standard equations to update the state and
the output score vectors recursively on the basis of incoming measurements. Assume
Φ, Η, Γ, Q, R represent the state transition, output, state noise coefficient, state noise
covariance, and measurement noise covariance matrices respectively. The Kalman
filter equation is as follow,
[ ])|1(-1)(1)()|1(1)|1( ttxHtytKttxttx +++++Φ=++
[ ] 11)()|1()|1(1)( −++++=+ tRHttHPHttPtK TT
TT QttPttP ΓΓ+ΦΦ=+ )|()|1(
[ ] )|1(1)(-)1|1( ttPHtKIttP ++=++ (7.11)
When t = 0, …, N-2, Φ=I, Q=0, H= )(tH , R can be estimated from the PCA residual.
When t = N-1, Φ = Φ, Q is the error covariance matrix obtained from subspace
identification, H = H, R = 0.
Process Monitoring Measure
The Hotelling’s T2 monitoring statistics based on state space has been proposed
by Negiz and Cinar (1997). The Hotelling’s T2 statistics is a metric that includes
information for both mean and covariance structure of the state variables.
115
)()(
)1(12 nNn,FnN
nN~xxT −−−
Σ= −αk
Tkk (7.12)
where subscript k indicates time, N is the number of samples and n is the dimension
of the state vector x. The T2 is obtained by assuming that the state variables follow a
Gaussian distribution, that is, zero mean and multinormal distribution with estimated
covariance matrix Σ while they are orthogonal at zero lag. Fα(n, N-n) is the upper
100α% critical point of the F-distribution with n and N-n degree of freedom, which
can be used to establish control limit with significance level α.
The updated state vector x and the output score vector Yr can be monitored
separately using T2 statistics. Note that T2 monitoring of Yr amounts to the period-
by-period M-PCA monitoring implemented in a real-time manner. The monitoring
of state vector x(t) could give extra information about small mean shifts or slowly
developing abnormalities that are hard to detect with PCA. That is, the state has an
information form previous cycles that is relevant for predicting future cycles.
Therefore, a monitoring algorithm built around the developed state space model
gives a efficient ability to detect abnormal deviation in the process dynamics from
cycle-to-cycle dynamics point of view as well as deviation in process measurements
(Dorsey and Lee, 2001).
On the other hand, in many chemical and biological processes, some quality
variables cannot be measured on-line and their lab measurements are available only
after long delays. And advanced nutrient sensors are expensive in the aspect of cost
and maintenance. In such cases, inferential sensing can be very useful for on-line
monitoring and control. An added advantage of the proposed framework that it is
very easy to implement inferential sensing. For the purpose of inferential sensing,
the quality variables qky can be augmented with all other measured output before
doing PCA, such as,
116
[ ]TTTTT yyyyyY )(,,)2(,)2(,)1(,)1( Nqk
qkk
qkkk L= (7.13)
Note that quality measurements do not have to be available at a same rate as
process measurements. All the presented model formulations remain the same. Once
the model is built, the Kalman filter can be designed to use whatever measurements
available at each each time. Inferential predictions on the quality variables can be
obtained just by picking out the proper element after the Kalman filter estimate and
transforming it back from the score space to the original full space.
[ ] [ ]
Θ=
)()(
),0,(,0,)(tetx
ICtItyk
krLL qq (7.14)
7.3 Simulation Study
The proposed a process monitoring method for cyclic process is tested on data
generated form a simulation of benchmark plant. The exiting dry weather data set in
benchmark may not be proper for constructing cycle-to-cycle model. There are few
variations from day to day influent data. While, constructing dynamic day to day
model needs data which contains enough variations and correlations form day to day.
And benchmark data is not long enough. If considering to construct daily model,
only 14 days of data are available, which is far from enough, especially for later
subspace identification.
To generate data which shows day to day correlation as well as some in-cycle
correlation a variation (or disturbance) model is used of both SNH,in and Qin. The used
functions are:
For time model variation
)()()1( tetaEtE kk +=+ (7.15)
For cycle-to-cycle model,
117
kkk Eb +=+ θθ 1 (7.16)
where θk is a vector which stores all the variations of each sample. Through the
simulation, data set in-cycle correlation would have only 4 times a day which means
every 6 hours, a variation is assumed to occur. It comes from that too much in-cycle
correlation would require more PCA scores to capture the whole cycle dynamics
which will complicate the subsequent state space modeling and makes the PTV
Kalman filter hard to follow the trend.
We generated 300 days normal data set with influent data file, which used last
200 days data as modeling data set for PCA and N4SID and used any part of first
100 days as prediction test. Figure 7.1 show the measured variables of the first 10
days of normal data set, which has a cyclic and diurnal variation. Two disturbance
scenarios were simulated. The first deals with slowly linear decrease of nitrification
rate. The second disturbance is a small mean (step) decrease of nitrification rate.
Table 7.1 represents the simulation conditions of two disturbance cases.
For the comparisons, we applied the general PCA and PLS method for both
original measured data set and data set with subtracting mean variation. For a data
set with subtracting a mean variation, PCA technique is applied to the remaining
dataset which averaging trajectory within a day is subtracted. So, periodicity is
removed within one day, which normal trajectory within a day is subtracted
averaging within one day (96 sample time). And then PCA model for the auto-scaled
data set with mean zero and unit variance is treated.
Monitoring and quality prediction performances are compared with those of the
static PCA and PLS methods. The variables used to build the X-block in the
disturbance detection were the influent ammonia concentration (SNH,in), influent flow
rate (Qin), nitrate concentration in the second aerator (SNO,2 ), total suspended
solid in aerator 4 (TSS4), DO concentration in aerators 3 and 4 (SO,3, SO,4), oxygen
transfer coefficient in aerator 5 (KLa5), and internal recirculation rate (Qint). A PCA
118
model of 95% and 99% confidence limit is built from training periods of 200 days of
dry weather dataset (normal operation). Three PCs are selected for PCA model and
captured variability is shown in Table 7.2. Figure 7.2 shows the T2 and SPE plot of
conventional PCA method during first 10 days of normal data set. During normal
operation with diurnal influent, the PCA monitoring result shows apparent cyclic
and non-stationary characteristics. This non-stationary (periodic) behavior of T2
score values is the cause of false alarm and missed fault by widen confidence limit.
Monitoring of nitrification linear decrease case
Figure 7.3 shows the general PCA monitoring results with stationary statistics
assumption for nitrification linear decrease case. T2 and SPE plot which focuses on
during 0-1000 samples show the bad monitoring results for this linear shift type
disturbance, where T2 plot cannot detect this type of event and SPE plot has a delay
about 200 samples. It originated from the widened confidence limit of their mean
behavior. For these non-stationary (periodic) behaviors, it tends to give false alarm
at peak value, but not sensitive at lower value period.
Figure 7.4 shows the general PCA monitoring results with subtracting mean
variation for nitrification linear decrease case during 0-500 samples. The control
limits of PCA method for cyclic removal data set have higher control limits than
limits of the conventional dataset. T2 and SPE plot for nitrification linear decrease
case show better monitoring performances than the previous data set, which T2 plot
shows the removal of the periodic variations and shows some other variations. From
the Figure 7.4, we know that T2 measure shows a delayed detection performance and
SPE plot has a delayed detection about 100 samples. However, T2 and SPE plot still
have some false alarms.
The monitoring result of the proposed method is shown in Figure 7.5. The 95%
confidence limits is calculated and is shown by the horizontal dotted line on the
plots. The result represents a high sensitivity of the proposed state space monitoring
119
method, where T2 around the state vector can immediately detect the linear shift
event of the nitrification rate. The earlier detection capability of the proposed
method could allow for a corrective action and operation change before any serious
situations such as the biomass decay and sludge bulking would be occurred.
Monitoring of nitrification step decrease
Figure 7.6 shows the conventional PCA monitoring results for small step
decrease of nitrification rate. PCA model shows the bad monitoring result for this
step type disturbance, which both T2 and SPE plot cannot detect the disturbance of
small step decrease. It originated from the widened confidence limit of their mean
behavior. For non-stationary behavior, it tends to give false alarm at peak value, but
not sensitive at lower value period. Because small decrease of nitrification rate is
introduced at lower value time, PCA monitoring method which is based on the
average of the whole trajectory of each variable cannot detect this type of small
mean shift.
Figure 7.7 shows the general PCA monitoring results with subtracting mean
variation for nitrification step decrease case during 0-1000 samples. Here, the
normal trajectory within a day is subtracted. As the previous case, T2 and SPE plot
of small step decrease show slightly better monitoring performances than the
original data set. But both T2 and SPE plot have a delay about 200 -300 samples and
also have some false alarm. It comes from that PCA method with subtracting variant
mean only considers the time variant mean and assume the constant variance which
can be time variant.
The monitoring result of the proposed method is shown in Figure 7.8. The
result represents a high sensitivity of the proposed state space monitoring method,
where T2 around the state vector can immediately detect the small step decrease of
the nitrification rate. The earlier detection capability of the proposed method could
allow for a corrective action and operation change before some undesirable
120
situations such as the reduced plant efficiency and the decreased settling ability
would be occurred.
Inferential sensing
Quality prediction performances of inferential sensing are compared with those
of PLS method for both original data set and data set with subtracting mean
variation. We made 300 days normal data set, where used last 200 days data as
modeling data set for PLS and used any part of first 100 days as validation test.
Because we are interested in the prediction ability based on integrated monitoring
model in this research, we constructed the inferential model which was based on 200
days of normal operation and did not consider any time lags between the input and
output. Quality variables are the effluent ammonia and nitrate, SNH,e and SNO,e.
Linear PLS model of original data set is built for the prediction of quality
variables, where four latent variables (LVs) are selected. The prediction results of
validation data set are shown in Figure 7.9. PLS model shows bad prediction results
of both SNH,e and SNO,e which has nonlinear and periodic dynamics. Linear PLS
model cannot model these nonlinear process behaviors. And then, PLS model for the
periodic removed data set is built, where four LVs are also selected. Compared to
the previous PLS model, Figure 7.10 shows highly good prediction performance. It
means that subtracting the average trajectory from the periodic process removes the
major nonlinear behavior of SNH,e. Finally, Figure 7.11 shows the prediction
performance of the proposed inferential sensing. Table 7.3 compares the mean
square of prediction error (MSE) of three prediction methods. As PLS prediction for
periodic removed data set, it removes the periodic dynamics and shows the good
prediction performance. This is not a surprising result as the proposed method
captures the most variability of the normal data set and the main dynamics of the
system. So, the proposed method offers an advantage for the process monitoring as
well as a chance to predict the quality variable in integrated model.
121
7.4 Conclusions
In this research, we propose a monitoring method based on state space models
for diurnal cyclic characteristics in domestic WWTP. A state-space model is
identified to extract “within cycle” and “between-cycle” correlation information
from historical data using subspace identification method. First, a cycle-to-cycle
invariant model is constructed using the subspace identification method. Second,
time-varying Kalman filter model is constructed for on-line monitoring. For the
purpose of inferential sensing, integrated framework augmenting the quality
variables is also suggested. Simulation results show that the proposed method is an
appropriate monitoring technique for WWTP with cyclic operation. Specially, it can
detect more rapidly the changes of cycle-to-cycle behavior, linear decrease of
nitrification, and more accurately detect mean shift, step decrease of nitrification,
than conventional monitoring methods.
122
Figure 7.1 Measured variables of the first 10 days of normal data set
123
Figure 7.2 Conventional PCA monitoring result during the first 10 days of normal
data set
0 100 200 300 400 500 600 700 800 900 10000
2
4
6
8
10
12
14Value of T2 with 95 and 99
Sample Number
Val
ue o
f T2
0 100 200 300 400 500 600 700 800 900 10000
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6Process Residual Q with 95
Sample Number
Res
idua
l
124
Figure 7.3 Conventional PCA monitoring result for nitrification linear decrease: T2
and SPE plot
125
Figure 7.4 PCA monitoring result with periodic removal for nitrification linear
decrease: T2 and SPE plot
126
Figure 7.5 Monitoring result of the proposed method for nitrification linear decrease
127
Figure 7.6 Conventional PCA monitoring result for nitrification step decrease: T2
and SPE plot
128
Figure 7.7 PCA monitoring result with periodic removal for nitrification step
decrease: T2 and SPE plot
129
Figure 7.8 Monitoring result of the proposed method for nitrification step decrease
130
Figure 7.9 Prediction results of SNH,e and SNO,e for validation data set with static PLS
method
131
Figure 7.10 Prediction results of SNH,e and SNO,e for validation data set with static
PLS method (after periodic removal)
132
Figure 7.11 Prediction results of SNH,e and SNO,e for validation data set with the
proposed method
133
Table 7.1 Two disturbances in the benchmark
Disturbances Linear/step Simulation conditions
Decreasing nitrification
rate Linear
Specific growth rate for autotrophs: from 0.5 to 0.4 day-1 in a linear fashion during
samples 288 (3 day) – 480 (5day) Decreasing nitrification
rate Step
Specific growth rate for autotrophs: from 0.5 to 0.47 day-1 in a step fashion during
sample 288 (3 day)
134
Table 7.2 Percent variance captured by PCA model
Original Periodic removal
PCs % Variance
captured this PC
% Variance
captured
total
% Variance captured this PC
% Variance
captured
total
1 69.8 69.80 39.91 39.91
2 19.18 88.98 21.05 60.96
3 6.37 95.35 15.48 76.43
135
Table 7.3 MSE of two PLS methods and the proposed method
Model type MSE
Conventional PLS 2.2460
Periodic removal
PLS 0.1023
The proposed
method 0.0044
136
VIII. Simultaneous Prediction and Classification in the
Secondary Settling Tank
8.1 Introduction
Since the environmental restriction becomes harder and harder nowadays, the
increasing effort for higher effluent quality from wastewater treatment plant is
required in the advanced monitoring of plant performance. To reduce the effluent
pollutants from the wastewater treatment plant (WWTP), it should precede any other
procedure to analyze the current state of each unit plant of WWTP. Most of the
changes in WWTP are slow when the process is recovering back from a ‘bad’ state
to a ‘normal’ state. Early fault detection and current status classification in the
biological process are very efficient to execute corrective action well before a
dangerous situation starts. So, the monitoring and multivariate analysis of activated
sludge process or the secondary settler have long been noted (Olsson et. al., 1988;
Hasselblad et. al., 1996; Stefano, 1998; Teppola et. al, 1997, 1999; Van Dongen et.
al., 1998; Rosen et. al., 1998).
Secondary sedimentation in the WWTP separates the biomass from the treated
wastewater and its performance is crucial to the operation of an activated sludge
system. Its operation depends on the status of sludge that relies on many other
parameters such as temperature, organic loading, influent flow rate, and flock
properties. The solid volume index (SVI) primarily describes the settling properties
of the sludge. High SVI values are indicators of a bulking state and overgrowth of
filamentous microbes, which is one of the major upsets of the activated sludge
process leading to the deterioration of the purification efficiency. Therefore, the
prediction of SVI value is very important from the viewpoint of the strategy of the
settler operation.
In this research, we forecasted the solid volume index (SVI) of the secondary
137
settler using an adaptive RLS method. ARX model parameters are proposed and
verified as a good feature in a secondary clarifier monitoring by observing the
evolution of ARX model parameters through power spectrum. The capability of
monitoring the secondary clarifier is illustrated with the application of a neural
network classifier which, combined the adaptive processing scheme, proved to be
suitable for the monitoring and classification application in the real wastewater
treatment plant.
8.2 Theory
Monitoring system in the secondary settler has been developed based on a
neural network of parallel distributed processing, powerful learning and
generalization capability of the pattern information. This system is composed of
three fundamental parts. First, auto-regressive with exogenous input (ARX) models
of time series data of SVI value in the secondary settler have been used to predict
the SVI value. Its parameters are adaptively estimated by RLS method and provide
feature vectors. Its ability is showed by the power spectrum analysis of ARX
parameters. Second, in the classification process, we design neural network classifier
to identify the current state of the secondary classifier. After training the neural
networks, we can recognize the state class of the settler from the values of output
nodes is chosen according to the maximum selection rule. Third, in the structure of
the classifier, we decide the optimal number of hidden nodes using GA.
SVI prediction with RLS method
To model SVI of the secondary settler, we introduce the identification methods
for the ARX model using recursive least square (RLS) method (Ljung, 1987; Ko and
Cho, 1996). A general form of the discrete ARX model is as follows.
)()()2()1()()1()( 211 tentubtubtubntyatyaty bnan ba+−++−+−=−++−+ LL (8.1)
138
The objective of the ARX is to estimate the adjustable parameters of ai and bi to
minimize the difference between the predicted process output and the measured
process output. Because the secondary settler is time varying process and has
inherently dynamic characteristics, it is required to use the adaptive ability. For this
purpose, we use the RLS algorithm that makes the modeling technique well suited
for time varying environment. The RLS algorithm is as follows.
λϕϕλ
ϕϕ
ϕϕλϕ
θϕ
θθ
)()1()()1()()()1()1(
)(
)()1()()()1()(
)1(ˆ)()(ˆ
))(ˆ)()(()1(ˆ)(ˆ
ttPttPtttPtP
tP
ttPtttptK
ttty
tytytKtt
T
T
T
T
−+−−
−−=
−+−
=
−=
−+−=
(8.2)
where K(t) is an adaptation gain, θ(t) parameter estimation at time, ŷ(t) is the
prediction value based on observations at time t-1, ϕ(t) regression vector, λ
forgetting vector, and P(t) covariance matrix of estimates. This recursive form is
very convenient for updating the model at each time, so that the model follows the
gradual change in the characteristic of the settler process. If the parameters of ARX
model are well tuned, a change in dynamic characteristics of the settler process will
cause gradual change in the parameter vector and prediction error. Therefore, the
status of the secondary settler can be observed by a gradual change in ARX model
parameters.
Power Spectrum
In order to see the sensitivity of ARX coefficients of SVI at each state and
verify its discriminant ability, the comparison of the power spectrum at each state is
required. The power spectrum of a stationary process is defined as the Fourier
transform of its covariance function (Ljung, 1987). While a deterministic signal can
be expressed as a mixture of sine and cosine functions at different frequencies, a
139
time series response or stochastic system response of a function of time doesn’t
belong to the class of functions dealt with in the usual Fourier transform theory. The
frequency decomposition of these random functions can be obtained by taking the
Fourier Transform of the auto-covariance function for which the usual Fourier
transform can be used.
For stochastic process, y(t) can be given by
)()()()()( teqHtuqGty += (8.3)
where u(t) is a quasi-stationary, deterministic signal with a spectrum, and e(t) is
white noise with a variance. Let G(q) and H(q) be stable filters. Then y(t) is quasi-
stationary and
222
)()()()( iwau
iwy eHeG σωω +Φ=Φ (8.4)
)()()( ωω uiw
yu Φ=Φ eG (8.5)
where Φy(ω) is a power spectrum of y(t) and Φyu(ω) is a cross spectrum of y(t) and
u(t). It should be noted that this type of spectrum estimates is inherently smooth
because they are obtained based on a parameter representation of the system. The
result has a physical interpretation, where |G(eiw)|2 is the steady-state amplitude of
the response of the system to sine wave with a frequency. The value of the spectral
density of the output is then the product of the power |G(eiw)|2 and the spectral
density of the input Φu(ω). If the power spectrum was separated and had a
dissimilarity value at a different state, analyzing the power spectrum of ARX
parameters can make the decision on the state of the secondary settler.
Pattern classification (neural network)
While different states are not completely separable in the original input and
output dimensional space under a wide range of conditions, the classes become
separable in the dimensional feature of ARX parameters space using neural network
140
classifier that has the ability of nonlinear mapping. So, ARX parameters were used
as input features for the neural network classifier. The explanation of neural network
classifier is as follows.
Pattern recognition methods such as neural networks are important for the
classification problems because they do not require accurate process models, which
are often difficult to obtain for many biological and chemical processes. And neural
network computing ability outperforms the conventional statistical approach in
many engineering application because of its non-linear transformation (Bishop,
1995; Lin and Lee, 1996; Haykin, 1999). Neural network maps a set of input
patterns (e.g., process operating conditions) to respective output classes (e.g.,
categorical groups). We use an input vector and an output vector to represent the
input pattern and output class, respectively. The output vector, y, from the neural
network is bipolar, with -1 indicating that the input pattern is not within the specific,
and 1 indicating that it is within a specific class (e.g., “-1” = not in class I; “1” = in
class I). The actual output from the neural network is a numerical value between -1
and 1, and can be viewed as the probability that the input pattern corresponds to a
specific class. The output vector (y) contains three possible classes, that is, y={class
I, class II, class Ⅲ}. Note that for every point within the input space, there must be
only one class specified. In this paper, we have only three possible output vectors for
training the network, for example, y = {[1,-1,-1], [-1,1,-1], [-1,-1,1]}.
After calculation of the neural network classifier output, the values of output
nodes are passed to the maximum selector. The output node selected by the
maximum selector gives information on the class that includes a current input. In
theory, for an M-class classification problem in which the union of the M distinct
classes forms the entire input space, we need a total of M outputs to represent all
possible classification decisions. It can be expressed as follows.
If yi(xj) > yk(xj) for all k (k=1,2,…,M : k ≠ i) , then xj ∈ si (8.6)
141
where xj is jth input vector, yi is the ith output node value of the neural network
classifier for input xj, si is the ith state of secondary clarifier, and M is the number of
output nodes. A unique largest output value exists with probability 1 when the
underlying posterior class distributions are distinct.
Genetic algorithm (GA)
The GA is a derivative-free stochastic optimization technique in which the
stochastic search algorithm is based on the idea of the principle of natures such as
natural selection, crossover, and mutation (Marsili-Libelli, 1996; Wang, et al., 1998).
One of the GA’s characteristics is the multiple points search, which discriminate the
GA from other random search methods. In this paper, the string, which is a model of
chromosome, represents the number of hidden layer of the neural network. The GA
typically starts by randomly generating initial population of strings. Each string is
transformed into the fitness value to obtain a quantitative measure. On the basis of
the fitness value, the strings undergo genetic operations. The goal of genetic
operations is to find a set of parameters that search the optimal solution to the
problem or to reach the limited generation.
Since the ultimate objective of a pattern classifier is to achieve an acceptable
rate of correct classification, this criterion is used to judge when the variable
parameters of the neural network are optimal. In addition, GA is the useful tool to
select features for neural network classifiers. For example, GA can be used to learn
or train neural network structure or to initialize the reasonable weight that is
generally assigned randomly. This paper uses the hybrid algorithm for the
optimization of the neural network structure using GA in order to improve the
behavior and the design of neural networks. GA was used to find the optimal
number of hidden nodes.
Hierarchy structure
142
Monitoring system in a secondary settler has been developed based on a neural
network of parallel distributed processing, powerful learning and generalization
capability of the pattern information. This system is composed of three fundamental
parts. These are an adaptive ARX estimation processing, neural network
classification and maximum decision rule making part. Figure 8.1 represents the
schematic diagram of the proposed hierarchy structure.
First, adaptive estimation processing is performed to predict SVI value and
provide feature vectors. That is, ARX model has been used to predict SVI value in a
secondary settler. Its parameters are adaptively estimated by RLS method and input
vector of the neural network classifier. Second, in the neural network classification,
the feature vectors are associated with the desired output decision. In the structure of
the neural network classifier, we decide the optimal number of hidden node using
GA. After training neural networks, the classifier output is calculated by the trained
weight. Third, in the maximum decision rule, only one of the values of the classifier
output is chosen according to the rule "the minority is subordinated to the majority”.
8.3 Simulation Study
In this research, we used the industrial wastewater treatment facility data of the
iron and steel making plant in Korea. It is a general activated sludge process that has
five aeration basins and a secondary clarifier. Figure 2.3 shows the layout of the
WWTP. The data set consisted of daily mean values from January 1, 1997 to
December 22, 1999. The data are divided into two parts. A training set consisted of
the values during first two years and a test data set during the remaining one year are
used to see how the monitoring proceeds with the proposed algorithm.
First, the ARX model structure is as follows. Its inputs are four which are the
143
influent flow rate, influent COD, dissolved oxygen (DO) of the final aeration basin
and mixed liquor suspended solid (MLSS) in the final aeration basin. Output
variable is SVI of the settler. The state of a secondary settler is divided three classes
that were judged by the experienced operator. The choice of the order of the model
is a non-trivial problem that requires trade-off between precise description of data
and model complexity. We determined the model order using cross-validation and
numerous simulation. The prediction model uses ARX structure whose parameter is
adapted by RLS with the forgetting factor, where order of AR part is 3 and the order
of each exogenous input is 2. The applied ARX model has a following form.
2)(1)(2)(1)(2)(1)(2)(1)(
3)(2)(1)()(
−+−+−+−+
−+−+−+−=−+−+−+
tubtubtubtubtubtubtubtub
tyatyatyaty
44,244,133,233,1
22,222,111,211,1
321
(8.7)
where y(t) is SVI, u1(t) is influent flow rate, u2(t) is influent COD, u3(t) is DO and
u4(t) is MLSS. To remove data redundancy, we normalize the raw training data. The
RLS method uses the dead-zone method to remedy the estimation windup. Figure
8.2 shows the result of the one-step ahead prediction value of SVI that forecasts
reasonably. The dot point is real value and solid line is the prediction value.
In order to see the sensitivity of the ARX coefficients at each state, the
parameter values of each state were shown in Figure 8.3. In this Figure, ARX
parameters have different values according to each state, which means that the
decision on the state of the secondary clarifier can be achieved by quantitatively
analyzing the ARX parameters. To conform the difference between parameters in
each class theoretically, we display the power spectrum analysis of the parameters in
the Figure 8.4.
Second, neural network classifier has a MLP structure with two hidden layer,
which its nodes are decided by GA. To speed up training and stabilize the learning
algorithm, we use the momentum term, adaptive learning rate, normalized weight
144
updating and batch learning techniques. The neural network is trained using three
patterns according to the state of a secondary settler. The number of ARX
parameters, which is used as the input variables of neural network, is eleven. And
other operating conditions can be taken as additional features to compensate for
sensitivity of the ARX parameters to the variation of operation conditions, such as
toxic occurrence, aeration basin status. The simulation results showed little
improvement of classification ability. In this paper, we did not use this additional
information for the clarity. The input features were normalized in [-1, 1] ranges in
order to prevent saturation of an activation function. The corresponding target values
of output nodes were set to normal state (0.9, -0.9, -0.9), bad state (-0.9, 0.9, -0.9),
bulking state (-0.9, -0.9, 0.9) for each state of three classes. In the application of GA
for the structure of a neural network, the initial population size of parents was 30
and generation number was 100. Ranked-base selection as a selection operator, and
mutation and uniform crossover as a search operator were used. We have set the
mutation rate for 0.01 and crossover rate for 0.6. GA can find the optimal number of
each hidden node quickly, because the search space is small. The number of first and
second hidden layer is 7 and 4, respectively. In this experiment, the neural network
with two hidden layers have a better result than with only one hidden layer. In
addition, three or more hidden layers have no improvement of performance.
In testing mode, the maximum value of neural network classifier outputs was
chosen in determining the present states. It indicates what state is the current state.
The test data has not a bulking state but only the normal and bad state. Table 8.1
shows the confusion matrix from the result of the test set using the neural network
classifier. This is a matrix A whose (i, j) element is the number of vectors that
originate from the ith distribution and are assigned to the jth cluster. Though output
values don’t completely agree with the corresponding desired outputs, they are
reasonable to recognize the present state. From the trained neural network, the
145
classification rate was about over 80.9% on an average, even though the system was
tested under a wide range of operating condition. Because the process has an abrupt
load variation during the latter part of test set, the misclassification rate was higher
in this period.
8.4 Conclusions
The recognition of the process state of a secondary settler is very important in
the operation decision. We can monitor the current state through the mixed structure
of ARX model and neural network classifier. We found the optimal structure with
second order of ARX model and neural network with three layers. From the
experiment, a strong correlation between the settler states and the values of the ARX
parameters could be used as effective features for secondary clarifier monitoring.
The training and decision making for pattern recognition were successfully
performed through neural network classifier. The proposed method is useful to
predict the SVI value of the secondary settler and to classify the current state of a
secondary settler simultaneously. And the suggested method can also be used as the
classifier of the other process in the wastewater treatment plant.
146
Figure 8.1 Schematic diagram of the proposed hierarchy structure
Number of Hidden Nodes Fitness
Evolution
ARX
MODELFeatur
e
MAXIMUM
DECISION
MAKING
Class si
Inpu
y1
y2
y3
SVI
Prediction
Reproduction of
147
0 50 100 150 200 250 300 3500
20
40
60
80
100
120
140
data
SV
I, m
l/g
Figure 8.2 One-step ahead prediction value of SVI using RLS method
148
0 2 4 6 8 10 12-3
-2
-1
0
1
2
Normal Bad Bulking
Par
amet
er V
alue
Model Parameter
Figure 8.3 Sensitivity of the ARX model parameters of each state
149
-5 0 53
3.5
4
4.5
5
5.5
frequency(ω)
Pow
er
(a)
-5 0 50
5
10
15(b)
-5 0 50
2
4
6
8(c)
frequency(ω) frequency(ω)
Figure 8.4 Power spectrum in each state (a) normal (b) bad (c) bulking state
150
Table 8.1 Confusion matrix of the test data
Predicted
Normal Bad Bulking
Normal 228 9 6
Bad 36 59 17 True
Bulking 0
151
IX. Nonlinear Fuzzy PLS Modeling
9.1 Introduction
Statistical data analysis has been widely used in establishing models from
experimental or historical data. Typical problems in multivariate statistical analysis
are high dimensionality and collinearity in a sparse sample data set. The partial least
squares (PLS) modeling method is one of the most useful measures for overcoming
these problems. PLS is a multivariate statistical data analysis and regression method
which uses projection into latent variables to reduce high dimensional and strongly
correlated data to a much smaller data set that can then be interpreted.
The PLS method is used in a variety of areas where multivariate data emerge,
both in the laboratory and in the real world. Typical ‘lab-scale’ examples are
multivariate calibration, and quantitative structure-property and composition-
property relationships. ‘Real world’ examples include the monitoring of industrial
and environmental processes, geochemistry, and clinical, atmospheric, and marine
chemistry. As PLS uses a statistical data reduction and regression algorithm, it is
employed primarily in data analysis (Teppola et al., 1997, 1998; Rosen and Olsson,
1998; Wikström et al., 1998).
Although the original linear PLS (LPLS) regression method provides good
remedial measures to the problems of correlated inputs and limited observations, it
has the major limitation that only linear information can be extracted from data.
Since many practical data are inherently nonlinear, it is desirable to have a robust
method that can model any nonlinear relation. A successful step towards nonlinear
PLS modeling was the quadratic PLS (QPLS) method proposed by Wold et al.
(1989). In QPLS quadratic functions are used for the inner regression in PLS.
However, the nonlinearity of the QPLS method is very limited. To create a PLS
method of greater nonlinearity, several more generic approaches have been
152
developed such as spline PLS (SPLS), neural networks PLS (NNPLS), and locally
weighted regression PLS (LWR-PLS) (Wold, 1992; Qin and McAvoy, 1992;
Centner and Massart, 1998; Baffi et al., 1999). As their names suggest, SPLS uses
spline inner models and NNPLS uses neural networks inner models. LWR-PLS uses
LPLS as a regression method to build a locally weighted model for every sample. In
general, NLPLS algorithms use the criterion of minimum regression error to select
inner model parameters. However, the resulting models suffer from over-fitting or
local minima. In many cases modeling experts can easily detect these kinds of poor
modeling results by inspecting the PLS score plots. However, correcting NLPLS
models by changing model parameters is not an easy task because the relationship
between model parameters and model shape is not clear, and the models were not
developed taking into consideration the need for this kind of measure. The proposed
FPLS model remedies the shortcomings of NLPLS outlined above.
9.2 Theory
9.2.1 PLS modeling method
Basically, the PLS method is a multivariable linear regression algorithm that
can handle correlated inputs and limited data. The algorithm reduces the dimension
of the predictor variables (input matrix, X) and response variables (output matrix, Y)
by projecting them to the directions (input weight w and output weight c) that
maximize the covariance between input and output variables. Through this
projection decomposes variables of high collinearity into one-dimensional variables
(input score vector t and output score vector u). The decomposition of X and Y by
score vectors is formulated as follows:
∑
=
+=m
h
Thh
1
EX pt (9.1)
153
∑=
+=m
h
Thh
1FY qu (9.2)
where p and q are loading vectors, and E and F are residuals. This relation is known
as the PLS outer relation. The relation between score vectors th and uh is known as
the inner relation.
The original PLS algorithm was developed as a linear regression method that
uses a linear inner relation on the latent space. This LPLS algorithm has many
beneficial properties for use as a data analysis tool. For example, w and c can be
used to find the contributions of different variables to each score, and t and u can be
used to detect outliers. Moreover, the method has a well-developed statistical
foundation and results can be illustrated using biplots that enhance intuition into the
underlying system. However, LPLS is limited to modeling linear relationships, and
the real world is not limited to linear systems. Various nonlinear PLS algorithms
have been proposed to cope with the problems introduced by nonlinearity. However,
each of these approaches has shortcomings such as simplicity, lack of analytical
interpretability of regression coefficients, and so on. The FPLS algorithm proposed
here applies the TSK fuzzy model to the PLS inner regression. This method was
developed because the interpretability of the TSK fuzzy model overcomes some
handicaps of extant nonlinear PLS algorithms.
9.2.2 TSK Fuzzy Modeling
A fuzzy inference system is an effective means of creating models based on
human expertise in a specific application by a selection of fuzzy IF-THEN rules,
which form the key components of the system. Having selected the IF-THEN rules,
fuzzy set theory provides a systematic calculus to deal with information
linguistically, and it performs numerical computation by using linguistic labels
stipulated by membership functions. The fuzzy inference system therefore has the
properties of a structured knowledge representation in the form of fuzzy IF-THEN
154
rules. This system therefore provides a good framework for applying human
expertise in the construction of inference models.
The fuzzy inference system proposed by Takagi, Sugeno and Kang, known as
the TSK model, provides a powerful tool for modeling complex nonlinear systems
(Yen et al., 1998). Typically, a TSK model consists of IF-THEN rules of the form
Ri : if x1 is Ai1 and ⋅⋅⋅ and xr is Air then yi
= bi0 + bi1 x1 + ⋅⋅⋅ + bir xr for i = 1, 2, ⋅⋅⋅, L (9.3)
where L is the number of rules, xi = [x1 x2 ⋅⋅⋅ xr]T are input variables, yi are local
output variables, Aij are fuzzy sets that are characterized by the membership function
Aij(xj), and bi = [bi0 bi1 ⋅⋅⋅ bir]T are real-valued parameters. The overall output of the
model is computed by
∑
∑∑
∑=
=
=
=+++
== L
i i
L
i ririiiL
i i
L
i ii xbxbbyy
1
1 110
1
1)(
τ
τ
τ
τ L (9.4)
where τi is the firing strength of rule Ri, which is defined as
)()()( 2211 ririii xAxAxA ×××= Lτ (9.5)
Figure 9.1 shows a schematic block diagram of the TSK fuzzy model.
In general, Gaussian-type membership functions are used to build the model.
They are defined by
−−= 2
2
2)(exp)(
i
irrrir
cxxAσ
, i = 1, 2, ⋅⋅⋅, L (9.6)
where cir is the center of the ith Gaussian membership function of the rth input
variable xr and σi is the width of the membership function.
The TSK model presented above is sometimes called a first-order TSK model,
because it formulates its rules using a first-order polynomial. In general, any
function can be used for the fuzzy rules as long as it can appropriately describe the
output of the model within the fuzzy region specified by the antecedent of the rule.
155
For example, when the function is a constant it is called a zero-order TSK model.
Moreover, the zero-order TSK model is functionally equivalent to a radial basis
function network. In the present research, we refer only to the first-order TSK model
as the TSK model to avoid complication.
The great advantage of the TSK fuzzy model is its representative power, which
stems from its ability to describe complex nonlinear systems using a small number
of rules. Moreover, the output of the model has an explicit functional form (equation
9.4), and the individual rules give insights into the local behavior of the model. The
good interpretability of the fuzzy system may match the utility of the PLS method in
intuitive data analysis.
9.2.3 Nonlinear FPLS Modeling
Since many practical data are inherently nonlinear, there is a need for a
nonlinear PLS modeling approach which can not only represent any nonlinear
relationship but also attain the robust regression property of the LPLS method. We
propose the FPLS method as such a nonlinear modeling method. The FPLS method
is basically a combination of the PLS method and the TSK fuzzy model. The PLS
outer projection is used as a dimension reduction tool to remove collinearity, and the
TSK fuzzy inner model is used to capture the nonlinearity in the projected latent
space. An advantage of using the TSK fuzzy model as the inner regressor is its
interpretability, which facilitates in the design of the FPLS model structure by
allowing human experts to participate in the design process.
The FPLS method differs from the direct TSK fuzzy modeling approach in that
the data are not used directly to train the TSK model, but are preprocessed by the
PLS outer transform. This transformation decomposes the multivariate regression
problem into a few univariate regression problems and simplifies the TSK model.
The TSK method is a type of kernel regression method, where the input variables are
transformed nonlinearly to feature space variables and the transformed data set is
156
regressed linearly. Well-designed nonlinear transformation procedures usually
reduce the collinearity problem. In the kernel regression method, the method of
nonlinear transform is related directly to the regression performance. However,
designing an optimal nonlinear transformation for high dimensional and collinear
data set is very difficult, and the resulting models often suffer from over-fitting or
local minima. However, the robust data reduction characteristic of the PLS method
can compensate for this problem in the TSK fuzzy modeling method.
In the following subsections we propose the FPLS and IFPLS algorithms. The
basic FPLS algorithm keeps the weight vectors the same as for LPLS, whereas
IFPLS is an extended version of the FPLS algorithm that iteratively updates its
weight vectors according to inner relation functions. The weight updating algorithm
used in the IFPLS algorithm was developed because a complete PLS algorithm
should have an algorithm that updates the weights according to the inner relation.
However, the automatic updating of the weights to meet a certain object function
diminishes the knowledge-based modeling aspect of FPLS because it automatically
changes the score plots. When such a feature is used, the model that experts judged
to have an appropriate structure for the previous score plot may not be good for the
present plot. Therefore, we do not include the weight update scheme in the modeling
procedure of FPLS. However, as IFPLS is an advanced PLS method that follows the
main stream of NLPLS convention, we present it as an extended FPLS algorithm.
FPLS algorithm
Figure 9.2 shows a schematic of the basic FPLS method, which uses the PLS
outer transform to generate score variables from the data. Score vectors (th and uh) of
the same factor h are used to train the inner TSK fuzzy model fh(⋅), which obeys the
following relation
hhhh etfu += )( (9.7)
157
where eh represents the regression error. The parameters of fh(⋅) should be selected to
minimize eh without over-fitting. To summarize, by not updating the outer relation
FPLS keeps the LPLS property that variables are projected into the directions
maximizing the covariance, and it captures nonlinearity through the large modeling
capacity of the TSK model.
The proposed FPLS algorithm can be formulated as follows.
1. Scale X and Y to have zero-mean and unit-variance.
Let E0 = X, F0 = Y and h = 1.
2. For each factor h, take uh from one of the columns of Fh-1.
3. PLS outer transform:
)(1 hT
hhT
hT
h / uuuw −= E (9.8)
hhh / www = (9.9)
hhh wt 1−= E (9.10)
)(1 hT
hhT
hT
h / tttc −= F (9.11)
hhh / ccc = (9.12)
hhh cu 1−= F (9.13)
Iterate this step until it converges. This step is called the nonlinear iterative
partial least squares (NIPALS) algorithm. Although there exists a faster and
more stable algorithm using eigen vectors (Höskuldsson, 1988), we use
NIPALS to give readers a clearer picture of PLS outer projection.
4. Find the TSK fuzzy-type inner relation function, fh(⋅), which predicts the
output score uh with the input score th. fh(⋅) has the functional form
∑=
+=L
iiiih tbbGtf
110 )()( (9.14)
where
158
∑ =
= L
i i
iiG
1τ
τ (9.15)
( )
−−= 2
2
2exp)(
i
ii
cttσ
τ , i = 1, 2, ⋅⋅⋅, L (9.16)
Gi is the normalized firing strength and τi is a Gaussian-type firing strength
for the ith rule. First, the number of fuzzy rules, L, should be estimated by
the model designer at an integer value that minimizes the regression error of
fh(⋅) without creating an over-fitted model. The designer may use intuition
gained from the score plot or some numerical criteria such as the sum of
squared errors (SSE) for cross validation. The designer can then decide the
other parameters, such as ci, σi and bi, using a numerical curve fitting
function to minimize the SSE.
5. Calculate the X and Y loadings
)(1 hT
hhT
hT
h / tttp −= E (9.17)
)ˆˆ(ˆ 1 hT
hhT
hT
h / uuuq −= F (9.18)
where ( ) ( ) ( )[ ]Thhhhh Ntftftff )(,,)2(,)1()(ˆ hhhhtu L== for N samples.
6. Calculate the residuals for factor h.
Thhhh pt−= −1EE (9.19)
Thhhh qu1 −= −FF (9.20)
7. Let h = h + 1, then return to step 2 until all m principal factors are
calculated. The number of factors m is decided by the designer. The
designer may use intuition gained from the score plot or some numerical
criteria such as SSE for cross validation.
The parameters of fh(⋅) can be decided by various heuristics. In this research,
the initial values of ci, σi and bi are decided using the fuzzy c-means (FCM)
159
algorithm (Jang et al., 1997), Moody and Darken’s (M&D) rule (Moody and Darken,
1989) and the global learning procedure (Yen et al., 1998) (see the Appendix for the
mathematical formulations of these methods). Then a numerical nonlinear least
squares curve fitting function is applied for the optimization of the parameters with
the object function of minimizing the SSE. However, if the optimized model shows
signs of over-fitting such as very steep changes in its trend, the designer can change
and fix some parameters and then optimize the other parameters to make a smoother
and more reliable model within the criteria of his or her expertise.
As is shown in the algorithm, the designer’s decisions are emphasized in the
calibration of a FPLS model. This aspect of FPLS represents an improvement over
other PLS algorithms. Generally, structural parameters such as L and m are selected
using cross validation method to avoid the problem of over-fitting. Cross validation
is mandatory for high dimensional models, because the model shape cannot be well
presented in visible form. Although the fuzzy modeling process gives particular
weight to the application of the expert’s knowledge in the modeling process, it is
also hindered by the problem of high dimensionality. Regardless of the type of
modeling, designers should check the validity of their model. The FPLS method aids
designers in model validation by providing a simple modeling interface for visual
checking, in addition to the typical cross validation method. The visual check
comprises checks of the error correlation, high leverage data treatment, local
minimum, over-fitting and lower fitting. Checking using visualization is possible
because of the robust data reduction and the two-dimensional presentation properties
of PLS. Other PLS methods such as LPLS and NNPLS also have these properties,
but they lack the interpretability and high nonlinear regression capacity of the TSK
inner relation function. The fuzzy rules of the TSK function provide insights into the
model that allow us to make a simple linear expectation of its behavior even in the
extrapolation range and to interactively change its parameters. These capabilities
160
make FPLS a promising modeling and monitoring method.
Iterative FPLS (IFPLS) algorithm
NLPLS algorithms include weight update methods. The weight update methods
can be classified as follows. The first approach is not to update any weight. In this
scheme, no iterative weight update procedure is used on either the input or the
output weight. So, inner relation functions are calibrated only one time for each
latent factor. FPLS uses this method. As a result the first score and weight vectors
remain the same as that of LPLS; however, the later vectors change because the
reduction of X and Y changes depending on the nonlinear regression performance.
The advantages of this method are that the direction of the weights remains in the
direction of maximizing covariance and that the calculation time is short. The
second method is to use fixed input weights, and update output weights and inner
relation functions iteratively, where a weight update method similar to that of the
NIPAL algorithm is used (Qin and McAvoy, 1992). As a result only the first input
score and the first input weight vectors remain the same as those of LPLS. This
method goes half way toward fitting the weights to the inner relation function. On
the other hand, it avoids the problem of updating the input weights, which can be
controversial. The third approach is to update both the input and output weights
iteratively (Wold et al., 1989; Baffi et al., 1999). While the output weights are
updated using the same method as that of the second method described above, the
input weights are updated iteratively to the vectors that minimize the regression SSE
of the each inner relation function that is decided at the previous iteration, where
numerical techniques are usually used to find the optimum input weights. However,
this method shows an obvious problem when applied to rank deficient data sets.
When the input dimension is larger than the number of samples, attempts to find the
w which minimizes the SSE of u = f(Xw) can yield an uncountable number of w’s
161
which give the same minimum SSE. Numerical techniques give one of these
solutions. This kind of weight update method does not have the robust dimension
reduction properties of PLS. One feature of this kind of model is that they capture
very large y-variance, but very small x-variance. The fourth approach involves the
simultaneous reduction of x- and y-variances along with the updating of both the
input and output weights. Wold et al. (1992) proposed a method of this kind that
uses the same output weight update method as the second and third methods outlined
above, but which updates the input weights using a correlation related algorithm.
This method has the basic PLS principle in mind, which places equal emphasis on
the approximation of X and on the correlation between X and Y. However, the
algorithm of Wold et al. (1989) is not as balanced as NIPALS.
In this research we propose a new weight update method. The idea behind this
method is to apply the same update scheme to the input and output weights. Our
approach uses the weight update method that has been used only on output weights
in the past. This is achieved by defining a backward inner relation function, g(⋅),
where a TSK model is used for the functional form of g(⋅). The core of the algorithm
is as follows.
1. Initialize PLS parameters using LPLS.
2. Update the parameters.
)(ˆ tu f= (9.21)
)ˆˆ(ˆ uuuc TTT /Y= (9.22)
ccc /= (9.23)
cu Y= (9.24)
)(ˆ ut g= (9.25)
)ˆˆ(ˆ tttw TTT /X= (9.26)
162
www /= (9.27)
wt X= (9.28)
where f(⋅) minimizes SSE between u and u and g(⋅) minimizes SSE
between t and t .
Iterate this step until convergence is reached.
This is a nonlinear extension of the NIPALS algorithm that reduces to LPLS
when the inner relation functions are first order polynomials with no constant term.
The function g(⋅) is used only in the training process not in the prediction procedure.
The use of g(⋅) achieves balanced reductions of X and Y. This algorithm can be
applied to other PLS algorithms with minor changes.
IFPLS is an extended version of FPLS that uses the weight update algorithm
given in equations 9.21-9.29. In the IFPLS modeling, the number of fuzzy rules can
be decided using the method employed for FPLS. However, decisions based on
intuition lose meaning because the score plots change throughout the iteration
process. Therefore, the use of the cross validation method is recommended for
IFPLS.
Prediction method with FPLS model
FPLS and IFPLS models trained on a calibration data set are both identified by
scaling information, outer projection vectors and inner relation parameters, i.e., the
means and variances of the calibration data sets X0 and Y0, loading vectors p and q,
input weight vector w, the number of fuzzy rules L, the center of the membership
function c = {c1, c2, ⋅⋅⋅, cL}, the width of the membership function σ = {σ1, σ2, ⋅⋅⋅,
σL} and the linear regression coefficient b = {b1, b2, ⋅⋅⋅, bL} of fuzzy rules for all the
factors m under consideration. Let us denote the outer projection vectors of the m
factors by matrix form, i.e., P, Q and W. Then, for a new input data set X the output
data set Y can be predicted using the following steps.
163
1. Scale X by the mean and variance of X0.
2. Calculate the input score matrix
1)( −= WPXWT T (9.29)
where T = [t1, t2, ⋅⋅⋅, tm]
3. Predict output score vectors using the TSK inner model defined in equation
(9.14), with ch, σh and bh for each factor h.
)(ˆ htu hh f= (9.30)
4. Predict the scaled Y
TQUY ˆˆ = (9.31)
where ]ˆ,,ˆ,ˆ[ˆ21 muuu L=U for i = 1, 2, ⋅⋅⋅, m.
5. Rescale Y by the mean and variance of Y0
Using the PLS outer relation and the TSK fuzzy-type inner model, the FPLS
method is capable of robustly describing any complex nonlinear system and
provides informative biplots. Because FPLS uses the outer relation of PLS, the
analytical meaning of the outer projection vectors remains valid. Hence, various PLS
monitoring methods are still applicable to FPLS. Moreover, the interpretation based
on fuzzy rules gives a new way of monitoring nonlinear systems. For an example,
each sample of a system modeled by FPLS can be classified according to the fuzzy
rule that has the largest firing strength value on it.
9.3 Results and Discussion
The proposed FPLS algorithm is applied to two data sets. First, a simulation
data set of benchmark plant is considered, followed by real data of BET plant. TSK
fuzzy model were built using the nonlinear least squares optimization function
(Bang et al., 2001), which initial point is determined by FCM clustering algorithm
164
for identifying the center locations, P-nearest neighborhood method for deciding the
width and the global learning procedure for determine the parameters of fuzzy rule
(See the appendix). For the comparison, prediction performances of FPLS are
compared with LPLS and QPLS.
Simulation benchmark
Eight variables used to build the X-block in the simulation benchmark were the
influent ammonia concentration (SNH,in), influent flow rate (Qin), nitrate
concentration in the second aerator (SNO,2 ), total suspended solid in aerator 4
(TSS4), DO concentration in aerators 3 and 4 (SO,3, SO,4), oxygen transfer coefficient
in aerator 5 (KLa5), and internal recirculation rate (Qint). Quality variables are the
effluent ammonia and nitrate, SNH,e and SNO,e. We used 14 days as a normal data set
developed by the benchmark, where the training model was based on a normal
operation period for one week of dry weather and validation data was used on data
set for last 7 days. Because we are interested in the normal operation condition in
this research, we constructed the training model which was based on normal
operation and did not consider any time lags between the input and output to avoid
the complication.
The results of three PLS models are represented in Table 9.1, where four LVs
are selected in PLS model. Figure 9.3 shows the scatter plot and firing strength of
FPLS model. In the score plot, the small circle represents the center ci of a firing
strength function shown in the lower plot and the dashed line crossing the circle is
its fuzzy rule. In the lower plot, the solid lines represent the firing strength τi and the
dashed lines represent the normalized firing strength Gi. These plots clearly show
the nonlinear natures of the benchmark plant. LPLS gives no direct way to cope with
this nonlinearity; however, FPLS can give a direct and interactive way of treating
such nonlinearities. To decide the number of fuzzy rules, we applied various
165
numbers of fuzzy rules and heuristic rules to each LV. Then, we found that ‘2-2-1-1’
fuzzy rules for each LV and fixing the center of fuzzy rule of first LV by FCM gave
the best regression performances on training and validation data sets. The score plots
of the third and fourth LVs showed almost no nonlinearity; hence, we used only one
fuzzy rule for each of these LVs. Compared with other NLPLSs, FPLS model gives
a visual and interactive design capability which can treat such nonlinearities and
avoid overfitting problem.
Percent variances captured of training data (%) and mean squared error (MSE)
of test data set in benchmark with three PLS models are listed in Table 9.1, which
shows the regression performance of all PLS models. Explained variances of X-
block using LPLS, QPLS and FPLS model do not show any particular difference
and the value of Y-variance captured by the FPLS model is larger than two methods.
And the mean squared error (MSE) in the validation data set shows that best
prediction performance is achieved by the FPLS method. Figures 9.4 and 9.5 show
the prediction results of SNH,e and SNO,e in the validation data set for LPLS and FPLS
method. Time series plots and scatter plots illustrate the prediction improvements
that are achievable through the fuzzy regression approach. Scatter plots certify the
modeling capability of FPLS.
These results are not surprising because FPLS model is designed to capture the
main variability of the training data set and validation data set is generated with the
similar statistical properties to the training data. However, the above results are valid
on only the normal data set. In other situations, such as other disturbances cases,
other models may be better than FPLS model. The situation and the aim of the
models can determine their best model structure.
Full-scale WWTP
The process data were collected from a biological WWTP (BET) that treated
the coke wastewater of the iron and steel making plant in Korea (Figure 2.3).
166
Twelve process and manipulated variables, X blocks, were used to model three
process output variables, Y blocks. Y blocks consist of the solid volume index (SVI),
the reduction of cyanide (∆CN), and the reduction of COD (∆COD). Table 2.1
describes the process variables and presents the mean and standard deviation (SD)
values of X and Y blocks. The process data consisted of daily mean values from 1
January, 1998 to 9 November, 2000 with a total number of 1034 observations. The
720 observations were used as the calibration of PLS models, where the samples of
odd number were used as a training set and those of even number were used as a
validation set. And the remaining 314 observations were used as a test data set.
The results of three PLS models are represented in Table 9.2, where six LVs
are selected for each PLS model. Figure 9.6 shows the scatter plot and firing
strength of FPLS model with six LVs (the fifth and sixth LV are not shown). Unlike
the expectation, the data from BET showed no obvious nonlinearity. However, we
did find some nonlinear characteristics at the second LV, which leads us to use three
fuzzy rules for this factor. The first and later factors showed almost no nonlinearity;
hence, one fuzzy rule was used for each of these LVs. To avoid complication, we
did not consider the nonlinearity of these factors further.
The value of X and Y-variance captured by the FPLS model is larger than those
of LPLS and QPLS methods and the mean squared error (MSE) of validation shows
even best result in FPLS. Contrary to our expectation, MSE in the test data set
represents that LPLS and QPLS have better prediction performance than FPLS.
During the test data set, WWTP had received a large influent load and experienced
the large change of operating condition. These process transitions altered a sort of
microorganism and sludge, which changed the process dynamics in BET. Because
FPLS model is designed to capture the nonlinear behavior and statistical properties
of the training data set, FPLS model showed inferior prediction result in these
disturbances cases. Figure 9.7 and 9.8 shows the time series and scatter plot of real
167
and predicted value with LPLS and FPLS model during the validation periods. The
prediction performances of COD and CN reduction are satisfactory. But, the
prediction of SVI of secondary settler is not so good as those of the other process
quality variables. LPLS and FPLS show a similar prediction performance.
After we performed several experiments comparing FPLS with the other
NLPLS models. We concluded that FPLS shows similar regression performance to
the other NLPLS models; however, it is difficult to make a fair comparison between
models, because each algorithm has its own characteristics. For this reason we will
not present a detailed comparison between models, but below we will outline the
difference between FPLS and the other NLPLS in two aspects.
First, inner relation models of FPLS usually take on gentler curvature than
those of other NLPLS, as they are locally weighted averages of linear fuzzy rules
and model designers would not favor highly nonlinear shapes of inner relation
models whose variables are the results of linear computations. In contrast, other
NLPLS models can take on any nonlinear shape to minimize the SSE, providing this
shape is permitted by cross validation. If a FPLS model were built referring only to
the cross-validation result, with no input from the experts, it could have greater
curvature. Hence, it ultimately depends on the experts’ decision whether to use a
conservative model or an SSE-minimizing model.
At seconds, the number of regression parameters estimated for each NLPLS
inner model depends on a few structural parameters, such as the order of a
polynomial for QPLS, the order of polynomials and the number of knots for SPLS,
the number of neurons for NNPLS and the number of rules for FPLS. They also vary
depending on the nonlinearity of the modeled system. If the value of the structural
parameters is increased the regression SSE of the model will decrease and the model
will take on a more nonlinear shape. Because these structural parameters have
different physical meanings, their values cannot be compared with those of another
168
NLPLS. However, if the values are the same, FPLS generally uses more parameters
than other NLPLS methods. For example, if the values of the structural parameter
are L for both NNPLS and FPLS, an inner model of NNPLS needs 2L + 1 regression
parameters for the input and output weights of the neurons plus a bias term, whereas
that of FPLS needs 4L parameters for c, σ and b. However, this does not mean FPLS
is a more complex model to interpret. Because FPLS analyzes the system using
submodels represented by fuzzy rules, the 2L parameters used for b help in the
preparation of submodels and the 2L parameters used for c and σ help to interpret
the relationship between the input data and the submodels. Therefore, although
FPLS uses more regression parameters than other NLPLS methods for a same
structural parameter, its superiority as an informative model will rate it highly
among the elemental NLPLS methods
9.4 Conclusions
In this research we proposed a fuzzy PLS model and presented experimental
results showing the application of this algorithm. The proposed model uses a PLS
framework, which gives the model robust regression performance when used on
high dimensional and collinear data. Moreover, as the model uses TSK fuzzy models,
it can represent highly nonlinear systems. Most importantly, the proposed model has
higher interpretability than any other NLPLS modeling method, creating a modeling
environment that is favorable to the use of experts’ knowledge. The interpretability
of the FPLS model is embodied in the following elements. First, the model can be
presented in intuitively simple biplots; second, the fuzzy rules of the TSK function
provide insight into the model; and finally, the effects of the fuzzy rules can be
estimated using plots of the firing strength. Another property that distinguishes the
FPLS model from other NLPLS models is that the TSK fuzzy model is a
combination of linear submodels. This feature causes the FPLS model to provide
169
more stable estimations of output on extrapolation. Using these properties, a model
designer can interactively revise FPLS model and construct a more robust nonlinear
model with fewer instances of local minima and over-fitting.
9.5 Appendix
A1. Fuzzy c-means (FCM) algorithm
The center of a Gaussian-type membership function, ci, can be decided by using
the FCM algorithm, that is
∑
∑=
== N
j ij
N
j jiji
tc
12
12
µ
µ, i = 1, 2, ⋅⋅⋅, L (9.32)
where
2
1
1
∑ =
−
−=
L
kkj
ij
ij
ct
ctµ (9.33)
is a membership grade.
A2. Moody & Darken’s rule
The width of a Gaussian-type membership function, σi, can be decided by using
the P-nearest neighborhood heuristic suggested by Moody and Darken (1989), that
is
( )
2/1
1
21
−= ∑
=
p
llii cc
pσ
, i = 1, 2, ⋅⋅⋅, L (9.34)
where cl (l = 1, 2, ⋅⋅⋅, p) are the p (typically p = 2) nearest neighborhoods of the
center ci.
A3. Global learning algorithm
170
The parameters, bi, of a fuzzy rule can be decided by using a global learning
algorithm. Global learning chooses the parameters of fuzzy rules that minimize the
objective function JG.
[ ]∑=
−=N
kG kukuJ
1
2)(ˆ)( (9.35)
Equation (9.35) can be rearranged into a simple matrix form.
( ) ( )GGGT
GGGJ bubu G TT −−= (9.36)
where [ ] TG Nuuu )()2()1( L== uu ∈ℜ N × 1
=
)()()()()()()()()(
)2()2()2()2()2()2()2()2()2()1()1()1()1()1()1()1()1()1(
2211
2211
2211
NtNGNGNtNGNGNtNGNG
tGGtGGtGGtGGtGGtGG
LL
LL
LL
L
M
L
L
GT
∈ ℜ N × 2 (9.37)
[ ] TLLG bbbbbb 1021201110 L=b ∈ ℜ 2L × 1 (9.38)
Appling singular value decomposition (SVD) to TG yields
TVΣUT ~~~=G (9.39)
where
[ ]TNuuu ~~~~21 L=U ∈ ℜ N × N (9.40)
[ ]TL221~~~~ vvv L=V ∈ ℜ 2L × 2L (9.41)
)~,,~,~(~221 Ldiag σσσ L=Σ ∈ ℜ N × 2 (9.42)
where 0~~~221 ≥≥≥≥ Lσσσ L . Then the minimum Euclidean norm solution of the
fuzzy rule parameters, bG, is computed as
i
s
1i i
GT
iG σ
vuub ~~~
∑=
= (9.43)
where s is the number of nonzero singular values in Σ~ .
171
172
x is A1
Rule 1
Rule 2
Rule L
x y∑
∑=
=L
i i
L
i ii y
1
1
τ
τ
y1 = xT b1
x is AL yL = xT bL
x is A2 y2 = xT b2
τ1
τ2
τL
Figure 9.1 Block diagram of the TSK fuzzy model
173
w1
c1
t1
u1 û1
p1T
q1T
f1(·)
X
Y
E0
F0
+
+
-
-
w2
c2
t2
u2 û2
p2T
q2T
f2(·)
E1
F1
+
+
-
-
wm
cm
tm
um ûm
pmT
qmT
fm(·)
E2
F2
+
+
-
-
E
F
Em
Fm
. . .
First factor Second factor Last factor
Figure 9.2 Block diagram of the FPLS method
174
(a) (b)
(c) (d)
Figure 9.3 Scatter plots and firing strength plots of FPLS model in benchmark (a)
first LV (b) second LV (c) third LV (d) fourth LV
175
(a)
(b)
Figure 9.4 Comparisons of LPLS and FPLS for the predicted and actual SNHe in
benchmark (a) Time series plot (b) Scatter plot
176
(a)
(b)
Figure 9.5 Comparisons of LPLS and FPLS for the predicted and actual SNOe in
benchmark (a) Time series plot (b) Scatter plot
177
(a) (b)
(c) (d)
Figure 9.6 Scatter plots and firing strength plots of FPLS model in BET (a) first LV
(b) second LV (c) third LV (d) fourth LV
178
(a)
(b)
(c)
Figure 9.7 Time series plots of predicted and actual output in BET (a) SVI with
LPLS and FPLS (b) ∆ CN with LPLS and FPLS (c) ∆ COD with LPLS and FPLS
179
(a)
(b)
(c)
Figure 9.8 Scatter plots of predicted and actual output in BET (a) SVI with LPLS
and FPLS (b) ∆ CN with LPLS and FPLS (c) ∆ COD with LPLS and FPLS
180
Table 9.1 Percent variance captured (%) and MSE of several PLS models in
benchmark
LPLS QPLS FPLS
Factor X Y X Y X Y
1 64.49 40.60 64.49 43.03 64.49 43.68
2 88.96 60.06 88.96 67.85 88.97 71.61
3 91.40 71.04 91.28 77.12 91.45 78.51
4 96.85 72.25 97.04 78.66 97.10 80.00
MSE 0.60 0.46 0.44
181
Table 9.2 Percent variance captured (%) and MSE of several PLS models in BET
LPLS QPLS FPLS
Factor X Y X Y X Y
1 17.75 31.76 17.75 31.92 17.75 31.76
2 33.32 47.85 33.32 48.29 33.32 48.79
3 44.01 58.32 43.96 58.70 44.01 59.27
4 52.96 60.61 52.89 60.83 53.00 61.55
5 59.61 62.20 59.55 62.90 60.37 63.00
6 64.64 63.43 65.10 63.90 65.57 64.21
MSE of
validation data 1.13 1.12 1.11
MSE of test data 1.67 1.68 1.71
182
X. References
Alex, J., Beteau, J.F., Copp, J.B., Hellinga, C., Jepsson, U., Marsili-Libelli, S., Pons,
M.N., Spanjers. H., Vanhooren, H., “Benchmark for evaluating control
strategies in wastewater treatment plants.” European Control Conference 99,
Karlsruhe, 1999.
Åström, K.J. and Hägglund, T., “Automatic Tuning of Simple Regulators with
Specifications on Phase and Amplitude Margins.” Automatica, Vol.20, 1984,
pp.645-651.
Åström, K.J., “Adaptive control.” USA, Addison Wesley, 1995.
Baffi, G., Martin, E.B. and Morris, A.J., “Non-linear projection to latent
structures revisited (the neural network PLS algorithm).” Comp. & Chem.
Eng. Vol.23, 1999, pp.1293-1307.
Bakshi, B. R., “Multiscale PCA with Application to Multivariate Statistical Process
Monitoring.” AIChE J., Vol.44, 1998, 1596-1610.
Bishop, C. M., “Neural Networks for Pattern Recognition.” Claprendon Press, 1995.
Carlsson, B., “On-line estimation of the respiration rate in an activated sludge
process.” Wat. Sci Tech., Vol.28, 1993, pp.427-434.
Carlsson, B., Lindberg, C.F., Hasselblad, S. and Xu, S., “On-line estimation of the
respiration rate and the oxygen transfer rate at Kungsängen wastewater plant in
Uppsala.” Wat. Sci. Tech., Vol. 30, No.4, 1994, pp.255-263.
Carlsson, B., Lindberg, C.F., Hasselblad, S. and Xu, S., “Estimation of the
respiration rate and the oxygen transfer function utilizing a slow do sensor.”
Wat. Sci. Tech., Vol.33, 1996, pp.325-333.
Centner, V. and Massart, D.L., “Optimization in Locally Weighted Regression.”
Anal. Chem. Vol.70, 1998, pp.4206-4211.
Choi, S.W, Yoo, C.K and Lee, I.B., “Generic monitoring system in the biological
183
wastewater treatment process.” J. Chem. Eng. of Japan, 2001 (accepted).
Copp, J.B., “COST Simulation Benchmark Manual”, European Cooperation in the
field of Scientific and Technical Research, 2000.
COST-624, “The European Cooperation in the Field of Scientific and Technical
Research.” Website: http://www.ensic.u-nancy.fr/COSTWWTP.
Dieu, B., Garrett, M.T., Ahmad, Z. and Young, S., “Applications of automatic
control systems for chlorination and dechlorination processes in wastewater
treatment plants.” ISA Trans., Vol.34, 1995, pp.1-28.
Dorsey, A., and Lee, J., “Monitoring of batch processes through state-space
models.” 2001.
Eriksson, L., Hermens, J.L.M., Johansson, E., Verhaar, H.J.M. and Wold, S.,
“Multivariate analysis of aquatic toxicity data with PLS.” Aqu. Sci., Vol.57,
No.3, 1995, pp.1015-1621.
Gau, C. and Stadther M.A., “Reliable nonlinear parameter estimation using interval
analysis: error-in-variable approach.” Computer & Chemical Engineering,
Vol.24, 2000, pp.631-637.
Geladi, P. and Kowalski, B.R., “Partial Least Squares Regressions: a Tutorial.” Anal.
Chim. Acta., Vol.185, 1986, pp.1-17.
Hasselblad, S. and Xu, S., “On-line estimation of settling capacity in secondary
clarifier.” Wat. Sci. Tech, Vol.34(3-4), 1996, pp.323-330.
Haykin, S., “Neural Networks: A comprehensive foundation.” Prentice Hall
International, 1999.
Henze, M., Grady Jr, C. P. L., Gujer, W., Marais, G. and Matsuo, T., “Activated
sldge model no. 1.” Scientific and Technical Report No. 1, IAWQ, London, UK,
1987.
Holmberg, U., Olsson, G. and Andersson, B., “Simultaneous DO control and
respiration estimation.” Wat. Sci. Tech., Vol.21, 1989, pp.1185-1195.
184
Höskuldsson, A., “Prediction Methods in Science and Technology.” Thor Publishing,
Arnegaards Alle, Finland, 1996.
Jang, J.-S.R., Sun, C.-T. and Mizutani, E., “Neuro-Fuzzy and Soft Computing.”
Prentice Hall, 1997, pp. 425-427.
Jeppsson U., “Modelling Aspects of Wastewater Treatment Processes.” Ph. D. thesis,
Lund, Sweden, 1996.
Jeppsson U., Alex, J., Pons, M.N., Spanjers, H. and Vanrolleghem, P.A., “Status and
future trends of ICA in WWTP – A European perspective.” ICA2001, Sweden,
2001, pp.687-694.
Joanquin, S., Ion, I., Xabier, O., and Eduardo, A., “Dissolved oxygen control and
simultaneously estimation of oxygen uptake rate in activated sludge plant.”
Wat. Env. Res., Vol.70, No.3, 1998, pp.316-322.
Johnson, R.A. and Wichern, D.W., “Applied Multivariate Statistical Analysis.” 3rd
ed., Prentice Hall, Englewood Cliffs, USA, 1992.
Kano, M., Nagao, K., Ohno, H., Hasebe, S. and Hashimoto, I. “Dissimilarity of
Process Data for Statistical Process Monitoring.” International symposium on
advanced control of chemical process (ADCHEM), Pisa, Italy, 2000a, pp. 231-
236.
Kano, M., Nagao, K., Hasebe, S., Hashimoto, I., Ohno, H., Strauss, R. and Bakshi,
B., “Comparison of Statistical Process Monitoring Methods: Application to the
Eastman Challenge Problem.” Comp. & Chem. Eng., Vol.24, 2000b, pp.175-
181.
Ko, T. J. and Cho, D.W., “Adaptive Modelling of the Milling Process and
Application of a Neural Network for Tool Wear Monitoring.” Advanced
Manufacturing Technology, Vol.12, 1996, pp.5-16.
Kourti, T. and MacGregor, J.F., “Process Analysis, Monitoring and Diagnosis using
Multivariate Projection Methods.” Chem. Intelli. Lab., Vol.28, 1995, pp.3-21.
Krofta, M., Herath, B., Burgess, D. and Lampman, L., “An Attempt to understand
185
Dissolved Air Flotation using Multivariate Analysis.” Wat. Sci. Tech., Vol.31,
No.3-4, 1995, pp.191-201.
Ku, W., Storer, R.H. and Georgakis, C., “Disturbance Detection and Isolation by
Dynamic Principal Component Analysis.” Chem. Intelli. Lab., Vol.30, 1995,
pp.179-196.
Lambert, E., “Process control applications of long-range prediction.” Ph.D. thesis,
Dept. of Engineering Science, Oxford University, 1987.
Lee, D.S. and Park, J.M., “Neural network modeling for on-line estimation of
nutrient dynamics in a sequentially-operated batch reactor.” J. Biotechnol.,
Vol.75, 1999, pp.229-239.
Lee, D.S., “Neural network modeling of biological wastewater treatment processes.”
Ph. D. thesis, School of Environmental Engineering, POSTECH, Korea, 2000.
Lee, P.L. and Sullivan, G.R., “Generic Model Control.” Comp. & Chem. Eng.,
Vol.124, 1988, pp.573-580.
Li, W. and Qin, S. J., “Consistent dynamic PCA based on error-in-variables
subspace identification.” J. of Process Control, Vol.11, 2001, pp.661-678.
Li, W., Yue, H.H., Valle-Cervantes, S. and Qin, S.J., “Recursive PCA for Adaptive
Process Monitoring.” Journal of Process Control, Vol.10, 2000, pp.471-486.
Lin, C.-T. and Lee, C.S., “A Neuro-Fuzzy Systems.” Prentice-Hall, 1996.
Lindberg, C.F. and Carlsson, B., “Nonlinear and set-point control of the dissolved
oxygen dynamic in an activated sludge process.” Wat. Sci. Tech., Vol.34, No.3-
4, 1996, pp.135-142.
Lindberg, C.F., “Control and estimation strategies applied to the activated sludge
process.” Ph.D. thesis, Uppsala University, Dept. of Material Science, System
and Control Group, Sweden, 1997.
Ljung, L. and Söderström, T., “Theory and Practice of Recursive Identification.”
Cambridge: M.I.T. press, 1987.
186
Ljung, L., “System Identification.” New Jersey: PTR Prentice Hall, 1987.
Lukasse, L.J.S., “Control and identification in activated sludge process.” Ph. D.
thesis, Wageningen Agricultural University, Netherlands, 1999.
Marsili-Libelli, S., “Adaptive estimation of bioactivities in the activated sludge
process.” Proc. IEE, Part D, Vol.137, 1990, pp.349-356.
Marsili-Libelli, S. and Voggi, A., “Estimation of respirometric activities in
bioprocess.” Journal of Biotechnology, Vol.52, 1997, pp.181-192.
Marsili-Libelli, S., “Adaptive Fuzzy Monitoring and Fault Detection.” Int. J.
COMADEM, Vol.1(3), 1998, pp.31-40.
Moody, J. and Darken, C., “Fast learning in networks of locally-tuned processing
units.” J. Neural Comp., No.1, 1989, pp.281-294.
Negiz, A. and Cinar, A., “Statistical monitoring of multivariat dynamic processes
with state-space models.” AIChE, Vol.43, No.8, 1997, pp.2002-2020.
Neter, J., Kutner, M.H., Nachtsheim, C. and Wasserman, W., “Applied Linear
Statistical Models.” 4 Edition, McGraw-Hill, USA, 1996.
Nomikos, P. and MacGregor, J.F., “Monitoring of batch processes using multi-way
principal component analysis.” AIChE, Vol.40, 1994, pp.1361-1375.
Nomikos, P. and MacGregor, J.F., “Multivariate SPC charts for monitoring batch
processes.” Technometrics, Vol.37, No.1, 1995, pp.41-59.
Olsson, G. and Chapman, D., “Modeling the dynamics of clarifier behaviour in
activated sludge systems”, Wat. Sci. Tech, Vol.37(12), 1988, pp.405-412.
Olsson, G. and Newell, B., “Wastewater Treatment Systems: Modelling, diagnosis
and Control.” IWA, UK, 1999.
Pons, M.N., Spanjers, H. and Jeppsson, U., “Towards a Benchmark for evaluating
Control Strategies in Wastewater Treatment Plants by Simulation.” Escape 9,
Budapest, Hungary, 1999, pp. 403-406.
Qin, S.J. and McAvoy, T.J., “Nonlinear PLS modeling using neural networks.”
187
Comp. & Chem. Eng. Vol.16, 1992, pp.379-391.
Rosen, C. and Olsson, G., “Disturbance detection in wastewater treatment plants.”
Wat. Sci. Tech, Vol.37, No.12, 1998, pp.197-205.
Rosen, C., “Monitoring Wastewater Treatment System.” MS thesis, Lund, Sweden,
1998.
Rosen, C. and Lennox, J.A., “Multivariate and Multiscle Monitoring of Wastewater
Treatment Operation.” Water Research, Vol.35, No.14, 2001, pp.3402-3410.
Sagara, S., Yang, Z.J. and Wada, K., “Recursive identification algorithms for
continuous systems using an adaptive procedure.” Int. J. Control, Vol.53, 1991,
pp.391-409.
Signal, P.D. and Lee, P.L., “Generic Model Adaptive Control.” Chem. Eng. Comm.,
Vol.115, 1992, pp.35-52.
Singman, J., “Efficient Control of Wastewater Treatment Plant - a Benchmark
Study.” MSc thesis, Uppsala University, Sweden, 1999.
Söderström, T. and Stoica, P., “System Identification.” Cambridge: Prentice Hall
International, 1989.
Sotomayor, O.A.Z., Park, S. and Garcia, C., “Nitrate Concentration-based Control of
the Activated Sludge Systems.” IFAC-ADCHEM 2000, Vol.I, Pisa, Italy, 2000,
pp.213-218.
Spanjer, H., Vanrolleghems, P., Nguyen, K., Vanhooren, H. and Patry, G., “Toward
a simulation benchmark for evaluating respirometry-based control strategies.”
Wat. Sci Tech., Vol.37, No.12, 1998, pp.219-226.
Spanjers, H. and Klapwijk, A., “On-line meter for respiration rate and short-time
biochemical oxygen demand in the control of the activated sludge process.”
ICA of water and wastewater treatment and transport systems, Pergamon press,
New York, N.Y., 1990.
Steffens, M.A. and Lant, P.A., “Multivariable control of nutrients removing
188
activated sludge systems.” Wat. Res., Vol.33, No.12, 1999, pp.2864-2878.
Sung, S. W.; O, J., Lee, J., Yi, S. and Lee, I., “Automatic Tuning of PID Controller
using Second Order Plus Time Delay Model.” J. Chem. Eng. Japan, Vol.29,
1996, pp.990-999.
Sung, S.W. and Lee, I., “Limitations and Countermeasures of PID controllers.” Ind.
Eng. Chem. Res., Vol.35, 1996, pp.2596-2610.
Sung, S.W., “Process Identification and a New PID Controller Design.” Ph. D.
thesis, School of Environmental Engineering, POSTECH, Korea, 1996.
Sung, S.W., Lee, I. and Lee, J.T., “New Process Identification Method for
Automatic Design for PID controllers.” Automatica., Vol.34, 1998a, pp.513-
520.
Sung, S.W., Lee, I. and Lee B.K., “On-line process identification and automatic
tuning method for PID controllers.” Chem. Eng. Sci., Vol.53, 1998b, pp.1847-
1859.
Sung, S.W. and Lee, I., “PID Controllers and Automatic Tuning.” A-Jin, Korea,
1999.
Takács, I., Patry, G. G. and Nolasco, D., “A dynamic model of the clarification-
thickening process.“ Wat. Res., Vol.25(10), 1991, pp.1263-1271.
Teppola, P., Mujunen, S.-P. and Minkkinen, P., “Partial least square modeling of an
activated sludge plant: A case study.” Chem. Intelli. Lab., Vol.38, 1997,
pp.197-208.
Teppola, P., “Multivariate Process Monitoring of Sequential Process Data – A
Chemometric Approach.” Ph. D. Thesis, Lappeenranta University, Finland,
1999.
Teppola, P., Mujunen, S.-P. and Minkkinen, P., “Adaptive fuzzy c-means clustering
in process monitoring.” Chem. Intelli. Lab., Vol.41, 1999, pp.23-28.
Teppola, P. and Minkkinen, P., “Wavelet-PLS regression models for both
189
exploratory data analysis and process monitoring.” J. of Chemometrics, Vol.14,
2000, pp.383-399.
Teppola, P. and Minkkinen, P., “Wavelets for scrutinizing multivariate exploratory
models through multiresolution analysis.” J. of Chemometrics, Vol.15, 2001,
pp.1-18.
Tu, Y.X., Wernsdörfer, A., Honda, S., and Tuffs, P.S., “Estimation of conduction
velocity distribution by regularized-least-squares method.” IEEE Trans. on
Biomedical Engineering, Vol.44, 1987, pp.1102-1106.
Van Dongen, G. and Geuens, L., “Multivariate Time Series Analysis for Design and
Operation of a Biological Wastewater Treatment Plant.” Water Research,
Vol.32, 1998, pp.691-700.
Van Overschee, P. and De Moor, B., “N4SID: Subspace algorithms for the
identification of combined deterministic stochastic systems.” Automatica,
Vol.30, No.1, 1994, pp.75-93.
Van Overschee, P. and De Moor, B., “Subspace algorithms for linear system:
Theory – Implementation – Application .“ 1996, Kluwer Academic Publishers,
USA.
Van Overschee, P. and De Moor, B., “Subspace algorithms for the stochastic
identification problem.” Automatica, Vol.29, No.3, 1993, pp.649-660.
Wang, C.-H., Hong, T.-P. and Tseng, S.S., “Integrating Fuzzy Knowledge by
Genetic Algorithms.” IEEE. Trans. Evolutionary Computation, Vol.2(4), 1998,
pp.138-149.
Whitfield, A. H. and Messali. N., “Integral-equation approach to system
identification.” Int. J. Control, Vol.45, 1987, pp.1431-1445.
Wikström, C., Albano, C., Eriksson, L. Fridén, H., Johansson, E., Å. Nordahl,
Rännar, S., Sandberg, M., Kettaneh-Wold, N. and Wold, S., “Multivariate
process and quality monitoring applied to an electrolysis process Part II.
190
Multivariate time-series analysis of lagged latent variables.” Chem. Int. Lab.
Sys., Vol.42, 1998, pp.233-240.
Wise, B.M. and Richer, N.L., Veltkamp, D.F. and Kowalski, B.R., “A theoretical
basis for the use of principal component models for monitoring multivariate
processes.” Process Contrl Qual., Vol.1, 1990, pp.41-51.
Wise, B.M. and Gallagher, N.B., “The Process Chemometrics Approach to Process
Monitoring and Fault Detection.” Journal of Process Control, Vol.6, 1996,
pp.329-348.
Wold, S., “Nonlinear Partial least squares modeling II. Spline inner relation.”
Chemom. Int. Lab. Syst., Vol.14, 1992, pp.71-84.
Wold, S., Wold, N. K. and Skagerberg, B., “Nonlinear PLS modeling.” Chem. Int.
Lab. Sys., Vol.7, 1989, 53-65.
Yen, J., Wang, L. and Gillespie, C.W., “Improving the interpretability of TSK Fuzzy
models by combining global learning and local learning.” IEE Trans. Sys. No.6,
1998, pp.530-537.
Ying, C.-M. and Joseph, B., “Sensor Fault Detection Using Noise Analysis.” Ind.
Eng. Chem. Res., Vol.39, 2000, pp.396-407.
Yoo, C.K., Choi, S.W. and Lee, I., “Disturbance Detection and Isolation in the
Activated Sludge Process.” Wat. Sci. Tech. (in press), 2001
Curriculum Vitae
Name : Chang Kyoo Yoo
Date of Birth : September 25, 1969
Birthplace : ChunBuk, Korea
Address : Graduate Apartment 2-1004, POSTECH, Hyoja-Dong, Pohang
City, KyungBuk, 790-784, Korea
Education :
3, 1989 ~ 2, 1993 B. S. in Chemical Engineering,
Yonsei University, Seoul, Korea
3, 1993 ~ 2, 1995 M. S. in Chemical Engineering,
Pohang University of Science and Technology,
Pohang, Korea
3, 1999 ~ 2, 2002 Ph. D. in Chemical Engineering,
Pohang University of Science and Technology,
Pohang, Korea
Career :
3, 1999 ~ 2, 2002 Pohang University of Science and Technology,
Research Assistant
5, 2001 ~ 8, 2001 Visiting scholar, School of Chemical Engineering,
Georgia Institute of Technology, 778 Atlantic Dr.,
Atlanta, GA, USA (“Supported by Brain Korea 21
project”)
1, 1995 ~ 8, 1998 Industrial Researcher, Factory Automation, Doosan
R&D center, Korea
9, 1998 ~ 2, 1999 Industrial Researcher, Automation Research Center,
Pohang University of Science and Technology, Korea
Certificate :
National Technical Qualification Certificate (1994) in a division of chemical
engineering
Institute Activity :
5, 1993 ~ Now KIChE member
Scientific Publications
Research papers
1. C.K.Yoo, J.H. Cho, H.J. Kwak, S.K. Choi, H.D. Chun and I.B. Lee, "Closed-
loop identification and control for dissolved oxygen concentration in the full-
scale coke wastewater treatment plant", Wat. Sci. Tech., Vol.43, No.7, 2001,
pp.207-214
2. Chang Kyoo Yoo, Hee Jin Kwak and In-Beum Lee, “Direct identification
method of second order plus time delay model parameters”, Chemical
Engineering Research and Design, Vol. 79, Part A, October, 2001, 754-764
3. Chang Kyoo Yoo, Dong Soon Kim, Ji-Hoon Cho, Sang Wook Choi and In-Beum
Lee, “Process System Engineering in Wastewater Treatment Process”, The
Korean Journal of Chemical Engineering, Vol.18, No.4, 2001, pp.408-421
4. C.K.Yoo, S.W. Choi and I. Lee, "Disturbance Detection and Isolation in the
Activated Sludge Process", Wat. Sci. Tech., 2001 (in press)
5. Sang Wook Choi, Chang Kyoo Yoo, Kyu Hwang Lee and In-Beum Lee,
“Generic Monitoring system in the biological wastewater treatment process”,
Journal of Chemical Engineering of Japan, 2001 (in press)
6. Sang Wook Choi, Chang Kyoo Yoo, and In-Beum Lee, “SOx emission
monitoring in the power plant in a still mill by a neural network”, Journal of
Environmental Engineering, 2001 (in revision)
7. YoonHo Bang, Chang Kyoo Yoo and In-Beum Lee, “Nonlinear PLS modeling
with fuzzy inference system”, Chemometrics and intelligent laboratory systems,
2001 (in revision)
* Submitted and in revision papers
International conferences
1. Chang Kyoo Yoo, Jin Hyun Park and In-Beum Lee, "Adaptive Generic Model
Control for Automatic DO Control in the activated sludge process", The
Proceedings of The 8th APCChE Congress, pp.359-362, Seoul, Korea, 1999
2. Chang Kyoo Yoo, Jin Hyun Park and In-Beum Lee, "Auto-tuning and
simultaneous setpoint decision for the DO control in the coke wastewater
treatment plant", Proceedings of the 3th Asian Control Conference, pp.744-749,
July 5-7, Shanghai, 2000
3. C.K. Yoo, J.H. Cho, H.J. Kwak, S.K. Choi, H.D. Chun and I.B. Lee, "Closed-loop
identification and control for dissolved oxygen concentration in the full-scale
coke wastewater treatment plant", Proceedings at 5th International IWA
Symposium Systems Analysis and Computing in Water Quality Management
WATERMATEX2000, pp.9.9-9.16, Gent, Belgium, September 18-20, 2000
4. Chang Kyoo Yoo, Sang Wook Choi, Jin Hyun Park and In Beum Lee, "Time
series analysis and neural network classification of the secondary settler in the
wastewater plant", Proceedings at 5th International IWA Symposium Systems
Analysis and Computing in Water Quality Management WATERMATEX2000,
pp.9.9-9.16, Gent, Belgium, September 18-20, 2000
5. Sang Wook Choi, Chang Kyoo Yoo and In-Beum Lee, "Real-Time Process
Monitoring Using Dissimilarity through PCA", Proceedings at the 2nd Cross
Straits Symposium on Materials, Energy and Environmental Sciences, Pusan
National University, Korea, November 2-3, 2000
6. Sang Wook Choi, Chang Kyoo Yoo, byung-Hwan Cho and In-Beum Lee, "Air
Pollution Monitoring at a Power Plant using MLP via ARX model", Proceedings
at the first AEARU Environmental Workshop, pp.171-180, Hong Kong
University of Science and Technology, Hong Kong, January 9-10, 2001
7. C.K.Yoo, S.W. Choi and I. Lee, "Disturbance Detection and Isolation in the
Activated Sludge Process", ICA2001, pp.333-340, Lund University, Sweden,
June 3-7, 2001
8. ChangKyoo Yoo, Jin Hyun Park and In-Beum Lee, "Nonlinear model-based
control in the biological wastewater treatment system", DYCOPTS-6 (6TH IFAC
Symposium on Dynamics and Control of Process System), pp.724-728, Jeju,
Korea, June 3-6, 2001
9. Sang Wook Choi, ChangKyoo Yoo and In-Beum Lee, "Generic detection and
isolation of process events in WWTP", 4th IFAC workshop on on-line fault
detection and supervision in the chemical process industries, 135-139, Jeju,
Korea, June 7-8, 2001
10. ChangKyoo Yoo, SangWook Choi, YoonHo Bang and In-Beum Lee, "Soft
sensor and model-based DO control in the biological wastewater treatment
process", IFAC Workshop on Modeling and Control in Environmental Issues,
pp.307-312, Yokohama, Japan, August 22-23, 2001
11. YoonHo Bang, ChangKyoo Yoo and In-Beum Lee, "An enhanced prediction
model for the emission of air pollutants", IFAC Workshop on Modeling and
Control in Environmental Issues, pp.73-78, Yokohama, Japan, August 22-23,
2001
12. C.K.Yoo and I. Lee, “Wastewater quality modeling by hybrid GA-Fuzzy model ",
ASIAN WATERQUAL 2001, pp.107-112, Fukuoka, Japan, September 12-15,
2001
13. C.K.Yoo, S.W. Choi and I. Lee, "Monitoring algorithm in the biological
wastewater treatment system", 6th world congress of chemical engineering,
pp.2409, Melbourne, Australia, September 23-27, 2001
14. YoonHo Bang, ChangKyoo Yoo and In-Beum Lee, "Modeling and monitoring
with nonlinear FPLS method", Proceedings at the 3nd Cross Straits Symposium
on Materials, Energy and Environmental Sciences, POSTECH, Korea,
November 15-16, 2001
15. Yoon Ho Bang, ChangKyoo Yoo, Sang Wook Choi and In-Beum Lee,
“Nonlinear PLS Monitoring Applied to An Wastewater Treatment Process”,
proc. of ICCAS2001 conference, Oct. 17~20, Jeju, Korea, 2001
16. Yangdong Pan, ChangKyoo Yoo, Jay H. Lee and In-Beum Lee, “Process
Monitoring for Continuous Process with Cycling Operation”, Submitted to
extended abstract of American Control Conference 2002 (Invited Session in
Control of Batch and Periodic Process), 2002.
Domestic conferences
1. Chang Kyoo Yoo, Hee Jin Kwak, Su Whan Sung and In-Beum Lee, "An
Estimation Method of SOPTD Model Parameter", Proceeding of KIChE Spring
Meeting, 1999.
2. Ji-hoon Cho, Chang Kyoo Yoo, Dong-soon Kim and In-Beum Lee, "Wastewater
treatment automation system", Proceeding of KIChE Fall Meeting, 1999.
3. Sang Wook Choi, Chang Kyoo Yoo and In-Beum Lee, "SOx Emission monitoring
in the internal power plant of a still mill by a neural network classifier",
Proceeding of KOSENV Spring Meeting, 2000.
4. Sang Wook Choi, Chang Kyoo Yoo and In-Beum Lee, Proceeding of KICHE, On-
line process monitoring using modified dissimilarity measure", Proceeding of
KIChE Fall Meeting, 2000.
5. Yoon Ho Bang, Chang Kyoo Yoo and In-Beum Lee, “Fuzzy Model을 적용한
Nonlinear PLS algorithm”, Proceeding of KIChE Spring Meeting, 2001.