Top Banner
27

Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

Jul 10, 2018

Download

Documents

phamthu
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were
Page 2: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

Abstract

The paper presents a comparative study of potential cost functions for the adaptive non-

linear model predictive control based on probabilistic black-box model. We investigate

the time-domain properties of closed-loop control with different cost functions. Model

predictive control requires a model of the controlled system. We identify a NARX

Gaussian process (GP) model using measurements of inputs and outputs. The model

prediction itself is a normally distributed random variable. The information from a nor-

mally distributed prediction is used for implementation of probabilistic model predictive

control. Our goal is to illustrate the effects on the controlled system performance. By

examining the empirical results under the specified requirements, we can infer that the

prediction variance of an incomplete model does not have a noticeable impact on the

stability of the controlled system.

Key words: Adaptive model predictive control, Gaussian process model.

Page 3: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

Contents

1 Introduction 1

2 Model predictive control 2

3 Adaptive MPC 3

4 Model identification 5

4.1 GP model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

4.2 Evolving GP model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

5 Optimization of control input 10

5.1 Cost functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

5.1.1 Quadratic cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

5.1.2 Saturating cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

5.2 Optimization method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

6 Case studies 13

6.1 Bioreactor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

6.2 Unstable system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

7 Conclusion 18

iii

Page 4: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

iv Contents

Page 5: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

1 Introduction

Control systems are what makes the system, in the broadest sense of the term, to

function as desired. Control systems are most often based on the principle of feedback,

whereby the signal to be controlled is compared to a desired reference signal and the

discrepancy used to compute corrective control action [1]. The term named closed-loop

control is very common in control theory. Its name comes from the information path in

the system: process inputs (e.g., voltage applied to an electric motor) have an effect on

the process outputs (e.g., speed or torque of the motor), which is measured with sensors

and processed by the controller to form a control signal [1]. This signal is “fed back”

as input to the process, closing the loop. A broad range controllers were developed in

the past century. One branch of control theory is optimal control theory, an extension

of the calculus of variations. Optimal control is a mathematical optimization method

for deriving control policies and it requires a mathematical model of the controlled

process. The drawback is that in some cases the exact solution cannot be found if

the mathematical model is not simple enough. Methods such model predictive control

(MPC) were developed to solve such control system issues to perform as close as possible

to optimal control. The idea of MPC is that a control performance test is measured

on a model by finding the optimal future input signal. The control performance relies

on a criterion to be minimized which is called a cost function. When optimal control

performance is found according to the cost function, a part of future input signal is

applied to the real process.

The main problem of this study is which variable in cost function inside the MPC

algorithm is important to the performance of controlled system. We will focus specially

on the variance obtained from the probabilistic model in Section 5.1.

The structure of this text is following, starting with the definition of the problem

in Chapter 1. The overview of research includes Chapters from 2 to 5, starting with

a brief explanation about the basic concept of MPC in Chapter 2. We will explain how

MPC uses a model of the controlled process and after that we advance to the concept

of adaptive control in Chapter 3, by extending the MPC method. The model for MPC

is chosen to be a probabilistic, black-box model (Chapter 4) which is applied to the

evolving concept (Section 4.2). The criterion of MPC optimization will be presented as

a cost function and its impact on control performance will be examined by modifying

1

Page 6: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

it in Chapter 5. The critical judgement will base on two case studies in Chapter 6: a

non-linear bioreactor and a toy linear unstable process. Finally, results will be discussed

and concluded with further suggestions for upgrading the existing condition

in Chapter 7.

2 Model predictive control

MPC control was originally developed to meet the specialized control needs of power

plants and petroleum refineries since the 1880s because it handles non-linear or multi-

variable control problems naturally.[2] A drawback of MPC is computational cost but

this was not a problem since some processes are enough slow to solve the computation

in time. MPC technology can now be found in a wide variety of application areas

including chemicals, food processing, automotive, and aerospace applications [2].

Model predictive control (MPC) is an intuitive and advanced approach for control

systems. It requires a model of the controlled process and this model can be as simple as

step response in time-domain or first-principle model, described with partial differential

equations. More specifically, the model is used by an optimization algorithm which

simulates the future process output to find a suitable future control input. The devotion

to output response optimality is expressed in terms of cost function minimization but it

always depend on the model accuracy. A cost function can take the following arguments:

the reference point where the process is wanted to be driven, the simulated output

from the model with corresponding future input. The design of cost function will be

explained in Section 5.1.

The usual way of computer-aided control design restricts the process output sam-

pling and input control action to be taken at discrete-time intervals.1 In a similar way

we are dealing with the discrete model. We can present the values of a simulated out-

put signal for a given input as discrete-time values for a finite number of discrete-time

steps as shown in Figure 2.1. MPC control is called also receding horizon control [3, 4]

because the optimal control input is recalculated by each new discrete-time instant.

The predictive horizon Hp is the number of total time steps we take into account for

predicting future signal. A future input signal must be defined in order to simulate the

corresponding output signal. We set a parameter for each step till the end of control

horizon Hu is reached and the latter input signal is set to a constant value till the end

of prediction horizon.

1We omit the discretization problem of continuous systems

2

Page 7: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

output y

input u

(past) (future)

k k +Hp

reference trajectory

model output

model input

k +Hc

Control horizon Hc

Predictive horizon Hp

Figure 2.1: Illustrative example of input optimization within the receding horizon con-text.

The concept of receding horizon is using predictive horizon as a moving frame of

future signals as input values to optimize the simulated (model) response. We apply a

feedback from the process state to the model and form a closed-loop control by matching

the current state of process with the model. The closed-loop concept is implemented

as the persistent observation of the system output is fed back to the regulator part.

The MPC has become an increasingly popular in industry by using linear models[5].

Qin and Badgwell summarized more than 4600 applications spanning a wide range from

chemicals to aerospace industries through 1999 [2]. At the other hand every system is

inherently non-linear in nature [4]. Some surveys into non-linear MPC are included in

[4, 6].

3 Adaptive MPC

The following chapter is partially adapted from [7]. Adaptive controller is the controller

that continuously adapts to some changing process. Adaptive controllers emerged in

early sixties of the previous century. At the beginning these controllers were mainly

adapting themselves based on linear models with changing parameters. Since then,

several authors have proposed the use of non-linear models as a base to build non-

linear adaptive controllers. These are meant for the control of time-varying non-linear

systems or for time-invariant non-linear systems that are modeled as parameter-varying

simplified non-linear models.

3

Page 8: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

A subset of adaptive systems are dual-adaptive systems [8, 9] where the optimisation

of the information collection and the control action are pursued at the same time.

The control signal should ensure that the system output cautiously tracks the desired

reference value and at the same time excites the plant sufficiently to accelerate the

buildup of model [10, 11], known as the identification process. The solution to the

dual control problem is based on dynamic programming and the resulting functional

equation is often the Bellman equation. More information about dynamic programming

and the bellman equation is available in [12] and [13].

Many adaptive controllers in general are based on the separation principle [9] that

implies separate estimation of system model, i.e., system parameters, and the appli-

cation of this model for control design. When the identified model used for control

design and adaptation is presumed to be the same as the true system then the adaptive

controller of this kind is said to be based on certainty equivalence principle and such

adaptive control is named non-dual adaptive control. The control actions in non-dual

adaptive control do not take any active actions that will influence the uncertainty. Our

adaptive control algorithm is basing on non-dual adaptive control.

The designed scheme used for adaptive MPC is shown in Figure 3.1. The optimizer

uses a model to simulate and searches the desired response by finding a suitable input

which will be then partially applied to the plant. The control algorithm is altered to

an adaptive one which repeatedly updates the model on-line. This additional structure

is presented as model identification block. The data for identification is collected by

taking the process input u and output y. The problem is when such control system

starts without any identification data to build a model. We override this by giving an

initial model.

r y

u

u

y

Model

Optimizer

−Plant

identificationModel

Figure 3.1: A sheme of adaptive MPC control.

4

Page 9: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

5

4 Model identification

The MPC control algorithm requires a model of the controlled system. We consider

a stochastic black-box dynamic model in the NARX representation [7][14], where the

output at time step k depends on the delayed outputs y and the exogeneous control

inputs u:

yk+1 = f(yk, . . . , yk−L, uk, . . . , uk−L) + ε(k), (4.1)

where f denotes a function, ε is white noise and the output y(k) depends on the state

vector x(k) = [yk, . . . , yk−L, uk, . . . , uk−L] [7]. Assuming the signal is known up to k,

we wish to predict the output of the system l steps ahead, i.e., we need to find the

predictive distribution of y(k + l) corresponding to x(k + l), if a probabilistic model

is taken into account. Multi-step-ahead predictions of a system modelled by (4.1) is

achieved by iteratively repeating one-step-ahead prediction, up to the desired horizon

[7]. One of possible implementations of a NARX model is the Gaussian process model

which will be presented in Section 4.1.

4.1 GP model

One should form the model of dynamical system in a probabilistic way when dealing

with control under unexpected disturbances [15]. If the variance of stochastic output

is reduced, the control can be more accurate [15]. With such motivation a probabilistic

model is favourable, giving some information about uncertainty of the modeled process

for various operating regions [16–18]. The Gaussian process (GP) model is a proba-

bilistic, non-parametric model and can be used for modeling dynamical systems very

similar to other black-box models, for example, neural network models. More litera-

ture about GP models is available from [16, 17, 19–27]. It is probabilistic because its

prediction is normally distributed and it is non-parametric because it has no structural

evidence of a modeled system[17, 20, 28]. This kind of modeling method is classified as

supervised learning and depends on a learning set. In our case, the learning set can be

percieved as a part of the model itself. The learning set D is composed from delayed

input and output signal measurements of the process. This kind of data is followed

from the NARX model form. Each element of {xi, yi} ∈ D can be splitted into a state

vector xi and its following predictive target yi:

{xi, yi} ∈ D, (4.2)

for i = 1, . . . , N where N is the size of learning set D. The output values yi are assumed

to be noisy measurements of an underlying function f(xi) with a conditional probability

distribution p(yi|fi) = N (fi, σ2). Let f = [f(x1), . . . , f(xN )]T and y = [y1, . . . , yN ]T ,

Page 10: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

6 Model identification

then the learning set D is used to form a joined Gaussian distribution of function values

f [29]. This is a Gaussian process and it is defined as a collection of random variables

with joined Gaussian distribution:

p(y|D) = N (0,K), (4.3)

where K is a (semi-positive definite) covariance matrix which inherits the input part

of the learning set D by mapping its paired inputs xi,xj with a covariance function

k(xi,xj):

K =

k(x1,x1) k(x1,x2) . . . k(x1,xN )

k(x2,x1) k(x2,x2) . . . k(x2,xN )...

.... . .

...

k(xN ,x1) k(xN ,x2) . . . k(xN ,xN )

. (4.4)

The covariance function k returns a scalar value, representing how two state vectors

from D are related to each other. For now, we keep in mind just what covariance

function does, but not how it is made. A common aim in Gaussian process regression

is to predict the output y∗ from a new state vector x∗ given the learning set D and a

known covariance function k(xi,xj). The posterior predictive distribution is obtained

by altering the joint Gaussian distribution (4.3) into:

p

([y

y∗

]x∗,D

)= N

(0,

[K k∗

k∗T k(x∗,x∗)

]). (4.5)

It can be shown that the single posterior distribution p(y∗|D,x∗) can be analytically

solved [29], hence we get the form of GP model prediction:

p(y∗|x∗,D) = N(y∗|k∗TK−1y, k(x∗,x∗)− k∗TK−1k∗

), (4.6)

where k∗ is the vector of covariance function values between the inputs xi ∈ D, i =

1, . . . , N and the prediction input x∗:

k∗ = [k(x1,x∗), k(x2,x

∗), . . . , k(xN ,x∗)]T (4.7)

The covariance function design was omitted but it is essentialy the main part of

GP model structure along the learning set D. Inference in GP firstly involves finding

the form of covariance function k(xi, xj) to provide a Bayesian interpretation of kernel

methods2[28]. Its value expresses the correlation between the individual outputs yi and

2The theory of kernel methods will not be discussed here. For more information, some surveys intokernel methods are provided (Pilonetto et al. [30]; Campbell [31]).

Page 11: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

4.1 GP model 7

yj with respect to inputs xi and xj [28]. Usually, the covariance function is used along

with some parameters named hyperparameters. We tend to optimize the covariance

function hyperparameters instead of finding a more general covariance function w.r.t.

the learning set D. The use of hyperparameters can highlight or neglect individual

regressors from an input vector xi. Assuming stationary data is contaminated with

white noise, most commonly used covariance function is a composition of the square

exponential (SE) covariance function with “automatic relevance determination” (ARD)

hyperparameters [21] and an additional term δij for the white noise assumption [28]:

k(xi,xj) = v0 exp

(−1

2

D∑

d=1

θd(xid − xjd)2

)+ v1δij , (4.8)

where θd are the automatic relevance determination hyperparameters, v1 and v0 are

hyperparameters of the covariance function, D is the number of regressors, and δij is the

kronecker operator. The method of setting the hiperparameters Θ = [v1, v0, θ1, . . . , θd]

will not be discussed here, but can be further provided in [27, 28].

An example illustrating how the GP model is used for prediction is be shown by

taking a learning set from inputs with single regressor and corresponding outputs:

{xi, yi} ∈ D, i = 1, . . . , 7 (4.9)

D = {(−0.74,−0.17), (0.21,−0.00), (0.01, 0.09), (0.90,−0.77),

(−0.33,−0.06), (−0.82,−0.46), (−0.68,−0.01)}.(4.10)

The values of xi and yi are shown as blue stars on Figure 4.1. By defining a suitable

covariance function k, the model can be used to predict the output at different inputs

on x-axis. The expected value of predicted output is marked with an “×” symbol along

the double standard deviation of the prediction .

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

y ±

x

Figure 4.1: An example of multiple predictions at chosen points on the x-axis, markedwith the “×” symbol.

Page 12: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

8 Model identification

In our case we are interested in the applications of Non-linear MPC (NMPC) princi-

ple with a GP model [7]. Stochastic NMPC problems are formulated in the applications

where the system to be controlled is described by a stochastic model such as the GP

model [32]. Stochastic problems like state estimation are studied for long time, but,

in our case, we explore only stochastic NMPC problem. Nevertheless, most known

stochastic MPC approaches are based on parametric probabilistic models. Alterna-

tively, the stochastic systems can be modeled with nonparametric models which can

offer a significant advantage compared to the parametric models [7]. This is related

to the fact that the nonparametric probabilistic models, like GP models, provide in-

formation about prediction uncertainties which are difficult to evaluate appropriately

with the parametric models. Other relevant literature of MPC with GP models can be

found also in [7, 32–36].

The GP model used for adaptive control is identified on-line [32]. It is sensible that

advantages of GP models are considered in the control design, which relates the GP

model-based adaptive control at least to suboptimal dual adaptive control principles.

The uncertainty of model predictions obtained with the GP model prediction are de-

pendent, among others, on local learning-data density, and the model complexity is

automatically related to the amount and the distribution of the available data – more

complex models need more evidence to make them likely [37]. Both aspects are very

useful in sparsely-populated transient regimes. Moreover, since weaker prior assump-

tions are typically applied in a nonparametric model, the bias is typically lower than

in parametric models. The related

4.2 Evolving GP model

This section is summarized from [28]. The Evolving GP model (EGP) is inspired

by Evolving systems [38], which are self-developing systems, adapting on-line both,

structure and parameter values of the model from incoming data [38]. We use the term

Evolving GP models in sense of sequential adapting of both, the “structure” of GP

model and hyperparameter values.

This enables fast and efficient GP model adaptation to the time-varying system.

The concept of EGP proposed in [32] and further developed in [39] considers adaptation

of four main parts of GP model: a learning set, hyperparameter values, covariance

function and regressors. In comparison with the learning set D of a GP model, the

learning set of an EGP model DA is said to be an active set with the property:

DA ⊆ D (4.11)

where only a subset of entire learning dataset is used for modeling with EGP. We

Page 13: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

t[s]

y ± 2σ

t[s]

y ± 2σnon-adaptive GP model

adaptive GP model

Predicted variance increasesin unknown operating region

Figure 4.2: An example showing the difference between an adaptive and non-adaptiveGP model.

decided to use fixed squared exponential (SE) covariance function with ARD (4.8) be-

cause its functionality is able to find influential regressors [39]. With the optimization of

the hyperparameter values, uninfluential regressors have smaller influence in covariance

function and as a consequence have smaller influence to the result. Therefore is not

necessary to remove uninfluential regressors manually. In general the proposed method

consists of three main steps to adapt the GP model sequentially. In the first step new

data is processed in sense of including the incoming data to the active set D. Secondly,

hyperparameter values Θ are optimized while in the third step the covariance matrix

K and its inversion are updated according to the changes from the first two steps.

In our specific case we have an EGP of NARX form whose incoming data consists

from an input vector xi of delayed inputs and outputs and its target value yi of the

current output. For every new incoming data, the novelty of the data according to the

current GP model is verified. This is simply done by predicting the output mean value

E[y∗i ] of the incoming input vector xi and calculating the error:

e = E[y∗i ]− yi (4.12)

If the error e is greater than a pre-set threshold ζEGP , the element {xi, yi} ∈ D is

added to the active set DA. A method for excluding elements must be used if the

active learning set has to be limited to a maximum size. This methodology will not be

9

Page 14: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

10 Optimization of control input

disscused here but more information is available from [28, 32, 39].

5 Optimization of control input

The optimization part of MPC algorithm can be shown in Figure 3.1. It mainly depends

on the:

• structure of cost function,

• optimization method used,

• construction of the future input signal.

5.1 Cost functions

The cost function is used to value the short-hand closed-loop simulations using a model

with varying the input signal to the model. The task of designing a cost function

strongly depends on how is the notion “control system performance” interpreted. A

cost function should project the control system performance to a scalar value accurately.

5.1.1 Quadratic cost

The quadratic cost for stochastic control can be expressed as the expected value of eu-

clidean distance between the reference and predicted process output with an additional

weighing term of control input signal:

J = E[(r(k + 1)− y∗(k + 1))2] + λuu(k)2, (5.1)

using the fact that var[y∗(k + 1)] = E[y∗(k + 1)] − E[y∗(k + 1)]2, we modify (5.1) into

[40]:

J = (r(k)− E[y∗(k + 1)])2 + var[y∗(k + 1)] + λuu(k)2, . (5.2)

We can imply from (5.2) that the quadratic cost for stochastic control leads to variance

minimization. The origin of predicted variance from a GP model can have more possible

sources. If the GP model is used to acquire the properties of a stochastic process, the

noise source is modeled within the GP model. Another reason of GP model variance

is the model uncertainty. The control strategy with cost function (5.2) is “to avoid”

Page 15: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

5.1 Cost functions 11

going into regions with greater variance [7]. In the case that controller is not “cautious”

enough, a “quick-and-dirty” option is that the variance term can be weighted with

a constant λvar to enable shaping of the closed-loop response according to variance

information [7]:

J = (r(k + 1)− E[y∗(k + 1)])2 + λvarvar[y∗(k + 1)] + λuu(k)2, . (5.3)

The cost function from (5.3) can be further modified that fits to the receding horizon

concept. Instead of using just one-step prediction, we can extend the cost with sum of

multi-step prediction, including a weighted control input [7]:

J =

Hp∑

i=1

λe,i(r(k + i)− E[y∗(k + i)])2 + λvar,ivar[y∗(k + i)] + λu,iu(k + i)2, (5.4)

where each term has its weight λe,i, λvar,i and λu,i.

For practical reasons, the cost term of control input can be modified being propor-

tional to the euclidean distance (u(k + i)− us)2 where us is the value of control input

at steady-state of the modeled process [4, 41, 42]. The value of steady-state control

input is calculated using a NARX model by finding the solution of an implicit algebraic

equation:

yk = f(yk−1, . . . , yk−L, uk−1, . . . , uk−L), (5.5)

under the conditions:

rs = yk = yk−1 = . . . = yk−L, (5.6)

us = uk−1 = . . . = uk−L, (5.7)

where rs is the desired setpoint of the controlled system. In practical, the calculation of

steady-state target us is done numerically with a gradient optimization method. The

form of quadratic cost function with steady-state target is following:

J =

Hp∑

i=1

λe,i(r(k+ i)−E[y∗(k + i)])2 + λvar,ivar[y∗(k + i)] + λu,i(u(k+ i)− us)2, (5.8)

In our case, the weights from (5.8) will be reduced to two parameters λvar and λu in

Page 16: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

12 Optimization of control input

order to simplify the cost function design:

λe,i = 1 · Hp − i+ 1

Hp, (5.9)

λvar,i = λvarHp − i+ 1

Hp, (5.10)

λu,i = λuHp − i+ 1

Hp. (5.11)

The weight values are defined in a linear descending form along the prediction horizon

Hp. Each another prediction inside of a multi-step prediction is more inaccurate than

previous one and this is a reasonable approach to shift the sensivity of cost funciton at

the beginning part of multi-step model prediction. This kind of cost function will be

also used for two case studies.

5.1.2 Saturating cost

The saturating cost function is proposed in [43, 44]

Ji = E

{1− exp

(− 1

2a2(y∗(i)− r(i))2

)}, (5.12)

that is a locally quadratic but which saturates at unity for large deviations between

the desired process output r and the model prediction y∗. The saturating cost from

(5.12) is an unnormalized Gaussian function3 with mean r and variance parameter a2.

The expected value of (5.12) can be solved analytically and we get:

Ji = 1−(

a2

a2 + var[y∗(i)]

) 12

exp

(− (r(i)− y∗(i))2a2 + var[y∗(i)]

). (5.13)

In comparison with cost function from (5.8), it involves neither weights nor control

input term but a single parameter a. In order to form a saturating cost from a receding

horizon, we need to sum up each step of prediction horizon:

J =

k+Hp∑

i=k+1

Ji. (5.14)

3The term “Gaussian function” should not be confused with “Gaussian probability distribution”which has the same form as normalized Gaussian function

Page 17: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

5.2 Optimization method

One of main reasons why MPC method was developed is that it can be used for uncon-

strained control problems. In our case, the control input limits are the only constraints

we will be dealing with and these can be overriden by saturating the control input if the

limit is exceeded. The Quasi-Newton method is chosen for uncostrained optimization

of the cost function. It is designed to find a local optimum and it works good with a

convex shaped cost function which is usual in uncostrained linear MPC. One should be

careful with non-linear (or constrained) control because the convexity property of cost

function is not guaranteed.

6 Case studies

The adaptive MPC-GP method and the theories will be judged expermientally on two

processes defined with recurrence equations: a process named bioreactor and a toy

system, further called as the “unstable system”. The main properties for both systems

are given in Table 61.

System Bioreactor Unstable

Open-loop stability yes no

Linearity non-linear linear

Order 2 3

Number of inputs 1 1

Number of outputs 1 1

Input constraint 0 ≤ u ≤ 0.7 −0.5 ≤ u ≤ 0.5

Table 61: Main properties of the system

Eight experiments of closed-loop control were executed with four different cost func-

tions per two prediction horizons Hp under same circumstances, including the same

noise signal ε of the process output. Three cost functions belong to a quadratic form,

the fourth is a saturating one. Each cost function will be additionaly named by its

properties:

• controlled system which belongs to (B-bioreactor,U-unstable),

• quadratic or saturating form,

• weighing parameters λU , λV AR.

13

Page 18: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

14 Case studies

Cost name JB,Quad,E JB,Quad,ES JB,Quad,EV JB,Sat

Form quadratic quadratic quadratic saturating

Parameter λU 0 0.07 0 /

Parameter λV 0 0 0.14 /

Parameter a / / / 1

Table 62: Four cost functions with different parameter settings. λU forces the controlinput to be closer to the steady-state input, λV increases the predicted varianceterm of the quadratic cost, a sets the margin between a local quadratic andthe saturated shape of the saturating cost function, relative to the predictedvariance

6.1 Bioreactor

This dynamical system is a highly simplified version of a real bioreactor process [45].

It is an open-loop stable, non-linear and second order system, desribed with recurrence

equation (6.1) and (6.2):

x1(k + 1) = x1(k) +1

2

x1(k)x2(k)

x1(k) + x2(k)− 1

2u(k)x1(k), (6.1)

x2(k + 1) = x2(k)− 1

2

x1(k)x2(k)

x1(k) + x2(k)− 1

2u(k)x2(k) +

1

20u(k), (6.2)

y(k) = x1(k) + ε(k), (6.3)

where u is system input, limited to [0, 0.7], x1 and x2 are system states, and the output

y is contaminated with a normally distributed noise ε with p(ε) = N (0, 0.001). The

adaptive MPC-EGP algorithm requres an initial GP model to perform effectively. A

simple proportional (P) regulator was used to train a GP model in closed-loop in the

first 0 ≤ k ≤ 30 time steps. At k > 30 the adaptive MPC-EGP regulator was activated

and replaced the proportional one. The error threshold for EGP model update is set

to ζEGP = 0.0021 and we restricted the EGP active learning set to a maximum of 15

learning points. Such small active learning set would probably be too risky for control of

real systems. In our case, a smaller dataset can influence higher prediction uncertainty

(and also inaccuracy). This is interesting since we want to somehow implement the cost

function involving prediction variance, partially derived from the model uncertainty.

Just a representative segment of the closed-loop performance is shown in Figure 6.1

for two different prediction horizons Hp and four different cost functions as shown in

Table 62.

The algorithm performed relatively good for all configurations (Figure 6.1). We

can notice that the control using cost JU,Quad,ES performed with a smoother control

signal and a slower closed-loop response (for both horizons Hp = 1 and Hp = 8) as

Page 19: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

6.1 Bioreactor 15

290 300 310 320 330 340 350 3600

0.2

0.4

0.6

u(t

)

k

0.04

0.06

0.08

0.1

y(t

) ± 2

σ

(a) JU,Quad,E cost function, Hp = 1

290 300 310 320 330 340 350 3600

0.2

0.4

0.6

u(t

)

k

0.04

0.06

0.08

0.1

y(t

) ± 2

σ

(b) JU,Quad,E cost function, Hp = 8

290 300 310 320 330 340 350 3600

0.2

0.4

0.6

u(t

)

k

0.04

0.06

0.08

0.1

y(t

) ± 2

σ

(c) JU,Quad,ES cost function, Hp = 1

290 300 310 320 330 340 350 3600

0.2

0.4

0.6

u(t

)

k

0.04

0.06

0.08

0.1

y(t

) ± 2

σ

(d) JU,Quad,ES cost function, Hp = 8

290 300 310 320 330 340 350 3600

0.2

0.4

0.6

u(t

)

k

0.04

0.06

0.08

0.1

y(t

) ± 2

σ

(e) JU,Quad,EV cost function, Hp = 1

290 300 310 320 330 340 350 3600

0.2

0.4

0.6

u(t

)

k

0.04

0.06

0.08

0.1

y(t

) ± 2

σ

(f) JU,Quad,EV cost function, Hp = 8

290 300 310 320 330 340 350 3600

0.2

0.4

0.6

u(t

)

k

0.04

0.06

0.08

0.1

y(t

) ± 2

σ

(g) JU,Sat cost function, Hp = 1

290 300 310 320 330 340 350 3600

0.2

0.4

0.6

u(t

)

k

0.04

0.06

0.08

0.1

y(t

) ± 2

σ

(h) JU,Sat cost function, Hp = 8

Figure 6.1: Closed-loop control of bioreactor. The upper window contains a reference signal(blue), process output (red) and one-step prediction mean with double std. devia-tion (black with gray gap). The lower window is control input.

Page 20: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

16 Case studies

Cost name JU,Quad,E JU,Quad,ES JU,Quad,EV JU,Sat

Form quadratic quadratic quadratic saturating

Parameter λU 0 8 · 10−4 0 /

Parameter λV 0 0 2 /

Parameter a / / / 1

Table 63: Four cost functions with different parameter settings. λU forces the controlinput to be closer to the steady-state input, λV increases the predicted varianceterm of the quadratic cost, a sets the margin between a local quadratic andthe saturated shape of the saturating cost function, relative to the predictedvariance

expected. Using the cost JU,Quad,EV or JU,Sat does not improve the control performance

significantly, compared to a simpler cost term JU,Quad,E . We should expect that the

prediction variance would influence a slower closed-loop performance because the cost

should increase when predicting in a less known operating region.

6.2 Unstable system

Another case study of control performance is based on an artificially created linear

unstable system, described with difference equation:

y(k) =2.12y(k − 1)− 1.25y(k − 2) + 0.09y(k − 3)

+0.006u(k − 1) + 0.016u(k − 2) + 0.002u(k − 3)

+0.001ε(k),

(6.4)

where system input u is limited to [−0.5, 0.5] and y is the system output, contaminated

with a normally distributed noise ε with p(ε) = N (0, 0.001)4. The adaptive MPC-

EGP algorithm requres an initial GP model to perform effectively. A MPC regulator

with fixed parametric model (6.4) was used for closed-loop identification in the first

0 ≤ k ≤ 263 time steps. At k > 30 the adaptive MPC-EGP regulator was activated

and replaced with the proportional one. The error threshold for EGP model update is

ζEGP = 0.004 and the upper size limit of EGP active set is set to 40 elements.

Just a representative segment of the closed-loop performance is shown in Figure 6.2

for two different prediction horizons Hp and four different cost functions as shown in

Table 63;

The algorithm performed relatively slow but smoother using a larger horizon of

Hp = 8 (Figure 6.2). We can point out that using a larger predictive horizon gives a

more insensitive control performance w.r.t. the cost function chosen. While instead

4One should note that the noise signal ε is the same for bioreactor and the unstable system, but theunstable system operating region is two times larger than bioreactor

Page 21: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

6.2 Unstable system 17

520 540 560 580 600 620 640 660 680 700 720−0.5

0

0.5

u(t

)

k

−0.1

−0.05

0

0.05

0.1

(a) JU,Quad,E cost function, Hp = 1

520 540 560 580 600 620 640 660 680 700 720−0.5

0

0.5

u(t

)

k

−0.1

−0.05

0

0.05

0.1

y(t

) ± 2

σ

(b) JU,Quad,E cost function, Hp = 8

520 540 560 580 600 620 640 660 680 700 720−0.5

0

0.5

u(t

)

k

−0.1

−0.05

0

0.05

0.1

y(t

) ± 2

σ

(c) JU,Quad,ES cost function, Hp = 1

520 540 560 580 600 620 640 660 680 700 720−0.5

0

0.5

u(t

)

k

−0.1

−0.05

0

0.05

0.1

y(t

) ± 2

σ

(d) JU,Quad,ES cost function, Hp = 8

520 540 560 580 600 620 640 660 680 700 720−0.5

0

0.5

u(t

)

k

−0.1

−0.05

0

0.05

0.1

y(t

) ± 2

σ

(e) JU,Quad,EV cost function, Hp = 1

520 540 560 580 600 620 640 660 680 700 720−0.5

0

0.5

u(t

)

k

−0.1

−0.05

0

0.05

0.1

y(t

) ± 2

σ

(f) JU,Quad,EV cost function, Hp = 8

520 540 560 580 600 620 640 660 680 700 720−0.5

0

0.5

u(t

)

k

−0.1

−0.05

0

0.05

0.1

y(t

) ± 2

σ

(g) JU,Sat cost function, Hp = 1

520 540 560 580 600 620 640 660 680 700 720−0.5

0

0.5

u(t

)

k

−0.1

−0.05

0

0.05

0.1

y(t

) ± 2

σ

(h) JU,Sat cost function, Hp = 8

Figure 6.2: Closed-loop control of the “unstable” system. The upper window contains a refer-ence signal (blue), process output (red) and one-step prediction mean with doublestd. deviation (black with gray gap). The lower window is control input.

Page 22: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

using a predictive horizon Hp = 1, we can notice that the control was feasible and

smother using cost JU,Quad,ES and JU,Quad,EV compared to JU,Quad,E and JU,Sat. At

the other hand, the performance using Hp = 1 is very sensitive to the cost function

parameters chosen and it would prevert to an unstable closed-loop performance if they

would enough increased.

7 Conclusion

Our goal was to experimentally evaluate the effects of various cost functions on the

controlled system performance. The results from the bioreactor control and unstable

system control (figure 6.2) show that the performance using a larger horizon is accept-

able and less sensitive to parameters, compared to Hp = 1. The effect of weighing a

control input signal inside the quadratic cost from (5.8) does provide a smoother control

input signal and might improve the control performance (Figure 6.2c). The observed

effect of weighing a prediction variance inside quadratic cost might affect smoother

control input signal (figure 6.2e). No notable difference can be concluded between the

saturating JSat and a quadratic JQuad,E cost function for the given parameter a = 1.

Some practical issues were noticed. Firstly, the developed identification and control

algorithm are both computationally demanding but this is not a problem when real

controlled systems have longer time constant. Secondly, thinking about cost function

optimization, uncertain model might cause a non-convex form of the cost function which

is non-trivial to minimize.

The current issues in EGP methodology are many. One should note that we im-

plemented an adaptive control algorithm which adapts the GP model on-line and its

prediction infers a much smaller uncertainty compared to an offline GP model. Small

prediction variance cannot leave a bigger impact on control. A big increase of crite-

rion weights λi,V AR could solve the use of prediction uncertainty but might lead to

unwanted results. Another problem is the validation of an EGP model in specific time

instant: we can validate the overall performance of EGP model during system control

but another question is how to measure the accuracy of the EGP model in a specific

moment during the control if the model is time-varying.

The control design based on EGP is known for stable systems. More focus is

needed on control of unstable systems. GP models are good for interpolation between

two known regions, but another question is if GP models can be used for extrapolation

of unknown regions of a dynamical system during closed-loop control. Such issue could

18

Page 23: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

lead to the study of local and global stability of a closed-loop system. More properties

of dynamical GP models should be further investigated in order to understand and

improve closed-loop control of unstable systems. Yet another question is how to reach

convergence of closed-loop performance starting with an initial model as simple as

possible. The former is closely related to dual adaptive control.

Robust control is a branch of control theory that explicitly deals with uncertainty

in its approach to controller design but is non-adaptive. GP modeling framework gives

an opportunity to work with models where unertainty is handled by nature. Robust

control based on GP model is another aspect that can be further studied and developed

for industry needs.

Bibliography

[1] J. Doyle, B. Francis, and A. Tannenbaum, Feedback Control Theory. Macmillan

Publishing Co., 1990.

[2] S. J. Qin and T. A. Badgwell, “A survey of industrial model predictive control

technology,” Control Engineering Practice, vol. 11, p. 733–764, 2003.

[3] S. Qin and T. Badgwell, “An overview of nonlinear model predictive control appli-

cations,” in Nonlinear Model Predictive Control (F. Allgower and A. Zheng, eds.),

vol. 26 of Progress in Systems and Control Theory, pp. 369–392, Birkhauser Basel,

2000.

[4] J. B. Rawlings, “Tutorial overview of model predictive control,” IEEE Control

Systems Magazine, vol. 20, pp. 38–52, 2000.

[5] F. Allgower, R. Findeisen, and Z. K. Nagy, “Nonlinear model predictive control:

From theory to application,” Journal of the Chinese Institute of Chemical Engi-

neers, vol. 35, pp. 299–315, 2004.

[6] C. E. Garcıa, D. M. Prett, and M. Morari, “Model Predictive Control: Theory

and Practice - a Survey,” Automatica, vol. 25, pp. 335–348, 1989.

[7] J. Kocijan, “Control Algorithms Based on Gaussian Process Models: A State-of-

the-Art Survey,” in Special International Conference on Complex systems: synergy

of control communications and computing, September 16-20, 2011, Ohrid, Republic

of Macedonia. Proceedings of COSY 2011 papers, 2011.

19

Page 24: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

20 BIBLIOGRAPHY

[8] N. Filatov and H. Unbehauen, “Survey of adaptive dual control methods,” IEE

Proceedings - Control Theory and Applications, vol. 147, pp. 118–128, 2000.

[9] B. Wittenmark, “Adaptive dual control,” in Control Systems, Robotics and Au-

tomation, Encyclopedia of Life Support Systems (EOLSS), Developed under the

auspices of the UNESCO, Oxford, UK: Eolss Publishers, Jan. 2002.

[10] J. Alster and P. Belanger, “A Technique for Dual Adaptive Control,” Automatica,

vol. 10, pp. 627–634, 1974.

[11] S. Fabri and V. Kadirkamanathantn, “Dual Adaptive Control of Nonlinear

Stochastic Systems using Neural Networks,” Automatica, vol. 34, pp. 245–253,

1998.

[12] D. P. Bertsekas, “Dynamic programming and suboptimal control: A survey from

adp to mpc,” European Journal of Control, vol. 11, pp. 310–334, 2005.

[13] K. Astrom, “Theory and applications of adaptive control—a survey,” Automatica,

vol. 19, no. 5, pp. 471 – 486, 1983.

[14] R. Sa lat, M. Awtoniuk, and K. Korpysz, “Black-Box system identification by

means of Support Vector Regression and Imperialist Competitive Algorithm,” in

Przeglad Elektrotechniczny, 2013.

[15] H. Arellano-Garcia, M. Wendt, T. Barz, and G. Wozny, “Close-loop stochastic dy-

namic optimization under probabilistic output-constraints,” in Assessment and Fu-

ture Directions of Nonlinear Model Predictive Control (R. Findeisen, F. Allgower,

and L. Biegler, eds.), vol. 358 of Lecture Notes in Control and Information Sci-

ences, pp. 305–315, Springer Berlin Heidelberg, 2007.

[16] J. Kocijan, “Dynamic GP Models: An Overview and Recent Developments,” in

Proceedings of the 6th International Conference on Applied Mathematics, Simu-

lation, Modelling, ASM’12, (Stevens Point, Wisconsin, USA), pp. 38–43, World

Scientific and Engineering Academy and Society (WSEAS), 2012.

[17] K. Azman and J. Kocijan, “Dynamical Systems Identification Using Gaussian Pro-

cess Models with Incorporated Local Models,” Journal of Engineering Applications

of Artificial Intelligence, vol. 24, pp. 398–408, Mar. 2011.

[18] D. Petelin and J. Kocijan, “Application of on-line Gaussian process models for

pressure signal,” in 11th International PhD Workshop on Systems and Control,

2010.

[19] F. Perez-Cruz, S. Van Vaerenbergh, J. Murillo-Fuentes, M. Lazaro-Gredilla, and

I. Santamaria, “Gaussian processes for nonlinear signal processing: An overview

Page 25: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

BIBLIOGRAPHY 21

of recent advances,” Signal Processing Magazine, IEEE, vol. 30, pp. 40–50, July

2013.

[20] J. Kocijan, A. Girard, B. Banko, and R. Murray-Smith, “Dynamic systems identi-

fication with Gaussian processes,” Mathematical and Computer Modelling of Dy-

namical Systems, vol. 11, pp. 411–424, 2005.

[21] C. K. I. Williams and C. E. Rasmussen, “Gaussian processes for regression,” Ad-

vances in Neural Information Processing Systems, vol. 8, pp. 514–520, 1996.

[22] G. Gregorcic and G. Lightbody, “Gaussian processes for modelling of dynamic

non-linear systems,” in Proceedings of the Irish Signals and Systems Conference,

(Cork, Ireland), pp. 141–147, 2002.

[23] J. Quinoneiro-Candela and C. Rasmussen, “A Unifying View of Sparse Approxi-

mate Gaussian Process Regression,” Journal of Machine Learning Research, vol. 6,

pp. 1939–1959, 2005.

[24] M. Lazaro-Gredilla, J. Quinonero-Candela, C. E. Rasmussen, and A. R. Figueiras-

Vidal, “Sparse Spectrum Gaussian Process Regression,” Journal of Machine

Learning Research, vol. 11, pp. 1865–1881, 2010.

[25] R. Turner, M. P. Deisenroth, and C. E. Rasmussen, “State-space inference and

learning with Gaussian processes,” in Proceedings of 13th International Conference

on Artificial Intelligence and Statistics, vol. 9, (Sardinia, Italy), pp. 868–875, 2010.

[26] D. Nguyen-Tuong, M. Seeger, and J. Peters, “Computed torque control with non-

parametric regression models,” in Proceedings of the American Control Conference

(ACC), pp. 212–217, 2008.

[27] R. Neal, “Regression and classification using gaussian process priors,” Bayesian

Statistics, vol. 6, pp. 475–501, 1998.

[28] D. Petelin and J. Kocijan, “Evolving Gaussian process models for predicting

chaotic time-series,” in IEEE Conference on Evolving and Adaptive Intelligent

Systems (EAIS), 2014. Accepted for publication.

[29] H. Nickisch and C. Rasmussen, “Approximations for binary gaussian process clas-

sification,” Journal of Machine Learning Research, vol. 9, pp. 2035–2078, 2008.

[30] G. Pillonetto, F. Dinuzzo, T. Chen, G. D. Nicolao, and L. Ljung, “Kernel methods

in system identification, machine learning and function estimation: A survey,”

Automatica, vol. 50, no. 3, pp. 657–682, 2014.

Page 26: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

22 BIBLIOGRAPHY

[31] C. Campbell, “Kernel methods: a survey of current techniques,” Neurocomputing,

vol. 48, pp. 63–84, 2002.

[32] D. Petelin and J. Kocijan, “Control system with evolving Gaussian process mod-

els,” in Evolving and Adaptive Intelligent Systems (EAIS), 2011 IEEE Workshop

on, pp. 178–184, 2011.

[33] G. Shen and Y. Cao, “A Gaussian Process Based Model Predictive Controller for

Nonlinear Systems with Uncertain Input-output Delay,” Applied Mechanics and

Materials, vol. 433-435, pp. 1015–1020, 2013.

[34] J. M. Macijeowski and X. Yang, “Fault tolerant control using Gaussian processes

and model predictive control,” in Conference on Control and Fault-Tolerant Sys-

tems (SysTol), Nice, France., October 2013.

[35] J. Kocijan, R. Murray-Smith, C. Rasmussen, and B. Likar, “Predictive control

with gaussian process models,” in EUROCON 2003. Computer as a Tool. The

IEEE Region 8, vol. 1, pp. 352–356 vol.1, Sept 2003.

[36] R. Murray-Smith, D. Sbarbaro, C. E. Rasmussen, and A. Girard, “Adaptive, cau-

tious, predictive control with Gaussian process priors,” in Proceedings of 13th IFAC

Symposium on System Identification, (Rotterdam, Netherlands), 2003.

[37] R. Murray-Smith and A. Girard, “Gaussian Process priors with ARMA noise mod-

els,” in Irish Signals and Systems Conference, Maynooth, Ireland, (Maynooth,

Ireland), pp. 147–152, 2001.

[38] P. Angelov, D. P. Filev, and N. Kasabov, “Evolving intelligent systems: Methodol-

ogy and applications,” in IEEE Press Series on Computational Intelligence, Wiley

IEEE Press, April 2010.

[39] D. Petelin, A. Grancharova, and J. Kocijan, “Evolving Gaussian process models

for prediction of ozone concentration in the air,” Simulation Modelling Practice

and Theory, vol. 33, no. 0, pp. 68 – 80, 2013.

[40] R. Murray-Smith and D. Sbarbaro, “Nonlinear adaptive control using nonparamet-

ric Gaussian process prior models,” in Proceedings of IFAC 15th World Congress,

(Barcelona, Spain), 2002.

[41] C. V. Rao and J. B. Rawlings, “Steady states and constraints in model predictive

control,” AIChE Journal, vol. 45, pp. 1266–1279, 1999.

[42] K. R. Muske, “Steady-state target optimization in linear model predictive control,”

in Proceedings of the American Control Conference, 1997.

Page 27: Abstract - IJSkt.ijs.si/markodebeljak/Lectures/Seminar_MPS/2012_on/Seminars2013... · prediction itself is a normally distributed random variable. ... 7 Conclusion 18 iii. ... were

BIBLIOGRAPHY 23

[43] M. P. Deisenroth and C. E. Rasmussen, “Pilco: A model-based and data-efficient

approach to policy search,” in In Proceedings of the International Conference on

Machine Learning, 2011.

[44] M. P. Deisenroth and C. E. Rasmussen, “Efficient reinforcement learning for motor

control,” in Proceedings of the 10th International PhD Workshop on Systems and

Control, a Young Generation Viewpoint, (Hluboka nad Vltavou, Czech Republic),

2009.

[45] K. Azman and J. Kocijan, “Application of Gaussian processes for black-box mod-

elling of biosystems,” ISA Transactions, vol. 46, no. 4, pp. 443 – 457, 2007.