Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks Arnaud Doucet Engineering Dept. Cambridge University [email protected]Nando de Freitas Kevin Murphy Stuart Russell Computer Science Dept. UC Berkeley jfgf,murphyk,russell @cs.berkeley.edu Abstract Particle filters (PFs) are powerful sampling- based inference/learning algorithms for dynamic Bayesian networks (DBNs). They allow us to treat, in a principled way, any type of probabil- ity distribution, nonlinearity and non-stationarity. They have appeared in several fields under such names as “condensation”, “sequential Monte Carlo” and “survival of the fittest”. In this pa- per, we show how we can exploit the structure of the DBN to increase the efficiency of parti- cle filtering, using a technique known as Rao- Blackwellisation. Essentially, this samples some of the variables, and marginalizes out the rest exactly, using the Kalman filter, HMM filter, junction tree algorithm, or any other finite di- mensional optimal filter. We show that Rao- Blackwellised particle filters (RBPFs) lead to more accurate estimates than standard PFs. We demonstrate RBPFs on two problems, namely non-stationary online regression with radial ba- sis function networks and robot localization and map building. We also discuss other potential ap- plication areas and provide references to some fi- nite dimensional optimal filters. 1 INTRODUCTION State estimation (online inference) in state-space models is widely used in a variety of computer science and engineer- ing applications. However, the two most famous algorithms for this problem, the Kalman filter and the HMM filter, are only applicable to linear-Gaussian models and models with finite state spaces, respectively. Even when the state space is finite, it can be so large that the HMM or junction tree algorithms become too computationally expensive. This is typically the case for large discrete dynamic Bayesian net- works (DBNs) (Dean and Kanazawa 1989): inference re- quires at each time space and time that is exponential in the number of hidden nodes. To handle these problems, sequential Monte Carlo meth- ods, also known as particle filters (PFs), have been in- troduced (Handschin and Mayne 1969, Akashi and Ku- mamoto 1977). In the mid 1990s, several PF algorithms were proposed independently under the names of Monte Carlo filters (Kitagawa 1996), sequential importance sam- pling (SIS) with resampling (SIR) (Doucet 1998), bootstrap filters (Gordon, Salmond and Smith 1993), condensation trackers (Isard and Blake 1996), dynamic mixture models (West 1993), survival of the fittest (Kanazawa, Koller and Russell 1995), etc. One of the major innovations during the 1990s was the inclusion of a resampling step to avoid de- generacy problems inherent to the earlier algorithms (Gor- don et al. 1993). In the late nineties, several statistical im- provements for PFs were proposed, and some important theoretical properties were established. In addition, these algorithms were applied and tested in many domains: see (Doucet, de Freitas and Gordon 2000) for an up-to-date sur- vey of the field. One of the major drawbacks of PF is that sampling in high-dimensional spaces can be inefficient. In some cases, however, the model has “tractable substructure”, which can be analytically marginalized out, conditional on cer- tain other nodes being imputed, c.f., cutset conditioning in static Bayes nets (Pearl 1988). The analytical marginal- ization can be carried out using standard algorithms, such as the Kalman filter, the HMM filter, the junction tree al- gorithm for general DBNs (Cowell, Dawid, Lauritzen and Spiegelhalter 1999), or, any other finite-dimensional opti- mal filters. The advantage of this strategy is that it can drastically reduce the size of the space over which we need to sample. Marginalizing out some of the variables is an example of the technique called Rao-Blackwellisation, because it is related to the Rao-Blackwell formula: see (Casella and Robert 1996) for a general discussion. Rao-Blackwellised particle filters (RBPF) have been applied in specific con- texts such as mixtures of Gaussians (Akashi and Ku- mamoto 1977, Doucet 1998, Doucet, Godsill and Andrieu
95
Embed
Rao-Blackwellised Particle Filtering for Dynamic …motionplanning/reading/me577Particle...Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks Arnaud Doucet Engineering
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks
Particle filters (PFs) are powerful sampling-based inference/learning algorithms for dynamicBayesian networks (DBNs). They allow us totreat, in a principled way, any type of probabil-ity distribution, nonlinearity and non-stationarity.They have appeared in several fields under suchnames as “condensation”, “sequential MonteCarlo” and “survival of the fittest”. In this pa-per, we show how we can exploit the structureof the DBN to increase the efficiency of parti-cle filtering, using a technique known as Rao-Blackwellisation. Essentially, this samples someof the variables, and marginalizes out the restexactly, using the Kalman filter, HMM filter,junction tree algorithm, or any other finite di-mensional optimal filter. We show that Rao-Blackwellised particle filters (RBPFs) lead tomore accurate estimates than standard PFs. Wedemonstrate RBPFs on two problems, namelynon-stationary online regression with radial ba-sis function networks and robot localization andmap building. We also discuss other potential ap-plication areas and provide references to some fi-nite dimensional optimal filters.
1 INTRODUCTION
State estimation (online inference) in state-space models iswidely used in a variety of computer science and engineer-ing applications. However, the two most famous algorithmsfor this problem, the Kalman filter and the HMM filter, areonly applicable to linear-Gaussian models and models withfinite state spaces, respectively. Even when the state spaceis finite, it can be so large that the HMM or junction treealgorithms become too computationally expensive. This istypically the case for large discrete dynamic Bayesian net-works (DBNs) (Dean and Kanazawa 1989): inference re-quires at each time space and time that is exponential in the
number of hidden nodes.
To handle these problems, sequential Monte Carlo meth-ods, also known as particle filters (PFs), have been in-troduced (Handschin and Mayne 1969, Akashi and Ku-mamoto 1977). In the mid 1990s, several PF algorithmswere proposed independently under the names of MonteCarlo filters (Kitagawa 1996), sequential importance sam-pling (SIS) with resampling (SIR) (Doucet 1998), bootstrapfilters (Gordon, Salmond and Smith 1993), condensationtrackers (Isard and Blake 1996), dynamic mixture models(West 1993), survival of the fittest (Kanazawa, Koller andRussell 1995), etc. One of the major innovations during the1990s was the inclusion of a resampling step to avoid de-generacy problems inherent to the earlier algorithms (Gor-don et al. 1993). In the late nineties, several statistical im-provements for PFs were proposed, and some importanttheoretical properties were established. In addition, thesealgorithms were applied and tested in many domains: see(Doucet, de Freitas and Gordon 2000) for an up-to-date sur-vey of the field.
One of the major drawbacks of PF is that sampling inhigh-dimensional spaces can be inefficient. In some cases,however, the model has “tractable substructure”, whichcan be analytically marginalized out, conditional on cer-tain other nodes being imputed, c.f., cutset conditioning instatic Bayes nets (Pearl 1988). The analytical marginal-ization can be carried out using standard algorithms, suchas the Kalman filter, the HMM filter, the junction tree al-gorithm for general DBNs (Cowell, Dawid, Lauritzen andSpiegelhalter 1999), or, any other finite-dimensional opti-mal filters. The advantage of this strategy is that it candrastically reduce the size of the space over which we needto sample.
Marginalizing out some of the variables is an example ofthe technique called Rao-Blackwellisation, because it isrelated to the Rao-Blackwell formula: see (Casella andRobert 1996) for a general discussion. Rao-Blackwellisedparticle filters (RBPF) have been applied in specific con-texts such as mixtures of Gaussians (Akashi and Ku-mamoto 1977, Doucet 1998, Doucet, Godsill and Andrieu
2000), fixed parameter estimation (Kong, Liu and Wong1994), HMMs (Doucet 1998, Doucet, Godsill and Andrieu2000) and Dirichlet process models (MacEachern, Clydeand Liu 1999). In this paper, we develop the general theoryof RBPFs, and apply it to several novel types of DBNs. Weomit the proofs of the theorems for lack of space: pleaserefer to the technical report (Doucet, Gordon and Krishna-murthy 1999).
2 PROBLEM FORMULATION
Let us consider the following general state spacemodel/DBN with hidden variables ��� and observed vari-ables � � . We assume that ��� is a Markov process of ini-tial distribution ��� ��� and transition equation ��� ��� � �������� .The observations � ��� ��� � � �� ��� ��������� � � � are assumedto be conditionally independent given the process ��� ofmarginal distribution ����� � � � � . Given these observations,the inference of any subset or property of the states � � � � �� � � ��� � ������� ��� � � relies on the joint posterior distribution�!� � �"� � � � ��� � . Our objective is, therefore, to estimate thisdistribution, or some of its characteristics such as the filter-ing density �!� � � � � ��� � or the minimum mean square error(MMSE) estimate #%$ � � � � ��� ��& . The posterior satisfies thefollowing recursion
If one attempts to solve this problem analytically, one ob-tains integrals that are not tractable. One, therefore, has toresort to some form of numerical approximation scheme. Inthis paper, we focus on sampling-based methods. Advan-tages and disadvantages of other approaches are discussedat length in (de Freitas 1999).
The above description assumes that there is no structurewithin the hidden variables. But suppose we can di-vide the hidden variables ��� into two groups, + � and , � ,such that ��� ����� �������"-( ����, ��� + ��� ��� ��� , ��� �� ���*+ ��� + ��� �� and,conditional on + �"� � , the conditional posterior distribution�!��, � � �"� � ��� � � + � � �. is analytically tractable.
�Then we can
easily marginalize out , �"� � from the posterior, and onlyneed to focus on estimating �!�/+ � � � � � ��� � , which lies in aspace of reduced dimension. Formally, we are making useof the following decomposition of the posterior, which fol-lows from the chain rule
The problem of how to automatically identify which vari-ables should be sampled, and which can be handled analytically,is one we are currently working on. We anticipate that algorithmssimilar to cutset conditioning (Becker, Bar-Yehuda and Geiger1999) might prove useful.
If eq. (1) does not admit a closed-form expression, then eq.(2) does not admit one either and sampling-based methodsare also required. Since the dimension of �!��+ �"� ��� � ��� �C issmaller than the one of �!�"+ �"� � � , � � � � � ��� � , we should expectto obtain better results.
In the following section, we review the importance sam-pling (IS) method, which is the core of PF, and quantify theimprovements one can expect by marginalizing out , �"� � �i.e. using the so-called Rao-Blackwellised estimate. Sub-sequently, in Section 4, we describe a general RBPF algo-rithm and detail the implementation issues.
3 IMPORTANCE SAMPLING ANDRAO-BLACKWELLISATION
If we were able to sample D i.i.d. random sam-
ples (particles), EGF�+�HJILK� � � � ,MHLIJK�"� �CNPO�Q (SR����L�L�J� DUT , according to�!��+ �"� �0� , �"� ��� � ��� �. , then an empirical estimate of this distri-bution would be given by
�WVU�"+ � � �0� , �"� ��� � ��� �C)( RD
VXILY ��Z\[C]�^`_Jab'c d'e f ^g_Lab.c d*h �@i�+
� � � i�, � � �'where Zj[C] ^`_Jab'c d e f ^g_Jab.c d h �@i�+ � � � i�, �"� �' denotes the Dirac delta
function located at F +�HLIJK�"� � � ,MHLILK�"� � N . As a corollary, an
estimate of the filtering distribution �!��+ � � , � � � ��� � is� V �"+ � � , � � � ��� � k( �Vml VILY � Z [C] ^g_Lad e f ^n_Jad h �*i�+ � i�, � . Hence
one can easily estimate the expected value of any functiono � of the hidden variables w.r.t. this distribution, pq� o �. , us-ing
p"V�� o �.U(sr o � ��+ �"� ��� , � � �' �qVt��+ �"� �0� , �"� ��� � ��� �. i�+ � � � i�, � � �( R
DVXILY �
o � F�+�HJIJK� � � � ,MHLIJK�"� �CNThis estimate is unbiased and, from the strong law oflarge numbers (SLLN), p�V�� o �. converges almost surely(a.s.) towards p4� o �. as D u vxw . If y �z d �var{ H ] b'c d e f b.c d'| }�~�c d K $ o � �*+ �"� � � , �"� � &S� vxw , then a centrallimit theorem (CLT) holds� D�� p"Vt� o �'�� p4� o �'���(��VG���G���@� � y �z d��where � ��� denotes convergence in distribution. Typi-cally, it is impossible to sample efficiently from the “tar-get” posterior distribution �P�"+ �"� � � , � � � � � ��� � at any time � .So we focus on alternative methods.
One way to estimate �!� + � � � � , � � � � � ��� � and p4� o � con-sists of using the well-known importance sampling method(Bernardo and Smith 1994). This method is based on thefollowing observation. Let us introduce an arbitrary impor-tance distribution � ��+ �"� ��� , � � ��� � ��� �. , from which it is easyto get samples, and such that �!� + �"� � � , �"� �"� � ��� �'�� � implies� �"+ �"� � � , � � �"� � ��� �.�� � . Then
p � o � ( #�� H ] b'c d e f b.c d | } ~Cc d K � o � �*+ �"� � � , �"� � �� ��+ �"� � � , � � � '#�� H ] b'c d e f b.c d'| }�~Cc d K � � ��+ �"� � � , � � �.'where the importance weight is equal to
ing to � ��+ � � � � , �"� � � � ��� � , a Monte Carlo estimate of p4� o � is given by
p �V � o � )(� �V � o �'� �V � o �. ( l VILY � o � F + HBIJK�"� � � , HJILK�"� �CN � F + HJILK� � � � , HBIJK�"� �.N
l VILY � � F +�HLIJK�"� � � ,MHJILK�"� � N( VX
ILY � � HJILK� � � o � F + HJILK�"� � � , HJILK� � � N
where the normalized importance weights � HBIJK��� � are equal to
� HBIJK�"� � ( � F + HLILK� � � � , HLIJK�"� � Nl V� Y � � F +�H � K� � � � ,MH � K�"� � N
This method is equivalent to the following point mass ap-proximation of �!� + � � � � , �"� � � � ��� � �WVU�"+ �"� �0� , � � ��� � ��� �.)( VX
ILY � � HJILK� � � Z [C] ^n_Lab.c d.e f ^n_Jab'c d@h �@i�+ �"� � i�, �"� �'
For “perfect” simulation, that is � ��+ � � � � , �"� � � � ��� � �(�!��+ �"� � � , �"� � � � ��� � , we would have
� HBIJK�"� � ( D � �for any Q .
In practice, we will try to select the importance distribu-tion as close as possible to the target distribution in a given
sense. For D finite,p �V � o � is biased (since it is a ratio of
estimates), but according to the SLLN,p V � o � converges
asymptotically a.s. towards p4� o � . Under additional as-sumptions, a CLT also holds.
Now consider the case where one can marginalize out , � � �analytically, then we can propose an alternative estimatefor p � o �' with a reduced variance. As �!��+ �"� � � , � � �"� � ��� �' (�!��+ �"� �"� � ��� �' �!�", �"� � � � ��� � � + �"� �. , where �!�", � � �"� � ��� � � + �"� �. isa distribution that can be computed exactly, then anapproximation of �!��+ �"� ��� � ��� �' yields straightforwardlyan approximation of �!��+ �"� � � , �"� � � � ��� � . Moreover, if#q{ H f b'c d�| }�~Cc d e ] b'c d K � o � �@+ � � � � , �"� � � can be evaluated in a
closed-form expression, then the following alternative im-portance sampling estimate of p � o � can be used
p �V � o � "(� �V � o �.� �V � o �.
( l VILY � # { [ f b'c d | } ~�c d e ] ^n_Jab'c d h F o � F�+ HJIJK� � � � , �"� � NqN � F�+ HJILK�"� � Nl VILY � � F +�HJILK�"� � N
p �V � o � tosatisfy a CLT is var{ H ] b'c d e f b.c d | } ~Cc d K � o � ��+ �"� � � , � � � � � vxwand � ��+ �"� � � , �"� � � vxw for any ��+ �"� � � , �"� � (Bernardo and
Smith 1994). This trivially implies thatp �V � o � also satis-
fies a CLT. More precisely, we get the following result.
Proposition 2 Under
the assumptions given above,p �V � o � and
p �V � o � satisfya CLT � D F p �V � o � M� p>� o � N (��VG��� � �@� � y �� �� D F p �V � o � M� p>� o � N (��VG����� �@� � y �� �where y ��0/ y �� , y �� and y �� being given by
tationally more extensive to compute thanp �V � o �. so it is
of interest to know when, for a fixed computational com-plexity, one can expect to achieve variance reduction. One
has
y �� � y �� ( #�� H ] b'c d | } ~Cc d K � ������� ^ � b.c d�� "~�c d� ��b'c d a � � o � ��+ � � � � , �"� � � p4� o � '�� ��+ � � � � , �"� � � �so that, accordingly to the intuition, it will be worth gen-erally performing Rao-Blackwellisation when the averageconditional variance of the variable , �"� � is high.
4 RAO-BLACKWELLISED PARTICLEFILTERS
Given D particles (samples)� + HJILK�"� ��� � � ,MHJILK� � ����� � at time � �R , approximately distributed according to the distribution���*+ HJIJK� � ��� � � , HJILK�"� ��� � � � ��� ��� �" , RBPFs allow us to compute D
particles � +�HJILK� � � � ,MHJIJK� � � � approximately distributed according
to the posterior ���*+�HJILK�"� � � ,MHJILK� � � � � ��� � , at time � . This is ac-complished with the algorithm shown below, the details ofwhich will now be explained.
we can obtain recursive formulas to evaluate � ��+ �"� � (� �*+ �"� ��� � � � and thus � ��� � . The “incremental weight” � �
is given by
� �32 �!�"� � � � ��� ��� � � + �"� � �P�"+ � � + ��� � � �"+ ��� � ��� � � + ��� �����" � � denotes the normalized version of � � , i.e. � HLILK� (4 l V� Y � � H
� K�!5 ��� � HBIJK� . Hence we can perform importance
sampling online.
Choice of the Importance Distribution
There are infinitely many possible choices for � �"+ �"� � � � ��� � ,the only condition being that its supports must include thatof �!�"+ �"� � � � ��� � . The simplest choice is to just sample fromthe prior, �!�"+ � � + ��� � , in which case the importance weightis equal to the likelihood, �!�"� � � � ��� ����� � + �"� � . This is themost widely used distribution, since it is simple to compute,but it can be inefficient, since it ignores the most recentevidence, � � . Intuitively, many of our samples may end upin a region of the space that has low likelihood, and hencereceive low weight; these particles are effectively wasted.
We can show that the “optimal” proposal distribution, inthe sense of minimizing the variance of the importanceweights, takes the most recent evidence into account:
Proposition 3 The distribution that minimizes the vari-ance of the importance weights conditional upon + � � �����and � ��� � is
�!��� � � � ��� ����� � + �"� ��� � 0( r �!� � � � � ��� ��� � � + � � � �P�"+ � � + ��� � i�+ �Unfortunately, computing the optimal importance samplingdistribution is often too expensive. Several deterministicapproximations to the optimal distribution have been pro-posed, see for example (de Freitas 1999, Doucet 1998).
Degeneracy of SIS
The following proposition shows that, for importance func-tions of the form (3), the variance of � �*+ �"� �. can only in-crease (stochastically) over time. The proof of this propo-sition is an extension of a Kong-Liu-Wong theorem (Kong
et al. 1994, p. 285) to the case of an importance function ofthe form (3).
Proposition 4 The unconditional variance (i.e. with theobservations � ��� � being interpreted as random variables)of the importance weights � �@+ � � �. increases over time.
In practice, the degeneracy caused by the variance increasecan be observed by monitoring the importance weights.Typically, what we observe is that, after a few iterations,one of the normalized importance weights tends to 1, whilethe remaining weights tend to zero.
4.1.2 Selection step
To avoid the degeneracy of the sequential importance sam-pling simulation method, a selection (resampling) stagemay be used to eliminate samples with low importance ra-tios and multiply samples with high importance ratios. Aselection scheme associates to each particle + HJIJK� � � a num-ber of offsprings, say D I ��� , such that l VIJY � D I ( D .Several selection schemes have been proposed in the lit-erature. These schemes satisfy # � D I � ( D � HJILK� , buttheir performance varies in terms of the variance of theparticles, var � D I � . Recent theoretical results in (Crisan,Del Moral and Lyons 1999) indicate that the restriction# � D I � ( D � HBIJK� is unnecessary to obtain convergence re-sults (Doucet et al. 1999). Examples of these selectionschemes include multinomial sampling (Doucet 1998, Gor-don et al. 1993, Pitt and Shephard 1999), residual resam-pling (Kitagawa 1996, Liu and Chen 1998) and stratifiedsampling (Kitagawa 1996). Their computational complex-ity is � �@D .4.1.3 MCMC step
After the selection scheme at time � , we obtain D par-ticles distributed marginally approximately according to���*+ � � � � � ��� � . As discussed earlier, the discrete nature of theapproximation can lead to a skewed importance weightsdistribution. That is, many particles have no offspring( D I ( � ), whereas others have a large number of off-spring, the extreme case being D I ( D for a particularvalue Q . In this case, there is a severe reduction in the di-versity of the samples. A strategy for improving the re-sults involves introducing MCMC steps of invariant distri-bution ���*+ �"� �"� � ��� �' on each particle (Andrieu, de Freitas andDoucet 1999b, Gilks and Berzuini 1998, MacEachern et al.1999). The basic idea is that, by applying a Markov tran-sition kernel, the total variation of the current distributionwith respect to the invariant distribution can only decrease.Note, however, that we do not require this kernel to be er-godic.
4.2 CONVERGENCE RESULTS
Let� ����� be the space of bounded, Borel measurable
functions on ��� . We denote ��� � ������������ � -��� �� . The fol-
lowing theorem is a straightforward consequence of Theo-rem 1 in (Crisan and Doucet 2000) which is an extensionof previous results in (Crisan et al. 1999).
Theorem 5 If the importance weights �G� are upperbounded and if one uses one of the selection schemes de-scribed previously, then, for all � / � , there exists � � inde-
pendent of D such that for any � � � F�������� ����� N2 !#" �� () � + 1%$ 8 $ 5 ��� �6�7 8 *�7�& $ 8 3/5 6�7 8 ;"24305 6�7 8 9<: 1 7 8 ;�' 5 6�7 8�(
'*)+��, 8�- $ 8 - '�
where the expectation is taken w.r.t. to the randomness in-troduced by the PF algorithm. This results shows that, un-der very lose assumptions, convergence of this general par-ticle filtering method is ensured and that the convergencerate of the method is independent of the dimension of thestate-space. However, � � usually increases exponentiallywith time. If additional assumptions on the dynamic sys-tem under study are made (e.g. discrete state spaces), itis possible to get uniform convergence results ( � � ( � forany � ) for the filtering distribution �!� , � � � ��� � . We do notpursue this here.
5 EXAMPLES
We now illustrate the theory by briefly describing two ap-plications we have worked on.
5.1 ON-LINE REGRESSION AND MODELSELECTION WITH NEURAL NETWORKS
Consider a function approximation scheme consisting ofa mixture of . radial basis functions (RBFs) and a linearregression term. The number of basis functions, . � , theircenters, / � , the coefficients (weights of the RBF centersplus regression terms), 0 � , and the variance of the Gaussiannoise on the output, y �� , can all vary with time, so we treatthem as latent random variables: see Figure 1. For details,see (Andrieu, de Freitas and Doucet 1999a).
In (Andrieu et al. 1999a), we show that it is possible tosimulate / � , . � and 1 � with a particle filter and to com-pute the coefficients 0 � analytically using Kalman filters.This is possible because the output of the neural networkis linear in 0 � , and hence the system is a conditionally lin-ear Gaussian state-space model (CLGSSM), that is it is alinear Gaussian state-space model conditional upon the lo-cation of the bases and the hyper-parameters. This leads toan efficient RBPF that can be combined with a reversiblejump MCMC algorithm (Green 1995) to select the number
y
x
y
x
σ 2
µ
α
k
2
2
2
2
2
2
y
x
σ 2
µ
α
k
y
x
σ 2
µ
α
k
σ 2
µ
α
k
σ 2
µ
α
k1 3 4
0
0
0
0
1
1
1
1
1
3
3
3
3
3
4
4
4
4
4
Figure 1: DBN representation of the RBF model. Thehyper-parameters have been omitted for clarity.
230 240 250 260 270 280 290 300 310
−2
−1
0
1
2
Pre
dict
ion
0 50 100 150 200 250 300 350 400 450 5000
2
4
6
k
0 50 100 150 200 250 300 350 400 450
0
0.2
0.4
σ2
Time
Figure 2: The top plot shows the one-step-ahead outputpredictions [—] and the true outputs [ ����� ] for the RBFmodel. The middle and bottom plots show the true val-ues and estimates of the model order and noise variancerespectively.
of basis functions online. For example, we generated somedata from a mixture of 2 RBFs for � ( R�������� � � ��� , andthen from a single RBF for � ( � � R�������� ��R ����� ; the methodwas able to track this change, as shown in Figure 2. Furtherexperiments on real data sets are described in (Andrieu etal. 1999a).
5.2 ROBOT LOCALIZATION AND MAPBUILDING
Consider a robot that can move on a discrete, two-dimensional grid. Suppose the goal is to learn a map ofthe environment, which, for simplicity, we can think of asa matrix which stores the color of each grid cell, whichcan be either black or white. The difficulty is that the color
L1 L2 L3
Y2 Y3Y1
M1(1) M2(1) M3(1)
M2(2)M1(2) M3(2)
Figure 3: A Factorial HMM with 3 hidden chains. � � � Q represents the color of grid cell Q at time � , � � representsthe robot’s location, and � � the current observation.
sensors are not perfect (they may accidentally flip bits), norare the motors (the robot may fail to move in the desired di-rection with some probability due e.g., to wheel slippage).Consequently, it is easy for the robot to get lost. And whenthe robot is lost, it does not know what part of the matrix toupdate. So we are faced with a chicken-and-egg situation:the robot needs to know where it is to learn the map, butneeds to know the map to figure out where it is.
The problem of concurrent localization and map learn-ing for mobile robots has been widely studied. In (Mur-phy 2000), we adopt a Bayesian approach, in which wemaintain a belief state over both the location of the robot,� � � � R�������� � D�� � , and the color of each grid cell, � � � Q �� R�������� � D � , Q ( R�������� � D � , where D � is the numberof cells, and D is the number of colors. The DBN weare using is shown in Figure 3. The state space has size� �*D V �
. Note that we can easily handle changing envi-ronments, since the map is represented as a random vari-able, unlike the more common approach, which treats themap as a fixed parameter.
The observation model is � � ( M��� � ��� �C� , where M� � isa function that flips its binary argument with some fixedprobability. In other words, the robot gets to see the colorof the cell it is currently at, corrupted by noise: � � is anoisy multiplexer with � � acting as a “gate” node. Notethat this conditional independence is not obvious from thegraph structure in Figure 3(a), which suggests that all thenodes in each slice should be correlated by virtue of sharinga common observed child, as in a factorial HMM (Ghahra-mani and Jordan 1997). The extra independence informa-tion is encoded in � � ’s distribution, c.f., (Boutilier, Fried-man, Goldszmidt and Koller 1996).
The basic idea of the algorithm is to sample � ��� � with a PF,and marginalize out the � � � Q nodes exactly, which can bedone efficiently since they are conditionally independentgiven � ��� � :� 3�� 8.3 �"; A������jA � 8'3 ��� ;09 � 1 7 8 A�� 1 7 8<;�=��
(�� + 1 � 3�� 8'3 � ;09 � 1 7 8 A�� 1 7 8<;
Some results on a simple one-dimensional grid world are
Figure 4: Estimated position as the robot moves from cell1 to 8 and back. The robot “gets stuck” in cell 4 for twosteps in a row on the outgoing leg of the journey (hence thedouble diagonal), but the robot does not realize this untilit reaches the end of the “corridor” at step 9, where it isable to relocalise. (a) Exact inference. (b) RBPF with 50particles. (c) Fully-factorised BK.
shown in Figure 4. We compared exact Bayesian infer-ence with the RBPF method, and with the fully-factorisedversion of the Boyen-Koller (BK) algorithm (Boyen andKoller 1998), which represents the belief state as a productof marginals:
� 3 � 8 A � 8 3 �"; A������WA � 8 3 � � ;09 : 1 7 8 ;<= � 3 � 8 9 : 1 7 8 ; ( ��� + 1 � 3�� 8 3 � ;09 : 1 7 8 ;We see that the RBPF results are very similar to the ex-
act results, even with only 50 particles, but that BK getsconfused because it ignores correlations between the mapcells. We have obtained good results learning a R ��� R �map (so the state space has size
� ��� �.��� ) using only 100particles (the observation model in the 2D case is that therobot observes the colors of all the cells in a � � � neighbor-hood centered on its current location). For a more detaileddiscussion of these results, please see (Murphy 2000).
5.3 CONCLUSIONS AND EXTENSIONS
RBPFs have been applied to many problems, mostly inthe framework of conditionally linear Gaussian state-spacemodels and conditionally finite state-space HMMs. That is,they have been applied to models that, conditionally upona set of variables (imputed by the PF algorithm), admit aclosed-form filtering distribution (Kalman filter in the con-tinuous case and HMM filter in the discrete case). One canalso make use of the special structure of the dynamic modelunder study to perform the calculations efficiently using thejunction tree algorithm. For example, if one had evolv-ing trees, one could sample the root nodes with the PF andcompute the leaves using the junction tree algorithm. Thiswould result in a substantial computational gain as one onlyhas to sample the root nodes and apply the juction tree tolower dimensional sub-networks.
Although the previoulsy mentioned models are the most
famous ones, there exist numerous other dynamic systemsadmitting finite dimensional filters. That is, the filteringdistribution can be estimated in closed-form at any time �using a fixed number of sufficient statistics. These include Dynamic models for counting observations (Smith
and Miller 1986). Dynamic models with a time-varying unknow covari-ance matrix for the dynamic noise (West and Harrison1996, Uhlig 1997). Classes of the exponential family state space models(Vidoni 1999).
This list is by no means exhaustive. It, however, shows thatRBPFs apply to very wide class of dynamic models. Con-sequently, they have a big role to play in computer vision(where mixtures of Gaussians arise commonly), robotics,speech and dynamic factor analysis.
References
Akashi, H. and Kumamoto, H. (1977). Random samplingapproach to state estimation in switching environ-ments, Automatica 13: 429–434.
Andrieu, C., de Freitas, J. F. G. and Doucet, A. (1999a). Se-quential Bayesian estimation and model selection ap-plied to neural networks, Technical Report CUED/F-INFENG/TR 341, Cambridge University EngineeringDepartment.
Andrieu, C., de Freitas, J. F. G. and Doucet, A. (1999b). Se-quential MCMC for Bayesian model selection, IEEEHigher Order Statistics Workshop, Ceasarea, Israel,pp. 130–134.
Becker, A., Bar-Yehuda, R. and Geiger, D. (1999). Randomalgorithms for the loop cutset problem.
Bernardo, J. M. and Smith, A. F. M. (1994). Bayesian The-ory, Wiley Series in Applied Probability and Statis-tics.
Boutilier, C., Friedman, N., Goldszmidt, M. and Koller,D. (1996). Context-specific independence in bayesiannetworks, Proc. Conf. Uncertainty in AI.
Boyen, X. and Koller, D. (1998). Tractable inferencefor complex stochastic processes, Proc. Conf. Uncer-tainty in AI.
Casella, G. and Robert, C. P. (1996). Rao-Blackwellisationof sampling schemes, Biometrika 83(1): 81–94.
Cowell, R. G., Dawid, A. P., Lauritzen, S. L. and Spiegel-halter, D. J. (1999). Probabilistic Networks and Ex-pert Systems, Springer-Verlag, New York.
Crisan, D. and Doucet, A. (2000). Convergence of gen-eralized particle filters, Technical Report CUED/F-INFENG/TR 381, Cambridge University EngineeringDepartment.
Crisan, D., Del Moral, P. and Lyons, T. (1999). Dis-crete filtering using branching and interacting parti-cle systems, Markov Processes and Related Fields5(3): 293–318.
de Freitas, J. F. G. (1999). Bayesian Methods for Neu-ral Networks, PhD thesis, Department of Engineer-ing, Cambridge University, Cambridge, UK.
Dean, T. and Kanazawa, K. (1989). A model for reason-ing about persistence and causation, Artificial Intelli-gence 93(1–2): 1–27.
Doucet, A. (1998). On sequential simulation-based meth-ods for Bayesian filtering, Technical Report CUED/F-INFENG/TR 310, Department of Engineering, Cam-bridge University.
Doucet, A., de Freitas, J. F. G. and Gordon, N. J.(2000). Sequential Monte Carlo Methods in Practice,Springer-Verlag.
Doucet, A., Godsill, S. and Andrieu, C. (2000). On se-quential Monte Carlo sampling methods for Bayesianfiltering, Statistics and Computing 10(3): 197–208.
Doucet, A., Gordon, N. J. and Krishnamurthy, V.(1999). Particle filters for state estimation of jumpMarkov linear systems, Technical Report CUED/F-INFENG/TR 359, Cambridge University EngineeringDepartment.
Ghahramani, Z. and Jordan, M. (1997). Factorial HiddenMarkov Models, Machine Learning 29: 245–273.
Gilks, W. R. and Berzuini, C. (1998). Monte Carlo in-ference for dynamic Bayesian models, Unpublished.Medical Research Council, Cambridge, UK.
Gordon, N. J., Salmond, D. J. and Smith, A. F. M. (1993).Novel approach to nonlinear/non-Gaussian Bayesianstate estimation, IEE Proceedings-F 140(2): 107–113.
Green, P. J. (1995). Reversible jump Markov chain MonteCarlo computation and Bayesian model determina-tion, Biometrika 82: 711–732.
Handschin, J. E. and Mayne, D. Q. (1969). Monte Carlotechniques to estimate the conditional expectation inmulti-stage non-linear filtering, International Journalof Control 9(5): 547–559.
Isard, M. and Blake, A. (1996). Contour tracking bystochastic propagation of conditional density, Euro-pean Conference on Computer Vision, Cambridge,UK, pp. 343–356.
Kanazawa, K., Koller, D. and Russell, S. (1995). Stochasticsimulation algorithms for dynamic probabilistic net-works, Proceedings of the Eleventh Conference onUncertainty in Artificial Intelligence, Morgan Kauf-mann, pp. 346–351.
Kitagawa, G. (1996). Monte Carlo filter and smoother fornon-Gaussian nonlinear state space models, Journalof Computational and Graphical Statistics 5: 1–25.
Kong, A., Liu, J. S. and Wong, W. H. (1994). Se-quential imputations and Bayesian missing data prob-lems, Journal of the American Statistical Association89(425): 278–288.
Liu, J. S. and Chen, R. (1998). Sequential Monte Carlomethods for dynamic systems, Journal of the Ameri-can Statistical Association 93: 1032–1044.
MacEachern, S. N., Clyde, M. and Liu, J. S. (1999).Sequential importance sampling for nonparametricBayes models: the next generation, Canadian Jour-nal of Statistics 27: 251–267.
Murphy, K. P. (2000). Bayesian map learning in dynamicenvironments, in S. Solla, T. Leen and K.-R. Muller(eds), Advances in Neural Information ProcessingSystems 12, MIT Press, pp. 1015–1021.
Pearl, J. (1988). Probabilistic Reasoning in Intelligent Sys-tems: Networks of Plausible Inference, Morgan Kauf-mann.
Pitt, M. K. and Shephard, N. (1999). Filtering via simula-tion: Auxiliary particle filters, Journal of the Ameri-can Statistical Association 94(446): 590–599.
Smith, R. L. and Miller, J. E. (1986). Predictive records,Journal of the Royal Statistical Society B 36: 79–88.
Uhlig, H. (1997). Bayesian vector-autoregressions withstochastic volatility, Econometrica.
Vidoni, P. (1999). Exponential family state space modelsbased on a conjugate latent process, Journal of theRoyal Statistical Society B 61: 213–221.
West, M. (1993). Mixture models, Monte Carlo, Bayesianupdating and dynamic models, Computing Scienceand Statistics 24: 325–333.
West, M. and Harrison, J. (1996). Bayesian Forecastingand Dynamic Linear Models, Springer-Verlag.
On Sequential Monte Carlo Sampling Methods for Bayesian
Filtering
Arnaud Doucet (corresponding author) - Simon Godsill - Christophe Andrieu
Signal Processing Group, Department of Engineering
In this paper, we improve on the slow time-varying par-tial correlation (STV-PARCOR) model recently suggestedby us [2] to include any deterministic interpolator. We thensuggest a modification to the on-line filtering algorithm toaccomodate the changes. It is believed that the modificationwill improve on the simulation results as it takes into ac-count the underlying trend of the parameter evolution. Thesuggested algorithm is tested with real speech data and pre-liminary results are shown and compared with those gener-ated using existing approaches.
1. INTRODUCTION
Many real world data analysis problems involve sequentialestimation of filtering distribution ����������, where �� isthe unobserved state of the system at time � and ���� �
���� � � � � ��� are observations made over some time inter-val � � ��� � � � � ��. In most cases, the data structures can bevery complex, typically involving elements of non-Gaussianity,high-dimensionality and non-linearity, which may not besolvable analytically. Sequential Monte Carlo methods, alsoknown as Particle Filters (PF), have been proposed to over-come these problems. Refer to [1] for an up-to-date surveyof the field. Within the particle filter framework, the filteringdistribution is approximated with an empirical distributionformed from point masses, or particles,
���������� ������
����� �����
���� ��
�����
����� � �� �
���� � �
where ��� is the Dirac delta function and ����� is a weight
attached to particle ����� .
In recent years, various approaches have been developedto apply the sequential Monte Carlo filtering strategies forthe purpose of audio signal enhancement (see, for exam-ple, [3] and citations therein). These approaches assumea Gaussian random walk model for the system parameter
evolution at every time step, which may not give a suf-ficiently slow or smooth variation with time. This makesthe standard PF inefficient as it is known that the filter be-comes highly degenerate for random walks with very lowvariance. Recently, Fong and Godsill [2] propose a slowtime-varying partial correlation (STV-PARCOR) model tosolve this problem. In their work, the system coefficients areconsidered to evolve stochastically on a block-to-block ba-sis and all coefficients in-between are found by linear inter-polator. Based on the STV-PARCOR model and under theparticle filtering framework, an algorithm for on-line jointestimation of system parameters signal is developed.
This work serves as an extension to the work of [2]. Inthis paper, we generalised the STV-PARCOR particle fil-ter, so that any deterministic interpolator can be used. Wethen describe the modified algorithm for the generation of“delayed” state realisations. Finally, preliminary simulationresults are shown.
2. STATE-SPACE REPRESENTATION AND AUDIOMODEL
In this section, we describe the model adopted in this paper— the STV-PARCOR model. A lengthy time series is di-vided into non-overlapping blocks. If is the block size,we define ����� � ������� � � � � ������ as a group of unob-served states of the system and ����� � ������� � � � � ������as observations made over some blocks � � ��� � � � � ���.
Assuming a Markovian structure for the model, the prob-lem can then be formulated in a state-space form as follows,
����� � ���������� � State evolution density����� � �������������� Observation density
(1)where ����� and ������ are pre-specified state evolution andobservation densities. It should be noted that the state-spacemodel adopted here is different from the standard one [5],which relates the unobserved states ���� and observations���� made over a time interval � � ��� � � � � ��. (1), how-ever, defines the state evolution between different blocks.
For the choice of audio model, we suggest a time vary-ing partial correlation (TV-PARCOR) model. The advan-tage of adopting such model is that approximate stabilitycan easily be enforced, provided that the PARCOR coeffi-cients ���� vary sufficient slowly with time [3]. The audiosignal process ���� is then modelled as
�� �
������
�������� � ��
where �� is the TVAR coefficient at time �, which is foundby transforming the PARCOR coefficient �� via the Levinson-Durbin recursion. ��� is the log-excitation variance and ��is the time-varying model order. Refer to [3] for a detaileddescription of the audio model adopted here.
The full specification of the state-space model is as fol-lows: at any time �, the state vector �� is partitioned as���� ���
� with �� � ��������� � � � � ���� and �� � ���� ��� � ���
�
being the signal state and the parameter state respectively.In the setup of the particle filter, a proposal distribu-
tion [1, 3] similar to that of [2] has been adopted, whichtakes the form:
���� ������� � ������������� �������������
����� ���������� (2)
As in [2], the block variation of the log-excitation varianceand model order take the form,
����� ���������� � ���������
� ���� (3)
������������� ����
���
����� � �������� � ��
�(4)
where ���� � ��� � ����� � ���� for � � ��������� � � � � ��,i.e. both ��� and �� are assumed to be fixed within a block.
For the PARCOR coefficients, a constrained random walkmodel [3] is assumed for the block variation,
������������� �
� �������� � ����� if �
���������� � �
� otherwise
A constrained random walk model of this form will en-sure approximate stability provided that the PARCOR co-efficients vary sufficiently slowly. Having sampled the lastPARCOR coefficient of the block � , ���, all the interme-diate PARCOR coefficients are found by some determinis-tic methods using previously sampled PARCOR coefficients���������� � �� � � � � � � ��,
�� � ������� ����� � � � � ���� (5)
where � � ��� � � � �� � �� � � � � ��. For the interpo-lator functions, ��, the Legendre polynomials, Fourier basisfunction and B-splines are popular choices, all of which will
ensure a slow and smooth evolution of the reflection coef-ficients [4]. In [2], we have implemented a linear interpo-lator, which should be considered as a special case of (5).We believe that (5) will give a better approximation than thesimplified linear interpolator case as it takes into account thebetter underlying smooth trend of the coefficient evolution.
3. SEQUENTIAL AUDIO SIGNAL ANDPARAMETER ESTIMATION
Based on the STV-PARCOR model, [2] describes a way forjoint estimation of the signal and parameter state under thesequential Monte Carlo filter framework. The suggested al-gorithm proceeds as follows:
Random samples are to be drawn from the joint filteringdistribution ����� � ��� ������ � which can be factorised as fol-lows,
Assume there exists a particulate approximation for the marginalparameter filtering distribution,
������� ������ � ������
����� ������ � ��
������ � (7)
as it is assumed that the proposal distribution for the pa-rameter state being the prior (2), the importance weight willsimply take the form,
����� �
�������������
������������� ������� (8)
with ������ � ���������� � � � � �� are parameter states re-cently updated using the deterministic interpolator (5). Thejoint filtering distribution (6) can then be approximated by
����� � ��� ������ � �
������ ������ � ����� �
�����
����� ���
������ �
����� ���������
������
����� ��������� � ����� ��
���� ������� � ��� �
Hence, given ������� , signal realisations can be drawn from
������ � ����� ��������� � ����� �
For instance, if we assume a conditional Gaussian state-space model, then all the computations can be done under
the framework of the Kalman filter and smoother [3]. e.g.for � � ��� � �� � �� � � � � ��, ������
������� ������� from (8)
can be found by the prediction error decomposition [5] andthe marginal signal filtering distribution can be rewritten as:
����� ��������� � ����� � �
�������������
������������������� �������� ������
�
�������������
���� ��������� �
�������� (9)
with �������� � ���������� �
������� and �
������� � ��������
�������� ���
��������������� �
������� are sufficient statistics found by the Kalman
smoother.
4. IMPLEMENTATION
We modify the STV-PARCOR particle filter suggested in [2]to facilitate the generalised STV-PARCOR model. Let � �
� � � � �, assuming that the parameter realisations ���������
� � � � ����������� and the signal sufficient statistics (� ���
�������
and �����������
) are available for � � �� � � � � !� from theprevious iteration of the filter, state random samples can bedrawn from the filtering distribution as follows:For � �� � � � � ! ,
� Generate random sample from ������ � ������
����������
and ������� � ����� ��
����������
�. For each ������, sample
������ from �������
����������, where
����������� �
�������
�����������
�
�if �
����� " �
���������
����
�����������������
��otherwise
� Make each of �������� � � � � � ������� the same size by ap-
pending 0 to the vector if necessary. Using these fixedgrids, ������ � � � � � �� � � #� for # being inte-ger are found by deterministic interpolator (5). In oursimulations, we have implemented the cubic splinewith � � �, i.e. for $ � �� � � � � �
�����,
������� �
�����
%������
���� ��
where %�� is the spline coefficient to be determinedand ����� is the ��� order B-spline basis function.
� Given the sufficient statistics (� ����������
and �����������
)
and ������ � � � � � ��, we run a forward sweep
of the Kalman filter over blocks � � � to � . We thenevaluate the importance weight �
���� according to (8).
Having generated the parameter set ������� � � �� � � � � !�,we then resample (see [1] for details) it ! times with re-placement according to �
���� . For the resampled parame-
ter set ������� � � �� � � � � !�, we run a backward sweep of
the Kalman smoother and generate �� �������� �
�������� � � � �
�� � � � � ��.
Theoretically, signal realisations �� ���� � � �� � � � � !� � �� � �� � � � � �� can be generated according to (9). How-
ever, as ������ � �� � �� � � � �� � � #� are go-ing to change in the next iteration, only �� �� � � �� ��� � � � � �� � ��� will be drawn. Hence, the suggested al-gorithm will only give “delayed” state realisations.
In addition, continuity can be ensured by taking into ac-count �
������ in the sampling of �
���� , as there is only one free
variable owing to overlapping betweem ������� and �
���� .
5. EXPERIMENTAL RESULTS
Experiments are conducted to investigate the effectivenessof the suggested algorithm (STV-PARCOR PF) for the pur-pose of audio noise reduction. In particular, we would liketo verify our suggestion that the generalised STV-PARCORmodel is a better model for slow time-varying processes(e.g. speech) then the TVAR model. Preliminary simula-tion results are shown and compared with those generatedusing the standard extended Kalman smoother (eKS) [5].The clean speech clips used in this experiment are:
S1: Good service should be rewarded by big tipsS2: Draw every outer line first, then fill in the interior
The experiment setup is as follows, the clean speech sig-nal is assumed to be submerged in white Gaussian noise(WGN) with known variance ��
� , i.e. �������� � ������� ����.
The output SNR from different algorithms are recorded andcompared. Owing to the stochastic nature of the MonteCarlo algorithm, simulation results for the STV-PARCORPF are found by averaging the SNR improvement over fiveindependent applications of the algorithm.
In our simulations, we have chosen a value of ! � ��,the block size is fixed to 100 for the STV-PARCOR PF.The hyperparameters (��
�, ��� and ��� � � ��� �����)adopted here are assumed to be known and fixed. In con-sideration of the computational cost, it is assumed that themodel order �� � ��� � � � � ��� with �� being limitedto 20. We note that ! � �� is extremely small for MonteCarlo simulation but the preliminary simulation results sug-gest that the algorithm works pretty well in such a case. Asin other applications of the Particle Filter, simulation resultsimprove as ! increases.
For the eKS, a Gaussian random walk is assumed di-rectly on the AR coefficients and the model order is fixedto 10. This model is employed as the TV-PARCOR modeland time-varying model order is not straight forward to im-plement with the extended Kalman smoother. The hyperpa-rameters adopted are adjusted so that the system parameterswill cover the same range as the generalised STV-PARCORmodel in time steps. We note that this may not be theoptimum setup for the eKS, however, this will give a faircomparison for both algorithms.
Figure 1 and Figure 2 show the 3D histogram plots ofthe first reflection coefficients (����) at different input SNRs(SNR��) for the word “‘reward” in S1 using the STV-PARCORPF. The plots are generated by grouping all PARCOR co-efficients particles from five independent simulations. Asshown in the plots, the suggested algorithm gives consistentresults at different noise levels.
We then compare the performance of the suggested al-gorithm with the the eKS. The SNR improvements for dif-ferent clips at different noise levels are summarised below:
Audio outputs can be found at http://www-sigproc.eng.cam.ac.uk/�wnwf2/Eusipco2002.html. Comparing the SNRimprovements, the suggested STV-PARCOR PF consistentlyoutperforms the eKS, which justifies using it in practice,even it induces a much heavier computational load whencompare with the eKS.
6. CONCLUSION
We propose a generalisation to the STV-PARCOR modelrecently suggested by us to include any deterministic inter-polator functions, ��. We then describe an adaptation tothe algorithm for joint estimation for signal and parameter.The algorithm is tested on real speech signals and comparedwith other standard approaches. Encouraging results are ob-tained. Further simulations will be conducted to investigatethe effects of different interpolator functions and differentlags, �. The results will be published in due course.
7. REFERENCES
[1] A. Doucet, N. de Freitas, and N. J. Gordon, editors. Se-quential Monte Carlo Methods in Practice. New York:Springer-Verlag, 2001.
[2] W. Fong and S. Godsill. Sequential Monte Carlo simu-lation of dynamical models with slowly varying param-eters: Application to audio. In Proceedings of the IEEEICASSP, 2002. To appear.
[3] W. Fong, S. J. Godsill, A. Doucet, and M. West. MonteCarlo smoothing with application to audio signal en-hancement. IEEE Transactions on Signal Processing,Special Issue, 50(2):438–449, February 2002.
[4] Y. Grenier. Time-dependent ARMA modeling of non-stationary signals. IEEE Transactions on Acoustics,Speech and Signal Processing, 31(4):899–911, 1983.
[5] A. C. Harvey. Forecasting, structural time series mod-els and the Kalman filter. Cambridge University Press,1989.
20
40
60
80
100
120
140
160
180
200
0
500
1000
1500
2000
2500
3000
3500
−0.98
−0.96
−0.94
−0.92
−0.9
−0.88
−0.86
−0.84
−0.82
0
100
200
tρt
Figure 1: 3D histogram plot of ���� for the word “reward”using the STV-PARCOR particle filter for SNR��=0dB
20
40
60
80
100
120
140
160
180
200
220
0
500
1000
1500
2000
2500
3000
3500
−0.98
−0.96
−0.94
−0.92
−0.9
−0.88
−0.86
−0.84
−0.82
0
100
200
tρt
Figure 2: 3D histogram plot of ���� for the word “reward”using the STV-PARCOR particle filter for SNR��=20dB
Hybrid Diagnosis with Unknown Behavioral ModesMichael W. Hofbaur 1 and Brian C. Williams 2
in: Proceedings of the 13th International Workshop on
Principles of Diagnosis (DX02), May 2002
Abstract. A novel capability of discrete model-based diagnosismethods is the ability to handleunknown modeswhere no assump-tion is made about the behavior of one or several components of thesystem. This paper incorporates this novel capability of model-baseddiagnosis into a hybrid estimation scheme by calculating partial fil-ters. The filters are based on causal and structural analysis of thespecified components and their interconnection within the hybrid au-tomaton model. Incorporating unknown modes provides a robust es-timation scheme that can cope, unlike other hybrid estimation andmulti-model estimation schemes, with unmodeled situations and par-tial information.
1 Introduction
Modern technology is increasingly leading to complex artifacts withhigh demands on performance and availability. As a consequence,fault-tolerant control and an underlying monitoring and diagno-sis capability plays an important role in achieving these require-ments. Monitoring and diagnosis systems that build upon the discretemodel-based reasoning paradigm[8] can cope well with complexityin modern artifacts. As an example, the Livingstone system[22] suc-cessfully monitored and diagnosed the DS-1 space probe in flight,a system with approximately480 modes of operation. However, awidespread application of discrete model-based systems is hinderedby their difficulty to reason about the continuous dynamics of an ar-tifact in a comprehensive manner. Continuous behaviors are difficultto capture by the pure qualitative models that are used by the rea-soning engines. Nevertheless, additional reasoning in terms of thecontinuous dynamics is vital for detecting functional failures, as wellas low-level incipient (i.e slowly developing) faults and subtle com-ponent degradation.
Hybrid systems theory provides a modeling paradigm that inte-grates both, continuous state evolution and discrete mode changesin a comprehensive manner. Recent work in hybrid estimation[14,16, 24, 9] attempts to overcome the shortcomings of discrete model-based diagnosis cited above and provides schemes that integratemodel-based approaches with techniques from fault detection andisolation (FDI)[23, 4] and multi-model adaptive filtering[13, 11, 10].The hybrid estimation schemes, as well as their FDI and multi-modelfiltering ancestors, work well whenever the underlying model(s) are’close’ mathematical descriptions of the physical artifact. They canfail severely whenever unforeseen situations occur. Therefore, it isessential to provide models that capture the entire spectrum of possi-ble behaviors/modes whenever we use the hybrid estimate for closedloop control, for instance. Model-based diagnosis, in contrast, does
1 Department of Automatic Control, Graz University of Technology, A-8010Graz, Austria, email:hofbaur @irt.tu-graz.ac.at
2 MIT Space Systems and AI Laboratories, 77 Massachusetts Ave., Rm. 37-381, Cambridge, MA 02139 USA, email:williams @mit.edu
not impose such a strong modeling assumption. Its concept of theunknown modeallows diagnosis of systems where no assumption ismade about the behavior of one or several components of the sys-tem. In this way, it captures unspecified and unforeseen behaviorsof the system under investigation. This paper provides an approachto incorporate the concept of an unknown mode into our hybrid es-timation scheme[9]. As a result we obtain an estimation capabilitythat can detect unforeseen situations. Furthermore, it allows us tocontinue estimation on a degraded basis. We achieve this by causalanalysis[17, 20], structural analysis[7] and decomposition of the sys-tem.
This paper starts with a brief introduction to our hybrid systemsmodeling and estimation scheme. Upon this foundation, we extendhybrid estimation to incorporate the unknown mode and demonstratethe underlying structural analysis and decomposition task. Finally, anexperimental evaluation with computer simulated data for a Martianlive support system demonstrates the advantages of this extended hy-brid estimation scheme.
2 Hybrid Systems
The hybrid automaton model used throughout this paper is based on[9] and can be seen as a model that merges hidden Markov models(HMM) with continuous discrete-time dynamical system models (wepresent the model on the level of detail sufficient for this work andrefer the reader to the reference cited above for more detail).
2.1 Concurrent Hybrid Automata
Definition 1 A discrete-time probabilistic hybrid automaton (PHA)A is described as a tuple〈x,w, F, T,Xd, Ts〉:
• x denotes the hybridstate variablesof the automaton3, composedof x = {xd} ∪ xc. The discrete variablexd denotes themodeof the automaton and has finite domainXd. Thecontinuous statevariablesxc capture the dynamic evolution of the automaton.xdenotes thehybrid stateof the automaton, whilexc denotes thecontinuous state.
• The set ofI/O variablesw = ud ∪ uc ∪ yc of the automatonis composed of disjoint sets of discrete input variablesud (calledcommand variables), continuousinput variablesuc, and continu-ousoutput variablesyc.
• F : Xd → FDE ∪ FAE specifies thecontinuous evolutionof theautomaton in terms ofdiscrete-time difference equationsFDE andalgebraic equationsFAE for each modexd ∈ Xd. Ts denotes thesampling period of the discrete-time difference equations.
3 When clear from context, we use lowercase bold symbols, such asv, todenote asetof variables{v1, . . . , vl}, as well as avector [v1, . . . , vl]
T
with componentsvi.
• The finite set,T , of transitionsspecifies the probabilistic discreteevolution of the automaton.
Complex systems are modeled as a composition of concurrentlyoperating PHA that represent the individual system components. Aconcurrent probabilistic hybrid automata (cPHA)specifies this com-position as well as its interconnection to the outside world:
Definition 2 A concurrent probabilistic hybrid automaton (cPHA)CA is described as a tuple〈A,u,yc,vs,vo, Nx, Ny〉:
• A = {A1,A2, . . . ,Al} denotes the finite set of PHAs that repre-sent the componentsAi of the cPHA (we denote the componentsof a PHAAi by xdi,xci,udi,uci,yci, Fi,Xdi).
• Theinput variablesu = ud ∪uc of the automaton consists of thesets of discrete input variablesud = ud1 ∪ . . . ∪ udl (commandvariables) and continuous input variablesuc ⊆ uc1 ∪ . . . ∪ ucl.
• Theoutput variablesyc ⊆ yc1 ∪ . . . ∪ ycl specify the observedoutput variables of the cPHA.
• The observation process is subject to additive, zero mean Gaussiansensor noise. Ny : Xd → IRm×m specifies the mode dependent4
disturbancevo in terms of the covariance matrixR = diag(ri).• Nx specifies additive, zero mean Gaussiandisturbancesthat act
upon the continuous state variablesxc = xc1 ∪ . . . ∪ xcl. Nx :Xd → IRn×n specifies the mode dependent disturbancevs interms of the covariance matrixQ.
Definition 3 The hybrid statex(k) of a cPHA at time-stepk spec-ifies the mode assignmentxd,(k) of the mode variablesxd ={xd1, . . . , xdl} and the continuous state assignmentxc,(k) of thecontinuous state variablesxc = xc1 ∪ . . . ∪ xcl.
Interconnection among the cPHA componentsAi is achieved viashared continuous I/O variableswc ∈ uci∪yci only. Fig. 1 illustratesa simple example composed of 3 PHAs.
A1 A2
uc1
ud1
ud2CA
wc1
A3
yc2
yc1
Figure 1. Example cPHA composed of three PHAs
A cPHA specifies a mode dependent discrete-time model for aplant with command inputsud, continuous inputsuc, continuousoutputsyc, modexd, continuous state variablesxc and additive, zeromean Gaussian disturbancesvs, vo. The discrete-time evolution ofxc andyc is described by the nonlinear system of difference equa-tions (sampling periodTs)
xc,(k) = f(k)(xc,(k−1),uc,(k−1)) + vs,(k−1)
yc,(k) = g(k)(xc,(k),uc,(k)) + vo,(k).(1)
The functionsf(k) andg(k) are obtained by symbolically solving5
the set of equationsF1(xd1,(k)) ∪ . . . ∪ Fl(xdl,(k)) given the modexd,(k) = [xd1,(k), . . . , xdl,(k)]
T .
4 E.g. sensors can experience different magnitudes of disturbances for differ-ent modes.
5 Our symbolic solver restricts the algebraic equations and nonlinear func-tions to ones that can be solved explicitly and utilizes a Grobner Basisapproach[3] to derive a set of equations of form (1).
To detect the onset of subtle failures, it is essential that a monitoringand diagnosis system is able to accurately extract the hybrid state ofa system from a signal that may be hidden among disturbances, suchas measurement noise. This is the role of a hybrid observer. Moreprecisely:
Hybrid Estimation Problem: Given a cPHACA, a sequencesof observations{yc,(0),yc,(1), . . . ,yc,(k)} and control inputs{u(0),u(1), . . . ,u(k)}, estimate the most likely hybrid statex(k) at time-stepk.
A hybrid state estimatex(k) consists of acontinuous state esti-mate, together with the associatedmode. We denote this by the tuple
x(k) := 〈xd,(k), xc,(k),P(k)〉,
wherexc,(k) specifies the mean andP(k) the covariance for the con-tinuous state variablesxc. The likelihood of an estimatex(k) is de-noted by thehybrid belief-stateh(k)[x].
We perform hybrid estimation as extended version of HMM-stylebelief-state update that accounts for the influence of the continuousdynamics upon the system’s discrete modes. A major difference be-tween hybrid estimation and an HMM-style belief-state update, aswell as multi-model estimation, is, however, that hybrid estimationtracks a set of trajectories, whereas standard belief-state update andmulti-model estimation aggregate trajectories which share the samemode. This difference is reflected in the first of the following tworecursive functions which define our hybrid estimation scheme:
h(•k)[xi] denotes an intermediate hybrid belief-state, based on tran-sition probabilities only. Hybrid estimation determines for each
xj,(k−1) at the previous time-stepk − 1 the possible transitions,thus specifying candidate successor states to be tracked. Consecu-tive filtering provides the new hybrid statexi,(k) and adjusts the hy-brid belief-stateh(k)[xi] based on the hybrid probabilistic observa-tion functionPO(yc,(k)|xi,(k),uc,(k)). The estimatexj,(k) with thehighest belief-stateh(k)[xj ] = maxi(h(k)[xi]) is taken as the hybridestimate at time-stepk.
Tracking all possible trajectories of the system is almost alwaysintractable because the number of trajectories becomes too large afteronly a few time-steps. In [9] we present an approximative anytimeanyspace algorithm that copes with the exponential growth, as well asthe large number of modes in a typical concurrent hybrid automatonmodel.
Hybrid estimation and other multi-model estimation schemes havein common that they require models that are ’close’ mathematical de-scriptions of the system. They can fail severely whenever unforeseen,i.e. unmodeled, situations occur. As a consequence, we have to pro-vide models for all operational modes as well as an exhaustive setof models for possible failure modes. Providing all possible failuremodels can be problematic even under the assumption of an exhaus-tive failure mode effect analysis (FMEA). For instance, consider anincipient fault in a servo valve that causes the valve to drift off itsnominal opening value. The drift (positive, negative, slow, fast...) issubject to the fault. It is surely difficult to provide a mathematicalmodel with the correct parameter values that captures all possibledrift situations. Nor is it helpful to introduce a sufficiently large setof modes that captures possible situations of the drift fault as thiswould introduce additional complexity for hybrid estimation by in-creasing the number of modes unnecessarily.
This requirement of hybrid mode estimation is in contrast to dis-crete model-based diagnosis schemes, such as GDE (e.g. [5, 6, 19]).Model-based diagnosis deduces the possible mode of the systembased on nominal models, and few specified fault models only. Theonset of possible fault scenarios are covered by the so calledun-known modewhich does not impose any constraints on the system’svariables.
The next section provides an approach that systematically incor-porates the concept of the unknown mode into our hybrid estimationscheme.
3 Estimation with Unknown Modes
The estimation scheme [9] requires a fully specified mode assign-mentxdi,(k) for each candidate trajectory that is tracked in the courseof hybrid estimation. Only a fully specified mode allows us to deducethe mathematical model (1) for the overall system. This model is thebasis for the dynamic filter (e.g. extended Kalman filter) that is usedin the course of hybrid estimation.
uc1
yc1
yc2PO
xc1
xc2
xc3
MIMO Filter
Figure 2. MIMO filter (e.g. extended Kalman filter) for the cPHA example
For our illustrative 3 component example introduced abovethis would mean that hybrid estimation calculates a multi-input
multi-output (MIMO) filter (see Fig. 2) for modexdi,(k) =[m11, m21, m31]
T based on the mathematical model (3). This filterprovides the hybrid state estimatexi,(k) as well as the value for thehybrid probabilistic observation functionPO(yc,(k)|xi,(k),uc,(k))for the hybrid estimator (see Appendix A for the extended Kalmanfilter estimation details).
Let us assume the modexdi,(k) = [?, m21, m31]T which speci-
fies that component 1 (A1) is in unknown mode. A component in un-known mode imposes no constraints (equations) among its variables(uc1 and the internal variablewc1, in our case). As a consequence,we cannot deduce an overall mathematical model of the form (1) andfail to provide the basis for the hybrid estimation scheme, the MIMOfilter for modexdi,(k) = [?, m21, m31]
T .
1 2
uc1
ud1
ud2
wc1
3
yc2
yc1
vs1 vs3
vo1
vo2
A A
CA
A
vs2
Figure 3. Example cPHA with explicit noise inputs
However, a close look on the PHA interconnection (Fig. 3 - thefigure extends Fig. 1 by including the implicit noise inputs, as wellas indicating the causality for the internal I/O variables) reveals thatwe can still estimate component 3 by its observed outputyc2 and theobservationyc1 as a substitute for the value of its input. This intuitiveapproach utilizes a decomposition of the cPHA as shown in Fig. 4.
1
vs1
2
3
vs3
yc2
yc1
vo1
vo2
vo1yc1
uc1A A
A
vs2
uc1
Figure 4. Decomposed cPHA
The decomposition allows us to treat the concurrent parts of thesystem independently and calculate afilter cluster consisting of 2independent filters. However, when calculating the individual filtersfor the cluster, we have to take into account that we use themea-surementof the input to the third component (yc1) in replacement toits true value. This can be interpreted as having additional additivenoise at the component’s input as indicated in Fig. 4. The followingmodification of the covariance matrixQ3 for the state variables ofA3 takes this into account:
Q3 = b3r1bT3 + Q3, (6)
wherer1 denotes the variance of disturbancevo1 andb3 = [0, 1]T
uc1
yc1
PO2
yc2
PO1
xc2
xc3
xc1Filter 1
Filter 2
Filter Cluster
Figure 5. Decomposed filter
denotes the input vector6 of A3 with respect toyc1.A filter cluster consisting of extended Kalman filters and the
MIMO extended Kalman filter are interchangeable as they providethe same expected value for the continuous state (E(xc)) wheneverthe mode of the automaton is fully specified. However, the decom-posed filter has the advantage that the probabilistic observation func-tion PO of the overall system is given by
PO =∏
j
POj , (7)
wherePOj denotes the probabilistic observation function of thej’thfilter in the filter cluster.
This factorization of the probabilistic observation function allowsus to calculate an upper bound forPO whenever one or more com-ponents of the system are in unknown mode. We simply take theproduct over the remaining filters in the cluster. This is equivalentwith considering the upper bounds of the inequalitiesPOj ≤ 1 foreach unknown filterj. In our example with unknown componentA1
this would mean:PO ≤ PO2,
wherePO2 denotes the observation function for the filter that esti-mates the continuous state of componentA3.
The following subsection provides a graph-based approach forfiler cluster deduction that grounds the informally introduced decom-position on a more versatile basis.
3.1 System Decomposition and Filter ClusterCalculation
Starting point for the decomposition of the system for a cPHA modexd is the set of equations
F1(xd1,(k)) ∪ . . . ∪ Fl(xdl,(k)) =: F(xd), (8)
whereFj(xdj,(k)) returns the appropriate set of equations for a com-ponentAi wheneverxdj,(k) ∈ Xdj or the empty set whenever thecomponent is in unknown mode, i.e.xdj,(k) =?. Although we stillhave to solve the set of equations to arrive at the mathematicalmodel of form (1) we can interpret the set of equations (8) as the
6 In the general case, we have to calculatebj for a cPHA componentAjand observed inputsuyc by linearization, more specifically:bj,(k) =
∂fj/∂uyc|xcj,(k−1),ucj,(k−1), wherefj denotes the right-hand side of
the difference equation for componentAj , uyc refers to the observedvariables that are used as inputs to the component (i.e.uyc ⊂ yc) andxcj,(k−1) as well asucj,(k−1) represent the state estimate and the contin-uous input for componentAj at the previous time-step, respectively.
raw modelfor the system given modexd. The following decom-position performs a structural analysis of the raw model-based oncausal analysis[17, 20], structural observability analysis[7] and graphdecomposition[1].
A cPHA model does not impose a fixed causal structure that spec-ifies directionality of automaton interconnections. Causality is im-plicitly specified by the set of equations. This increases the expres-siveness of the modeling framework but requires us to perform acausal analysis of the raw model (8) as a first step. The deduc-tion of the causal dependencies is done by applying the bipartite-matching based algorithm presented in [17]. The resulting directedgraph records the causal dependencies among the variables of thesystem (Fig. 6 shows the graph for the the illustrative 3 PHA ex-ample). Each vertex of the graph represents one equationei ∈ F
uc1 xc1wc1 yc1 xc2 xc3 yc2
Figure 6. Causal graph for the cPHA example
or an exogenous variable specification (e.g.uc1) and is labeled byits dependent variable which also specifies the outgoing edge (in thefollowing, we will use the variable name to refer to the correspond-ing vertex in the graph). Vertices without incoming edges specify theexogenousvariables.
Definition 4 A causal graphof a cPHACA at a modexd is a di-rected graph that records the causal dependencies among the vari-ablesv ∈
⋃i xci ∪ uci ∪ yci of CA. We denote the causal graph
by CG(CA,xd) and sometimes omit arguments where no confusionseems likely.
Goal of our analysis is to obtain a set of independent subsystemsthat utilize observed variables as virtual inputs. Therefore, we slicethe graph at observed variable vertices with outgoing edges, insert anew vertex to represent a virtual input and re-map the sliced outgo-ing edges to this vertex. Fig. 7 demonstrates this re-mapping for thecausal graph of Fig. 6. The observed variables areyc1 andyc2. Onlythe vertex with dependent variableyc1 has an outgoing edge, thus weslice the graph atyc1 → xc2 and re-map the edge to the virtual inputuyc1.
uc1 xc1wc1 yc1
xc2 xc3 yc2uyc1
Figure 7. Remapped causal graph for the cPHA example
A dynamic filter (e.g. extended Kalman filter) can only estimatethe observable part of the model. Therefore, it is essential to perform
an observability analysis prior calculating the filter so that non ob-servable parts of the model are excluded. We perform this analysison a structural basis7.
Definition 5 We call a variablev of a cPHACA at modexd struc-turally observable (SO)whenever it is directly observed, i.e.v ∈ yc,or there exists at least one path in the causal graphCG(CA,xd) thatconnects the variablez to an output variableyc ∈ yc of CA.
A filter estimates the state variablesxc of a dynamic system basedon observationsyc and the inputsuc that act upon the state variablesxc. The required knowledge about the inputsuc indicates that thestructural observability criteria is not yet sufficient to determine thesubmodel for estimation. We have to make sure, that no unknown ex-ogenous input influences a variable. To illustrate this, consider againthe 3 PHA example with modexd = [?, m21, m31]
T . Component1 in unknown mode omits the equation that relates the variablesuc1
andwc1. This leads to a causal graphCG (Fig. 8), wherewc1 is la-beled as exogenous (no incoming edges). This unknown exogenousinput influences the state variablexc1 and, as a consequence, pre-vents us from estimating it!
uc1 xc1wc1 yc1
xc2 xc3 yc2uyc1
Figure 8. Remapped causal graph for the cPHA example with unknowncomponentA1
We extend our structural analysis of the causal graph by the fol-lowing criteria:
Definition 6 We call a variablev of a cPHACA at modexd struc-turally determined (SD)whenever it is an input variable of the au-tomaton, i.e.v ∈ uc, or there does not exist a path in the causalgraphCG(CA,xd) that connects an exogenous variableue /∈ uc
with v.
Furthermore, it is helpful to eliminate loops in the causal graphprior checking variables against both structural criteria. For this pur-pose, we calculate thestrongly connected componentsof the causalgraph[1].
Definition 7 A strongly connected component (SCC)of the causalgraphCG is a maximal setSCC of variables in which there is a pathfrom any one variable in the set to another variable in the set.
Fig. 9 shows the remapped causal graph for the 3 PHA example aftergrouping variables into strongly connected components.
The strong interconnection among variables in an SCC impliesthat:
1. Structural observability of variables in an SCC follows directlyfrom structural observability of at least one variable in the SCC.
7 Throughout the paper we assume that loss of observability is caused bya structural defect of the model. Otherwise, it is necessary to perform anadditional numerical observability test [18] as structural observability onlyprovides anecessarycondition for observability.
uc1 xc1wc1 yc1
xc2, xc3 yc2uyc1
Figure 9. Causal SCC graph for cPHA example
2. A variable in an SCC is structurally determined, if and only if allvariables in the SCC are structurally determined.
As a consequence, we can apply our structural analysis to stronglyconnected components directly and operate on the SCC graph, i.ea causal graph without loops. The analysis of a strongly connectedcomponent with respect to structural observability and structural de-termination (SOD) can be outlined as follows:
function determine-SOD-of-SCC(SCC,uc, k)whenSOD-undetermined?(SCC)
if exogenous?(SCC)then vi← independent-var(SCC)
if vi ∈ uc then SD(SCC)← TrueelseSD(SCC)← False
else V ← uplink-SCCs(SCC)loop for SCCi in V
do determine-SOD-of-SCC(SCCi,uc, k)SO(SCC)← TrueSD(SCC)← all-uplink-SCCs-are-SD?(V)cluster-index(SCC)← k ∪ cluster-indices(V)
SOD-determined(SCC)← Truereturn Nil
Our structural analysis algorithm determines structural observabil-ity and determination (SOD) of a variable by traversing the SCCgraph backwards from the observed variables towards the inputs.In the course of this analysis we label non-exogenous strongly con-nected components with an index that refers to their cluster mem-bership. This indexing scheme allows us to cluster the variables intonon-overlapping clusters with respect to the observed variables. Thedirect relation between a variable, its determining equation, and thecPHA component that specified this equation leads to the compo-nent clusters sought. The structural analysis can be summarized asfollows:
function component-clustering(CA,xd)returns a set of cPHA component clustersyc← observed-vars(CA)CG ← remap-causal-graph(CG(CA,xd),yc)uc← virtual-inputs(CG) ∪ input-vars(CA)CGSCC ← strongly-connected-component-graph(CG)k← 0loop for SCCi in output-SCCs(CGSCC ,yc)
Figure 10. Labeled and partitioned causal SCC graph for the 3 cPHAexample
Each component cluster defines the observable and determinedraw model for a subsystem of the cPHA. This raw model can besolved symbolically and provides the nonlinear system of differenceequations (a model similar to (1), but with the additional virtual in-puts) that is the basis for the corresponding filter in the filter cluster.In this way we exclude the unobservable and/or undetermined partsof the overall system from estimation.
Whenever a state variablexcj becomes unobservable and/or un-determined (e.g. due to a mode change) during hybrid estimation,we hold the value for the mean at its last known estimatexcj andincrease its varianceσ2
j = pjj by a constant factor at each hybridestimation step. This reflects a continuously decreasing confidencein the estimatexcj and allows us to restart estimation whenever thevariable becomes observable and determined again8.
4 Example - BIO-Plex
Our application is the BIO-Plex Test Complex at NASA JohnsonSpace Center, a five chamber facility for evaluating biological andphysiochemical Martian life support technologies. It is an artificial,biosphere-type, closed environment, which must robustly provide allthe air, water, and most of the food for a crew of four without in-terruption. Plants are grown in plant growth chambers, where theyprovide food for the crew, and convert the exhaledCO2 into O2. Inorder to maintain a closed-loop system, it is necessary to control theresource exchange between the chambers without endangering thecrew. For the scope of this paper, we restrict our evaluation to thesub-system dealing withCO2 control in the plant growth chamber(PGC), shown in Fig. 11.
The system is composed of several components, such as redundantflow regulators (FR1, FR2) that provide continuousCO2 supply, re-dundant pulse injection valves (PIV1, PIV2) that provide a means forincreasing theCO2 concentration rapidly, a lighting system (LS) andthe plant growth chamber (PGC), itself. The control system main-tains a plant growth optimalCO2 concentration of1200 ppm duringthe day phase of the system (20 hours/day).
Hybrid estimation schemes are key to tracking system operationalmodes, as well as, detecting subtle failures and performing diag-noses. For example, we simulate a failure of the second flow reg-ulator. The regulator becomes off-line and drifts slowly towards itspositive limit. This fault situation is difficult to capture by an explicitfault model as we do not know, in advance, whether the regulator
8 Whenever a state variablexcj is directly observed we also can utilize analternative approach suggested in [15] that restarts the estimator with theobserved value, thus improving the observer convergence time.
Airlock Plant Growth Chamber
Cre
w C
ham
ber
CO2
tank
lighting system
chamber control
flow regulator 2
pulse injection valves
CO2
flow regulator 1
Figure 11. BIO-Plex plant growth chamber
drifts towards its postitive or negative limit, nor do we know the mag-nitude of the drift. A fault of this type, which develops slowly andwhose symptom is hidden among the noise in the system is a typicalcandidate for our unknown-mode detection capability. However, wealso provide explicit failure models that describe typical situations.For example, the PGC has 4 plant trays with one illumination bankfor each tray. A black out of one illumination bank can be interpretedas a25% loss in light intensity. This situation can be modeled explic-itly by a dynamical model that takes this reduced light intensity intoaccount.
In the following we describe the outcome of a simulated experi-ment where the flow regulator fault with drifting symptom is injectedat time pointk = 700 and an additional light fault, that harms oneof the four illumination banks, is injected atk = 900. The faults are’repaired’ atk = 1100 andk = 1300 for the flow regulator fault andthe lighting fault, respectively. This experiment illustrates unknownmode detection and recovery from it, nominal failure mode detection,and the multiple fault detection capability of our approach.
FR1
FR2
PIV1
PIV2
LS
PGC
ud3
uc1
ud1
ud2
yc1
yc2
yc3
wc1
wc2
wc3
A1
A2
A3
A4
A5
A6
Figure 12. BIO-Plex cPHA model
The simulated data is gathered from the execution of a refined sub-set of NASA’s JSC’s CONFIG model for the BIO-Plex system[12].Hybrid estimation utilizes a cPHA model that consists of 6 com-ponents as shown in Fig. 12. To illustrate the complexity of thehybrid estimation problem we should note, that the concurrent au-tomaton has approximately56 ≈ 15000 modes. Each mode de-scribes the dynamic evolution of the chamber system by a third or-der system of difference equations. For example, the nominal op-erational condition for plant growth is characterized by the mode
xd = [mr2, mr2, mv1, mv1, ml2, mp2], wheremr2 characterizesan partially open flow regulator,mv1 a closed pulse injection valve,ml2 100% light on, andmp2 plant growth mode at1200 ppm, re-spectively. This mode specifies the raw model:
xc1,(k) andxc2,(k) denote the gas flow ([g/min]) of flow regulator 1and 2, respectively andxc3,(k) denotes theCO2 gas concentration([ppm]) in the plant growth chamber.wc1,(k) andwc2,(k) denote thegas flow ([g/min]) of the pulse injection valves andwc3,(k) denotesthe photosynthetic photon flux ([µ-mol/m2s]) of the lights above theplant trays. The nonlinear expression
−1.516 · 10−4f1(wc1,(k−1))f2(xc3,(k−1))
approximates theCO2 gas production [g/min] due to photo-synthesis according to theCO2 gas concentration and chamberillumination[12]. This raw model defines a third order system ofdiscrete-time difference equations with sampling periodTs = 1[min]:
Figure 13. Causal graph of the BIO-Plex cPHA raw model (9)
The causal graph (Fig. 13) of the raw model (9) leads to the de-composition of the system as shown in Fig. 14 (our implementationof the causal analysis and decomposition algorithms treats constantvalues, such as the value 1204.0 for the photosynthetic photon flux,as known exogenous inputs with constant value). The decompositionof the model leads to a filter cluster with 3 extended Kalman filters -one for each flow regulator and one for the remaining system (pulseinjection valves, lighting system and plant growth chamber). Thisenables us to estimate the mode and continuous state of the flow reg-ulators independent of the remaining system. As a consequence, anunknown mode in a flow regulator does not cause any implicationson the estimation of the remaining system.
uc1 xc1
wc1
yc1
uyc1
cluster 1 {FR1}
xc2 yc2
cluster 2 {FR2}
xc3 yc3
cluster 3 {PIV1, PIV2, LS, PGC}
uyc2
wc2
wc3
1204.0
0.0
Figure 14. Partitioned causal SCC graph of the BIO-Plex cPHA model
Fig. 15 shows the continuous input (control signal)uc1, observedflow rates for flow regulator 1 and 2 and theCO2 concentration forthe experiment. Both flow regulators provide half of the requestedgas injection rate up tok = 700. At this time point, the second flowregulator starts to slowly drift towards its positive limit which it willreach at approximatelyk = 800. The camber control system re-acts immediately and lowers the control signal in order to keep theCO2 concentration at the requested 1200 ppm concentration. Thistransient behavior causes a slight bump in theCO2 concentrationas shown in Fig. 15-b. Our hybrid mode estimation system detectsthis unmodeled fault atk = 727 and declares flow regulator 2 to bein an unknown mode (we indicate the unknown mode by the modenumber 0 in Fig. 16). The flow regulator modestuck-open(mr5) be-
650 700 750 800 850
0
1
2
3
4
5
6Flow Regulator 2 Estimation Detail
time [minutes]
mod
e nu
mbe
r
727 769
Figure 16. Mode estimate detail for flow regulator 2
comes more and more likely as the regulator drifts towards its openposition. Hybrid mode estimation prefers this mode as symptom ex-
600 700 800 900 1000 1100 1200 1300 14000
0.2
0.4
0.6
0.8
1
time [minutes]
CO
2 ga
s in
flow
rat
e [g
/min
]
control input
inflow rate FR2
inflow rate FR1
727
(a) Control inputuc and measuredCO2 input flow rates
600 700 800 900 1000 1100 1200 1300 14001120
1140
1160
1180
1200
1220
1240
time [minutes]
CO
2 co
ncen
trat
ion
[ppm
]
727
(b) CO2 level in PGC (measurement - gray/green, estimate -black)
Figure 15. Observed data and continuous estimation of theCO2 concentration in plant growth chamber
planation fromk = 769 onwards, although flow regulator 2 goesinto saturation a little bit later atk = 800.
The light fault atk = 900 is detected almost instantly atk = 904(ml4). This good discrimination among the pre-specified modes(failure and nominal) is further demonstrated at the terminationpoints of the faults. Repairs of the flow regulator 2 and the lightingsystem are detected immediately atk = 1101 andk = 1301, re-spectively. Fig. 17 shows the mode estimation result for the lightingsystem and flow regulator 2 over the entire experiment horizon.
600 700 800 900 1000 1100 1200 1300 1400
0
1
2
3
4
5
6Flow Regulator 2
time [minutes]
mod
e nu
mbe
r
600 700 800 900 1000 1100 1200 1300 1400
0
1
2
3
4
5
6Lighting System
time [minutes]
mod
e nu
mbe
r
Figure 17. Mode estimates for flow regulator 2 and lighting system
5 Implementation and Discussion
The implementation of our hybrid estimation scheme extends previ-ous work on hybrid estimation [9] and is written in Common LISP.
The hybrid estimator uses a cPHA description and performs decom-position and estimation, as outlined above. Decomposition is doneon-line according to the mode hypotheses that are tested in the courseof hybrid estimation. In general, it can be assumed that the the modein the system evolves on a lower rate than the hybrid estimationrate, which operates on the sampling periodTs. Therefore, we cacherecent decompositions and their corresponding filters for re-use asa compromise between a-priori calculation (space complexity) andpure on-line deduction (time complexity).
Optimized model-based estimation schemes, such asLivingstone[22], utilize conflicts to focus the underlying searchoperation. A conflict is a (partial) mode assignment that makes ahypothesis very unlikely. This requires a more general treatmentof unknown modes compared to the filter decomposition taskintroduced above. The decompositional model-based learningsystem Moriarty[21] introduced continuous variants of conflicts,so-calleddissents. We are currently reformulating these dissents forhybrid systems and investigate their incorporation to improve theunderlying search scheme. This will lead to an overall frameworkthat unifies our previous work on Livingstone, Moriarty and hybridestimation.
REFERENCES
[1] A. Aho, J. Hopcroft, and J. Ullman,Data Structures and Algorithms,Addison-Wesley, 1983.
[2] B. Anderson and J. Moore,Optimal Filtering, Information and SystemSciences Series, Prentice Hall, 1979.
[3] Grobner Bases and Applications, eds., B. Buchberger and F. Winkler,Cambridge Univ. Press, 1998.
[4] J. Chen and R. Patton,Robust Model-Based Fault Diagnosis for Dy-namic Systems, Kluwer, 1999.
[5] J. de Kleer and B. Williams, ‘Diagnosing multiple faults’,Artificial In-telligence, 32(1), 97–130, (1987).
[6] J. de Kleer and B. Williams, ‘Diagnosis with behavioral modes’, inPro-ceedings of IJCAI-89, pp. 1324–1330, (1989).
[7] A. Gehin, M. Assas, and M. Staroswiecki, ‘Structural analysis of sys-
tem reconfigurability’, inPreprints of the 4th IFAC SAFEPROCESSSymposium, volume 1, pp. 292–297, (2000).
[8] Readings in Model-Based Diagnosis, eds., W. Hamscher, L. Console,and J. de Kleer, Morgan Kaufmann, San Mateo, CA, 1992.
[9] M. Hofbaur and B.C. Williams, ‘Mode estimation of probabilistic hy-brid systems’, inHybrid Systems: Computation and Control, HSCC2002, eds., C.J. Tomline and M.R. Greenstreet, volume 2289 ofLec-ture Notes in Computer Science, 253–266, Springer Verlag, (2002).
[10] P. Li and V. Kadirkamanathan, ‘Particle filtering based likelyhood ra-tio approach to fault diagnosis in nonlinear stochastic systems’,IEEETransactions on Systems, Man, and Cybernetics - Part C, 31(3), 337–343, (2001).
[11] X.R. Li and Y. Bar-Shalom, ‘Multiple-model estimation with vari-able structure’,IEEE Transactions on Automatic Control, 41, 478–493,(1996).
[12] J. T. Malin, L. Fleming, and T. R. Hatfield, ‘Interactive simulation-based testing of product gas transfer integrated monitoring and controlsoftware for the lunar mars life support phase III test’, inSAE 28th In-ternational Conference on Environmental Systems, Danvers MA, (July,1998).
[13] P. Maybeck and R.D. Stevens, ‘Reconfigurable flight control via multi-ple model adaptive control methods’,IEEE Transactions on Aerospaceand Electronic Systems, 27(3), 470–480, (1991).
[14] S. McIlraith, ‘Diagnosing hybrid systems: a bayseian model selectionapproach’, inProceedings of the 11th International Workshop on Prin-ciples of Diagnosis (DX00), pp. 140–146, (June 2000).
[15] P.J. Mosterman and G. Biswas, ‘Building hybrid observers for complexdynamic systems using model abstractions’, inHybrid Systems: Com-putation and Control (HSCC’99), eds., F. Vaandrager and J. Schuppen,volume 1569 ofLNCS, 178–192, Springer Verlag, (1999).
[16] S. Narasimhan and G. Biswas, ‘Efficient diagnosis of hybrid systemsusing models of the supervisory controller’, inProceedings of the 12thInternational Workshop on Principles of Diagnosis (DX01), pp. 127–134, (March 2001).
[17] P. Nayak,Automated Modelling of Physical Systems, Lecture Notes inArtificial Intelligence, Springer, 1995.
[18] E. Sontag,Mathematical Control Theory: Deterministic Finite Dimen-sional Systems, Springer, New York, Berlin, Heidelberg, 2 edn., 1998.
[19] P. Struss and O. Dressler, ‘Physical negation: Integrating fault modelsinto the general diagnostic engine’, inProceedings of the InternationalJoint Conference on Artificial Intelligence (IJCAI’89), pp. 1318–1323,(1989).
[20] L. Trave-Massuyes and R. Pons, ‘Causal ordering for multiple modesystems’, inProceedings of the 11th International Workshop on Quali-tative Reasoning (QR97), pp. 203–214, (1997).
[21] B. Williams and B. Millar, ‘Decompositional, model-based learningand its analogy to diagnosis’, inProceedings of the 15th National Con-ference on Artificial Intelligence (AAAI-98), (1998).
[22] B. Williams and P. Nayak, ‘A model-based approach to reactive self-configuring systems’, inProceedings of the 13th National Conferenceon Artificial Intelligence (AAAI-96), (1996).
[23] A. S. Willsky, ‘A survey of design methods for failure detection in dy-namic systems’,Automatica, 12(6), 601–611, (1974).
[24] F. Zhao, X. Koutsoukos, H. Haussecker, J. Reich, and P. Cheung, ‘Dis-tributed monitoring of hybrid systems: A model-directed approach’, inProceedings of the International Joint Conference on Artificial Intelli-gence (IJCAI’01), pp. 557–564, (2001).
Acknowledgments
In part supported by NASA under contract NAG2-1388.
A Extended Kalman Filter
The disturbances and imprecise knowledge about the initial statexc,(0) make it necessary to estimate the state by its meanxc,(k)
and covariance matrixP(k). We use an extended Kalman filter[2]for this purpose, which updates its current state, like an HMM ob-server, in two steps. The first step uses the model to predict meanfor the statexc,(•k) and its covarianceP(•k), based on the previous
estimate〈xc,(k−1),P(k−1)〉, and the control inputuc,(k−1):
xc,(•k) = f(xc,(k−1),uc,(k−1)) (12)
A(k−1) =∂f
∂x
∣∣∣∣xc,(k−1),uc,(k−1)
(13)
P(•k) = A(k−1)P(k−1)AT(k−1) + Q. (14)
This one-step ahead prediction leads to a prediction residualr(k)
with covariance matrixS(k)
r(k) = yc,(k) − g(xc,(•k),uc,(k)) (15)
C(k) =∂g
∂x
∣∣∣∣xc,(•k),uc,(k)
(16)
S(k) = C(k)P(•k)CT(k) + R. (17)
The second filter step calculates the Kalman filter gainK(k), andrefines the prediction as follows:
K(k) = P(•k)CT(k)S
−1(k) (18)
xc,(k) = xc,(•k) + K(k)r(k) (19)
P(k) =[I−K(k)C(k)
]P(•k). (20)
The output of the extended Kalman filter, as used in our hybrid esti-mation system, is a sequence of mean/covariance pairs〈xc,(k),P(k)〉for xc,(k) as well as the hybrid probabilistic observation function