Knowledge Repn. & Reasoning Lec #24: Approximate Inference in DBNs UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2004 (Some slides by X. Boyen & D. Koller, and by S. H. Lim; Some slides by Doucet, de Freitas, Murphy, Russell, and H. Zhou)
37
Embed
Knowledge Repn. & Reasoning Lec #24: Approximate Inference in DBNs
Knowledge Repn. & Reasoning Lec #24: Approximate Inference in DBNs. UIUC CS 498: Section EA Professor: Eyal Amir Fall Semester 2004. (Some slides by X. Boyen & D. Koller, and by S. H. Lim; Some slides by Doucet, de Freitas, Murphy, Russell, and H. Zhou ). Dynamic Systems. - PowerPoint PPT Presentation
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Knowledge Repn. & ReasoningLec #24:
Approximate Inference in DBNsUIUC CS 498: Section EA
Professor: Eyal AmirFall Semester 2004
(Some slides by X. Boyen & D. Koller, and by S. H. Lim;
Some slides by Doucet, de Freitas, Murphy, Russell, and H. Zhou)
Dynamic Systems
• Filtering in stochastic, dynamic systems:– Monitoring freeway traffic (from an autonomous driver
or for traffic analysis)– Monitoring patient’s symptoms
• Models to deal with uncertainty and/or partial observability in dynamic systems:– Hidden Markov Models (HMMs), Kalman Filters etc– All are special cases of Dynamic Bayesian Networks
• Bayesian Network: a decomposed structure to represent the full joint distribution
• Does it imply easy decomposition for the belief state?
• No!
Tractable, approximate representation
• Exact inference in DBN is intractable
• Need approximation– Maintain an approximate belief state– E.g. assume Gaussian processes
• Today: – Factored belief state apx [Boyen & Koller ’98]– Particle filtering (if time permits)
Idea
• Use a decomposable representation for the belief state (pre-assume some independency)
Problem
• What about the approximation errors?– It might accumulate and grow unbounded…
Contraction property
• Main result:– If the process is mixing, then every state
transition results in a contraction of the distance between the two distributions by a constant factor
– Since approximation errors from previous steps decrease exponentially, the overall error remains bounded indefinitely
Basic framework• Definition 1:
– Prior belief state:
– Posterior belief state:
• Monitoring task:
],...,|[][ )1()0()()(
10
thh
tii
t
trrsPs
],,...,|[][ )()1()0()()(
10
th
thh
tii
t
ttrrrsPs
n
l hllt
hiit
it
n
ijii
tj
t
rsOs
rsOss
ssTss
1
)1(
)1()1(
1
)()1(
][][
][][][
][][][
Simple contraction
• Distance measure:– Relative entropy (KL-divergence) between the
actual and the approximate belief state
• Contraction due to O:
• Contraction due to T (can we do better?):
i i
iiED
][
][ln][][ln]||[
]ˆ||[]]]ˆ[||][[[ )()()()()(
tttr
tr DOODE
hht
]ˆ||[]]ˆ[||][[[ )()()()( tttt DTTD
Simple contraction (cont)
• Definition:– Minimal mixing rate:
• Theorem 3 (the single process contraction theorem):– For process Q, anterior distributions φ and ψ, ulterior distributions
φ’ and ψ’,
]]|[],|[min[min2121
1, ij
n
jijiiQ QQ
]||[)1(]||[ DD Q
Simple contraction (cont)
• Proof Intuition:
Compound processes
• Mixing rate could be very small for large processes• The trick is to assume some independence among
subprocesses and factor the DBN along these subprocesses
• Fully independent subprocesses:– Theorem 5:
• For L independent subprocesses T1, …, TL. Let γl be the mixing rate for Tl and let γ = minl γl. Let φ and ψ be distributions over S1
(t), …, SL
(t), and assume that ψ renders the Sl(t) marginally independent.
Then:
]||[)1(]||[ DD
Compound processes (cont)
• Conditionally independent subprocesses• Theorem 6 (the main theorem):– For L independent subprocesses T1, …, TL, assume each
process depends on at most r others, and each influences at most q others. Let γl be the mixing rate for Tl and let γ = minl γl. Let φ and ψ be distributions over S1
(t), …, SL(t), and assume
that ψ renders the Sl(t) marginally independent. Then:
q
rwhere
DD
*
* ]||[)1(]||[
Efficient, approximate monitoring
• If each approximation incurs an error bounded by ε, then– Total error
• =>error remains bounded• Conditioning on observations might introduce
momentary errors, but the expected error will contract
2)1()1(
Approximate DBN monitoring
• Algorithm (based on standard clique tree inference):
1. Construct a clique tree from the 2-TBN2. Initialize clique tree with conditional probabilities
from CPTs of the DBN3. For each time step:
a. Create a working copy of the tree Y. Create σ(t+1).b. For each subprocess l, incorporate the marginal σ(t)
[X(t)l] in the appropriate factor in Y.
c. Incorporate evidence r(t+1) in Y.d. Calibrate the potentials in Y.e. For each l, query Y for marginal over Xl
(t+1) and store it in σ(t+1).
Conclusion of Factored DBNs
• Accuracy-efficiency tradeoff:– Small partition =>