Dynamic Bayesian Networks CSE 473
Dec 22, 2015
Dynamic Bayesian Networks
CSE 473
© Daniel S. Weld 2
473 Topics
Agency
Problem Spaces
Search
Knowledge Representation
Reinforcement
Learning Inference Planning Learning
Logic Probability
© Daniel S. Weld 3
Last Time• Basic notions• Bayesian networks• Statistical learning
Parameter learning (MAP, ML, Bayesian L) Naïve Bayes Structure Learning Expectation Maximization (EM)
• Dynamic Bayesian networks (DBNs)• Markov decision processes (MDPs)
© Daniel S. Weld 4
Generative Planning
InputDescription of (initial state of) world (in some KR)Description of goal (in some KR)Description of available actions (in some KR)
OutputController
E.g. Sequence of actionsE.g. Plan with loops and conditionalsE.g. Policy = f: states -> actions
© Daniel S. Weld 5
Simplifying Assumptions
Environment
Percepts Actions
What action next?
Static vs.
Dynamic
Fully Observable vs.
Partially Observable
Deterministic vs.
Stochastic
Instantaneous vs.
Durative
Full vs. Partial satisfaction
Perfectvs.
Noisy
Determ
inist
ic
vs.
Stoch
ast
ic
© Daniel S. Weld 6
Static Deterministic ObservableInstantaneousPropositional
“Classical Planning”
DynamicR
ep
lan
ni
ng
/S
itu
ate
d
Pla
ns
Durative
Tem
pora
l R
eason
in
g
Continuous
Nu
meri
c
Con
str
ain
t re
ason
ing
(LP
/ILP
)
Stochastic
Con
tin
gen
t/C
on
form
an
t P
lan
s,
Inte
rleaved
execu
tion
MD
P
Policie
sP
OM
DP
P
olicie
s
PartiallyObservable
Con
tin
gen
t/C
on
form
an
t P
lan
s,
Inte
rleaved
execu
tion
Sem
i-M
DP
P
olicie
s
© Daniel S. Weld 7
Uncertainty
• DisjunctiveNext state could be one of a set of states.
• ProbabilisticNext state is a probability distribution over the set of states.
Which is a more general model?
© Daniel S. Weld 8
STRIPS Action Schemata
(:operator pick-up :parameters ((block ?ob1)) :precondition (and (clear ?ob1)
(on-table ?ob1) (arm-empty))
:effect (and (not (clear ?ob1)) (not (on-table ?ob1))
(not (arm-empty)) (holding ?ob1)))
• Instead of defining ground actions: pickup-A and pickup-B and …
• Define a schema:Note: strips doesn’t
allow derived effects;
you must be complete!}
© Daniel S. Weld 9
Nondeterministic “STRIPS”?
pick_up_Aarm_empty
clear_A
on_table_A
?
+ arm_empty- on_table_A- holding_A
© Daniel S. Weld 10
Probabilistic STRIPS?
pick_up_Aarm_empty
clear_A
on_table_A
P<0.9
+ arm_empty- on_table_A- holding_A
But what does this all mean anyway?
© Daniel S. Weld 11
Defn: Markov Model
Q: set of states
init prob distribution
A: transition probability distribution
© Daniel S. Weld 12
Probability Distribution, A
• Forward Causality The probability of st does not depend directly
on values of future states.• Probability of new state could depend on
The history of states visited. Pr(st|st-1,st-2,…, s0)
• Markovian Assumption Pr(st|st-1,st-2,…s0) = Pr(st|st-1)
• Stationary Model Assumption Pr(st|st-1) = Pr(sk|sk-1) for all k.
© Daniel S. Weld 14
Representing A
Q: set of states
init prob distributionA: transition probabilities
how can we represent these?
s0
s1
s2
s3
s4
s5
s6
s0 s1 s2 s3 s4 s5 s6
s0
s1
s2
s3
s4
s5
s6
p12
Probability of transitioning from s1 to s2
∑ ?
© Daniel S. Weld 15
Factoring Q
• Represent Q simply as a set of states?
• Is their internal structure? Consider a robot domain What is the state space?
s0
s1
s2
s3
s4
s5
s6
© Daniel S. Weld 16
A Factored domain
• Variables : has_user_coffee (huc) , has_robot_coffee (hrc), robot_is_wet (w), has_robot_umbrella (u), raining (r), robot_in_office (o)
• How many states?
26 = 64
© Daniel S. Weld 17
Representing Compactly
Q: set of states init prob distribution
How represent this efficiently?
s0
s1
s2
s3
s4
s5
s6
r u hrc
w
With a Bayes net (of course!)
© Daniel S. Weld 18
Representing A Compactly
Q: set of states
init prob distributionA: transition probabilities
s0
s1
s2
s3
s4
s5
s6
s0 s1 s2 s3 s4 s5 s6 … s35
s0
s1
s2
s3
s4
s5
s6
…
s35
p12 How big is matrix version of A? 4096
© Daniel S. Weld 19
Dynamic Bayesian Network
huc
hrc
w
u
r
o o
r
u
w
hrc
huc 8
4
16
4
2
2
Total values
required to represent transition probability table = 36
Vs. 4096 required
for a complete
state probablity
table?
T T+1
© Daniel S. Weld 20
Dynamic Bayesian Network
huc
hrc
w
u
r
o o
r
u
w
hrc
huc
T T+1
Also known as a Factored Markov Model
Defined formally as * Set of random variables * BN for initial state * 2-layer DBN for transitions
© Daniel S. Weld 21
This is Really a “Schema”
huc
hrc
w
u
r
o o
r
u
w
hrc
huc
T T+1 “Unrolling”
© Daniel S. Weld 22
Semantics:A DBN A Normal Bayesian Network
o
r
u
w
hrc
huchuc
hrc
w
u
r
o
0
o
r
u
w
hrc
huc
o
r
u
w
hrc
huc
1 2 3
© Daniel S. Weld 23
Multiple Actions
• Variables : has_user_coffee (huc) , has_robot_coffee
(hrc), robot_is_wet (w), has_robot_umbrella (u), raining (r), robot_in_office (o)
• Actions : buy_coffee, deliver_coffee, get_umbrella,
move
• We need an extra variable• (Alternatively a separate 2DBN /
action)
© Daniel S. Weld 24
Actions in DBN
huc
hrc
w
u
r
o o
r
u
w
hrc
huc
T T+1 a
© Daniel S. Weld 25
What can we do with DBNs?
• State estimation• Planning
If you add rewards (Next class)
© Daniel S. Weld 26
State Estimation
OR
HucHrcUW
? ?Move BuyC
© Daniel S. Weld 27
Noisy Observationshuc
hrc
w
u
r
o o
r
u
w
hrc
huc
T T+1
So So
True state vars (unobserved)
Noisy, sensor reading
o oSo .9 .2
a Action var Known with
certainty
© Daniel S. Weld 28
State Estimation
OR
HucHrcUW
? ?Move BuyC
So=T …
© Daniel S. Weld 29
Maybe we don’t know action!huc
hrc
w
u
r
o o
r
u
w
hrc
huc
T T+1
So So
True state vars (unobserved)
Noisy, sensor reading
o oSo .9 .2
a Action var (other agent’s
action)
© Daniel S. Weld 30
State Estimation
OR
HucHrcUW
? ?? ?
So=T …
© Daniel S. Weld 31
Inference
• Expand DBN (schema) into BN Use inference as shown last week Problem?
• Particle Filters
© Daniel S. Weld 32
Particle Filters
• Create 1000s of “Particles” Each one has a “copy” of each random var These have concrete values Distribution is approximated by whole set
•Approximate technique! Fast!• Simulate
Each time step, iterate per particle•Use the 2 layer DBN + random # generator to•Predict next value for each random var
Resample!
© Daniel S. Weld 33
Resampling
• Simulate Each time step, iterate per particle
•Use the 2 layer DBN + random # generator to•Predict next value for each random var
Resample:•Given observations (if any) at new time point•Calculate the likelihood of each particle•Create a new population of particles by
– Sampling from old population – Proportionally to likelihood