Fragment-Based Conformant Fragment-Based Conformant Planning Planning James Kurien Palo Alto Research Center Pandu Nayak Stratify, Inc. David E. Smith NASA Ames Research Center
Dec 31, 2015
Fragment-Based Conformant PlanningFragment-Based Conformant Planning
James Kurien Palo Alto Research Center
Pandu Nayak Stratify, Inc.
David E. Smith NASA Ames Research Center
Motivation: Planning for Spacecraft RecoveryMotivation: Planning for Spacecraft Recovery
Closed Open
Stuck Valve
System failures lead to uncertainty• Internal actions are fairly reliable but do fail• System interactions are complex• Observability is limited
Diagnosis yields multiple states ranked by magnitude of probability
The system must choose actions to respond to the failure
Under certain conditions an action may be damaging or disallowed
Conformant PlanningConformant Planning
Problem Instance– Let Domain be a description of a planning domain– Let Worlds be a set of initial states of the domain, {w1, w2, … wn}– Let G be a goal description– There are no sensing actions
Task: Find plan P that applied to any wi results in a state entailing G
P is a conformant plan
Challenge: Actions chosen in wi may have undesirable effects in wj
P
w1, w2, … wn
G
Existing Approaches to Conformant PlanningExisting Approaches to Conformant Planning
Generate a plan in wi and test if it achieves G in all Worlds
CGP Smith & Weld 1998 Graphplan over multiple plan graphs
CMBP Cimatti & Roveri 1999 BDD representation of belief state
GPT Bonet & Geffner 2001 Heuristic search in space of belief states
HSCP Bertoli, Cimatti & Roveri 2001 BDD + heuristic search
CPlan Castellini, Giunchiglia & Tachella 2001
SAT encoding determines possible plans which must be checked
Select actions for P by considering all Worlds simultaneously
An Observation on Conformant PlansAn Observation on Conformant Plans
Plan Step
Action
1 Dunk p3
2 Flush
3 Dunk p2
4 Flush
5 Dunk p1
6 Flush
7 Dunk p6
8 Flush
9 Dunk p4
10 Flush
11 Dunk p5
Bomb in the Toilet6 packages, 1 toilet
Example Domain: Bomb in the Toilet– Set of N packages, p1 through pN
– Packages may have bombs (1, many, a subset)
– Bombs defused by dunking the package in the toilet
– The toilet must be flushed before dunking again
Example Problem– 1 toilet
– 6 packages
– A bomb is in p1, p2, p3, p5 or (p4 & p6)
An Observation on Conformant PlansAn Observation on Conformant Plans
Example Domain: Bomb in the Toilet– Set of N packages, p1 through pN
– Packages may have bombs (1, many, a subset)
– Bombs defused by dunking the package in the toilet
– The toilet must be flushed before dunking again
Plan Step
Action
1 Dunk p3
2 Flush
3 Dunk p2
4 Flush
5 Dunk p1
6 Flush
7 Dunk p6
8 Flush
9 Dunk p4
10 Flush
11 Dunk p5
Bomb in the Toilet6 packages, 1 toilet
Fragment if bomb in p1
An Observation on Conformant PlansAn Observation on Conformant Plans
Example Domain: Bomb in the Toilet– Set of N packages, p1 through pN
– Packages may have bombs (1, many, a subset)
– Bombs defused by dunking the package in the toilet
– The toilet must be flushed before dunking again
Plan Step
Action
1 Dunk p3
2 Flush
3 Dunk p2
4 Flush
5 Dunk p1
6 Flush
7 Dunk p6
8 Flush
9 Dunk p4
10 Flush
11 Dunk p5
Bomb in the Toilet6 packages, 1 toilet
Fragment if bomb in p1
Fragment if bombs in p6 and p4
An Observation on Conformant PlansAn Observation on Conformant Plans
Example Domain: Bomb in the Toilet– Set of N packages, p1 through pN
– Packages may have bombs (1, many, a subset)
– Bombs defused by dunking the package in the toilet
– The toilet must be flushed before dunking again
Plan Step
Action
1 Dunk p3
2 Flush
3 Dunk p2
4 Flush
5 Dunk p1
6 Flush
7 Dunk p6
8 Flush
9 Dunk p4
10 Flush
11 Dunk p5
Bomb in the Toilet6 packages, 1 toilet
Fragment if bombs in p6 and p4
Fragment if bomb in p1 Repair action to unify fragments
An Observation on Conformant PlansAn Observation on Conformant Plans
Every conformant plan P must contain a fragment that achieves the goal in each world
Each world has plans that are fragments of some P
Approach:
Grow a set of fragments into a conformant plan
Plan Step
Action
1 Dunk p3
2 Flush
3 Dunk p2
4 Flush
5 Dunk p1
6 Flush
7 Dunk p6
8 Flush
9 Dunk p4
10 Flush
11 Dunk p5
Bomb in the Toilet6 packages, 1 toilet
Example Domain: Bomb in the Toilet– Set of N packages, p1 through pN
– Packages may have bombs (1, many, a subset)
– Bombs defused by dunking the package in the toilet
– The toilet must be flushed before dunking again
Fragment-based Conformant PlanningFragment-based Conformant Planning
Intuition
For each wi in Worlds
1. Generate a plan for Domain to achieve G in wi
2. Add the planned actions to Domain
Step 2 ensures the plan for wi+1 includes the actions that achieved G in {w1… wi}
Fragment-based Conformant PlanningFragment-based Conformant Planning
Plan Step
Plan for p1
1 Dunk p1
2
3
4
5
Planning Process
Fragment-based Conformant PlanningFragment-based Conformant Planning
Plan Step
Plan for p1Fragments for p2 plan
1 Dunk p1 Dunk p1
2
3
4
5
Planning Process
Fragment-based Conformant PlanningFragment-based Conformant Planning
Plan Step
Plan for p1Fragments for p2 plan
Plan for {p1,p2}
1 Dunk p1 Dunk p1 Dunk p1
2
3 Flush
4
5 Dunk p2
Planning Process
Fragment-based Conformant PlanningFragment-based Conformant Planning
Plan Step
Plan for p1Fragments for p2 plan
Plan for {p1,p2}
Extracted fragment
1 Dunk p1 Dunk p1 Dunk p1
2
3 Flush
4
5 Dunk p2 Dunk p2
Planning Process
Fragment-based Conformant PlanningFragment-based Conformant Planning
Plan Step
Plan for p1Fragments for p2 plan
Plan for {p1,p2}
Extracted fragment
Fragments for p3 plan
1 Dunk p1 Dunk p1 Dunk p1 Dunk p1
2
3 Flush
4
5 Dunk p2 Dunk p2 Dunk p2
Planning Process
Fragment-based Conformant PlanningFragment-based Conformant Planning
Plan Step
Plan for p1Fragments for p2 plan
Plan for {p1,p2}
Extracted fragment
Fragments for p3 plan
Plan for {p1,p2,p3}
1 Dunk p1 Dunk p1 Dunk p1 Dunk p1 Dunk p1
2 Flush
3 Flush Dunk p3
4 Flush
5 Dunk p2 Dunk p2 Dunk p2 Dunk p2
Planning Process
Search will be required– The fragment chosen for w1 may not allow a plan for w2
– The fragment chosen for w2 may disrupt the plan for w1
The FragPlan AlgorithmThe FragPlan Algorithm
completed=While (Worlds )
select and remove world wi from Worlds
Choose a plan Pi for Domain that achieves G in wi
Fail if Pi doesn’t achieve G for all w completed
Extract fragment Fi from Pi
Domain = Domain + Fi
add wi to completed
Return Pi
Search StrategiesSearch Strategies
Chronological Backtracking
Probing– Extend fragments to as many worlds as possible, then restart
– On failure, discard all fragments and empty completed
– Effective even when a small subset of worlds are very difficult
– Fits well with deterministic planner we use to choose Pi for wi
Bubbling– Find difficult worlds. Solve first by moving them up the stack.
w1
F1 F2 F3
w3
F1 F2 F3
w2
F1 F2 F3
W1 fragments
First world selected
ImplementationImplementation
No actions with conditional outcomes in current implementation – Planning graph cannot represent conditional outcome– Conditional extension (Gazen & Knoblock 1997) not applicable
No non-deterministic actions
Essentially conformant BlackBox (Kautz & Selman 99)
GraphBuilder
Graph to WFF
SAT(satz)
WFFPlan Graph
BlackBox
PlanningDomain
Plan Pi
Fragmentswi
PDDL
WorldsSpecification
FragmentExtraction
SearchControl
FragPlan
Conformant Plan
Experimental SetupExperimental Setup
FragPlan tested on a number of domains– Several variations of the bomb in the toilet problem
– Modified ringworld with no uncertain outcomes
– Logistics domain with uncertainty
Compared to performance quoted in the literature– CMBP, C-Plan, GTP from (Castellini, Giunchiglia, & Tacchella 2001)
– HSCP from (Bertoli, Cimatti, & Roveri 2001)
FragPlan performance averaged over 30 probing runs
Performance on Bomb in the Toilet ProblemsPerformance on Bomb in the Toilet Problems
Problem Instance Time Steps GTP 850Mhz CMBP 850Mhz HSCP 300Mhz Cplan 850Mhz FragPlan 733MhzPackages Toilets Serial Parallel Time Time Time Time Plans Time Calls to Plan
6 1 11 11 0.08 0.04 0.01 221.55 52561 0.07 15.428 1 15 15 0.41 0.20 0.01 TIME - 0.54 54.7
10 1 19 19 2.67 1.55 0.01 TIME - 2.26 115.456 4 8 3 0.01
8 4 12 3 0.0410 4 16 5 0.04
6 5 7 3 3.29 16.80 419.53 98348 0.16 68 5 11 3 32.07 112.48 TIME - 0.31 8
10 5 15 3 MEM 974.55 TIME - 0.58 106 6 6 1 0.05
8 6 10 3 0.07510 6 14 3 0.1
6 10 6 1 74.15 MEM 0.01 1 0.1 68 10 8 1 MEM MEM 0.01 1 0.16 8
10 10 10 1 MEM MEM 0.04 1 0.26 10
HSCP dominates on serial instances
FragPlan is balanced– HSCP, CMBP, GPT do not produce parallel plans– C-Plan does poorly on serial instances of this problem
10 Package Bomb in the Toilet with Parallelism10 Package Bomb in the Toilet with Parallelism Problem Instance Time Steps GTP 850Mhz CMBP 850Mhz HSCP 300Mhz Cplan 850Mhz FragPlan 733Mhz
Packages Toilets Serial Parallel Time Time Time Time Plans Time Calls to Plan
10 1 19 19 2.67 1.55 0.01 TIME - 2.26 115.45
10 4 16 5 0.0410 5 15 3 MEM 974.55 TIME - 0.58 10
10 6 14 3 0.110 10 10 1 MEM MEM 0.04 1 0.26 10
Space of serialized plans explodes as parallelism increases Parallelism renders fragments independent, yielding linear speedup
1 bomb
<= 2 bombs<= 3 bombs
1 of 5 windowsmay not be locked
<= 2 windows
<= 3 windows<= 4 windows
1
10
100
0 50 100 150 200 250
Worlds
Iterations/Worlds Ratio
BTC 10-1 MRING 5
FragPlan Performance on Many WorldsFragPlan Performance on Many Worlds Independent sources of uncertainty yield many worlds
Less planning, more checking– Fragment for n independent events is often a plan for each– If n is high, a few fragments yield a conformant plan.– In effect the plan is only checked on the remaining worlds
Constant space usage, except for fragments
– N rooms with window open, closed or locked 3n worlds
(N
K)– K bombs in N packages worlds
Handling Non-Deterministic ActionsHandling Non-Deterministic Actions Action A has n possible outcomes
Disjunction doesn’t ensure conformance
A
Effect 1
Effect 2
or
Worlds {w1 w2 }
Handling Non-Deterministic ActionsHandling Non-Deterministic Actions Action A has n possible outcomes
Disjunction doesn’t ensure conformance
A
Effect 1
Effect 2
or
Worlds {w1 w2 } Worlds {w1,P w1,P w2,P w2,P}
A’
Effect 1
Effect 2
P
P
Handling Non-Deterministic ActionsHandling Non-Deterministic Actions Action A has n possible outcomes
Disjunction doesn’t ensure conformance
A
Effect 1
Effect 2
or
Worlds {w1 w2 } Worlds {w1,P w1,P w2,P w2,P}
A’
Effect 1
Effect 2
P
P
Algorithm changes– Implement conditional effects
– Generate plan Pi for one execution in wi using A
– Substitute A’/A in Pi .
– Split completed worlds and wi
– Check Pi in all worlds, as before
MessageMessage
Performs well on both serial and parallel problems
More scalable than other possible worlds approaches – Memory usage is constant as the number of worlds increases– Computation is less susceptible to explosive growth
Probing is effective
Constructive approach– Always have a plan– Conformance increases in an anytime manner– Can delete and add worlds and re-use partial results
Motivation: Planning for Spacecraft RecoveryMotivation: Planning for Spacecraft Recovery
Complex Plan Utility Function• No safe, conformant plan may exist • Safety always desired, often dominates• Certain goals dominate at critical junctures• A failure may force all actions to be unsafe• Time for planning not known a priori• We must have some plan
Given: Utility function on goals, safety, and worlds
Return: Best plan the available time allows
Initial State Uncertainty • Internal actions are fairly reliable• Systems are complex• Observability is limited• Failures yield multiple diagnoses
Closed Open
StuckValve
Safe, Conformant Planning with OptimizationSafe, Conformant Planning with Optimization
Problem Instance– Let Domain be a description of a planning domain– Let Worlds be a set of initial states of the domain, {w1, w2, … wn}– Let G be a set of goals– Let S be a set of safety constraints– Let U be a function from (world x goal x safety) ->
Task: – Find a plan P with highest U (in available time)
Challenge: – Which subsets of {G S Worlds} admit a plan? – Will we have a plan when time runs out?
SCOPE – SCOPE – Safe, Conformant, Optimizing Planning EngineSafe, Conformant, Optimizing Planning Engine
Approach: Manipulate the scope of the problem
While (Time 0)
select constraints from {G S Worlds}
FragPlan(constraints)
SCOPE – SCOPE – Safe, Conformant, Optimizing Planning EngineSafe, Conformant, Optimizing Planning Engine
Approach: Manipulate the scope of the problem
While (Time 0)
select constraints from {G S Worlds}
FragPlan(constraints) for some time
Balance solving current constraints vs. exploration
SCOPE – SCOPE – Safe, Conformant, Optimizing Planning EngineSafe, Conformant, Optimizing Planning Engine
Approach: Manipulate the scope of the problem
While (Time 0)
select constraints from {G S Worlds}
FragPlan(constraints) for some time
Strategy: Start small and grow– On success, add constraints guided by U(world x goal x safety)– Anytime
SCOPE – SCOPE – Safe, Conformant, Optimizing Planning EngineSafe, Conformant, Optimizing Planning Engine
Approach: Manipulate the scope of the problem
While (Time 0)
select constraints from {G S Worlds}
FragPlan(constraints) for some time
Strategy: Start small and grow– On success, add constraints guided by U(world x goal x safety)– Anytime
Strategy: Start big, shrink– Failures reveal difficult constraint combinations– On failure, remove constraints guided by U, difficulty