1 1 AI Planning AI Planning
Jan 23, 2016
11
AI PlanningAI Planning
22
The planning problemThe planning problem
Inputs:Inputs:1. A description of the world state1. A description of the world state
2. The goal state description2. The goal state description
3. A set of actions3. A set of actions
Output:Output:A sequence of actions that if applied to the initial A sequence of actions that if applied to the initial
state, transfers the world to the goal statestate, transfers the world to the goal state
33
An example – Blocks worldAn example – Blocks world
Blocks on a tableBlocks on a table Can be stacked, but only one block on top of Can be stacked, but only one block on top of
anotheranother A robot arm can pick up a block and move to A robot arm can pick up a block and move to
another positionanother position– On the tableOn the table– On another blockOn another block
Arm can pick up only one block at a timeArm can pick up only one block at a time– Cannot pick up a block that has another one on itCannot pick up a block that has another one on it
44
STRIPS RepresentationSTRIPS Representation
State is a conjunction of State is a conjunction of positive ground literalspositive ground literalsOn(B, Table) On(B, Table) ΛΛ Clear (A) Clear (A)
Goal is a conjunction of Goal is a conjunction of positive ground literalspositive ground literals Clear(A) Clear(A) ΛΛ On(A,B) On(A,B) ΛΛ On(B, Table) On(B, Table)
STRIPS Operators STRIPS Operators – Conjunction of Conjunction of positive literalspositive literals as preconditions as preconditions– Conjunction of Conjunction of positive and negativepositive and negative literals as literals as
effectseffects
55
More on action schemaMore on action schema
Example: Move (b, x, y)Example: Move (b, x, y)– PreconditionPrecondition: :
Block(b)Block(b) ΛΛ Clear(b) Clear(b) ΛΛ Clear(y) Clear(y) ΛΛ On(b,x) On(b,x) ΛΛ (b ≠ x) (b ≠ x) ΛΛ (b ≠ y) (b ≠ y) ΛΛ (y ≠ x) (y ≠ x)
– Effect:Effect: ¬Clear(y) ¬Clear(y) ΛΛ ¬On(b,x) ¬On(b,x) ΛΛ Clear(x) Clear(x) ΛΛ On(b,y) On(b,y)
An action is applicable in any state that An action is applicable in any state that satisfies its preconditionsatisfies its precondition
Delete list Add list
66
STRIPS assumptionsSTRIPS assumptions
Closed World assumptionClosed World assumption– Unmentioned literals are false (no need to Unmentioned literals are false (no need to
explicitly list out)explicitly list out)
STRIPS assumptionSTRIPS assumption– Every literal not mentioned in the “effect” of an Every literal not mentioned in the “effect” of an
action remains unchangedaction remains unchanged
Atomic Time (actions are instantaneous)Atomic Time (actions are instantaneous)
77
STRIPS expressivenessSTRIPS expressiveness Literals are Literals are function free: function free: Move (Block(x), y, z)Move (Block(x), y, z)
operators can be operators can be propositionalized propositionalized (= actions)(= actions)
Move(b,x,y) and 3 blocks and table can be expressed as Move(b,x,y) and 3 blocks and table can be expressed as 48 purely propositional actions48 purely propositional actions
No disjunctive goals: No disjunctive goals: On(B, Table) V On(B, C)On(B, Table) V On(B, C)
No conditional effects: No conditional effects: On(B, Table) if ¬On(A, Table)On(B, Table) if ¬On(A, Table)
88
Planning algorithmsPlanning algorithms
Planning algorithms are search proceduresPlanning algorithms are search procedures Which state to search?Which state to search?
– State-space searchState-space search Each node is a state of the worldEach node is a state of the world Plan = path through the statesPlan = path through the states
– Plan-space searchPlan-space search Each node is a set of partially-instantiated operators Each node is a set of partially-instantiated operators
and set of constraintsand set of constraints Plan = nodePlan = node
99
State searchState search
Search the space of situations, which is Search the space of situations, which is connected by operator instances (= actions)connected by operator instances (= actions)
The sequence of actions = planThe sequence of actions = plan We have both preconditions and effects We have both preconditions and effects
available for each operator, so we can try available for each operator, so we can try different searches: different searches: ForwardForward vs. vs. BackwardBackward
1010
Planning: Search SpacePlanning: Search Space
AC
B A B C A CB
CBA
BA
C
BAC
B CA
CAB
ACB
BCA
A BC
AB
C
ABC
1111
Forward state-space search Forward state-space search (1)(1)
ProgressionProgression Initial state: initial state of the problemInitial state: initial state of the problem Actions:Actions:
– Applied to a state if all the preconditions are Applied to a state if all the preconditions are satisfiedsatisfied
– Succesor state is built by updating current state Succesor state is built by updating current state with add and delete listswith add and delete lists
Goal test: state satisfies the goal of the Goal test: state satisfies the goal of the problemproblem
1212
Progression (forward search)Progression (forward search)
ProgWS(world-state, goal-list, PossibleActions, path)ProgWS(world-state, goal-list, PossibleActions, path)
If If world-stateworld-state satisfies all goals in satisfies all goals in goal-listgoal-list,,
1.1. Then return Then return pathpath..
2.2. Else Else ActAct = = choosechoose an action whose precondition is an action whose precondition is true in world-statetrue in world-state
a)a) If no such action existsIf no such action exists
b)b) Then Then failfail
c)c) Else return ProgWS( Else return ProgWS( result(Act, world-state), result(Act, world-state),
goal-list, PossibleActions,goal-list, PossibleActions,
concatenate(path, Act)concatenate(path, Act) ) )
1313
Forward search in the Blocks worldForward search in the Blocks world
…
…
1414
Forward state-space search Forward state-space search (2)(2)
AdvantagesAdvantages– No functions in the declarations goals No functions in the declarations goals
search state is finitesearch state is finite– SoundSound– Complete (if algorithm used to do the search is Complete (if algorithm used to do the search is
complete)complete)
LimitationsLimitations– Irrelevant actions Irrelevant actions not efficient not efficient– Need heuristic or pruning procedureNeed heuristic or pruning procedure
1515
Backward state-space search Backward state-space search (1)(1)
RegressionRegression Initial state: goal state of the problemInitial state: goal state of the problem Actions:Actions:
– Choose an action A thatChoose an action A that Is relevant; has one of the goal literals in its effect setIs relevant; has one of the goal literals in its effect set Is consistent; does not negate another literalIs consistent; does not negate another literal
– Construct new search stateConstruct new search state Remove all positive effects of A that appear in goalRemove all positive effects of A that appear in goal Add all preconditions of A, unless already appearsAdd all preconditions of A, unless already appears
Goal test: initial world state contains remaining Goal test: initial world state contains remaining goalsgoals
1616
Regression (backward search)Regression (backward search)RegWS(initial-state, current-goals, PossibleActions, path)RegWS(initial-state, current-goals, PossibleActions, path)1.1. If If initial-stateinitial-state satisfies all of satisfies all of current-goalscurrent-goals2.2. Then return Then return pathpath3.3. Else Else Act Act = = choosechoose an action whose effect matches an action whose effect matches
one of current-goalsone of current-goalsa.a. If no such action exists, or the effects of If no such action exists, or the effects of ActAct
contradict some of contradict some of current-goalscurrent-goals, then , then failfailb.b. GG = (current-goals – goals-added-by(Act)) + = (current-goals – goals-added-by(Act)) +
preconds(Act)preconds(Act)c.c. If If GG contains all of contains all of current-goalscurrent-goals, then , then failfaild.d. Return Return RegWS(initial-state, G, PossibleActions, RegWS(initial-state, G, PossibleActions,
concatenate(Act, path))concatenate(Act, path))
1717
Backward state-space search Backward state-space search (2)(2)
AdvantagesAdvantages– Consider only relevant actions Consider only relevant actions much smaller much smaller
branching factorbranching factor
LimitationsLimitations– Still need heuristic to be more efficientStill need heuristic to be more efficient
1818
Comparing ProgWS and RegWSComparing ProgWS and RegWS
Both algorithms are Both algorithms are – sound sound (they always return a valid plan)(they always return a valid plan)– completecomplete (if a valid plan exists they will find one) (if a valid plan exists they will find one)
Running time is O(bRunning time is O(bnn))
where b = branching factor, where b = branching factor,
n = number of “choose” operatorsn = number of “choose” operators
Efficiency of Backward SearchEfficiency of Backward Search
Backward search can Backward search can alsoalso have a very large branching have a very large branching factorfactor– E.g., many relevant actions that don’t regress toward the initial E.g., many relevant actions that don’t regress toward the initial
statestate
As before, deterministic implementations can waste lots of As before, deterministic implementations can waste lots of time trying all of themtime trying all of them
a3
a1
a2
…a1 a2 a50a3
initial state goal
LiftingLifting
Can reduce the branching factor of backward Can reduce the branching factor of backward search if we search if we partiallypartially instantiate the operators instantiate the operators– this is called this is called liftinglifting q(a)foo(a,y)
p(a,y)
q(a)
foo(x,y)precond: p(x,y)
effects: q(x)
foo(a,a)
foo(a,b)
foo(a,c)
…
p(a,a)
p(a,b)
p(a,c)
21
The Search Space is Still Too Large Backward-search generates a smaller search space than
Forward-search, but it still can be quite large Suppose a, b, and c are independent, d must precede
all of them, and d cannot be executed We’ll try all possible orderings of a, b, and c before
realizing there is no solution
c
b
a
goal
a b
b a
b a
a c
b c
c b
d
d
d
d
d
d
A ground version of the STRIPS algorithm.
2323
Blocks world: STRIPS operatorsBlocks world: STRIPS operators
Pickup(x)Pickup(x)Pre: on(x, Table), clear(x), Pre: on(x, Table), clear(x),
aeae
Del: on(x, Table), ae Del: on(x, Table), ae
Add: holding(x)Add: holding(x)
Putdown(x)Putdown(x)Pre: holding(x)Pre: holding(x)
Del: holding(x)Del: holding(x)
Add: on(x, Table), ae Add: on(x, Table), ae
UnStack(x,y)UnStack(x,y)Pre: on(x, y), aePre: on(x, y), ae
Del: on(x, y), aeDel: on(x, y), ae
Add: holding(x), clear(y) Add: holding(x), clear(y)
Stack(x, y)Stack(x, y)Pre: holding(x), clear(y)Pre: holding(x), clear(y)
Del: holding(x), clear(y)Del: holding(x), clear(y)
Add: on(x, y), ae Add: on(x, y), ae
2424
Current state:Current state:– on(A,table), on(C, B), on(B,table), on(D,table), clear(A), clear(C), on(A,table), on(C, B), on(B,table), on(D,table), clear(A), clear(C),
clear(D), ae.clear(D), ae.
GoalGoal– on(A,C), on(D, A)on(A,C), on(D, A)
STRIPS PlanningSTRIPS Planning
A
C
B D
C
A
D
2525
STRIPS PlanningSTRIPS Planningon(A,C), on(D,A)
on(A,table), on(C, B), on(B,table), on(D,table), clear(A), clear(C), clear(D), ae.
Current State
Plan: Goalstack:
on(A,C)
Stack(A, C)
holding(A), clear(C)
holding(A)
Pickup(A)
on(A,Table), clear(A), ae
ACB D
CAD
on(A,table), on(C, B), on(B,table), on(D,table), clear(A), clear(C), clear(D), ae.holding(A), on(C, B), on(B,table), on(D,table), clear(A), clear(C), clear(D).
2626
STRIPS PlanningSTRIPS Planningon(A,C), on(D,A)
Current State
Plan: Goalstack:
on(A,C)
Stack(A, C)
holding(A), clear(C)
holding(A)
Pickup(A)
ACB D
CAD
Pickup(A)
Pre: on(A,Table), clear(A), ae
Del: on(A, Table), ae,
Add: holding(A)
2727
holding(A), on(C, B), on(B,table), on(D,table), clear(A), clear(C), clear(D).
STRIPS PlanningSTRIPS Planningon(A,C), on(D,A)
Current State
Plan: Goalstack:
on(A,C)
Stack(A, C)
A
CB D
CAD
Stack(A, C)
Pre: holding(A), clear(C)
Del: holding(A), clear(C)
Add: on(A, C), ae
Pickup(A)
on(A,C), on(C, B), on(B,table), on(D,table), clear(A), clear(D), ae.
2828
STRIPS PlanningSTRIPS Planningon(A,C), on(D,A)
on(A,C), on(C, B), on(B,table), on(D,table), clear(A), clear(D), ae.
Current State
Plan: Goalstack:
on(D, A)
Stack(D,A)
holding(D), clear(A)
holding(D)
Pickup(D)
on(D,Table), clear(D), ae
ACB D
CAD
on(A,C), on(C, B), on(B,table), holding(D), clear(A), clear(D)on(A,C), on(C, B), on(B,table), on(D,A), clear(A), ae
Stack(A, C)
Pickup(A)
2929
STRIPS PlanningSTRIPS Planningon(A,C), on(D,A)
Current State
Plan: Goalstack:
on(D, A)
Stack(D,A)
holding(D), clear(A)
holding(D)
Pickup(D)
ACB
D
CAD
on(A,C), on(C, B), on(B,table), holding(D), clear(A), clear(D)on(A,C), on(C, B), on(B,table), on(D,A), clear(A), ae
Stack(A, C)
Pickup(A)
3030
STRIPS Planning: Getting it Wrong!STRIPS Planning: Getting it Wrong!on(A,C), on(D,A)
on(A,table), on(C, B), on(B,table), on(D,table), clear(A), clear(C), clear(D), ae.
Current State
Plan: Goalstack:
on(D,A)
Stack(D, A)
holding(D), clear(A)
holding(D)
Pickup(D)
on(D,Table), clear(D), ae
ACB D
CAD
on(A,table), on(C, B), on(B,table), holding(D), clear(A), clear(C), clear(D)
3131
on(A,table), on(C, B), on(B,table), holding(D), clear(A), clear(C), clear(D)
STRIPS Planning: Getting it Wrong!STRIPS Planning: Getting it Wrong!on(A,C), on(D,A)
on(A,table), on(C, B), on(B,table), on(D,A), clear(C), clear(D), ae.
Current State
Plan: Goalstack:
on(D,A)
Stack(D, A)
Pickup(D)
ACB
D
CAD
3232
STRIPS Planning: Getting it Wrong!STRIPS Planning: Getting it Wrong!
on(A,C), on(D,A)
on(A,table), on(C, B), on(B,table), on(D,A), clear(C), clear(D), ae.
Current State
Plan: Goalstack:
Stack(D, A)
Pickup(D)
ACB
D
CAD
Now What?
– We chose the wrong goal first
– A is no longer clear.
– stacking D on A messes up the preconditions for actions to accomplish on(A, C)
– either have to backtrack, or else we must undo the previous actions
3333
Limitation of state-space searchLimitation of state-space search
Linear planning or Total order planningLinear planning or Total order planning ExampleExample
– Initial state: all the blocks are clear and on the Initial state: all the blocks are clear and on the tabletable
– Goal: On(A,B) Goal: On(A,B) ΛΛ On(B,C) On(B,C)– If search achieves On(A,B) first, then needs to If search achieves On(A,B) first, then needs to
undo it in order to achieve On(B,C)undo it in order to achieve On(B,C)
Have to go through all the possible Have to go through all the possible permutations of the subgoalspermutations of the subgoals
3434
Search through the space of plansSearch through the space of plans
Nodes are partial plans, links are plan refinement Nodes are partial plans, links are plan refinement operations and a solution is a node (not a path).operations and a solution is a node (not a path).
POP creates partial-order plans following a “least POP creates partial-order plans following a “least commitment” principle. commitment” principle.
3535
Left Sock
Start
Finish
Right Shoe
Left Shoe
Right Sock
Start
Right Sock
Finish
LeftShoe
RightShoe
Left Sock
Start Start Start Start Start
Right Sock
Right Sock
Right Sock
Right Sock
Right Sock
Left Sock
Left Sock
Left Sock
Left Sock
Left Sock
Left Sock
RightShoe
RightShoe
RightShoe
RightShoe
RightShoe
LeftShoe
LeftShoe
LeftShoe
LeftShoe
Finish Finish Finish Finish Finish
Left Shoe on Right Shoe on
Left Sock on Right Sock on
Partial Order Plans: Total Order Plans:
3636
P.O. plans in POPP.O. plans in POP
Plan = (A, O, L), wherePlan = (A, O, L), where– A is the set of actions in the planA is the set of actions in the plan– O is a set of O is a set of temporal orderingstemporal orderings between actions between actions– L is a set of L is a set of causal linkscausal links linking actions via a literal linking actions via a literal
Causal link means that Ac has precondition Causal link means that Ac has precondition Q that is established in the plan by Ap.Q that is established in the plan by Ap.
move-a-from-b-to-table move-c-from-d-to-bmove-a-from-b-to-table move-c-from-d-to-b
Ap AcQ
(clear b)
3737
Threats to causal linksThreats to causal links
Step Step AtAt threatensthreatens link if: link if:
1.1. AtAt has ( has (~Q~Q) as an effect) as an effect
2.2. AtAt could come between Ap and Ac, i.e., O is could come between Ap and Ac, i.e., O is consistent with Ap < At < Acconsistent with Ap < At < Ac
Ap AcQ
3838
Threat RemovalThreat Removal
Threats must be removed to prevent a plan Threats must be removed to prevent a plan from failingfrom failing
DemotionDemotion adds the constraint A adds the constraint Att < A < App to to
prevent clobbering, i.e. push the clobberer prevent clobbering, i.e. push the clobberer before the producerbefore the producer
PromotionPromotion adds the constraint A adds the constraint Acc < A < Att to to
prevent clobbering, i.e. push the clobberer prevent clobbering, i.e. push the clobberer after the consumerafter the consumer
3939
Initial (Null) PlanInitial (Null) Plan
Initial plan has Initial plan has – A = { AA = { A00, A, A} }
– O = {AO = {A00 < A < A} }
– L = {} L = {}
AA00 (Start) has no preconditions but all facts (Start) has no preconditions but all facts
in the initial state as effects. in the initial state as effects. AA (Finish) has the goal conditions as (Finish) has the goal conditions as
preconditions and no effects.preconditions and no effects.
4040
POP algorithmPOP algorithmPOP((A, O, L), agenda, PossibleActions):POP((A, O, L), agenda, PossibleActions):1.1. If If agendaagenda is empty, return (A, O, L) is empty, return (A, O, L)2.2. Pick (Q, An) from Pick (Q, An) from agendaagenda3.3. Ad = Ad = choosechoose an action that adds Q. an action that adds Q.
a.a. If no such action exists, If no such action exists, failfail..b.b. Add the link Ad An to L and the ordering Ad < An to OAdd the link Ad An to L and the ordering Ad < An to Oc.c. If Ad is new, add it to A.If Ad is new, add it to A.
4.4. Remove (Q, An) from Remove (Q, An) from agendaagenda. If Ad is new, for each . If Ad is new, for each of its preconditions P add (P, Ad) to of its preconditions P add (P, Ad) to agendaagenda..
5.5. For every action At that threatens any link For every action At that threatens any link 1.1. ChooseChoose to add At < Ap or Ac < At to O. to add At < Ap or Ac < At to O.2.2. If neither choice is consistent, If neither choice is consistent, failfail..
6.6. POPPOP((A, O, L), agenda, PossibleActions((A, O, L), agenda, PossibleActions))
Q
Ap AcQ
4141
AnalysisAnalysis
POP can be much faster than the state-space POP can be much faster than the state-space planners because it doesn’t need to backtrack planners because it doesn’t need to backtrack over goal orderings (so less branching is over goal orderings (so less branching is required).required).
Although it is more expensive per node, and Although it is more expensive per node, and makes more choices than RegWS, the reduction makes more choices than RegWS, the reduction in branching factor makes it faster, i.e., in branching factor makes it faster, i.e., nn is larger is larger but but bb is smaller! is smaller!
4242
More analysisMore analysis
Does POP make the least possible amount Does POP make the least possible amount of commitment?of commitment?
Lifted POP: Using Operators, instead of Lifted POP: Using Operators, instead of ground actions,ground actions,
Unification is requiredUnification is required
4343
POP in the Blocks worldPOP in the Blocks world
On(x,y), Cl(x), ~Cl(y), ~On(x,z)
PutOn(x,y)
Cl(x), Cl(y), On(x,z)
PutOnTable(x)
On(x, z) Cl(x)
On(x,Table), Cl(x), ~On(x,z)
4444
POP in the Blocks worldPOP in the Blocks world
4545
POP in the Blocks worldPOP in the Blocks world
4646
POP in the Blocks worldPOP in the Blocks world
4747
POP in the Blocks worldPOP in the Blocks world
4848
Example 2Example 2
Have(y)
Buy (y,x)
At(x), Sells(x,y)
GO (x,y)
At(x) At(y) ~At(x)
AA00: Start: Start– At(Home) Sells(SM,Banana) Sells(SM,Milk) At(Home) Sells(SM,Banana) Sells(SM,Milk)
Sells(HWS,Drill)Sells(HWS,Drill)
AA : Finish : Finish– Have(Drill) Have(Milk) Have(Banana) At(Home) Have(Drill) Have(Milk) Have(Banana) At(Home)
4949
POP ExamplePOP Example
finish
GO (x3,SM)
start
Have(B)
x1 = SM
Have(M)
x2 = SM At(x3)
At(SM)
x3 = H
Buy (M,x1)
At(x1) Sells(x1,M)
Buy (B,x2)
Have(M) Have(B)
At(x2) Sells(x2, B)
Sells(SM, M) Sells(SM,B) At(H)