Notes for CS3310 Artificial Intelligence Part 9: Planning

Notes for CS3310Artificial Intelligence

Part 9: Planning

Prof. Neil C. RoweNaval Postgraduate School

Version of January 2006

Means-ends analysisA search method of top-down recursive decomposition of a search

(preferably, decomposable) problem into simpler subproblems.• Inputs: (a) a starting state; (b) goal conditions (facts that must

appear in the goal state).• Requires a "difference table" showing the recommended operator

for any search problem. The operator will not necessarily apply immediately, but should be the most important operator necessary to solve its search problem.

• Then decompose the search problem into two subproblems:– (1) Search to achieve the preconditions of the recommended

operator, if necessary. (Starting state is the same, but goal is different.)

– (2) Search to achieve the original goal after applying the recommended operator, if necessary. (Goals are the same, but starting state is different.)

Difference table for travelThis recommends an operator for the difference between the

current goal conditions and the current state.

Preconditions of the operators:Use plane: you are at airport, you have plane ticket.Use train: you are at train station, you have a train ticket.Use your car: you are at your car, your car is < 1 mile away.Use taxi: you are on a street.Walk: no preconditions.

Distance to go (mi.)

Use airplane?

Use train?

Use your car?

Use taxi?

Walk?

> 1000 yes yes no no no ≤ 1000, >100 no yes yes no no ≤ 100, > 1 no no yes yes no ≤ 1 no no no no yes

Example decomposition tree for travel problem• Goal: visit your aunt in Pasadena, Calif.• Start state: you are in Spanagel Hall; your car is parked 0.8

miles away; you have a train ticket to Los Angeles; you have money for a taxi.

use train (Salinas to LA)

use your car (Monterey to Salinas)

use taxi (LAto Pasadena)

walk (Spanagel to parking lot)

walk (Salinas parking to Salinas

train station)

walk (LA trainto LA street)

pre post

pre post pre

Building a problem-decomposition tree• Subtrees below and to the left represent work needed to

achieve the preconditions of the operator above.• The starting state for the root of a left subtree is the same

as the starting state for the root of the tree.• The goal conditions for the root of the left subtree are the

preconditions of the operator at the root of the tree.• Subtrees below and to the right represent further work

needed after the operator at the root of the tree is applied.• The goal conditions for the root of the right subtree is the

same as the goal conditions for the root of the tree.• The starting state for the root of the right subtree is the

state after the application of the operator at the root of the tree.

The means-ends algorithmGiven a task to achieve a list of goals G from starting state S, the algorithm computes meansends(S,G) and returns OL (an operator list) and FS (the final state).• Let the “difference” D be the list of all unnegated facts in G that are not in

S and all negated facts in G that are in S.• If D is the empty list, return OL=[] and FS=S.• Otherwise, find the first recommended operator OM for this difference, the

first operator in the list of recommendations that matches something in D.• Find the preconditions of OM, call them PL.• Do meansends(S,PL), get back OL1 and FS1.• Delete the deletepostconditions of OM from FS1, and add all

addpostconditions not already in FS1. Call the new state FS2.• Do meansends(FS2,G), get back OL3 and FS3.• Append OL1, the list consisting of OM, and OL2 to get OL.• Return this OL and return FS=FS3.

Implementing a means-ends problemYou must provide define each operator with four kinds of facts:1. recommended: Give a list of operator-differencelist pairs,

where the operator is recommended whenever those facts in the differencelist appear in a goal but not in the current state.

2. precondition: Gives the facts necessary before an operator can be applied.

3. deletepostcondition: Gives the facts that become false after the operator is applied.

4. addpostcondition: Gives the facts that become true after the operator is applied.

To start analysis, you supply two things: a starting state and a list of goal conditions you want to become true.

Analysis returns two results: a list of operators to achieve your goal conditions, and the final state then.

Formal definition of the flashlight problemExample starting state:[closed(case), closed(top), inside(batteries),defective(batteries), ok(light), unbroken(case)]Goal conditions:[ok(batteries), ok(light), closed(case), closed(top)]Difference table:For [ok(batteries)] do replace_batteries.For [ok(light)] do replace_light.For [open(case)] do disassemble_case.For [open(case)] do smash_case.For [open(top)] do disassemble_top.For [open(top)] do smash_case.For [closed(case)] do assemble_case.For [closed(top)] do assemble_top.For [outside(batteries)] do turn_over_case.For [outside(batteries)] do smash_case.

Preconditions for the flashlight problem

For replace_batteries: [open(case),outside(batteries),unbroken(case)]).

For replace_light: [open(top)]For disassemble_case: [closed(case)]For assemble_case: [open(case),closed(top),unbroken(case)]For disassemble_top: [open(case),closed(top)]For assemble_top: [open(top)]For turn_over_case: [open(case)]For smash_case: []

Postconditions for the flashlight problemFor replace_batteries: Delete

[outside(batteries),defective(batteries)] and add [inside(batteries),ok(batteries)]

For replace_light: Delete [defective(light)] and add [ok(light)]For disassemble_case: Delete [closed(case)] and add [open(case)]For assemble_case: Delete [open(case)] and add [closed(case)]For disassemble_top: Delete [closed(top)] and add [open(top)]For assemble_top: Delete [open(top)] and add [closed(top)]For turn_over_case: Delete [inside(batteries)] and add

[outside(batteries)]For smash_case: Delete

[unbroken(case),closed(case),closed(top),inside(batteries)] and add [broken(case),open(case),open(top),outside(batteries)]

Flashlight example problemStart state: [closed(case), closed(top), inside(batteries),defective(batteries), ok(light), unbroken(case)]Goal conditions: [ok(batteries), ok(light), closed(case), closed(top)]

State after disassemble-case: [open(case), closed(top), inside(batteries), defective(batteries), ok(light), unbroken(case)]

State after turn-over-case: [outside(batteries), open(case), closed(top), defective(batteries), ok(light), unbroken(case)]

State after replace-batteries: [ok(batteries), inside(batteries), open(case), closed(top), ok(light), unbroken(case)]

State after assemble-case: [closed(case), ok(batteries), inside(batteries), closed(top), ok(light), unbroken(case)]

replace_batteries

disassemble_case

turn_over_case

assemble_case

pre post

post

Flashlight exercise• Start state: [closed(case), closed(top), inside(batteries),

defective(batteries), defective(light), unbroken(case)]• Goal conditions: [ok(batteries), ok(light), closed(case),

closed(top)]• Draw the problem-decomposition tree for means-ends

analysis.

For contrast: The flashlight search graph

[c,t,i,l] [t,i,l]

[i,l]

[t,l] [b,t,i,l]

[l,k] [b,l,k]

[c,b,t,i,l]

[b,i,l][l]

smash_case smash_case

smash_case

smash_casesmash_case

turn_over_casereplace_batteries

(dis)assemble_caseturn_ over_case

replace_ batteries

(dis)assemble_case

(dis)assemble_top(dis)assemble_top

(dis)assemble_top

c=closed(case),t=closed(top),i=inside(batteries),b=ok(batteries),l=ok(light),k=broken(case)

Agents for modeling shipboard firefighting

• Command Center (CC): Monitors the sensors and alarms and gives order to the fire scene leader

• Scene leader (SL): Manages the fire team• Electrician (E): Handles electrical tasks• Nozzleman (N): Directs the fire hose when

extinguishing, plus other tasks• Hoseman (H): Helps with the hose, plus other tasks• Fire (F): Grows and wanes based on conditions,

countermeasures, and random variation

Firefighting plan tree for deterministic actions

record debriefed

report debriefed

order debriefed

debrief

order verified

go to fire scene

report verified out

equip

deenergize

report deenergized

report hose tended

order hose tended

set boundaries

record fire report

report fire report

order fire report

order deenergized

verify out

extinguish

approach fire

tend hose

report no smoke

order no smoke report no water

desmoke report watched reflashorder no water

dewaterestimate water report safe gasesorder watched reflash

watch reflash report safe oxygen

store equipment test gases

test oxygen

test oxygen tester

go to repair locker

order safe gases

order safe oxygen

Note: precondition handling is below and to left; postcondition handling is below and to right.

Agents shown:

Command center – purple

Scene leader – black

Electrician – gray green

Nozzleman – bright blue

Hoseman – gray blue

Order of actions in firefighting tree1 = order debriefed (CC->SL)2 = equip team (SL)3 = go to fire scene (SL)4 = ordered verified out (SL->H)5 = order deenergized (SL->E)6 = deenergize (E)7 = report deenergized (E->SL)8 = order fire report (CC->SL)9 = report fire status (SL->CC)10 = record fire status (CC)11 = set boundaries (N)12 = order hose tended (N->H)13 = tend hose (H)14 = report hose tended (H->SL)15 = approach fire (N)16 = extinguish fire (N)17 = verify fire out (H)18 = report verified out (H->SL)19 = order no smoke (SL->E)20 = desmoke (E)

21 = report no smoke (E->SL)22 = estimate water (SL)23 = order no water (SL->N)24 = dewater (N)25 = report no water (N->SL)26 = order watched reflash (SL->N)27 = test gases (H)28 = test oxygen tester (H)29 = test oxygen (H)30 = watch reflash (N)31 = report watched reflash (N->SL)32 = order safe gases (SL->H)33 = report safe gases (H->SL)34 = order safe oxygen (SL->H)35 = report safe oxygen (H->SL)36 = go to repair locker (SL)37 = store equipment (SL)38 = debrief (SL)39 = report debriefed (SL->CC)40 = record debriefed (CC)

Where do goals come from?A more fundamental problem than means-ends addresses is how

to come up with goals to act on.• Usually, you can model people and organizations as having a

set of "drives" or "motivations" that they want to satisfy.• For instance for people: (1) food (2) water (3) clothing (4)

shelter (5) sex (6) sleep (7) health (8) security. These increase and decrease by various formula.

• One approach to motivations: Take the most urgent need at the moment, act to satisfy it. So if you're dying of thirst, forget about sex.

• Another approach ("potential field"): Weight all the motivations, take a weighted average of the motivations to decide what to do. So for computer games, travel in the direction indicated by the weighted average of all desires.

Analyzing the difficulty of a search problem• Branching factor method: Suppose the average state has B new

successors. Then there are B states at level 1, B*B at level 2, B*B*B at level 3, and so on. If you know the depth of the goal, you can guess the number of states you must visit during search.

• In housekeeping problem, B is around 2, solution is depth 16, hence upper bound is 64,000.

• Problems: This ignores duplicates, so is only an upper bound. And the number of new states can be unpredictable.

• State combinatorics method: Suppose there are K separate facts that can appear in states. Then there are possible subsets of those facts, each of which is a state. If a fact can take M arguments, there are M+1 possibilities involving that fact.

• In housekeeping problem, there are eight facts possible for dusty, vacuumable, trashy, and holdingbasketfrom, plus 3 places robot can be. Hence 2*2*2*2*2*2*2*2*3 = 768 states total.

K2

Modern search method 1: Bidirectional search• = Search forward (start to goal) and backward (goal to

start) simultaneously. Stop when you meet in the middle.

• Do either concurrently or by time-sharing.• Bidirectional search can be fast when the branching

factors are equal in both directions because:

• If the branching factors are different in the two directions, bidirectional search may still help, but maybe not so much.

• Main disadvantage of bidirectional search: You can't reason backward in many problems. For instance in scheduling: If you knew the goal state, there's no reason to search. Also: searches may not meet in middle.

BBBB kkkk /22/2/ 2for

Getting "stuck in a rut" in route planning

StartGoal

The middle route looks best at first to A* search, but later gets into trouble. It would be better to explore several different directions from the starting state simultaneously.

Modern search method 2: Simulated annealing• Like A* search, but pick a state semi-randomly.• One way: if b is the cost of the best state on agenda, choose a

state with cost c with relative likelihood: (Divide by the sum of all likelihoods to get a probability.)

• This means that the larger a cost is than the cost of the best state, the less likely we are to pick it next. Factor k controls how sensitive the choice is to the difference in costs.

• "Annealing" means that k increases with time, so that search becomes more and more narrowly focused with time. The term originally applied to cooling of molten metal.

• Example: scheduling of classes in a department.• Advantages: Easily parallelizable; and we are less likely to get

"stuck in a rut" or an unpromising line of exploration just because its costs are a little better than some other path.

• Disadvantage: Lots of time needed.

))(1/(1 22 bck

Modern search method 3: Genetic algorithms• Like best-first search, but don't always pick the best state on

the agenda. Instead, sometimes “mutate” a state and sometimes "cross" two good states on the agenda to get a hypothetical new state. If it's reasonable, add it to the agenda.

• "Cross" is an analogy to genetics of organisms. You pick some facts from one state and some from the other to get the new state. Often you get something crazy, but keep trying.

• Example: In scheduling classes in a department, combine the two best partial schedules by taking Monday and Tuesday from the first schedule and Wednesday-Friday from the second schedule, resolving conflicts by eliminating conflicting classes. Put this on the agenda.

• Advantages: Easily parallelizable; and a search process can "evolve” like organisms, getting better as it proceeds.

• Disadvantage: No good theory behind it, so hard to predict how long it takes to solve a problem.

Game (adversary) search• Some search problems involve an opponent who alternates

moves with you, like in chess or some kinds of military planning. This is "game search”.

• You can't control the opponent, but you figure they will do what is best for them. So assume you both do a best-first search with a single evaluation function: you try to minimize it, while opponent tries to maximize it ("minimax search").

• It helps if you plan several moves ahead. Assume opponent always picks the maximum branch and you always pick the minimum. That way you can exclude many lines of play.

• Example evaluation function, for chess: total piece value for opponent + mobility of opponent - total piece value for you - mobility of you. (Piece values: queen=8, rook=4, bishop=3, knight=3, pawn=1.)

Example of game searchAssume this search graph, with evaluation function

values shown for the leaves at level 3; maximizing players moves first.

2 3 5 9 0 7 4 1 5 6

max

min

max

Alpha-beta pruning for game search

• Something like a heuristic, it's a guaranteed way to avoid useless branches in game search (search with an opponent alternating moves).

• You do a depth-first search to level N, N given in advance. Alpha-beta tells you some branches to avoid going down after backing up.

• For this to work, you must store a number for every node on the tree. For min nodes, it's the minimum cost of the game from there ("alpha"); for max nodes, it's the maximum cost ("beta").

• Then the pruning idea is: don't go down a branch from a max node if the current max (or beta) at this node is more than the min (or alpha) of this node's parent. Vice versa for min nodes.

Example of alpha-beta pruning in game search

Apply alpha-beta pruning to the previous example game search graph.

2 3 5 9 0 7 4 1 5 6

max

min

max