Computer Science CPSC 322 Lecture A* and Search Refinements (Ch 3.6.1, 3.7.1, 3.7.2) Slide 1
Jan 01, 2016
Course Announcements
Assignment1: Posted on Connect on Monday
Due: Monday Feb. 2, 2PM
Slide 2
Remember, post questions on the assignment or course material
on the Connect board, do not email them to me or the TAs
• Also, remember that you should expect 24h as response time from the assignment team.• you post a question less than 24h before the assignment is
due, you might not get an answer from the teaching team.
Lecture Overview
• Recap of Lecture 7• A* Properties
• Cycle Checking and Multiple Path Pruning (time permitting)
Slide 3
How to Make Search More Informed?Def.: A search heuristic h(n) is an estimate of the cost of the optimal
(cheapest) path from node n to a goal node.
Estimate: h(n1)
Estimate: h(n2)
Estimate: h(n3)n3
n2
n1
Slide 4
• h can be extended to paths: h(n0,…,nk)=h(nk)
• h(n) should leverage readily obtainable information (easy to compute) about a node.
• Always choose the path on the frontier with the smallest h value.• BestFS treats the frontier as a priority queue ordered by h.• Can get to the goal pretty fast if it has a good h but…
It is not complete, nor optimal
• still has time and space worst-case complexity of O(bm)
Best First Search (BestFS)
5
• A* search takes into account both • the cost of the path to a node c(p) • the heuristic value of that path h(p).
• Let f(p) = c(p) + h(p). • f(p) is an estimate of the cost of a path from the start to a goal
via p.
A* Search
c(p) h(p)
f(p)A* always chooses the path on the frontier with the lowest estimated distance
from the start to a goal node constrained to go via that path.
Compare A* and LCFS on the Vancouver graphSlide 6
A* is complete (finds a solution, if one exists)
and optimal (finds the optimal path to a goal) if
• the branching factor is finite• arc costs are > 0 • h(n) is admissible
Optimality of A*
Slide 8
Admissibility of a heuristic
9
Def.: Let c(n) denote the cost of the optimal path from node
n to any goal node. A search heuristic h(n) is called admissible if h(n) ≤ c(n) for all nodes n, i.e. if for all nodes it is an underestimate of the cost to any goal.
• Example: is the straight-line distance admissible?
- The shortest distance between two points is a line.
YES
Admissibility of a heuristic
10
Def.: Let c(n) denote the cost of the optimal path from node
n to any goal node. A search heuristic h(n) is called admissible if h(n) ≤ c(n) for all nodes n, i.e. if for all nodes it is an underestimate of the cost to any goal.
example: the goal is Urzizeni (red box), but all we know is the straight-line distances to Bucharest (green box)
NO
• Possible h(n) = sld(n, Bucharest) + cost(Bucharest, Urzineni)• Admissible? Cost of going from Vastul to Urzineni • is shorter than this estimate
Example 3: Eight Puzzle• Another possible h(n):
Sum of number of moves between each tile's current position and its goal position (we can move over other tiles in the grid)
Sum (
1 2 3 4 5 6 7 8
Slide 11
Example 3: Eight Puzzle• Another possible h(n):
Sum of number of moves between each tile's current position and its goal position
sum ( 2 3 3 2 4 2 0 2) = 18
Admissible?
1 2 3 4 5 6 7 8
A. Yes C. It dependsB. No Slide 12
Example 3: Eight Puzzle• Another possible h(n):
Sum of number of moves between each tile's current position and its goal position
sum 2 3 3 2 4 2 0 2 = 18
Admissible? YES!
1 2 3 4 5 6 7 8
Slide 13
How to Construct an Admissible Heuristic
14
• Identify relaxed version of the problem: - where one or more constraints have been dropped- problem with fewer restrictions on the actions
• Grid world: the agent can move through walls• Driver: the agent can move straight• 8 puzzle:
- “number of misplaced tiles”:tiles can move everywhere and occupy same spot as others
- “sum of moves between current and goal position”: tiles can occupy same spot as others
Why does this lead to an admissible heuristic?- The problem only gets easier!- Less costly to solve it
A* is complete (finds a solution, if one exists)
and optimal (finds the optimal path to a goal) if
• the branching factor is finite• arc costs are > 0 • h(n) is admissible
Back to A*
Slide 15
Lecture Overview
• Recap of Lecture 7• A* Properties
• Cycle Checking and Multiple Path Pruning (time permitting)
Slide 16
S
p*
p
1. f(p*) = c(p*) + h(p*) = c(p*)
Because?
Proof
Let’s p be a subpath of an optimal path p*
We can show that If h is admissible f(p) ≤ f(p*)
17
S
p*
p
1. f(p*) = c(p*) + h(p*) = c(p*)
2. f(p) = c(p) + h(p) <= c(p*) = f(p*)
Because h is admissible, thus cannot be > 0 at the goal, must be h(p*) = 0
Proof
Let’s p be a subpath of an optimal path p*
We can show that If h is admissible f(p) ≤ f(p*)
18
S
p*
p
1. f(p*) = c(p*) + h(p*) = c(p*)
2. f(p) = c(p) + h(p) <= c(p*) = f(p*)
Because h(p) does not overestimate the cost of getting from p to the goal. Thus
f(p) <= f(p*) For every subpath of an optimal path p*
Because h is admissible, thus cannot be > 0 at the goal, must be h(p*) = 0
Proof
Let’s p be a subpath of an optimal path p*
We can show that If h is admissible f(p) ≤ f(p*)
19
It halts (does not get caught in cycles)• Let fmin be the cost of the (an) optimal solution path s
(unknown but finite if there exists a solution)
• Let cmin > 0 be the minimal cost of any arc
• Each sub-path p of s has f(p) ≤ fmin
- Due to admissibility (see previous slide)
• A* expands paths on the frontier with minimal f(n), and - Always a subpath of s on the frontier
- Only expands paths p with f(p) ≤ fmin
- Terminates when expanding sBecause, with positive arc cost, the cost of any other path p on the frontier would eventually exceed fmin , at depth less no greater than (fmin / cmin )
See how it works on the “misleading heuristic” problem in AI space:
Why is A* complete
20
Why is A* complete?
A* does not get caught into the cycle because f(n) of sub paths in the cycle eventually (at depth <= 55.4/6.9) exceed the cost of the optimal solution 55.4 (N0->N6->N7->N8)
21
22
A* does not get caught into the cycle because f(n) of sub paths in the cycle eventually (at depth <= 55.4/6.9) exceed the cost of the optimal solution 55.4 (N0->N6->N7->N8)
• Let p* be the optimal solution path, with cost c*.
• Let p’ be a suboptimal solution path. That is c(p’) > c*.
We are going to show that any sub-path p" of p* on the frontier
will be expanded before p’
Therefore, A* will find p* before p’
Why A* is optimal
p’
p*
p”
23
And because
• Let p* be the optimal solution path, with cost c*.
• Let p’ be a suboptimal solution path. That is c(p’) > c*.• Let p” be a sub-path of p* on the frontier.
Why A* is optimal
we know that because at a goal node
f(p’’) f(p*)
p’
p*
p”
f (goal)
Thus
f* f(p’)
f(p”) f(p’)
Any sup-path of the optimal solution path will be
expanded before p’ 24
And because h is admissible
• Let p* be the optimal solution path, with cost c*.
• Let p’ be a suboptimal solution path. That is c(p’) > c*.• Let p” be a sub-path of p* on the frontier.
Why A* is optimal
we know that because at a goal node
f(p’’) < f(p*)
p’
p*
p”
f (goal) = c(goal)
Thus
f* < f(p’)
f(p”) < f(p’)
Any sup-path of the optimal solution path will be
expanded before p’ 25
Run A* on this example to see how A* starts off going down the suboptimal path (through N5) but then recovers and never expands it, because there are always subpaths of the optimal path through N2 on the frontier
Slide 2727
Run A* on this example to see how A* starts off going down the suboptimal path (through N5) but then recovers and never expands it, because there are always subpaths of the optimal path through N2 on the frontier
Slide 2828
If fact, we can say something even stronger about A* (when it is admissible)
A* is optimally efficient among the algorithms that
extend the search path from the initial state.
It finds the goal with the minimum # of expansions
Analysis of A*
29
Why A* is Optimally Efficient
No other optimal algorithm is guaranteed to expand fewer
nodes than A* (given the same heuristic function)
This is because any algorithm that does not expand every
node with f(n) < f* risks missing the optimal solution
30
Analysis of A*
If the heuristic is completely uninformative and the edge costs are all the same, A* is equivalent to….
A. BFSB. LCFSC. DFSD. None of the Above
31
Analysis of A*
If the heuristic is completely uninformative and the edge costs are all the same, A* is equivalent to….
BFS but I will accept LCFS as correct as well because when edge costs are all equal it is equivalent to BFS
32
Time Space Complexity of A*
• Time complexity is O(bm)• the heuristic could be completely uninformative and the edge
costs could all be the same, meaning that A* does the same thing as BFS
• Space complexity is O(bm) like BFS, A* maintains a frontier which grows with the size of the tree
33
Effect of Search HeuristicA search heuristic that is a better approximation on the
actual cost reduces the number of nodes expanded by A*Example: 8puzzle: (1) tiles can move anywhere
(h1 : number of tiles that are out of place)(2) tiles can move to any adjacent square
(h2 : sum of number of squares that separate each tile from its correct position)
average number of paths expanded: (d = depth of the solution)d=12 IDS = 3,644,035 paths
A*(h1) = 227 paths A*(h2) = 73 paths
d=24 IDS = too many paths A*(h1) = 39,135 paths A*(h2) = 1,641 paths
34
Apply basic properties of search algorithms:
- completeness, optimality, time and space complexity
Complete Optimal Time Space
DFS N (Y if no cycles)
N O(bm) O(mb)
BFS Y Y O(bm) O(bm)
IDS Y Y O(bm) O(mb)
LCFS(when arc costs
available)
Y Costs > 0
Y Costs >=0
O(bm) O(bm)
Best First(when h available)
N N O(bm) O(bm)
A*(when arc costs > 0 and h admissible)
Y Y O(bm) O(bm)
Learning Goals
35