A* Path Finding

A* PATH FINDING ALGORITHMPresented by Daniel Natapov

PROBLEM DEFINITION

Find the shortest (weighted) path from a start node to a goal node in a graph (or grid).

What if we are “informed” with heuristics? Can we “look-ahead” and direct our search?

APPLICATION

Games – NPC movement. Needs to be smart and fast.

Games use grids to describe the environment. These slides do too.

Many other applications: Network routing Image processing A.I. Path finding ...

GRID = GRAPH

Grid allows movement between adjacent cells in 4 or 8 possible directions.

Each direction may have a different cost.

=

EXAMPLE – GET FROM S TO T

S

T

EXAMPLE – EDGE WEIGHTS

In a game, edge weights depend on various factors, ie travel on road vs. grass.

For simplicity: lets say all horizontal and vertical costs are the same.

Also assume no diagonal paths.

WWDD? – WHAT WOULD DIJKSTRA’S DO?

S

T

Found it! (finally)

S

WWDD?

Dijkstra’s algorithm guarantees shortest path.

But searches a lot of unneeded area. We know where the destination node is, (just

not how to get there). We can try to direct the search with greedy

Best-First-Search.

BEST-FIRST-SEARCH

Similar to Dijkstra’s, but is informed. Has some estimate of how far from the goal

each vertex is: “look-ahead”. This estimate is a heuristic. It prioritizes vertices which it believes to be

closets to the goal, as opposed to vertices closest to the start.

BEST-FIRST-SEARCH EXAMPLE

S

T

S

HOW TO BREAK IT – OBSTACLES!

S

T

S

Found it! (could’ve

taken a better route)

BEST-FIRST-SEARCH

Best-First-Search works faster than Dijkstra’s. But does not guarantee an optimal-path. We want some combination of Dijkstra’s and

Best-First-Search. Enter A*!

A* ALGORITHM

Prioritizes its search based on: The distance traveled (Dijkstra’s) The distance remaining (Best-First-Search)

g(n) = Distance traveled from the start to a cell.

h(n) = Estimated distance from a cell to the target.

Value of a cell is f(n) = g(n) + h(n). The algorithm prioritizes cells whose f(n) is

lowest.

WHAT’S ALL THIS TALK ABOUT ESTIMATES?

An estimate of the distance between a cell and a target is a heuristic.

May be able to estimate distance between two cells.

Choosing a good heuristic is important, and can be difficult.

In our simplified case it is easy: the Manhattan Distance:

h(n) = |cell.x – goal.x| + |cell.y – goal.y|

MANHATTAN DISTANCE

Good for our case. Actual distance can never be less (more on this later).

Lets go through a complete example.

A* EXAMPLE

S T

A* EXAMPLE

g = 1h = 6f = 7

g = 1h = 6f = 7

S g = 1h = 4f = 5

T

g = 1h = 6f = 7

A* EXAMPLE

g = 1h = 6f = 7

g = 1h = 6f = 7

S g = 1h = 4f = 5

T

g = 1h = 6f = 7

A* EXAMPLE

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 1h = 6f = 7

S g = 1h = 4f = 5

T

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 2h = 3f = 5

A* EXAMPLE

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 1h = 6f = 7

S g = 1h = 4f = 5

T

g = 1h = 6f = 7

g = 2h = 5f = 7

A* EXAMPLE

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 1h = 6f = 7

S g = 1h = 4f = 5

T

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 3h = 6f = 9

g = 3h = 6f = 9

A* EXAMPLE

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 1h = 6f = 7

S g = 1h = 4f = 5

T

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 3h = 6f = 9

A* EXAMPLE

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 1h = 6f = 7

S g = 1h = 4f = 5

T

g = 2h = 7f = 9

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 2h = 7f = 9

g = 3h = 6f = 9

A* EXAMPLE

g = 3h = 6f = 9

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 1h = 6f = 7

S g = 1h = 4f = 5

T

g = 2h = 7f = 9

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 2h = 7f = 9

g = 3h = 6f = 9

g=2h=7f=9

g=2h=7f=9

g=2h=7f=9

g=3h=8f=11

g=3h=8f=11

g=3h=8f=11

g=3h=8f=11

A* EXAMPLE

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 3h = 6f = 9

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 2h = 7f = 9

g = 1h = 6f = 7

S g = 1h = 4f = 5

T

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 3h = 6f = 9

g = 4 h = 5f = 9

A* EXAMPLE

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 3h = 6f = 9

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 2h = 7f = 9

g = 1h = 6f = 7

S g = 1h = 4f = 5

T

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 3h = 6f = 9

g = 4 h = 5f = 9

g = 5h = 4f = 9

A* EXAMPLE

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 3h = 6f = 9

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 2h = 7f = 9

g = 1h = 6f = 7

S g = 1h = 4f = 5

T

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 6 h = 3f = 9

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 3h = 6f = 9

g = 4 h = 5f = 9

g = 5h = 4f = 9

g = 6h = 3f = 9

A* EXAMPLE

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 3h = 6f = 9

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 2h = 7f = 9

g = 1h = 6f = 7

S g = 1h = 4f = 5

g = 7h = 2f = 9

T

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 6 h = 3f = 9

g = 7h = 2f = 9

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 3h = 6f = 9

g = 4 h = 5f = 9

g = 5h = 4f = 9

g = 6h = 3f = 9

A* EXAMPLE

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 3h = 6f = 9

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 2h = 7f = 9

g = 1h = 6f = 7

S g = 1h = 4f = 5

g = 7h = 2f = 9

g = 8h = 1f = 9

T

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 1h = 6f = 7

g = 2h = 5f = 7

g = 6 h = 3f = 9

g = 7h = 2f = 9

g = 8h = 1f = 9

g = 3h = 8f = 11

g = 2h = 7f = 9

g = 3h = 6f = 9

g = 4 h = 5f = 9

g = 5h = 4f = 9

g = 6h = 3f = 9

MORE ABOUT HEURISTICS

Depending heuristic, A* can be admissible. This guarantees an optimal solution, despite

using an estimate. For A* to be admissible and guarantee an

optimal solution we need:n, h(n) ≤ h*(n)

h*(n) is the actual distance. If the heuristic overestimates the actual

distance, an optimal solution is not guaranteed.

MORE ABOUT HEURISTICS CONT’D

For admissibility we also need monotonicity. Satisfy triangle inequality h(n1) ≤ c(n1 → n2)

+ h(n2)n1

goal

h(n1)

n2

c(n1->n2)

h(n2)

FIDDLING WITH THE HEURISTIC

Use the heuristic to balance speed vs. accuracy.

If h(n) = 0, then f(n) = g(n). In other words, A* becomes Dijkstra’s.

If h(n) >> g(n), g(n) can be ignored. f(n) ≈ h(n). A* becomes Best-First-Search.

In general: The bigger g(n) is, the more it expands, which

makes it slower. The bigger h(n) is, the more direct the search is,

but better paths could be missed.

FIDDLING WITH THE HEURISTIC 2

If h(n) = h*(n), then A* will find the optimal solution, and not expand anything unnecessary. Straight to the target.

Only possible with good heuristic and no obstacles.

Can ‘fiddle’ with the heuristic and set it depending on the need.

Sometimes okay to get an approximate solution at the cost of a speed-up.

SPEED-ACCURACY SEE-SAW

g(n) h(n)

SpeedAccuracy

FORMAL DEFINITION

preCond: Input a grid/graph G with positive edge weights, a source node s, a target node t.

Also given a admissible heuristic for estimating distances.

postCond: Finds a shortest weighted path from s to t.

Loop Invariant: So far, the nodes have been handled in order of f(n), where f(n) = g(n) + h(n).

FORMAL DEFINITION CONT’D

Step: Handle the found (not handled) node with min f(n).

Store the parent for each cell – the cell through which the shortest path from s came.

Exit: Stop when t has been found.

Obtaining the post condition: LI + Exit + Code => PostCond.

Proving the path we traced back is shortest: Prove there is a path of this length: We have one.

Prove there is no shorter path: ...

PROOF THAT THERE IS NO BETTER PATH

We know our heuristic is admissible, h(n) ≤ h*(n).

By LI, our path handles cells in order of the minimum of f(n) = g(n) + h(n).

All unhandled paths have a larger f(n) than ours.

We found t. So h(n)=0. f(n) becomes our actual -g(n). In other words, our actual cost is lower than the actual+estimated of any other found node.

The estimated cost is always less than the actual cost. Meaning our actual cost is less than any other actual cost.

COMPLICATED SLIDE. PAY

ATTENTION!

CONFUSED?

f(n) ≤ f(any) = g(any) + h(any)

≤ g(any) + actual(any)

Our actual cost

Any other found estimate

Any other ‘actual’

WHAT IT ALL MEANS

If heuristic is admissible, A* returns the shortest path.

It will find it by (likely) expanding and searching less cells than Dijkstra’s.

But if the condition that h(n) ≤ h*(n) is violated, we can no longer ensure optimality.

Should it always be optimal?

RUNNING TIME

Well.... Dijkstra’s O(|E| + |V |log |V| )

V = number of vertices, E = number of edges. Obviously it is possible for A* to search every

edge as well, so we have no savings in the worst case.

Lets focus on the nodes instead: Dijkstra’s O(V2)

RUNNING TIME – DIJKSTRA’S

S T

Area of circle is O(L2)

RUNNING TIME – A*

S T

Area of half ellipse: O(L∙H)

RUNNING TIME – A*

ST

In total: O((L/n)∙H∙n) = O(L∙H)

ALL DONE! Thank you. Questions?

The algorithm was first described in 1968 by Peter Hart, Nils Nilsson, and Bertram Raphael

References & Resources: http://theory.stanford.edu/~amitp/GameProgrammi

ng/ (Great source)

http://www.policyalmanac.org/games/aStarTutorial.htm

http://en.wikipedia.org/wiki/A*_search_algorithm http://www.cse.yorku.ca/course_archive/2008-09/W/

3402/slides/Week3.pdf

A* Path Finding

Education

example g

example s t

search examples s t

better route s t

search best

complete example

example edge weights

search gn