Adversarial Search Chapter 6. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.

Adversarial Search

Chapter 6

Outline

Optimal decisions α-β pruning Imperfect, real-time decisions

Games as search problems Engaged intellectual faculties of humans.

Board games such as Chess.

Game playing is appealing target of AI.

State of the game is easy to represent . And

Actions are restricted.

Games as search problems This makes the Game playing an idealizations of

world in which hostile agents act so as to diminish one’s well being.

Early researchers choose chess fro several reasons.

A chess playing computer is an existing proof of machine doing some thing thought to require intelligence.

Easy to represent the game as search through a space of possible gaming positions .

Games as search problems The presence of opponent makes the

decision problem complicated.

It introduces “Uncertainty”.

So, all game playing programs must deal with contingency problems.

Means prediction is impossible.

Games as search problems The complexity of the games introduces

completely a new kind of uncertainty that we have not seen so far.

It does not arise because it has missing information but one does not have a time to calculate the exact consequences of any move.

Instead, one has to make the best guess based on past experience and act before he is sure what action to take.

Games as search problems

Game usually has the time limits.

Game playing research has therefore spawned a number of interesting ideas on how to make the best use of time to reach good decisions.

Now ,the next discussion is how to find the theoretically best move. Then we look techniques for choosing a good move when time is limited.

Pruning allows us to to ignore portions of the search tree that makes no difference to the final choice.

Heuristic function allows us to approximate the true utility of state without complete search

Perfect Decisions in two person Games

The general case of a game with two players, whom we will call MAX and M1N,

MAX moves first, and then they take turns moving until the game is over.

At the end of the game, points are awarded to

the winning player.

Formal definition of game as search problem

The initial state, which includes the board position and an indication of whose move it is.

A set of operators, which define the legal moves that a player can make.

A terminal test, which determines when the game is over. States where the game has ended are called terminal states.

A utility function (also called a payoff function), which gives a numeric value for the outcome of a game. In chess, the outcome is a win, loss, or draw, which we can represent by the values +1, —1, or 0

Perfect Decisions in two person Games MAX would have to search for a sequence of moves

that leads to a terminal state that is a winner .then go ahead and make the first move in the sequence.

Unfortunately, MIN has something to say about it.

MAX therefore must find a strategy that will lead to a winning terminal state regardless of what MIN does, where the strategy includes the correct move for MAX for each possible move by MIN.

We will begin by showing how to find the optimal (or rational) strategy.

Game tree (2-player, deterministic, turns) Figure shows part of the search tree for the game of

Tic-Tac-Toe.

From the initial state, has a choice of nine possible moves. Play alternates between MAX placing x's and MIN placing o's until we reach leaf nodes corresponding to terminal states: states where one player has three in a row or all the squares are filled.

The number on each leaf node indicates the utility value of the terminal state from the point of view of MAX; high values are assumed to be good for MAX and bad for MIN .It is MAX'S job to use the search tree to determine the best move.

Game tree (2-player, deterministic, turns)

Game tree Even a simple game like Tic-Tac-Toe is too

complex to show the whole search tree, so we will switch to the absolutely trivial

game in Figure . The possible moves for MAX are labelled A

I , AT, and AS. The possible replies loA\ for MIN are A11, A12, A13, and so on. This particular

game ends after one move each by MAX and MIN.

Game tree

Minimax Algorithm The minimax algorithm is designed to determine

the optimal strategy for MAX, and thus to decide what the best first move is.

The algorithm consists of five steps:

Generate the whole game tree, all the way down to the terminal states.

Apply the utility function to each terminal state to get its value.

Use the utility of the terminal states to determine the utility of the nodes one level higher up in the search tree.

Minimax Algorithm Continue backing up the values from the leaf nodes

toward the root, one layer at a time.

Eventually, the backed-up values reach the top of the tree; at that point, MAX chooses the move that leads to the highest value. In the topmost A node of , MAX has a choice of three moves that will lead to states with utility 3, 2, and 2, respectively.

Thus, MAX's best opening move is A1. This is called the minimax decision, because it maximizes the utility under the assumption that the opponent will play perfectly to minimize it.

Minimax Perfect play for deterministic games Idea: choose move to position with highest

minimax value = best achievable payoff against best

play E.g., 2-ply game:

Minimax algorithm

Properties of minimax Complete? Yes (if tree is finite) Optimal? Yes (against an optimal opponent) Time complexity? O(bm) Space complexity? O(bm) (depth-first exploration)

For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution completely infeasible

(m is depth and b is legal move on each point)

α-β pruning example





Properties of α-β Pruning does not affect final result

Good move ordering improves effectiveness of pruning

With "perfect ordering," time complexity = O(bm/2) doubles depth of search

A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)

Why is it called α-β? α is the value of the

best (i.e., highest-value) choice found so far at any choice point along the path for max

If v is worse than α, max will avoid it prune that branch

Define β similarly for min

The α-β algorithm

The α-β algorithm

Resource limits

Suppose we have 100 secs, explore 104 nodes/sec 106 nodes per move

Standard approach: cutoff test:

e.g., depth limit (perhaps add quiescence search) evaluation function

= estimated desirability of position

Evaluation functions For chess, typically linear weighted sum of

featuresEval(s) = w1 f1(s) + w2 f2(s) + … + wn fn(s)

e.g., w1 = 9 with

f1(s) = (number of white queens) – (number of black queens), etc.

Cutting off searchMinimaxCutoff is identical to MinimaxValue

except1. Terminal? is replaced by Cutoff?2. Utility is replaced by Eval

Does it work in practice?bm = 106, b=35 m=4

4-ply lookahead is a hopeless chess player! 4-ply ≈ human novice 8-ply ≈ typical PC, human master 12-ply ≈ Deep Blue, Kasparov

1.

Deterministic games in practice Checkers: Chinook ended 40-year-reign of human world

champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions.

Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply.

Othello: human champions refuse to compete against computers, who are too good.

Go: human champions refuse to compete against computers, who are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.

Summary

Games are fun to work on! They illustrate several important

points about AI perfection is unattainable must

approximate good idea to think about what to

think about

Adversarial Search Chapter 6. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.

Documents