Adversarial Search Chapter 6
Games as search problems Engaged intellectual faculties of humans.
Board games such as Chess.
Game playing is appealing target of AI.
State of the game is easy to represent . And
Actions are restricted.
Games as search problems This makes the Game playing an idealizations of
world in which hostile agents act so as to diminish one’s well being.
Early researchers choose chess fro several reasons.
A chess playing computer is an existing proof of machine doing some thing thought to require intelligence.
Easy to represent the game as search through a space of possible gaming positions .
Games as search problems The presence of opponent makes the
decision problem complicated.
It introduces “Uncertainty”.
So, all game playing programs must deal with contingency problems.
Means prediction is impossible.
Games as search problems The complexity of the games introduces
completely a new kind of uncertainty that we have not seen so far.
It does not arise because it has missing information but one does not have a time to calculate the exact consequences of any move.
Instead, one has to make the best guess based on past experience and act before he is sure what action to take.
Games as search problems
Game usually has the time limits.
Game playing research has therefore spawned a number of interesting ideas on how to make the best use of time to reach good decisions.
Now ,the next discussion is how to find the theoretically best move. Then we look techniques for choosing a good move when time is limited.
Pruning allows us to to ignore portions of the search tree that makes no difference to the final choice.
Heuristic function allows us to approximate the true utility of state without complete search
Perfect Decisions in two person Games
The general case of a game with two players, whom we will call MAX and M1N,
MAX moves first, and then they take turns moving until the game is over.
At the end of the game, points are awarded to
the winning player.
Formal definition of game as search problem
The initial state, which includes the board position and an indication of whose move it is.
A set of operators, which define the legal moves that a player can make.
A terminal test, which determines when the game is over. States where the game has ended are called terminal states.
A utility function (also called a payoff function), which gives a numeric value for the outcome of a game. In chess, the outcome is a win, loss, or draw, which we can represent by the values +1, —1, or 0
Perfect Decisions in two person Games MAX would have to search for a sequence of moves
that leads to a terminal state that is a winner .then go ahead and make the first move in the sequence.
Unfortunately, MIN has something to say about it.
MAX therefore must find a strategy that will lead to a winning terminal state regardless of what MIN does, where the strategy includes the correct move for MAX for each possible move by MIN.
We will begin by showing how to find the optimal (or rational) strategy.
Game tree (2-player, deterministic, turns) Figure shows part of the search tree for the game of
Tic-Tac-Toe.
From the initial state, has a choice of nine possible moves. Play alternates between MAX placing x's and MIN placing o's until we reach leaf nodes corresponding to terminal states: states where one player has three in a row or all the squares are filled.
The number on each leaf node indicates the utility value of the terminal state from the point of view of MAX; high values are assumed to be good for MAX and bad for MIN .It is MAX'S job to use the search tree to determine the best move.
Game tree Even a simple game like Tic-Tac-Toe is too
complex to show the whole search tree, so we will switch to the absolutely trivial
game in Figure . The possible moves for MAX are labelled A
I , AT, and AS. The possible replies loA\ for MIN are A11, A12, A13, and so on. This particular
game ends after one move each by MAX and MIN.
Minimax Algorithm The minimax algorithm is designed to determine
the optimal strategy for MAX, and thus to decide what the best first move is.
The algorithm consists of five steps:
Generate the whole game tree, all the way down to the terminal states.
Apply the utility function to each terminal state to get its value.
Use the utility of the terminal states to determine the utility of the nodes one level higher up in the search tree.
Minimax Algorithm Continue backing up the values from the leaf nodes
toward the root, one layer at a time.
Eventually, the backed-up values reach the top of the tree; at that point, MAX chooses the move that leads to the highest value. In the topmost A node of , MAX has a choice of three moves that will lead to states with utility 3, 2, and 2, respectively.
Thus, MAX's best opening move is A1. This is called the minimax decision, because it maximizes the utility under the assumption that the opponent will play perfectly to minimize it.
Minimax Perfect play for deterministic games Idea: choose move to position with highest
minimax value = best achievable payoff against best
play E.g., 2-ply game:
Properties of minimax Complete? Yes (if tree is finite) Optimal? Yes (against an optimal opponent) Time complexity? O(bm) Space complexity? O(bm) (depth-first exploration)
For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution completely infeasible
(m is depth and b is legal move on each point)
Properties of α-β Pruning does not affect final result
Good move ordering improves effectiveness of pruning
With "perfect ordering," time complexity = O(bm/2) doubles depth of search
A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)
Why is it called α-β? α is the value of the
best (i.e., highest-value) choice found so far at any choice point along the path for max
If v is worse than α, max will avoid it prune that branch
Define β similarly for min
Resource limits
Suppose we have 100 secs, explore 104 nodes/sec 106 nodes per move
Standard approach: cutoff test:
e.g., depth limit (perhaps add quiescence search) evaluation function
= estimated desirability of position
Evaluation functions For chess, typically linear weighted sum of
featuresEval(s) = w1 f1(s) + w2 f2(s) + … + wn fn(s)
e.g., w1 = 9 with
f1(s) = (number of white queens) – (number of black queens), etc.
Cutting off searchMinimaxCutoff is identical to MinimaxValue
except1. Terminal? is replaced by Cutoff?2. Utility is replaced by Eval
Does it work in practice?bm = 106, b=35 m=4
4-ply lookahead is a hopeless chess player! 4-ply ≈ human novice 8-ply ≈ typical PC, human master 12-ply ≈ Deep Blue, Kasparov
1.
Deterministic games in practice Checkers: Chinook ended 40-year-reign of human world
champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions.
Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply.
Othello: human champions refuse to compete against computers, who are too good.
Go: human champions refuse to compete against computers, who are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.