COMP-4640: Intelligent & Interactive Systems Game Playing

COMP-4640: Intelligent & Interactive SystemsGame Playing

A game can be formally defined as a search problem with:

-An initial state

-a set of operators (actions or moves)

-a terminal test

-a utility (payoff) function

1. Multi-agent environment– Multi-player games involve planning and acting in environments

populated by other active agents– Agents use sense/plan/act architecture that does not plan too far into

the unpredictable future– But with proper information agent can construct plan that consider the

effects of the actions of other agents– In AI we will consider the special case of a games,

• deterministic• turn taking• two-player• zero sum games of perfect-information

2. Zero Sum Games– either one of them wins (and the other loses), or a draw results– +1 win -1 loss 0 draw

3. Agents utility functions make the games adversarial


Multi-agent environmentRobot Soccer


http://www.robocup.org/

Game tree (2-player, deterministic, turns)

COMP-4640: Intelligent & Interactive SystemsGame PlayingThe Minimax Algorithm


The Minimax Algorithm


• The evaluation function:• Must have the same terminal states (goal states)

as the utility function• Must be of reasonable complexity so that it can

be computed quickly (this is a trade-off between Accuracy and Time)

• Should be accurate• The performance of the game playing system

depends on the accuracy “goodness” of the evaluation function


• One problem with using minimax is that it may not be feasible to search the whole game tree for a minimax decision (move or action)

• Using depth-limited search may speed thing up the minimax decision process but instead of using the utility function one would need to construct an evaluation fuction.

• This evaluation function would provide an estimate of the expected utility of a game position

COMP-4640: Intelligent & Interactive Systems

Game Playing Properties of minimax• Complete? Yes (if tree is finite)• Optimal? Yes (against an optimal opponent)• Time complexity? O(bm)• Space complexity? O(bm) (depth-first exploration)

• For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution completely infeasible

•••••

Once we have developed a good evaluation function, we must also consider:

• The depth-limit

• The Horizon Problem– Difficult to eliminate– When a program is facing a move by the opponent

that causes serious damage and is ultimately unavoidable

– Stalling pushes the move over the horizon to a place where it can’t be detected


• Once we have an evaluation function and a depth-limit we can then re-apply minimax search.

• However, for depth-limited search minimax may still be inefficient.

• Minimax will expand nodes that need not be searched.

• By making our search method more efficient, we will be able to search at deeper levels of our game tree.


COMP-4640: Intelligent & Interactive SystemsGame Playing: Alpha-Beta Pruning

1. Search below a MIN node may be alpha-pruned if the beta value is < to the alpha value of some MAX ancestor.

2. Search below a

MAX node may be beta-pruned if the alpha value is > to the beta value of some MIN ancestor.

27

Alpha-Beta Pruning (αβ prune) • Rules of Thumb

– α is the highest max found so far– β is the lowest min value found so far

– If Min is on top Alpha prune– If Max is on top Beta prune

– You will only have alpha prune’s at Min level– You will only have beta prunes at Max level

– See detailed algorithm p167



2. Search below a


27



2. Search below a


3

2 3

3

593

3

β

5



2. Search below a


3

2 3

3

5 0

9

93

3

9

0

0

747

0 7

α

β



2. Search below a


3

2 3

3

5 0

9

93

3

3

90

0

02

22 6

22 1 5

6747 6

0 7

αα

β



3

3

5 0 6 1

65 3

4 7

73

5

5 6

55

3

2


α

β3

3

5 0 6 1

65 3

4 7

73

5

5 6

55

3

2

COMP-4640: Intelligent & Interactive SystemsGame of Chance: Expecti-minimax

•Initial value of leaves indicate board state•Use percentage chance based upon roll for first calculated value•Min eval f(n) selects Max value•The second roll uses different assigned percentage chance•Max eval f(n) selects Max value


3

0

0

3 0

(3*1.0)



3

0

0

6

6 0 12

9

3 6

3 0 6

9(3*1.0)



3

0

0

6

6 0 12

9

3 6

3 0 6

9

(0*0.67 + 6*0.33)2

2

(3*1.0)



3

0

0

6

6 0 12

9

3 6

0 63 0 6

9

3 0 6 12

(0*0.67 + 6*0.33)2

2

(3*1.0)



3

0

0

6

6 0 12

9

3 6

0 63 0 6

9

3 0 6 12

(0*0.67 + 6*0.33) (0*0.67 + 6*0.33)2 2

22 2

(3*1.0)


Cutting off searchMinimaxCutoff is identical to MinimaxValue except

1. Terminal? is replaced by Cutoff?2. Utility is replaced by Eval

Does it work in practice?bm = 106, b=35 m=4

4-ply lookahead is a hopeless chess player!– 4-ply ≈ human novice– 8-ply ≈ typical PC, human master– 12-ply ≈ Deep Blue, Kasparov

COMP-4640: Deterministic games in practice

• Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions.

»• Chess: Deep Blue defeated human world champion Garry Kasparov

in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply.

• Othello: human champions refuse to compete against computers, who are too good.

• Go: human champions refuse to compete against computers, who are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.

•••

»

http://www.research.ibm.com/deepblue/

http://www.freegames.ws/games/boardgames/othello/othello.htm

http://en.wikipedia.org/wiki/Go_(board_game)



COMP-4640: Intelligent & Interactive Systems Game Playing

Documents