Top Banner
CS 480: GAME AI ADVERSARIAL SEARCH 2 5/29/2012 Santiago Ontañón [email protected] https://www.cs.drexel.edu/~santi/teaching/2012/CS480/intro.html
37

CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Apr 25, 2020

Download

Documents

dariahiddleston
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

CS 480: GAME AI ADVERSARIAL SEARCH 2

5/29/2012 Santiago Ontañón [email protected] https://www.cs.drexel.edu/~santi/teaching/2012/CS480/intro.html

Page 2: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Reminders • Check BBVista site for the course regularly • Also: https://www.cs.drexel.edu/~santi/teaching/2012/CS480/intro.html

• Project 4 description is available. •  Project 4 due June 7th

Page 3: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Outline • Student Presentations:

•  “Game AI as Storytelling” •  “Computational Approaches to Story-telling and Creativity”

• Monte-Carlo Search Algorithms • UCT • Strategy Simulation

Page 4: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Outline • Student Presentations:

•  “Game AI as Storytelling” •  “Computational Approaches to Story-telling and Creativity”

• Monte-Carlo Search Algorithms • UCT • Strategy Simulation

Page 5: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Board Games • Main characteristic: turn-based

•  The AI has a lot of time to decide the next move

Page 6: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Board Games • Not just chess…

Page 7: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Board Games •  From an AI point of view:

•  Turn-based •  Discrete actions •  Complete information (mostly)

•  Those features make these games amenable to game tree search!

Page 8: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Game Tree Search in Complex Games •  Classic minimax assumes (Chess, Checkers, Go…):

•  2 players •  Perfect information •  Turn-taking game •  Given a state and an action, we can predict the next state

•  It is easily generalizable to a multiplayer turn-taking game

(max^n algorithm)

•  Complex games (like RTS games): •  Real-time, not turn-taking, simultaneous actions •  Lots of possible actions: branching factor too large! •  We cannot exactly predict the next state •  Imperfect information

Page 9: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Game Tree Search in RTS Games • Problem:

•  Lots of possible actions, branching factor too large!

• Solution: •  ???

• Problems: •  real-time, no turn taking, simultaneous actions

• Solution: •  ???

Page 10: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Game Tree Search in RTS Games • Problem:

•  Lots of possible actions, branching factor too large!

• Solution: •  Sampling (Monte-Carlo Search)

• Problems: •  real-time, no turn taking, simultaneous actions

• Solution: •  ???

Page 11: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Monte-Carlo Methods •  Idea: use sampling instead of exact calculations • Simplest Monte-Carlo method:

•  Integration •  Imagine a very complex function f(x), we want to compute the

definite integral of f(x) between a and b:

•  Generate N random numbers between a and b. For each number n, compute f(n), and do the average.

•  For large values of N, this converges to the actual integral!

Z b

af(x)

Page 12: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Monte-Carlo Tree Search • Monte-Carlo Search:

•  Instead of opening the whole minimax tree •  Approximate it by sampling (same idea as for the integral)

•  For each possible action: play N games at random until the end starting with each action

•  If N is large, the average win ratio converges to the expected utility of the action

Page 13: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Minimax vs Monte-Carlo Minimax: Monte-Carlo:

U U U U U U U U U U U U U U U U

Page 14: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Minimax vs Monte-Carlo Minimax: Monte-Carlo:

U U U U U U U U U U U U U U U U

Minimax opens the complete tree (all

possible moves) up to a fixed depth.

Then, the Utility function is applied to

the leaves.

Page 15: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Minimax vs Monte-Carlo Minimax: Monte-Carlo:

U U U U U U U U U U U U U U U U

Monte-Carlo search runs, for each possible move at the root

node, a fixed number K of random complete games.

No need for a Utility function (but it can be used),

Complete Game

Page 16: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Monte-Carlo Search • Advantages:

•  Scales up better than minimax (less sensitive to branching factors) •  No need for a utility function! Just play till the end and return the

move with highest probability of win.

• Disadvantages: •  Brittle: possibility that a good move of the opponent is not sampled

Page 17: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Monte-Carlo Search Improvements • Each branch of a Monte-Carlo search tree is a random

game. •  Instead of generating random games, bias the probability

of each move: •  Example, in chess, favor capturing moves. This is more likely to

generate move sequences that make sense! •  In general: use game-play data to learn which moves are more

frequent, and use those probabilities when generating random games.

Page 18: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Monte-Carlo Search Uses • Extremely useful in complex games when minimax cannot

be used

• When trying to decide between a set of actions: •  Just play random games with each action, and select the best one

• Can be used, for example, in: •  RTS games •  RPG game battles •  Board games •  Etc.

Page 19: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Outline • Student Presentations:

•  “Game AI as Storytelling” •  “Computational Approaches to Story-telling and Creativity”

• Monte-Carlo Search Algorithms • UCT • Strategy Simulation

Page 20: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Monte-Carlo Tree Search: UCT • Upper Confidence Tree (UCT) is a state of the art, simple

variant of Monte-Carlo Search, responsible for the recent success of Computer Go programs

•  Ideas: •  Sampling optimally (UCB) •  Instead of opening the whole Minimax tree or play N random

games open only the upper part of the tree, and play random games from there

Page 21: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

UCT 0/0 Tree Search

Monte-Carlo Search

Current state w/t is the account of how many games starting from this state

have be found to be won out of the total games explored in the

current search

Current State

Page 22: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

UCT 1/1 Tree Search

Monte-Carlo Search

win

Page 23: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

UCT 1/2 Tree Search

Monte-Carlo Search

0/1

loss

At each iteration, one node of the tree (upper part) is selected and expanded (one node added to the tree). From this new node a complete game is played out

at random (Monte-Carlo)

Page 24: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

UCT 2/3 Tree Search

Monte-Carlo Search

0/1

At each iteration, one node of the tree (upper part) is selected and expanded (one node added to the tree). From this new node a complete game is played out

at random (Monte-Carlo)

1/1

win

Page 25: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

UCT 3/4 Tree Search

Monte-Carlo Search

0/1 2/2

1/1

win

The counts w/t are used to determine which nodes to explore next.

Naïve Exploration/Exploitation policy: 50% expand the best node in the tree

50% expand a node at random

Page 26: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

UCT 3/4 Tree Search

Monte-Carlo Search

0/1 2/2

1/1

win

The counts w/t are used to determine which nodes to explore next.

Naïve Exploration/Exploitation policy: 50% expand the best node in the tree

50% expand a node at random Instead of this naïve policy, UCT uses an optimal sampling policy called UCB (Upper Confidence Bounds) coming

from reinforcement learning.

Page 27: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

UCT 3/5 Tree Search

Monte-Carlo Search

0/1 2/3

1/1 0/1

loss

The tree ensures all relevant actions are explored (greatly alleviates the

randomness that affects Monte-Carlo methods)

Page 28: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

UCT 3/5 Tree Search

Monte-Carlo Search

0/1 2/3

1/1 0/1

loss

The random games played from each node of the tree serve to estimate the

Utility function. They can be random, or use an opponent model (if available)

Page 29: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

UCT • After a fixed number of iterations K (or after the assigned

time is over), UCT analyzes the resulting trees, and the selected action is the one that has been explored more often.

• UCT can search in games with much larger state spaces than minimax. It is the standard algorithms for modern (from 2008 to present) Go playing programs

Page 30: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Outline • Student Presentations:

•  “Game AI as Storytelling” •  “Computational Approaches to Story-telling and Creativity”

• Monte-Carlo Search Algorithms • UCT • Strategy Simulation

Page 31: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Game Tree Search in RTS Games • Problem:

•  Lots of possible actions, branching factor too large!

• Solution: •  Sampling (Monte-Carlo Search)

• Problems: •  real-time, no turn taking, simultaneous actions

• Solution: •  Strategy simulation, rather than turn-based action taking

Page 32: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Strategy Simulation: Example • Assume we want to use UCT for the Strategy module of

an RTS AI game • Define a collection of “high level actions” (or strategies)

that make sense for the game. For example, in S3: •  S1: Attack with the units we have •  S2: Train 4 footmen •  S3: Train 4 archers •  S4: Train 4 catapults •  S5: Train 4 knights •  S6: Build 2 defense Towers •  S7: Build 2 defense Towers around a Gold Mine •  S8: Build 2 defense Towers around a group of Trees •  S9: Bring units back to the base •  S10: Train 2 more peasants to gather resources

Page 33: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Strategy Simulation: Example •  Instead of taking turns in executing actions, we assign a

“strategy” to each player, and simulate it until completion:

Player 1, Action 1

Player 2, Action 2

Player 1, Action 3

Player 1: S2 (ETA 240) Player 2: S3 (ETA 400)

Player 1: S1 (ETA 400) Player 2: S3 (ETA 160)

Player 1: S1 (ETA 240) Player 2: S1 (ETA 400)

Player 1: S1

Player 2: S1

Standard Minimax

Strategy Simulation

Page 34: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Strategy Simulation • Requires:

•  A way to simulate strategies: typically a very simplified model •  E.g. battles just decided by who has more units, or added damage of

units (taking into account air/ground units) •  No pathfinding, etc. •  Abstracted version of the game, e.g.: divide map in regions, and just

count the number of unit types in each region •  Utility function (optional):

•  If available, there is no need to simulate games till the end when using Monte-Carlo

•  If not available, simply simulate games to the end

Page 35: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

UCT for RTS Games • Applicable to:

•  Strategy (previous example) •  Attack: where the high-level actions are things like attack enemy X,

retreat, etc. •  Economy

•  In Turn-based games, minimax is executed each turn •  For RTS games: execute each K cycles (e.g. once per second), or

once the current action has finished, or an important event happened (e.g. new enemy sighted)

• State of the art: •  No current commercial games use it •  Research in experimental games shows its potential

Page 36: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Projects 3 & 4 • Project 4 (and last): Rule-based Strategy for RTS Game (S3)

•  Idea: •  Create a perception layer that creates a simple knowledge base (logical

terms) •  Create a simple unification algorithm with variable bindings •  Define a set of actions the rule-based system can execute •  Define a small set of rules (do not overdo it! J) •  RETE is optional (extra credit) •  See how well it plays and how easy is it to make the AI play well!

•  Anyone wants to do a different project 4? Any ideas?

Page 37: CS 480: GAME AIsanti/teaching/2012/CS...Minimax vs Monte-Carlo Minimax: Monte-Carlo: U U U U U U U U U U U U U U U U Monte-Carlo search runs, for each possible move at the root node,

Next Thursday • Machine Learning in games (last lecture!)