YOU ARE DOWNLOADING DOCUMENT

Please tick the box to continue:

Transcript
Page 1: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

DeepStackExpert-Level Artificial Intelligence in Heads-Up No-Limit Poker

Lasse Becker-Czarnetzki

University of HeidelbergArtificial Intelligence for Games

SS 2019

July 11. 2019

Page 2: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

1 Perfect vs Imperfect information GamesIntroductionNo-Limit Heads-up Texax HoldemPerfect Information strategies

2 DeepStackRe-solving (CFR)Depth limited searchCounterfactual Value NetworksSparse lookahead trees

3 EvaluationPerformanve against humansExploitability (LBR)Nice features

4 Conclusion

Page 3: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionIntroduction No-Limit Heads-up Texax Holdem Perfect Information strategies

Perfect information games

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 1 / 38

Page 4: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionIntroduction No-Limit Heads-up Texax Holdem Perfect Information strategies

Von Neuman on games

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 2 / 38

Page 5: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionIntroduction No-Limit Heads-up Texax Holdem Perfect Information strategies

No-Limit Heads-up Texax Holdem

2 player zero-sum game4 Betting rounds on ”who has the better cards”2 Hold cards (private) (3, 4, 5) public cards.

–> 10160decisionpoints

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 3 / 38

Page 6: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionIntroduction No-Limit Heads-up Texax Holdem Perfect Information strategies

Poker Terms

BigblindFoldCheckCallBet (raise)Flop (Pre-Flop)TurnRiverrange

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 4 / 38

Page 7: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionIntroduction No-Limit Heads-up Texax Holdem Perfect Information strategies

Poker Game Tree

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 5 / 38

Page 8: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionIntroduction No-Limit Heads-up Texax Holdem Perfect Information strategies

Perfect information game

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 6 / 38

Page 9: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionIntroduction No-Limit Heads-up Texax Holdem Perfect Information strategies

Perfect information game

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 7 / 38

Page 10: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionIntroduction No-Limit Heads-up Texax Holdem Perfect Information strategies

Perfect information game

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 8 / 38

Page 11: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionIntroduction No-Limit Heads-up Texax Holdem Perfect Information strategies

Perfect information game

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 9 / 38

Page 12: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionIntroduction No-Limit Heads-up Texax Holdem Perfect Information strategies

Problems for imperfect information games

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 10 / 38

Page 13: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionIntroduction No-Limit Heads-up Texax Holdem Perfect Information strategies

Questions

How can we forget supergames without using necessaryinformation?How do we solve a subgame when their are no definite statesto start from?How do we evaluate a state, when we can’t use a single valueto summarize a position?

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 11 / 38

Page 14: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Re-solving

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 12 / 38

Page 15: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Re-solving

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 13 / 38

Page 16: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Re-solving

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 14 / 38

Page 17: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Counterfactual Regret Minimization

Counterfactual: ”If i had known”...Regret: ”how much better would i have done if i didsomething else instead?Minimization: ”what strategy minimizes my overall regret?Average strategy over i iterations = approximation to NashEquilibrium

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 15 / 38

Page 18: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Counterfactual Regret Minimization

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 16 / 38

Page 19: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Counterfactual Regret Minimization

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 17 / 38

Page 20: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Counterfactual Regret Minimization

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 18 / 38

Page 21: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Continual Re-solving

At every action we re-solve the subgameWe need our range and opponents counterfactual value”What-if” (expected value) opponent reaches public statewith hand x.3 scenarios for updating range and CFVs.

own action: CFVs = CFVs(action) – Update range via BayesruleChance action: CFVs = CFVs(chance action) – Eliminateimpossible card combos.Opponents action: Do Nothing

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 19 / 38

Page 22: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Depth limited search

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 20 / 38

Page 23: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Depth limited search

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 21 / 38

Page 24: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Solutions

Search from a set of possible states, re-solving multiple times.Remember players range and opponents counterfactual valuesGet evaluation through Deep Counterfactual value networks

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 22 / 38

Page 25: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

DeepStack elements summary

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 23 / 38

Page 26: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Deep Counterfactual Value Networks

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 24 / 38

Page 27: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Deep Counterfactual Value Networks

2 Networks: Flop Network, Turn NetworkAuxiliary network (Pre-Flop)Simple FFNN (7 layers, 500 Nodes, ReLU)outer network to fit values for zero-sum gameinput: Pot sizes, public cards, players rangesoutput: Counterfactual Values (Players, Hands)

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 25 / 38

Page 28: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Training

Randomly generated Poker situations.Turn network: 10M, Flop network:1MTurn network used for depth-limited lookahead in FlopNetwork training.

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 26 / 38

Page 29: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Sparse lookahead trees

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 27 / 38

Page 30: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionRe-solving (CFR) Depth limited search Counterfactual Value Networks Sparse lookahead trees

Abstraction?

Traditionally abstraction was used to simplify the gameAction abstraction – Card abstraction–> Translation ErrorsDeepstack only uses action abstraction in lookaheadCard clustering is used for NN input.

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 28 / 38

Page 31: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionPerformanve against humans Exploitability (LBR) Nice features

Evaluation

Exploitability – Play against humansProblems with Variance(Luck) –> 100.000 Hands forstatistical significance–> AIVAT 3k Hands = 90k normal hands

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 29 / 38

Page 32: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionPerformanve against humans Exploitability (LBR) Nice features

Pro players experimental results

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 30 / 38

Page 33: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionPerformanve against humans Exploitability (LBR) Nice features

Pro players experimental results

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 31 / 38

Page 34: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionPerformanve against humans Exploitability (LBR) Nice features

Exploitability

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 32 / 38

Page 35: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionPerformanve against humans Exploitability (LBR) Nice features

Nice to know

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 33 / 38

Page 36: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation ConclusionPerformanve against humans Exploitability (LBR) Nice features

Nice to know

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 34 / 38

Page 37: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation Conclusion

Conclusion

DeepStack beats Pro Poker player in No-Limit Heads-UpHoldem for the first timeConnects Perfect information AI heuristical searrch strategywith imperfect information AIPlays with Nash Equilibrium approximated strategy–> Doesn’t exploit weaker players.No MultiplayerCan’t explain moves but strategy tips can be taken away fromDeepStacks play.

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 35 / 38

Page 38: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation Conclusion

References

[1] Matej Moravcík and Martin Schmid and Neil Burch andViliam Lisý and Dustin Morrill and Nolan Bard and TrevorDavis and Kevin Waugh and Michael Johanson and MichaelH. BowlingDeepStack: Expert-Level Artificial Intelligence in No-LimitPokerScience, 2017

[2] N. Burch, M. Johanson, M. BowlingProceedings of the Twenty-Eighth Conference onArtificialIntelligence(2014)pp. 602–608

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 36 / 38

Page 39: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation Conclusion

References

[3] M. Bowling, N. Burch, M. Johanson, O. TammelinScience 347145 (2015)

[4] Todd W. Neller and Marc LanctoAn introduction to counterfactual regret minimization.InProceedings of Model AI AssignmentsThe Fourth Symposium on Educational Advances inArtificialIntelligence (EAAI-2013), 2013

[5] www.deepstack.ai

[6] www.depthfirstlearning.com/2018/DeepStack

[7] https://www.youtube.com/watch?v=qndXrHcV1sM

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 37 / 38

Page 40: DeepStack - Expert-Level Artificial Intelligence in Heads-Up ......DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker Science, 2017 [2] N. Burch, M. Johanson, M. Bowling

Perfect vs Imperfect information Games DeepStack Evaluation Conclusion

Thank You for ListeningAny Questions?

July 11. 2019 DeepStack Lasse Becker-Czarnetzki 38 / 38


Related Documents