This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
BALANCING INTRANSITIVE RELATIONSHIPS IN MOBA
GAMES USING DEEP REINFORCEMENT LEARNING
Conor Stephens and Chris Exton1 Lero – The Science Foundation Ireland Research Centre for Software,
Computer Science & Information Systems (CSIS), University Of Limerick, Ireland 1Dr
ABSTRACT
Balanced intransitive relationships are critical to the depth of strategy and player retention within esports games. Intransitive
relationships comprise the metagame, a collection of strategies and play styles that are viable, each providing counterplay
for other viable strategies. This work presents a framework for testing the balance of massive online battle arena (MOBA)
games using deep reinforcement learning to identify the synergies between characters by measuring their effectiveness
against the other compositions within the games character roster. This research is designed for game designers and
developers to show how multi-agent reinforcement learning (MARL) can accelerate the balancing process and highlight
potential game-balance issues during the development process. Our findings conclude that accurate measurements of game
balance can be found with under 10 hours of simulation and show imbalances that traditional cost curve analysis approaches
failed to capture. Furthermore, we discovered that this approach reduced imbalance in each character's win rate by 20% in
our example project a key measurement that would be impossible to measure without collecting data from hundreds of
human-controlled games previously. The project's source code is publicly available at https://github.com/Taikatou/top-
down-shooter.
KEYWORDS
Deep Reinforcement Learning, Game Balance, Design
1. INTRODUCTION
The game balancing process aims to improve a game's aesthetic quality's and ensure the consistency and
fairness of the game's systems and mechanics. Traditionally this process was achieved through a combination
of data collected from playtesting analytical tools such as measuring the risk-reward ration of an item. This
process is getting progressively more time consuming with rising levels of complexity in the game's design.
To the constant updates that result in shifts in the game's balance. Game Designers have looked for options
when testing the balance of games with reinforcement learning being a strong contender for the new solution.
Reinforcement learning could potentially evaluate the quality of the game every evening when the developers
are asleep, accelerating the project's timeline and giving designers more confidence when carrying out
playtesting with participants. The sample problem this paper is focused on is an example project based on the
popular MOBA game genre an asymmetric multiplayer game that is played in a square arena with a top-down
perspective. We will evaluate the effectiveness of reinforcement learning when evaluating the balance of the
game by measuring the effectiveness of different team compositions using accelerated simulated play
controlled by deep reinforcement learning agents.
2. GAME DESIGN
The hope of any game's design that is optimising the rules and content of a game to further progress it towards
its aesthetic goals (Hunicke, Leblanc and Zubek, 2004). Common goals of Game balance is to ensure
entertaining and fair games for the players (Adams, 2009). This can take the form of understanding various
metrics about game mechanics and comparing them with other content in the game's systems. An example of
the game balance proccess would be to balance a revolver gun; the designer could record the mean, median
The character-specific win-rates also changed drastically with the DPS Sprint being the best character at
the cost of the traditional DPS role, as shown in Figure 5. Three characters currently have win-rates of
approximately 50%. The standard deviation also changed from the two experiments from the original
experiments, as shown in Figure 6. This change in the standard deviation is our most significant result it is
approximately 20% reduction of the level of imbalance of the game's characters. This is a significant figure
and would allow developers to measure the change in the fairness of the games over time in a significantly
more credible way than the previously used perceived balance found in other competitive games.
Figure 5. Win rate of individual characters
Figure 6. The standard deviation of win-rate - Experiment 1 vs Experiment 2
10. DISCUSSION
This research has shown that reinforcement learning can measure the imbalance within a multiplayer game.
Furthermore, this solution can aid games designers when identifying specific mechanics or systems, causing
an imbalance. This utility and speed provided by the learning agents can empower designers to rebalance their
game to pursue their aesthetic ambitions further. We expect other developers and researchers to push the
boundaries of what is possible with simulated play, especially concerning the design of the game's systems.
The next area of research we hope to look at is multiplayer Player vs Event games and to evaluate the difficulty
International Conferences Interfaces and Human Computer Interaction 2020; and Game and Entertainment Technologies 2020
133
of level design in a cooperative multiplayer setting. This could either review the effects of learning signals on
agent cooperation or measuring the deviation of procedurally generated content.
This research also poses several critical questions about the structure and purpose of reinforcement learning
in game design. The questions that we believe to be of particular importance in this area of research include:
Can we self-balance a game using Adversarial Learning, (Where one learning agent is responsible for
the game balance and can change character stats?
Can we identify character relationships in existing esports games that have open API's (Application
Interface) such as Hearthstone or DOTA II
How accurate are AI agents decisions when compared to human players, and by extension is imitation
learning a more accurate behaviour.
Thus it is clear that this research has both expanded reinforcement learning into a new area of applicability
and paved the way for an essential discussion on its future applications.
The authors would recommend several improvements that would accelerate the training time of this
research and make it more accessible within the games industry. The first is multiple parallel learning
environments; this would expedite the collection of data for the policy update and could allow the experience
buffer to contain more diverse experience each iteration. We would including Self-Play (Silver et al., 2018)
with the learning environment, a learning environment design paradigm where the current policy plays against
previous policies, this can show an increase over time and allows the reward estimates to be upwards instead
of zero-sum. Self-Play would facilitate more stable training and allow the agents to adapt to different playstyles
with the same characters.
ACKNOWLEDGEMENT
This research was supported by University of Limericks Computer Science and Information Systems
department and Lero, The Science Foundation Ireland Research Centre for Software.
REFERENCES
Adams, E. (2009). Fundamentals of Game Design. New Riders Publishing, p.329.
Carter, M., Gibbs, M. and Harrop, M. (2012). Metagames, Paragames and Orthogames: A New Vocabulary. In: FDG '12. [online] Association for Computing Machinery, p.1117. Available at: https://doi.org/10.1145/2282338.2282346.
Debus, M. (2017). Metagames: On the Ontology of Games Outside of Games. In: FDG '17. [online] Association for Computing Machinery. Available at: https://doi.org/10.1145/3102071.3102097.
Hunicke, R., Leblanc, M. and Zubek, R. (2004). MDA: A Formal Approach to Game Design and Game Research. AAAI Workshop - Technical Report, 1.
Jaffe, A., Miller, A., Andersen, E., Liu, Y., Karlin, A. and Popoviundefined, Z. (2012). Evaluating Competitive Game Balance with Restricted Play. In: AIIDE'12. AAAI Press, p.2631.
Juliani, A., Berges, V., Vckay, E., Gao, Y., Henry, H., Mattar, M. and Lange, D. (2018). Unity: A General Platform for
Intelligent Agents. CoRR, [online] abs/1809.02627. Available at: http://arxiv.org/abs/1809.02627.
Narvekar, S. (2017). Curriculum Learning in Reinforcement Learning. [online] pp.5195--5196. Available at:
https://doi.org/10.24963/ijcai.2017/757.
Pathak, D. (2017). Curiosity-driven Exploration.
Salen, K. and Zimmerman, E. (2003). Rules of Play: Game Design Fundamentals. The MIT Press.
Schulman, J. (2017). Proximal Policy Optimization Algorithms. CoRR, [online] abs/1707.06347. Available at:
http://arxiv.org/abs/1707.06347.
Silva, F. (2019). Evolving the Hearthstone Meta. CoRR, [online] abs/1907.01623. Available at: http://arxiv.org/abs/1907.01623.
Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., & Guez, A. et al. (2018). A general reinforcement learning
algorithm that masters chess, shogi, and Go through self-play. Science, 362(6419), 1140-1144. doi: 10.1126/science.aar6404.
Stefan Freyr Gudmundsson, L. (2018). Human-Like Playtesting with Deep Learning. pp.1-8.