A Review on Hanabi Game for Multiagent Learning using Artificial Intelligence

Shweta Pramodrao Sontakke; Dr. A. N. Thakare

doi:10.32628/IJSRSET218245

Authors

Shweta Pramodrao Sontakke PG Scholar, Computer Science and Engineering, Department of Computer Engineering, Bapurao Deshmukh College of Engineering, Sevagram, Wardha, Maharashtra, India
Dr. A. N. Thakare Assistant Professor, Department of Computer Engineering, Bapurao Deshmukh College of Engineering, Sevagram, Wardha, Maharashtra, India

Keywords:

Ad-Hoc Team, Communication, Cooperative, Imperfect Information

Abstract

A popular board game Hanabi is a combination of cooperative gameplay with imperfect information. Partial observability makes the game, a challenging domain for AI research. Especially, when AI should cooperate with a human player. Imperfect information game is nontrivial due to complicated interplay of policies. The combination of cooperation, imperfect information, and limited communication make Hanabi an ideal challenge in both self-play and ad-hoc team settings. Ad-hoc team settings, where partners and strategies are not known in advance. In this paper, we are trying to review all such type of games, which is evaluated with the help of Artificial Intelligence and machine technique. We expect this article will help unify and motivate future research to take advantage of the abundant literature that exists to promote fruitful research in the multiagent community.

References

N. Bard, J. N. Foerster, S. Chandar, N. Burch, M. Lanctot, H. F. Song, E. Parisotto, V. Dumoulin, S. Moitra, E. Hughes, I. Dunning, S. Mourad, H. Larochelle, M. G. Bellemare, M. Bowling, “The Hanabi challenge: A new frontier for AI research,” CoRR, vol. abs/1902.00506, 2019. [Online]. Available: http://arxiv.org/abs/1902.00506
M. Eager, C. Martens, P. Sauma Chacon, M. Alfaro Cordoba, J. Hidalgo-Cespedes, “Operationalizing Intentionality to Play Hanabi with Human Players,” DOI 10.1109/TG.2020.3009359, IEEE
R. Canaan, J. Togelius, A. Nealen, S. Menzel, “Diverse Agents for Ad-Hoc Cooperation in Hanabi,” IEEE 2019.
R. Canaan, J. Togelius, H. Shen, A. Nealen, R. Torrado, S. Menzel, “Evolving Agents for the Hanabi 2018 CIG Competition,” ArXiv e-prints, Sep. 2018.
J. Walton-Rivers, P. R. Williams, R. Bartle, D. Perez-Liebana, S. M. Lucas, “Evaluating and modelling hanabi-playing agents”, in Evolutionary Computation (CEC), 2017 IEEE Congress on. IEEE,2017, pp. 1382-1389.
R. Patil Rashmi, Y. Gandhi, V. Sarmalkar, P. Pund and V. Khetani, "RDPC: Secure Cloud Storage with Deduplication Technique," 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Palladam, India, 2020, pp. 1280-1283, doi: 10.1109/I-SMAC49090.2020.9243442.
M. Eger, C. Martens, and M. Alfaro Cordoba, “An intentional AI for Hanabi,” in 2017 IEEE Conference on Computational Intelligence and Games (CIG), 2017, pp. 68-75.
P. Sauma Chacon and M. Eger, “Pandemic as a challenge for human-AI cooperation,” in Proceedings of the AIIDE workshop on Experimental AI in Games, 2019.
B. Bouzy “Playing Hanabi near-optimally,” in 15th International Conference on Advances in Computer Game (ACG15), ICGA. Cham: Springer International Publishing, 2017, pp. 51-62.
T. Thi Nguyen, N. Duy Nguyen, and Saeid Nahavandi, Senior Member, IEEE, “Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications,” IEEE, April 28, 2020.
Y. Tian, Q. Gong, and T. Jiang, “Joint Policy Search for Multi-agent Collaboration with Imperfect Information,” 34th Conference on Neural Information Processing Systems (NeuroIPS), 2020.
E. T. Gottwald, M. Eger and C. Martens, “I see what you see: Integrating eye tracking into Hanabi playing agents,” CEUR-WS.org/Vol-2282/EXAG_112
M. Eger and C. Martens, “A Browser-based interface for the exploration and evaluation of Hanabi AIs,” https://github.com/yawgmoth/pyhanabi, 2017.
L. Cheng, T. Guo, Yun-ting Liu and Jia-Ming Yang, “Survey of Multi-Agent Strategy Based on Reinforcement Learning,” IEEE, 2020.
R. Canaan, X. Gao, Y. Chung, J. Togelius, A. Nealen, and S. Menzel, “Evaluating RL Agents in Hanabi with Unseen Partners,” AAAI Workshop on Reinforcement Learning in Games ,2020.
J. (JP) Park “Advancing AI: Hanabi Challenge,” Inference and Representation, 2019
J. Goodman, “Re-determinizing MCTS in Hanabi,” arXiv preprint arXiv: 1902.06075, 2019.
J. N. Foerster, H. F. Song, E. Hughes, N. Burch, I. Dunning, S. Whiteson, M. Botvinick, and M. Bowling, “Bayesian action decoder for deep multi-agent reinforcement learning,” arXiv preprint arXiv:1811.01458, 2018.
H. Osawa, “Solving Hanabi: Estimating hands by opponent’s actions in cooperative Game with incomplete information,” in Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015. [Online]. Available: https://aaai.org/ocs/index.php/WS/AAAIW15/paper/view/10167
P. Hernandez-Leal, B. Kartal, and M. E. Taylor, “A survey and critique of multiagent deep reinforcement learning,” Springer,16 October 2019.
C. Cox, J. De Silva, P. Deorsey, F. H. Kenter, T. Retter, and J. Tobin, “How to make the perfect firework display: Two strategies for Hanabi,” Mathematics Magazine, vol. 88, no. 5, pp. 323-336, 2015.
R. Canaan, X. Gao, Y. Chung, J. Togelius, A. Nealen, and S. Menzel, “Behavioural Evaluation of Hanabi Rainbow DQN Agents and Rule-Based Agents”, 16th AAAI Conference on AIIDE, 2020
P. Stone, G. A. Kaminka, S. Kraus, and J. S. Rosenschein, “Ad-Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination”, American Association for Artificial Intelligence, 2010.
A. Bauza, “Hanabi” https://boardgamegeek.com/boardgame/98778/ hanabi,2010.
G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, W. Zaremba, “Open AI Gym”, arXiv:1606.01540v1 [cs. LG] 5 Jun 2016.
N. Brown and T. Sandholm, Science, “Superhuman AI for heads-up no-limit poker: Libratus beats top professionals”, 10.1126/science. aao1733 (2017).
M. Moravcik, M. Schmid, N. Burch, V. Lisy, D. Morrill, N. Bard, T. Davis, K. Waugh, M. Johanson, M. Bowling, “DeepStack: Expert-Level Artificial Intelligence in heads-up no-limit Poker”, arXiv: 1701.01724v3 [cs. AI] 3Mar 2017.
G. Tesauro “Temporal difference learning and TD-Gammon”, ACM, March 1995/ Vol. 38, No.3.
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglu, V. Panneershelvam, M. Lanctot, S. Dielema, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel and D. Hassabis, “Mastering the game of GO with deep neural networks and tree search”, doi:10.1038/nature16961 vol 529 Jan 2016.
N. Ensmenger “Is chess the drosophila of artificial intelligence A social history of an algorithm”, DOI:10.1177/0306312711424596 sss.sagepub.com 2011.
A. A. Sanchez-Ruiz, M. Miranda, “A machine learning approach to predict the winner in StarCraft based on influence maps”, http://dx.doi.org/10.1016/j.entcom.2016.11.005
J. Schaeffer, R. Lake, P. Lu, and M. Bryant, “Chinook: The world man-machine checkers champion”, AI magazine volume 17 Nov 1 (1996) AAAI.
M. Campbell, A. J. Hoane Jr., Feng-hsiung Hsu, “Deep Blue”, PII: S0004-3702(01)00129-1 2001 by Elsevier.
A. Iraci, “Convensions for Hanabi”, http://hanabi.pythonanywhere.com/static/Hanabi.pdf, 2018.

A Review on Hanabi Game for Multiagent Learning using Artificial Intelligence

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite