Neural networks and Monte-Carlo method usage in multi-agent systems for sudoku problem solving

Katerina Poloziuk; Vadym Yaremenko

doi:10.15587/2706-5448.2020.218427

Authors

Katerina Poloziuk National Technical University of Ukraine «Igor Sikorsky Kyiv Polytechnic Institute», 37, Peremohy ave., Kyiv, Ukraine, 03056, Ukraine https://orcid.org/0000-0002-9892-5196
Vadym Yaremenko National Technical University of Ukraine «Igor Sikorsky Kyiv Polytechnic Institute», 37, Peremohy ave., Kyiv, Ukraine, 03056, Ukraine https://orcid.org/0000-0001-8557-6938

DOI:

https://doi.org/10.15587/2706-5448.2020.218427

Keywords:

DQN, DDQN, TD, PPO, neural network, deep learning, reinforcement learning, multi-agent system, MCTS, Q-Learning.

Abstract

The object of research is multi-agent systems based on Deep Reinforcement Learning algorithms and analysis of ways to establish interaction within the system, based on intelligent agents. Also, part of the material in this paper covers ways to organize the management and administration of agents at the meta-level: external controllers and tools to optimize their work, describing architectural solutions that should accelerate agents’ training. The studied full-fledged multi-agent system would be flexible to expansion and would give effective acceleration in agent training and problem-solving quality.

In this paper, the following neural network models were considered: DQN, DDQN, PPO, TD (methods based on Q-Learning), an approach using a neural network with Monte-Carlo tree search. The presented models were tested on a Sudoku problem with a dataset of 5039 combinations, dimensions 2x2, 4x4, and 9x9. Several sets of agent rewards were used. The presentation of data during the learning and problem-solving process was described. Also was built a multi-agent system based on the model using a Monte-Carlo tree search.

According to the study results, it was revealed that for tasks in a complex environment, the models based on Q-Learning are practically ineffective (plots support the statement). The training process for these models is quite demanding on the characteristics of the workstation hardware. It was also determined that the Monte-Carlo tree search method does a good job. Even with a small number of iterations, it shows results better than other Deep Learning methods (45–50 % accuracy for 9x9). However, a significant drawback is a complexity of training the model, and the hardware requirements are too large for this kind of research.

Author Biographies

Katerina Poloziuk, National Technical University of Ukraine «Igor Sikorsky Kyiv Polytechnic Institute», 37, Peremohy ave., Kyiv, Ukraine, 03056

Department of System Design

Vadym Yaremenko, National Technical University of Ukraine «Igor Sikorsky Kyiv Polytechnic Institute», 37, Peremohy ave., Kyiv, Ukraine, 03056

Postgraduent Student, Assistant

Department of System Design

References

Wang, Y., Wu, F. (2019). Multi-Agent Deep Reinforcement Learning with Adaptive Policies. ArXiv, abs/1912.00949. Available at: https://arxiv.org/abs/1912.00949
Tampuu, A., Matiisen, T., Kodelja, D., Kuzovkin, I., Korjus, K., Aru, J. et. al. (2017). Multiagent cooperation and competition with deep reinforcement learning. PLOS ONE, 12 (4), e0172395. doi: http://doi.org/10.1371/journal.pone.0172395
Simoes, D., Lau, N., Reis, L. P. (2019). Multi-Agent Deep Reinforcement Learning with Emergent Communication. 2019 International Joint Conference on Neural Networks (IJCNN). doi: http://doi.org/10.1109/ijcnn.2019.8852293
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., van den Driessche, G. et. al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature, 529 (7587), 484–489. doi: http://doi.org/10.1038/nature16961
Nguyen, T. T., Nguyen, N. D., Nahavandi, S. (2020). Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications. IEEE Transactions on Cybernetics, 50 (9), 3826–3839. doi: http://doi.org/10.1109/tcyb.2020.2977374
Kumar, S., Hakkani-Tür, D., Shah, P., Heck, L. (2017). Federated control with hierarchical multi-agent deep reinforcement learning. ArXiv. Available at: https://arxiv.org/abs/1712.08266v1
Hernandez-Leal, P., Kartal, B., Taylor, M. E. (2019). A survey and critique of multiagent deep reinforcement learning. Autonomous Agents and Multi-Agent Systems, 33 (6), 750–797. doi: http://doi.org/10.1007/s10458-019-09421-1
Foerster, J. N., Assael, Y. M., De Freitas, N., Whiteson, S. (2016). Learning to communicate with deep multi-agent reinforcement learning. Advances in Neural Information Processing Systems. Neural information processing systems foundation, 2145–2153.
Gupta, J. K., Egorov, M., Kochenderfer, M. (2017). Cooperative Multi-agent Control Using Deep Reinforcement Learning. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Vol. 10642 LNAI. Springer Verlag, 66–83. Available at: http://doi.org/10.1007/978-3-319-71682-4_5
Nguyen, N. D., Nguyen, T., Nahavandi, S. (2019). Multi-agent behavioral control system using deep reinforcement learning. Neurocomputing, 359, 58–68. doi: http://doi.org/10.1016/j.neucom.2019.05.062
Da Silva, F. L., Glatt, R., Costa, A. H. R. (2017). Simultaneously learning and advising in multiagent reinforcement learning. Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS. Vol. 2. International Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS), 1100–1108

Neural networks and Monte-Carlo method usage in multi-agent systems for sudoku problem solving

Authors

DOI:

Keywords:

Abstract

Author Biographies

Katerina Poloziuk, National Technical University of Ukraine «Igor Sikorsky Kyiv Polytechnic Institute», 37, Peremohy ave., Kyiv, Ukraine, 03056

Vadym Yaremenko, National Technical University of Ukraine «Igor Sikorsky Kyiv Polytechnic Institute», 37, Peremohy ave., Kyiv, Ukraine, 03056

References

Downloads

Published

How to Cite

Issue

Section

License

Information site

Language

Information

Developed By

Current Issue