no code implementations • 1 Feb 2024 • Guangzheng Hu, Yuanheng Zhu, Haoran Li, Dongbin Zhao
Based on it, we present a novel multi-agent reinforcement learning framework, Factorized Multi-Agent MiniMax Q-Learning (FM3Q), which can factorize the joint minimax Q function into individual ones and iteratively solve for the IGMM-satisfied minimax Q functions for 2t0sMGs.
no code implementations • 5 Dec 2022 • Jiajun Chai, Wenzhang Chen, Yuanheng Zhu, Zong-xin Yao, Dongbin Zhao
Then the inner loop tracks the macro behavior with a flight controller by calculating the actual input signals for the aircraft.
no code implementations • 10 Oct 2020 • Guangzheng Hu, Yuanheng Zhu, Dongbin Zhao, Mengchen Zhao, Jianye Hao
Then the design of the event-triggered strategy is formulated as a constrained Markov decision problem, and reinforcement learning finds the best communication protocol that satisfies the limited bandwidth constraint.
Multiagent Systems
no code implementations • 31 Mar 2020 • Zhentao Tang, Yuanheng Zhu, Dongbin Zhao, Simon M. Lucas
In contrast to conventional RHEA, an opponent model is proposed and is optimized by supervised learning with cross-entropy and reinforcement learning with policy gradient and Q-learning respectively, based on history observations from opponent.
no code implementations • 23 Dec 2019 • Kun Shao, Zhentao Tang, Yuanheng Zhu, Nannan Li, Dongbin Zhao
In this paper, we survey the progress of DRL methods, including value-based, policy gradient, and model-based algorithms, and compare their main techniques and properties.
1 code implementation • 3 Apr 2018 • Kun Shao, Yuanheng Zhu, Dongbin Zhao
With reinforcement learning and curriculum transfer learning, our units are able to learn appropriate strategies in StarCraft micromanagement scenarios.