MDDP: Making Decisions From Different Perspectives in Multiagent Reinforcement Learning
Article 2023 en
Authors
WL
Wei Li
ZQ
Ziming Qiu
SS
Shitong Shao
Abstract
1 min read
Multi-Agent Reinforcement Learning (MARL) has made remarkable progress in recent years. However, in most MARL methods, agents share a policy or value network, which is easy to result in similar behaviors of agents and thus limits the flexibility of the method to handle complex tasks. To enhance the diversity of agent behaviors, we propose a novel method, Making Decisions from Different Perspectives (MDDP). This method enables agents to switch flexibly between different policy roles and make decisions from different perspectives, which can improve the adaptability of policy learning in complex scenarios. Specifically, in MDDP, we design a new Self-attention and Gated Recurrent Unit (GRU) based Dueling Architecture Network (SG-DAN) to estimate the individual <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Q</i> -values. SGDAN contains two components: the new Self-Attention based Role-switching network (SAR) and the capable GRU-based State value Estimation network (GSE). SAR takes charge of action advantage estimation and GSE is responsible for state value estimation. Experimental results on the challenging StarCraft II micromanagement benchmark not only verify the modeling reasonability of MDDP but also demonstrate its performance superiority over the related advanced approaches.
Discussion(0)
No comments yet. Be the first to comment.