Abstract
In this paper, we consider a distributed reinforcement learning setting where agents are communicating with a central entity in a shared environment to maximize a global reward. A main challenge in this setting is that the randomness of the wireless channel perturbs each agent's model update while multiple agents' updates may cause interference when communicating under limited bandwidth. To address this issue, we propose a novel distributed reinforcement learning algorithm based on the alternating direction method of multipliers (ADMM) and 'over air aggregation' using analog transmission scheme, referred to as A-RLADMM. Our algorithm incorporates the wireless channel into the formulation of the ADMM method, which enables agents to transmit each element of their updated models over the same channel using analog communication. Numerical experiments on a multi-agent collaborative navigation task show that our proposed algorithm significantly outperforms the digital communication baseline of A-RLADMM (D-RLADMM), the lazily aggregated policy gradient (RL-LAPG), as well as the analog and the digital communication versions of the vanilla FL, (A-FRL) and (D-FRL) respectively.
| Original language | English |
|---|---|
| Pages (from-to) | 311-320 |
| Number of pages | 10 |
| Journal | IEEE Transactions on Cognitive Communications and Networking |
| Volume | 8 |
| Issue number | 1 |
| DOIs | |
| State | Published - 1 Mar 2022 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2015 IEEE.
Keywords
- ADMM
- Analog communications
- Distributed optimization
- Policy gradient
- Reinforcement learning
ASJC Scopus subject areas
- Hardware and Architecture
- Computer Networks and Communications
- Artificial Intelligence
Fingerprint
Dive into the research topics of 'Communication-Efficient and Federated Multi-Agent Reinforcement Learning'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver