Abstract
Non-orthogonal multiple access (NOMA) is regarded as a promising solution to improve the energy efficiency and reduce the latency of the unmanned aerial vehicle (UAV)-aided networks. In this letter, we consider an energy-efficient multi-UAV incorporating hybrid NOMA data collection system. Explicitly, the optimization problem of joint trajectory design and power allocation is formulated for maximizing energy utilization of the system. The optimization problem is a mixed integer non-convex problem and involves continuous variables. To tackle this challenging problem, we utilize a multi-agent deep reinforcement learning (MADRL) approach, i.e., multi-agent Twin Delayed Deep Deterministic Policy Gradient (MATD3), which introduces clipped double Q-learning and deep networks to reduce overestimation bias. Furthermore, a reward shaping method is applied to speed up the learning efficiency and convergence. Corroborated by extensive experiments, the proposed hybrid NOMA enhanced multi-UAV outperforms pure NOMA and OMA cases.
Original language | English |
---|---|
Pages (from-to) | 2722-2726 |
Number of pages | 5 |
Journal | IEEE Communications Letters |
Volume | 27 |
Issue number | 10 |
DOIs | |
State | Published - 1 Oct 2023 |
Bibliographical note
Publisher Copyright:© 1997-2012 IEEE.
Keywords
- Multi-agent deep reinforcement learning
- data collection
- non-orthogonal multiple access
- unmanned aerial vehicle
ASJC Scopus subject areas
- Electrical and Electronic Engineering
- Computer Science Applications
- Modeling and Simulation