Abstract
Future sixth-generation (6G) networks require efficient resource management to support a variety of services. This paper addresses the issue of maximizing user rates in a beyond directional reconfigurable intelligent surface (BD-RIS)-assisted network with non-orthogonal multiple access (NOMA) and integrated sensing and communication (ISAC) users. However, exploiting the gains offered by these frameworks necessitates joint tuning of BD-RIS phases and NOMA power, which is an inherently non-convex problem. We model this coupling as a continuous-action Markov decision process and solve it using twin-delayed deep deterministic policy gradient (TD3) reinforcement learning. The learned policy adaptively selects power-allocation factors and BD-RIS phase shifts, thereby boosting both communication and sensing rates under quality-of-service constraints. Simulation results confirm that the proposed deep reinforcement learning (DRL) scheme significantly outperforms conventional heuristics, demonstrating its potential for real-time resource optimization in 6G networks.
| Original language | English |
|---|---|
| Journal | IEEE Wireless Communications Letters |
| DOIs | |
| State | Accepted/In press - 2025 |
Bibliographical note
Publisher Copyright:© 2012 IEEE.
Keywords
- Beyond-directional reconfigurable intelligent surfaces (BD-RIS)
- deep reinforcement learning (DRL)
- integrated sensing and communication (ISAC)
- non-orthogonal multiple access (NOMA)
ASJC Scopus subject areas
- Control and Systems Engineering
- Electrical and Electronic Engineering