No-Pain No-Gain: DRL Assisted Optimization in Energy-Constrained CR-NOMA Networks

Zhiguo Ding, Robert Schober, H. Vincent Poor

Research output: Contribution to journalArticlepeer-review

Abstract

This paper applies machine learning to optimize the transmission policies of cognitive radio inspired non-orthogonal multiple access (CR-NOMA) networks, where time-division multiple access (TDMA) is used to serve multiple primary users and an energy-constrained secondary user is admitted to the primary users' time slots via NOMA. During each time slot, the secondary user performs the two tasks: data transmission and energy harvesting based on the signals received from the primary users. The goal of the paper is to maximize the secondary user's long-term throughput, by optimizing its transmit power and the time-sharing coefficient for its two tasks. The long-term throughput maximization problem is challenging due to the need for making decisions that yield long-term gains but might result in short-term losses. For example, when in a given time slot, a primary user with large channel gains transmits, intuition suggests that the secondary user should not carry out data transmission due to the strong interference from the primary user but perform energy harvesting only, which results in zero data rate for this time slot but yields potential long-term benefits. In this paper, a deep reinforcement learning (DRL) approach is applied to emulate this intuition, where the deep deterministic policy gradient (DDPG) algorithm is employed together with convex optimization. Our simulation results demonstrate that the proposed DRL assisted NOMA transmission scheme can yield significant performance gains over two benchmark schemes.

Original languageEnglish (US)
Pages (from-to)5917-5932
Number of pages16
JournalIEEE Transactions on Communications
Volume69
Issue number9
DOIs
StatePublished - Sep 2021

All Science Journal Classification (ASJC) codes

  • Electrical and Electronic Engineering

Keywords

  • and energy harvesting
  • cognitive radio communications
  • deep reinforcement learning
  • Non-orthogonal multiple access

Fingerprint

Dive into the research topics of 'No-Pain No-Gain: DRL Assisted Optimization in Energy-Constrained CR-NOMA Networks'. Together they form a unique fingerprint.

Cite this