Abstract
This article considers a power allocation problem in energy harvesting downlink non-orthogonal multiple access (NOMA) systems in which a transmitter sends desired messages to their respective receivers by using harvested energy. To tackle this problem, we make use of a reinforcement learning approach based on a shallow neural network structure. We prove that the optimal power allocation policy and the optimal action-value function depend monotonically on some of their input variables and the shallow neural network structure is designed based on properties revealed in the proof. Different from inefficient deep learning methods that tend to require tremendous computational resources, this structure is capable of fully capturing the characteristics of the desired function with a single hidden layer. The optimized structure also allows learning agents to be robust and highly reliable in learning about randomly occurring data. Furthermore, we provide comprehensive experimental results in harsh environments where various arbitrary factors are assumed in order to demonstrate the robustness of the proposed learning approach compared with deep neural networks without proper grounds. It is also shown that the proposed learning process converges to a policy that outperforms existing power allocation algorithms.
Original language | English (US) |
---|---|
Article number | 9174757 |
Pages (from-to) | 982-997 |
Number of pages | 16 |
Journal | IEEE Journal on Selected Areas in Communications |
Volume | 39 |
Issue number | 4 |
DOIs | |
State | Published - Apr 2021 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Electrical and Electronic Engineering
Keywords
- Energy harvesting communications
- broadcast channel
- non-orthogonal multiple access
- power allocation
- reinforcement learning
- shallow neural network