Abstract
This study aims to address the power allocation problem to maximize the sum of the generalized mutual information, which refers to the achievable rate with imperfect channel state information, through a reinforcement learning (RL) approach in energy harvesting communications. In contrast to the conventional deep RL applications, which incur a large computational load on the devices due to the use of deep neural networks, we adopt shallow RL architectures involving the optimal structural properties pertaining to the optimal power allocation policy. To design the shallow architectures that can fully capture the desired power allocation policy, we derive the partial monotonicity of and bounds on the policy and value functions. These structural properties represent mathematical bases on which to construct the shallow architecture. We use a deterministic policy gradient method with monotonically shape-constrained approximators that allow us to avoid using overly complicated deep neural networks, which are not suitable for low-power devices. Through various experiments, we visualize the solutions derived from the proposed shallow architectures and demonstrate that the proposed method outperforms existing power allocation policies and exhibits a greater robustness due to optimal structural properties.
Original language | English (US) |
---|---|
Pages (from-to) | 1258-1271 |
Number of pages | 14 |
Journal | IEEE Journal on Selected Topics in Signal Processing |
Volume | 15 |
Issue number | 5 |
DOIs | |
State | Published - Aug 1 2021 |
All Science Journal Classification (ASJC) codes
- Signal Processing
- Electrical and Electronic Engineering
Keywords
- Energy harvesting communications
- generalized mutual information
- power allocation
- rate maximization
- reinforcement learning
- shallow neural network