Shallow Reinforcement Learning for Energy Harvesting Communications with Imperfect Channel Knowledge

Heasung Kim, Jungwoo Lee, Wonjae Shin, H. Vincent Poor

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


This study aims to address the power allocation problem to maximize the sum of the generalized mutual information, which refers to the achievable rate with imperfect channel state information, through a reinforcement learning (RL) approach in energy harvesting communications. In contrast to the conventional deep RL applications, which incur a large computational load on the devices due to the use of deep neural networks, we adopt shallow RL architectures involving the optimal structural properties pertaining to the optimal power allocation policy. To design the shallow architectures that can fully capture the desired power allocation policy, we derive the partial monotonicity of and bounds on the policy and value functions. These structural properties represent mathematical bases on which to construct the shallow architecture. We use a deterministic policy gradient method with monotonically shape-constrained approximators that allow us to avoid using overly complicated deep neural networks, which are not suitable for low-power devices. Through various experiments, we visualize the solutions derived from the proposed shallow architectures and demonstrate that the proposed method outperforms existing power allocation policies and exhibits a greater robustness due to optimal structural properties.

Original languageEnglish (US)
Pages (from-to)1258-1271
Number of pages14
JournalIEEE Journal on Selected Topics in Signal Processing
Issue number5
StatePublished - Aug 1 2021
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Electrical and Electronic Engineering


  • Energy harvesting communications
  • generalized mutual information
  • power allocation
  • rate maximization
  • reinforcement learning
  • shallow neural network


Dive into the research topics of 'Shallow Reinforcement Learning for Energy Harvesting Communications with Imperfect Channel Knowledge'. Together they form a unique fingerprint.

Cite this