Primal Dual PPO Learning Resource Allocation in Indoor IRS-Aided Networks

Haijun Zhang, Xiangnan Liu, Keping Long, H. Vincent Poor

Research output: Contribution to journalConference articlepeer-review

3 Scopus citations


Terahertz communications is regarded as a promising technology due to its higher bandwidth and narrower beamwidths, which can improve capacity and coverage for indoor wireless users. In this paper, the intelligent reflecting surface (IRS) technique and non-orthogonal multiple access (NOMA) are utilized to compensate drawbacks of indoor transmission mismatch in the terahertz band. Then wireless resource allocation optimization in indoor terahertz IRS-aided systems is transformed into a universal optimization problem with ergodic constraints. With the aid of parametrization features of deep neural networks (DNNs), proximal policy optimization (PPO) is adopted to train the policy and corresponding actions to allocate power and bandwidths. The actor part generates continuous power allocation, and the critic part takes charge of discrete bandwidths allocation. In the design of a deep reinforcement learning (DRL) framework, primal dual ascent is proposed to realize model-free training. Simulation results demonstrate the effectiveness of the primal dual PPO learning algorithm in different settings.

Original languageEnglish (US)
JournalProceedings - IEEE Global Communications Conference, GLOBECOM
StatePublished - 2021
Externally publishedYes
Event2021 IEEE Global Communications Conference, GLOBECOM 2021 - Madrid, Spain
Duration: Dec 7 2021Dec 11 2021

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Signal Processing


Dive into the research topics of 'Primal Dual PPO Learning Resource Allocation in Indoor IRS-Aided Networks'. Together they form a unique fingerprint.

Cite this