Meta-Reinforcement Learning for Trajectory Design in Wireless UAV Networks

Ye Hu, Mingzhe Chen, Walid Saad, H. Vincent Poor, Shuguang Cui

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations

Abstract

In this paper, the design of an optimal trajectory for an energy-constrained drone operating in dynamic network environments is studied. In the considered model, a drone base station (DBS) is dispatched to provide uplink connectivity to ground users whose demand is dynamic and unpredictable. In this case, the DBS's trajectory must be adaptively adjusted to satisfy the dynamic user access requests. To this end, a metalearning algorithm is proposed in order to adapt the DBS's trajectory when it encounters novel environments, by tuning a reinforcement learning (RL) solution. The meta-learning algorithm provides a solution that adapts the DBS in novel environments quickly based on limited former experiences. The meta-tuned RL is shown to yield a faster convergence to the optimal coverage in unseen environments with a considerably low computation complexity, compared to the baseline policy gradient algorithm. Simulation results show that, the proposed meta-learning solution yields a 25% improvement in the convergence speed, and about 10% improvement in the DBS' communication performance, compared to a baseline policy gradient algorithm. Meanwhile, the probability that the DBS serves over 50% of user requests increases about 27%, compared to the baseline policy gradient algorithm.

Original languageEnglish (US)
Title of host publication2020 IEEE Global Communications Conference, GLOBECOM 2020 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728182988
DOIs
StatePublished - Dec 2020
Event2020 IEEE Global Communications Conference, GLOBECOM 2020 - Virtual, Taipei, Taiwan, Province of China
Duration: Dec 7 2020Dec 11 2020

Publication series

Name2020 IEEE Global Communications Conference, GLOBECOM 2020 - Proceedings

Conference

Conference2020 IEEE Global Communications Conference, GLOBECOM 2020
Country/TerritoryTaiwan, Province of China
CityVirtual, Taipei
Period12/7/2012/11/20

All Science Journal Classification (ASJC) codes

  • Media Technology
  • Modeling and Simulation
  • Instrumentation
  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Software
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Meta-Reinforcement Learning for Trajectory Design in Wireless UAV Networks'. Together they form a unique fingerprint.

Cite this