TY - JOUR
T1 - Cooperative Internet of UAVs
T2 - Distributed Trajectory Design by Multi-Agent Deep Reinforcement Learning
AU - Hu, Jingzhi
AU - Zhang, Hongliang
AU - Song, Lingyang
AU - Schober, Robert
AU - Poor, H. Vincent
N1 - Funding Information:
Manuscript received May 18, 2020; revised July 25, 2020; accepted July 25, 2020. Date of publication August 3, 2020; date of current version November 18, 2020. This work was supported in part by the National Natural Science Foundation of China under Grant 61625101 and Grant 61941101, and in part by the U.S. National Science Foundation under Grant CCF-0939370 and Grant CCF-1908308. This article was presented in part at the 2019 IEEE Global Communications Conference. The associate editor coordinating the review of this article and approving it for publication was Y. Liu. (Corresponding author: Lingyang Song.) Jingzhi Hu and Lingyang Song are with the Department of Electronics, Peking University, Beijing 100871, China (e-mail: jingzhi.hu@pku.edu.cn; lingyang.song@pku.edu.cn).
Publisher Copyright:
© 1972-2012 IEEE.
PY - 2020/11
Y1 - 2020/11
N2 - Due to the advantages of flexible deployment and extensive coverage, unmanned aerial vehicles (UAVs) have significant potential for sensing applications in the next generation of cellular networks, which will give rise to a cellular Internet of UAVs. In this article, we consider a cellular Internet of UAVs, where the UAVs execute sensing tasks through cooperative sensing and transmission to minimize the age of information (AoI). However, the cooperative sensing and transmission is tightly coupled with the UAVs' trajectories, which makes the trajectory design challenging. To tackle this challenge, we propose a distributed sense-and-send protocol, where the UAVs determine the trajectories by selecting from a discrete set of tasks and a continuous set of locations for sensing and transmission. Based on this protocol, we formulate the trajectory design problem for AoI minimization and propose a compound-action actor-critic (CA2C) algorithm to solve it based on deep reinforcement learning. The CA2C algorithm can learn the optimal policies for actions involving both continuous and discrete variables and is suited for the trajectory design. Our simulation results show that the CA2C algorithm outperforms four baseline algorithms. Also, we show that by dividing the tasks, cooperative UAVs can achieve a lower AoI compared to non-cooperative UAVs.
AB - Due to the advantages of flexible deployment and extensive coverage, unmanned aerial vehicles (UAVs) have significant potential for sensing applications in the next generation of cellular networks, which will give rise to a cellular Internet of UAVs. In this article, we consider a cellular Internet of UAVs, where the UAVs execute sensing tasks through cooperative sensing and transmission to minimize the age of information (AoI). However, the cooperative sensing and transmission is tightly coupled with the UAVs' trajectories, which makes the trajectory design challenging. To tackle this challenge, we propose a distributed sense-and-send protocol, where the UAVs determine the trajectories by selecting from a discrete set of tasks and a continuous set of locations for sensing and transmission. Based on this protocol, we formulate the trajectory design problem for AoI minimization and propose a compound-action actor-critic (CA2C) algorithm to solve it based on deep reinforcement learning. The CA2C algorithm can learn the optimal policies for actions involving both continuous and discrete variables and is suited for the trajectory design. Our simulation results show that the CA2C algorithm outperforms four baseline algorithms. Also, we show that by dividing the tasks, cooperative UAVs can achieve a lower AoI compared to non-cooperative UAVs.
KW - Cooperative Internet of UAVs
KW - deep reinforcement learning
KW - distributed trajectory design
UR - http://www.scopus.com/inward/record.url?scp=85096700270&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096700270&partnerID=8YFLogxK
U2 - 10.1109/TCOMM.2020.3013599
DO - 10.1109/TCOMM.2020.3013599
M3 - Article
AN - SCOPUS:85096700270
SN - 0090-6778
VL - 68
SP - 6807
EP - 6821
JO - IEEE Transactions on Communications
JF - IEEE Transactions on Communications
IS - 11
M1 - 9154432
ER -