TY - JOUR
T1 - Dynamic UAV Deployment for Differentiated Services
T2 - A Multi-Agent Imitation Learning Based Approach
AU - Wang, Xiaojie
AU - Ning, Zhaolong
AU - Guo, Song
AU - Wen, Miaowen
AU - Guo, Lei
AU - Poor, H. Vincent
N1 - Funding Information:
This work was supported by the National Natural Science Foundation of China under Grants 62025105, 61971084, and 62001073, Chongqing Talent Program under Grant CQYC2020058659, Natural Science Foundation of Guangdong Province under Grant 2018B030306005, Hong Kong RGC Research Impact Fund (RIF) under Project R5060-19, General Research Fund (GRF) under Project 152221/19E, and the U.S. National Science Foundation under Grant CCF-1908308
Publisher Copyright:
© 2002-2012 IEEE.
PY - 2023/4/1
Y1 - 2023/4/1
N2 - Unmanned Aerial Vehicles (UAVs) have been utilized to serve on-ground users with various services, e.g., computing, communication and caching, due to their mobility and flexibility. The main focus of many recent studies on UAVs is to deploy a set of homogeneous UAVs with identical capabilities controlled by one UAV owner/company to provide services. However, little attention has been paid to the issue of how to enable different UAV owners to provide services with differentiated service capabilities in a shared area. To address this issue, we propose a multi-agent imitation learning enabled UAV deployment approach to maximize both profits of UAV owners and utilities of on-ground users. Specially, a Markov game is formulated among UAV owners and we prove that a Nash equilibrium exists based on the full knowledge of the system. For online scheduling with incomplete information, we design agent policies by imitating the behaviors of corresponding experts. A novel neural network model, integrating convolutional neural networks, generative adversarial networks and a gradient-based policy, can be trained and executed in a fully decentralized manner with a guaranteed ϵ-Nash equilibrium. Performance results show that our algorithm has significant superiority in terms of average profits, utilities and execution time compared with other representative algorithms.
AB - Unmanned Aerial Vehicles (UAVs) have been utilized to serve on-ground users with various services, e.g., computing, communication and caching, due to their mobility and flexibility. The main focus of many recent studies on UAVs is to deploy a set of homogeneous UAVs with identical capabilities controlled by one UAV owner/company to provide services. However, little attention has been paid to the issue of how to enable different UAV owners to provide services with differentiated service capabilities in a shared area. To address this issue, we propose a multi-agent imitation learning enabled UAV deployment approach to maximize both profits of UAV owners and utilities of on-ground users. Specially, a Markov game is formulated among UAV owners and we prove that a Nash equilibrium exists based on the full knowledge of the system. For online scheduling with incomplete information, we design agent policies by imitating the behaviors of corresponding experts. A novel neural network model, integrating convolutional neural networks, generative adversarial networks and a gradient-based policy, can be trained and executed in a fully decentralized manner with a guaranteed ϵ-Nash equilibrium. Performance results show that our algorithm has significant superiority in terms of average profits, utilities and execution time compared with other representative algorithms.
KW - Nash equilibrium
KW - UAV deployment
KW - decentralized training
KW - differentiated services
KW - imitation learning
UR - http://www.scopus.com/inward/record.url?scp=85118673567&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85118673567&partnerID=8YFLogxK
U2 - 10.1109/TMC.2021.3116236
DO - 10.1109/TMC.2021.3116236
M3 - Article
AN - SCOPUS:85118673567
SN - 1536-1233
VL - 22
SP - 2131
EP - 2146
JO - IEEE Transactions on Mobile Computing
JF - IEEE Transactions on Mobile Computing
IS - 4
ER -