TY - JOUR
T1 - Mean Field Game Guided Deep Reinforcement Learning for Task Placement in Cooperative Multiaccess Edge Computing
AU - Shi, Dian
AU - Gao, Hao
AU - Wang, Li
AU - Pan, Miao
AU - Han, Zhu
AU - Poor, H. Vincent
N1 - Funding Information:
Manuscript received December 17, 2019; revised February 29, 2020; accepted March 13, 2020. Date of publication March 27, 2020; date of current version October 9, 2020. The work of Dian Shi and Miao Pan was supported by the U.S. National Science Foundation under Grant CNS-1350230 (CAREER), Grant CNS-1646607, Grant CNS-1702850, and Grant CNS-1801925. The work of Hao Gao and Zhu Han was supported in part by U.S. Multidisciplinary Research Program of the University Research Initiative AFOSR under Grant MURI 18RT0073, and in part by NSF under Grant EARS-1839818, Grant CNS1717454, Grant CNS-1731424, and Grant CNS-1702850. The work of Li Wang was supported in part by the National Natural Science Foundation of China under Grant 61871416, in part by the Fundamental Research Funds for the Central Universities under Grant 2018XKJC03, and in part by the Beijing Municipal Natural Science Foundation under Grant L192030. The work of H. Vincent Poor was supported in part by the U.S. Air Force Office of Scientific Research under Grant MURI FA9550-18-1-0502. (Corresponding author: Miao Pan.) Dian Shi, Hao Gao, and Miao Pan are with the Electrical and Computer Engineering Department, University of Houston, Houston, TX 77004 USA (e-mail: dshi3@uh.edu; hgao5@uh.edu; mpan2@uh.edu).
Publisher Copyright:
© 2014 IEEE.
PY - 2020/10
Y1 - 2020/10
N2 - Cooperative multiaccess edge computing (MEC) is a promising paradigm for the next-generation mobile networks. However, when the number of users explodes, the computational complexity of the existing optimization or learning-based task placement approaches in the cooperative MEC can increase significantly, which leads to intolerable MEC decision-making delay. In this article, we propose a mean field game (MFG) guided deep reinforcement learning (DRL) approach for the task placement in the cooperative MEC, which can help servers make timely task placement decisions, and significantly reduce average service delay. Instead of applying MFG or DRL separately, we jointly leverage MFG and DRL for task placement, and let the equilibrium of MFG guide the learning directions of DRL. We also ensure that the MFG and DRL approaches are consistent with the same goal. Specifically, we novelly define a mean field guided Q-value (MFG-Q), which is an estimation of the Q-value with the Nash equilibrium gained by MFG. We evaluate the proposed method's performance using real-world user distribution. Through extensive simulations, we show that the proposed scheme is effective in making timely decisions and reducing the average service delay. Besides, the convergence rates of our proposed method outperform the pure DR-based approaches.
AB - Cooperative multiaccess edge computing (MEC) is a promising paradigm for the next-generation mobile networks. However, when the number of users explodes, the computational complexity of the existing optimization or learning-based task placement approaches in the cooperative MEC can increase significantly, which leads to intolerable MEC decision-making delay. In this article, we propose a mean field game (MFG) guided deep reinforcement learning (DRL) approach for the task placement in the cooperative MEC, which can help servers make timely task placement decisions, and significantly reduce average service delay. Instead of applying MFG or DRL separately, we jointly leverage MFG and DRL for task placement, and let the equilibrium of MFG guide the learning directions of DRL. We also ensure that the MFG and DRL approaches are consistent with the same goal. Specifically, we novelly define a mean field guided Q-value (MFG-Q), which is an estimation of the Q-value with the Nash equilibrium gained by MFG. We evaluate the proposed method's performance using real-world user distribution. Through extensive simulations, we show that the proposed scheme is effective in making timely decisions and reducing the average service delay. Besides, the convergence rates of our proposed method outperform the pure DR-based approaches.
KW - Deep reinforcement learning (DRL)
KW - mean field game (MFG)
KW - multiaccess edge computing (MEC)
KW - task placement
UR - http://www.scopus.com/inward/record.url?scp=85092723841&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85092723841&partnerID=8YFLogxK
U2 - 10.1109/JIOT.2020.2983741
DO - 10.1109/JIOT.2020.2983741
M3 - Article
AN - SCOPUS:85092723841
SN - 2327-4662
VL - 7
SP - 9330
EP - 9340
JO - IEEE Internet of Things Journal
JF - IEEE Internet of Things Journal
IS - 10
M1 - 9049116
ER -