TY - JOUR
T1 - Experienced Deep Reinforcement Learning with Generative Adversarial Networks (GANs) for Model-Free Ultra Reliable Low Latency Communication
AU - Kasgari, Ali Taleb Zadeh
AU - Saad, Walid
AU - Mozaffari, Mohammad
AU - Poor, H. Vincent
N1 - Funding Information:
Manuscript received February 12, 2020; revised July 3, 2020, September 13, 2020, and September 29, 2020; accepted September 29, 2020. Date of publication October 19, 2020; date of current version February 17, 2021. This work was supported by the U.S. National Science Foundation under Grants IIS-1633363, CNS-1836802 and CCF-1908308. This article was presented in part at the IEEE ICC. The associate editor coordinating the review of this article and approving it for publication was C. Lee. (Corresponding author: Ali Taleb Zadeh Kasgari.) Ali Taleb Zadeh Kasgari and Walid Saad are with the Department of Electrical and Computer Engineering, Virginia Tech, Blacksburg, VA 24060 USA (e-mail: alitk@vt.edu; walids@vt.edu).
Publisher Copyright:
© 1972-2012 IEEE.
PY - 2021/2
Y1 - 2021/2
N2 - In this paper, a novel experienced deep reinforcement learning (deep-RL) framework is proposed to provide model-free resource allocation for ultra reliable low latency communication (URLLC-6G) in the downlink of a wireless network. The goal is to guarantee high end-to-end reliability and low end-to-end latency, under explicit data rate constraints, for each wireless user without any models of or assumptions on the users' traffic. In particular, in order to enable the deep-RL framework to account for extreme network conditions and operate in highly reliable systems, a new approach based on generative adversarial networks (GANs) is proposed. This GAN approach is used to pre-train the deep-RL framework using a mix of real and synthetic data, thus creating an experienced deep-RL framework that has been exposed to a broad range of network conditions. The proposed deep-RL framework is particularly applied to a multi-user orthogonal frequency division multiple access (OFDMA) resource allocation system. Formally, this URLLC-6G resource allocation problem in OFDMA systems is posed as a power minimization problem under reliability, latency, and rate constraints. To solve this problem using experienced deep-RL, first, the rate of each user is determined. Then, these rates are mapped to the resource block and power allocation vectors of the studied wireless system. Finally, the end-to-end reliability and latency of each user are used as feedback to the deep-RL framework. It is then shown that at the fixed-point of the deep-RL algorithm, the reliability and latency of the users are near-optimal. Moreover, for the proposed GAN approach, a theoretical limit for the generator output is analytically derived. Simulation results show how the proposed approach can achieve near-optimal performance within the rate-reliability-latency region, depending on the network and service requirements. The results also show that the proposed experienced deep-RL framework is able to remove the transient training time that makes conventional deep-RL methods unsuitable for URLLC-6G. Moreover, during extreme conditions, it is shown that the proposed, experienced deep-RL agent can recover instantly while a conventional deep-RL agent takes several epochs to adapt to new extreme conditions.
AB - In this paper, a novel experienced deep reinforcement learning (deep-RL) framework is proposed to provide model-free resource allocation for ultra reliable low latency communication (URLLC-6G) in the downlink of a wireless network. The goal is to guarantee high end-to-end reliability and low end-to-end latency, under explicit data rate constraints, for each wireless user without any models of or assumptions on the users' traffic. In particular, in order to enable the deep-RL framework to account for extreme network conditions and operate in highly reliable systems, a new approach based on generative adversarial networks (GANs) is proposed. This GAN approach is used to pre-train the deep-RL framework using a mix of real and synthetic data, thus creating an experienced deep-RL framework that has been exposed to a broad range of network conditions. The proposed deep-RL framework is particularly applied to a multi-user orthogonal frequency division multiple access (OFDMA) resource allocation system. Formally, this URLLC-6G resource allocation problem in OFDMA systems is posed as a power minimization problem under reliability, latency, and rate constraints. To solve this problem using experienced deep-RL, first, the rate of each user is determined. Then, these rates are mapped to the resource block and power allocation vectors of the studied wireless system. Finally, the end-to-end reliability and latency of each user are used as feedback to the deep-RL framework. It is then shown that at the fixed-point of the deep-RL algorithm, the reliability and latency of the users are near-optimal. Moreover, for the proposed GAN approach, a theoretical limit for the generator output is analytically derived. Simulation results show how the proposed approach can achieve near-optimal performance within the rate-reliability-latency region, depending on the network and service requirements. The results also show that the proposed experienced deep-RL framework is able to remove the transient training time that makes conventional deep-RL methods unsuitable for URLLC-6G. Moreover, during extreme conditions, it is shown that the proposed, experienced deep-RL agent can recover instantly while a conventional deep-RL agent takes several epochs to adapt to new extreme conditions.
KW - Resource allocation
KW - generative adversarial networks
KW - low latency communications
KW - model-free resource management
UR - http://www.scopus.com/inward/record.url?scp=85099268924&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85099268924&partnerID=8YFLogxK
U2 - 10.1109/TCOMM.2020.3031930
DO - 10.1109/TCOMM.2020.3031930
M3 - Article
AN - SCOPUS:85099268924
SN - 0090-6778
VL - 69
SP - 884
EP - 899
JO - IEEE Transactions on Communications
JF - IEEE Transactions on Communications
IS - 2
M1 - 9229155
ER -