TY - GEN
T1 - Convergence Time Minimization for Federated Reinforcement Learning over Wireless Networks
AU - Wang, Sihua
AU - Chen, Mingzhe
AU - Yin, Changchuan
AU - Poor, H. Vincent
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - In this paper, the convergence time of federated reinforcement learning (FRL) that is deployed over a realistic wireless network is studied. In the considered model, several devices and the base station (BS) jointly participate in the iterative training of an FRL algorithm. Due to limited wireless resources, the BS must select a subset of devices to exchange FRL training parameters at each iteration, which will significantly affect the training loss and convergence time of the considered FRL algorithm. This joint learning, wireless resource allocation, and device selection problem is formulated as an optimization problem aiming to minimize the FRL convergence time while meeting the FRL temporal difference (TD) error requirement. To solve this problem, a deep Q network based algorithm is designed. The proposed method enables the BS to dynamically select an appropriate subset of devices to join the FRL training. Given the selected devices, a resource block allocation scheme can be derived to further minimize the FRL convergence time. Simulation results with real data show that the proposed approach can reduce the FRL convergence time by up to 44.7% compared to a baseline that randomly determines the subset of participating devices and their occupied resource blocks.
AB - In this paper, the convergence time of federated reinforcement learning (FRL) that is deployed over a realistic wireless network is studied. In the considered model, several devices and the base station (BS) jointly participate in the iterative training of an FRL algorithm. Due to limited wireless resources, the BS must select a subset of devices to exchange FRL training parameters at each iteration, which will significantly affect the training loss and convergence time of the considered FRL algorithm. This joint learning, wireless resource allocation, and device selection problem is formulated as an optimization problem aiming to minimize the FRL convergence time while meeting the FRL temporal difference (TD) error requirement. To solve this problem, a deep Q network based algorithm is designed. The proposed method enables the BS to dynamically select an appropriate subset of devices to join the FRL training. Given the selected devices, a resource block allocation scheme can be derived to further minimize the FRL convergence time. Simulation results with real data show that the proposed approach can reduce the FRL convergence time by up to 44.7% compared to a baseline that randomly determines the subset of participating devices and their occupied resource blocks.
UR - http://www.scopus.com/inward/record.url?scp=85128755296&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85128755296&partnerID=8YFLogxK
U2 - 10.1109/CISS53076.2022.9751199
DO - 10.1109/CISS53076.2022.9751199
M3 - Conference contribution
AN - SCOPUS:85128755296
T3 - 2022 56th Annual Conference on Information Sciences and Systems, CISS 2022
SP - 246
EP - 251
BT - 2022 56th Annual Conference on Information Sciences and Systems, CISS 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 56th Annual Conference on Information Sciences and Systems, CISS 2022
Y2 - 9 March 2022 through 11 March 2022
ER -