TY - GEN
T1 - Online Learning to Precode for FDD Massive MIMO Systems
AU - Kim, Daeun
AU - Poor, H. Vincent
AU - Lee, Namyoon
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/12
Y1 - 2020/12
N2 - This paper presents a novel multi-user precoding strategy for frequency-division duplexing massive multiple-input multiple-output downlink systems with rate-limited feedback. Inspired by a multi-armed bandit framework, our approach is to adaptively learn the best precoding action that provides the highest sum-throughput without explicit channel state information feedback. In particular, we present an online learning algorithm to find the best optimal precoding action in a timely manner, called fast upper confidence bound (Fast-UCB) precoding. The key idea is to use a fast-exploration and exploitation with pruning strategies to speed up learning rates in identifying the optimal precoding action. From simulations, we show that the proposed algorithm significantly outperforms the existing online learning algorithms, including the conventional UCB method, in the cumulative regret. In addition, we demonstrate that the Fast-UCB method achieves a higher net sum-throughput than greedy action selection with full-exploration under a short channel coherence time environment, even with much less feedback.
AB - This paper presents a novel multi-user precoding strategy for frequency-division duplexing massive multiple-input multiple-output downlink systems with rate-limited feedback. Inspired by a multi-armed bandit framework, our approach is to adaptively learn the best precoding action that provides the highest sum-throughput without explicit channel state information feedback. In particular, we present an online learning algorithm to find the best optimal precoding action in a timely manner, called fast upper confidence bound (Fast-UCB) precoding. The key idea is to use a fast-exploration and exploitation with pruning strategies to speed up learning rates in identifying the optimal precoding action. From simulations, we show that the proposed algorithm significantly outperforms the existing online learning algorithms, including the conventional UCB method, in the cumulative regret. In addition, we demonstrate that the Fast-UCB method achieves a higher net sum-throughput than greedy action selection with full-exploration under a short channel coherence time environment, even with much less feedback.
UR - http://www.scopus.com/inward/record.url?scp=85102926957&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102926957&partnerID=8YFLogxK
U2 - 10.1109/GCWkshps50303.2020.9367488
DO - 10.1109/GCWkshps50303.2020.9367488
M3 - Conference contribution
AN - SCOPUS:85102926957
T3 - 2020 IEEE Globecom Workshops, GC Wkshps 2020 - Proceedings
BT - 2020 IEEE Globecom Workshops, GC Wkshps 2020 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 IEEE Globecom Workshops, GC Wkshps 2020
Y2 - 7 December 2020 through 11 December 2020
ER -