TY - GEN
T1 - Regret-minimizing exploration in hetnets with mmWave
AU - Wang, Michael
AU - Dutta, Aveek
AU - Buccapatnam, Swapna
AU - Chiang, Mung
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/11/2
Y1 - 2016/11/2
N2 - We model and analyze a User-Equipment (UE) based wireless network selection method where individuals act on their stochastic knowledge of the expected behavior off their available networks. In particular, we focus on networks with millimeter-wave (mmWave) radio. Modeling mmWave radio access technologies (RATs) as a stochastic 3-state process based on their physical layer characteristics in Line-of-Sight (LOS), Non-Line-of-Sight (NLOS), and Outage states, we make the realistic assumption that users have no knowledge of the statistics of the RATs and must learn these while maximizing the throughput obtained. We develop an online learning-based approach to access network selection: a user-centric Multi-Armed Bandit Problem that incorporates the cost of switching access networks. We develop an online learning policy that groups network access to minimize costs for RAT selection, analyze the regret (loss due to uncertainty) of our algorithm. We also show that our algorithm obtains optimal regret and in numerical examples achieves 24% increase in total throughput compared to existing techniques for high throughput mmWave RATs that vary over a fast timescale.
AB - We model and analyze a User-Equipment (UE) based wireless network selection method where individuals act on their stochastic knowledge of the expected behavior off their available networks. In particular, we focus on networks with millimeter-wave (mmWave) radio. Modeling mmWave radio access technologies (RATs) as a stochastic 3-state process based on their physical layer characteristics in Line-of-Sight (LOS), Non-Line-of-Sight (NLOS), and Outage states, we make the realistic assumption that users have no knowledge of the statistics of the RATs and must learn these while maximizing the throughput obtained. We develop an online learning-based approach to access network selection: a user-centric Multi-Armed Bandit Problem that incorporates the cost of switching access networks. We develop an online learning policy that groups network access to minimize costs for RAT selection, analyze the regret (loss due to uncertainty) of our algorithm. We also show that our algorithm obtains optimal regret and in numerical examples achieves 24% increase in total throughput compared to existing techniques for high throughput mmWave RATs that vary over a fast timescale.
UR - http://www.scopus.com/inward/record.url?scp=85001022043&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85001022043&partnerID=8YFLogxK
U2 - 10.1109/SAHCN.2016.7733013
DO - 10.1109/SAHCN.2016.7733013
M3 - Conference contribution
AN - SCOPUS:85001022043
T3 - 2016 13th Annual IEEE International Conference on Sensing, Communication, and Networking, SECON 2016
BT - 2016 13th Annual IEEE International Conference on Sensing, Communication, and Networking, SECON 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 13th Annual IEEE International Conference on Sensing, Communication, and Networking, SECON 2016
Y2 - 27 June 2016 through 30 June 2016
ER -