High-dimensional sparse linear bandits

Botao Hao, Tor Lattimore, Mengdi Wang

Research output: Contribution to journalConference articlepeer-review

27 Scopus citations

Abstract

Stochastic linear bandits with high-dimensional sparse features are a practical model for a variety of domains, including personalized medicine and online advertising [Bastani and Bayati, 2020]. We derive a novel ?(n2/3) dimension-free minimax regret lower bound for sparse linear bandits in the data-poor regime where the horizon is smaller than the ambient dimension and where the feature vectors admit a well-conditioned exploration distribution. This is complemented by a nearly matching upper bound for an explore-then-commit algorithm showing that that T(n2/3) is the optimal rate in the data-poor regime. The results complement existing bounds for the data-rich regime and provide another example where carefully balancing the trade-off between information and regret is necessary. Finally, we prove a dimension-free O(vn) regret upper bound under an additional assumption on the magnitude of the signal for relevant features.

Original languageEnglish (US)
JournalAdvances in Neural Information Processing Systems
Volume2020-December
StatePublished - 2020
Externally publishedYes
Event34th Conference on Neural Information Processing Systems, NeurIPS 2020 - Virtual, Online
Duration: Dec 6 2020Dec 12 2020

All Science Journal Classification (ASJC) codes

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'High-dimensional sparse linear bandits'. Together they form a unique fingerprint.

Cite this