LEARNING KERNELIZED CONTEXTUAL BANDITS IN A DISTRIBUTED AND ASYNCHRONOUS ENVIRONMENT

Chuanhao Li, Huazheng Wang, Mengdi Wang, Hongning Wang

Research output: Contribution to conferencePaperpeer-review

5 Scopus citations

Abstract

Despite the recent advances in communication-efficient distributed bandit learning, most existing solutions are restricted to parametric models, e.g., linear bandits and generalized linear bandits (GLB). In comparison, kernel bandits, which search for non-parametric functions in a reproducing kernel Hilbert space (RKHS), offer higher modeling capacity. But the only existing work in distributed kernel bandits adopts a synchronous communication protocol, which greatly limits its practical use (e.g., every synchronization step requires all clients to participate and wait for data exchange). In this paper, in order to improve the robustness against delays and unavailability of clients that are common in practice, we propose the first asynchronous solution based on approximated kernel regression for distributed kernel bandit learning. A set of effective treatments are developed to ensure approximation quality and communication efficiency. Rigorous theoretical analysis about the regret and communication cost is provided; and extensive empirical evaluations demonstrate the effectiveness of our solution.

Original languageEnglish (US)
StatePublished - 2023
Event11th International Conference on Learning Representations, ICLR 2023 - Kigali, Rwanda
Duration: May 1 2023May 5 2023

Conference

Conference11th International Conference on Learning Representations, ICLR 2023
Country/TerritoryRwanda
CityKigali
Period5/1/235/5/23

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Computer Science Applications
  • Education
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'LEARNING KERNELIZED CONTEXTUAL BANDITS IN A DISTRIBUTED AND ASYNCHRONOUS ENVIRONMENT'. Together they form a unique fingerprint.

Cite this