Abstract
Numerous machine learning problems require an exploration basis - a mechanism to explore the action space. We define a novel geometric notion of exploration basis with low variance called volumetric spanners, and give efficient algorithms to construct such bases. We show how efficient volumetric spanners give rise to an efficient and near-optimal regret algorithm for bandit linear optimization over general convex sets. Previously such results were known only for specific convex sets, or under special conditions such as the existence of an efficient self-concordant barrier for the underlying set.
Original language | English (US) |
---|---|
Pages (from-to) | 408-422 |
Number of pages | 15 |
Journal | Journal of Machine Learning Research |
Volume | 35 |
State | Published - 2014 |
Externally published | Yes |
Event | 27th Conference on Learning Theory, COLT 2014 - Barcelona, Spain Duration: Jun 13 2014 → Jun 15 2014 |
All Science Journal Classification (ASJC) codes
- Software
- Control and Systems Engineering
- Statistics and Probability
- Artificial Intelligence
Keywords
- Active learning
- Convex geometry
- Learning basis
- Multi-armed bandit
- Spanners