Abstract
Motivated by applications in large-scale and multi-agent reinforcement learning, we study the non-asymptotic performance of stochastic approximation (SA) schemes with delayed updates under Markovian sampling. While the effect of delays has been extensively studied for optimization, the manner in which they interact with the underlying Markov process to shape the finite-time performance of SA remains poorly understood. In this context, our first main contribution is to show that under time-varying bounded delays, the delayed SA update rule guarantees exponentially fast convergence of the last iterate to a ball around the SA operator’s fixed point. Notably, our bound is tight in its dependence on both the maximum delay τmax, and the mixing time τmix. To achieve this tight bound, we develop a novel inductive proof technique that, unlike various existing delayed-optimization analyses, relies on establishing uniform boundedness of the iterates. As such, our proof may be of independent interest. Next, to mitigate the impact of the maximum delay on the convergence rate, we provide the first finite-time analysis of a delay-adaptive SA scheme under Markovian sampling. In particular, we show that the exponent of convergence of this scheme gets scaled down by τavg, as opposed to τmax for the vanilla delayed SA rule; here, τavg denotes the average delay across all iterations. Moreover, the adaptive scheme requires no prior knowledge of the delay sequence for step-size tuning. Our theoretical findings shed light on the finite-time effects of delays for a broad class of algorithms, including TD learning, Q-learning, and stochastic gradient descent under Markovian sampling.
Original language | English (US) |
---|---|
Pages (from-to) | 2746-2754 |
Number of pages | 9 |
Journal | Proceedings of Machine Learning Research |
Volume | 238 |
State | Published - 2024 |
Event | 27th International Conference on Artificial Intelligence and Statistics, AISTATS 2024 - Valencia, Spain Duration: May 2 2024 → May 4 2024 |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability