Abstract
This paper establishes Hoeffding's lemma and inequality for bounded functions of general-statespace and not necessarily reversible Markov chains. The sharpness of these results is characterized by the optimality of the ratio between variance proxies in the Markov-dependent and independent settings. The boundedness of functions is shown necessary for such results to hold in general. To showcase the usefulness of the new results, we apply them for non-asymptotic analyses of MCMC estimation, respondent-driven sampling and high-dimensional covariance matrix estimation on time series data with a Markovian nature. In addition to statistical problems, we also apply them to study the time-discounted rewards in econometric models and the multi-armed bandit problem with Markovian rewards arising from the field of machine learning.
Original language | English (US) |
---|---|
Pages (from-to) | 1-35 |
Number of pages | 35 |
Journal | Journal of Machine Learning Research |
Volume | 22 |
State | Published - Aug 1 2021 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Software
- Artificial Intelligence
- Control and Systems Engineering
- Statistics and Probability
Keywords
- General state space
- Hoeffding's inequality
- Markov chain
- Markov chain Monte Carlo