Keyphrases
Sample Complexity
100%
Variance Reduction
100%
Asynchronous Q-learning
100%
Markovian
66%
Q-function
66%
Value Function
33%
Steady State
33%
Markov Decision Process
33%
Mixing Time
33%
Stationary Distribution
33%
State Space
33%
Sampling numbers
33%
Order of Accuracy
33%
Process-based
33%
State Action
33%
Optimal Actions
33%
Action Value
33%
Second Term
33%
Empirical Distribution Function
33%
Behavioral Policy
33%
First Term
33%
Action Space
33%
Discounted Markov Decision Processes
33%
Learning Objectives
33%
σ-space
33%
Occupancy Probability
33%
Single Trajectories
33%
Constant Learning Rate
33%
Computer Science
Markov Decision Process
100%
Variance Reduction
100%
Function Value
50%
State Space
50%
Independent Sample
50%
Learning Rate
50%
Optimal Action
50%
Mathematics
Markov Decision Process
100%
Variance Reduction
100%
Function Value
50%
Action Space
50%
Independent Sample
50%
S-Spaces
50%
Occupancy Probability
50%