Learning Good State and Action Representations for Markov Decision Process via Tensor Decomposition

  • Chengzhuo Ni
  • , Yaqi Duan
  • , Munther Dahleh
  • , Mengdi Wang
  • , Anru R. Zhang

Research output: Contribution to journalArticlepeer-review

Abstract

The transition kernel of a continuous-state-action Markov decision process (MDP) admits a natural tensor structure. This paper proposes a tensor-inspired unsupervised learning method to identify meaningful low-dimensional state and action representations from empirical trajectories. The method exploits the MDP’s tensor structure by kernelization, importance sampling and low-Tucker-rank approximation. This method can be further used to cluster states and actions respectively and find the best discrete MDP abstraction. We provide sharp statistical error bounds for tensor concentration and the preservation of diffusion distance after embedding. We further prove that the learned state/action abstractions provide accurate approximations to latent block structures if they exist, enabling function approximation in downstream tasks such as policy evaluation.

Original languageEnglish (US)
Article number115
JournalJournal of Machine Learning Research
Volume24
StatePublished - 2023
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Software
  • Statistics and Probability
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Learning Good State and Action Representations for Markov Decision Process via Tensor Decomposition'. Together they form a unique fingerprint.

Cite this