A Note on the Representational Incompatibility of Function Approximation and Factored Dynamics

Eric Allender, Sanjeev Arora, Michael Kearns, Cristopher Moore, Alexander Russell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

We establish a new hardness result that shows that the difficulty of planning in factored Markov decision processes is representational rather than just computational. More precisely, we give a fixed family of factored MDPs with linear rewards whose optimal policies and value functions simply cannot be represented succinctly in any standard parametric form. Previous hardness results indicated that computing good policies from the MDP parameters was difficult, but left open the possibility of succinct function approximation for any fixed factored MDP. Our result applies even to policies which yield a polynomially poor approximation to the optimal value, and highlights interesting connections with the complexity class of Arthur-Merlin games.

Original languageEnglish (US)
Title of host publicationNIPS 2002
Subtitle of host publicationProceedings of the 15th International Conference on Neural Information Processing Systems
EditorsSuzanna Becker, Sebastian Thrun, Klaus Obermayer
PublisherMIT Press Journals
Pages431-437
Number of pages7
ISBN (Electronic)0262025507, 9780262025508
StatePublished - 2002
Event15th International Conference on Neural Information Processing Systems, NIPS 2002 - Vancouver, Canada
Duration: Dec 9 2002Dec 14 2002

Publication series

NameNIPS 2002: Proceedings of the 15th International Conference on Neural Information Processing Systems

Conference

Conference15th International Conference on Neural Information Processing Systems, NIPS 2002
Country/TerritoryCanada
CityVancouver
Period12/9/0212/14/02

All Science Journal Classification (ASJC) codes

  • Signal Processing
  • Computer Networks and Communications
  • Information Systems

Fingerprint

Dive into the research topics of 'A Note on the Representational Incompatibility of Function Approximation and Factored Dynamics'. Together they form a unique fingerprint.

Cite this