PG3: Policy-Guided Planning for Generalized Policy Generation

  • Ryan Yang
  • , Tom Silver
  • , Aidan Curtis
  • , Tomas Lozano-Perez
  • , Leslie Kaelbling

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

A longstanding objective in classical planning is to synthesize policies that generalize across multiple problems from the same domain. In this work, we study generalized policy search-based methods with a focus on the score function used to guide the search over policies. We demonstrate limitations of two score functions - policy evaluation and plan comparison - and propose a new approach that overcomes these limitations. The main idea behind our approach, Policy-Guided Planning for Generalized Policy Generalization (PG3), is that a candidate policy should be used to guide planning on training problems as a mechanism for evaluating that candidate. Theoretical results in a simplified setting give conditions under which PG3 is optimal or admissible. We then study a specific instantiation of policy search where planning problems are PDDL-based and policies are lifted decision lists. Empirical results in six domains confirm that PG3 learns generalized policies more efficiently and effectively than several baselines.

Original languageEnglish (US)
Title of host publicationProceedings of the 31st International Joint Conference on Artificial Intelligence, IJCAI 2022
EditorsLuc De Raedt, Luc De Raedt
PublisherInternational Joint Conferences on Artificial Intelligence
Pages4686-4692
Number of pages7
ISBN (Electronic)9781956792003
DOIs
StatePublished - 2022
Externally publishedYes
Event31st International Joint Conference on Artificial Intelligence, IJCAI 2022 - Vienna, Austria
Duration: Jul 23 2022Jul 29 2022

Publication series

NameIJCAI International Joint Conference on Artificial Intelligence
ISSN (Print)1045-0823

Conference

Conference31st International Joint Conference on Artificial Intelligence, IJCAI 2022
Country/TerritoryAustria
CityVienna
Period7/23/227/29/22

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'PG3: Policy-Guided Planning for Generalized Policy Generation'. Together they form a unique fingerprint.

Cite this