Goals as reward-producing programs

Guy Davidson, Graham Todd, Julian Togelius, Todd M. Gureckis, Brenden M. Lake

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

People are remarkably capable of generating their own goals, beginning with child’s play and continuing into adulthood. Despite considerable empirical and computational work on goals and goal-oriented behaviour, models are still far from capturing the richness of everyday human goals. Here we bridge this gap by collecting a dataset of human-generated playful goals (in the form of scorable, single-player games), modelling them as reward-producing programs and generating novel human-like goals through program synthesis. Reward-producing programs capture the rich semantics of goals through symbolic operations that compose, add temporal constraints and allow program execution on behavioural traces to evaluate progress. To build a generative model of goals, we learn a fitness function over the infinite set of possible goal programs and sample novel goals with a quality-diversity algorithm. Human evaluators found that model-generated goals, when sampled from partitions of program space occupied by human examples, were indistinguishable from human-created games. We also discovered that our model’s internal fitness scores predict games that are evaluated as more fun to play and more human-like.

Original languageEnglish (US)
Article number5972
Pages (from-to)205-220
Number of pages16
JournalNature Machine Intelligence
Volume7
Issue number2
DOIs
StatePublished - Feb 2025
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Goals as reward-producing programs'. Together they form a unique fingerprint.

Cite this