Shaping Model-Free Habits with Model-Based Goals

Paul M. Krueger, Thomas L. Griffiths

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

Model-free (MF) and model-based (MB) reinforcement learning (RL) have provided a successful framework for understanding both human behavior and neural data. These two systems are usually thought to compete for control of behavior. However, it has also been proposed that they can be integrated in a cooperative manner. For example, the Dyna algorithm uses MB replay of past experience to train the MF system, and has inspired research examining whether human learners do something similar. Here we introduce an approach that links MF and MB learning in a new way: via the reward function. Given a model of the learning environment, dynamic programming is used to iteratively approximate state values that monotonically converge to the state values under the optimal decision policy. Pseudorewards are calculated from these values and used to shape the reward function of a MF learner in a way that is guaranteed not to change the optimal policy. We show that this method offers computational advantages over Dyna in two classic problems. It also offers a new way to think about integrating MF and MB RL: that our knowledge of the world doesn't just provide a source of simulated experience for training our instincts, but that it shapes the rewards that those instincts latch onto. We discuss psychological phenomena that this theory could apply to, including moral emotions.

Original languageEnglish (US)
Title of host publicationProceedings of the 40th Annual Meeting of the Cognitive Science Society, CogSci 2018
PublisherThe Cognitive Science Society
Pages1975-1980
Number of pages6
ISBN (Electronic)9780991196784
StatePublished - 2018
Externally publishedYes
Event40th Annual Meeting of the Cognitive Science Society: Changing Minds, CogSci 2018 - Madison, United States
Duration: Jul 25 2018Jul 28 2018

Publication series

NameProceedings of the 40th Annual Meeting of the Cognitive Science Society, CogSci 2018

Conference

Conference40th Annual Meeting of the Cognitive Science Society: Changing Minds, CogSci 2018
Country/TerritoryUnited States
CityMadison
Period7/25/187/28/18

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Science Applications
  • Human-Computer Interaction
  • Cognitive Neuroscience

Fingerprint

Dive into the research topics of 'Shaping Model-Free Habits with Model-Based Goals'. Together they form a unique fingerprint.

Cite this