TY - JOUR
T1 - Reinforcement Learning and Episodic Memory in Humans and Animals
T2 - An Integrative Framework
AU - Gershman, Samuel J.
AU - Daw, Nathaniel D.
N1 - Funding Information:
The authors are grateful to Daphna Shohamy for longstanding collaborative research underlying many of the ideas in this review and helpful comments on the manuscript. The authors also thank Rahul Bhui for comments on an earlier draft of themanuscript.The authors' research contributing to this review is funded by the National Institute on Drug Abuse, grant number R01DA038891 (N.D.D.); Google DeepMind (N.D.D.); and the Center for Brains, Minds and Machines (CBMM) via National Science Foundation Science and Technology Centers award CCF-1231216 (S.J.G.)
Publisher Copyright:
© Copyright 2017 by Annual Reviews. All rights reserved.
PY - 2017/1/3
Y1 - 2017/1/3
N2 - We review the psychology and neuroscience of reinforcement learning (RL), which has experienced significant progress in the past two decades, enabled by the comprehensive experimental study of simple learning and decision-making tasks. However, one challenge in the study of RL is computational: The simplicity of these tasks ignores important aspects of reinforcement learning in the real world: (a) State spaces are high-dimensional, continuous, and partially observable; this implies that (b) data are relatively sparse and, indeed, precisely the same situation may never be encountered twice; furthermore, (c) rewards depend on the long-term consequences of actions in ways that violate the classical assumptions that make RL tractable. A seemingly distinct challenge is that, cognitively, theories of RL have largely involved procedural and semantic memory, the way in which knowledge about action values or world models extracted gradually from many experiences can drive choice. This focus on semantic memory leaves out many aspects of memory, such as episodic memory, related to the traces of individual events. We suggest that these two challenges are related. The computational challenge can be dealt with, in part, by endowing RL systems with episodic memory, allowing them to (a) efficiently approximate value functions over complex state spaces, (b) learn with very little data, and (c) bridge long-term dependencies between actions and rewards. We review the computational theory underlying this proposal and the empirical evidence to support it. Our proposal suggests that the ubiquitous and diverse roles of memory in RL may function as part of an integrated learning system.
AB - We review the psychology and neuroscience of reinforcement learning (RL), which has experienced significant progress in the past two decades, enabled by the comprehensive experimental study of simple learning and decision-making tasks. However, one challenge in the study of RL is computational: The simplicity of these tasks ignores important aspects of reinforcement learning in the real world: (a) State spaces are high-dimensional, continuous, and partially observable; this implies that (b) data are relatively sparse and, indeed, precisely the same situation may never be encountered twice; furthermore, (c) rewards depend on the long-term consequences of actions in ways that violate the classical assumptions that make RL tractable. A seemingly distinct challenge is that, cognitively, theories of RL have largely involved procedural and semantic memory, the way in which knowledge about action values or world models extracted gradually from many experiences can drive choice. This focus on semantic memory leaves out many aspects of memory, such as episodic memory, related to the traces of individual events. We suggest that these two challenges are related. The computational challenge can be dealt with, in part, by endowing RL systems with episodic memory, allowing them to (a) efficiently approximate value functions over complex state spaces, (b) learn with very little data, and (c) bridge long-term dependencies between actions and rewards. We review the computational theory underlying this proposal and the empirical evidence to support it. Our proposal suggests that the ubiquitous and diverse roles of memory in RL may function as part of an integrated learning system.
KW - Decision making
KW - Memory
KW - Reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85009518318&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85009518318&partnerID=8YFLogxK
U2 - 10.1146/annurev-psych-122414-033625
DO - 10.1146/annurev-psych-122414-033625
M3 - Article
C2 - 27618944
AN - SCOPUS:85009518318
SN - 0066-4308
VL - 68
SP - 101
EP - 128
JO - Annual review of psychology
JF - Annual review of psychology
ER -