TY - JOUR
T1 - Reinforcement learning in the brain
AU - Niv, Yael
N1 - Funding Information:
The author wishes to thank P. Read Montague and Paul Glimcher for comments and contribution to earlier versions of this paper, and Michael Todd for comments on a previous draft. Parts of this work were published as a chapter in “Neuroeconomics: Decision making and the brain” (P. W. Glimcher, C. Camerer, E. Fehr, and R. Poldrack, editors). This work was supported by the Human Frontiers Science Program.
PY - 2009/6
Y1 - 2009/6
N2 - A wealth of research focuses on the decision-making processes that animals and humans employ when selecting actions in the face of reward and punishment. Initially such work stemmed from psychological investigations of conditioned behavior, and explanations of these in terms of computational models. Increasingly, analysis at the computational level has drawn on ideas from reinforcement learning, which provide a normative framework within which decision-making can be analyzed. More recently, the fruits of these extensive lines of research have made contact with investigations into the neural basis of decision making. Converging evidence now links reinforcement learning to specific neural substrates, assigning them precise computational roles. Specifically, electrophysiological recordings in behaving animals and functional imaging of human decision-making have revealed in the brain the existence of a key reinforcement learning signal, the temporal difference reward prediction error. Here, we first introduce the formal reinforcement learning framework. We then review the multiple lines of evidence linking reinforcement learning to the function of dopaminergic neurons in the mammalian midbrain and to more recent data from human imaging experiments. We further extend the discussion to aspects of learning not associated with phasic dopamine signals, such as learning of goal-directed responding that may not be dopamine-dependent, and learning about the vigor (or rate) with which actions should be performed that has been linked to tonic aspects of dopaminergic signaling. We end with a brief discussion of some of the limitations of the reinforcement learning framework, highlighting questions for future research.
AB - A wealth of research focuses on the decision-making processes that animals and humans employ when selecting actions in the face of reward and punishment. Initially such work stemmed from psychological investigations of conditioned behavior, and explanations of these in terms of computational models. Increasingly, analysis at the computational level has drawn on ideas from reinforcement learning, which provide a normative framework within which decision-making can be analyzed. More recently, the fruits of these extensive lines of research have made contact with investigations into the neural basis of decision making. Converging evidence now links reinforcement learning to specific neural substrates, assigning them precise computational roles. Specifically, electrophysiological recordings in behaving animals and functional imaging of human decision-making have revealed in the brain the existence of a key reinforcement learning signal, the temporal difference reward prediction error. Here, we first introduce the formal reinforcement learning framework. We then review the multiple lines of evidence linking reinforcement learning to the function of dopaminergic neurons in the mammalian midbrain and to more recent data from human imaging experiments. We further extend the discussion to aspects of learning not associated with phasic dopamine signals, such as learning of goal-directed responding that may not be dopamine-dependent, and learning about the vigor (or rate) with which actions should be performed that has been linked to tonic aspects of dopaminergic signaling. We end with a brief discussion of some of the limitations of the reinforcement learning framework, highlighting questions for future research.
UR - http://www.scopus.com/inward/record.url?scp=67349283062&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=67349283062&partnerID=8YFLogxK
U2 - 10.1016/j.jmp.2008.12.005
DO - 10.1016/j.jmp.2008.12.005
M3 - Article
AN - SCOPUS:67349283062
SN - 0022-2496
VL - 53
SP - 139
EP - 154
JO - Journal of Mathematical Psychology
JF - Journal of Mathematical Psychology
IS - 3
ER -