TY - GEN
T1 - How fast to work
T2 - 2005 Annual Conference on Neural Information Processing Systems, NIPS 2005
AU - Niv, Yael
AU - Daw, Nathaniel D.
AU - Dayan, Peter
PY - 2005
Y1 - 2005
N2 - Reinforcement learning models have long promised to unify computational, psychological and neural accounts of appetitively conditioned behavior. However, the bulk of data on animal conditioning comes from free-operant experiments measuring how fast animals will work for reinforcement. Existing reinforcement learning (RL) models are silent about these tasks, because they lack any notion of vigor. They thus fail to address the simple observation that hungrier animals will work harder for food, as well as stranger facts such as their sometimes greater productivity even when working for irrelevant outcomes such as water. Here, we develop an RL framework for free-operant behavior, suggesting that subjects choose how vigorously to perform selected actions by optimally balancing the costs and benefits of quick responding. Motivational states such as hunger shift these factors, skewing the tradeoff. This accounts normatively for the effects of motivation on response rates, as well as many other classic findings. Finally, we suggest that tonic levels of dopamine may be involved in the computation linking motivational state to optimal responding, thereby explaining the complex vigor-related effects of pharmacological manipulation of dopamine.
AB - Reinforcement learning models have long promised to unify computational, psychological and neural accounts of appetitively conditioned behavior. However, the bulk of data on animal conditioning comes from free-operant experiments measuring how fast animals will work for reinforcement. Existing reinforcement learning (RL) models are silent about these tasks, because they lack any notion of vigor. They thus fail to address the simple observation that hungrier animals will work harder for food, as well as stranger facts such as their sometimes greater productivity even when working for irrelevant outcomes such as water. Here, we develop an RL framework for free-operant behavior, suggesting that subjects choose how vigorously to perform selected actions by optimally balancing the costs and benefits of quick responding. Motivational states such as hunger shift these factors, skewing the tradeoff. This accounts normatively for the effects of motivation on response rates, as well as many other classic findings. Finally, we suggest that tonic levels of dopamine may be involved in the computation linking motivational state to optimal responding, thereby explaining the complex vigor-related effects of pharmacological manipulation of dopamine.
UR - http://www.scopus.com/inward/record.url?scp=33745774340&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33745774340&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:33745774340
SN - 9780262232531
T3 - Advances in Neural Information Processing Systems
SP - 1019
EP - 1026
BT - Advances in Neural Information Processing Systems 18 - Proceedings of the 2005 Conference
Y2 - 5 December 2005 through 8 December 2005
ER -