Learning to use working memory in partially observable environments through dopaminergic reinforcement

Michael T. Todd, Yael Niv, Jonathan D. Cohen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

39 Scopus citations

Abstract

Working memory is a central topic of cognitive neuroscience because it is critical for solving real-world problems in which information from multiple temporally distant sources must be combined to generate appropriate behavior. However, an often neglected fact is that learning to use working memory effectively is itself a difficult problem. The Gating framework [1- 4] is a collection of psychological models that show how dopamine can train the basal ganglia and prefrontal cortex to form useful working memory representations in certain types of problems. We unite Gating with machine learning theory concerning the general problem of memory-based optimal control [5-6]. We present a normative model that learns, by online temporal difference methods, to use working memory to maximize discounted future reward in partially observable settings. The model successfully solves a benchmark working memory problem, and exhibits limitations similar to those observed in humans. Our purpose is to introduce a concise, normative definition of high level cognitive concepts such as working memory and cognitive control in terms of maximizing discounted future rewards.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference
Pages1689-1696
Number of pages8
StatePublished - Dec 1 2009
Event22nd Annual Conference on Neural Information Processing Systems, NIPS 2008 - Vancouver, BC, Canada
Duration: Dec 8 2008Dec 11 2008

Publication series

NameAdvances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference

Other

Other22nd Annual Conference on Neural Information Processing Systems, NIPS 2008
CountryCanada
CityVancouver, BC
Period12/8/0812/11/08

All Science Journal Classification (ASJC) codes

  • Information Systems

Fingerprint Dive into the research topics of 'Learning to use working memory in partially observable environments through dopaminergic reinforcement'. Together they form a unique fingerprint.

  • Cite this

    Todd, M. T., Niv, Y., & Cohen, J. D. (2009). Learning to use working memory in partially observable environments through dopaminergic reinforcement. In Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference (pp. 1689-1696). (Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference).