Abstract
Many decisions in everyday life involve a choice between exploring options that are currently unknown and exploiting options that are already known to be rewarding. Previous work has suggested that humans solve such “explore-exploit” dilemmas using a mixture of two strategies: directed exploration, in which information seeking drives exploration by choice, and random exploration, in which behavioral variability drives exploration by chance. One limitation of this previous work was that, like most studies on explore-exploit decision making, it focused exclusively on the domain of gains, where the goal was to maximize reward. In many real-world decisions, however, the goal is to minimize losses and it is well known from Prospect Theory that behavior can be quite different in this domain. In this study, we compared explore-exploit behavior of human subjects under conditions of gain and loss. We found that people use both directed and random exploration regardless of whether they are exploring to maximize gains or minimize losses and that there is quantitative agreement between the exploration parameters across domains. Our results also revealed an overall bias towards the more uncertain option in the domain of losses. While this bias towards uncertainty was qualitatively consistent with the predictions of Prospect Theory, quantitatively we found that the bias was better described by a Bayesian account, in which subjects had a prior that was optimistic for losses and pessimistic for gains. Taken together, our results suggest that explore-exploit decisions are driven by three independent processes: directed and random exploration, and a baseline uncertainty seeking that is driven by a prior.
Original language | English (US) |
---|---|
Pages (from-to) | 104-117 |
Number of pages | 14 |
Journal | Judgment and Decision Making |
Volume | 12 |
Issue number | 2 |
State | Published - Mar 2017 |
All Science Journal Classification (ASJC) codes
- General Decision Sciences
- Applied Psychology
- Economics and Econometrics
Keywords
- Decision noise
- Explore-exploit
- Information
- Loss aversion
- Reinforcement learning
- Risk seeking
- Uncertainty