Risk-sensitive inverse reinforcement learning via coherent risk models

Anirudha Majumdar, Sumeet Singh, Ajay Mandlekar, Marco Pavone

Research output: Chapter in Book/Report/Conference proceedingConference contribution

37 Scopus citations

Abstract

The literature on Inverse Reinforcement Learning (IRL) typically assumes that humans take actions in order to minimize the expected value of a cost function, i.e., that humans are risk neutral. Yet, in practice, humans are often far from being risk neutral. To fill this gap, the objective of this paper is to devise a framework for risk-sensitive IRL in order to explicitly account for an expert's risk sensitivity. To this end, we propose a flexible class of models based on coherent risk metrics, which allow us to capture an entire spectrum of risk preferences from risk-neutral to worst-case. We propose efficient algorithms based on Linear Programming for inferring an expert's underlying risk metric and cost function for a rich class of static and dynamic decision-making settings. The resulting approach is demonstrated on a simulated driving game with ten human participants. Our method is able to infer and mimic a wide range of qualitatively different driving styles from highly risk-averse to risk-neutral in a data-efficient manner. Moreover, comparisons of the Risk-Sensitive (RS) IRL approach with a risk-neutral model show that the RS-IRL framework more accurately captures observed participant behavior both qualitatively and quantitatively.

Original languageEnglish (US)
Title of host publicationRobotics
Subtitle of host publicationScience and Systems XIII, RSS 2017
EditorsNancy Amato, Siddhartha Srinivasa, Nora Ayanian, Scott Kuindersma
PublisherMIT Press Journals
ISBN (Electronic)9780992374730
DOIs
StatePublished - 2017
Externally publishedYes
Event2017 Robotics: Science and Systems, RSS 2017 - Cambridge, United States
Duration: Jul 12 2017Jul 16 2017

Publication series

NameRobotics: Science and Systems
Volume13
ISSN (Electronic)2330-765X

Other

Other2017 Robotics: Science and Systems, RSS 2017
Country/TerritoryUnited States
CityCambridge
Period7/12/177/16/17

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Risk-sensitive inverse reinforcement learning via coherent risk models'. Together they form a unique fingerprint.

Cite this