TY - GEN
T1 - LESS is more
T2 - 15th Annual ACM/IEEE International Conference on Human Robot Interaction, HRI 2020
AU - Bobu, Andreea
AU - Scobee, Dexter R.R.
AU - Fisac, Jaime F.
AU - Sastry, S. Shankar
AU - Dragan, Anca D.
N1 - Publisher Copyright:
© 2020 Association for Computing Machinery.
PY - 2020/3/9
Y1 - 2020/3/9
N2 - Robots need models of human behavior for both inferring human goals and preferences, and predicting what people will do. A common model is the Boltzmann noisily-rational decision model, which assumes people approximately optimize a reward function and choose trajectories in proportion to their exponentiated reward. While this model has been successful in a variety of robotics domains, its roots lie in econometrics, and in modeling decisions among different discrete options, each with its own utility or reward. In contrast, human trajectories lie in a continuous space, with continuous-valued features that influence the reward function. We propose that it is time to rethink the Boltzmann model, and design it from the ground up to operate over such trajectory spaces. We introduce a model that explicitly accounts for distances between trajectories, rather than only their rewards. Rather than each trajectory affecting the decision independently, similar trajectories now affect the decision together. We start by showing that our model better explains human behavior in a user study.We then analyze the implications this has for robot inference, first in toy environments where we have ground truth and find more accurate inference, and finally for a 7DOF robot arm learning from user demonstrations.
AB - Robots need models of human behavior for both inferring human goals and preferences, and predicting what people will do. A common model is the Boltzmann noisily-rational decision model, which assumes people approximately optimize a reward function and choose trajectories in proportion to their exponentiated reward. While this model has been successful in a variety of robotics domains, its roots lie in econometrics, and in modeling decisions among different discrete options, each with its own utility or reward. In contrast, human trajectories lie in a continuous space, with continuous-valued features that influence the reward function. We propose that it is time to rethink the Boltzmann model, and design it from the ground up to operate over such trajectory spaces. We introduce a model that explicitly accounts for distances between trajectories, rather than only their rewards. Rather than each trajectory affecting the decision independently, similar trajectories now affect the decision together. We start by showing that our model better explains human behavior in a user study.We then analyze the implications this has for robot inference, first in toy environments where we have ground truth and find more accurate inference, and finally for a 7DOF robot arm learning from user demonstrations.
KW - Human decision modeling
KW - Robot inference and prediction
UR - http://www.scopus.com/inward/record.url?scp=85081991966&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85081991966&partnerID=8YFLogxK
U2 - 10.1145/3319502.3374811
DO - 10.1145/3319502.3374811
M3 - Conference contribution
AN - SCOPUS:85081991966
T3 - ACM/IEEE International Conference on Human-Robot Interaction
SP - 429
EP - 437
BT - HRI 2020 - Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction
PB - IEEE Computer Society
Y2 - 23 March 2020 through 26 March 2020
ER -