Abstract
Robots equipped with rich sensing modalities (e.g., RGB-D cameras) performing long-horizon tasks motivate the need for policies that are highly memory-efficient. State-of-the-art approaches for controlling robots often use memory representations that are excessively rich for the task or rely on hand-crafted tricks for memory efficiency. Instead, this work provides a general approach for jointly synthesizing memory representations and policies; the resulting policies actively seek to reduce memory requirements. Specifically, we present a reinforcement learning framework that leverages an implementation of the group LASSO regularization to synthesize policies that employ low-dimensional and task-centric memory representations. We demonstrate the efficacy of our approach with simulated examples including navigation in discrete and continuous spaces as well as vision-based indoor navigation set in a photo-realistic simulator. The results on these examples indicate that our method is capable of finding policies that rely only on low-dimensional memory representations, improving generalization, and actively reducing memory requirements.
Original language | English (US) |
---|---|
Pages (from-to) | 125-137 |
Number of pages | 13 |
Journal | Proceedings of Machine Learning Research |
Volume | 144 |
State | Published - 2021 |
Externally published | Yes |
Event | 3rd Annual Conference on Learning for Dynamics and Control, L4DC 2021 - Virtual, Online, Switzerland Duration: Jun 7 2021 → Jun 8 2021 |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability
Keywords
- Memory-Efficiency
- Navigation
- Reinforcement Learning