Deep Patch Visual SLAM

Lahav Lipson, Zachary Teed, Jia Deng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Scopus citations

Abstract

Recent work in Visual Odometry and SLAM has shown the effectiveness of using deep network backbones. Despite excellent accuracy, such approaches are often expensive to run or do not generalize well zero-shot. To address this problem, we introduce Deep Patch Visual-SLAM, a new system for monocular visual SLAM based on the DPVO visual odometry system. We introduce two loop closure mechanisms which significantly improve the accuracy with minimal runtime and memory overhead. On real-world datasets, DPV-SLAM runs at 1x-3x real-time framerates. We achieve comparable accuracy to DROID-SLAM on EuRoC and TartanAir while running twice as fast using a third of the VRAM. We also outperform DROID-SLAM by large margins on KITTI. As DPV-SLAM is an extension to DPVO, its code can be found in the same repository: https://github.com/princeton-vl/DPVO

Original languageEnglish (US)
Title of host publicationComputer Vision – ECCV 2024 - 18th European Conference, Proceedings
EditorsAleš Leonardis, Elisa Ricci, Stefan Roth, Olga Russakovsky, Torsten Sattler, Gül Varol
PublisherSpringer Science and Business Media Deutschland GmbH
Pages424-440
Number of pages17
ISBN (Print)9783031726262
DOIs
StatePublished - 2025
Event18th European Conference on Computer Vision, ECCV 2024 - Milan, Italy
Duration: Sep 29 2024Oct 4 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15060 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th European Conference on Computer Vision, ECCV 2024
Country/TerritoryItaly
CityMilan
Period9/29/2410/4/24

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science

Keywords

  • Monocular
  • SLAM
  • Visual Odometry

Fingerprint

Dive into the research topics of 'Deep Patch Visual SLAM'. Together they form a unique fingerprint.

Cite this