TY - GEN
T1 - Gated2Gated
T2 - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
AU - Walia, Amanpreet
AU - Walz, Stefanie
AU - Bijelic, Mario
AU - Mannan, Fahim
AU - Julca-Aguilar, Frank
AU - Langer, Michael
AU - Ritter, Werner
AU - Heide, Felix
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Gated cameras hold promise as an alternative to scanning LiDAR sensors with high-resolution 3D depth that is robust to back-scatter in fog, snow, and rain. Instead of sequentially scanning a scene and directly recording depth via the photon time-of-flight, as in pulsed LiDAR sensors, gated imagers encode depth in the relative intensity of a handful of gated slices, captured at megapixel resolution. Although existing methods have shown that it is possible to decode high-resolution depth from such measurements, these methods require synchronized and calibrated LiDAR to supervise the gated depth decoder - prohibiting fast adoption across geographies, training on large unpaired datasets, and exploring alternative applications outside of automotive use cases. In this work, propose an entirely self-supervised depth estimation method that uses gated in-tensity profiles and temporal consistency as a training signal. The proposed model is trained end-to-end from gated video sequences, does not require LiDAR or RGB data, and learns to estimate absolute depth values. We take gated slices as input and disentangle the estimation of the scene albedo, depth, and ambient light, which are then used to learn to reconstruct the input slices through a cyclic loss. We rely on temporal consistency between a given frame and neighboring gated slices to estimate depth in regions with shadows and reflections. We experimentally validate that the proposed approach outperforms existing super-vised and self-supervised depth estimation methods based on monocular RGB and stereo images, as well as super-vised methods based on gated images. Code is available at https://github.com/princeton-computational-imaging/Gated2Gated.
AB - Gated cameras hold promise as an alternative to scanning LiDAR sensors with high-resolution 3D depth that is robust to back-scatter in fog, snow, and rain. Instead of sequentially scanning a scene and directly recording depth via the photon time-of-flight, as in pulsed LiDAR sensors, gated imagers encode depth in the relative intensity of a handful of gated slices, captured at megapixel resolution. Although existing methods have shown that it is possible to decode high-resolution depth from such measurements, these methods require synchronized and calibrated LiDAR to supervise the gated depth decoder - prohibiting fast adoption across geographies, training on large unpaired datasets, and exploring alternative applications outside of automotive use cases. In this work, propose an entirely self-supervised depth estimation method that uses gated in-tensity profiles and temporal consistency as a training signal. The proposed model is trained end-to-end from gated video sequences, does not require LiDAR or RGB data, and learns to estimate absolute depth values. We take gated slices as input and disentangle the estimation of the scene albedo, depth, and ambient light, which are then used to learn to reconstruct the input slices through a cyclic loss. We rely on temporal consistency between a given frame and neighboring gated slices to estimate depth in regions with shadows and reflections. We experimentally validate that the proposed approach outperforms existing super-vised and self-supervised depth estimation methods based on monocular RGB and stereo images, as well as super-vised methods based on gated images. Code is available at https://github.com/princeton-computational-imaging/Gated2Gated.
KW - 3D from multi-view and sensors
KW - 3D from single images
KW - Physics-based vision and shape-from-X
KW - RGBD sensors and analytics
KW - Self- & semi- & meta- & unsupervised learning
UR - http://www.scopus.com/inward/record.url?scp=85134008342&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85134008342&partnerID=8YFLogxK
U2 - 10.1109/CVPR52688.2022.00283
DO - 10.1109/CVPR52688.2022.00283
M3 - Conference contribution
AN - SCOPUS:85134008342
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 2801
EP - 2811
BT - Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
PB - IEEE Computer Society
Y2 - 19 June 2022 through 24 June 2022
ER -