TY - GEN
T1 - End-to-end High Dynamic Range Camera Pipeline Optimization
AU - Robidoux, Nicolas
AU - Seo, Dong Eun
AU - Ariza, Federico
AU - García Capel, Luis E.
AU - Sharma, Avinash
AU - Heide, Felix
N1 - Funding Information:
We thank Emmanuel Onzon, Doug Taylor and Jean-François Taillon for fruitful discussions.
Publisher Copyright:
© 2021 IEEE
PY - 2021
Y1 - 2021
N2 - The real world is a 280 dB High Dynamic Range (HDR) world which imaging sensors cannot record in a single shot. HDR cameras acquire multiple measurements with different exposures, gains and photodiodes, from which an Image Signal Processor (ISP) reconstructs an HDR image. Dynamic scene HDR image recovery is an open challenge because of motion and because stitched captures have different noise characteristics, resulting in artifacts that ISPs must resolve in real time at double-digit megapixel resolutions. Traditionally, ISP settings used by downstream vision modules are chosen by domain experts; such frozen camera designs are then used for training data acquisition and supervised learning of downstream vision modules. We depart from this paradigm and formulate HDR ISP hyperparameter search as an end-to-end optimization problem, proposing a mixed 0th and 1st-order block coordinate descent optimizer that jointly learns sensor, ISP and detector network weights using RAW image data augmented with emulated SNR transition region artifacts. We assess the proposed method for human vision and image understanding. For automotive object detection, the method improves mAP and mAR by 33% over expert-tuning and 22% over state-of-the-art optimization methods, outperforming expert-tuned HDR imaging and vision pipelines in all HDR laboratory rig and field experiments.
AB - The real world is a 280 dB High Dynamic Range (HDR) world which imaging sensors cannot record in a single shot. HDR cameras acquire multiple measurements with different exposures, gains and photodiodes, from which an Image Signal Processor (ISP) reconstructs an HDR image. Dynamic scene HDR image recovery is an open challenge because of motion and because stitched captures have different noise characteristics, resulting in artifacts that ISPs must resolve in real time at double-digit megapixel resolutions. Traditionally, ISP settings used by downstream vision modules are chosen by domain experts; such frozen camera designs are then used for training data acquisition and supervised learning of downstream vision modules. We depart from this paradigm and formulate HDR ISP hyperparameter search as an end-to-end optimization problem, proposing a mixed 0th and 1st-order block coordinate descent optimizer that jointly learns sensor, ISP and detector network weights using RAW image data augmented with emulated SNR transition region artifacts. We assess the proposed method for human vision and image understanding. For automotive object detection, the method improves mAP and mAR by 33% over expert-tuning and 22% over state-of-the-art optimization methods, outperforming expert-tuned HDR imaging and vision pipelines in all HDR laboratory rig and field experiments.
UR - http://www.scopus.com/inward/record.url?scp=85117122372&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85117122372&partnerID=8YFLogxK
U2 - 10.1109/CVPR46437.2021.00623
DO - 10.1109/CVPR46437.2021.00623
M3 - Conference contribution
AN - SCOPUS:85117122372
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 6293
EP - 6303
BT - Proceedings - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021
PB - IEEE Computer Society
T2 - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2021
Y2 - 19 June 2021 through 25 June 2021
ER -