TY - GEN
T1 - Equivariant multi-view networks
AU - Esteves, Carlos
AU - Xu, Yinshuang
AU - Allec-Blanchette, Christine
AU - Daniilidis, Kostas
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - Several popular approaches to 3D vision tasks process multiple views of the input independently with deep neural networks pre-trained on natural images, where view permutation invariance is achieved through a single round of pooling over all views. We argue that this operation discards important information and leads to subpar global descriptors. In this paper, we propose a group convolutional approach to multiple view aggregation where convolutions are performed over a discrete subgroup of the rotation group, enabling, thus, joint reasoning over all views in an equivariant (instead of invariant) fashion, up to the very last layer. We further develop this idea to operate on smaller discrete homogeneous spaces of the rotation group, where a polar view representation is used to maintain equivariance with only a fraction of the number of input views. We set the new state of the art in several large scale 3D shape retrieval tasks, and show additional applications to panoramic scene classification.
AB - Several popular approaches to 3D vision tasks process multiple views of the input independently with deep neural networks pre-trained on natural images, where view permutation invariance is achieved through a single round of pooling over all views. We argue that this operation discards important information and leads to subpar global descriptors. In this paper, we propose a group convolutional approach to multiple view aggregation where convolutions are performed over a discrete subgroup of the rotation group, enabling, thus, joint reasoning over all views in an equivariant (instead of invariant) fashion, up to the very last layer. We further develop this idea to operate on smaller discrete homogeneous spaces of the rotation group, where a polar view representation is used to maintain equivariance with only a fraction of the number of input views. We set the new state of the art in several large scale 3D shape retrieval tasks, and show additional applications to panoramic scene classification.
UR - http://www.scopus.com/inward/record.url?scp=85081925035&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85081925035&partnerID=8YFLogxK
U2 - 10.1109/ICCV.2019.00165
DO - 10.1109/ICCV.2019.00165
M3 - Conference contribution
AN - SCOPUS:85081925035
T3 - Proceedings of the IEEE International Conference on Computer Vision
SP - 1568
EP - 1577
BT - Proceedings - 2019 International Conference on Computer Vision, ICCV 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th IEEE/CVF International Conference on Computer Vision, ICCV 2019
Y2 - 27 October 2019 through 2 November 2019
ER -