Abstract
Dynamic convolution has demonstrated substantial performance improvements for convolutional neural networks. Previous aggregation based dynamic convolution methods are challenged by the parameter/memory inefficiency, and the learning difficulty due to the scalar type attention for aggregation. To rectify these limitations, we propose a parameter efficient dynamic convolution operator (dubbed as PEDConv) that learns to discriminatively perturb the spatial, input and output filters of a shared base convolution weight, through a tensor decomposition based input-dependent reparameterization. Our method considerably reduces the number of parameters compared to prior arts and limit the computational cost to maintain inference efficiency. Meanwhile, the proposed PEDConv significantly boosts the accuracy when substituting standard convolutions on a plethora of prevalent deep learning tasks, including ImageNet classification, COCO object detection, ADE20K semantic segmentation, and adversarial robustness. For example, on ImageNet classification, PEDConv applied to ResNet-50 achieves 80.5% Top-1 accuracy at almost the same computation cost as static convolutional baseline, improving previous best dynamic convolution method by 1.9% accuracy. Moreover, the proposed method can be readily extended to both input and spatial dynamic regime with adaptive reparameterization at different spatial locations, in which case ResNet-50 achieves 79.3% Top-1 accuracy while reducing 44% FLOPs compared to the baseline model.
Original language | English (US) |
---|---|
State | Published - 2021 |
Event | 32nd British Machine Vision Conference, BMVC 2021 - Virtual, Online Duration: Nov 22 2021 → Nov 25 2021 |
Conference
Conference | 32nd British Machine Vision Conference, BMVC 2021 |
---|---|
City | Virtual, Online |
Period | 11/22/21 → 11/25/21 |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Computer Vision and Pattern Recognition