Residential College | false |
Status | 已發表Published |
ParameterNet: Parameters are All You Need for Large-Scale Visual Pretraining of Mobile Networks | |
Han, Kai1,2; Wang, Yunhe1; Guo, Jianyuan1,3; Wu, Enhua2,4 | |
2024-09 | |
Conference Name | 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) |
Source Publication | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
Pages | 15751-15761 |
Conference Date | 16-22 June 2024 |
Conference Place | Seattle, WA, USA |
Country | USA |
Publisher | IEEE Computer Society |
Abstract | The large-scale visual pretraining has significantly improve the performance of large vision models. However, we observe the low FLOPs pitfall that the existing low-FLOPs models cannot benefit from large-scale pretraining. In this paper, we introduce a novel design principle, termed ParameterNet, aimed at augmenting the number of parameters in large-scale visual pretraining models while minimizing the increase in FLOPs. We leverage dynamic convolutions to incorporate additional parameters into the networks with only a marginal rise in FLOPs. The ParameterNet approach allows low-FLOPs networks to take advantage of large-scale visual pretraining. Furthermore, we extend the ParameterNet concept to the language domain to enhance inference results while preserving inference speed. Experiments on the large-scale ImageNet-22K have shown the superiority of our ParameterNet scheme. For example, ParameterNet-600M can achieve higher accuracy than the widely-used Swin Transformer (81.6% vs. 80.9%) and has much lower FLOPs (0.6G vs. 4.5G). The code will be released at https://parameternet.github.io/. |
Keyword | Convolutional Codes Visualization Computer Vision Accuracy Transformers Pattern Recognition |
DOI | 10.1109/CVPR52733.2024.01491 |
URL | View the original |
Language | 英語English |
Scopus ID | 2-s2.0-85202632508 |
Fulltext Access | |
Citation statistics | |
Document Type | Conference paper |
Collection | Faculty of Science and Technology |
Affiliation | 1.Huawei Noah's Ark Lab, Canada 2.The University of Sydney 3.State Key Lab of Computer Science, ISCAS & UCAS 4.Faculty of Science and Technology, University of Macau |
Recommended Citation GB/T 7714 | Han, Kai,Wang, Yunhe,Guo, Jianyuan,et al. ParameterNet: Parameters are All You Need for Large-Scale Visual Pretraining of Mobile Networks[C]:IEEE Computer Society, 2024, 15751-15761. |
APA | Han, Kai., Wang, Yunhe., Guo, Jianyuan., & Wu, Enhua (2024). ParameterNet: Parameters are All You Need for Large-Scale Visual Pretraining of Mobile Networks. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 15751-15761. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment