ParameterNet: Parameters are All You Need for Large-Scale Visual Pretraining of Mobile Networks

doi:10.1109/CVPR52733.2024.01491

UM > Faculty of Science and Technology

Residential College	false
Status	已發表Published
	ParameterNet: Parameters are All You Need for Large-Scale Visual Pretraining of Mobile Networks
	Han, Kai 1,2; Wang, Yunhe 1; Guo, Jianyuan 1,3; Wu, Enhua 2,4
	2024-09
Conference Name	2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Source Publication	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Pages	15751-15761
Conference Date	16-22 June 2024
Conference Place	Seattle, WA, USA
Country	USA
Publisher	IEEE Computer Society
Abstract	The large-scale visual pretraining has significantly improve the performance of large vision models. However, we observe the low FLOPs pitfall that the existing low-FLOPs models cannot benefit from large-scale pretraining. In this paper, we introduce a novel design principle, termed ParameterNet, aimed at augmenting the number of parameters in large-scale visual pretraining models while minimizing the increase in FLOPs. We leverage dynamic convolutions to incorporate additional parameters into the networks with only a marginal rise in FLOPs. The ParameterNet approach allows low-FLOPs networks to take advantage of large-scale visual pretraining. Furthermore, we extend the ParameterNet concept to the language domain to enhance inference results while preserving inference speed. Experiments on the large-scale ImageNet-22K have shown the superiority of our ParameterNet scheme. For example, ParameterNet-600M can achieve higher accuracy than the widely-used Swin Transformer (81.6% vs. 80.9%) and has much lower FLOPs (0.6G vs. 4.5G). The code will be released at https://parameternet.github.io/.
Keyword	Convolutional Codes Visualization Computer Vision Accuracy Transformers Pattern Recognition
DOI	10.1109/CVPR52733.2024.01491
URL	View the original
Language	英語English
Scopus ID	2-s2.0-85202632508
Fulltext Access	View Full-Text via DOI View Full-Text via Scopus
Citation statistics
Document Type	Conference paper
Collection	Faculty of Science and Technology
Affiliation	1.Huawei Noah's Ark Lab, Canada 2.The University of Sydney 3.State Key Lab of Computer Science, ISCAS & UCAS 4.Faculty of Science and Technology, University of Macau
Recommended Citation GB/T 7714	Han, Kai,Wang, Yunhe,Guo, Jianyuan,et al. ParameterNet: Parameters are All You Need for Large-Scale Visual Pretraining of Mobile Networks[C]:IEEE Computer Society, 2024, 15751-15761.
APA	Han, Kai., Wang, Yunhe., Guo, Jianyuan., & Wu, Enhua (2024). ParameterNet: Parameters are All You Need for Large-Scale Visual Pretraining of Mobile Networks. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 15751-15761.

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh