Residential College | false |
Status | 已發表Published |
Addi-Reg: A Better Generalization-Optimization Tradeoff Regularization Method for Convolutional Neural Networks | |
Yao Lu1; Zheng Zhang1; Guangming Lu1; Yicong Zhou2; Jinxing Li1; David Zhang3 | |
2021 | |
Source Publication | IEEE Transactions on Cybernetics |
ABS Journal Level | 3 |
ISSN | 2168-2267 |
Volume | 52Issue:10Pages:10827-10842 |
Abstract | In convolutional neural networks (CNNs), generating noise for the intermediate feature is a hot research topic in improving generalization. The existing methods usually regularize the CNNs by producing multiplicative noise (regularization weights), called multiplicative regularization (Multi-Reg). However, Multi-Reg methods usually focus on improving generalization but fail to jointly consider optimization, leading to unstable learning with slow convergence. Moreover, Multi-Reg methods are not flexible enough since the regularization weights are generated from a definite manual-design distribution. Besides, most popular methods are not universal enough, because these methods are only designed for the residual networks. In this article, we, for the first time, experimentally and theoretically explore the nature of generating noise in the intermediate features for popular CNNs. We demonstrate that injecting noise in the feature space can be transformed to generating noise in the input space, and these methods regularize the networks in a Mini-batch in Mini-batch (MiM) sampling manner. Based on these observations, this article further discovers that generating multiplicative noise can easily degenerate the optimization due to its high dependence on the intermediate feature. Based on these studies, we propose a novel additional regularization (Addi-Reg) method, which can adaptively produce additional noise with low dependence on intermediate feature in CNNs by employing a series of mechanisms. Particularly, these well-designed mechanisms can stabilize the learning process in training, and our Addi-Reg method can pertinently learn the noise distributions for every layer in CNNs. Extensive experiments demonstrate that the proposed Addi-Reg method is more flexible and universal, and meanwhile achieves better generalization performance with faster convergence against the state-of-the-art Multi-Reg methods. |
Keyword | Additional Regularization (Addi-reg) Convergence Convolutional Neural Networks (Cnns) Deep Learning Learning Systems Multiplicative Regularization (Multi-reg) Neural Networks Optimization Perturbation Methods Regularization. Residual Neural Networks Training |
DOI | 10.1109/TCYB.2021.3062881 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Automation & Control Systems ; Computer Science |
WOS Subject | Automation & Control Systems ; Computer Science, Artificial Intelligence ; Computer Science, Cybernetics |
WOS ID | WOS:000732334200001 |
Scopus ID | 2-s2.0-85103274066 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE Faculty of Science and Technology |
Corresponding Author | Zheng Zhang; Guangming Lu |
Affiliation | 1.Department of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, China. 2.Department of Computer and Information Science, University of Macau, Macau 3.School of Data Science, Chinese University of Hong Kong, Shenzhen 518172, China |
Recommended Citation GB/T 7714 | Yao Lu,Zheng Zhang,Guangming Lu,et al. Addi-Reg: A Better Generalization-Optimization Tradeoff Regularization Method for Convolutional Neural Networks[J]. IEEE Transactions on Cybernetics, 2021, 52(10), 10827-10842. |
APA | Yao Lu., Zheng Zhang., Guangming Lu., Yicong Zhou., Jinxing Li., & David Zhang (2021). Addi-Reg: A Better Generalization-Optimization Tradeoff Regularization Method for Convolutional Neural Networks. IEEE Transactions on Cybernetics, 52(10), 10827-10842. |
MLA | Yao Lu,et al."Addi-Reg: A Better Generalization-Optimization Tradeoff Regularization Method for Convolutional Neural Networks".IEEE Transactions on Cybernetics 52.10(2021):10827-10842. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment