Residential Collegefalse
Status已發表Published
Improving Bert Fine-Tuning via Stabilizing Cross-Layer Mutual Information
Li, Jicun1,2; Li, Xingjian3,6; Wang, Tianyang4; Wang, Shi1,2; Cao, Yanan5; Xu, Chengzhong6; Dou, Dejing3
2023-05
Conference NameICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Source PublicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2023-June
Conference Date04-10 June 2023
Conference PlaceRhodes Island, Greece
CountryGreece
PublisherInstitute of Electrical and Electronics Engineers Inc.
Abstract

Fine-tuning pre-trained language models, such as BERT, has shown enormous success among various NLP tasks. Though simple and effective, the process of fine-tuning has been found unstable, which often leads to unexpected poor performance. To increase stability and generalizability, most existing works resort to maintaining the parameters or representations of pre-trained models during fine-tuning. Nevertheless, very little work explores mining the reliable part of pre-learned information that can help to stabilize fine-tuning. To address this challenge, we introduce a novel solution in which we fine-tune BERT with stabilized cross-layer mutual information. Our method aims to preserve the reliable behaviors of cross-layer information propagation, instead of preserving the information itself, of the pre-trained model. Therefore, our method circumvents the domain conflicts between pre-trained and target tasks. We conduct extensive experiments with popular pre-trained BERT variants on NLP datasets, demonstrating the universal effectiveness and robustness of our method.

KeywordFine-tuning Stability Mutual Information Pre-trained Language Model
DOI10.1109/ICASSP49357.2023.10095747
URLView the original
Language英語English
Scopus ID2-s2.0-85175809274
Fulltext Access
Citation statistics
Document TypeConference paper
CollectionTHE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU)
Faculty of Science and Technology
Affiliation1.Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, China
2.University of Chinese Academy of Sciences, China
3.Big Data Lab, Baidu Research, China
4.University of Alabama, Birmingham, United States
5.Institute of Information Engineering, Chinese Academy of Sciences, China
6.State Key Lab of Iotsc, University of Macau, Macao
Recommended Citation
GB/T 7714
Li, Jicun,Li, Xingjian,Wang, Tianyang,et al. Improving Bert Fine-Tuning via Stabilizing Cross-Layer Mutual Information[C]:Institute of Electrical and Electronics Engineers Inc., 2023.
APA Li, Jicun., Li, Xingjian., Wang, Tianyang., Wang, Shi., Cao, Yanan., Xu, Chengzhong., & Dou, Dejing (2023). Improving Bert Fine-Tuning via Stabilizing Cross-Layer Mutual Information. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2023-June.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Li, Jicun]'s Articles
[Li, Xingjian]'s Articles
[Wang, Tianyang]'s Articles
Baidu academic
Similar articles in Baidu academic
[Li, Jicun]'s Articles
[Li, Xingjian]'s Articles
[Wang, Tianyang]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Li, Jicun]'s Articles
[Li, Xingjian]'s Articles
[Wang, Tianyang]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.