Residential College | false |
Status | 已發表Published |
Improving Bert Fine-Tuning via Stabilizing Cross-Layer Mutual Information | |
Li, Jicun1,2; Li, Xingjian3,6; Wang, Tianyang4; Wang, Shi1,2; Cao, Yanan5; Xu, Chengzhong6; Dou, Dejing3 | |
2023-05 | |
Conference Name | ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) |
Source Publication | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
Volume | 2023-June |
Conference Date | 04-10 June 2023 |
Conference Place | Rhodes Island, Greece |
Country | Greece |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Abstract | Fine-tuning pre-trained language models, such as BERT, has shown enormous success among various NLP tasks. Though simple and effective, the process of fine-tuning has been found unstable, which often leads to unexpected poor performance. To increase stability and generalizability, most existing works resort to maintaining the parameters or representations of pre-trained models during fine-tuning. Nevertheless, very little work explores mining the reliable part of pre-learned information that can help to stabilize fine-tuning. To address this challenge, we introduce a novel solution in which we fine-tune BERT with stabilized cross-layer mutual information. Our method aims to preserve the reliable behaviors of cross-layer information propagation, instead of preserving the information itself, of the pre-trained model. Therefore, our method circumvents the domain conflicts between pre-trained and target tasks. We conduct extensive experiments with popular pre-trained BERT variants on NLP datasets, demonstrating the universal effectiveness and robustness of our method. |
Keyword | Fine-tuning Stability Mutual Information Pre-trained Language Model |
DOI | 10.1109/ICASSP49357.2023.10095747 |
URL | View the original |
Language | 英語English |
Scopus ID | 2-s2.0-85175809274 |
Fulltext Access | |
Citation statistics | |
Document Type | Conference paper |
Collection | THE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU) Faculty of Science and Technology |
Affiliation | 1.Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, Chinese Academy of Sciences, China 2.University of Chinese Academy of Sciences, China 3.Big Data Lab, Baidu Research, China 4.University of Alabama, Birmingham, United States 5.Institute of Information Engineering, Chinese Academy of Sciences, China 6.State Key Lab of Iotsc, University of Macau, Macao |
Recommended Citation GB/T 7714 | Li, Jicun,Li, Xingjian,Wang, Tianyang,et al. Improving Bert Fine-Tuning via Stabilizing Cross-Layer Mutual Information[C]:Institute of Electrical and Electronics Engineers Inc., 2023. |
APA | Li, Jicun., Li, Xingjian., Wang, Tianyang., Wang, Shi., Cao, Yanan., Xu, Chengzhong., & Dou, Dejing (2023). Improving Bert Fine-Tuning via Stabilizing Cross-Layer Mutual Information. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2023-June. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment