On the Copying Behaviors of Pre-Training for Neural Machine Translation

doi:10.48550/arXiv.2107.08212

UM > Faculty of Science and Technology > DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE

Residential College	false
Status	已發表Published
	On the Copying Behaviors of Pre-Training for Neural Machine Translation
	Liu, Xuebo 1; Wang, Longyue 2; Wong, Derek F.1 ; Ding, Liang 3; Chao, Lidia S.1; Shi, Shuming 2; Tu, Zhaopeng 2
	2021
Conference Name	The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021)
Source Publication	Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021
Pages	4265-4275
Conference Date	1 August 2021through 6 August 2021
Conference Place	Virtual, Online
Author of Source	Zong C., Xia F., Li W., Navigli R.
Publisher	Association for Computational Linguistics (ACL)
Abstract	Previous studies have shown that initializing neural machine translation (NMT) models with the pre-trained language models (LM) can speed up the model training and boost the model performance. In this work, we identify a critical side-effect of pre-training for NMT, which is due to the discrepancy between the training objectives of LM-based pre-training and NMT. Since the LM objective learns to reconstruct a few source tokens and copy most of them, the pre-training initialization would affect the copying behaviors of NMT models. We provide a quantitative analysis of copying behaviors by introducing a metric called copying ratio, which empirically shows that pre-training based NMT models have a larger copying ratio than the standard one. In response to this problem, we propose a simple and effective method named copying penalty to control the copying behaviors in decoding. Extensive experiments on both in-domain and out-of-domain benchmarks show that the copying penalty method consistently improves translation performance by controlling copying behaviors for pre-training based NMT models. Source code is freely available at https://github.com/SunbowLiu/CopyingPenalty.
DOI	10.48550/arXiv.2107.08212
URL	View the original
Language	英語English
Scopus ID	2-s2.0-85114656790
Fulltext Access	View Full-Text via DOI View Full-Text via Scopus
Citation statistics
Document Type	Conference paper
Collection	DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding Author	Liu, Xuebo
Affiliation	1.NLP2CT Lab, Department of Computer and Information Science, University of Macau, Macao 2.Tencent AI Lab, China 3.The University of Sydney, Australia
First Author Affilication	University of Macau
Corresponding Author Affilication	University of Macau
Recommended Citation GB/T 7714	Liu, Xuebo,Wang, Longyue,Wong, Derek F.,et al. On the Copying Behaviors of Pre-Training for Neural Machine Translation[C]. Zong C., Xia F., Li W., Navigli R.:Association for Computational Linguistics (ACL), 2021, 4265-4275.
APA	Liu, Xuebo., Wang, Longyue., Wong, Derek F.., Ding, Liang., Chao, Lidia S.., Shi, Shuming., & Tu, Zhaopeng (2021). On the Copying Behaviors of Pre-Training for Neural Machine Translation. Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 4265-4275.

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh