Understanding and Improving Low-Resource Neural Machine Translation with Shallow Features

doi:10.1007/978-981-97-9437-9_18

Residential College	false
Status	即將出版Forthcoming
	Understanding and Improving Low-Resource Neural Machine Translation with Shallow Features
	Sun, Yanming 1; Liu, Xuebo 2; Wong, Derek F.1; Lin, Yuchu 3; Li, Bei 4; Zhan, Runzhe 1; Chao, Lidia S.1; Zhang, Min 2
	2025
Conference Name	13th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2024
Source Publication	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	15361 LNAI
Pages	227-239
Conference Date	1 November 2024 to 3 November 2024
Conference Place	Hangzhou; China
Publisher	Springer Science and Business Media Deutschland GmbH
Abstract	During neural machine translation (NMT) tasks, we observe that despite the common assumption that increasing encoder depth leads to improved performance, this effect is less pronounced in low-resource scenarios and can even exacerbate overfitting issues. Our comparative analysis between NMT models equipped with shallow and deep encoders reveals that the majority of sentences are more effectively translated by a shallow encoder. Further analysis indicates that these sentences tend to be simpler, suggesting the shallow encoder’s ability to capture unique features in simple text. Building on these insights, we introduce NATASHA, a novel training strategy that enhances the capabilities of deep models in low-resource Neural mAchine TrAnslation with SHallow feAtures extracted through sequence-level knowledge distillation from the shallow model. Experimental results on five low-resource NMT tasks show that NATASHA consistently improves over strong baselines by at least 1 BLEU point. Furthermore, when combined with other regularization methods, NATASHA achieves leading-edge performance on the IWSLT14 De-En translation task. Further analysis of our method’s effectiveness reveals that integrating shallow features reduces the complexity of the training data, facilitating deep models in learning patterns and features within simple text during the early stages of training. This unleashes the deep model’s ability to learn representations of low-frequency words and long sentences, thereby enhancing overall performance.
Keyword	Encoder Depth Low-resource Neural Machine Translation Shallow Features
DOI	10.1007/978-981-97-9437-9_18
URL	View the original
Language	英語English
Scopus ID	2-s2.0-85210070429
Fulltext Access	View Full-Text via DOI View Full-Text via Scopus
Citation statistics
Document Type	Conference paper
Collection	University of Macau
Affiliation	1.Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, China 2.Harbin Institute of Technology, Shenzhen, China 3.DeepTranx, Zhuhai, China 4.Northeastern University, Shenyang, China
First Author Affilication	Faculty of Science and Technology
Recommended Citation GB/T 7714	Sun, Yanming,Liu, Xuebo,Wong, Derek F.,et al. Understanding and Improving Low-Resource Neural Machine Translation with Shallow Features[C]:Springer Science and Business Media Deutschland GmbH, 2025, 227-239.
APA	Sun, Yanming., Liu, Xuebo., Wong, Derek F.., Lin, Yuchu., Li, Bei., Zhan, Runzhe., Chao, Lidia S.., & Zhang, Min (2025). Understanding and Improving Low-Resource Neural Machine Translation with Shallow Features. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 15361 LNAI, 227-239.

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh