Residential College | false |
Status | 即將出版Forthcoming |
Understanding and Improving Low-Resource Neural Machine Translation with Shallow Features | |
Sun, Yanming1; Liu, Xuebo2; Wong, Derek F.1; Lin, Yuchu3; Li, Bei4; Zhan, Runzhe1; Chao, Lidia S.1; Zhang, Min2 | |
2025 | |
Conference Name | 13th CCF International Conference on Natural Language Processing and Chinese Computing, NLPCC 2024 |
Source Publication | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
![]() |
Volume | 15361 LNAI |
Pages | 227-239 |
Conference Date | 1 November 2024 to 3 November 2024 |
Conference Place | Hangzhou; China |
Publisher | Springer Science and Business Media Deutschland GmbH |
Abstract | During neural machine translation (NMT) tasks, we observe that despite the common assumption that increasing encoder depth leads to improved performance, this effect is less pronounced in low-resource scenarios and can even exacerbate overfitting issues. Our comparative analysis between NMT models equipped with shallow and deep encoders reveals that the majority of sentences are more effectively translated by a shallow encoder. Further analysis indicates that these sentences tend to be simpler, suggesting the shallow encoder’s ability to capture unique features in simple text. Building on these insights, we introduce NATASHA, a novel training strategy that enhances the capabilities of deep models in low-resource Neural mAchine TrAnslation with SHallow feAtures extracted through sequence-level knowledge distillation from the shallow model. Experimental results on five low-resource NMT tasks show that NATASHA consistently improves over strong baselines by at least 1 BLEU point. Furthermore, when combined with other regularization methods, NATASHA achieves leading-edge performance on the IWSLT14 De-En translation task. Further analysis of our method’s effectiveness reveals that integrating shallow features reduces the complexity of the training data, facilitating deep models in learning patterns and features within simple text during the early stages of training. This unleashes the deep model’s ability to learn representations of low-frequency words and long sentences, thereby enhancing overall performance. |
Keyword | Encoder Depth Low-resource Neural Machine Translation Shallow Features |
DOI | 10.1007/978-981-97-9437-9_18 |
URL | View the original |
Language | 英語English |
Scopus ID | 2-s2.0-85210070429 |
Fulltext Access | |
Citation statistics | |
Document Type | Conference paper |
Collection | University of Macau |
Affiliation | 1.Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, China 2.Harbin Institute of Technology, Shenzhen, China 3.DeepTranx, Zhuhai, China 4.Northeastern University, Shenyang, China |
First Author Affilication | Faculty of Science and Technology |
Recommended Citation GB/T 7714 | Sun, Yanming,Liu, Xuebo,Wong, Derek F.,et al. Understanding and Improving Low-Resource Neural Machine Translation with Shallow Features[C]:Springer Science and Business Media Deutschland GmbH, 2025, 227-239. |
APA | Sun, Yanming., Liu, Xuebo., Wong, Derek F.., Lin, Yuchu., Li, Bei., Zhan, Runzhe., Chao, Lidia S.., & Zhang, Min (2025). Understanding and Improving Low-Resource Neural Machine Translation with Shallow Features. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 15361 LNAI, 227-239. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment