Residential College | false |
Status | 已發表Published |
How Does Pretraining Improve Discourse-Aware Translation? | |
Zhihong Huang1; Longyue Wang2; Siyou Liu1; Derek F. Wong1 | |
2023-08 | |
Conference Name | 24th International Speech Communication Association, Interspeech 2023 |
Source Publication | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
Volume | 2023-August |
Pages | 3899 - 3903 |
Conference Date | 2023/08/20-2023/08/23 |
Conference Place | Dublin |
Country | Ireland |
Publisher | International Speech Communication Association |
Abstract | Pretrained language models (PLMs) have produced substantial improvements in discourse-aware neural machine translation (NMT), for example, improved coherence in spoken language translation. However, the underlying reasons for their strong performance have not been well explained. To bridge this gap, we introduce a probing task to interpret the ability of PLMs to capture discourse relation knowledge. We validate three state-of-the-art PLMs across encoder-, decoder-, and encoder-decoder-based models. The analysis shows that (1) the ability of PLMs on discourse modelling varies from architecture and layer; (2) discourse elements in a text lead to different learning difficulties for PLMs. Besides, we investigate the effects of different PLMs on spoken language translation. Through experiments on IWSLT2017 Chinese-English dataset, we empirically reveal that NMT models initialized from different layers of PLMs exhibit the same trends with the probing task. Our findings are instructive to understand how and when discourse knowledge in PLMs should work for downstream tasks. © 2023 International Speech Communication Association. All rights reserved. |
Keyword | Discourse Linguistic Probing Machine Translation Pretrained Language Models Spoken Language |
DOI | 10.21437/Interspeech.2023-1068 |
URL | View the original |
Language | 英語English |
Scopus ID | 2-s2.0-85171534338 |
Fulltext Access | |
Citation statistics | |
Document Type | Conference paper |
Collection | DEPARTMENT OF PORTUGUESE DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Corresponding Author | Derek F. Wong |
Affiliation | 1.University of Macau 2.Tencent AI Lab |
First Author Affilication | University of Macau |
Corresponding Author Affilication | University of Macau |
Recommended Citation GB/T 7714 | Zhihong Huang,Longyue Wang,Siyou Liu,et al. How Does Pretraining Improve Discourse-Aware Translation?[C]:International Speech Communication Association, 2023, 3899 - 3903. |
APA | Zhihong Huang., Longyue Wang., Siyou Liu., & Derek F. Wong (2023). How Does Pretraining Improve Discourse-Aware Translation?. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2023-August, 3899 - 3903. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment