Residential College | false |
Status | 已發表Published |
RBA-GCN: Relational Bilevel Aggregation Graph Convolutional Network for Emotion Recognition | |
Yuan,Lin1; Huang,Guoheng1; Li,Fenghuan1; Yuan,Xiaochen2; Pun,Chi Man3; Zhong,Guo4 | |
2023 | |
Source Publication | IEEE/ACM Transactions on Audio Speech and Language Processing |
ISSN | 2329-9290 |
Volume | 31Pages:2325-2337 |
Abstract | Emotion recognition in conversation (ERC) has received increasing attention from researchers due to its wide range of applications. As conversation has a natural graph structure, numerous approaches used to model ERC based on graph convolutional networks (GCNs) have yielded significant results. However, the aggregation approach of traditional GCNs suffers from the node information redundancy problem, leading to node discriminant information loss. Additionally, single-layer GCNs lack the capacity to capture long-range contextual information from the graph. Furthermore, the majority of approaches are based on textual modality or stitching together different modalities, resulting in a weak ability to capture interactions between modalities. To address these problems, we present the relational bilevel aggregation graph convolutional network (RBA-GCN), which consists of three modules: the graph generation module (GGM), similarity-based cluster building module (SCBM) and bilevel aggregation module (BiAM). First, GGM constructs a novel graph to reduce the redundancy of target node information. Then, SCBM calculates the node similarity in the target node and its structural neighborhood, where noisy information with low similarity is filtered out to preserve the discriminant information of the node. Meanwhile, BiAM is a novel aggregation method that can preserve the information of nodes during the aggregation process. This module can construct the interaction between different modalities and capture long-range contextual information based on similarity clusters. On both the IEMOCAP and MELD datasets, the weighted average F1 score of RBA-GCN has a 2.17 ∼ 5.21% improvement over that of the most advanced method. |
Keyword | Context Modeling Emotion Recognition Multimodal Fusion Similarity Cluster |
DOI | 10.1109/TASLP.2023.3284509 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Acoustics ; Engineering |
WOS Subject | Acoustics ; Engineering, Electrical & Electronic |
WOS ID | WOS:001018614900001 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Scopus ID | 2-s2.0-85162708719 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Corresponding Author | Huang,Guoheng |
Affiliation | 1.Guangdong University of Technology,School of Computer Science and Technology,Guangdong,510006,China 2.Macao Polytechnic University,Faculty of Applied Sciences,999078,Macao 3.University of Macau,Faculty of Science and Technology,999078,Macao 4.Guangdong University of Foreign Studies,School of Information Science and Technology,Guangzhou,510006,China |
Recommended Citation GB/T 7714 | Yuan,Lin,Huang,Guoheng,Li,Fenghuan,et al. RBA-GCN: Relational Bilevel Aggregation Graph Convolutional Network for Emotion Recognition[J]. IEEE/ACM Transactions on Audio Speech and Language Processing, 2023, 31, 2325-2337. |
APA | Yuan,Lin., Huang,Guoheng., Li,Fenghuan., Yuan,Xiaochen., Pun,Chi Man., & Zhong,Guo (2023). RBA-GCN: Relational Bilevel Aggregation Graph Convolutional Network for Emotion Recognition. IEEE/ACM Transactions on Audio Speech and Language Processing, 31, 2325-2337. |
MLA | Yuan,Lin,et al."RBA-GCN: Relational Bilevel Aggregation Graph Convolutional Network for Emotion Recognition".IEEE/ACM Transactions on Audio Speech and Language Processing 31(2023):2325-2337. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment