Residential College | false |
Status | 已發表Published |
Difference-guided multi-scale spatial-temporal representation for sign language recognition | |
Gao, Liqing1; Hu, Lianyu1; Lyu, Fan1; Zhu, Lei2; Wan, Liang1; Pun, Chi Man3; Feng, Wei1 | |
2023-07-30 | |
Source Publication | Visual Computer |
ISSN | 0178-2789 |
Volume | 39Issue:8Pages:3417-3428 |
Abstract | Sign language recognition (SLR) is a challenging task, which requires a thorough understanding of spatial-temporal visual features for translating it into comprehensible written or spoken language. However, existing SLR methods ignore the importance of key spatial-temporal representation due to its sparsity and inconsistency in space and time. To solve this problem, we present a difference-guided multi-scale spatial-temporal representation (DMST) learning model for SLR. In DMST, we devise two modules: (1) key spatial-temporal representation, to extract and enhance key spatial-temporal information by a spatial-temporal difference strategy and (2) multi-scale sequence alignment, to perceive and fuse multi-scale spatial-temporal features and achieve sequence mapping. The DMST model outperforms state-of-the-art performance on four public sign language datasets, which demonstrates the superiority of DMST model and the significance of key spatial-temporal representation for SLR. |
Keyword | Key Spatial-temporal Representation Multi-scale Sequence Alignment Sign Language Recognition (Slr) |
DOI | 10.1007/s00371-023-02979-8 |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Computer Science |
WOS Subject | Computer Science, Software Engineering |
WOS ID | WOS:001040330500003 |
Publisher | SPRINGERONE NEW YORK PLAZA, SUITE 4600 , NEW YORK, NY 10004, UNITED STATES |
Scopus ID | 2-s2.0-85166243912 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | Faculty of Science and Technology DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Corresponding Author | Feng, Wei |
Affiliation | 1.Tianjin University, Tianjin, China 2.The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China 3.University of Macau, Macao |
Recommended Citation GB/T 7714 | Gao, Liqing,Hu, Lianyu,Lyu, Fan,et al. Difference-guided multi-scale spatial-temporal representation for sign language recognition[J]. Visual Computer, 2023, 39(8), 3417-3428. |
APA | Gao, Liqing., Hu, Lianyu., Lyu, Fan., Zhu, Lei., Wan, Liang., Pun, Chi Man., & Feng, Wei (2023). Difference-guided multi-scale spatial-temporal representation for sign language recognition. Visual Computer, 39(8), 3417-3428. |
MLA | Gao, Liqing,et al."Difference-guided multi-scale spatial-temporal representation for sign language recognition".Visual Computer 39.8(2023):3417-3428. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment