Difference-guided multi-scale spatial-temporal representation for sign language recognition

doi:10.1007/s00371-023-02979-8

UM > Faculty of Science and Technology

Residential College	false
Status	已發表Published
	Difference-guided multi-scale spatial-temporal representation for sign language recognition
	Gao, Liqing 1; Hu, Lianyu 1; Lyu, Fan 1; Zhu, Lei 2; Wan, Liang 1; Pun, Chi Man3 ; Feng, Wei 1
	2023-07-30
Source Publication	Visual Computer
ISSN	0178-2789
Volume	39 Issue:8 Pages:3417-3428
Abstract	Sign language recognition (SLR) is a challenging task, which requires a thorough understanding of spatial-temporal visual features for translating it into comprehensible written or spoken language. However, existing SLR methods ignore the importance of key spatial-temporal representation due to its sparsity and inconsistency in space and time. To solve this problem, we present a difference-guided multi-scale spatial-temporal representation (DMST) learning model for SLR. In DMST, we devise two modules: (1) key spatial-temporal representation, to extract and enhance key spatial-temporal information by a spatial-temporal difference strategy and (2) multi-scale sequence alignment, to perceive and fuse multi-scale spatial-temporal features and achieve sequence mapping. The DMST model outperforms state-of-the-art performance on four public sign language datasets, which demonstrates the superiority of DMST model and the significance of key spatial-temporal representation for SLR.
Keyword	Key Spatial-temporal Representation Multi-scale Sequence Alignment Sign Language Recognition (Slr)
DOI	10.1007/s00371-023-02979-8
URL	View the original
Indexed By	SCIE
Language	英語English
WOS Research Area	Computer Science
WOS Subject	Computer Science, Software Engineering
WOS ID	WOS:001040330500003
Publisher	SPRINGERONE NEW YORK PLAZA, SUITE 4600 , NEW YORK, NY 10004, UNITED STATES
Scopus ID	2-s2.0-85166243912
Fulltext Access	View Full-Text via DOI View Full-Text via Web of Science View Full-Text via Scopus
Citation statistics
Document Type	Journal article
Collection	Faculty of Science and Technology DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding Author	Feng, Wei
Affiliation	1.Tianjin University, Tianjin, China 2.The Hong Kong University of Science and Technology (Guangzhou), Guangzhou, China 3.University of Macau, Macao
Recommended Citation GB/T 7714	Gao, Liqing,Hu, Lianyu,Lyu, Fan,et al. Difference-guided multi-scale spatial-temporal representation for sign language recognition[J]. Visual Computer, 2023, 39(8), 3417-3428.
APA	Gao, Liqing., Hu, Lianyu., Lyu, Fan., Zhu, Lei., Wan, Liang., Pun, Chi Man., & Feng, Wei (2023). Difference-guided multi-scale spatial-temporal representation for sign language recognition. Visual Computer, 39(8), 3417-3428.
MLA	Gao, Liqing,et al."Difference-guided multi-scale spatial-temporal representation for sign language recognition".Visual Computer 39.8(2023):3417-3428.

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh