Residential Collegefalse
Status已發表Published
AdaBrowse: Adaptive Video Browser for Efficient Continuous Sign Language Recognition
Hu, Lianyu1; Gao, Liqing1; Liu, Zekang1; Pun, Chi Man2; Feng, Wei1
2023-10-26
Conference Name31st ACM International Conference on Multimedia, MM 2023
Source PublicationMM 2023 - Proceedings of the 31st ACM International Conference on Multimedia
Pages709-718
Conference Date29 October 2023through 3 November 2023
Conference PlaceOttawa
CountryCanada
PublisherAssociation for Computing Machinery, Inc
Abstract

Raw videos have been proven to own considerable feature redundancy where in many cases only a portion of frames can already meet the requirements for accurate recognition. In this paper, we are interested in whether such redundancy can be effectively leveraged to facilitate efficient inference in continuous sign language recognition (CSLR). We propose a novel adaptive model (AdaBrowse) to dynamically select a most informative subsequence from input video sequences by modelling this problem as a sequential decision task. In specific, we first utilize a lightweight network to quickly scan input videos to extract coarse features. Then these features are fed into a policy network to intelligently select a subsequence to process. The corresponding subsequence is finally inferred by a normal CSLR model for sentence prediction. As only a portion of frames are processed in this procedure, the total computations can be considerably saved. Besides temporal redundancy, we are also interested in whether the inherent spatial redundancy can be seamlessly integrated together to achieve further efficiency, i.e., dynamically selecting a lowest input resolution for each sample, whose model is referred to as AdaBrowse+. Extensive experimental results on four large-scale CSLR datasets, i.e., PHOENIX14, PHOENIX14-T, CSL-Daily and CSL, demonstrate the effectiveness of AdaBrowse and AdaBrowse+ by achieving comparable accuracy with state-of-the-art methods with 1.44X throughput and 2.12X fewer FLOPs. Comparisons with other commonly-used 2D CNNs and adaptive efficient methods verify the effectiveness of AdaBrowse. Code is available at https://github.com/hulianyuyy/AdaBrowse.

KeywordContinuous Sign Language Recognition Efficient Inference Feature Redundancy
DOI10.1145/3581783.3611745
URLView the original
Language英語English
Scopus ID2-s2.0-85179549611
Fulltext Access
Citation statistics
Document TypeConference paper
CollectionDEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Affiliation1.Tianjin Univeristy, Tianjin, China
2.University of Macau, Macao
Recommended Citation
GB/T 7714
Hu, Lianyu,Gao, Liqing,Liu, Zekang,et al. AdaBrowse: Adaptive Video Browser for Efficient Continuous Sign Language Recognition[C]:Association for Computing Machinery, Inc, 2023, 709-718.
APA Hu, Lianyu., Gao, Liqing., Liu, Zekang., Pun, Chi Man., & Feng, Wei (2023). AdaBrowse: Adaptive Video Browser for Efficient Continuous Sign Language Recognition. MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia, 709-718.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Hu, Lianyu]'s Articles
[Gao, Liqing]'s Articles
[Liu, Zekang]'s Articles
Baidu academic
Similar articles in Baidu academic
[Hu, Lianyu]'s Articles
[Gao, Liqing]'s Articles
[Liu, Zekang]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Hu, Lianyu]'s Articles
[Gao, Liqing]'s Articles
[Liu, Zekang]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.