UM  > Faculty of Science and Technology
Residential Collegefalse
Status已發表Published
SparseInteraction: Sparse Semantic Guidance for Radar and Camera 3D Object Detection
Xu, Shaoqing1; Jiang, Shengyin2; Li, Fang3; Liu, Li3; Song, Ziying4; Yang, Bo2; Yang, Zhi Xin1
2024-11
Conference Name32nd ACM International Conference on Multimedia, MM 2024
Source PublicationMM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
Pages9224-9233
Conference Date28 October 2024 - 1 November 2024
Conference PlaceMelbourne
CountryAustralia
PublisherAssociation for Computing Machinery, Inc
Abstract

Multi-modal fusion techniques, such as radar and images, enable a complementary and cost-effective perception of the surrounding environment regardless of lighting and weather conditions. However, existing fusion methods for surround-view images and radar are challenged by the inherent noise and positional ambiguity of radar, which leads to significant performance losses. To address this limitation effectively, our paper presents a robust, end-to-end fusion framework dubbed SparseInteraction. First, we introduce the Noisy Radar Filter (NRF) module to extract foreground features by creatively using queried semantic features from the image to filter out noisy radar features. Furthermore, we implement the Sparse Cross-Attention Encoder (SCAE) to effectively blend foreground radar features and image features to address positional ambiguity issues at a sparse level. Ultimately, to facilitate model convergence and performance, the foreground prior queries containing position information of the foreground radar are concatenated with predefined queries and fed into the subsequent transformer-based decoder. The experimental results demonstrate that the proposed fusion strategies markedly enhance detection performance and achieve new state-of-the-art results on the nuScenes benchmark. Source code is available at https://github.com/GG-Bonds/SparseInteraction.

Keyword3d Object Detection Autonomous Driving Multi-modal
DOI10.1145/3664647.3681565
URLView the original
Language英語English
Scopus ID2-s2.0-85204507729
Fulltext Access
Citation statistics
Document TypeConference paper
CollectionFaculty of Science and Technology
DEPARTMENT OF ELECTROMECHANICAL ENGINEERING
Corresponding AuthorYang, Zhi Xin
Affiliation1.University of Macau, Macao
2.Beijing University of Posts and Telecommunications, Beijing, China
3.Beijing Institute of Technology, Beijing, China
4.Beijing Jiaotong University, Beijing, China
First Author AffilicationUniversity of Macau
Corresponding Author AffilicationUniversity of Macau
Recommended Citation
GB/T 7714
Xu, Shaoqing,Jiang, Shengyin,Li, Fang,et al. SparseInteraction: Sparse Semantic Guidance for Radar and Camera 3D Object Detection[C]:Association for Computing Machinery, Inc, 2024, 9224-9233.
APA Xu, Shaoqing., Jiang, Shengyin., Li, Fang., Liu, Li., Song, Ziying., Yang, Bo., & Yang, Zhi Xin (2024). SparseInteraction: Sparse Semantic Guidance for Radar and Camera 3D Object Detection. MM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia, 9224-9233.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Xu, Shaoqing]'s Articles
[Jiang, Shengyin]'s Articles
[Li, Fang]'s Articles
Baidu academic
Similar articles in Baidu academic
[Xu, Shaoqing]'s Articles
[Jiang, Shengyin]'s Articles
[Li, Fang]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Xu, Shaoqing]'s Articles
[Jiang, Shengyin]'s Articles
[Li, Fang]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.