DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection

doi:10.1609/aaai.v38i4.28105

UM > Faculty of Science and Technology

Residential College	false
Status	已發表Published
	DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection
	Li, Xiang 1; Yin, Junbo 1; Li, Wei 2; Xu, Chengzhong3 ; Yang, Ruigang 2; Shen, Jianbing3
	2024-03-24
Conference Name	38th AAAI Conference on Artificial Intelligence, AAAI 2024
Source Publication	Proceedings of the AAAI Conference on Artificial Intelligence
Volume	38
Issue	4
Pages	3208-3215
Conference Date	20-27 February 2024
Conference Place	Vancouver
Country	Canada
Abstract	Vehicle-to-Everything (V2X) collaborative perception has recently gained significant attention due to its capability to enhance scene understanding by integrating information from various agents, e.g., vehicles, and infrastructure. However, current works often treat the information from each agent equally, ignoring the inherent domain gap caused by the utilization of different LiDAR sensors of each agent, thus leading to suboptimal performance. In this paper, we propose DI-V2X, that aims to learn Domain-Invariant representations through a new distillation framework to mitigate the domain discrepancy in the context of V2X 3D object detection. DI-V2X comprises three essential components: a domain-mixing instance augmentation (DMA) module, a progressive domain-invariant distillation (PDD) module, and a domain-adaptive fusion (DAF) module. Specifically, DMA builds a domain-mixing 3D instance bank for the teacher and student models during training, resulting in aligned data representation. Next, PDD encourages the student models from different domains to gradually learn a domain-invariant feature representation towards the teacher, where the overlapping regions between agents are employed as guidance to facilitate the distillation process. Furthermore, DAF closes the domain gap between the students by incorporating calibration-aware domain-adaptive attention. Extensive experiments on the challenging DAIR-V2X and V2XSet benchmark datasets demonstrate DI-V2X achieves remarkable performance, outperforming all the previous V2X models. Code is available at https://github.com/Serenos/DI-V2X.
Keyword	Cv: Vision For Robotics & Autonomous Driving Cv: Object Detection & Categorization
DOI	10.1609/aaai.v38i4.28105
URL	View the original
Indexed By	CPCI-S
Language	英語English
WOS Research Area	Computer Science
WOS Subject	Computer Science, Artificial Intelligence ; Computer Science, Theory & Methods
WOS ID	WOS:001239884400037
Scopus ID	2-s2.0-85189447169
Fulltext Access	View Full-Text via DOI View Full-Text via Web of Science View Full-Text via Scopus
Citation statistics
Document Type	Conference paper
Collection	Faculty of Science and Technology THE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU) DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding Author	Shen, Jianbing
Affiliation	1.School of Computer Science, Beijing Institute of Technology, China 2.Inceptio, 3.SKL-IOTSC, CIS, University of Macau, Macao
Corresponding Author Affilication	University of Macau
Recommended Citation GB/T 7714	Li, Xiang,Yin, Junbo,Li, Wei,et al. DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection[C], 2024, 3208-3215.
APA	Li, Xiang., Yin, Junbo., Li, Wei., Xu, Chengzhong., Yang, Ruigang., & Shen, Jianbing (2024). DI-V2X: Learning Domain-Invariant Representation for Vehicle-Infrastructure Collaborative 3D Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 38(4), 3208-3215.

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh