Residential Collegefalse
Status已發表Published
Weakly Supervised Monocular 3D Object Detection by Spatial-Temporal View Consistency
Han, Wencheng1; Tao, Runzhou2; Ling, Haibin3; Shen, Jianbing1
2024-09-24
Source PublicationIEEE Transactions on Pattern Analysis and Machine Intelligence
ISSN0162-8828
Abstract

Monocular 3D object detection plays a crucial role In the field of self-driving cars, estimating the size and location of objects solely based on input images. However, a notable disparity exists between the training and inference of 3D object detectors. This discrepancy arises because during inference, monocular 3D detectors depend solely on images captured by cameras; while during training, these methods require 3D ground truths labeled on point cloud data, which is obtained using specialized devices like LiDAR. This discrepancy creates a break in the data loop, preventing the feedback data from production cars from being utilized to enhance the robustness of the detectors. To address this issue and establish a connection in the data loop, we present a weakly-supervised solution that trains monocular 3D object detectors solely using 2D labels, eliminating the requirement for 3D ground truths. Our approach considers two view consistency: spatial and temporal view consistency, which play a crucial role in regulating the prediction of 3D bounding boxes. Spatial view consistency is achieved by employing projection and multi-view consistency techniques to guide the optimization of the target's location and size. We leverage temporal viewpoint consistency to provide temporal multi-view image pairs, and we further introduce temporal movement consistency to tackle the challenge of dynamic scenes. With only 2D ground truths, our method achieves comparable performance to fully supervised methods. Additionally, our method can be employed as a pre-training method and achieves significant improvement when fine-tuned with a small proportion of fully supervised labels.

KeywordMonocular 3d Object Detection Production Cars Data Spatial-temporal Consistency Weakly Supervised Learning
DOI10.1109/TPAMI.2024.3466915
URLView the original
Language英語English
Scopus ID2-s2.0-85204945519
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionTHE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU)
Affiliation1.University of Macau, State Key Laboratory of Internet of Things for Smart City, Department of Computer and Information Science, Macao
2.QCraft Inc, Beijing, China
3.Stony Brook University, Department of Computer Science, Stony Brook, 11794, United States
First Author AffilicationUniversity of Macau
Recommended Citation
GB/T 7714
Han, Wencheng,Tao, Runzhou,Ling, Haibin,et al. Weakly Supervised Monocular 3D Object Detection by Spatial-Temporal View Consistency[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.
APA Han, Wencheng., Tao, Runzhou., Ling, Haibin., & Shen, Jianbing (2024). Weakly Supervised Monocular 3D Object Detection by Spatial-Temporal View Consistency. IEEE Transactions on Pattern Analysis and Machine Intelligence.
MLA Han, Wencheng,et al."Weakly Supervised Monocular 3D Object Detection by Spatial-Temporal View Consistency".IEEE Transactions on Pattern Analysis and Machine Intelligence (2024).
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Han, Wencheng]'s Articles
[Tao, Runzhou]'s Articles
[Ling, Haibin]'s Articles
Baidu academic
Similar articles in Baidu academic
[Han, Wencheng]'s Articles
[Tao, Runzhou]'s Articles
[Ling, Haibin]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Han, Wencheng]'s Articles
[Tao, Runzhou]'s Articles
[Ling, Haibin]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.