Residential Collegefalse
Status已發表Published
Normal Assisted Pixel-Visibility Learning with Cost Aggregation for Multiview Stereo
Tong, Wei1; Guan, Xiaorong1; Kang, Jian2; Sun, Poly Z.H.3,4; Law, Rob5,6; Ghamisi, Pedram7,8; Wu, Edmond Q.4,9
2022-12-05
Source PublicationIEEE Transactions on Intelligent Transportation Systems
ISSN1524-9050
Volume23Issue:12Pages:24686-24697
Abstract

Multiple-View Stereo (MVS) aims to reconstruct the dense 3D representations of scenes. MVS has potential applications in the fields of autonomous driving (unstructured environment construction) and robotic navigation (visual-inertial navigation). To mitigate the error of depth estimation in low-textured or occluded regions, this work proposes a two-stage multi-view stereo network for fast and accurate depth estimation. The improvements of this work over the state of the art are as follows: 1) Sparse costs are constructed to jointly predict the initial depth map and surface normal by cost regularization, which proves that the surface normals can be estimated in this way with low memory consumption. 2) A new edge refinement block is developed to refine the coarse surface normal to obtain a fine-grained surface normal map. 3) Instead of using the general variance-based metric to equally aggregate cost, a new content-adaptive cost aggregation mechanism based on the similarity of the neighboring surface normal is designed for reliable cost aggregation. To the best of our knowledge, the proposed work is the first trainable network that leverages surface normal as guidance to capture neighboring pixel-visibility, which is an effective supplement to existing depth/normal estimation frameworks. Experimental results indicate that our method can not only achieve accurate depth estimation for scene perception but also make no concession to the real-time performance and limited memory bottleblock. Multiple-view stereo (MVS) aims to reconstruct the dense 3D representations of scenes. It is widely used in the fields of industrial measurement, autonomous driving, and robotic navigation. To mitigate the error of depth estimation in challenging scenarios, this work proposes a two-stage multi-view stereo network for fast and accurate depth estimation. Our method is the first trainable network that leverages surface normal as pixel-visibility guidance to aggregate reliable cost, which could achieve accurate depth estimation and provide the perception ability for the robot. The proposed method has great potential in the fields of 3D reconstruction, industrial measurement, and robotic navigation to estimate real-time and accurate depth with limited memory consumption.

KeywordCost Aggregation Depth Estimation Multi-view Stereo Pixel Visibility Surface Normal
DOI10.1109/TITS.2022.3193421
URLView the original
Indexed BySCIE
Language英語English
WOS Research AreaEngineering ; Transportation
WOS SubjectEngineering, Civil ; Engineering, Electrical & Electronic ; Transportation Science & Technology
WOS IDWOS:000836681700001
PublisherIEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC445 HOES LANE, PISCATAWAY, NJ 08855-4141
Scopus ID2-s2.0-85135745222
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionDEPARTMENT OF INTEGRATED RESORT AND TOURISM MANAGEMENT
Faculty of Business Administration
ASIA-PACIFIC ACADEMY OF ECONOMICS AND MANAGEMENT
Corresponding AuthorGuan, Xiaorong; Kang, Jian; Wu, Edmond Q.
Affiliation1.Nanjing University of Science and Technology, School of Mechanical Engineering, Nanjing, Jiangsu, 210094, China
2.Soochow University, School of Electronic and Information Engineering, Suzhou, 215006, China
3.The Department of Industrial Engineering, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
4.The Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China
5.The Asia-Pacific Academy of Economics and Management, University of Macau, 999078, Macao
6.The Department of Integrated Resort and Tourism Management, Faculty of Business Administration, University of Macau, 999078, Macao
7.The Helmholtz-Zentrum Dresden-Rossendorf (HZDR), Helmholtz Institute Freiberg for Resource Technology (HIF), Exploration, Freiberg, D09599, Germany
8.The Institute of Advanced Research in Artificial Intelligence (IARAI), Vienna, 1030, Austria
9.The Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China
Recommended Citation
GB/T 7714
Tong, Wei,Guan, Xiaorong,Kang, Jian,et al. Normal Assisted Pixel-Visibility Learning with Cost Aggregation for Multiview Stereo[J]. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(12), 24686-24697.
APA Tong, Wei., Guan, Xiaorong., Kang, Jian., Sun, Poly Z.H.., Law, Rob., Ghamisi, Pedram., & Wu, Edmond Q. (2022). Normal Assisted Pixel-Visibility Learning with Cost Aggregation for Multiview Stereo. IEEE Transactions on Intelligent Transportation Systems, 23(12), 24686-24697.
MLA Tong, Wei,et al."Normal Assisted Pixel-Visibility Learning with Cost Aggregation for Multiview Stereo".IEEE Transactions on Intelligent Transportation Systems 23.12(2022):24686-24697.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Tong, Wei]'s Articles
[Guan, Xiaorong]'s Articles
[Kang, Jian]'s Articles
Baidu academic
Similar articles in Baidu academic
[Tong, Wei]'s Articles
[Guan, Xiaorong]'s Articles
[Kang, Jian]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Tong, Wei]'s Articles
[Guan, Xiaorong]'s Articles
[Kang, Jian]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.