Residential College | false |
Status | 已發表Published |
High-Fidelity and High-Efficiency Talking Portrait Synthesis With Detail-Aware Neural Radiance Fields | |
Wang, Muyu1; Zhao, Sanyuan2; Dong, Xingping1; Shen, Jianbing3![]() | |
2024-11 | |
Source Publication | IEEE Transactions on Visualization and Computer Graphics
![]() |
ISSN | 1077-2626 |
Abstract | In this paper, we propose a novel rendering framework based on neural radiance fields (NeRF) named HH-NeRF that can generate high-resolution audio-driven talking portrait videos with high fidelity and fast rendering. Specifically, our framework includes a detail-aware NeRF module and an efficient conditional super-resolution module. Firstly, a detail-aware NeRF is proposed to efficiently generate a high-fidelity low-resolution talking head, by using the encoded volume density estimation and audio-eye-aware color calculation. This module can capture natural eye blinks and high-frequency details, and maintain a similar rendering time as previous fast methods. Secondly, we present an efficient conditional super-resolution module on the dynamic scene to directly generate the high-resolution portrait with our low-resolution head. Incorporated with the prior information, such as depth map and audio features, our new proposed efficient conditional super resolution module can adopt a lightweight network to efficiently generate realistic and distinct high-resolution videos. Extensive experiments demonstrate that our method can generate more distinct and fidelity talking portraits on high resolution (900 × 900) videos compared to state-of-the-art methods. |
Keyword | Neural Radiance Fields Talking Portrait Audio Super-resolution |
DOI | 10.1109/TVCG.2024.3488960 |
URL | View the original |
Language | 英語English |
Scopus ID | 2-s2.0-85208531865 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | Faculty of Science and Technology THE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU) DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Affiliation | 1.Wuhan University, School of Computer Science, National Engineering Research Center for Multimedia Software, Institute of Artificial Intelligence, Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan, 430072, China 2.Beijing Institute of Technology, School of Computer Science, Beijing, China 3.University of Macau, State Key Laboratory of Internet of Things for Smart City, Department of Computer and Information Science, Macao |
Recommended Citation GB/T 7714 | Wang, Muyu,Zhao, Sanyuan,Dong, Xingping,et al. High-Fidelity and High-Efficiency Talking Portrait Synthesis With Detail-Aware Neural Radiance Fields[J]. IEEE Transactions on Visualization and Computer Graphics, 2024. |
APA | Wang, Muyu., Zhao, Sanyuan., Dong, Xingping., & Shen, Jianbing (2024). High-Fidelity and High-Efficiency Talking Portrait Synthesis With Detail-Aware Neural Radiance Fields. IEEE Transactions on Visualization and Computer Graphics. |
MLA | Wang, Muyu,et al."High-Fidelity and High-Efficiency Talking Portrait Synthesis With Detail-Aware Neural Radiance Fields".IEEE Transactions on Visualization and Computer Graphics (2024). |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment