UM  > Faculty of Science and Technology
Residential Collegefalse
Status已發表Published
High-Fidelity and High-Efficiency Talking Portrait Synthesis With Detail-Aware Neural Radiance Fields
Wang, Muyu1; Zhao, Sanyuan2; Dong, Xingping1; Shen, Jianbing3
2024-11
Source PublicationIEEE Transactions on Visualization and Computer Graphics
ISSN1077-2626
Abstract

In this paper, we propose a novel rendering framework based on neural radiance fields (NeRF) named HH-NeRF that can generate high-resolution audio-driven talking portrait videos with high fidelity and fast rendering. Specifically, our framework includes a detail-aware NeRF module and an efficient conditional super-resolution module. Firstly, a detail-aware NeRF is proposed to efficiently generate a high-fidelity low-resolution talking head, by using the encoded volume density estimation and audio-eye-aware color calculation. This module can capture natural eye blinks and high-frequency details, and maintain a similar rendering time as previous fast methods. Secondly, we present an efficient conditional super-resolution module on the dynamic scene to directly generate the high-resolution portrait with our low-resolution head. Incorporated with the prior information, such as depth map and audio features, our new proposed efficient conditional super resolution module can adopt a lightweight network to efficiently generate realistic and distinct high-resolution videos. Extensive experiments demonstrate that our method can generate more distinct and fidelity talking portraits on high resolution (900 × 900) videos compared to state-of-the-art methods.

KeywordNeural Radiance Fields Talking Portrait Audio Super-resolution
DOI10.1109/TVCG.2024.3488960
URLView the original
Language英語English
Scopus ID2-s2.0-85208531865
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionFaculty of Science and Technology
THE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU)
DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Affiliation1.Wuhan University, School of Computer Science, National Engineering Research Center for Multimedia Software, Institute of Artificial Intelligence, Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan, 430072, China
2.Beijing Institute of Technology, School of Computer Science, Beijing, China
3.University of Macau, State Key Laboratory of Internet of Things for Smart City, Department of Computer and Information Science, Macao
Recommended Citation
GB/T 7714
Wang, Muyu,Zhao, Sanyuan,Dong, Xingping,et al. High-Fidelity and High-Efficiency Talking Portrait Synthesis With Detail-Aware Neural Radiance Fields[J]. IEEE Transactions on Visualization and Computer Graphics, 2024.
APA Wang, Muyu., Zhao, Sanyuan., Dong, Xingping., & Shen, Jianbing (2024). High-Fidelity and High-Efficiency Talking Portrait Synthesis With Detail-Aware Neural Radiance Fields. IEEE Transactions on Visualization and Computer Graphics.
MLA Wang, Muyu,et al."High-Fidelity and High-Efficiency Talking Portrait Synthesis With Detail-Aware Neural Radiance Fields".IEEE Transactions on Visualization and Computer Graphics (2024).
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Wang, Muyu]'s Articles
[Zhao, Sanyuan]'s Articles
[Dong, Xingping]'s Articles
Baidu academic
Similar articles in Baidu academic
[Wang, Muyu]'s Articles
[Zhao, Sanyuan]'s Articles
[Dong, Xingping]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wang, Muyu]'s Articles
[Zhao, Sanyuan]'s Articles
[Dong, Xingping]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.