Residential College | false |
Status | 已發表Published |
Single Cross-domain Semantic Guidance Network for Multimodal Unsupervised Image Translation | |
Lan,Jiaying1; Cheng,Lianglun1; Huang,Guoheng1; Pun,Chi Man2; Yuan,Xiaochen3; Lai,Shangyu4; Liu,Hong Rui5; Ling,Wing Kuen1 | |
2023-03-29 | |
Conference Name | Proceedings of the 29th International Conference on MultiMedia Modeling (MMM) |
Source Publication | Proceedings of the 29th International Conference on MultiMedia Modeling (MMM) |
Volume | 13833 LNCS |
Pages | 165-177 |
Conference Date | 2023-01 |
Conference Place | Norway |
Abstract | Multimodal image-to-image translation has received great attention due to its flexibility and practicality. The existing methods lack the generality of effective style representation, and cannot capture different levels of stylistic semantic information from cross-domain images. Besides, they ignore the parallelism for cross-domain image generation, and their generator can only be responsible for specific domains. To address these issues, we propose a novel Single Cross-domain Semantic Guidance Network (SCSG-Net) for coarse-to-fine semantically controllable multimodal image translation. Images from different domains are mapped to a unified visual semantic latent space by a dual sparse feature pyramid encoder, and then the generative module generates the result images by extracting semantic style representation from the input images in a self-supervised manner guided by adaptive discrimination. Especially, our SCSG-Net meets the needs of users in different styles as well as diverse scenarios. Extensive experiments on different benchmark datasets show that our method can outperform other state-of-the-art methods both quantitatively and qualitatively. |
Keyword | Multimodal Image Translation Semantic Guidance Unsupervised Learning |
DOI | 10.1007/978-3-031-27077-2_13 |
URL | View the original |
Indexed By | CPCI-S |
Language | 英語English |
WOS Research Area | Computer Science |
WOS Subject | Computer Science, Artificial Intelligence ; Computer Science, Interdisciplinary Applications ; Computer Science, Theory & Methods |
WOS ID | WOS:000996563000013 |
Scopus ID | 2-s2.0-85152572411 |
Fulltext Access | |
Citation statistics | |
Document Type | Conference paper |
Collection | DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Corresponding Author | Huang,Guoheng |
Affiliation | 1.Guangdong University of Technology, Guangzhou, China 2.University of Macau, Macau, China 3.Macao Polytechnic University, Macau, China 4.University of Maryland College Park, Maryland, MD, 20742, USA 5.San José State University, San José, CA, 95192, USA |
Recommended Citation GB/T 7714 | Lan,Jiaying,Cheng,Lianglun,Huang,Guoheng,et al. Single Cross-domain Semantic Guidance Network for Multimodal Unsupervised Image Translation[C], 2023, 165-177. |
APA | Lan,Jiaying., Cheng,Lianglun., Huang,Guoheng., Pun,Chi Man., Yuan,Xiaochen., Lai,Shangyu., Liu,Hong Rui., & Ling,Wing Kuen (2023). Single Cross-domain Semantic Guidance Network for Multimodal Unsupervised Image Translation. Proceedings of the 29th International Conference on MultiMedia Modeling (MMM), 13833 LNCS, 165-177. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment