UM  > Faculty of Science and Technology
Residential Collegefalse
Status已發表Published
SQ-ViT: A Multi-Scale Vision Transformer With Quaternion For Endoscopic Images Classification
Jin, Zhanjun1; Huang, Guoheng1; Zhang, Feng2; Yuan, Xiaochen3; Zhu, Dingzhou1; Tan, Zhe1; Pun, Chi Man4; Zhong, Guo5
2024-12-16
Source PublicationIEEE Transactions on Consumer Electronics
ISSN0098-3063
Abstract

In the field of medical consumer electronics, endoscopic imaging technology especially electronic nasopharyngoscope imaging, often suffers from low resolution, which poses a difficulty for endoscopic images classification due to the loss of image details. Recent advancements in Vision Transformer (ViT) based methods have shown promise in addressing this problem. However, ViT relies heavily on global context information to maintain performance, and the limited pixel count in lowresolution images poses a challenge in capturing adequate global context information. To address these challenges, we propose the Sequential Quaternion Vision Transformer (SQ-ViT), which improves multi-scale feature utilization by feeding sampled features into the subsequent encoder layers. Specifically, we introduce the Multi-scale Visual Feature Fusion (MVFF) module, which segments the image into multiple superpixel blocks and refines the contour and color information of the processed image, which helps to enhance the representation of visual features. Additionally, visual information would be captured more effectively by our proposed Quaternion Interactive Encoder (QIE). Experiments demonstrate the effectiveness of SQ-ViT in improving multi-scale feature utilization and addressing challenges in low-resolution endoscopic imaging for endoscopic images classification. The source code will be released at https://github.com/jinzhanjun625/SQViT.

KeywordEndoscopic Images Classification Endoscopy Interpretability Quaternion Convolution Superpixel Vision Transformer
DOI10.1109/TCE.2024.3518755
URLView the original
Language英語English
PublisherInstitute of Electrical and Electronics Engineers Inc.
Scopus ID2-s2.0-85213027035
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionFaculty of Science and Technology
DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding AuthorHuang, Guoheng; Zhong, Guo
Affiliation1.Guangdong University of Technology, School of Computer Science and Technology, Guangzhou, Guangdong, 510000, China
2.Sun Yat-sen University First Affiliated Hospital Department of Nephrology, Department of Otorhinolaryngology, Guangzhou, Guangdong, 510000, China
3.Macao Polytechnic University, Faculty of Applied Sciences, 999078, Macao
4.University of Macao, Department of Computer and Information Science, 999078, Macao
5.Guangdong University of Foreign Studies, School of Information Science and Technology, Guangzhou, Guangdong, 510000, China
Recommended Citation
GB/T 7714
Jin, Zhanjun,Huang, Guoheng,Zhang, Feng,et al. SQ-ViT: A Multi-Scale Vision Transformer With Quaternion For Endoscopic Images Classification[J]. IEEE Transactions on Consumer Electronics, 2024.
APA Jin, Zhanjun., Huang, Guoheng., Zhang, Feng., Yuan, Xiaochen., Zhu, Dingzhou., Tan, Zhe., Pun, Chi Man., & Zhong, Guo (2024). SQ-ViT: A Multi-Scale Vision Transformer With Quaternion For Endoscopic Images Classification. IEEE Transactions on Consumer Electronics.
MLA Jin, Zhanjun,et al."SQ-ViT: A Multi-Scale Vision Transformer With Quaternion For Endoscopic Images Classification".IEEE Transactions on Consumer Electronics (2024).
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Jin, Zhanjun]'s Articles
[Huang, Guoheng]'s Articles
[Zhang, Feng]'s Articles
Baidu academic
Similar articles in Baidu academic
[Jin, Zhanjun]'s Articles
[Huang, Guoheng]'s Articles
[Zhang, Feng]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Jin, Zhanjun]'s Articles
[Huang, Guoheng]'s Articles
[Zhang, Feng]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.