UM  > Faculty of Science and Technology
Residential Collegefalse
Status已發表Published
More Efficient and Locally Enhanced Transformer
Zhu,Zhefeng1; Qi,Ke1; Zhou,Yicong2; Chen,Wenbin1; Zhang,Jingdong3
2023
Conference Name29th International Conference on Neural Information Processing, ICONIP 2022
Source PublicationCommunications in Computer and Information Science
Volume1792 CCIS
Pages86-97
Conference DateNOV 22-26, 2022
Conference PlaceVirtual, Online
PublisherSpringer Science and Business Media Deutschland GmbH
Abstract

Aiming at the problems of the expensive computational cost of Self-attention and cascaded Self-attention weakening local feature information in the current ViT model, the ESA (Efficient Self-attention) module for optimizing computational complexity and the LE (Locally Enhanced) module for enhancing local information are proposed. The ESA module sorts the attention intensity of the class token and patch tokens of each Transformer encoder in the ViT model, only retains the weight value of patch token strongly associated with the class token in the attention matrix, and reuses the attention matrix of adjacent layers, so as to reduce the calculation of the model and accelerate the reasoning of the model; the LE module parallels a Depth-wise convolution in each Transformer encoder, it enables Transformer to capture global feature information and strengthen local feature information at the same time, which effectively improves the image recognition rate. A large number of experiments are performed on common image recognition datasets such as Tiny ImageNet, CIFAR-10 and CIFAR-100, experimental results show that the proposed method performs better in recognition accuracy under the premise of less computation.

KeywordEfficient Self-attention Image Recognition Locally Enhanced Vit
DOI10.1007/978-981-99-1642-9_8
URLView the original
Language英語English
Scopus ID2-s2.0-85161683342
Fulltext Access
Citation statistics
Document TypeConference paper
CollectionFaculty of Science and Technology
DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding AuthorQi,Ke
Affiliation1.Guangzhou University,Guangzhou,China
2.University of Macau,Taipa,Macao
3.South China Normal University,Guangzhou,China
Recommended Citation
GB/T 7714
Zhu,Zhefeng,Qi,Ke,Zhou,Yicong,et al. More Efficient and Locally Enhanced Transformer[C]:Springer Science and Business Media Deutschland GmbH, 2023, 86-97.
APA Zhu,Zhefeng., Qi,Ke., Zhou,Yicong., Chen,Wenbin., & Zhang,Jingdong (2023). More Efficient and Locally Enhanced Transformer. Communications in Computer and Information Science, 1792 CCIS, 86-97.
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Zhu,Zhefeng]'s Articles
[Qi,Ke]'s Articles
[Zhou,Yicong]'s Articles
Baidu academic
Similar articles in Baidu academic
[Zhu,Zhefeng]'s Articles
[Qi,Ke]'s Articles
[Zhou,Yicong]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Zhu,Zhefeng]'s Articles
[Qi,Ke]'s Articles
[Zhou,Yicong]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.