Residential Collegefalse
Status即將出版Forthcoming
A Two-level Rectification Attention Network for Scene Text Recognition
Wu, Lintai1; Xu, Yong2; Hou, Junhui3; Chen, C. L.Philip4; Liu, Cheng Lin5
2022
Source PublicationIEEE Transactions on Multimedia
ISSN1520-9210
Abstract

Scene text recognition is a challenging task in the computer vision field due to the diversity of text styles and the complexity of the image backgrounds. In recent decades, numerous text rectification and recognition methods have been proposed to solve these problems. However, most of these methods rectify texts at the geometry level or pixel level. The former is limited by geometric constraints, and the latter is prone to blurring the text. In this paper, we propose a two-level rectification attention network (TRAN) to rectify and recognize texts. This network consists of two parts: a two-level rectification network (TORN) and an attention-based recognition network (ABRN). Specifically, the TORN first rectifies texts at the geometry level and then performs a pixel-level adjustment, which not only eliminates the geometric constraints but also renders clear texts. The ABRN's role is to recognize text in the rectified images. To improve the feature extraction ability of our model, we design a new channel-wise and kernel-wise attention unit, which enables the network to handle significant variations of character size and channel interdependencies. Furthermore, we propose a skip training strategy to make our model converge smoothly. We conduct experiments on various benchmarks, including regular and irregular datasets. The experimental results show that our method achieves a state-of-the-art performance.

KeywordScene Text Recognition Text Rectification Spatial Transformer Network Optical Character Recognition
DOI10.1109/TMM.2022.3146779
URLView the original
Language英語English
Scopus ID2-s2.0-85124089811
Fulltext Access
Citation statistics
Document TypeJournal article
CollectionDEPARTMENT OF COMPUTER AND INFORMATION SCIENCE
Corresponding AuthorXu, Yong
Affiliation1.Computer Science and Technology, Harbin Institute of Technology Shenzhen, 529484 Shenzhen, China, 518055 (e-mail: [email protected])
2.Computer Science & Engineering, Harbin Institute of Technology, 47822 Harbin, China, 150001 (e-mail: [email protected])
3.Computer Science, City University of Hong Kong, 53025 Kowloon, Hong Kong, (e-mail: [email protected])
4.Department of Computer and Information Science, University of Macau, Macau, Macau, China, 999078 (e-mail: [email protected])
5.National Laboratory of Pattern Recognition, Institute of Automation Chinese Academy of Sciences, 74522 Beijing, China, 100190 (e-mail: [email protected])
Recommended Citation
GB/T 7714
Wu, Lintai,Xu, Yong,Hou, Junhui,et al. A Two-level Rectification Attention Network for Scene Text Recognition[J]. IEEE Transactions on Multimedia, 2022.
APA Wu, Lintai., Xu, Yong., Hou, Junhui., Chen, C. L.Philip., & Liu, Cheng Lin (2022). A Two-level Rectification Attention Network for Scene Text Recognition. IEEE Transactions on Multimedia.
MLA Wu, Lintai,et al."A Two-level Rectification Attention Network for Scene Text Recognition".IEEE Transactions on Multimedia (2022).
Files in This Item:
There are no files associated with this item.
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Wu, Lintai]'s Articles
[Xu, Yong]'s Articles
[Hou, Junhui]'s Articles
Baidu academic
Similar articles in Baidu academic
[Wu, Lintai]'s Articles
[Xu, Yong]'s Articles
[Hou, Junhui]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Wu, Lintai]'s Articles
[Xu, Yong]'s Articles
[Hou, Junhui]'s Articles
Terms of Use
No data!
Social Bookmark/Share
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.