Residential College | false |
Status | 已發表Published |
TriView-ParNet: parallel network for hybrid recognition of touching printed and handwritten strings based on feature fusion and three-view co-training | |
Qiu, Junhao1; Lai, Shangyu2; Huang, Guoheng3; Zhang, Weiwen3; Mai, Junhui3; Pun, Chi Man4; Ling, Wing Kuen5 | |
2022-12-21 | |
Source Publication | APPLIED INTELLIGENCE |
ISSN | 0924-669X |
Volume | 53Issue:13Pages:17015–17034 |
Abstract | Deep learning has been a mainstream solution for recognizing printed and isolated handwritten characters in Optical Characters Recognition (OCR). However, it is still a challenge to hybrid recognition of adjoining strings in printed and handwritten format, especially in the case that characters are touching and the data is imbalanced. In this paper, we propose a hybrid recognition scheme, termed TriView-ParNet, for adjoining printed-and-handwritten strings. First of all, we introduce a Parallel Network, which consists of Two-stream feature Extraction and Fusion Module (TEFM) and Context Extraction and Transcription Module (CETM). The TEFM is proposed to address the issue where characters are touched in printed and handwritten format. It can fuse the content and positional features extracted by two feature extraction networks to enrich the original feature representation. For another, the CETM is used to further extract the contextual information of the sequence. By using the contextual prompts of sequence, the recognition ability of long strings can be enhanced by CETM. Secondly, we propose a Three-view Co-training Module, in view of the poor performance of direct training based on a small amount of labeled data. Using the idea of semi-supervised learning, a classifier is trained from three different views, print, handwriting, and hybrid. Finally, we compare our method with state-of-the-art methods on the public dataset NIST SD19 and the newly collected dataset CPHS2020. The experimental results demonstrate that our method gets a higher accuracy of strings recognition. As a result, our TriView-ParNet extracts positional and contextual information to enhance the performance of recognition, which also provides a semi-supervised learning solution. |
Keyword | Strings Recognition Mdsr Recognition Feature Fusion Multi-view Training Semi-supervised Learning |
DOI | 10.1007/s10489-022-04257-x |
URL | View the original |
Indexed By | SCIE |
Language | 英語English |
WOS Research Area | Computer Science |
WOS Subject | Computer Science, Artificial Intelligence |
WOS ID | WOS:000901986100001 |
Publisher | SPRINGERVAN GODEWIJCKSTRAAT 30, 3311 GZ DORDRECHT, NETHERLANDS |
Scopus ID | 2-s2.0-85144519974 |
Fulltext Access | |
Citation statistics | |
Document Type | Journal article |
Collection | Faculty of Science and Technology DEPARTMENT OF COMPUTER AND INFORMATION SCIENCE |
Corresponding Author | Huang, Guoheng |
Affiliation | 1.School of Electromechanical Engineering, Guangdong University of Technology, Guangzhou, 510006, China 2.College of Computer, Mathematical, and Natural Sciences 2300 Symons Hall, University of Maryland College Park, Baltimore, MD 20742, Maryland, USA 3.School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, 510006, China 4.Department of Computer and Information Science, University of Macau, Macau, 999078 SAR, China 5.School of Information Engineering, Guangdong University of Technology, Guangzhou, 510006, China |
Recommended Citation GB/T 7714 | Qiu, Junhao,Lai, Shangyu,Huang, Guoheng,et al. TriView-ParNet: parallel network for hybrid recognition of touching printed and handwritten strings based on feature fusion and three-view co-training[J]. APPLIED INTELLIGENCE, 2022, 53(13), 17015–17034. |
APA | Qiu, Junhao., Lai, Shangyu., Huang, Guoheng., Zhang, Weiwen., Mai, Junhui., Pun, Chi Man., & Ling, Wing Kuen (2022). TriView-ParNet: parallel network for hybrid recognition of touching printed and handwritten strings based on feature fusion and three-view co-training. APPLIED INTELLIGENCE, 53(13), 17015–17034. |
MLA | Qiu, Junhao,et al."TriView-ParNet: parallel network for hybrid recognition of touching printed and handwritten strings based on feature fusion and three-view co-training".APPLIED INTELLIGENCE 53.13(2022):17015–17034. |
Files in This Item: | There are no files associated with this item. |
Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.
Edit Comment