XJIPC OpenIR  > 多语种信息技术研究室
Convolutional Attention Networks for Scene Text Recognition
Xie, HT (Xie, Hongtao)[ 1 ]; Fang, SC (Fang, Shancheng)[ 2,3 ]; Zha, ZJ (Zha, Zheng-Jun)[ 1 ]; Yang, YT (Yang, Yating)[ 4 ]; Li, Y (Li, Yan)[ 5 ]; Zhang, YD (Zhang, Yongdong)[ 1 ]
2019
Source PublicationACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS
ISSN1551-6857
Volume15Issue:1 增刊Pages:3-17
Abstract

In this article, we present Convoluitional Attention Networks (CAN) for unconstrained scene text recognition. Recent dominant approaches for scene text recognition are mainly based on Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), where the CNN encodes images and the RNN generates character sequences. Our CAN is different from these methods; our CAN is completely built on CNN and includes an attention mechanism. The distinctive characteristics of our method include (i) CAN follows encoder-decoder architecture, in which the encoder is a deep two-dimensional CNN and the decoder is a one-dimensional CNN; (ii) the attention mechanism is applied in every convolutional layer of the decoder, and we propose a novel spatial attention method using average pooling; and (iii) position embeddings are equipped in both a spatial encoder and a sequence decoder to give our networks a sense of location. We conduct experiments on standard datasets for scene text recognition, including Street View Text, IIIT5K, and ICDAR datasets. The experimental results validate the effectiveness of different components and show that our convolutional-based method achieves state-of-the-art or competitive performance over prior works, even without the use of RNN.

KeywordText recognition text detection convolutional neural networks multi-level supervised information attention model
DOI10.1145/3231737
Indexed BySCI
WOS IDWOS:000459798100003
Citation statistics
Document Type期刊论文
Identifierhttp://ir.xjipc.cas.cn/handle/365002/5690
Collection多语种信息技术研究室
Affiliation1.Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei, Anhui, Peoples R China
2.Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
3.Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
4.Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Urumqi, Peoples R China
5.Beijing Kuaishou Technol Co Ltd, Beijing, Peoples R China
Recommended Citation
GB/T 7714
Xie, HT ,Fang, SC ,Zha, ZJ ,et al. Convolutional Attention Networks for Scene Text Recognition[J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS,2019,15(1 增刊):3-17.
APA Xie, HT ,Fang, SC ,Zha, ZJ ,Yang, YT ,Li, Y ,&Zhang, YD .(2019).Convolutional Attention Networks for Scene Text Recognition.ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS,15(1 增刊),3-17.
MLA Xie, HT ,et al."Convolutional Attention Networks for Scene Text Recognition".ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS 15.1 增刊(2019):3-17.
Files in This Item:
File Name/Size DocType Version Access License
Convolutional Attent(1094KB)期刊论文作者接受稿开放获取CC BY-NC-SAView Application Full Text
Related Services
Recommend this item
Bookmark
Usage statistics
Export to Endnote
Google Scholar
Similar articles in Google Scholar
[Xie, HT (Xie, Hongtao)[ 1 ]]'s Articles
[Fang, SC (Fang, Shancheng)[ 2,3 ]]'s Articles
[Zha, ZJ (Zha, Zheng-Jun)[ 1 ]]'s Articles
Baidu academic
Similar articles in Baidu academic
[Xie, HT (Xie, Hongtao)[ 1 ]]'s Articles
[Fang, SC (Fang, Shancheng)[ 2,3 ]]'s Articles
[Zha, ZJ (Zha, Zheng-Jun)[ 1 ]]'s Articles
Bing Scholar
Similar articles in Bing Scholar
[Xie, HT (Xie, Hongtao)[ 1 ]]'s Articles
[Fang, SC (Fang, Shancheng)[ 2,3 ]]'s Articles
[Zha, ZJ (Zha, Zheng-Jun)[ 1 ]]'s Articles
Terms of Use
No data!
Social Bookmark/Share
File name: Convolutional Attention Networks for Scene Text Recognition.pdf
Format: Adobe PDF
All comments (0)
No comment.
 

Items in the repository are protected by copyright, with all rights reserved, unless otherwise indicated.