Convolutional Attention Networks for Scene Text Recognition
Xie, HT (Xie, Hongtao)[ 1 ]; Fang, SC (Fang, Shancheng)[ 2,3 ]; Zha, ZJ (Zha, Zheng-Jun)[ 1 ]; Yang, YT (Yang, Yating)[ 4 ]; Li, Y (Li, Yan)[ 5 ]; Zhang, YD (Zhang, Yongdong)[ 1 ]
刊名ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS
2019
卷号15期号:1 增刊页码:3-17
关键词Text recognition text detection convolutional neural networks multi-level supervised information attention model
ISSN号1551-6857
DOI10.1145/3231737
英文摘要

In this article, we present Convoluitional Attention Networks (CAN) for unconstrained scene text recognition. Recent dominant approaches for scene text recognition are mainly based on Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), where the CNN encodes images and the RNN generates character sequences. Our CAN is different from these methods; our CAN is completely built on CNN and includes an attention mechanism. The distinctive characteristics of our method include (i) CAN follows encoder-decoder architecture, in which the encoder is a deep two-dimensional CNN and the decoder is a one-dimensional CNN; (ii) the attention mechanism is applied in every convolutional layer of the decoder, and we propose a novel spatial attention method using average pooling; and (iii) position embeddings are equipped in both a spatial encoder and a sequence decoder to give our networks a sense of location. We conduct experiments on standard datasets for scene text recognition, including Street View Text, IIIT5K, and ICDAR datasets. The experimental results validate the effectiveness of different components and show that our convolutional-based method achieves state-of-the-art or competitive performance over prior works, even without the use of RNN.

WOS记录号WOS:000459798100003
内容类型期刊论文
源URL[http://ir.xjipc.cas.cn/handle/365002/5690]  
专题新疆理化技术研究所_多语种信息技术研究室
作者单位1.Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei, Anhui, Peoples R China
2.Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
3.Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
4.Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Urumqi, Peoples R China
5.Beijing Kuaishou Technol Co Ltd, Beijing, Peoples R China
推荐引用方式
GB/T 7714
Xie, HT ,Fang, SC ,Zha, ZJ ,et al. Convolutional Attention Networks for Scene Text Recognition[J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS,2019,15(1 增刊):3-17.
APA Xie, HT ,Fang, SC ,Zha, ZJ ,Yang, YT ,Li, Y ,&Zhang, YD .(2019).Convolutional Attention Networks for Scene Text Recognition.ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS,15(1 增刊),3-17.
MLA Xie, HT ,et al."Convolutional Attention Networks for Scene Text Recognition".ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS 15.1 增刊(2019):3-17.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace