Convolutional Attention Networks for Scene Text Recognition | |
Xie, HT (Xie, Hongtao)[ 1 ]; Fang, SC (Fang, Shancheng)[ 2,3 ]; Zha, ZJ (Zha, Zheng-Jun)[ 1 ]; Yang, YT (Yang, Yating)[ 4 ]; Li, Y (Li, Yan)[ 5 ]; Zhang, YD (Zhang, Yongdong)[ 1 ] | |
刊名 | ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS |
2019 | |
卷号 | 15期号:1 增刊页码:3-17 |
关键词 | Text recognition text detection convolutional neural networks multi-level supervised information attention model |
ISSN号 | 1551-6857 |
DOI | 10.1145/3231737 |
英文摘要 | In this article, we present Convoluitional Attention Networks (CAN) for unconstrained scene text recognition. Recent dominant approaches for scene text recognition are mainly based on Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN), where the CNN encodes images and the RNN generates character sequences. Our CAN is different from these methods; our CAN is completely built on CNN and includes an attention mechanism. The distinctive characteristics of our method include (i) CAN follows encoder-decoder architecture, in which the encoder is a deep two-dimensional CNN and the decoder is a one-dimensional CNN; (ii) the attention mechanism is applied in every convolutional layer of the decoder, and we propose a novel spatial attention method using average pooling; and (iii) position embeddings are equipped in both a spatial encoder and a sequence decoder to give our networks a sense of location. We conduct experiments on standard datasets for scene text recognition, including Street View Text, IIIT5K, and ICDAR datasets. The experimental results validate the effectiveness of different components and show that our convolutional-based method achieves state-of-the-art or competitive performance over prior works, even without the use of RNN. |
WOS记录号 | WOS:000459798100003 |
内容类型 | 期刊论文 |
源URL | [http://ir.xjipc.cas.cn/handle/365002/5690] |
专题 | 新疆理化技术研究所_多语种信息技术研究室 |
作者单位 | 1.Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei, Anhui, Peoples R China 2.Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China 3.Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China 4.Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Urumqi, Peoples R China 5.Beijing Kuaishou Technol Co Ltd, Beijing, Peoples R China |
推荐引用方式 GB/T 7714 | Xie, HT ,Fang, SC ,Zha, ZJ ,et al. Convolutional Attention Networks for Scene Text Recognition[J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS,2019,15(1 增刊):3-17. |
APA | Xie, HT ,Fang, SC ,Zha, ZJ ,Yang, YT ,Li, Y ,&Zhang, YD .(2019).Convolutional Attention Networks for Scene Text Recognition.ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS,15(1 增刊),3-17. |
MLA | Xie, HT ,et al."Convolutional Attention Networks for Scene Text Recognition".ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS 15.1 增刊(2019):3-17. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论