A benchmark for unconstrained online handwritten Uyghur word recognition
Simayi, Wujiahemaiti1; Ibrahim, Mayire1; Zhang, Xu-Yao2; Liu, Cheng-Lin2; Hamdulla, Askar1
刊名INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION
2020-07-28
页码14
关键词Online handwriting recognition Uyghur alphabet Database Out-of-vocabulary words Recurrent neural network 1D Convolution
ISSN号1433-2833
DOI10.1007/s10032-020-00354-0
通讯作者Hamdulla, Askar(askar@xju.edu.cn)
英文摘要Despite some interesting results from different research groups, a public database for Uyghur online handwriting recognition and a baseline study are not yet available for comparison purpose. In order to fill this void, we present a database of Uyghur online handwritten words and carry out the first benchmark experiments using it. This database contains 125,020 samples of 2030 words collected from 393 writers. According to Uyghur lexicon characteristics, two out-of-vocabulary datasets are especially provided for evaluation. We carry out some unconstrained handwritten word recognition experiments on the database using recurrent neural networks as base model. Recognition results are acquired using connectionist temporal classification without lexicon search and external language model. Concatenated and averaged bidirectional recurrent layers are compared for better generalization. Based on Uyghur unicode representation, we are interested in comparing the models using different alphabets, based both on character types and character forms. To improve generalization, we propose 1D convolutional model which implements 1D convolutional layers for sequence feature extraction. In our experiments, the proposed 1D convolutional model and its variations surpassed the base recurrent layered model on the out-of-vocabulary words by clear margin. 83.23% CAR (character accurate rate) was resulted when out-of-vocabulary samples are used for testing. The highest recognition rate is as high as 94.95% CAR when the test set shares the same lexicon to the training set. The experiments in this paper can be the baseline references for the future study using this database.
资助项目National Key Research and Development Plan of China[2017YFC0820603] ; National Science Foundation of China (NSFC)[61462081] ; National Science Foundation of China (NSFC)[61263038]
WOS关键词CHINESE ; NETWORKS ; DATABASE
WOS研究方向Computer Science
语种英语
出版者SPRINGER HEIDELBERG
WOS记录号WOS:000553232000001
资助机构National Key Research and Development Plan of China ; National Science Foundation of China (NSFC)
内容类型期刊论文
源URL[http://ir.ia.ac.cn/handle/173211/40274]  
专题自动化研究所_模式识别国家重点实验室_模式分析与学习团队
通讯作者Hamdulla, Askar
作者单位1.Xinjiang Univ, Inst Informat Sci & Engn, Urumqi, Peoples R China
2.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit NLPR, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Simayi, Wujiahemaiti,Ibrahim, Mayire,Zhang, Xu-Yao,et al. A benchmark for unconstrained online handwritten Uyghur word recognition[J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION,2020:14.
APA Simayi, Wujiahemaiti,Ibrahim, Mayire,Zhang, Xu-Yao,Liu, Cheng-Lin,&Hamdulla, Askar.(2020).A benchmark for unconstrained online handwritten Uyghur word recognition.INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION,14.
MLA Simayi, Wujiahemaiti,et al."A benchmark for unconstrained online handwritten Uyghur word recognition".INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION (2020):14.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace