CORC  > 清华大学
印刷维吾尔文本切割
靳简明 ; 丁晓青 ; 彭良瑞 ; 王华 ; JIN Jian-ming ; DING Xiao-qing ; PENG Liang-rui ; WANG Hua
2010-06-09 ; 2010-06-09
关键词计算机应用 中文信息处理 文本切割 字符切割 字符识别 维吾尔文 computer application Chinese information proces sing text segmentation character segmentation character recognition Uyghur sc ript TP391.1
其他题名Printed Uyghur Texts Segmentation
中文摘要我国新疆地区使用的维吾尔文借用阿拉伯文字母书写。因为阿拉伯文字母自身书写的特点,造成维文文本的切割和识别极其困难。本文在连通体分类的基础上,结合水平投影和连通体分析的方法实现维文文本的文字行切分和单词切分。然后定位单词基线位置,计算单词轮廓和基线的距离,寻找所有可能的切点实现维文单词过切割,最后利用规则合并过切分字符。实验结果表明,字符切割准确率达到99%以上。; Uyghur is spoken in Xinjiang Uyghur Autonomous Re gion of China, which adopts Arabic script to write. As a cursive script and othe r characteristics, it is very difficult to do text segmentation and recognition. In this paper, a method, which hybrid horizontal projection and connected compo nents analysis, based on connected components classification is proposed to do t ext line segmentation and word segmentation of Uyghur texts. And then, the basel ine position of each word is estimated. All candidate character segmentation poi nts are found out by calculating the distance between word contour and baseline. Finally, over-segmen ted characters are merged according to rules. Experiment shows that the characte r segmentation accuracy has achieved 99%.; 国家自然科学基金资助项目(60241005)
语种中文 ; 中文
内容类型期刊论文
源URL[http://hdl.handle.net/123456789/55551]  
专题清华大学
推荐引用方式
GB/T 7714
靳简明,丁晓青,彭良瑞,等. 印刷维吾尔文本切割[J],2010, 2010.
APA 靳简明.,丁晓青.,彭良瑞.,王华.,JIN Jian-ming.,...&WANG Hua.(2010).印刷维吾尔文本切割..
MLA 靳简明,et al."印刷维吾尔文本切割".(2010).
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace