CORC  > 北京大学  > 信息科学技术学院
A comparative study on selecting acoustic modeling units in deep neural networks based large vocabulary Chinese speech recognition
Li, Xiangang ; Yang, Yuning ; Pang, Zaihu ; Wu, Xihong
刊名International Conference on Intelligent Science and Intelligent Data Engineering (IScIDE)
2015
关键词Deep neural networks Multi-task learning Chinese automatic speech recognition Acoustic modeling units Syllable
DOI10.1016/j.neucom.2014.07.087
英文摘要This paper compared the performance of different acoustic modeling units in deep neural networks (DNNs) based large vocabulary continuous speech recognition (LVCSR) systems for Chinese. Recently, the deep neural networks based acoustic modeling method has achieved very competitive performance for many speech recognition tasks, and has become the focus of current LVCSR research. Some previous work have studied the context independent and context dependent DNNs based acoustic models. For Chinese, a syllabic language, the choice of basic modeling units under the background of DNNs based LVCSR systems is a very important issue. Three basic modeling units, syllables, initial/finals, phones, are discussed and compared. Experimental results show that, in the DNNs based systems, the context dependent (CD) phones obtain the best performance, and the context independent (Cl) syllables have the similar performance with the CD initial/finals. How the number of clustered states impacts on the performance of DNNs based systems is also discussed, which showed different properties from the GMMs based systems. Besides, through introducing the multi-task learning strategy, these multiple modeling units can be combined in the DNNs training procedure. The experimental results indicate that combining these multiple modeling units using multi-task learning outperforms each individual modeling unit. (C) 2015 Published by Elsevier B.V.; SCI(E); ARTICLE; 251-256; 170
语种英语
内容类型期刊论文
源URL[http://ir.pku.edu.cn/handle/20.500.11897/459243]  
专题信息科学技术学院
推荐引用方式
GB/T 7714
Li, Xiangang,Yang, Yuning,Pang, Zaihu,et al. A comparative study on selecting acoustic modeling units in deep neural networks based large vocabulary Chinese speech recognition[J]. International Conference on Intelligent Science and Intelligent Data Engineering (IScIDE),2015.
APA Li, Xiangang,Yang, Yuning,Pang, Zaihu,&Wu, Xihong.(2015).A comparative study on selecting acoustic modeling units in deep neural networks based large vocabulary Chinese speech recognition.International Conference on Intelligent Science and Intelligent Data Engineering (IScIDE).
MLA Li, Xiangang,et al."A comparative study on selecting acoustic modeling units in deep neural networks based large vocabulary Chinese speech recognition".International Conference on Intelligent Science and Intelligent Data Engineering (IScIDE) (2015).
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace