Word-level Permutation and Improved Lower Frame Rate for RNN-Based Acoustic Modeling
Yuanyuan Zhao1,2; Shiyu Zhou1,2; Shuang Xu1; Bo Xu1
2017
会议日期November 14-18, 2017
会议地点Guangzhou, China
关键词Rnn-based Acoustic Model Acoustic Trajectory Lower Frame Rate Word-level Permutation
页码859-869
英文摘要Recently, the RNN-based acoustic model has shown promising performance. However, its generalization ability to multiple scenarios is not powerful enough for two reasons. Firstly, it encodes inter-word dependency, which conflicts with the nature that an acoustic model should model the pronunciation of words only. Secondly, the RNN-based acoustic model depicting the inner-word acoustic trajectory frame-by-frame is too precise to tolerate small distortions. In this work, we propose two variants to address aforementioned two problems. One is the word-level permutation, i.e. the order of input features and corresponding labels is shuffled with a proper probability according to word boundaries. It aims to eliminate inter-word dependencies. The other one is the improved LFR (iLFR) model, which equidistantly splits the original sentence into N utterances to overcome the discarding data in LFR model. Results based on LSTM RNN demonstrate 7\% relative performance improvement by jointing the word-level permutation and iLFR.
会议录iconip2017
语种英语
内容类型会议论文
源URL[http://ir.ia.ac.cn/handle/173211/15429]  
专题数字内容技术与服务研究中心_听觉模型与认知计算
通讯作者Yuanyuan Zhao
作者单位1.Institute of Automation, Chinese Academy of Sciences
2.University of Chinese Academy of Sciences
推荐引用方式
GB/T 7714
Yuanyuan Zhao,Shiyu Zhou,Shuang Xu,et al. Word-level Permutation and Improved Lower Frame Rate for RNN-Based Acoustic Modeling[C]. 见:. Guangzhou, China. November 14-18, 2017.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace