Word-level Permutation and Improved Lower Frame Rate for RNN-Based Acoustic Modeling | |
Yuanyuan Zhao1,2![]() ![]() ![]() | |
2017 | |
会议日期 | November 14-18, 2017 |
会议地点 | Guangzhou, China |
关键词 | Rnn-based Acoustic Model Acoustic Trajectory Lower Frame Rate Word-level Permutation |
页码 | 859-869 |
英文摘要 | Recently, the RNN-based acoustic model has shown promising performance. However, its generalization ability to multiple scenarios is not powerful enough for two reasons. Firstly, it encodes inter-word dependency, which conflicts with the nature that an acoustic model should model the pronunciation of words only. Secondly, the RNN-based acoustic model depicting the inner-word acoustic trajectory frame-by-frame is too precise to tolerate small distortions. In this work, we propose two variants to address aforementioned two problems. One is the word-level permutation, i.e. the order of input features and corresponding labels is shuffled with a proper probability according to word boundaries. It aims to eliminate inter-word dependencies. The other one is the improved LFR (iLFR) model, which equidistantly splits the original sentence into N utterances to overcome the discarding data in LFR model. Results based on LSTM RNN demonstrate 7\% relative performance improvement by jointing the word-level permutation and iLFR. |
会议录 | iconip2017
![]() |
语种 | 英语 |
内容类型 | 会议论文 |
源URL | [http://ir.ia.ac.cn/handle/173211/15429] ![]() |
专题 | 数字内容技术与服务研究中心_听觉模型与认知计算 |
通讯作者 | Yuanyuan Zhao |
作者单位 | 1.Institute of Automation, Chinese Academy of Sciences 2.University of Chinese Academy of Sciences |
推荐引用方式 GB/T 7714 | Yuanyuan Zhao,Shiyu Zhou,Shuang Xu,et al. Word-level Permutation and Improved Lower Frame Rate for RNN-Based Acoustic Modeling[C]. 见:. Guangzhou, China. November 14-18, 2017. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论