一种高效的语音关键词检索系统

CORC > 清华大学

	一种高效的语音关键词检索系统
	罗骏 ; 欧智坚 ; Jun Luo ; Zhijian Ou
	2010-07-15 ; 2010-07-15
会议名称	全国网络与信息安全技术研讨会'2005论文集（下册） ; 全国网络与信息安全技术研讨会'2005 ; 中国北京 ; CNKI ; 信息产业部互联网应急处理协调办公室
关键词	信息检索关键词检索拼音图置信度 Information retrieval Keyword spotting Syllable graph Confidence measure TP391.3
其他题名	An efficient keyword spotting system for information retrieval
中文摘要	本文针对音频检索任务中的关键词检索提出一种新的基于拼音图的两阶段检索系统,可以高效地从大量语音数据中检索出感兴趣的文本信息,从而达到为国家安全服务的目的。该系统分为预处理阶段和检索阶段。预处理阶段将语音数据识别成具有高覆盖率的拼音图,在这一过程中通过若干次的无监督最大似然线性回归自适应算法渐次提高拼音图的质量。检索阶段响应用户的频繁查询,只需在拼音图中查找出与关键词拼音匹配的拼音串,并采用基于N元拼音文法的前后向算法计算置信度以实现对检索结果的筛选。实验表明:系统具有较高的召回率和正确率,且检索阶段仅需0．01倍实时,可以满足快速检索的需要。; In this paper, we proposed a new two-stage keyword spotting system based on syllable graph for audio information retrieval task, which can efficiently spot the interested words in mass speech data, thus serve for the national security management. It comprises two stages - preprocessing and searching. In the preprocessing stage, the audio data is recognized into syllable graph which includes high accuracy syllable candidates, and unsupervised MLLR (maximum likelihood linear regression) adaptation is carried out iteratively to further improve the accuracy of the syllable graph. In the searching stage, to answer the frequent queries from users, searching for matched keyword is only scanned in the graph for likely syllable strings. A forward-backward algorithm based on syllable N-grammar is used to calculate confidence measures for further filtering of the searching result. Experimental results showed the system achieved good performances in both recall rate and accuracy, and in the searching stage only 0.01 times of real time was needed, which could meet the demand for fast retrieval.
语种	中文 ; 中文
内容类型	会议论文
源URL	[http://hdl.handle.net/123456789/69826]
专题	清华大学
推荐引用方式 GB/T 7714	罗骏,欧智坚,Jun Luo,等. 一种高效的语音关键词检索系统[C]. 见:全国网络与信息安全技术研讨会'2005论文集（下册）, 全国网络与信息安全技术研讨会'2005, 中国北京, CNKI, 信息产业部互联网应急处理协调办公室.

个性服务

查看访问统计

相关权益政策

暂无数据

收藏/分享

所有评论 (0)

[发表评论/异议/意见]

暂无评论

评论
权益异议
反馈意见

评注功能仅针对注册用户开放，请您登录

您对该条目有什么异议，请向管理员反馈。
内容：
Email：	*
单位:
验证码：	刷新

您在知识库使用过程中有什么好的想法或者建议可以反馈给我们。
标题：	*
内容：
Email：	*
验证码：	刷新

相关链接

CORC

联系我们