Classification of Music and Speech in Mandarin News Broadcasts

	Classification of Music and Speech in Mandarin News Broadcasts
	Chuan Liu ; Lei Xie ; Helen Meng
	2007
会议名称	NCMMSC2007
英文摘要	Audio scene analysis refers to the problem of classifying segments in a continuous audio stream according to content, e.g. speech versus non-speech, music, ambient noise, etc. Techniques that support such automatic segmentation is indispensable for multimedia information processing. For example, it is a precursor to processes such as indexing of speech segments by automatic speech recognition, automatic story segmentation based on recognition transcripts, speaker diarization, etc. This paper describes our work in the development of a speech/music discriminator for Mandarin broadcast news audio. We formed a high-dimensional feature vector that includes LPCC, LPS and STFT coefficients totaling 94 in all. We also experimented with three classifiers – the KNN, SVM and MLP. Experiments based on the Voice of America Mandarin news broadcasts show high classification performance with F-measure=0.98. The SVM also strikes the best balance in terms of classification performance and computation time (real-time) among the three classifiers.
收录类别	其他
语种	英语
内容类型	会议论文
源URL	[http://ir.siat.ac.cn:8080/handle/172644/2030]
专题	深圳先进技术研究院_集成所
推荐引用方式 GB/T 7714	Chuan Liu,Lei Xie,Helen Meng. Classification of Music and Speech in Mandarin News Broadcasts[C]. 见:NCMMSC2007.

个性服务

查看访问统计

相关权益政策

暂无数据

收藏/分享

所有评论 (0)

暂无评论

评注功能仅针对注册用户开放，请您登录

您在知识库使用过程中有什么好的想法或者建议可以反馈给我们。
标题：	*
内容：
Email：	*
验证码：	刷新

相关链接