Classification of Music and Speech in Mandarin News Broadcasts | |
Chuan Liu ; Lei Xie ; Helen Meng | |
2007 | |
会议名称 | NCMMSC2007 |
英文摘要 | Audio scene analysis refers to the problem of classifying segments in a continuous audio stream according to content, e.g. speech versus non-speech, music, ambient noise, etc. Techniques that support such automatic segmentation is indispensable for multimedia information processing. For example, it is a precursor to processes such as indexing of speech segments by automatic speech recognition, automatic story segmentation based on recognition transcripts, speaker diarization, etc. This paper describes our work in the development of a speech/music discriminator for Mandarin broadcast news audio. We formed a high-dimensional feature vector that includes LPCC, LPS and STFT coefficients totaling 94 in all. We also experimented with three classifiers – the KNN, SVM and MLP. Experiments based on the Voice of America Mandarin news broadcasts show high classification performance with F-measure=0.98. The SVM also strikes the best balance in terms of classification performance and computation time (real-time) among the three classifiers. |
收录类别 | 其他 |
语种 | 英语 |
内容类型 | 会议论文 |
源URL | [http://ir.siat.ac.cn:8080/handle/172644/2030] |
专题 | 深圳先进技术研究院_集成所 |
推荐引用方式 GB/T 7714 | Chuan Liu,Lei Xie,Helen Meng. Classification of Music and Speech in Mandarin News Broadcasts[C]. 见:NCMMSC2007. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论