题名电话通信中说话人确认方法研究
作者刘倓倓
学位类别博士
答辩日期2007-06-01
授予单位中国科学院声学研究所
授予地点声学研究所
关键词说话人确认 高斯混合模型 支持向量机 电话语音 信道失配
其他题名Speaker Verification for Telephone Speech
学位专业信号与信息处理
中文摘要本论文主要研究电话语音中的说话人确认方法及具体实现。该任务的目的是检测指定电话语音是否属于目标说话人。说话人确认系统,首先提取电话语音信号的语音特征,建立目标说话人模型;在确认阶段用目标说话人模型对指定电话语音的特征序列打分,用以确认该电话语音的说话人身份。 说话人确认技术可应用在很多种场合,如证券交易、银行交易、公安取证、个人电脑声控锁、汽车声控锁、身份证、信用卡等。随着电话通信网络的迅速发展,电话已成为当今人们交流的最主要的通信方式,因此面向电话的说话人确认技术有着极为广泛的应用前景。说话人确认过程可分为如下几个阶段:特征提取、说话人建模,测试语音判决。说话人确认的主要难点在于如何提取能够有效反映说话人发声特征的参数和选择描述说话人发声特性的说话人模型。电话语音下的说话人确认还面临着测试语音与训练语音信道环境的失配问题。 本论文研究了基于高斯混合模型的说话人确认方法和基于支持向量机的说话人确认方法。在分析了GMM的建模优势和SVM的分类优势后,尝试了将高斯混合模型与支持向量机相结合的确认方法。针对训练与测试环境失配问题,从特征和后处理两个角度研究了消除信道影响的方法,如倒谱均值归一化,特征弯曲,特征映射,NAP,零规整,测试规整等方法。 本论文实现了多种说话人确认系统,并在NIST 2006SRE测试集上对其性能进行了分析比较。其中,最优系统的等错率为7.0%。
英文摘要The aim of this thesis is to explore the methods used in speaker verification under telephone speech. Speaker verification is the process of accepting or rejecting the identity claim of a speaker. In the training phase, features extracted from the telephone speech of target speaker are used to create target speaker model. In the verification phase, a comparison with the target speaker model provides a verification score. The identity claim is accepted when the score is larger than the threshold. The application of speaker verification is quite varied, such as stock exchange, telephone bank, crime evidence obtaining, voice lock on computer and car, identity card, credit card and so on. With the development of communication network, telephone has become the main medium of communication in nowadays’ life. Thus speaker verification for telephone speech has broad application in modern life. The main modules of speaker verification are composed by feature extraction, target speaker modeling and test speech verification. The research of speaker verification focuses on choosing efficient feature characterizing speaker identity and speaker model representing speaker’s voice sample. In addition, speaker verification task for telephone speech is difficult because of its special application background, such as the mismatch channel conditions of training speech and test speech. This thesis discusses two the state-of-art speaker verification systems. One is based on Gaussian mixture model (GMM) and another use Support vector machine (SVM). The method of combining GMM and SVM is proposed in this thesis, which utilizes the modeling ability of GMM and classifying ability of SVM. Methods of reducing the influence of channel environment are also introduced: Ceptral Mean Normalization (CMN), feature warping, feature mapping, Nuisance Attribute Projection (NAP), Zero Normalization, Test Normalization, etc. Several speaker verification systems are constructed and comparison experiments are conducted on NIST test set. EER of 7.0% is achieved on the NIST 2006 SRE data.
语种中文
公开日期2011-05-07
页码59
内容类型学位论文
源URL[http://159.226.59.140/handle/311008/250]  
专题声学研究所_声学所博硕士学位论文_1981-2009博硕士学位论文
推荐引用方式
GB/T 7714
刘倓倓. 电话通信中说话人确认方法研究[D]. 声学研究所. 中国科学院声学研究所. 2007.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace