基于数据驱动方法的汉语文本-可视语音合成(英文)

CORC > 清华大学

	基于数据驱动方法的汉语文本-可视语音合成(英文)
	王志明 ; 蔡莲红 ; 艾海舟 ; WANG Zhi-Ming ; CAI Lian-Hong ; AI Hai-Zhou
	2010-06-09 ; 2010-06-09
关键词	文-语转换系统(TTS) 文本-可视语音合成系统(TTVS) 视位协同发音 text-to-speech (TTS) text-to-visual speech (TTVS) viseme co-articulation TP391.1
其他题名	Text-To-Visual Speech in Chinese Based on Data-Driven Approach
中文摘要	计算机文本-可视语音合成系统(TTVS)可以增强语音的可懂度,并使人机交互界面变得更为友好.给出一个基于数据驱动方法(基于样本方法)的汉语文本-可视语音合成系统,通过将小段视频拼接生成新的可视语音.给出一种构造汉语声韵母视觉混淆树的有效方法,并提出了一个基于视觉混淆树和硬度因子的协同发音模型,模型可用于分析阶段的语料库选取和合成阶段的基元选取.对于拼接边界处两帧图像的明显差别,采用图像变形技术进行平滑并.结合已有的文本-语音合成系统(TTS),实现了一个中文文本-视觉语音合成系统.; Text-To-Visual speech (TTVS) synthesis by computer can increase the speech intelligibility and make the human-computer interaction interfaces more friendly. This paper describes a Chinese text-to-visual speech synthesis system based on data-driven (sample based) approach, which is realized by short video segments concatenation. An effective method to construct two visual confusion trees for Chinese initials and finals is developed. A co-articulation model based on visual distance and hardness factor is proposed, which can be used in the recording corpus sentence selection in analysis phase and the unit selection in synthesis phase. The obvious difference between boundary images of the concatenation video segments is smoothed by image morphing technique. By combining with the acoustic Text-To-Speech (TTS) synthesis, a Chinese text-to-visual speech synthesis system is realized.; 国家教育部博士点基金; 北京科技大学校内科研基金~~
语种	英语 ; 英语
内容类型	期刊论文
源URL	[http://hdl.handle.net/123456789/55417]
专题	清华大学
推荐引用方式 GB/T 7714	王志明,蔡莲红,艾海舟,等. 基于数据驱动方法的汉语文本-可视语音合成(英文)[J],2010, 2010.
APA	王志明,蔡莲红,艾海舟,WANG Zhi-Ming,CAI Lian-Hong,&AI Hai-Zhou.(2010).基于数据驱动方法的汉语文本-可视语音合成(英文)..
MLA	王志明,et al."基于数据驱动方法的汉语文本-可视语音合成(英文)".(2010).

个性服务

查看访问统计

相关权益政策

暂无数据

收藏/分享

所有评论 (0)

[发表评论/异议/意见]

暂无评论

评论
权益异议
反馈意见

评注功能仅针对注册用户开放，请您登录

您对该条目有什么异议，请向管理员反馈。
内容：
Email：	*
单位:
验证码：	刷新

您在知识库使用过程中有什么好的想法或者建议可以反馈给我们。
标题：	*
内容：
Email：	*
验证码：	刷新

相关链接

CORC

联系我们