×
验证码:
换一张
忘记密码?
记住我
CORC
首页
科研机构
检索
知识图谱
申请加入
托管服务
登录
注册
在结果中检索
科研机构
自动化研究所 [78]
内容类型
学位论文 [35]
期刊论文 [29]
会议论文 [14]
发表日期
2023 [2]
2022 [4]
2021 [2]
2019 [3]
2018 [7]
2017 [3]
更多...
×
知识图谱
CORC
开始提交
已提交作品
待认领作品
已认领作品
未提交全文
收藏管理
QQ客服
官方微博
反馈留言
浏览/检索结果:
共78条,第1-10条
帮助
限定条件
专题:自动化研究所
第一署名单位
第一作者单位
通讯作者单位
已选(
0
)
清除
条数/页:
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
排序方式:
请选择
作者升序
作者降序
题名升序
题名降序
发表日期升序
发表日期降序
提交时间升序
提交时间降序
Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey
期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 4, 页码: 447-482
作者:
Xiao Wang
收藏
  |  
浏览/下载:12/0
  |  
提交时间:2023/08/02
Multi-modal (MM), pre-trained model (PTM), information fusion, representation learning, deep learning
Large-scale Multi-modal Pre-trained Models: A Comprehensive Survey
期刊论文
Machine Intelligence Research, 2023, 卷号: 20, 期号: 4, 页码: 447-482
作者:
Xiao Wang
;
Guangyao Chen
;
Guangwu Qian
;
Pengcheng Gao
;
Xiao-Yong Wei
收藏
  |  
浏览/下载:1/0
  |  
提交时间:2024/04/23
Multi-modal (MM), pre-trained model (PTM), information fusion, representation learning, deep learning
Hybrid Autoregressive and Non-Autoregressive Transformer Models for Speech Recognition
期刊论文
IEEE SIGNAL PROCESSING LETTERS, 2022, 页码: 762-766
作者:
Zhengkun Tian
;
Jiangyan Yi
;
Jianhua Tao
;
Shuai Zhang
;
Zhengqi Wen
收藏
  |  
浏览/下载:12/0
  |  
提交时间:2022/06/14
Everybody’s Talkin’: Let Me Talk as You Want
期刊论文
IEEE Transactions on Information Forensics and Security, 2022, 卷号: 17, 期号: 1, 页码: 585 - 598
作者:
Song LS(宋林森)
;
Wu WY(吴文岩)
;
Qian C(钱晨)
;
He R(赫然)
;
Loy, Chen Change
收藏
  |  
浏览/下载:0/0
  |  
提交时间:2023/06/29
Talking face generation
Video generation
GAN
Audio dubbing
NeuralDPS: Neural Deterministic Plus Stochastic Model With Multiband Excitation for Noise-Controllable Waveform Generation
期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 卷号: 30, 页码: 865-878
作者:
Wang, Tao
;
Fu, Ruibo
;
Yi, Jiangyan
;
Tao, Jianhua
;
Wen, Zhengqi
收藏
  |  
浏览/下载:12/0
  |  
提交时间:2022/06/06
Vocoders
Stochastic processes
Neural networks
Speech processing
Signal to noise ratio
Acoustics
Speech enhancement
Vocoder
speech synthesis
deterministic plus stochastic
multiband excitation
noise control
Investigating Parameter Sharing in Multilingual Speech Translation
会议论文
Incheon, Korea, 18-22 September 2022
作者:
Wang, Qian
;
Wang, Chen
;
Zhang, Jiajun
收藏
  |  
浏览/下载:0/0
  |  
提交时间:2022/12/19
WASE: LEARNING WHEN TO ATTEND FOR SPEAKER EXTRACTION IN COCKTAIL PARTY ENVIRONMENTS
会议论文
Toronto, June 6-11, 2021
作者:
Yunzhe Hao
;
Jiaming Xu
;
Peng Zhang
;
Bo Xu
收藏
  |  
浏览/下载:10/0
  |  
提交时间:2022/06/23
F-0-Noise-Robust Glottal Source and Vocal Tract Analysis Based on ARX-LF Model
期刊论文
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 卷号: 29, 页码: 3375-3383
作者:
Li, Yongwei
;
Tao, Jianhua
;
Erickson, Donna
;
Liu, Bin
;
Akagi, Masato
收藏
  |  
浏览/下载:18/0
  |  
提交时间:2021/12/28
Speech recognition
Iterative methods
Production
Estimation
Brain modeling
Shape
Low-frequency noise
Glottal source
vocal tract
source-filter model
ARX-LF model
Simultaneous Estimation of Glottal Source Waveforms and Vocal Tract Shapes from Speech Signals Based on ARX-LF Model
期刊论文
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2020, 卷号: 92, 期号: 8, 页码: 831-838
作者:
Li, Yongwei
;
Sakakibara, Ken-Ichi
;
Akagi, Masato
收藏
  |  
浏览/下载:7/0
  |  
提交时间:2020/08/03
Glottal source waveform
Vocal tract shape
ARX-LF model
Uncertainty-optimized deep learning model for small-scale person re-identification
期刊论文
SCIENCE CHINA-INFORMATION SCIENCES, 2019, 卷号: 62, 期号: 12, 页码: 13
作者:
Zhao, Cairong
;
Chen, Kang
;
Zang, Di
;
Zhang, Zhaoxiang
;
Zuo, Wangmeng
收藏
  |  
浏览/下载:103/0
  |  
提交时间:2020/03/30
person re-identification
uncertainty analysis
deep learning
©版权所有 ©2017 CSpace - Powered by
CSpace