Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition
Zitong, Yu2; Benjia, Zhou6; Jun, Wan1; Pichao, Wang3; Haoyu, Chen2; Xin, Liu4; Stan, Z., Li5; Guoying, Zhao2
刊名IEEE Transactions on Image Processing
2021
卷号30页码:5626-5640
英文摘要

Gesture recognition has attracted considerable attention owing to its great potential in applications. Although the great progress has been made recently in multi-modal learning methods, existing methods still lack effective integration to fully explore synergies among spatio-temporal modalities effectively for gesture recognition. The problems are partially due to the fact that the existing manually designed network architectures have low efficiency in the joint learning of multi-modalities. In this paper, we propose the first neural architecture search (NAS)- based method for RGB-D gesture recognition. The proposed method includes two key components: 1) enhanced temporal representation via the proposed 3D Central Difference Convolution (3D-CDC) family, which is able to capture rich temporal context via aggregating temporal difference information; and 2) optimized backbones for multi-sampling-rate branches and lateral connections among varied modalities. The resultant multi-modal multi-rate network provides a new perspective to understand the relationship between RGB and depth modalities and their temporal dynamics. Comprehensive experiments are performed on three benchmark datasets (IsoGD, NvGesture, and EgoGesture), demonstrating the state-of-the-art performance in both single- and multi-modality settings. The code is available at https://github.com/ZitongYu/3DCDC-NAS.

语种英语
内容类型期刊论文
源URL[http://ir.ia.ac.cn/handle/173211/57115]  
专题多模态人工智能系统全国重点实验室
通讯作者Jun, Wan; Guoying, Zhao
作者单位1.Institute of Automation, Chinese Academy of Sciences
2.University of Oulu
3.Alibaba Group
4.Tianjin University
5.Westlake University
6.Macau University of Science and Technology
推荐引用方式
GB/T 7714
Zitong, Yu,Benjia, Zhou,Jun, Wan,et al. Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition[J]. IEEE Transactions on Image Processing,2021,30:5626-5640.
APA Zitong, Yu.,Benjia, Zhou.,Jun, Wan.,Pichao, Wang.,Haoyu, Chen.,...&Guoying, Zhao.(2021).Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition.IEEE Transactions on Image Processing,30,5626-5640.
MLA Zitong, Yu,et al."Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition".IEEE Transactions on Image Processing 30(2021):5626-5640.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace