Temporal-Spatial Mapping for Action Recognition | |
Song, Xiaolin3; Lan, Cuiling2; Zeng, Wenjun2; Xing, Junliang1; Sun, Xiaoyan2; Yang, Jingyu3 | |
刊名 | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY |
2020-03-01 | |
卷号 | 30期号:3页码:748-759 |
关键词 | Two dimensional displays Three-dimensional displays Feature extraction Optical imaging Computational modeling Streaming media Kernel Temporal-spatial mapping (TSM) action recognition deep learning |
ISSN号 | 1051-8215 |
DOI | 10.1109/TCSVT.2019.2896029 |
通讯作者 | Lan, Cuiling(culan@microsoft.com) ; Yang, Jingyu(yjy@tju.edu.cn) |
英文摘要 | Deep learning models have enjoyed great success for image related computer vision tasks such as image classification and object detection. For video related tasks such as human action recognition, however, the advancements are not as significant yet. The main challenge is the lack of effective and efficient models in modeling the rich temporal-spatial information in a video. We introduce a simple yet effective operation, termed temporal-spatial mapping, for capturing the temporal evolution of the frames by jointly analyzing all the frames of a video. We propose a video level 2D feature representation by transforming the convolutional features of all frames to a 2D feature map, referred to as VideoMap. With each row being the vectorized feature representation of a frame, the temporal-spatial features are compactly represented, while the temporal dynamic evolution is also well embedded. Based on the VideoMap representation, we further propose a temporal attention model within a shallow convolutional neural network to efficiently exploit the temporal-spatial dynamics. The experiment results show that the proposed scheme achieves state-of-the-art performance, with 4.2% accuracy gain over the temporal segment network, a competing baseline method, on the challenging human action benchmark dataset HMDB51. |
资助项目 | National Science Foundation of China[61672519] ; National Natural Science Foundation of China[61771339] ; Reserved Peiyang Scholar Program of Tianjin University, Tianjin, China |
WOS研究方向 | Engineering |
语种 | 英语 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
WOS记录号 | WOS:000519551500010 |
资助机构 | National Science Foundation of China ; National Natural Science Foundation of China ; Reserved Peiyang Scholar Program of Tianjin University, Tianjin, China |
内容类型 | 期刊论文 |
源URL | [http://ir.ia.ac.cn/handle/173211/38619] |
专题 | 智能系统与工程 |
通讯作者 | Lan, Cuiling; Yang, Jingyu |
作者单位 | 1.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China 2.Microsoft Res Asia, Beijing 100080, Peoples R China 3.Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China |
推荐引用方式 GB/T 7714 | Song, Xiaolin,Lan, Cuiling,Zeng, Wenjun,et al. Temporal-Spatial Mapping for Action Recognition[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,2020,30(3):748-759. |
APA | Song, Xiaolin,Lan, Cuiling,Zeng, Wenjun,Xing, Junliang,Sun, Xiaoyan,&Yang, Jingyu.(2020).Temporal-Spatial Mapping for Action Recognition.IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,30(3),748-759. |
MLA | Song, Xiaolin,et al."Temporal-Spatial Mapping for Action Recognition".IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 30.3(2020):748-759. |
个性服务 |
查看访问统计 |
相关权益政策 |
暂无数据 |
收藏/分享 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论