Intense spatiotemporal coupling states frequently appear
in robotic tasks, and this coupling enriches the information
encapsulated in each state. Taking advantage of historical observations
can provide more information about the robot, especially
for partially observable Markov decision processes. How to deal
with this coupling remains a challenging issue in robotic reinforcement
learning (RL), and we allege that the imbalanced processing
capability of spatiotemporal details is one of the bottlenecks of
the vanilla transformer model in learning robotic policies. To address
this problem, we novelly propose an efficient spatiotemporal
transformer structure. To our knowledge, this work is the first
to improve the transformer with spatiotemporal information in
RL. In each attention block, we sequentially execute attention
computation twice: the first to process the temporal sequence of
the input and the latter to manage the spatial state. This input
reconstruction enables sufficient information extraction to promote
data efficiency.We also add correlation encoding into the query and
key computation of multi-head attention, providing the operability
of associating states between and within time steps. We evaluate
the proposed approach on several robot tasks, and it outperforms
state-of-the-art transformer-based online RL.
Yang YM,Xing DP,Xu B. Efficient Spatiotemporal Transformer for Robotic Reinforcement Learning[J]. IEEE ROBOTICS AND AUTOMATION LETTERS,2022,7(3):7982-7989.
APA
Yang YM,Xing DP,&Xu B.(2022).Efficient Spatiotemporal Transformer for Robotic Reinforcement Learning.IEEE ROBOTICS AND AUTOMATION LETTERS,7(3),7982-7989.
MLA
Yang YM,et al."Efficient Spatiotemporal Transformer for Robotic Reinforcement Learning".IEEE ROBOTICS AND AUTOMATION LETTERS 7.3(2022):7982-7989.
修改评论