DepthFormer: Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation
Zhenyu Li2
刊名Machine Intelligence Research
2023
卷号20期号:6页码:837-854
关键词Autonomous driving, 3D reconstruction, monocular depth estimation, Transformer, convolution
ISSN号2731-538X
DOI10.1007/s11633-023-1458-0
英文摘要This paper aims to address the problem of supervised monocular depth estimation. We start with a meticulous pilot study to demonstrate that the long-range correlation is essential for accurate depth estimation. Moreover, the Transformer and convolution are good at long-range and close-range depth estimation, respectively. Therefore, we propose to adopt a parallel encoder architecture consisting of a Transformer branch and a convolution branch. The former can model global context with the effective attention mechanism and the latter aims to preserve the local information as the Transformer lacks the spatial inductive bias in modeling such contents. However, independent branches lead to a shortage of connections between features. To bridge this gap, we design a hierarchical aggregation and heterogeneous interaction module to enhance the Transformer features and model the affinity between the heterogeneous features in a set-to-set translation manner. Due to the unbearable memory cost introduced by the global attention on high-resolution feature maps, we adopt the deformable scheme to reduce the complexity. Extensive experiments on the KITTI, NYU, and SUN RGB-D datasets demonstrate that our proposed model, termed DepthFormer, surpasses state-of-the-art monocular depth estimation methods with prominent margins. The effectiveness of each proposed module is elaborately evaluated through meticulous and intensive ablation studies.
内容类型期刊论文
源URL[http://ir.ia.ac.cn/handle/173211/54170]  
专题自动化研究所_学术期刊_International Journal of Automation and Computing
作者单位1.Department of Automation, University of Science and Technology of China, Hefei 230026, China
2.Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
推荐引用方式
GB/T 7714
Zhenyu Li. DepthFormer: Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation[J]. Machine Intelligence Research,2023,20(6):837-854.
APA Zhenyu Li.(2023).DepthFormer: Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation.Machine Intelligence Research,20(6),837-854.
MLA Zhenyu Li."DepthFormer: Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation".Machine Intelligence Research 20.6(2023):837-854.
个性服务
查看访问统计
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。


©版权所有 ©2017 CSpace - Powered by CSpace