面向大场景的图像拼接与三维重建算法研究

CORC > 沈阳自动化研究所 > 中国科学院沈阳自动化研究所 > 机器人学研究室

题名	面向大场景的图像拼接与三维重建算法研究
作者	赵戈
学位类别	博士
答辩日期	2013-11-20
授予单位	中国科学院沈阳自动化研究所
导师	唐延东
关键词	图像拼接三维重建视差图图像配准特征匹配
其他题名	Panoramic image stitching and 3D reconstruction
学位专业	模式识别与智能系统
中文摘要	面向大场景的图像拼接是计算机视觉中的主要研究课题之一。该项技术在虚拟现实、增强现实和机器人学等领域都得到了广泛的应用。面向大场景的图像拼接的目的是将若干幅相互重叠的输入图像缝合成一幅大视野范围的全景图。在算法流程上面向大场景的图像拼接可以划分为图像配准和图像融合两个阶段。图像配准的目的是找出两幅图像间的公共部分。当前，在面向大场景的图像拼接中图像配准算法可以分为基于柱面投影的方法和基于全局优化的方法。前者需要通过全景云台与三脚架等设备确保相机绕其光心水平旋转。这类方法的基本思想是将拍摄的图片序列投影到一个以相机焦距为半径的圆柱上，投影后两幅图像间的旋转角度转化为柱面上的平移。基于柱面投影的图像配准方法的优势在于具有较快的速度和内存开销。基于全局优化的方法取消了对相机运动方式的限制，无需相机以外的其他外设的支持。这类方法的核心思想是通过优化一个目标函数来估计相机模型的运动参数，根据目标函数是否基于特征点又可以分为基于特征的方法与直接法。直接法无需检测与匹配特征点，其目标函数定义在所有像素灰度之上。然而，由于直接法主要通过在像素灰度级上的算数运算来定义目标函数，其对亮度变化或相机增益鲁棒性较差。针对这一问题，本文提出了一种在图像非参数空间中定义的目标函数。图像的非参数空间被证明对相机增益、亮度变化都具有较强的鲁棒性。此外，针对所提出的目标函数不易求导的特点，设计了一种启发式混合优化方法：首先，通过遗传算法在全局范围内粗略估计最优解的大概位置，然后，利用黄金分割法在该位置的周围更细致的进行搜索。这种优化方法是一种由“粗”到“精”的搜索策略，它的优势在于既保证了解的全局最优性又提高了搜索的速度。图像融合的目的是通过两幅图像的重叠区域将它们合成为一幅全景图。在图像融合中基于最优缝合线的融合方法是一种被广泛采用的方法，其思想是在重叠区域内寻找一条缝合线，在缝合线的左侧取左图像的像素，缝合线的右侧取右图像的像素，最后沿缝合线将两幅图像合成为一幅全景图。一条最优的缝合线应该从图像重叠区域中差别最小的区域通过，从而确保融合后的图像有良好的视觉效果。在最优缝合线方法中两幅图像在重叠区域对应位置处的差异被称为缝合代价。通过对相机成像模型的分析，发现一些经常使用的缝合代价对诸如光晕效应等光学畸变比较敏感，此外，这些缝合代价也无法准确地反映图像在局部结构上的差异情况。针对上述问题，本文在对成像模型分析的基础上提出了一种新的最优缝合线生成方法。该方法首先通过由图像灰度和各阶导数信息构成的二阶矩矩阵来描述重叠区域中各个位置的局部结构特征; 然后，借助黎曼流形对二阶矩矩阵构成的空间进行分析，利用黎曼流形上的测地线距离来构造一个缝合代价矩阵；最后，利用动态规划法在该代价矩阵上进行路径回溯来生成最优缝合线。在光照变化和光晕效应等环境下将本文提出的面向大场景的图像拼接方法与其他几种较新的拼接方法进行了对比，实验结果表明本文方法对这些非理想条件都具有较强的鲁棒性。三维重建的目的是通过双目相机拍摄的立体像对来获取场景的三维几何信息。目前，在三维重建中有两个研究热点：特征点的三维重建与稠密视差图的生成。特征点的三维重建大体包含五个步骤：双目相机标定、图像校正、特征提取、特征匹配、三角测量。其中，立体像对间的特征匹配是核心和难点。本文设计了一种新的基于遗传算法的立体图像间特征匹配的方法。遗传算法是一种启发式算法，它的优势在于不但全局搜索能力强而且算法结构本身具有内在的并行性，适于做分布式计算。由于遗传算法只提供了一种求解问题的通用框架，针对具体问题还需要设计不同的实现方法。本文对遗传算法中的适应度函数、编码方式、交叉以及变异等环节根据特征匹配问题的特点做了相应的修改，使其能更有加效地用来对两幅立体图像中的特征进行匹配。稠密视差图的生成则是三维重建中的另一个研究热点，由于视差与深度成反比，所以通过稠密视差图也可以恢复场景的三维几何信息。近年来，Rank变换被广泛应用在稠密视差图的生成中，以提高算法对光学畸变与光照变化的鲁棒性。Rank变换用像素在局部邻域 (窗口) 内灰度大小的排序值来代替它的幅值。然而，使用Rank变换的问题在于丢失了原图像中包含的边缘和轮廓信息的信息，进而降低了变换后图像的可区分性。针对这一问题，本文提出了一种新的的Rank变换算法，该方法在继承了原有Rank变换优点的基础上，能够有效地保留原图像中的边缘信息。最后，稠密视差图中通常会存在较多的与真实视差间差异较大的点，这些点处的视差值被称为不可信视差。本文设计了一种视差图后续处理方法，该方法首先沿图像的扫描线对稠密视差图中的不可信视差进行检测，然后，利用不可信视差周边的可信视差信息对它们进行校正。在标准测试集上对本文提出的稠密视差图生成方法做了测试，并与其他几种较新的方法做了对比，实验结果充分说明了本文方法的优越性。
索取号	TP391.41/Z43/2013
英文摘要	Panoramic image stitching is still one of main research topics in computer vision and it is widely used in virtual reality, augmented reality and robotics. The main task of panoramic image stitching is to combine several overlapping images of the same scene into a single panorama with an enlarged field-of-view. The stitching process can be divided into two major steps: image registration and image fusion. The purpose of image registration is to find the overlapping region between two images. Currently, image registration methods used in panoramic image stitching can be categorized into two classes: registration using cylindrical projection and registration using global optimization. The methods in the former class need to use additional equipments such as the panoramic head where the camera optical center is placed on the rotation axis. The advantages of these methods are their memory and computational efficiency. The methods in the latter class has no restriction on camera motion and need not to use equipments other than the camera itself. The parameters of the camera motion model are estimated by optimizing an objective function. According to whether or not features are used, they can be further divided into two categories: feature-based methods and direct methods. Because current direct methods need to define their objective functions by performing arithmetic operations directly on the magnitude of intensity, they are sensitive to illumination changes and camera bias and gains. To solve this problem, we propose a novel objective function in the non-parametric image domain. The non-parametric image domain has been proven to be robust against camera bias and gains as well as brightness fluctuation. Moreover, a heuristic hybrid optimization strategy is applied. The genetic algorithm is firstly applied to find a good initial solution globally, and then this solution is locally refined by the golden ratio search algorithm. The presented method is actually a coarse-to-fine strategy with the advantage of not only ensuring the global optimality of the solution but also increasing the searching speed. The aim of the image fusion is to combine two images into a single panorama using their overlapping region. The optimal seam method is a widely used method in image fusion. The basic idea behind this method is to first find a seam in the overlapping area, and then each image is copied to the corresponding side of this seam. The optimal seam is expected to go through areas where difference between two images are smallest so the final composite image can be visually pleasing. In the optimal seam method the difference between two correspondingly located pixels in the two overlapping images is called seam cost. However, the analysis of the imaging model reveals that some commonly used seam costs are sensitive to optical distortions such as vignetting. Besides, they are unable to describe the difference in the image local structures accurately. To resolve this problem, based on the study of the imaging model, in this dissertation, a novel optimal seam estimation method is proposed. Firstly, the intensity and image derivatives are used to formulate a second moment matrix for each location in the overlapping region. Then, the seam cost is defined as the geodesic distance between two second moment matrices on the Riemannian manifold. Finally, the optimal seam is backtraced on the seam cost surface using the dynamic programming. The proposed image stitching method is compared with several other state-of-the-art methods under some challenging situations such as illumination variations and vignetting, the experimental results show that the proposed method is more robust against these undesirable environmental effects.The purpose of 3D reconstruction is to recover the 3D geometrical structure of the scene using two stereo images. Currently, there are two main research topics in 3D reconstruction: the feature point reconstruction and the dense disparity map generation. The process of the feature point reconstruction involves five basic steps: stereo camera calibration, image rectification, feature detection, feature matching and triangulation. In this dissertation, a new feature matching method using the genetic algorithm is presented. Genetic algorithm (GA) is one kind of the heuristic searching algorithms. I To effectively match extracted feature points between two stereo images and tailored to the characteristics of feature matching problem, some major components in GA such as the fitness function, the encoding strategy and the crossover and mutation operations are redesigned. Dense disparity map generation is another main research topic in 3D reconstruction. Recently, the rank transform is widely used to increase the robustness of disparity map generation methods against various optical distortions as well as illumination changes. The main problem of using the rank transform is the loss of edges and boundaries contained in the original image, and that risks the danger of harming discriminative power. To address this problem, a new version of the rank transform is presented in this paper. The new rank transform not only inherits the desirable properties of the original rank transform but also preserves more edges and boundaries in the original image. Finally, there are always some false disparities in the dense disparity map which are far away from their true values. A novel disparity post-processing method is proposed in this dissertation. This method first detects false disparities along the image scanlines, and then emends these false disparities using the nearby correct disparities. The proposed dense disparity map generation method is evaluated on the benchmarks and compared with several other state-of-the-art methods. The corresponding experimental results show the superiority of the proposed methods.
语种	中文
产权排序	1
页码	110页
分类号	TP391.41
内容类型	学位论文
源URL	[http://ir.sia.ac.cn/handle/173321/14840]
专题	沈阳自动化研究所_机器人学研究室
推荐引用方式 GB/T 7714	赵戈. 面向大场景的图像拼接与三维重建算法研究[D]. 中国科学院沈阳自动化研究所. 2013.

个性服务

查看访问统计

相关权益政策

暂无数据

收藏/分享

所有评论 (0)

[发表评论/异议/意见]

暂无评论

评论
权益异议
反馈意见

评注功能仅针对注册用户开放，请您登录

您对该条目有什么异议，请向管理员反馈。
内容：
Email：	*
单位:
验证码：	刷新

您在知识库使用过程中有什么好的想法或者建议可以反馈给我们。
标题：	*
内容：
Email：	*
验证码：	刷新

相关链接

CORC

联系我们