This research work tackles the problem of dense three-dimensional reconstruction from binocular image sequences. Recovering 3D-information has been in the focus of attention of the computer vision community for a few decades now, yet no all-satisfying method has been found so far. The main problem with vision, is that the perceived computer image is a two-dimensional projection of the 3D world. Three-dimensional reconstruction can thus be regarded as the process of re-projecting the 2D image(s) back to a 3D model, as such recovering the depth dimension which was lost during projection.
In this work, we focus on dense reconstruction, meaning that a depth estimate is sought for each pixel of the input image. Most attention in the 3Dreconstruction area has been on stereo-vision based methods, which use the displacement of objects in two (or more) images. Where stereo vision must be seen as a spatial integration of multiple viewpoints to recover depth, it is also possible to perform a temporal integration. The problem arising in this situation is known as the Structure from Motion problem and deals with extracting 3- dimensional information about the environment from the motion of its projection onto a two-dimensional surface. Based upon the observation that the human visual system uses both stereo and structure from motion for 3D reconstruction, this research work also targets the combination of stereo information in a structure from motion-based 3D-reconstruction scheme. The data fusion problem arising in this case is solved by casting it as an energy minimization problem in a variational framework.
Video Results (check also our YouTube channel):
Reconstruction of an indoor stereo sequence:
Incorporated in the videos
Robots used for this research subject: