The Recovery of 3-D Structure Using Visual Texture Patterns

Angeline Loh
2006

Abstract

One common task in Computer Vision is the estimation of three-dimensional surface shape from two-dimensional images. This task is important as a precursor to higher level tasks such as object recognition - since the shape of an object gives clues to what the object is - and object modeling for graphics. Many visual cues have been suggested in the literature to provide shape information, including the shading of an object, its occluding contours (the outline of the object that slants away from the viewer) and its appearance from two or more views. If the image exhibits a significant amount of texture, then this too may be used as a shape cue. Here, 'texture' is taken to mean the pattern on the surface of the object, such as the dots on a pear, or the tartan pattern on a tablecloth. This problem of estimating the shape of an object based on its texture is referred to as shape-from-texture and it is the subject of this thesis.

One motivation for studying shape-from-texture is the fact that, according to psychophysical experiments, texture plays an important role in the human perception of shape. It would be useful if computers could mimic this behaviour. Some advantages of using texture as a cue are that it allows shape to be recovered from static monocular images (rather than multiple views) and the fact that textures are ubiquitous in the world around us. Another reason for studying texture as a shape cue is to use it in combination with other cues for a more robust solution.

During the past three decades, there has been much work in shape-from-texture. This thesis contributes to the existing body of work by providing three new algorithms: two are shape-from-texture algorithms that solve the problem under different sets of assumption regarding the texture and viewing geometry; the other algorithm solves the low-level task of estimating the transformation between patches of texture. These three algorithms are described in more detail below.

The first shape-from-texture method is fast and direct, and works with homogeneous and stationary textures viewed orthographically, with the frontal texture known. The method is based on the fact that as textures are foreshortened due to the relative orientation of the surface patch to the viewer, the second spectral moments do not change about the axis orthogonal to the tilt axis. From this, the tilt axis may be identified, which in turn leads to an estimation of the slant angle of the surface patch. In this way the orientation of each surface patch may be solved. A number of issues affecting the ideal behaviour of the system are explored, including the scaling property of the Fourier transform, windowing schemes, illumination and blur. A property of this method is that occluding boundaries are estimated to curve away from the viewer. The new method is compared to a recent and well-known method from the literature that uses the same assumptions regarding texture and viewing geometry. This thesis demonstrates that the new method has many advantages over the other existing method; among other things it is more robust, does not exhibit any ambiguities other than the unavoidable tilt ambiguity of ±π, and never returns complex, and hence unusable, values.

The second shape-from-texture method aims to solve the problem in one of its most general forms; the texture is not assumed to be isotropic, homogeneous, stationary or viewed orthographically. In addition, the frontal texture is not assumed to be known a priori, or from a known set, or even present in the image. Instead it is assumed that the surface is smooth and covered in identical texture elements; this allows the the entire surface to be recovered via a consistency constraint. The key idea is that if the correct transformation from an arbitrary reference texel to a frontal texel can be estimated only then will a consistent, integrable surface be produced. It is shown that a Levenberg-Marquardt search can estimate the frontal texture efficiently. The methods described in this thesis have been quantitatively tested on the entire set of Brodatz textures and also on real images.

This thesis also investigates the relationship between shape-from-texture and structure-from-motion. If the camera is stationary and the moving object is planar, or nearly so, then the superposition of the images of the moving object produce a texture, the structure of which can be solved. The second shape-from-texture algorithm was adapted to demonstrate an example of such a structure-from-motion reconstruction.

The other algorithm that is presented in this thesis estimates the transformation between patches of the same texture, viewed from different orientations. Previous methods for doing this are shown to be non-robust or have limitations if the change between the two textures is not incremental. The new method overcomes the drawbacks of these previous methods, as well as being robust to blurring and illumination variations.

The work in this thesis is likely to impact in a number of ways. The second shape-from-texture algorithm provides one of the most general solutions to the problem. On the other hand, if the assumptions of the first shape-from-texture algorithm are met, this algorithm provides an extremely usable method, in that users should be able to input images of textured objects and click on the frontal texture to quickly reconstruct a fairly good estimation of the surface. And lastly, the algorithm for estimating the transformation between textures can be used as a part of many shape-from-texture algorithms, as well as being useful in other areas of Computer Vision. This thesis gives two examples of other applications for the method: re-texturing an object and placing objects in a scene.

Thesis