Article Preview
Top1. Introduction
As human brain can deal with the weak difference between the left and right eye images, we can sense the external 3-D world; and this ability is called stereoscopic vision. Stereo image sequence is a three-dimensional visual image form, it uses left and right groups of images to describe a group of scenes, and the human eyes apperceive the 3-D depth information by addressing the relative position between the two images. The 3-D reconstruction of video scenes by stereo analysis is an important advancement in machine vision and computer vision. Most of the automation work requires modeling of 3-D structure of environment. Compared with traditional two-dimensional images, stereo images are more “realism” and the description of the scene is more natural. Currently, 3-D vision system has been widely applied to stereo video communications (Raymond & Clifton, 2000), robot vision (Nishihara & Poggio, 1984), aviation navigation (Stefano, Marchioni, Mattoccia, & Neri, 2004), and other fields. With ever-growing material and spiritual needs, the stereo images will gradually replace the traditional single vision images, and will be more used in television, online shopping, remote medical diagnosis and other civilian areas.
To store or transmit the stereo images, a more efficient compression encoding program must be developed. Stereo image sequence compression method was put forward in the late 1980s; after more than 25 years of development, people have developed several comparatively mature algorithms. However, to the practical application point, there is still not a unified coding standard.
The video object based stereo video coding method is to separate the video object from the scenes and to extract its borders, texture, movement and other parameters, then to code these parameters to achieve the purpose of coding the whole image (Aizawa & Huang, 1995). This method uses the hidden 3-D depth information, through the creation of 3-D objects and coding model to improve the coding efficiency and to reduce the influence of the block. It provides a more natural scene interpretation. However, this approach requires sophisticated image analysis process, such as: object segmentation, object modeling, and all these are not ripe at present so it can only be applied to a single background image with simple motion; its widely using depends on better solving some of these key technologies.
This paper presents a redundant wavelet transform based stereo video object segmentation algorithm. First, we use the redundant wavelet transform to extract the feature points of stereo video images, then according to the feature points we do the disparity estimation, to form a disparity map. The stationary objects are segmented from the stereo images by the disparity map. For the moving objects, we use a redundant wavelet transform based moving object extraction algorithm to segment the moving target from the redundant wavelet domain. Experimental results show that our algorithm can segment video objects from stereo video images, including stationary objects and moving objects with good results, highlighted details, and simple calculation process; all these can help to the subsequent coding operation.