Detecting and Tracking Segmentation of Moving Objects Using Graph Cut Algorithm

Detecting and Tracking Segmentation of Moving Objects Using Graph Cut Algorithm

Raviraj Pandian (GSSS Institute of Engineering and Technology for Women, India) and Ramya A. (KalaignarKarunanidhi Institute of Technology, India)
Copyright: © 2018 |Pages: 16
DOI: 10.4018/978-1-5225-5775-3.ch010
OnDemand PDF Download:
No Current Special Offers


Real-time moving object detection, classification, and tracking capabilities are presented with system operates on both color and gray-scale video imagery from a stationary camera. It can handle object detection in indoor and outdoor environments and under changing illumination conditions. Object detection in a video is usually performed by object detectors or background subtraction techniques. The proposed method determines the threshold automatically and dynamically depending on the intensities of the pixels in the current frame. In this method, it updates the background model with learning rate depending on the differences of the pixels in the background model of the previous frame. The graph cut segmentation-based region merging algorithm approaches achieve both segmentation and optical flow computation accurately and they can work in the presence of large camera motion. The algorithm makes use of the shape of the detected objects and temporal tracking results to successfully categorize objects into pre-defined classes like human, human group, and vehicle.
Chapter Preview


Automated video analysis is important for many vision applications, such as surveillance, traffic monitoring, augmented reality, vehicle navigation, etc. (Yilmaz, Javed, & Shah, 2006; Moeslund, Hilton, & Kruger, 2006). As pointed out in (Yilmaz, Javed, & Shah, 2006), there are three key steps for automated video analysis: object detection, object tracking, and behavior recognition. As the first step, object detection aims to locate and segment interesting objects in a video. Then, such objects can be tracked from frame to frame, and the tracks can be analyzed to recognize object behavior. Thus, object detection plays a critical role in practical applications.

Object detection is usually achieved by object detectors or background subtraction (Yilmaz, Javed, & Shah, 2006). An object detector is often a classifier that scans the image by a sliding window and labels each sub image defined by the window as either object or background. Generally, the classifier is built by offline learning on separate datasets (Papageorgiou, Oren, & Poggio, 1998; Viola, Jones, & Snow, 2005) or by online learning initialized with a manually labeled frame at the start of a video (Grabner & Bischof, 2006; Babenko, Yang, & Belongie, 2011). Alternatively, background subtraction (Piccardi, 2004) compares images with a background model and detects the changes as objects. It usually assumes that no object appears in images when building the background model (Toyama, Krumm, Brumitt, & Meyers, 1999; Moeslund, Hilton, & Kruger, 2006). Such requirements of training examples for object or background modeling actually limit the applicability of above-mentioned methods in automated video analysis.

Another category of object detection methods that can avoid training phases are motion-based methods (Yilmaz, Javed, & Shah, 2006; Moeslund, Hilton, & Kruger, 2006), which only use motion information to separate objects from the background. The problem can be rephrased as follows: Given a sequence of images in which foreground objects are present and moving differently from the background, can we separate the objects from the background automatically? Figure 1(a) shows a walking lady is always present and recorded by a handheld camera. Figure 1(b) shows such an example, where the surveillance video at the airport. The goal is to take the image sequence as input and directly output a mask sequence of the walking lady. The example is consolidated in Figure 1.

Figure 1.

(a) A sequence of 40 frames, where a walking lady is recorded by a handheld camera. From left to right are the first, 20th, and 40th frames. (b) A sequence of 48 frames clipped from a surveillance video at the airport.


Complete Chapter List

Search this Book: