Occlusion Handling in Object Detection

Occlusion Handling in Object Detection

Farjana Z. Eishita (University of Saskatchewan, Canada), Ashfaqur Rahman (Central Queensland University Rockhampton, Australia), Salahuddin A. Azad (Central Queensland University Rockhampton, Australia) and Akhlaqur Rahman (American International University, Bangladesh)
DOI: 10.4018/978-1-4666-1830-5.ch005
OnDemand PDF Download:
List Price: $37.50


Object tracking is a process that follows an object through consecutive frames of images to determine the object’s movement relative other objects of those frames. In other words, tracking is the problem of estimating the trajectory of an object in the image plane as it moves around a scene. This chapter presents research that deals with the problem of tracking objects when they are occluded. An object can be partially or fully occluded. Depending on the tracking domain, a tracker can deal with partial and full object occlusions using features such as colour and texture. But sometimes it fails to detect the objects after occlusion. The shape feature of an individual object can provide additional information while combined with colour and texture features. It has been observed that with the same colour and texture if two object’s shape information is taken then these two objects can be detected after the occlusion has occurred. From this observation, a new and a very simple algorithm is presented in this chapter, which is able to track objects after occlusion even if the colour and textures are the same. Some experimental results are shown along with several case studies to compare the effectiveness of the shape features against colour and texture features.
Chapter Preview

1. Introduction

Object tracking is the process of following moving objects across a video sequence (Trucco, & Plakas, 2006). In other words, object tracking is the problem of estimating the trajectory of an object in the image plane as it moves around a scene. Object tracking has many applications such as traffic monitoring, security & surveillance, human computer interaction, video annotation, video editing, medical imaging, robotics, augmented reality etc.

There are three major steps in video analysis – detection of the moving objects, tracking of moving objects from frame to frame and analysis of the object tracks to recognize their behavior (Yilmaz, Javed, & Shah, 2006). A tracker allocates unswerving labels to the tracked objects in different video frames. In addition, depending on the tracking field, a tracker can provide object-centric information, such as the area or shape of an object. Simple algorithms for video tracking rely on the selection of the region of interest in the first frame associated with the moving objects. Some of the demanding methods put constraints on the shape of the tracked object (Erdem, Tekalp, & Sankur, 2003, Shin, J., Kim, S., Kang, S., Lee, S., Paik, J., Abidi, B., & Abidi, M., 2005). In general, this type of algorithms includes apriori training on the possible shape of the object.

A typical video tracker has two components – target representation & localization and filtering & data association. The first process, which is mainly a bottom-up approach, deals with the appearance of the target. The second process, which is a top-down approach, deals with deals with dynamic of the tracked object, learning of scene priors and evaluation of hypothesis (Comaniciu, Ramesh, & Meer, 2003). Application of these two processes depends on the application and purpose of the tracking. Face tracking in cluttered scenario depends more on target representation than target dynamics (DeCarlo, & Metaxas, 2000), while in aerial video surveillance, target motion and target dynamics is more important than target representation. Target representation and localization algorithms can be classified into three groups – blob tracking, kernel tracking and silhouette tracking. Blob tracking algorithms localize the target object using blob detection, block based correlation or optical flow method. This sort of algorithms is suitable for tracking objects occupying a small region in the image (Yilmaz, Javed, & Shah, 2006). Kernel based algorithms localize the target objects through maximization of a similarity measure such as Bhattacharyya coefficient. Silhouette tracking localize the target objects by the detection of object boundary. This type of algorithms is suitable for tracking non rigid objects (Yilmaz, Javed, & Shah, 2006). Filtering and data association processes are used to deal with object dynamics, often by incorporating prior information about the scene and object. Kalman filter and particle filter are two well known filtering algorithms (Bradski, 1998, Comaniciu, & Meer, 2002, Han, & Davis, 2004). Kalman filter is an optimal recursive Bayesian filter for linear functions subjected to Gaussian noise. Particle Filter (PF) is a Monte Carlo (i.e. choosing randomly) method to monitor dynamic systems, which non-parametrically approximates probabilistic distribution using weighted samples (particles) (Arulampalam, Maskell, Gordon, & Clapp, 2002).

Occlusion can significantly undermine the performance of the object tracking algorithms. In many practical object tracking scenarios, a moving object can be entirely or partially occluded by other objects in the scene frequently. The object tracking algorithm must be able to detect the occlusion quickly and find the object once the occlusion is over.

Complete Chapter List

Search this Book: