Improved Subject Identification in Surveillance Video Using Super-Resolution

Improved Subject Identification in Surveillance Video Using Super-Resolution

Simon Denman, Frank Lin, Vinod Chandran, Sridha Sridharan, Clinton Fookes
Copyright: © 2013 |Pages: 44
DOI: 10.4018/978-1-4666-2660-7.ch011
OnDemand:
(Individual Chapters)
Available
$33.75
List Price: $37.50
10% Discount:-$3.75
TOTAL SAVINGS: $3.75

Abstract

The time consuming and labour intensive task of identifying individuals in surveillance video is often challenged by poor resolution and the sheer volume of stored video. Faces or identifying marks such as tattoos are often too coarse for direct matching by machine or human vision. Object tracking and super-resolution can then be combined to facilitate the automated detection and enhancement of areas of interest. The object tracking process enables the automatic detection of people of interest, greatly reducing the amount of data for super-resolution. Smaller regions such as faces can also be tracked. A number of instances of such regions can then be utilized to obtain a super-resolved version for matching. Performance improvement from super-resolution is demonstrated using a face verification task. It is shown that there is a consistent improvement of approximately 7% in verification accuracy, using both Eigenface and Elastic Bunch Graph Matching approaches for automatic face verification, starting from faces with an eye to eye distance of 14 pixels. Visual improvement in image fidelity from super-resolved images over low-resolution and interpolated images is demonstrated on a small database. Current research and future directions in this area are also summarized.
Chapter Preview
Top

Introduction

Forensic use of surveillance video is limited by the low resolution of the frames and the human effort required for extracting useful footage from recordings. Good quality images provide sharp and clear facial, clothing, tattooed skin and other areas of interest and are essential for the surveillance footage to be useful. However, surveillance videos are low resolution due to two main factors.

Storage requirements make high resolution video prohibitive for use in a surveillance system. The video is often compressed using standards proposed by the Motion Pictures Expert Group (MPEG) to reduce file size. In a best case scenario, a DVD quality video could be encoded in the state of the art MPEG4 format at around 100KB/s. This still translates to 8.2GB of data for one camera per day. The total storage requirements become prohibitive very quickly as the number of cameras is increased or if the period of retention is lengthened.

Further, in order to reduce the number of cameras required to cover an area, wide angle lenses are fitted to cameras to increase their field of view (FOV). As a consequence subjects of interest occupy only small portions of the entire scene. Figure 1 contains a sample frame from the i-LIDS (i-LIDS Team, 2006) dataset which consists of surveillance footage for evaluating tracking algorithms. Although the video was captured at DVD resolution, the subjects' faces were less than 30 pixels (px) wide. It becomes progressively harder to distinguish the two faces once the width of the face drops below 32px, as can be seen from Figure2. Interpolation or “digital zoom” does not help because although these procedures add extra samples, they do not add extra information at high spatial frequencies.

Figure 1.

Typical image captured by a surveillance camera. The whole frame is 720 by 576px. The faces only occupy a small area compared to the frame. Widths of the highlighted faces are: a) 17px, b) 14px, c) 12px and d) 27px.

978-1-4666-2660-7.ch011.f01

Super-resolution is a signal-processing method that can be applied to enhance the resolution of the surveillance video by fusing complementary information contained in successive frames of the video. As super-resolution is a computationally intensive process, applying it to an entire sequence without any guidance is undesirable. A tracking system enables subjects of interest to be detected, tracked and super-resolved; resulting in a more computationally efficient and useful system. The tracking system also discards frames of the sequence that do not contain moving subjects, reducing the manual effort required in extracting useful frames.

In this chapter, we present an end-to-end system for detecting and tracking people, extracting sequences of faces for each subject and using super-resolution to improve the image quality prior to a face verification process. This process is illustrated in Figure 3. The proposed super-resolution approach uses a robust optical flow technique to guide the registration of a set of facial images, allowing for non-rigid deformations to be incorporated into the registration and super-resolution. This novel super-resolution approach is combined with an object tracking technique, resulting in a novel end-to-end system for automatically extracting, enhancing and recognizing facial images from surveillance footage.

Figure 3.

Outline of the proposed system. Video footage from a CCTV network is analyzed by a person tracking routine that extracts sequences of faces images for each person observed in the network. These sequences are registered and super-resolution is used to obtain a high quality images for each subject. The resultant high resolution face images are compared to a database of biometric credentials to determine the subject identity.

978-1-4666-2660-7.ch011.f03

The remainder of this chapter is organized as follows. An overview of super resolution and object tracking techniques is presented in Sections 2 (Super Resolution) and 3 (Person Tracking). An integrated system that performs super-resolution on areas identified by the tracking process is described in Section 4 (Integrated System). Current research trends and future directions are summarized in Section 5 (Future Research Directions). Concluding remarks are provided in Section 6 (Conclusion).

Complete Chapter List

Search this Book:
Reset