Reliable Motion Detection, Location and Audit in Surveillance Video

Reliable Motion Detection, Location and Audit in Surveillance Video

Samaan Poursoltan (University of Adelaide, Australia) and Matthew J. Sorell (University of Adelaide, Australia)
DOI: 10.4018/978-1-60960-515-5.ch019
OnDemand PDF Download:
List Price: $37.50


The review of video captured by fixed surveillance cameras is a time consuming, tedious, expensive and potentially unreliable human process, but of very high evidentiary value. Two key challenges stand out in such a task; ensuring that all motion events are captured for analysis, and demonstrating that all motion events have been captured so that the evidence survives being challenged in court. In previous work (Zhao, Poursoltanmohammadi & Sorell, 2008), it was demonstrated that tracking the average brightness of video frames or frame segment provided a more robust metric of motion than other commonly hypothesized motion measures. This paper extends that work in three ways; by setting automatic localized motion detection thresholds, by maintaining a frame-by-frame single parameter normalized motion metric, and by locating regions of motion events within the footage. A tracking filter approach is used for localized motion analysis, which adapts to localized background motion or noise within each image segment. When motion is detected, location and size estimates are reported to provide some objective description of the motion event.
Chapter Preview


Consider a surveillance scenario of a fixed-position video camera mounted indoors or outdoors, monitoring a specific low-traffic area such as an entrance, a cash machine or a stairwell. Low-traffic in this context means that much of the time there are no motion events of potential interest. This does not however mean that the scene is stationary. It is common in such a situation for the camera to introduce significant random noise from one frame to the next, there might also be trees or bushes waving in the wind in the background, a constantly-moving escalator etc. Such occurrences do not constitute motion events of interest, but in the absence of an adequately robust motion detection algorithm, they are likely to cause constant false triggering of a motion detector.

In previous work (Zhao, Poursoltanmohammadi & Sorell, 2008), it was shown that of three likely candidates (frame luminance entropy, frame average luminance, frame differencing), average luminance provided the most robust mechanism for motion event detection, in large part because such a technique averages out speckle noise introduced by the camera sensor and subsequent processing and compression. That paper also proposed that by tracking such a motion metric on a frame-by-frame basis, and by capturing full frame-rate footage with lead in and lead out when motion is detected and low frame-rate footage otherwise, it would be possible to have confidence in the completeness of the motion footage, and to be able to demonstrate that confidence in court. However, the issue of how to set decision thresholds and consideration of additional informative motion metrics was left open. In particular, the challenge of how to deal with regular background motion, such as waving trees, was not addressed.

The current work extends that previous work in significant ways – firstly by showing that a region of interest such as a square macro-block of 16x16 pixels provides sufficient noise reduction for effective motion detection, then by proposing a technique using a tracking filter for each macroblock in a video frame to provide normalised local motion metrics, and then by combining the metrics into an aggregated motion metric, a composite motion detector, and an estimator of size and location of regions of motion.

It is proposed that such an approach has several notable advantages over current practice. In the first instance, human motion review is a tedious and time-consuming task, subject to errors due to fatigue; it is also expensive in terms of work hours to perform such a review. The approach proposed here allows for human or automated secondary analysis of motion event footage rather than motion detection. Secondly, metrics are produced which can be used to demonstrate confidence in the completeness of the captured record, that is to say to demonstrate to a court that all motion events have been captured, but without excessive false-alarm overheads. Thirdly, once the efficacy of the proposed approach is accepted by the court, the option exists to implement the algorithm at the camera, reducing storage and/or transmission overheads without compromising the completeness of the video record.

Motion analysis has been considered within the realm of computer vision and image understanding for some years (Ullman (1979), Huang and Lee (1988), and Maybank (1993)). Almost all of the research in this area has assumed that the video frames contain some form of motion, and the focus has been the classification of the movement into matters of interest (Konrad (2000), Duque et al (2006), and Hu et al (2004)). While these approaches are useful in general, we are only concerned in this case with determining whether video footage contains motion, and if so, providing an estimate of the size and location of the region where such motion occurs. This is a critical question because an effective solution can significantly reduce the computational complexity of subsequent analysis, as well as a substantial reduction in storage or transmission capacity. Although there is significant research effort in the topic of motion analysis, the literature has not addressed the question of whether reliable motion detection can be accomplished using only low-level features of motion, and if so, how it can be achieved, and what features of the frame to use.

Complete Chapter List

Search this Book: