Towards Ontology-Based Realtime Behaviour Interpretation

Towards Ontology-Based Realtime Behaviour Interpretation

Wilfried Bohlken (University of Hamburg, Germany), Patrick Koopmann (University of Hamburg, Germany), Lothar Hotz (University of Hamburg, Germany) and Bernd Neumann (University of Hamburg, Germany)
DOI: 10.4018/978-1-4666-3682-8.ch003
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

The authors describe a generic framework for model-based behaviour interpretation and its application to monitoring aircraft service activities. Behaviour models are represented in a standardised conceptual knowledge base using OWL-DL for concept definitions and the extension SWRL for constraints. The conceptual knowledge base is automatically converted into an operational scene interpretation system implemented in Java and JESS that accepts tracked objects as input and delivers high-level activity descriptions as output. The interpretation process employs Beam Search for exploring the interpretation space, guided by a probabilistic rating system. The probabilistic model cannot be efficiently represented in the ontology, but it has been designed to closely correspond to the compositional hierarchy of behaviour concepts. Experiments are described that demonstrate the system performance with real airport data.
Chapter Preview
Top

1. Introduction

This chapter is about realtime monitoring of object behaviour in aircraft servicing scenes, such as arrival preparation, unloading, tanking, and others, based on video streams from several cameras1. The focus is on high-level interpretation of object tracks extracted from the video data. The term “high level interpretation” denotes meaning assignment above the level of individually recognised objects, typically involving temporal and spatial relations between several objects and qualitative behaviour descriptions corresponding to concepts used by humans. We prefer to use the term “scene interpretation” in order to avoid reference to a particular level structure. Scene interpretation is understood to include the recognition of multi-object structures (e.g. the facade of a building) as well as the recognition of activities and occurrences (e.g. criminal acts). Regarding its scope, scene recognition can be compared to silent-movie understanding.

For aircraft servicing, scene interpretation has the goal to recognise the various servicing activities at the apron position of an aircraft, beginning with arrival preparation, passenger disembarking via a passenger bridge, unloading and loading operations involving several kinds of vehicles, refuelling, catering, and other activities. Real-time monitoring may serve several purposes. For one, delays in performing scheduled activities can be noticed and counteracted early. Secondly, predictions about the completion of a turnaround can be provided, alleviating planning. Thirdly, monitoring of service activities can be extended to include unrelated object behaviour, e.g. of vehicles not allowed in the proximity of the aircraft.

Our approach aims at developing a largely domain-independent scene interpretation framework, designed to be adaptable to changes and to be reusable for other applications. In fact, our basic approach of structuring the conceptual knowledge base in terms of compositional hierarchies and guiding the interpretation process accordingly has been used in other domains (Hotz & Neumann, 2006; Hotz, Neumann & Terzic, 2008) and by other authors (Rimey, 1993; Fusier, Valentin, Brémond, Thonnat, Borg, Thirde & Ferryman 2007; Mumford & Zhu 2007). In this introductory section we shortly discuss major contributions of past research which have influenced our current understanding of scene interpretation and our design decisions for a framework. We also compare with recent work on ontology-based scene interpretation.

Although scene interpretation has enjoyed much less attention in Computer Vision research than object recognition, there exists a considerable body of related work dating as far back as into the seventies. Badler (1975) was one of the first to derive high-level descriptions of simple traffic scenes represented by hand-drawn sketches for lack of computer-generated low-level data about real scenes. He used spatial relations between pairs of objects, corresponding to spatial adverbials, to describe a snapshot of a scene, and changes of these relational structure to describe the temporal development. A temporal concept such as “across-motion” would be recognised by rules defined in terms of preconditions and postconditions. His work showed that spatial predicates form the bridge from quantitative low-level data to qualitative high-level descriptions.

A first systematic approach to motion analysis is due to Tsotsos, Mylopoulos, Covvey, and (Zucker 1980) who introduced a taxonomy of motion types. Specific high-level motion events (in this case pathological human-heart motions) were described as a composition of elementary motions, thus also establishing a compositional hierarchy.

Structuring motion by taxonomical and compositional hierarchies, reflecting the logical structure of natural language terms, has played a significant part in most approaches to model scene interpretation, also in the early work of Neumann (1989) on natural-language description of traffic scenes. One of his additional achievements was the separation of behaviour models from control structures for behaviour recognition. Occurrences in street traffic were modelled as declarative aggregates enabling both bottom-up and top-down recognition. Temporal relations between occurrences were modelled by constraints and processed by a constraint system. As in earlier work, low-level image analysis had to be bypassed by manually providing input data in terms of a “Geometric Scene Description” (GSD) consisting of typed objects and quantitative object trajectories.

Complete Chapter List

Search this Book:
Reset