This chapter introduces a cutting-edge research field of computer vision and image understanding – the spatial-temporal behavior understanding. The main concepts, the focus of research, the typical technology, the fast development, etc. of this new field in recent years are overviewed. An important task in computer vision and image understanding is to analyze the scene through image operation on the image of scene in order to guide the action. To do this, one needs to locate the objects in the scene, and to determine how they change its position, attitude, speed, and relationships in the space over time. In short, it is to grasp the action in time and space, to determine the purpose of the operation, and thus to understand the semantics of the information they passed. This is referred ti as the understanding of spatial-temporal behaviors.
TopBackground
Research forces and results around such a topic are just appeared in recent years, some statistics can be seen from the survey on image engineering (Zhang, 2015c). The annual survey series of the yearly bibliographies on image engineering has started in 1995 and has been carried out for 21 years (Zhang, 2016). When the series enters its second decade (for the literature statistics of 2005), with the appearance of some new hot spots in the image engineering research and application, a new subcategories (C5): spatial-temporal technology (including 3-D motion analysis, gesture and posture detection, object tracking, behavior judgment and understanding) has been added into the image understanding category (C) (Zhang, 2006). The emphasis here is the comprehensive utilization of a variety of information possessed by the image/video in order to make the according interpretation for the dynamics of scene and objects inside.
In the past eleven years, the number of publications belong to the subcategory C5 in the annual survey series has attend a total of 153. There are five subcategories in category C, and the total number of publications belong to category C in these eleven years is 1352, so the subcategory C5 is still a small subcategory. Their distributions in each years are shown in the bars in Figure 1, in which a 3-order polynomial curve fitting to the number of publications of each year is drawn to show the change trends. Overall, this is still a relatively new field of research, so its development is not too stable, yet.
Figure 1. Some statistics of the numbers of publications for spatial-temporal technology
TopMain Focus Of The Article
The definition, development, and stratification of spatial-temporal technology are first provided. Then, from low-level to high-level, the detection of points of interest, the forming of dynamic trajectory and activity path, the example techniques for action classification and recognition, as well as modeling for action and activity, are introduced consecutively. Several further research directions are discussed before some final concluding remarks are delivered.