Warehousing and Mining Streams of Mobile Object Observations

Warehousing and Mining Streams of Mobile Object Observations

S. Orlando, A. Raffaetà, A. Roncato, C. Silvestri
DOI: 10.4018/978-1-60566-328-9.ch004
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

In this chapter, the authors discuss how data warehousing technology can be used to store aggregate information about trajectories of mobile objects, and to perform OLAP operations over them. To this end, the authors define a data cube with spatial and temporal dimensions, discretized according to a hierarchy of regular grids. This chapter analyses some measures of interest related to trajectories, such as the number of distinct trajectories in a cell or starting from a cell, the distance covered by the trajectories in a cell, the average and maximum speed and the average acceleration of the trajectories in the cell, and the frequent patterns obtained by a data mining process on trajectories. The authors focus on some specialised algorithms to transform data, and load the measures in the base cells. Such stored values are used, along with suitable aggregate functions, to compute the roll-up operations. The main issues derive, in this case, from the characteristics of input data (i.e., trajectory observations of mobile objects), which are usually produced at different rates, and arrive in streams in an unpredictable and unbounded way. Finally, the authors also discuss some use cases that would benefit from such a framework, in particular in the domain of supervision systems to monitor road traffic (or movements of individuals) in a given geographical area.
Chapter Preview
Top

Introduction

The widespread diffusion of modern technologies such as low-cost sensors, wireless, ubiquitous and location-aware mobile devices, allows for collecting overwhelming amounts of data about trajectories of moving objects. Such data are usually produced at different rates, and arrive in streams in an unpredictable and unbounded way. This opens new opportunities for monitoring and decision making applications in a variety of domains, such as traffic control management and location-based services. However, for these applications to become reality, new technical advances in spatial information management are needed. Typically analytical and reasoning processes for a large set of data require, as a starting point, their organisation in repositories, or data warehouses (DWs), where they can be extracted with powerful operators and further elaborated by means of sophisticated algorithms.

In this chapter we define a Trajectory DW (TDW) model for storing aggregates about trajectories, implementable using off-the-shelf DW systems. More specifically, it is a data cube with spatial and temporal dimensions, discretized according to a hierarchy of regular grids. The model abstracts from the identifiers of the objects in favour of aggregate information concerning global properties of a set of moving objects, such as the distance travelled by these objects inside an area, or their average speed or acceleration, or spatial patterns co-visited by many trajectories.

There are good reasons for storing only this aggregate information: in some cases personal data should not be stored due to legal or privacy issues; individual data may be irrelevant or unavailable; and individual data may be highly volatile and involve huge space requirements. In addition, current spatio-temporal applications are much more interested in aggregates, rather than information about individual objects (Tao & Papadias, 2005). For example, traffic supervision systems usually monitor the number of cars in an area of interest rather than their ids. Also mobile phone companies can exploit the number of phone-calls per cell in order to identify trends and prevent potential network congestion.

Note that a different solution, alternative to our TDW, could be based on the exploitation of Moving Object Databases (MODs) (Güting & Schneider, 2005), which extend database technologies for modelling, indexing and query processing raw trajectories. One of the main drawback of this MOD-based solution is space complexity and privacy: we need to store and maintain huge raw data and individual information. The other drawback is time complexity to compute a spatio-temporal window aggregate query (which usually specifies a spatial rectangle, a time interval, and an aggregate function to compute): we first need to perform an expensive step to extract from the MOD all the relevant trajectory segments, and then compute on them the requested aggregate function.

Concerning the TDW measures about trajectory data, in this chapter we go beyond numerical ones (Orlando, Orsini, Raffaetà, Roncato & Silvestri, 2007). We are interested also in aggregate properties, obtained through a knowledge discovery process from the raw data. In particular, we focus on the knowledge extracted by a Spatio-Temporal Frequent Pattern Mining (ST-FPM) tool. Apart from the transformation phase of ST raw data, the above problem can be reduced to the well-known Frequent Itemset Mining (FIM) (Agrawal & Srikant, 1994). This implies that the trajectory properties we store in and retrieve from the TDW are sets of spatial regions which are frequently visited, in any order, by a large number of trajectories (beyond a certain threshold). The extraction of frequent patterns is a time consuming task. Similarly to the way we process and load the other measures, as soon as data arrive, we transform data, extract patterns, and load the base cells of our TDW with the mined patterns. Such partial aggregations stored in the base cells can be aggregated in order to answer roll-up queries about patterns occurring in larger ST cells.

Complete Chapter List

Search this Book:
Reset