Article Preview
TopIntroduction
The ever-increasing popularity of mobile devices, coupled with the advances in wireless technologies and global positioning systems, has helped to create an environment where data access truly is anywhere at any time. The wireless communication market has grown by leaps and bounds, and these new technologies bring a flood of location-dependent applications where it is not only desirable, but often critical, to provide real-time query results to the respective users (Cai, Hua, Cao, & Xu, 2006). With the worldwide proliferation of GPS-enabled devices and the staggering growth of smart phones (Chang et al., 2009), it is only a matter of time until these applications, often referred to as Location Based Services (LBS), truly become ubiquitous.
Traditionally, mobile object databases augment the standard database model of persistent data storage and complex querying by adding new models and index structures geared to store, track, and process the locations of moving objects efficiently (Guttman, 1984; Beckmann et al., 1990; Gedik & Liu, 2006). R-trees (Guttman, 1984) have been the most popular mechanism for spatial indexing, and many variants have been proposed, including the R*-tree (Beckmann et al., 1990), X-tree (Berchtold et al., 1996), Lazy Update R-tree (Kwon et al., 2002), and a plethora of other suggestions. Additionally, there is a large body of research focused on reducing the computational burden of continuously monitoring and evaluating real-time queries over these mobile objects. Such works include MQM (Cai, Hua, Cao, & Xu, 2006), MobiEyes (Gedik & Lui, 2006), Domino (Wolfson et al., 1999), and CAT (Trajcevski et al., 2004), to name a few. While these models and structures did initially extend the research in this area, the past few years have witnessed the emergence of a new class of data intensive applications that often require the continuous processing of potentially unbounded sequences of transient data, called data streams. Examples include financial tickers, internet traffic, sensor data, and transaction logs. The high arrival rates of these spatio-temporal data streams, coupled with their massive data sizes, makes it infeasible for traditional DBMS techniques to store, query, or index these streams and therefore dictates the need for better solutions.
In simplest terms, a data stream can be defined as “a sequence of characters or bits that is too large to be viewed in its entirety” (Hartzman & Watters, 1990). Several works have convincingly argued that the two research fields of spatio-temporal data streams and the management of moving objects can naturally come together (Chandrasekaran & Franklin, 2003; Ghanem et al., 2007; Mokbel et al., 2004). For example, the output of a GPS receiver, monitoring the position of a mobile object, is viewed as a data stream of location updates. This data stream of location updates, along with those from the plausibly many other mobile objects, is received at a centralized server, which processes the streams upon arrival, effectively updating the answers to the currently active queries in real time. From this model, it becomes clear that additional applications could benefit from modeling location updates as streaming data, including, but not limited to, network traffic, time series data, telephone records, weather data, web click streams, and the list goes on.