Cross-Correlation Measure for Mining Spatio-Temporal Patterns

Cross-Correlation Measure for Mining Spatio-Temporal Patterns

James Ma, Daniel Zeng, Huimin Zhao, Chunyang Liu
Copyright: © 2013 |Pages: 22
DOI: 10.4018/jdm.2013040102
(Individual Articles)
No Current Special Offers


Spatio-temporal data mining is finding applications in many domains, such as public health, public safety, financial fraud detection, transportation, and product lifecycle management. Correlation analysis is an important spatio-temporal mining technique for unveiling spatial and temporal relationships among multiple event types. This paper presents a new measure for assessing and analyzing spatio-temporal cross-correlations. This measure extends Ripley’s a widely used measure of spatial correlation, with an additional temporal dimension. Empirical studies using real-world data show that the new measure can lead to a more discriminating and flexible spatio-temporal data analysis framework. In contrast with its predecessor, this measure also allows the discovery of leading (and potentially causal) event types whose occurrences precede those of other event types. Findings from analyses employing this measure may bear important managerial implications.
Article Preview


As a subfield of data mining (Fayyad & Uthurusamy, 1996; Rajagopalan & Krovi, 2002), spatio-temporal data mining studies the discovery of interesting, implicit relationships and characteristics from spatio-temporal data (Koperski, Han, & Adhikary, 1998; Yao, 2003). This field has been attracting significant research interest in recent years, driven by the increasing availability of large datasets containing important spatial and temporal elements across a wide spectrum of application domains. Some examples of such application domains include public health (disease case reports), public safety (crime case reports), financial fraud detection (financial transaction tracking data), transportation (data from Global Positioning Systems (GPS)), and product lifecycle management (data generated by Radio Frequency Identification (RFID) devices) (P. Yan & Zeng, 2008a, 2008b; Zeng, Ma, Chen, & Chang, 2009). Actionable knowledge discovered from data with spatial and temporal dimensions can provide decision makers with valuable insights and support in their decision making processes.

Current practices of spatio-temporal data analysis largely reside in the identification of “hotspots”, areas that exhibit exceptionally high or low measures on some characteristic, and the discovery of significant changes in a timely manner in geographic areas (Chang, Zeng, & Chen, 2005; Kulldorff, 2001). While such analyses focus on a single type of events, we study spatio-temporal relationships among multiple event types in this paper. We focus on two case studies in the domains of infectious disease informatics (Lu, Zeng, & Chen, 2010) and crime analysis (Chen, et al., 2003; Zhao, et al., 2006) for evaluation purposes. Our methods, however, are general and can be used to solve business problems, such as transport demand modeling (D. Wang & Cheng, 2001) and financial crime analysis (Masciandaro, 2004), where understanding the spatio-temporal relationships among multiple event types may bear important managerial implications.

Assessing and analyzing spatio-temporal cross-correlations among multiple data streams can unveil the relationships among the underlying event types. Correlation analysis has been applied mainly in such fields as forestry (Stoyan & Penttinen, 2000), acoustics (Tichy, 1973; Veit, 1976), entomology (Cappaert, Drummond, & Logan, 1991), and animal science (Lean, et al., 1992; Procknor, Dachir, Owens, Little, & Harms, 1986), where the analyses have focused on either time series or spatial data. However, in applications such as infectious disease informatics and crime analysis where both spatial and temporal dimensions are essential, considering only one of the dimensions at a time can be problematic. Important correlations may be missed due to aggregate effects on the overlooked dimension. Spurious, misleading correlations may be signaled if the directionality of time is ignored.

One of the widely adopted measures of correlation is Ripley’s jdm.2013040102.m02 function, which mainly focuses on spatial data (B. D. Ripley, 1976; B.D. Ripley, 1981). The parameter jdm.2013040102.m03 characterizes the spatial distance scale under consideration. In order to analyze datasets with both spatial and temporal dimensions, we have extended Ripley’s jdm.2013040102.m04 with an additional temporal parameter jdm.2013040102.m05. The classical jdm.2013040102.m06 then becomes a special case of this new measure jdm.2013040102.m07–other than a scaling difference–when the temporal parameter jdm.2013040102.m08 equals the entire time span under investigation and a two-tail time window is employed.

Complete Article List

Search this Journal:
Volume 35: 1 Issue (2024)
Volume 34: 3 Issues (2023)
Volume 33: 5 Issues (2022): 4 Released, 1 Forthcoming
Volume 32: 4 Issues (2021)
Volume 31: 4 Issues (2020)
Volume 30: 4 Issues (2019)
Volume 29: 4 Issues (2018)
Volume 28: 4 Issues (2017)
Volume 27: 4 Issues (2016)
Volume 26: 4 Issues (2015)
Volume 25: 4 Issues (2014)
Volume 24: 4 Issues (2013)
Volume 23: 4 Issues (2012)
Volume 22: 4 Issues (2011)
Volume 21: 4 Issues (2010)
Volume 20: 4 Issues (2009)
Volume 19: 4 Issues (2008)
Volume 18: 4 Issues (2007)
Volume 17: 4 Issues (2006)
Volume 16: 4 Issues (2005)
Volume 15: 4 Issues (2004)
Volume 14: 4 Issues (2003)
Volume 13: 4 Issues (2002)
Volume 12: 4 Issues (2001)
Volume 11: 4 Issues (2000)
Volume 10: 4 Issues (1999)
Volume 9: 4 Issues (1998)
Volume 8: 4 Issues (1997)
Volume 7: 4 Issues (1996)
Volume 6: 4 Issues (1995)
Volume 5: 4 Issues (1994)
Volume 4: 4 Issues (1993)
Volume 3: 4 Issues (1992)
Volume 2: 4 Issues (1991)
Volume 1: 2 Issues (1990)
View Complete Journal Contents Listing