Enabling RDF Stream Processing for Sensor Data Management in the Environmental Domain

Enabling RDF Stream Processing for Sensor Data Management in the Environmental Domain

Alejandro Llaves (Ontology Engineering Group, Universidad Politécnica de Madrid, Madrid, Spain), Oscar Corcho (Ontology Engineering Group, Universidad Politécnica de Madrid, Madrid, Spain), Peter Taylor (Data61, Commonwealth Scientific and Industrial Research Organisation, Hobart, Australia) and Kerry Taylor (Australian National University, Acton, Australia)
Copyright: © 2016 |Pages: 21
DOI: 10.4018/IJSWIS.2016100101
OnDemand PDF Download:
$37.50

Abstract

This paper presents a generic approach to integrate environmental sensor data efficiently, allowing the detection of relevant situations and events in near real-time through continuous querying. Data variety is addressed with the use of the Semantic Sensor Network ontology for observation data modelling, and semantic annotations for environmental phenomena. Data velocity is handled by distributing sensor data messaging and serving observations as RDF graphs on query demand. The stream processing engine presented in the paper, morph-streams++, provides adapters for different data formats and distributed processing of streams in a cluster. An evaluation of different configurations for parallelization and semantic annotation parameters proves that the described approach reduces the average latency of message processing in some cases.
Article Preview

Introduction

The demand for processing streams of data in real-time has increased in recent years due to various factors, including the growing amount of sensor data made available online (Margara et al., 2014) and the number of applications that provide context-aware services using smartphones (e.g. Waze1 and RunKeeper2), among others. Furthermore, there are government organizations and agencies with long tradition on publishing sensor data on the Web (e.g. USGS water data)3, and companies that started doing so recently following open data initiatives (e.g. APIs for public transport data in the UK4 and Madrid5). These are examples of resources that allow developers to build new applications upon dynamic datasets. Yet, extracting meaningful information from streams of data is not trivial and requires data integration procedures and processing systems that scale to varying conditions in data sources, complex queries, and system failures.

The authors of this paper focus on data produced by sensors that are available online. The concept of Sensor Web (Delin & Jackson, 2001) refers to a network of interlinked sensing devices distributed in space, which is able to monitor uncertain environments. The Open Geospatial Consortium's (OGC) Sensor Web Enablement (SWE) provides a set of standards for managing online sensor networks and the data they produce (Botts et al., 2008). SWE's data models and service specifications address syntactic heterogeneities, but lack semantic descriptions. To solve this problem, the Semantic Sensor Web (Sheth et al., 2008) aims at providing a framework for the interoperable exchange and processing of sensor data by enriching observations with spatial, temporal, and thematic metadata.

Sensor data providers and consumers are facing some challenges motivated by the data deluge (Corcho & Garcia-Castro, 2010): support for flexible querying (e.g. including spatio-temporal parameters), the need for on-the-fly aggregations, detection of relevant events and outliers, integration of heterogeneous data sources, and efficient management of system scalability. Making the collected sensor data available for consumers is also a data management task. Commonly, this task is solved by setting up a data access Web portal, via OGC's Sensor Observation Services, or providing an API. According to the Linked Data principles, the proper format to publish data on the Web is the Resource Data Framework (RDF) format.6 The W3C RDF Stream Processing (RSP) Community Group aims at defining a common model for producing, transmitting, and continuously querying data streams encoded in RDF.7 In this paper, the authors show how the conversion of sensor data (coming from the Sensor Cloud infrastructure) to RDF allows the ingestion of sensor data time series into RSP engines. With this purpose, the authors implemented morph-streams++,8 a distributed and parallelized version of morph-streams9 (Calbimonte et al., 2010; Calbimonte et al., 2012) that provides ontology-based data access to execute SPARQL-like queries over a range of data streaming systems. In previous work, the authors have discussed how to extend morph-streams in terms of scalability, adaptive query processing, and RDF stream compression (Llaves et al., 2014; Fernandez et al., 2014). In this paper, the focus is on the use of morph-streams++ for sensor data management in the environmental domain. More concretely, on the pre-processing of environmental sensor data before it is ingested by the RSP engine.

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 13: 4 Issues (2017): 2 Released, 2 Forthcoming
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing