Improving Geospatial Big Data Analytics Approaches: A Focus on High-Velocity Data Streams

Improving Geospatial Big Data Analytics Approaches: A Focus on High-Velocity Data Streams

Sana Rekik
Copyright: © 2021 |Pages: 9
DOI: 10.4018/978-1-7998-1954-7.ch005
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The advent of geospatial big data has led to a paradigm shift where most related applications became data driven, and therefore intensive in both data and computation. This revolution has covered most domains, namely the real-time systems such as web search engines, social networks, and tracking systems. These later are linked to the high-velocity feature, which characterizes the dynamism, the fast changing and moving data streams. Therefore, the response time and speed of such queries, along with the space complexity, are among data stream analysis system requirements, which still require improvements using sophisticated algorithms. In this vein, this chapter discusses new approaches that can reduce the complexity and costs in time and space while improving the efficiency and quality of responses of geospatial big data stream analysis to efficiently detect changes over time, conclude, and predict future events.
Chapter Preview
Top

Introduction

Since its emergence, the big data have managed to stand as an important multidisciplinary phenomenon involving various disciplines and areas of research. These latter may include parallel and distributed systems, communication networks and data analysis fields. Big data as a research field has efficiently affected most of existing domains including health, industry, environment and transportation (Lytras, 2017). Big data is defined as a massive amount (volume) of data that exceeds the limit of traditional storage powers. However, professionally, it is defined through a set of relevant characteristics named the Big data Vs, where the volume is just one of those features. These Vs define a Big data model that ranges from 3Vs to 7Vs (Uddin, 2014). Recently, the range has extended to reach 10Vs models (Khan, 2018). These Vs are respectively, the Volume of data, the Velocity that defines the high speed of data production and generation, the Variety that defines the heterogeneity of data, Value that refers the usefulness of the data, and the Veracity which refers to the quality of data (Storey, 2017). Equally, the other Vs are, Viability, Validity, Variability, Volatility, Viscosity, which are more explained in (Khan, 2018).

Due to its variety Big data includes different types of data, the Geospatial (geo) is an important one of them that often refers to geographical location. Apart the use of traditional capture tools, this geospatial data can be generated using geo-located sensors, Internet of Things objects, tracking systems and volunteered geographic information (VGI) sources such as social networks (Li, 2016). Geospatial data is applied in most of all applications as supply chain management (Irizarry, 2013), traffic and health monitoring services (Griffin, 2015). This has led to the appearance of a specific sub-domain; Geospatial big data. This later has demonstrated significant support for real-time systems through the application of several analytics approaches (Lee, 2015). Thus, researchers have opened new opportunities and focus to further leverage geospatial Big data in such context. On the other hand, given the complexity of geospatial data, related operations are becoming more and more complex and costly. Therefore, traditional geospatial tools, whether for storage, processing, analyzing, or visualization, have reached their limits facing these constraints (Robinson, 2017). Consequently, an emerging need has occurred to migrate to new geospatial big data tools and techniques.

Data-intensive applications have emerged last years and frequently appeared in various fields given the wide availability of data and also calculation techniques (Chen, 2014). This evolution required the study of the different challenges of these applications and generally of the big data phenomenon.

As already mentioned, high-velocity is among the geospatial Big data characteristics (Vs). This latter is a significant factor for stream-based applications, which requires real time responses and therefore a low latency for data stream processing and analysis. Therefore, processing real time data have more requirements than other, given that the time constraint and the speed of the response represent an important priority in such real-time applications. Therefore, such streams which data change frequently over time require sophisticated algorithms to identify these changes. In this context, researches have to be directed toward applying new thinking strategies and methods to cope with scaling problems. This principally aims to find a compromise without being subject to penalties in terms of cost and complexity. Thus, the proposed approaches for Geospatial big data stream analysis have to find solutions to time and space complexities and costs while ensuring efficiency, accuracy and quality of responses. In this context, this chapter investigates the state-of-the art research efforts directed toward handling Geo big data high-velocity by referring to improving and effective analytics approaches that can lead to efficient decision making. The first part of this chapter presents an overview of stream data analysis techniques and frameworks applied in the context of big data generally and more specifically for Geospatial big data. The second part discusses the performances of existing approaches and theirs related challenges facing the high velocity constraints. Then, possible solutions and recommendations for cited issues are described in the third part. Finally open issues and future research directions are presented.

Complete Chapter List

Search this Book:
Reset