Towards Big GeoData Mining and Processing

Towards Big GeoData Mining and Processing

Aroua Boulaaba (ENSI, Tunis, Tunisia) and Sami Faiz (ISAMM, Megrine, Tunisia)
DOI: 10.4018/IJOCI.2018040104

Abstract

The proliferation of advanced technologies and the rapid adoption of connected devices and mobiles were behind the tremendous growth of geographic data introducing the concept of Big GeoData. The extraction of useful patterns is an interesting process, but the specific features and the complexity of relationships among spatial objects outpace the capacities of current systems. The unique way to tackle challenges raised by Big GeoData is to develop new platforms and systems. To discuss this topic, this article starts with an introduction of Big GeoData and spatial data mining tasks, followed by a reviewing of works made on Big GeoData analytics as well as their issues. Finally, authors propose their approach to develop a tool allowing the analysis of Big GeoData. This tool will encompass a wide range of algorithms and techniques to carry out most of the data mining tasks.
Article Preview

1. Introduction

The commonly used phrase of (Morais, 2012) “80% of data is geographic” has been the subject of a serious debate among many researchers. None of the searchers denied the important rate and all discussions confirm the spectacular increase of spatial data acquisition.

Geospatial data identify features of geographic objects such as their orientation and position in the real world. It must also include spatial relations among neighboring elements.

The term Big GeoData has come to express the deluge of such specific data with its various formats and large complexity. Lying at the intersection of Geomatics and Big Data disciplines, Big GeoData became a hot topic needing considerable consideration.

This term is used to identify large quantities of geographic data continuously generated from numerous sources so that they can be analyzed and processed in brief delays. Volume, Velocity and Variety were first proposed by (Laney, 2001) to introduce the concept of Big Data. But since the 3Vs were not sufficient to explain this emerging concept, further dimensions such as Veracity, variability and some others have been added.

The most fundamental characteristics that they clearly contribute to understand the context of Big GeoData may be described as follows (Li et al., 2016):

  • Volume: Spatial data have moved sharply higher and the size is increasing at the rate of 20% per year. Such large amounts of data are coming from photogrammetry and a variety of advanced technologies and devices such as modern sensor networks and artificial satellites that are producing several GB or TB every hour. The increasing volume exceeds the capacity of current tools that cannot support such massive datasets and raises challenges in terms of storage, analyzing and visualizing.

  • Variety: The wide variety of sources are generating different formats of spatial data including raster and vector data, geo-tagged text, maps, images and many other data forms. The majority of the data has complex structures and may be structured, semi-structured and unstructured data.

  • Velocity: The Velocity involves not only the speed of spatial data generation such as modern sensors that are producing continuously streams of data, but also the speed of process to meet the needs and provide useful information in real time.

  • Veracity: Veracity is a key feature to define Big GeoData. It is about the provenance of data to be processed. In fact, despite the rapid growth of spatial data, there is a low density in value. This is due to the vast quantity of junk and polluted spatial data coming from unverified sources. To ensure the quality of the analysis results and avoid incorrect value, it is necessary to handle reliable and certain data.

  • Variability: Concerns variations in spatial data structures that change constantly the meaning of the data.

  • Value: Is an assessment of results from the spatial data analysis. It refers to the value added after the data processing.

The enormous growth of geographic data in terms of size, complexity, variety and velocity present challenges in capture, storage, management, analysis and visualization (Kasemsap, 2017). Consequently, traditional tools and systems are unable to cope with the new challenges brought by the concept of Big GeoData because they were not designed to take into account such massive and complex datasets. Thus, high performance tools are required and new technologies have to be developed (Niebles, Wang, & Fei-Fei, 2008).

The major issue moving forward now is how to analyze abundant spatial data having all these characteristics.

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 9: 4 Issues (2019): Forthcoming, Available for Pre-Order
Volume 8: 4 Issues (2018): 3 Released, 1 Forthcoming
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing