BigGIS With Hadoop in MapReduce Environment: Towards an M2BG Framework

BigGIS With Hadoop in MapReduce Environment: Towards an M2BG Framework

Nada M. Alhakkak (Baghdad College of Economic Sciences University, Iraq)
DOI: 10.4018/978-1-5225-9238-9.ch002

Abstract

BigGIS is a new product that resulted from developing GIS in the “Big Data” area, which is used in storing and processing big geographical data and helps in solving its issues. This chapter describes an optimized Big GIS framework in Map Reduce Environment M2BG. The suggested framework has been integrated into Map Reduce Environment in order to solve the storage issues and get the benefit of the Hadoop environment. M2BG include two steps: Big GIS warehouse and Big GIS Map Reduce. The first step contains three main layers: Data Source and Storage Layer (DSSL), Data Processing Layer (DPL), and Data Analysis Layer (DAL). The second layer is responsible for clustering using swarms as inputs for the Hadoop phase. Then it is scheduled in the mapping part with the use of a preempted priority scheduling algorithm; some data types are classified as critical and some others are ordinary data type; the reduce part used, merge sort algorithm M2BG, should solve security and be implemented with real data in the simulated environment and later in the real world.
Chapter Preview
Top

Background

The background of the present research handles some key terms that are widely and newly discussed as urban tools. These tools are Big GIS, Warehousing, Hadoop, Cloud computing with big geo-information, Intelligent GIS services, Intelligent GIS services, Parallel GIS based on Hadoop Cluster.

Generally speaking, the term ‘urban analysis’ is related to collecting geographical information about cities (Elshater & Abusaada, 2016) and towns and analyzing it. GIS is one of the main tools used for storing the collected data in databases and analyzing it for better decision making. This improves the required work with big geospatial data generated from GIS applications (Yeh, 1999; Elshater, 2015).

According to the vision of data holders in the GIS area for big data in managing, processing, visualization, and analyzing; Big GIS has been introduced. Mostly, most traditional GIS software is limited in dealing with big data challenges, discussed before. As a result, for developing the traditional GIS software that is related to big data issues and challenges, BIG GIS have been introduced; this new term used to manage and process big geospatial data (Yue & Jiang, 2014). The faster growing of all data types used by nowadays applications requires more flexible data warehousing software with the focus on some factors, i.e. data format and volume, varieties in data sources, unstructured data's integration, and data analysis tool. With the faster change in Bigdata term, it's important to solve the issues and challenges related to data storing and its related warehouses and use more flexible tools, like Hadoop and Hive (Sebaa, Chick, Nouicer, & Tari, 2017)

For Hadoop, it’s a Java-based programming framework with open source feature that supports managing big data in distributed an environment; solved the failure issues because of supporting clustering process in master-slave structure; which is all presented in Hadoop Distributed File System (HDFS). Meanwhile, there are two foremost programming tools that Hadoop depends on, i.e. Map Reduce and Spark. Each one has its own benefits and limitations. This work uses Map Reduce which covers three main factors. These factors are scheduling, monitoring, and re-executing failed tasks (Zhao, Chen, Ranjan, Choo, & He, 2016). However, Map Reduce leading idea can be summarized as follows:

  • Divide the data into individual blocks, which are processed by map jobs in parallel.

  • The output of the maps sorted by the framework is then inputted to the reduce tasks.

Complete Chapter List

Search this Book:
Reset