Iterative MapReduce: i-MapReduce on Medical Dataset Using Hadoop

Iterative MapReduce: i-MapReduce on Medical Dataset Using Hadoop

Utkarsh Srivastava (VIT University, India) and Ramanathan L. (VIT University, India)
DOI: 10.4018/978-1-5225-3643-7.ch007

Abstract

Diabetes Mellitus has turned into a noteworthy general wellbeing issue in India. Most recent measurements on diabetes uncover that 63 million individuals in India are experiencing diabetes, and this figure is probably going to go up to 80 million by 2025. Given the rise of big data as a socio-technical phenomenon, there are various complications in analyzing big data and its related data handling issues. This chapter examines Hadoop, an open source structure that permits the disseminated handling for huge datasets on group of PCs and thus finally produces better results with the deployment of Iterative MapReduce. The goal of this chapter is to dissect and extricate the enhanced performance of data analysis in distributed environment. Iterative MapReduce (i-MapReduce) plays a major role in optimizing the analytics performance. Implementation is done on Cloudera Hadoop introduced on top of Hortonworks Data Platform (HDP) Sandbox.
Chapter Preview
Top

Background

Big Data is also like normal data but with an enormous size. This is a term is generally used to describe a collection of data that is very huge in size and still growing exponentially with time. In short, such a large collection of data which is difficult to handle via traditional databases and other management tools is called ‘BigData’. Generic features of BigData are:

  • Volume

  • Velocity

  • Variability

  • Veracity

The main objective of this analysis is to find interesting patterns on the basis of conditional dependence on given attributes of a dataset. With such an increased rate of data generation it becomes very difficult to analyze the patterns in the dataset. Also the situation becomes more critical when we have sequential patterns in the dataset i.e. the order of dependency matters. Such behavior of data is very usual in day to day actions such as customer shopping behavior, medical symptoms leading to a future patient disease, financial stock market data predictions etc. Pattern mining of BigData using Hadoop faces a lot of issues in terms of data storage, data shuffling, data scanning, data processing units etc.

Figure 1.

Implications of BigData

Complete Chapter List

Search this Book:
Reset