Big Data Analytics Tools and Platform in Big Data Landscape

Big Data Analytics Tools and Platform in Big Data Landscape

Mohd Imran, Mohd Vasim Ahamad, Misbahul Haque, Mohd Shoaib
DOI: 10.4018/978-1-5225-3870-7.ch006
(Individual Chapters)
No Current Special Offers


The term big data analytics refers to mining and analyzing of the voluminous amount of data in big data by using various tools and platforms. Some of the popular tools are Apache Hadoop, Apache Spark, HBase, Storm, Grid Gain, HPCC, Casandra, Pig, Hive, and No SQL, etc. These tools are used depending on the parameter taken for big data analysis. So, we need a comparative analysis of such analytical tools to choose best and simpler way of analysis to gain more optimal throughput and efficient mining. This chapter contributes to a comparative study of big data analytics tools based on different aspects such as their functionality, pros, and cons based on characteristics that can be used to determine the best and most efficient among them. Through the comparative study, people are capable of using such tools in a more efficient way.
Chapter Preview

Background And Main Focus

The Big data analytics is new trending analytical standard used to fetch previously collected data which is generated by numerous applications for pattern searching that cannot be examined, processed, managed and categorized by any other existing tools or technologies (Yadav, Verma, and Kaushik, 2015).Hence new technology or tools must be adapted which can handle vast datasets generated from commodity servers which are distributed all across the globe. It is a technique of extracting useful correlated informations form massive dataset. Big data can be categorized in structured, Semi and Unstructured format. Mining of these structured, unstructured, and unrelated information collected from vast corporations, research, and healthcare organizations and make it useful by managing, structuring, controlling is main objective of big data analytics. Together, Big data analytics (BDA) is information managing tool that uncover the hidden pattern, correlated from vast big data set to make a decision control for large organization for optimized performance. The main focus is to deploy big data analytics tool in different sector of market, in order to obtain various pattern of market research.

Key Terms in this Chapter

Hive: SQL programming framework that allows a programmer to use the MapReduce algorithm via a SQL type programming language.

OLAP: Online analytical application processing is used in applications for analytical processing.

Structured Query Language (SQL): Is a programming language that is specifically designed for managing data sets in a relational database management system.

Opinion Mining: This is method of collecting opinions of different people on website in documented format.

Hadoop: Open source software that stores and analyzes massive unstructured data sets.

Batch Processing: It is defined as the processing of all collected jobs/batch which are same in nature.

HATS: HATS is a HIV and AIDS testing software developed by doctor for the diagnosis of symptoms. It diagnoses and providse a quick report of patient who is tested and it can be shareable on the internet.

Digital Image Processing and Communication Device: It is one of the standard procedures for integration of imaging devices like scanners, printers, network hardware, and servers that enables storage and communication of the medical images online.

Text-Processing: Refers to the discipline of mechanizing the creation or manipulation of electronic text.

MapReduce: Algorithm that is used to split massive data sets among many commodity hardware pieces in an effort to reduce computing time.

Data Grid: An architecture or in other word batch of services which provides a solution to individuals or bunch of users the ability to manipulate, access and transfer voluminous amount of data that is distributed geographically and intend to be used for research purposes.

Complete Chapter List

Search this Book: