Big Data Analytics and Visualization for Food Health Status Determination Using Bigmart Data

Big Data Analytics and Visualization for Food Health Status Determination Using Bigmart Data

Sumit Arun Hirve (VIT-AP University, India) and Pradeep Reddy C. H. (VIT-AP University, India)
Copyright: © 2020 |Pages: 27
DOI: 10.4018/978-1-5225-9750-6.ch011

Abstract

Being premature, the traditional data visualization techniques suffer from several challenges and lack the ability to handle a huge amount of data, particularly in gigabytes and terabytes. In this research, we propose an R-tool and data analytics framework for handling a huge amount of commercial market stored data and discover knowledge patterns from the dataset for conveying the derived conclusion. In this chapter, we elaborate on pre-processing a commercial market dataset using the R tool and its packages for information and visual analytics. We suggest a recommendation system based on the data which identifies if the food entry inserted into the database is hygienic or non-hygienic based on the quality preserved attributes. For a precise recommendation system with strong predictive accuracy, we will put emphasis on Algorithms such as J48 or Naive Bayes and utilize the one who outclasses the comparison based on accuracy. Such a system, when combined with R language, can be potentially used for enhanced decision making.
Chapter Preview
Top

Introduction

Classes of 'Big Data'

Big Data could be found in three structures: structured, unstructured, semi-structured.

Structured

Any information that can be put away got to and prepared as the settled configuration is named as a 'Structured' information. Over the timeframe, ability in software engineering has made more noteworthy progress in creating methods (Li, Wang, Lian et al., 2018) for working with such information (where the organization is outstanding ahead of time) and furthermore inferring an incentive out of it. In any case, presently days, we are predicting issues when the size of such information develops to an immense degree, normal sizes are being in the fury of a various zettabyte.

Unstructured

Any information with the obscure frame or the structure is delegated unstructured information. Notwithstanding the size being tremendous, un-organized information represents various difficulties as far as its preparing for inferring an incentive out of it. The regular case of unstructured information is a heterogeneous information source containing a blend of basic content records, pictures, recordings and so on (Anandakumar & Umamaheswari, 2018). Presently, multi-day associations have an abundance of information accessible with them yet sadly they don't realize how to determine an incentive out of it since this information is in its crude shape or unstructured arrangement.

Semi-structured

Semi-organized information can contain both types of information. We can see semi-organized information as a structured in frame however it is not characterized with for example a table definition in social DBMS. A case of semi-organized information is information spoken to in XML document.

Qualities of 'Big Data'

Volume

The name Big Data itself is identified with a size which is tremendous. Size of information assumes the extremely vital job in deciding an incentive out of information (Ahmed, 2019). Likewise, regardless of whether a specific information can be considered as a Big Data or not, it is an endless supply of information. Henceforth, 'Volume' is one trademark which should be considered while managing Big Data.

Variety

The following part of Big Data is its assortment (Lee, 2019). Assortment alludes to heterogeneous sources and the idea of information, both organized and unstructured. Amid prior days, spreadsheets and databases were the main wellsprings of information considered by the vast majority of the applications. Presently days, information as messages, photographs, recordings, checking gadgets, PDFs, sound, and so forth is additionally being considered in the examination applications. This assortment of unstructured information represents certain issues for capacity, mining and dissecting information.

Velocity

The term 'speed' alludes to the speed of age of information. How quick the information is created and handled to meet the requests, decides genuine potential in the information.

Enormous Data Velocity manages the speed at which information streams in from sources like business forms, application logs, systems, and web-based life locales, sensors, Mobile gadgets, and so on. The stream of information is huge and ceaseless (Anandakumar & Umamaheswari, 2017).

Variability

This alludes to the irregularity which can be appeared by the information on occasion, subsequently hampering the way toward having the capacity to deal with and deal with the info viably.

Complete Chapter List

Search this Book:
Reset