Olympics Big Data Prognostications

Olympics Big Data Prognostications

Arushi Jain (Ambedkar Institute of Advanced Communication Technologies and Research, New Delhi, India) and Vishal Bhatnagar (Department of Computer Science and Engineering, Ambedkar Institute of Advanced Communication Technologies and Research, New Delhi, India)
Copyright: © 2016 |Pages: 14
DOI: 10.4018/IJRSDA.2016100103
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Data is continuously snowballing over the years, gradually a huge growth is seen in data to store and tame to yield meticulous result. It gives rise to a concept nowadays, reckoned as big data analytics. With the summer Olympics at Rio de Janeiro, Brazil in the year 2016 round the corner, we, the authors have implemented a mathematical model by implementing efficient map reduce program to predict the number of medals each country might bag at the games. Based on a number of factors such as historical performance of the country in terms of medals won, the performance of athletes, financial scenario in the country, fitness levels and nutrition of athletes along with familiarity to the playing conditions can be used to come up with a reliable estimate.
Article Preview

Introduction

Big data as already become a prominent part of a $64 billion database and analytics souk. Gartner has defined Big Data by giving three characteristics (volume, velocity, variety) prevalently known as 3 V’s of Big Data. Later on, IBM defined one more characteristics (veracity) and provides the theory of 4 V’s. It also includes the 3 V’s given by Gartner. Now-a-days organizations are struggling with fifth V of big data, finding value contained in big data. Analytic souk is leaving no stone unturned to tame the available data that yields interesting results and better insights of industry or organization.

  • Volume: It refers to the amount of data. When data size surpasses from terabytes to exabytes, it refers to big data. Big firms like Facebook, twitter generate billions of data daily and uses big data analytics to retrieve valuable information when require.

  • Velocity: The speed at which data is received and processed. It corresponds to data in motion. Real-time applications and Internet-of-things (IOT) requires abrupt response. So in such cases, time for harnessing the data should be in seconds to milliseconds.

  • Variety: This feature depicts different formats of data. Big data can handle any form of data whether structured, semi-structured or unstructured. Unstructured data can contain same requirements information to structured data but in an unorganized manner not in row-column format.

  • Veracity: This refers to data in doubt. Veracity in data analytics is a non-desired feature in data. It is the uncertainties and noise present in the data.

  • Value: Characteristic refers to the intrinsic value contained in big data.

Some of the challenges of big data are:

  • 1.

    The biggest challenge in big data is to aggregate data from heterogeneous sources and analyzes it to get useful information out of it to improve various aspects of functioning and business process of organizations. The data may come from various social networks, with each having a different format.

  • 2.

    One of the main characteristics of big data is Autonomous where data source works independently without being dependent on centralized control. For example, World Wide Web generates function correctly without involving other servers.

  • 3.

    Another big data challenge is complexity as data is collected from the heterogeneous sources

  • 4.

    Big data is always evolving; thus evolution of complex data poses a big challenge.

For analyzing large and complex data sets, an application framework is used, known as Hadoop. Hadoop Distributed File System (HDFS) makes use of Distributed File System (DFS) for processing the data. It divides the file into blocks, which is then assigned to the different nodes in the cluster of hadoop framework where the input data is processed with the help of MapReduce programming and the result is written again into HDFS.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 5: 4 Issues (2018): 1 Released, 3 Forthcoming
Volume 4: 4 Issues (2017)
Volume 3: 4 Issues (2016)
Volume 2: 2 Issues (2015)
Volume 1: 2 Issues (2014)
View Complete Journal Contents Listing