Role of Open Source Software in Big Data Storage

Role of Open Source Software in Big Data Storage

Rupali Ahuja, Jigyasa Malik, Ronak Tyagi, R. Brinda
DOI: 10.4018/978-1-7998-9158-1.ch043
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Today, the world is revolving around Big Data. Each organization is trying hard to explore ways for deriving value out of huge pile of data we are generating each moment. Open Source Software are widely being adopted by most academicians, researchers and industrialists to handle various Big Data needs because of their easy availability, flexibility, affordability and interoperability. As a result, several open source Big Data tools have been developed. This chapter discusses the role of Open Source Software in Big Data Storage and how various organizations have benefitted from its use. It provides an overview of popular Open Source Big Data Storage technologies existing today. Distributed File Systems and NoSQL databases meant for storing Big Data have been discussed with their features, applications and comparison.
Chapter Preview
Top

Background

The amount of data generated each second is continuously growing at an exponential rate. Facebook, a social networking website, is home to 40 billion photos and more than 100 hours of videos are uploaded to YouTube every minute and these statistics are burgeoning at speed of light in almost every field increasing the interest and demand for Big Data Storage and management technologies. A new forecast from International Data Corporation (IDC) sees the Big Data technology and services market growing at a Compound Annual Growth Rate (CAGR) of 23.1% over the 2014-2019 forecast periods with annual spending reaching $48.6 billion in 2019 (IDC, 2016).

Open Source tools are playing prominent role in managing Big Data Storage issues. The most dominant technologies used in Big Data world, Hadoop and Apache Spark are Open Source tools. The most popular Big Data software distribution companies like Cloudera and HortonWorks have based their business around open source technologies. Open Source is the platform best suited for Big Data solutions. Almost all Big Data solutions work on top of UNIX Operating System which is open source. Without open source tools, the Big Data world would not have grown so rapidly. According to Talend’s CEO, Mike Tuchen, “the entire next-generation data platform will be open source”. (Noyes, 2016)

Complete Chapter List

Search this Book:
Reset