Big Data Computing Strategies

Big Data Computing Strategies

Malavika Jayanand, M. Anil Kumar, K. G. Srinivasa, G. M. Siddesh
DOI: 10.4018/978-1-4666-6559-0.ch004
(Individual Chapters)
No Current Special Offers


In order to store these huge bulks of data, organizations have to buy servers and scale it according to their need. One of the solution for storage would be Cloud Environment as there is no need to scale up storage by aggregating more physical servers, installing, updating, or to run backups, which also cuts down system hardware and makes application integration easier. This chapter discusses key concepts of cloud storage as storage as a solution for Bigdata, different big data domains, and its application with case studies of how organizations are leveraging Bigdata for better business opportunities.
Chapter Preview

1. Introduction

1.1 Big Data: A Revolutionary Technology

Big Data and Cloud Computing has brought about huge transition to computing infrastructure, traditional computing techniques cannot be used for storage and processing of very large quantities of digital information. Cisco cited an estimate that, by end of 2015, global internet traffic will reach 4.8 zettabytes a year that is 4.8 billion terabytes which indicates both the Big Data challenge as well as Big Data opportunities on the horizon.

1.2 Exploding Data Volumes

The amount of digital data being generated is growing exponentially for a number of related reasons. First, every organization is becoming increasingly aware of value locked in massive amounts of data. E-commerce, retailers are starting to build up vast databases of recorded customer activity. Organizations working in many sectors including healthcare, retail, social media are generating additional value by capturing more and more data. New data collectors such as sensors, geo-data, internet click tracking have created a world where everything around us where right from automobiles to mobile phones are collecting massive amount of data which may potentially be mined to generate valuable insights.

1.3 How to Avoid Data Exhaust?

The more an organization recognizes the lucrative role of Big Data more data they seek to capture and utilize. However, because of big data’s volume, velocity and variety few organizations have ignored huge quantities of very valuable potential information. In effect most of the data that surrounds organizations today is ignored. Massive amount of data that they gather is lost unprocessed, with a significant quantity of useful information passing straight through them as “data exhaust”.

Traditional large-scale computing solutions rely on expensive server hardware which are highly fault tolerant towards hardware failures or other system problems at the application level because of which there is a high level of service continuity to be delivered from clusters server computers, each of which may be prone to failure. The most feasible premise would be to process vast quantities of data across large low-cost distributed computing infrastructures.

Big Data requires new technological solutions. The need for more dependable, scalable, distributed computing systems capable of handling the Big Data deluge led to use of more flexible technologies like Hadoop, database virtualization, storage virtualization, network virtualization, and more in order to avoid single device barrier, since they impede scaling. To spread analytical processing across bunch of commodity servers, Hadoop uses MapReduce. Elasticity and agility were needed to scale and address Bigdata

Big Data is characterized by its volume and velocity of data and for convenient processing requires decentralization. Due to lot of limiting factors such as resources, expertise in the domain etc, organizations are highly unlikely to implement their own solutions. Nonetheless, big pioneers in the market like Amazon, NetApp and Google, allow organizations of all sizes to start benefiting from Big Data processing capabilities. Big Data sets need to be utilized, since data and processing need will change and change may be rapid and disruptive a more flexible platform such as Cloud computing services which makes it more manageable and agile(James Taylor, 2011). As new types of computer processing power become available, Big Data will progress in leaps and bounds as technology advances.


2. Cloud Computing For Big Data

Cloud computing transforms the way big data has been dealt with for all these years – it has emerged as a cost-effective and elastic computing paradigm that facilitates large scale data storage and analysis. The cloud facilitates Big Data processing for wide spectrum of diverse enterprises of various sizes. Getting Business insights from Big Data is prevailing and Cloud is becoming an ideal choice for that. Cloud complements big data since cloud computing provides boundless capabilities on demand. Having these capabilities made available now, any organization can work with unstructured data at a huge scale (Gartner Report, 2013).

Key Terms in this Chapter

Data Exploration: Data exploration is process where data that has been gathered are explored and analysed in order to find relevant data for statistical reporting, trend spotting and pattern spotting.

Big Data: A term used to describe large volumes of data of both structured and unstructured data type. Gartner defines it as data with three dimensions of Volume, Velocity and Variety.

Performance Management: Big Data performance Management is to maintain quality data and enable businesses to meet their goal with efficient data management.

Big Data Biometrics: Biometric based person data is collected such as DNA, retina, voice recognition etc., for Biometric based person Identification and Verification.

Social Media Analytics: Customer Data is captured to understand consumer sentiment or attitude in order to predict consumer behavior and provide recommendations for next best action.

Analytics: Big Data analytics is the process of finding meaningful patterns or to get insight on data with extensive use of statistical and mathematical techniques.

Cloud Computing: For Big Data is a technological storage solution, where bulk data is stored which provides scalability and flexibility.

Complete Chapter List

Search this Book: