Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Harnessing the Power of Big Data Analytics

Billie Anderson, J. Michael Hardin

Source Title: Encyclopedia of Business Analytics and Optimization

DOI: 10.4018/978-1-4666-5202-6.ch101

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Chapter Preview

Top

Introduction

Every two days we create as much data as we did up to 2003. -Eric Schmidt, CEO of Google

The age of big data is upon us. Data is being collected by businesses at a rate never encountered before, through Web sources, cellular phones and social media. The growth of Internet businesses has led to a whole new scale of data processing challenges. Companies such as Google, Facebook, Yahoo, Twitter, and Amazon now routinely collect and process hundreds to thousands of terabytes of data on a daily basis. This represents a significant change in the volume of data which can be processed, a major reduction in processing time required, and reduction of the cost required to store data.

Organizations have been collecting data for years, but the digital age has brought with it a substantial increase in the amount of data that is available to the modern-day business. For example, the genealogy site Ancestry.com stores about 2.5 petabytes of data (White, 2012). Twitter collects 7 terabytes of new data each day (Soffer & Heid, 2012). This data size growth rate can be attributed to several factors. The first is a more prominent presence in the online community. Many of the major retailers such as Apple, Wal-Mart, Target, Macy’s, Best Buy, Kohl’s, and Walgreens have much more of an online presence than they did 10 years ago. This online retail presence increases the amount of data each company has access to and can collect. From the financial services and healthcare sectors more data is being produced from a business protection standpoint. That is, more backup, recovery, and monitoring of customer or patient records.

In 2011, researchers from MIT Sloan Management Review and IBM asked 3,000 executives, managers and analysts how they obtain value from their massive amounts of data. The study found that organizations that used business information and analytics outperformed organizations who did not. Specifically, these researchers found that top-performing businesses were twice as likely to use analytics to guide future strategies and guide day-to-day operations as their lower-performing counterparts (LaValle, Lesser, Shockley, Hopkins, & Kruschwitz, 2011).

In order to extract value from big data, companies need to be able to easily work with terabytes and petabytes of data constantly generated by employees, customers, competitors, and Websites. It is not only the size of the data sets that distinguishes the big data movement, but also the differing types of data that must be handled. The scope of data collected by organizations is more diverse than ever. Data comes in a variety of different forms, such as structured and unstructured, spread across internal and external sources. Data is also more dynamic in the age of big data. Data constantly changes and evolves in real-time, making the window for taking action considerably shorter than in the past (McAfee & Brynjolfsson, 2012).

This chapter will define big data and big data analytics. Emerging data architectures that can handle vast amounts of data such as Hadoop will be examined. Hive, the new programming language developed by Facebook that makes Hadoop more accessible, will be explained. A survey of how software and hardware companies are creating new businesses and technology from the big data architectures will be provided. The chapter will conclude with the future work of big data that is on the horizon.

Top

Background

Typical business practice for large-scale data analysis has traditionally focused on Enterprise Data Warehouses (EDWs). EDWs dominated academic research and industrial development throughout the 1990’s. A data warehouse is a large repository of historical and current transaction data of an organization. An EDW is a centralized data warehouse that is accessible to the entire organization. EDWs are considered to be the cornerstone of good information technology (IT) (Cohen, Hellerstein, Dolan, Welton, & Dunlap, 2009). EDWs play a pivotal role in organizations that are very information-centered in industries such as retail and telecommunications. The EDW serves as the central meeting location for data integration within a large organization. The EDW has traditionally been an advantage for computing enterprise wide analytics since it has the ability to gather and organize data information from all elements of the organization. The main focus of an EDW is to compute data intensive reports for high levels of decision-making management.

Key Terms in this Chapter

Structured Query Language (SQL): Is a programming language that is specifically designed for managing data sets in a relational database management system.

Hive: SQL programming framework that allows a programmer to use the MapReduce algorithm via a SQL type programming language.

MapReduce: Algorithm that is used to split massive data sets among many commodity hardware pieces in an effort to reduce computing time.

Enterprise Data Warehouse: A data storage system that acts as a repository for the entire business enterprise.

Hadoop: Open source software that stores and analyzes massive unstructured data sets.

Business Intelligence: Transformation of data collected from all aspects of the business into a decision making tool.

Data Warehouse Generation: The development of a data warehouse.

Commodity Hardware: Hardware that is already available and not being fully utilized by a business.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference