Big Data Analytics: Trends and Case Studies

Big Data Analytics: Trends and Case Studies

Hoda Ahmed Abdelhafez (Suez Canal University, Egypt)
Copyright: © 2014 |Pages: 13
DOI: 10.4018/978-1-4666-5202-6.ch029
OnDemand PDF Download:
No Current Special Offers

Chapter Preview


Volume, Variety, And Velocity Of Big Data

The three attributes or components of big data are volume, variety and velocity (Russom, 2011), in addition to complexity and variability (SAS, 2012) as shown in Figure 1.

Figure 1.

The components of big data


Data volume is the primary attribute of big data. It can be quantified by terabytes or petabytes and it can also be quantified by counting records, tables, transactions and files (Kaisler et al., 2013; Russom, 2011). For instance, the total book stack in Library of Congress measures 15Terabytes and Google processes more than 1Petabytes every hour; also Bank of America Merrill Lynch manages petabytes of data for advanced analytics and new regulatory requirements (Forsyth Communications, 2012). Organizations that are facing huge amount of data cannot manage and analysis this data with traditional IT structures but they need scalable storage and distributed approach to process this data (Dumbill, 2012).

Data Variety means the diverse source of data because of explosion of sensors, smart devices and social collaboration technologies (Kaisler et al., 2013; Zikopoulos, et al., 2012). Variety including structured, semi structured (XML, RSS feeds) and unstructured (text and human language) as well as data, data comes from audio, video, and other devices. In addition to multidimensional data which can be drawn from a data warehouse (Russom, 2011). About 80 percent of a company’s data is unstructured including office productivity documents, e-mails, Web content, in addition to social media (Forsyth Communications, 2012). Text, video and other forms of media will need a completely different architecture and technologies to perform the required analysis. For example, many marketing departments want to find ways to do sentiment and brand analysis based on what is being posted on Facebook, Twitter and YouTube. This dynamic becomes challenge in Asia with local social media sites such as Nate in Korea and RenRen in China (Carter, 2011).

Key Terms in this Chapter

Big Data Analytics: The process of examining large volume of data of a variety of types to discover useful information.

No-SQL Database: A non-relational database, also called not only SQL. It is an approach used to manage large sets of distributed data.

Structured Query Language (SQL): The standard user and application program interface to a relational database.

Big Data: Data that exceeds that processing capacity of conventional database systems.

Big Data Attributes: The components of big data which are volume, variety and velocity, in addition to complexity and variability.

Cloudera's Distribution including Apache Hadoop (CDH): Cloudera's open-source Hadoop distribution focusing on enterprise-class deployments of that technology.

Data Visualization: The depiction of information in a graphical means for analysis that can be used for strategic planning and improvement.

Complete Chapter List

Search this Book: