Big Data Issues and Challenges

Big Data Issues and Challenges

Stephen Kaisler, Frank Armour, William Money, J. Alberto Espinosa
DOI: 10.4018/978-1-4666-5888-2.ch035
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Chapter Preview

Top

Background

Big Data was originally described by the 3Vs (Laney, 2001), but Kaisler, Armour, Espinosa, and Money (2013) have suggested two more.

Table 1.
Five Vs of Big Data
VDescription
Data VolumeThe amount of data collected and available. It is estimated that over 2.5 Exabytes (1018) of data are created every day as of 2012 (Wikipedia, 2013).
Data VelocityThe rate at which data is accumulated or the speed at which the data arrives, and how quickly it gets purged, how frequently it changes, and how fast it becomes outdated.
Data VarietyThe types of data required for analysis, either structured, such as RDF files, databases, and Excel tables or unstructured, such as text, audio files, and video.
Data ValueThe value derived from processing the data that contributes to decision making and problem solving. A large amount of data may be valueless if it is perishable, late, imprecise, or has other weakness or flaw.
Data VeracityThe accuracy, precision and reliability of the data. A data set may have very accurate data with low precision and low reliability based on the collection methods and tools.

Big Data has been often used to represent a large volume of data of one type, such as text or numbers or pixels. Recently, many organizations are creating blended data from data sources with varied types through analysis. These data come from instruments, sensors, Internet transactions, email, social media such as Twitter, YouTube, Reddit, Pinterest, Tumblr, and clickstreams. New data types may be derived through analysis or joining different types of data.

Key Terms in this Chapter

Data Variety: The number of relations and interdependencies among the data in one data set.

Big Data: The volume of data that is just beyond technology’s capability to store, manage and process efficiently.

Data Volume: The amount of data collected and, perhaps, available for use.

Data Velocity: The rate at which data is accumulated or streams into a collection area.

Data Veracity: The accuracy, precision and reliability of the data.

Data Curation: The extraction, moving, cleaning, and preparing of data for storage and processing.

Data Enrichment: The process of augmenting collected raw data or processed data with existing data or domain knowledge to enhance the analytic process.

Data Value: The value that derived from processing the data using different analytics that contributes to problem solving.

Complete Chapter List

Search this Book:
Reset