Big Data: The Path to Maturity

Big Data: The Path to Maturity

Stephen H. Kaisler, William H. Money, Frank Armour, J. Alberto Espinosa
DOI: 10.4018/IJSSOE.2017040101
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Big Data refers to data volumes in the range of exabytes (1018th) requiring processing from distributed on-line storage systems with thousands of processors, mainframes or supercomputers where processing speed is measured in GFLOPS. The rate at which data are being collected are accelerating and will approach the zettabyte/year range. Other attributes of Bi Data are also concurrently expanding including variety/variability, velocity, value, and vital concerns for veracity. Storage and data transport technology issues may be solvable in the near-term. However, these communication, quantity management, and processing technologies also represent long-term challenges that require research, paradigms and analytical practices. This paper extends the authors' previous analysis of the issues and challenges with Big Data. It presents a table that contrasts their previous research finding and projects with the state of Big Data today, and their projections of what managers and decision makers will (or should) seek to accomplish as the Big Data universe continues to expand and evolve.
Article Preview
Top

Introduction

The concept of Big Data has been endemic within computer science since the earliest days of computing. “Big Data” originally meant the volume of data that could not be processed (efficiently) by traditional database methods and tools. Each time a new storage medium was invented, the amount of data accessible exploded because it could be easily accessed. The original definition focused on structured data, but many researchers and practitioners have now come to appreciate that a very significant and growing percent of the world’s data and accumulated information resides in massive, unstructured data and information sinks. Further, very large chunks of the Big Data are, largely in the form of unstructured text, imagery, and video. The explosion of data has not been accompanied by the expanded availability of new storage mediums that can accumulate it while ensuring ready access.

We define “Big Data” as the amount of data just beyond technology’s capability to store, manage and process efficiently. These limitations are only discovered by a robust analysis of the data itself, explicit statements of its processing needs, and assessments of the capabilities and deficiencies in the tools (hardware, software, and methods) used to analyze and administer the Big Data. As with any newly recognized problem, the conclusion of how to proceed may lead to a recommendation that new tools need to be forged to perform the new tasks.

As little as 5 years ago, researchers and analysts were only thinking of tens to hundreds of terabytes of storage for our personal computers. Today, we are thinking in tens to hundreds of petabytes. Thus, Big Data like our universe, appears to be a constantly moving and expanding target. It is that growing amount of data (in volume, type, and perhaps with new characteristics) that is just beyond our immediate capabilities and grasp, e.g., we have to work hard to store it, access it, manage it, and process it.

Our conclusion about growth rate is that the current growth in the volume of data collected continues to be staggering. Further, no researcher or manager writing in all the literatures surveyed projects a slowing of the rate of growth or that a cap may exist for the total volume of data collected in the future. A major challenge for IT researchers and practitioners is that this growth rate is fast exceeding our ability to: (1) design appropriate systems to handle the data effectively, and (2) analyze it to extract relevant meaning for decision making. In this paper, we identify critical issues associated with data storage, management, processing and security. To the best of our knowledge, the research literature has addressed some of these issues, but we believe new approaches, technologies and processes will (and must) continue to emerge to address these issues as data volume approaches the Exabyte range.

Complete Article List

Search this Journal:
Reset
Volume 13: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 12: 2 Issues (2022): 1 Released, 1 Forthcoming
Volume 11: 2 Issues (2021)
Volume 10: 2 Issues (2020)
Volume 9: 2 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing