Shopping Cart | Login | Register | Language: English

A Review of System Benchmark Standards and a Look Ahead Towards an Industry Standard for Benchmarking Big Data Workloads

Copyright © 2014. 18 pages.
OnDemand Chapter PDF Download
Download link provided immediately after order completion
$37.50
Available. Instant access upon order completion.
DOI: 10.4018/978-1-4666-4699-5.ch017
Sample PDFCite

MLA

Nambiar, Raghunath and Meikel Poess. "A Review of System Benchmark Standards and a Look Ahead Towards an Industry Standard for Benchmarking Big Data Workloads." Big Data Management, Technologies, and Applications. IGI Global, 2014. 415-432. Web. 30 Aug. 2014. doi:10.4018/978-1-4666-4699-5.ch017

APA

Nambiar, R., & Poess, M. (2014). A Review of System Benchmark Standards and a Look Ahead Towards an Industry Standard for Benchmarking Big Data Workloads. In W. Hu, & N. Kaabouch (Eds.) Big Data Management, Technologies, and Applications (pp. 415-432). Hershey, PA: Information Science Reference. doi:10.4018/978-1-4666-4699-5.ch017

Chicago

Nambiar, Raghunath and Meikel Poess. "A Review of System Benchmark Standards and a Look Ahead Towards an Industry Standard for Benchmarking Big Data Workloads." In Big Data Management, Technologies, and Applications, ed. Wen-Chen Hu and Naima Kaabouch, 415-432 (2014), accessed August 30, 2014. doi:10.4018/978-1-4666-4699-5.ch017

Export Reference

Mendeley
Favorite
A Review of System Benchmark Standards and a Look Ahead Towards an Industry Standard for Benchmarking Big Data Workloads
Access on Platform
Browse by Subject
Top

Abstract

Industry standard benchmarks have played, and continue to play, a crucial role in the advancement of the computing industry. Demands for them have existed since buyers were first confronted with the choice between purchasing one system over another. Over the years, industry standard benchmarks have proven critical to both buyers and vendors: buyers use benchmark results when evaluating new systems in terms of performance, price/performance, and energy efficiency; while vendors use benchmarks to demonstrate competitiveness of their products and to monitor release-to-release progress of their products under development. Historically, we have seen that industry standard benchmarks enable healthy competition that results in product improvements and the evolution of brand new technologies. Over the past quarter-century, industry standard bodies like the Transaction Processing Performance Council (TPC) and the Standard Performance Evaluation Corporation (SPEC) have developed several industry standards for performance benchmarking, which have been a significant driving force behind the development of faster, less expensive, and/or more energy efficient system configurations. The world has been in the midst of an extraordinary information explosion over the past decade, punctuated by rapid growth in the use of the Internet and the number of connected devices worldwide. Today, we’re seeing a rate of change faster than at any point throughout history, and both enterprise application data and machine generated data, known as Big Data, continue to grow exponentially, challenging industry experts and researchers to develop new innovative techniques to evaluate and benchmark hardware and software technologies and products. This chapter looks into techniques to measure the effectiveness of hardware and software platforms dealing with big data.
Chapter Preview

1. Introduction To System Benchmarks

System benchmarks have played, and continue to play, a crucial role in the advancement of the computing industry. Existing system benchmarks are critical to both buyers and vendors. Buyers use benchmark results when evaluating new systems in terms of performance, price/performance, and energy efficiency, while vendors use benchmarks to demonstrate the competitiveness of their products and to monitor release-to-release progress of their products under development. With no standard system benchmarks available for Big Data systems, today’s situation is similar to that of the middle 1980s, when the lack of standard database benchmarks led many system vendors to practice what is now referred to as “benchmarketing,” a practice in which organizations make performance claims based on self-designed, highly biased benchmarks. The goal of publishing results from such tailored benchmarks was to state marketing claims, regardless of the absence of relevant and verifiable technical merit. In essence, these benchmarks were designed as forgone conclusions to fit a pre-established marketing message. Similarly, vendors would create configurations, referred to as “benchmark specials,” that were specifically designed to maximize performance against a specific benchmark with limited benefit to real-world applications.

As a direct consequence of the benchmarketing era, two benchmarking consortia emerged: the Transaction Processing Performance Council (TPC) and the Standard Performance Evaluation Corporation (SPEC). The TPC, founded in 1988, defines transaction processing and database benchmarks and disseminates objective, verifiable TPC performance data to the industry. While TPC benchmarks involve the measurement and evaluation of computer transactions, the TPC regards a transaction as it is commonly understood in the business world — as a commercial exchange of goods, services, or money. The TPC offers currently two benchmarks to measure On Line Transaction Processing (OLTP) systems, TPC-C and TPC-E, two others to measure decision support performance (TPC-H, TPC-DS), and one to measure virtualized databases. SPEC has been known mostly for their component-based benchmarks, such as SPEC CPU. It is a non-profit corporation formed to establish, maintain, and endorse a standardized set of relevant benchmarks that can be applied to the newest generation of high-performance computers, including processor-intensive benchmarks, benchmarks to measure graphics and workstation performance, high performance computing benchmarks, Java client/server benchmarks, mail server benchmarks, network file system benchmarks, and SPECpower_ssj2008, a benchmark focused on the relationship of power and performance.

While both consortia follow different methodologies in terms of benchmark development, benchmark dissemination, and benchmark compositions, they follow the same primary goal, namely to provide the industry and academia with realistic, verifiable, and fair means to compare performance.

Since the early days, other more specialized consortia arose, such as the Storage Performance Council (SPC), which is a non-profit corporation founded to define, standardize, and promote storage subsystem benchmarks and to disseminate objective, verifiable performance data to the computer industry and its customers. Since its founding in 1997, SPC has developed and publicized benchmarks and benchmark results focused on storage subsystems and the adapters, controllers, and storage area networks (SANs) that connect storage devices to computer systems. All major system and software vendors are members of these organizations. The TPC membership includes systems and database vendors. SPEC membership includes s universities and research institutions as associates. SPC membership includes systems and storage vendors.

System benchmarks can be classified into industry standard benchmarks, application benchmarks, and benchmarks based on synthetic workloads. Industry standard benchmarks are driven by industry standard consortia which are represented by vendors, customers, and research organizations. Industry standard consortia follow democratic procedures for all key decision making. Prominent industry standard consortia are the TPC, SPEC and SPC. Industry standard benchmarks enable the most fair comparison of technologies and are typically platform agnostic.

Top

Complete Chapter List

Search this Book: Reset
Table of Contents
Foreword
Wen-Chang Fang
Preface
Wen-Chen Hu, Naima Kaabouch
Chapter 1
Kapil Bakshi
This chapter provides a review and analysis of several key Big Data technologies. Currently, there are many Big Data technologies in development and... Sample PDF
Technologies for Big Data
$37.50
Chapter 2
Ilias K. Savvas, Georgia N. Sofianidou, M-Tahar Kechadi
Big data refers to data sets whose size is beyond the capabilities of most current hardware and software technologies. The Apache Hadoop software... Sample PDF
Applying the K-Means Algorithm in Big Raw Data Sets with Hadoop and MapReduce
$37.50
Chapter 3
Gueyoung Jung, Tridib Mukherjee
In the modern information era, the amount of data has exploded. Current trends further indicate exponential growth of data in the future. This... Sample PDF
Synchronizing Execution of Big Data in Distributed and Parallelized Environments
$37.50
Chapter 4
Ahmet Artu Yildirim, Cem Özdogan, Dan Watson
Data reduction is perhaps the most critical component in retrieving information from big data (i.e., petascale-sized data) in many data-mining... Sample PDF
Parallel Data Reduction Techniques for Big Datasets
$37.50
Chapter 5
Lynne M. Webb, Yuanxin Wang
The chapter reviews traditional sampling techniques and suggests adaptations relevant to big data studies of text downloaded from online media such... Sample PDF
Techniques for Sampling Online Text-Based Data Sets
$37.50
Chapter 6
Francesco Di Tria, Ezio Lefons, Filippo Tangorra
Traditional data warehouse design methodologies are based on two opposite approaches. The one is data oriented and aims to realize the data... Sample PDF
Big Data Warehouse Automatic Design Methodology
$37.50
Chapter 7
M. Asif Naeem, Gillian Dobbie, Gerald Weber
In order to make timely and effective decisions, businesses need the latest information from big data warehouse repositories. To keep these... Sample PDF
Big Data Management in the Context of Real-Time Data Warehousing
$37.50
Chapter 8
Jeonghyun Kim
The goal of this chapter is to explore the practice of big data sharing among academics and issues related to this sharing. The first part of the... Sample PDF
Big Data Sharing Among Academics
$37.50
Chapter 9
Chris A. Mattmann, Andrew Hart, Luca Cinquini, Joseph Lazio, Shakeh Khudikyan, Dayton Jones, Robert Preston, Thomas Bennett, Bryan Butler, David Harland, Brian Glendenning, Jeff Kern, James Robnett
Big data as a paradigm focuses on data volume, velocity, and on the number and complexity of various data formats and metadata, a set of information... Sample PDF
Scalable Data Mining, Archiving, and Big Data Management for the Next Generation Astronomical Telescopes
$37.50
Chapter 10
Zorica Stanimirovic, Stefan Miškovic
This study presents a novel approach in analyzing big data from social networks based on optimization techniques for efficient exploration of... Sample PDF
Efficient Metaheuristic Approaches for Exploration of Online Social Networks
$37.50
Chapter 11
Stacy T. Kowalczyk, Yiming Sun, Zong Peng, Beth Plale, Aaron Todd, Loretta Auvil, Craig Willis, Jiaan Zeng, Milinda Pathirage, Samitha Liyanage, Guangchen Ruan, J. Stephen Downie
Big Data in the humanities is a new phenomenon that is expected to revolutionize the process of humanities research. The HathiTrust Research Center... Sample PDF
Big Data at Scale for Digital Humanities: An Architecture for the HathiTrust Research Center
$37.50
Chapter 12
Tanu Malik
Data-rich scientific disciplines increasingly need end-to-end systems that ingest large volumes of data, make it quickly available, and enable... Sample PDF
GeoBase: Indexing NetCDF Files for Large-Scale Data Analysis
$37.50
Chapter 13
Joaquin Vanschoren, Ugo Vespier, Shengfa Miao, Marvin Meeng, Ricardo Cachucho, Arno Knobbe
Sensors are increasingly being used to monitor the world around us. They measure movements of structures such as bridges, windmills, and plane... Sample PDF
Large-Scale Sensor Network Analysis: Applications in Structural Health Monitoring
$37.50
Chapter 14
Mian Lu, Qiong Luo
Large-scale Genome-Wide Association Studies (GWAS) are a Big Data application due to the great amount of data to process and high computation... Sample PDF
Accelerating Large-Scale Genome-Wide Association Studies with Graphics Processors
$37.50
Chapter 15
Nathan Regola, David A. Cieslak, Nitesh V. Chawla
The selection of hardware to support big data systems is complex. Even defining the term “big data” is difficult. “Big data” can mean a large volume... Sample PDF
The Need to Consider Hardware Selection when Designing Big Data Applications Supported by Metadata
$37.50
Chapter 16
Charles Loboz
Modern data centers house tens of thousands of servers in complex layouts. That requires sophisticated reporting – turning available terabytes of... Sample PDF
Excess Entropy in Computer Systems
$37.50
Chapter 17
Raghunath Nambiar, Meikel Poess
Industry standard benchmarks have played, and continue to play, a crucial role in the advancement of the computing industry. Demands for them have... Sample PDF
A Review of System Benchmark Standards and a Look Ahead Towards an Industry Standard for Benchmarking Big Data Workloads
$37.50