Data-Centric Benchmarking

Data-Centric Benchmarking

Jérôme Darmont (Université de Lyon, Lyon 2, ERIC EA3083, France)
Copyright: © 2018 |Pages: 11
DOI: 10.4018/978-1-5225-2255-3.ch154
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

In data management, both system designers and users casually resort to performance evaluation. Performance evaluation by experimentation on a real system is generally referred to as benchmarking. The aim of this chapter is to present an overview of the major past and present state-of-the-art data-centric benchmarks. This review includes the TPC standard benchmarks, but also alternative or more specialized benchmarks. Surveyed benchmarks are categorized into three families: transaction benchmarks aimed at On-Line Transaction Processing (OLTP), decision-support benchmarks aimed at On-Line Analysis Processing (OLAP) and big data benchmarks. Issues, tradeoffs and future trends in data-centric benchmarking are also discussed.
Chapter Preview
Top

Introduction

In data management, both system designers and users casually resort to performance evaluation. On one hand, designers need to test architectural features and hypotheses regarding the actual (vs. theoretical) behavior of a system, especially in terms of response and scalability. Performance tuning also necessitates accurate performance evaluation. On the other hand, users are also keen on comparing the efficiency of different technologies before selecting a software solution. Thence, performance measurement tools are of premium importance in the data management domain.

Performance evaluation by experimentation on a real system is generally referred to as benchmarking. It consists in performing a series of tests on a given system to estimate its performance in a given setting. Typically, a data-centric benchmark is constituted of two main elements: a data model (conceptual schema and extension) and a workload model (set of read and write operations) to apply on this dataset, with respect to a predefined protocol. Both models may be parameterized. Most benchmarks also include a set of simple or composite performance metrics such as response time, throughput, number of input/output operations, disk or memory usage, etc.

The Transaction Processing Performance Council (TPC), a non-profit organization founded in 1988, plays a preponderant role in data-centric benchmarking. Its mission is to issue standard benchmarks, to verify their correct application by the industry, and to publish performance test results. TPC members include all the major industrial actors from the database field.

The aim of this chapter is to present an overview of the major past and present state-of-the-art data-centric benchmarks. Our review includes the TPC standard benchmarks, but also alternative or more specialized benchmarks. We survey benchmarks from three families: transaction benchmarks aimed at On-Line Transaction Processing (OLTP), decision-support benchmarks aimed at On-Line Analysis Processing (OLAP) and big data benchmarks. Eventually, we discuss the issues, tradeoffs and future trends in data-centric benchmarking.

Key Terms in this Chapter

Data Model: In a data-centric benchmark, a database schema and a protocol for instantiating this schema, i.e. , generating synthetic data or reusing real-life data.

Cloud Benchmarking: Use of cloud services in the respective (distributed) systems under test ( Folkerts et al., 2012 ).

Performance Metrics: Simple or composite metrics aimed at expressing the performance of a system.

Benchmark: A standard program that runs on different systems to provide an accurate measure of their performance.

Complete Chapter List

Search this Book:
Reset