A Proposed Validation Method for a Benchmarking Methodology

A Proposed Validation Method for a Benchmarking Methodology

Tudorica Bogdan George (Petroleum-Gas University of Ploiesti, Ploiesti, Romania & The Bucharest University of Economic Studies, Bucharest, Romania)
Copyright: © 2014 |Pages: 10
DOI: 10.4018/ijsem.2014100101

Abstract

The aim of this paper is to describe a meaning to validate a previously proposed benchmark method targeted at NoSQL databases. The method described here is based on a statistical analysis of the results given by the benchmarking software. The reason for proposing such a validation method is the opinion of the author that any benchmarking methodology should be both based on a rigorous algorithm and verified in a thorough way (by validation and by testing) before practical use.
Article Preview

About Benchmarking And Its Pproblems

The benchmarking activity usually involves repeating a series of operations. The execution time of these operations is interpreted afterwards in one way or another, depending on the purpose of the benchmark. This working way is prone to a series of difficulties which will be listed in the following paragraphs.

First of it, the software or hardware manufacturers can optimize their products in such ways that to better behave in the well-known benchmarking applications (and they done that in several occasions). For this reason the results can be significant or not. Such an optimization can even lower the results of the product on real case workloads.

Second, most of the benchmarks are exclusively targeted at execution times, ignoring some other important traits (but more difficult to test) of the benchmarked products such as: the quality of service (security, availability, trust level, execution integrity, maintainability, scalability, the ability to swiftly relocate computing capabilities, etc.). In most real situations, the users are looking for a balance between performance and one or more of these traits. Supplementary, for database applications some other qualities, even more difficult to test such as ACID completeness, following the scalability rules and level of service rules.

Third, most of the benchmarks are not measuring the total cost of ownership. As an exception, Transaction Processing Performance Council Benchmark is partly covering this subject by containing a price / performance metric. This metric can also be fooled by various manufacturers by artificially lowering the prices by various means. Another metric that can be significant is the performance achieved per consumed unit of energy.

There are also some difficulties in adopting the existing benchmarks to the distributed environments (especially cloud and grid ones) because of the fact that these environments are neither 100% compatible with the older individual systems nor working in the same way.

To further the difficulties, the concept of performance is a subjective one, most of the users taking as performance the capability of the evaluated object to reach a desired level of service.

The performance of many server architectures is highly degraded at load levels near to 100% but almost no benchmarks are running at this kind of load.

Most benchmarks are only focused over a single application or class of applications (e.g. office applications only) excluding even the case of simultaneously execution of other applications. Also most benchmarks are designed to run on real computing machines, not on virtual ones (these ones behaving differently in some situations) although in more and more situations the real computing machines are replaced by virtual ones.

And finally, many benchmarks are not based on a scientific working methodology (no optimal sample size is established or used, no control variables are used, the repeatability of the results is not assured / obtained and so on).

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 8: 4 Issues (2019): Forthcoming, Available for Pre-Order
Volume 7: 4 Issues (2018)
Volume 6: 4 Issues (2017)
Volume 5: 4 Issues (2016)
Volume 4: 4 Issues (2015)
Volume 3: 4 Issues (2014)
Volume 2: 4 Issues (2013)
Volume 1: 4 Issues (2012)
View Complete Journal Contents Listing