Control-Based Database Tuning Under Dynamic Workloads

Control-Based Database Tuning Under Dynamic Workloads

Yi-Cheng Tu, Gang Ding
Copyright: © 2009 |Pages: 6
DOI: 10.4018/978-1-60566-010-3.ch053
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Database administration (tuning) is the process of adjusting database configurations in order to accomplish desirable performance goals. This job is performed by human operators called database administrators (DBAs) who are generally well-paid, and are becoming more and more expensive with the increasing complexity and scale of modern databases. There has been considerable effort dedicated to reducing such cost (which often dominates the total ownership cost of missioncritical databases) by making database tuning more automated and transparent to users (Chaudhuri et al, 2004; Chaudhuri and Weikum, 2006). Research in this area seeks ways to automate the hardware deployment, physical database design, parameter configuration, and resource management in such systems. The goal is to achieve acceptable performance on the whole system level without (or with limited) human intervention. According to Weikum et al. (2002), problems in this category can be stated as: workload × configuration (?) ? performance which means that, given the features of the incoming workload to the database, we are to find the right settings for all system knobs such that the performance goals are satisfied The following two are representatives of a series of such tuning problems in different databases: • Problem 1: Maintenance of multi-class servicelevel agreements (SLA) in relational databases. Database service providers usually offer various levels of performance guarantees to requests from different groups of customers. Fulfillment of such guarantees (SLAs) is accomplished by allocating different amounts of system resources to different queries. For example, query response time is negatively related to the amount of memory buffer assigned to that query. We need to dynamically allocate memory to individual queries such that the absolute or relative response times of queries from different users are satisfied. • Problem 2: Load shedding in stream databases. Stream databases are used for processing data generated continuously from sources such as a sensor network. In streaming databases, data processing delay, i.e., the time consumed to process a data point, is the most critical performance metric (Tatbul et al., 2003). The ability to remain within a desired level of delay is significantly hampered under situations of overloading (caused by bursty data arrivals and time-varying unit data processing cost). When overloaded, some data is discarded (i.e., load shedding) in order to keep pace with the incoming load. The system needs to continuously adjust the amount of data to be discarded such that 1) delay is maintained under a desirable level; 2) data is not discarded unnecessarily.
Chapter Preview
Top

Introduction

Database administration (tuning) is the process of adjusting database configurations in order to accomplish desirable performance goals. This job is performed by human operators called database administrators (DBAs) who are generally well-paid, and are becoming more and more expensive with the increasing complexity and scale of modern databases. There has been considerable effort dedicated to reducing such cost (which often dominates the total ownership cost of mission-critical databases) by making database tuning more automated and transparent to users (Chaudhuri et al, 2004; Chaudhuri and Weikum, 2006). Research in this area seeks ways to automate the hardware deployment, physical database design, parameter configuration, and resource management in such systems. The goal is to achieve acceptable performance on the whole system level without (or with limited) human intervention.

According to Weikum et al. (2002), problems in this category can be stated as:workload × configuration (?)performancewhich means that, given the features of the incoming workload to the database, we are to find the right settings for all system knobs such that the performance goals are satisfied The following two are representatives of a series of such tuning problems in different databases:

  • Problem 1: Maintenance of multi-class service-level agreements (SLA) in relational databases. Database service providers usually offer various levels of performance guarantees to requests from different groups of customers. Fulfillment of such guarantees (SLAs) is accomplished by allocating different amounts of system resources to different queries. For example, query response time is negatively related to the amount of memory buffer assigned to that query. We need to dynamically allocate memory to individual queries such that the absolute or relative response times of queries from different users are satisfied.

  • Problem 2: Load shedding in stream databases. Stream databases are used for processing data generated continuously from sources such as a sensor network. In streaming databases, data processing delay, i.e., the time consumed to process a data point, is the most critical performance metric (Tatbul et al., 2003). The ability to remain within a desired level of delay is significantly hampered under situations of overloading (caused by bursty data arrivals and time-varying unit data processing cost). When overloaded, some data is discarded (i.e., load shedding) in order to keep pace with the incoming load. The system needs to continuously adjust the amount of data to be discarded such that 1) delay is maintained under a desirable level; 2) data is not discarded unnecessarily.

Such problems can hardly be solved by using rules of thumbs and simply throwing in more hardware. In the following section, we shall also see that the traditional approach of treating tuning problems as static optimization problems does not work well for dynamic workloads such as those with OLAP queries. In this chapter, we introduce an emerging new approach to attack self-tuning database problems that is based on well-established results in feedback control theory. Specifically, we address the core issues of the approach and identify critical challenges of applying control theory in the area of self-tuning databases.

Top

Background

Current research in automatic tuning (or self-tuning) of databases tend to treat the problem as an optimization problem with the performance metrics and workload characteristics as inputs. The main drawback for this strategy is: real-world workloads, especially OLAP workloads, are highly unpredictable in that their parameters and behaviors can change very frequently (Tu et al., 2005). Such uncertainties in workloads can bring dramatic variations to system performance and cause the database to run in suboptimal status. In order to maintain consistently good performance, we need to develop means for the database to quickly adapt to the changes in workload.

Complete Chapter List

Search this Book:
Reset