A Run-Time Based Technique to Optimize Queries in Distributed Internet Databases
Latifur Khan (University of Texas at Dallas, USA), Arunkumar Ponnusamy (University of Texas at Dallas, USA), Dennis McLeod (University of Southern California, Los Angeles, USA) and Cyrus Shahabi (University of Southern California, Los Angeles, USA)
Copyright: © 2003
An adaptive probe-based optimization technique is developed and demonstrated in the context of an Internet-based distributed database environment. More and more common are database systems, which are distributed across servers communicating via the Internet where a query at a given site might require data from remote sites. Optimizing the response time of such queries is a challenging task due to the unpredictability of server performance and network traffic at the time of data shipment; this may result in the selection of an expensive query plan using a static query optimizer. We constructed an experimental setup consisting of two servers running the same DBMS connected via the Internet. Concentrating on join queries, we demonstrate how a static query optimizer might choose an expensive plan by mistake. This is due to the lack of a priori knowledge of the run-time environment, inaccurate statistical assumptions in size estimation, and neglecting the cost of remote method invocation. These shortcomings are addressed collectively by proposing a probing mechanism. Furthermore, we extend our mechanism with an adaptive technique that detects sub-optimality of a plan during query execution and attempts to switch to the cheapest plan while avoiding redundant work and imposing little overhead. We demonstrate that this probe technique can be extended in a client-server environment as a basis for choosing the right place for the execution of user defined functions (UDFs). An implementation of our run-time optimization technique for queries was constructed in the Java language and incorporated into an experimental setup. The results demonstrate the superiority of our probe-based optimization over a static optimization.