Article Preview
TopIntroduction
Technologies such as grid and cloud computing infrastructures and service-oriented architectures have become adequately mature and have been adopted by a large number of enterprises and organizations. This trend has altered, to an extent, the way complex computational tasks are formulated, giving rise to approaches that rely on composition of services to be executed in a parallel and distributed manner (Alpdemir et al., 2004; Ng, Ooi, Tan, & Zhou, 2003; Malik, Szalay, Budavari, & Thakar, 2003). As a consequence, there is a growing interest in systems that are capable of processing complex tasks formulated as Web Service (WS) workflows utilizing remote computational resources. These tasks may involve the complete management of online sales, the cleaning of large volumes of accumulated data from mistypes, incorrect entries, etc., and the loosely-coupled integration of local applications with tools made available on the Web.
In Srivastava, Munagala, Widom, and Motwani (2006), the notion of Web Service Management System (WSMS) is introduced as a general purpose system, which possesses such advanced processing capabilities. In a WSMS, processing of data takes place through (remote) calls to WSs. The latter provide an interface of the form
, where X and Y are sets of attributes, i.e., given values for attributes in X, WS returns values for the attributes in Y, as shown in the following example adapted from Srivastava et al. (2006). In the generic case, the input data items (or tuples) may have more attributes than X, while attributes in Y are appended to the existing ones. Note that in the rest of this article, we will use the terms tuple and data item interchangeably.
There are multiple valid orderings to perform this task, although there is one precedence constraint: WS2 must precede WS3. The optimization process aims at deciding on the optimal (or near optimal) ordering under given optimization goals. Two possible WS linear orderings that can be formed using the above services are C1 = WS2 WS3 WS1 WS4 and C2 = WS1 WS2 WS3 WS4. In the first ordering, first, the customers having a good payment history are initially selected (WS2, WS3), and then, the remaining customers whose credit history is below some threshold are filtered out (through WS1). The C2 linear plan performs the same tasks in a reverse order. The above linear orderings have different response time. In a subsequent section it will be shown that C2 is the optimal one.