Article Preview
TopIntroduction
The increasing deployment of the service-oriented programming paradigm, in which programs are composed by weaving together distributed, platform-independent software components, raises the problem of service substitution: How to best replace one software component with another. The substitution may be motivated by a variety of reasons: An existing component might have failed, or there could be an alternative service that is of higher quality or lower cost. As the service-oriented paradigm gains popularity, the number of available services increases substantially, making the searches for substitutions more complicated and costly, and raising the need for methods and tools that facilitate such searches (Sillito, Murphy et al. 2008).
Finding the service (or group of services) that are most similar to a given service can be stated as a k-nearest neighbors problem. As such, it requires establishing a formal notion of service similarity. In this paper we describe a formal model for information services that incorporates the concept of service similarity, and we utilize this model for addressing the challenge of finding service substitutions.
Our model has several distinguishing aspects. First, the model considers services as mappings of inputs to outputs. A simple example is a service that returns the current temperature for a given US Zip code. These services are stateless and free of side-effects. Our view of services is thus quite similar to that advocated by the Representative State Transfer (REST) architectural style (Zhao and Doshi 2009). Services designed according to RESTful principles make resources available to clients through Web pages: Clients navigate to a URL, provide the necessary input values, invoke the service and receive a response. Since REST describes functionality with simple URLs rather than operations enveloped in complex XML standards, RESTful service implementations are becoming a popular alternative to SOAP/WSDL technologies. Unfortunately, formal models and techniques that prove so useful for developing services with SOAP/WSDL standards are absent for this type of service. So, as more data is made available via RESTful services, a formal model for information services is expected to be useful for RESTful services.
An important feature of the model is that the input and output of a service could be arbitrarily complex. For example, a service that returns the value of a stock portfolio given a particular date and a set of stock symbols and their corresponding quantities — has input which is a pair comprising a single value (date) and an arbitrarily-sized set of pairs (stock symbol and quantity). Another feature of the model is that the semantics of services need not be represented externally — by means of natural language descriptions, sets of keywords, ontologies, and so on. Instead, when necessary, semantics are inferred from observable behavior.
Our model defines a notion of service similarity that is based solely on observable behavior. Similarity is measured with distance metrics. Each basic domain (such as real numbers or character strings) is associated with a simple metric, and these metrics are then combined to create complex metrics that could be associated with domains of arbitrary complexity. Given two services with identically structured inputs and outputs, their similarity is derived from the similarity of their outputs for identical inputs. Additionally, the model is extended to deal with service exceptions: instances in which services might not provide valid outputs as expected.
A naïve solution to the problem of finding substitutions is to compare the observable behavior of the given service to each of the candidate services. To avoid long computations at the time when the substitutions are needed, similarity measurements should be prepared ahead of time. The cost, however, could be high, since metrics could be complex, and the number of services could be high. To overcome this challenge we describe a method that embeds the given set of services into a vector space, where the complexity of measurement is reduced significantly.