Article Preview
Top1. Introduction
Service-oriented computing has been widely applied in a variety of fields. A web service is composed of a series of operations. And each operation takes a SOAP package containing a list of input parameters, fulfills a certain task, and returns the result in an output SOAP package. A lot of enterprises are increasingly relying on web services as a method to provide services to the user. The rapid increasing of web services will bring a very challenging problem: how to accurately find the required web service.
To solve this problem, the current web service discovery methods are mainly divided into two categories (Hua, Liu, & Liu, 2008; Lv, Yang, & Yang, 2009): one is based on semantics, and the other is based on syntax known as a traditional web service discovery method. Traditional web service discovery method uses UDDI (Universal Description, Discovery and Integration)(WWW Companion, 2004)as an information registration specification and uses the language of WSDL (Web Services Description Language) (Christensen, Curbera, &Meredith,2001) to describe the Web service. And the use of exactly match keyword-based algorithm can achieve web service matching. These two kinds of methods have some problems as follows: (1) semantics-based methods such as OWL-S (Martin, & Burstein, 2007)and WSMO (Battle, Bernstein & Boley, 2005) use formal language to describe web services, and develop similar algorithm based on reasoning similarity to search services. However, semantic annotation is an artificial process, and it also requires people to study the body and to mark the web services. However, web service providers and requesters may not use the same ontology, and ontology may contain a large number of concepts. Therefore, although accuracy is improved, time is wasted, the workload is still unpredictable; (2) There are some classic information retrieval algorithms based on WSDL. Some terms extracted from WSDL based on XML to form a vector, and use the traditional similarity algorithm to directly calculate the similarity of web services. This kind of method is very inefficient, because it is based on terms frequency, and these terms obtained from the WSDL document is relatively small and the frequency of occurrence is relatively low. Considering the problems above, Woogle (Dong, Halevy, Madhavan, Nemes & Zhang, 2009) and URBE (Plebani & Pernici, 2009) have been built. They use semantics of terms extracted from WSDL documents to support the calculation method of similarity. Some meaningful concepts are clustered from parameter names collected in web services in (Dong, Halevy, Madhavan, Nemes & Zhang, 2009), and then use the TF/IDF algorithm in these concepts to get similarity of services. The author in (Plebani & Pernici, 2009) utilizes additional knowledge, the lexical database and Wordnet to obtain the semantic distance of terms extracted from the operations, input/output parameters inside WSDL. Apart from the two representative methods, other numerous techniques have also been developed (Hess & Kushmerick, 2003; D.-S., &Coalition. Daml-s, 2002; Paolucci, Kawmura, Payne, & Sycara,2002). Despite that much progress has been made, most of them employ information available only within the descriptions of services and hence only single similarity metric is applied. These restrictions often lead to the dilemma that no matter which metric we select, there are cases that we can’t deal with correctly as illustrated below.