Topology and Topic-Aware Service Clustering

Topology and Topic-Aware Service Clustering

Weifeng Pan (Zhejiang Gongshang University, Hangzhou, China), Jilei Dong (University of Connecticut, Storrs, USA), Kun Liu (Hubei University of Economics, Wuhan, China) and Jing Wang (Jiangxi University of Finance and Economics, Nanchang, China)
Copyright: © 2018 |Pages: 20
DOI: 10.4018/IJWSR.2018070102

Abstract

This article describes how the number of services and their types being so numerous makes accurately discovering desired services become a problem. Service clustering is an effective way to facilitate service discovery. However, the existing approaches are usually designed for a single type of service documents, neglecting to fully use the topic and topological information in service profiles and usage histories. To avoid these limitations, this article presents a novel service clustering approach. It adopts a bipartite network to describe the topological structure of service usage histories and uses a SimRank algorithm to measure the topological similarity of services; It applies Latent Dirichlet Allocation to extract topics from service profiles and further quantifies the topic similarity of services; It quantifies the similarity of services by integrating topological and topic similarities; It uses the Chameleon clustering algorithm to cluster the services. The empirical evaluation on real-world data set highlights the benefits provided by the combination of topological and topic similarities.
Article Preview

Introduction

Service-Oriented Computing (SOC) is a new computing paradigm that provides a distributed computing infrastructure for both intra- and cross-enterprise application integration and collaboration (Papazoglou, 2003). It has become one of the hottest topics in recent years, receiving much attention from research community and industry. SOC advocates building distributed applications by means of the composition of services, which greatly changes the way that software applications are designed, architected, delivered and consumed (Zhang & Zhang, 2013). Software is now exposed to an open and collaborative environment constituted by services. With the prevalence of SOC, the number of services and their types published on the Internet have been rapidly growing (Chen et al., 2015). The internet of services is being formed (Tan et al., 2010). The number of services and their types being so large makes how to accurately and efficiently discover the desired services become a problem facing many service consumers (Zhang et al., 2016).

Service clustering is considered to be an effective way to facilitate service discovery by organizing services into clusters of similar services to aid the pruning of the query space (Xia et al., 2015; Zhang et al., 2009). In recent years, several kinds of approaches have been proposed for clustering services to improve the performance of service discovery, mainly including functional-attributes-based (FAB) approaches (Dasgupts et al., 2011) and nonfunctional-attributes-based (NFAB) approaches (Chen et al., 2015; Xia et al., 2011; Zhu et al., 2012). In the current work, we focus on FAB approaches and aim at improving the quality of functionally similar clusters. The previous work on FAB approaches has made some great achievements, but there are still the following problems:

  • Most prior FAB approaches usually designed for a single type of service documents and make use of the information in WSDL (Web Service Description Language) documents or OWL-S (Ontology Web Language for Services) documents to compute service similarity. However, some new types of services such as RESTful services, Web APIs, and Mashups lack of such a WSDL document or OWL-S document to be used;

  • Most prior FAB approaches use the space vector model (Salton et al., 1975) to reduce the dimensions of the service documents. Only a few approaches take the topic of services into consideration to compute service similarity;

  • Most prior FAB approaches neglect to utilize the topological information in service usage histories to compute service similarity. However, the service usage histories also include rich information about the similarity of services which is one of the main issues for better clustering performances.

Thus, questions such as “how to organize services into clusters in the absence of WSDL documents or OWL-S documents” and “how to group services into clusters by using information in the topic of services and topology of service usage histories” are natural to ask.

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 16: 4 Issues (2019): Forthcoming, Available for Pre-Order
Volume 15: 4 Issues (2018)
Volume 14: 4 Issues (2017)
Volume 13: 4 Issues (2016)
Volume 12: 4 Issues (2015)
Volume 11: 4 Issues (2014)
Volume 10: 4 Issues (2013)
Volume 9: 4 Issues (2012)
Volume 8: 4 Issues (2011)
Volume 7: 4 Issues (2010)
Volume 6: 4 Issues (2009)
Volume 5: 4 Issues (2008)
Volume 4: 4 Issues (2007)
Volume 3: 4 Issues (2006)
Volume 2: 4 Issues (2005)
Volume 1: 4 Issues (2004)
View Complete Journal Contents Listing