Clustering Mashups by Integrating Structural and Semantic Similarities Using Fuzzy AHP

Clustering Mashups by Integrating Structural and Semantic Similarities Using Fuzzy AHP

Weifeng Pan (School of Computer Science and Information Engineering, Zhejiang Gongshang University, China), Xinxin Xu (School of Computer Science and Information Engineering, Zhejiang Gongshang University, China), Hua Ming (School of Engineering and Computer Science, Oakland University, USA) and Carl K. Chang (Department of Computer Science, Iowa State University, USA)
Copyright: © 2021 |Pages: 24
DOI: 10.4018/IJWSR.2021010103
OnDemand PDF Download:
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Mashup technology has become a promising way to develop and deliver applications on the web. Automatically organizing Mashups into functionally similar clusters helps improve the performance of Mashup discovery. Although there are many approaches aiming to cluster Mashups, they solely focus on utilizing semantic similarities to guide the Mashup clustering process and are unable to utilize both the structural and semantic information in Mashup profiles. In this paper, a novel approach to cluster Mashups into groups is proposed, which integrates structural similarity and semantic similarity using fuzzy AHP (fuzzy analytic hierarchy process). The structural similarity is computed from usage histories between Mashups and Web APIs using SimRank algorithm. The semantic similarity is computed from the descriptions and tags of Mashups using LDA (latent dirichlet allocation). A clustering algorithm based on the genetic algorithm is employed to cluster Mashups. Comprehensive experiments are performed on a real data set collected from ProgrammableWeb. The results show the effectiveness of the approach when compared with two kinds of conventional approaches.
Article Preview
Top

Introduction

Services are a special kind of software component, which are developed and deployed by different vendors. They are available over the Internet, and can be used to provide certain business functionalities (Zhang et al., 2013). However, an individual service usually contains limited functionality. People need to combine several services to satisfy complex requirements in the real world (Chen et al., 2015). “Mashup” has become an important technique to develop applications (i.e., Mashups) by composing services (i.e., Web APIs) (Xia et al., 2014), and can be seen as a novel way to perform service composition (Zhang et al., 2010). Up to now, a large number of Mashups have been published. These Mashups are usually contained in online repositories such as ProgrammableWeb (PWeb) (ProgrammableWeb, 2017), myExperiment (myExperiment, 2017), and Biocatalogue (Biocatalogue, 2017). These repositories provide tools to help users search and browse the Mashups that satisfy their requirements. However, with the number of Mashups being so large, Mashups discovery has become a challenge and takes a lot of time (Zhang et al., 2016).

Clustering technique has been seen as one of the most effective approaches to improve the performance of Mashups discovery, which can organize similar Mashups into groups so as to decrease the searching space of Mashups (Liu & Wong, 2009; Liu & Wong, 2008). Similarity calculation is a very important aspect of any Mashups clustering approach. Thus, large parts of the existing service clustering approaches are devoted to improving the clustering performance by employing much more information of services to quantify service similarity. For example, some clustering approaches (Liu & Wong, 2009; Liu & Wong, 2008; Platzer et al., 2009; Liang et al., 2014; Cassar et al., 2010; Aznag et al., 2013) utilized the description information (e.g., WSDL documents, Mashups descriptions, etc.) in the service profile to quantify service similarity, and further organized services into groups. Other clustering approaches (Chen et al., 2011; Wu et al., 2014; Wu et al., 2012; Chen et al., 2013; Li et al., 2014) utilized both the description information and user-contributed tags to quantify service similarity, and further organized services into clusters. In fact, no matter what information (descriptions or tags) the existing approaches used, the information is all displayed in text. That is, existing approaches actually captured the semantic similarity between services. However, they have never applied any structural similarity to guide service clustering. By saying structural similarity, we mean the similarity results from the structure context of any two services. For example, any Mashup is developed by composing a set of Web APIs. Mashups and their used Web APIs, and the “use” coupling of Mashups and Web APIs constitute a topological structure. Two Mashups can constitute two topological structures. If the two Mashups share some Web APIs in common, then the two Mashups are structurally similar. Thus, we can propose some metrics to characterize the structural similarity between two Mashups. Structural similarity is also of great importance. It captures a different aspect of service similarity and is orthogonal to semantic similarity. In our previous work, we have applied one structural similarity to cluster services (Pan & Chai, 2018). Our preliminary experimental results indicate that our structural similarity has better performance than the semantic similarities in service clustering. However, as far as we know, there is very little work on integrating both the semantic similarity and structural similarity to cluster services.

Complete Article List

Search this Journal:
Reset
Volume 19: 4 Issues (2022): Forthcoming, Available for Pre-Order
Volume 18: 4 Issues (2021)
Volume 17: 4 Issues (2020)
Volume 16: 4 Issues (2019)
Volume 15: 4 Issues (2018)
Volume 14: 4 Issues (2017)
Volume 13: 4 Issues (2016)
Volume 12: 4 Issues (2015)
Volume 11: 4 Issues (2014)
Volume 10: 4 Issues (2013)
Volume 9: 4 Issues (2012)
Volume 8: 4 Issues (2011)
Volume 7: 4 Issues (2010)
Volume 6: 4 Issues (2009)
Volume 5: 4 Issues (2008)
Volume 4: 4 Issues (2007)
Volume 3: 4 Issues (2006)
Volume 2: 4 Issues (2005)
Volume 1: 4 Issues (2004)
View Complete Journal Contents Listing