Article Preview
TopIntroduction
Services are a special kind of software component, which are developed and deployed by different vendors. They are available over the Internet, and can be used to provide certain business functionalities (Zhang et al., 2013). However, an individual service usually contains limited functionality. People need to combine several services to satisfy complex requirements in the real world (Chen et al., 2015). “Mashup” has become an important technique to develop applications (i.e., Mashups) by composing services (i.e., Web APIs) (Xia et al., 2014), and can be seen as a novel way to perform service composition (Zhang et al., 2010). Up to now, a large number of Mashups have been published. These Mashups are usually contained in online repositories such as ProgrammableWeb (PWeb) (ProgrammableWeb, 2017), myExperiment (myExperiment, 2017), and Biocatalogue (Biocatalogue, 2017). These repositories provide tools to help users search and browse the Mashups that satisfy their requirements. However, with the number of Mashups being so large, Mashups discovery has become a challenge and takes a lot of time (Zhang et al., 2016).
Clustering technique has been seen as one of the most effective approaches to improve the performance of Mashups discovery, which can organize similar Mashups into groups so as to decrease the searching space of Mashups (Liu & Wong, 2009; Liu & Wong, 2008). Similarity calculation is a very important aspect of any Mashups clustering approach. Thus, large parts of the existing service clustering approaches are devoted to improving the clustering performance by employing much more information of services to quantify service similarity. For example, some clustering approaches (Liu & Wong, 2009; Liu & Wong, 2008; Platzer et al., 2009; Liang et al., 2014; Cassar et al., 2010; Aznag et al., 2013) utilized the description information (e.g., WSDL documents, Mashups descriptions, etc.) in the service profile to quantify service similarity, and further organized services into groups. Other clustering approaches (Chen et al., 2011; Wu et al., 2014; Wu et al., 2012; Chen et al., 2013; Li et al., 2014) utilized both the description information and user-contributed tags to quantify service similarity, and further organized services into clusters. In fact, no matter what information (descriptions or tags) the existing approaches used, the information is all displayed in text. That is, existing approaches actually captured the semantic similarity between services. However, they have never applied any structural similarity to guide service clustering. By saying structural similarity, we mean the similarity results from the structure context of any two services. For example, any Mashup is developed by composing a set of Web APIs. Mashups and their used Web APIs, and the “use” coupling of Mashups and Web APIs constitute a topological structure. Two Mashups can constitute two topological structures. If the two Mashups share some Web APIs in common, then the two Mashups are structurally similar. Thus, we can propose some metrics to characterize the structural similarity between two Mashups. Structural similarity is also of great importance. It captures a different aspect of service similarity and is orthogonal to semantic similarity. In our previous work, we have applied one structural similarity to cluster services (Pan & Chai, 2018). Our preliminary experimental results indicate that our structural similarity has better performance than the semantic similarities in service clustering. However, as far as we know, there is very little work on integrating both the semantic similarity and structural similarity to cluster services.