Article Preview
TopIntroduction
Web services are gradually becoming the mainstream technology for implementing service-oriented architecture (SOA) applications. With the emergence of many more SOA-based applications, more and more Web services are available on the Internet today. For this reason, the rapid and accurate discovery and selection of required Web services have become a fundamental challenge in service computing. In addition, the lack of a formal description model, too little description text, and the irregular description language, further increases the challenge of Web services discovery and selection (Ye, Cao et al. 2019).
Web services classification techniques have been studied and proposed by many researchers previously. The main goal is to reduce the space and time required for Web services search to improve the efficiency and quality of Web service discovery. Most of these studies classify Web services based on their functional attributes (Wang, Yang et al. 2017) (Xia, Fan et al. 2014). They typically employ TF-IDF, cosine similarity, and other similarity measures to determine the functional similarity between Web services based on Web services description language (WSDL) documents (Xia, Fan et al. 2014). Furthermore, several researchers have used LDA topic models or their extensions (Shi, Liu et al. 2019) (Shi, Liu et al. 2017) (Cao, Liu et al. 2017a) (Cao, Liu et al. 2019) to mine hidden topic information in Web services. These topic-model-based works represent Web services using low-dimensional topic vector features and classify Web services by computing similarities based on these topic vectors. However, it is very challenging for these approaches that only consider service content information to achieve good results because of the short length and sparse nature of WSDL documents (Cao, Liu et al. 2016).
Web services are also directly or indirectly related to other information (e.g., Tags, Mashups, etc.), which characterize the functional properties of a Web service from several perspectives(Cao, Liu et al. 2017b). Therefore, several methodologies exist to classify Web services using auxiliary relationships such as tags. Although these methodologies improve the accuracy of Web service classification to a certain extent, they rely on attribute information such as textual description information and labels that do not fully consider the complex structural interactions between Web services (combination and shared labeling relationships).
Several objects link Web services to form a natural heterogeneous information network and provide new ideas for some special Web service classification situations. This has recently led to several researchers focusing on studying node representation learning for heterogeneous information networks (HIN)(Shi, Li et al. 2016) and have applied it for service classification. HIN aims to learn to map input spaces to lower-dimensional spaces while preserving heterogeneous structure and semantics, one of the most promising of such works are Metapath2vec(Dong, Chawla et al. 2017), HERec(Shi, Hu et al. 2018) and Hin2vec(Fu, Lee et al. 2017). Though these methods achieve significant classification accuracy improvement in node classification, they also have limitations. First, they usually sample negatively by randomly selecting existing nodes in the network. Second, they focus on capturing rich semantic information over heterogeneous information networks without paying attention to the underlying node distribution. For these reasons, they do not perform well under real network situations, which tend to be sparse and noisy.