Augmenting Labeled Probabilistic Topic Model for Web Service Classification

Augmenting Labeled Probabilistic Topic Model for Web Service Classification

Shengye Pang, Guobing Zou, Yanglan Gan, Sen Niu, Bofeng Zhang
Copyright: © 2019 |Pages: 21
DOI: 10.4018/IJWSR.2019010105
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Web service classification has become an urgent demand on service-oriented applications. Most existing classification algorithms mainly rely on the original service descriptions. That leads to low classification accuracy, since it cannot fully reflect the semantic feature specific to a service category. To solve the issue, this article proposes a novel approach for web service classification, including service topic feature extraction, service functionality augmentation, and service classification model learning. The characteristic is that the original service descriptions can be semantically augmented, which is fed to deriving a service classifier via labeled probabilistic topic model. A benefit from this approach is that it can be applied to an online service management platform, where it assists service providers to facilitate the registration process. Extensive experiments have been conducted on a large-scale real-world data set crawled from ProgrammableWeb. The results demonstrate that it outperforms state-of-the-art methods in terms of service classification accuracy and convergence speed.
Article Preview
Top

1. Introduction

Due to the fast advancement of Web 2.0 technologies and service-oriented computing, more and more service providers publish their services on the internet mainly in the form of web APIs. They can be more easily organized and manipulated in a loosely coupled style for creating service mashups to fulfill comprehensive functional requirements and offer value-added integrated software systems with complex business processes. As the rapid increase in the number and diversity of web services, it accelerates the interoperable machine-to-machine interaction and greatly promotes the procedure of service discovery, optimum selection, automatic composition and recommendation (Xia, Luo, Li, & Zhu, 2013; Xia, Liu, Liu, & Zhu, 2012; Li, Luo, Xia, Han, & Zhu, 2015). However, with the boom of overwhelming number of functional characteristics of the published web services, there are always hundreds of categories in an online RESTful service repository. As a result, it tends to be a labor-intensive challenging task for service providers to search and find an appropriate category from diverse registered ones, when publishing their API services on a service management platform. For example, ProgrammableWeb.com, which is the largest online RESTful service repository (APIs and mashups), collects over 19,000 APIs and 7000 mashups with more than 400 diverse categories on the web. In addition to providing basic registration information when service providers register their API services on ProgrammableWeb, it needs to further manually choose at least one desired category from more than 400 categories so that it can match corresponding service functional description. Therefore, how to design an effective approach that can classify web services and recommend an accurate category has become a critical research issue to be addressed (Ames & Naaman, 2007).

In recent years, correlative research efforts have been posed on web service classification (Tsoumakas, Katakis, & Taniar, 2008). These existing approaches achieve the goal of web service classification and service tag recommendation by training traditional supervised learning model (e.g., SVM) (Lopez & Maldonado, 2016; Wang, Shy, Zhou, & Bouguettaya, 2010), active learning-based supervised learning model (Tong & Koller, 2001; Liu, Agarwal, Ding, & Yu, 2016; Shi, Liu, & Yu, 2017), or a comprehensive supervised learning model where unlabeled probabilistic topic model (e.g., LDA) (Krestel, Fankauser, & Nejdl, 2009) has been applied to extract semantic feature of web services. Some of the works generally learn a classification model under an existing labeled service repository, while active learning method was taken into account for boosting the learned service classifier, where the most informative services are intellectually selected at each iteration and manually labeled with human efforts to enrich the quality of small scale training data. Although they take advantage of the existing service repository as training data to derive a service classifier which can be easily deployed and applied, it is still unsatisfactory for service providers’ demands on high accuracy of web service classification.

The essential reason is that existing approaches have deficiencies on their effectiveness and efficiency. More specially, the disadvantages of current paradigm for web service classification are twofold. (1) On one hand, they mainly rely on the original service descriptions for learning a service classifier, where each functional description of a RESTful web service only consists of a bunch of short text (e.g., 10 to 20 words), failing to be fully understood on its corresponding category. Furthermore, it is observed that some words are frequently repeated with high occurrence across different service descriptions, which obviously disturbs the purity of differentiating its category of web services. Therefore, it is crucially harmful to affect the classification accuracy. (2) On the other hand, most existing approaches leverage traditional classification algorithm (e.g., SVM), where multiple basic models need to be trained as a whole to perform web service classification, because each of them is a dichotomous classifier that cannot directly solve a multi-class problem. As a result, they accomplish the task with high complexity both on huge space consumption and slow convergence speed, when training a service classifier on a large-scale web service repository.

Complete Article List

Search this Journal:
Reset
Volume 21: 1 Issue (2024)
Volume 20: 1 Issue (2023)
Volume 19: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 18: 4 Issues (2021)
Volume 17: 4 Issues (2020)
Volume 16: 4 Issues (2019)
Volume 15: 4 Issues (2018)
Volume 14: 4 Issues (2017)
Volume 13: 4 Issues (2016)
Volume 12: 4 Issues (2015)
Volume 11: 4 Issues (2014)
Volume 10: 4 Issues (2013)
Volume 9: 4 Issues (2012)
Volume 8: 4 Issues (2011)
Volume 7: 4 Issues (2010)
Volume 6: 4 Issues (2009)
Volume 5: 4 Issues (2008)
Volume 4: 4 Issues (2007)
Volume 3: 4 Issues (2006)
Volume 2: 4 Issues (2005)
Volume 1: 4 Issues (2004)
View Complete Journal Contents Listing