Article Preview
TopIntroduction
Web service is the method for implementing Service Oriented Architecture (SOA) features. Recent years, web services play a major role in online trading, reservation systems, healthcare management, e-learning, etc. At last decade, over 130 percentages of services increased in the public registries (Al-Masri & Mamoud, 2008). In the increasing impact of web service development, the difficult task is to measure the Quality of Service (QoS). Single service cannot satisfy the client needs. Web service composition has introduced to aggregate the multiple services to meet the needs of the service consumer. So, the high quality of service has selected as the tedious work.
QoS measured is in the form of response time, throughput, failure rate, availability, etc. They (Kondratyeva, Cavalli, Kushik, & Yevtushenko, 2013) noticed that the research in the web service increased in 2010 by 40 times compared with the year of 2000. Moreover, the number of research publications based on the Quality of Experience (QoE) is smaller than the QoS based research. Still, QoS and QoE are main metrics to choose services. Other metrics, QoB (Quality of Business) and QoD (Quality of Design) identified by IT industry to test the quality of services.
Most of the research is in the web service selection, service composition, service prediction, finding trusted services, service reputation, agent mechanism, etc. This competitive world, find the solutions before the problem occurs. Such that, analyzing users' thoughts, recommend products, app, music albums, books, etc. in the shopping, predicts healthcare issues, fill in a search keyword, predicting environmental analysis, etc. There is a need for the ability to improve the customer rate, efficient processing, and solving business challenges in this competitive world. The QoS values varied depends on the dynamic changes in the network traffic, the service status when increasing the number of clients and workload, and internet connection speed. So, the service user's location considered here as the main role.
Clustering algorithms are a smart way to organize the space-oriented datasets in specific groups. The clusters might be in different uneven shapes in spatial datasets. There are two categories such as partition clustering and hierarchical clustering. In partition clustering, the database divides into a finite number of partitions. So, with the knowledge of the domain, specify the number of clusters needed. The clusters have formed by setting the center point of the cluster. For an example, k-means and k-medoid are the partitioning algorithms. In 1967, K-means uses at first and K-medoid proposed in 1987 and these algorithms analyzed by these authors (Arora, Varshney, & others, 2016; Kriegel, Schubert, & Zimek, 2017). For a large dataset, Clustering Large Applications based on RANdomized Search (CLARANS) (Ng & Han, 2002) is more effective than the k-medoid algorithm. It requires more time to run for large 'n' records. The hierarchical clustering algorithm decomposes the database from root to leaves (divisive approach) or leaves to root (agglomerative approach) and make the clusters. Here, the termination rule should specify at decomposition process. DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a density-based clustering algorithm and developed by these authors (Ester, Kriegel, Sander, Xu, & others, 1996) and generalized DBSCAN designed by these researchers (Sander, Ester, Kriegel, & Xu, 1998). It makes the number of the cluster using the algorithm itself. High-density values that are close neighbors are grouped together based on the location as a cluster. The values of low density are omitted and considered as noise points. It proves more effective than the CLARAN algorithm for making the arbitrary shape of clusters. It requires two parameters; ε (epsilon) and a minimum number of points to make a cluster. ε denotes the maximum radius between the core points and neighborhood points. This algorithm starts with any unvisited point towards the sufficient number of minimum points with ε radius to make the cluster until all points in spatial space. The visited points without neighbor are considered as noise. The parameter setting of ε radius (Rahmah & Sitanggang, 2016), it depends on k-distance graph, in which big bend of this plot denotes the right value of ε. For a large dataset, choose the large value for a minimum number of points. The choice of the distance function is either Euclidean distance function or other distance metrics. This clustering approach is highly scalable and efficient for large dataset.