An Improved Gravitational Clustering Based on Local Density

An Improved Gravitational Clustering Based on Local Density

Lei Chen, Qinghua Guo, Zhaohua Liu, Long Chen, HuiQin Ning, Youwei Zhang, Yu Jin
DOI: 10.4018/IJMCMC.2021010101
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Gravitational clustering algorithm (Gravc) is a novel and excellent dynamic clustering algorithm that can accurately cluster complex dataset with arbitrary shape and distribution. However, high time complexity is a key challenge to the gravitational clustering algorithm. To solve this problem, an improved gravitational clustering algorithm based on the local density is proposed in this paper, called FastGravc. The main contributions of this paper are as follows. First of all, a local density-based data compression strategy is designed to reduce the number of data objects and the number of neighbors of each object participating in the gravitational clustering algorithm. Secondly, the traditional gravity model is optimized to adapt to the quality differences of different objects caused by data compression strategy. And then, the improved gravitational clustering algorithm FastGravc is proposed by integrating the above optimization strategies. Finally, extensive experimental results on synthetic and real-world datasets verify the effectiveness and efficiency of FastGravc algorithm.
Article Preview
Top

Introduction

Clustering is an indispensable and very important method for mining complex real-world data. It uses an unsupervised way to reveal the hidden rules and patterns of human society. In the past 20 years, a large number of excellent clustering algorithms have been proposed, applied, improved and further optimized. Overall, these algorithms can be simply divided into the following categories: partitioned clustering, hierarchical clustering, density clustering, and dynamic clustering (Saxena A et al., 2017). (1) The partitioned clustering and hierarchical clustering is the most commonly and most widely used algorithms, K-Means and BIRCH are the typical cases. This algorithm can obtain excellent clustering accuracy on regular, noise-free dataset like a circle or ellipse, and the time efficiency is very high(Stevan N et al., 2014). But, on irregular and non-uniform datasets like non-circular, the clustering accuracy of this algorithm is not satisfactory. (2) The density clustering and graph clustering are also commonly used excellent clustering algorithm, DbScan and SpectralClustering are typical cases respectively. These two kinds of algorithms are suitable for various datasets, and can achieve excellent clustering performance on irregular and uneven datasets. However, these algorithms require longer clustering time and cannot accurately identify the noise in the dataset. (3) Dynamic clustering algorithm is a novel and outstanding clustering algorithm (Bae J et al., 2020), Gravc is a typical case. The basic idea of this algorithm is to extract a dynamic process from natural phenomena such as gravitation, synchronization, and evolution, and use it to cluster complex dataset. This algorithm can achieve good clustering accuracy on irregular and uneven dataset, and can accurately identify noise and abnormal data in the dataset (Chen L et al., 2017). But, due to the dynamic clustering process, the time complexity of this algorithm is very high.

With the coming of the era of big data, more and more complex data have emerged, such as the wireless and mobile multimedia network(Ajay K et al., 2019), transportation, weather, wireless sensor network(Surender S et al., 2011), etc. These complex new data bring some new challenges to the traditional clustering algorithm from the clustering accuracy and time efficiency (Fahad A et al., 2014). For the clustering accuracy aspect, complex data in the era of big data present some new features such as irregularity, unevenness and high noise. These features will cause the clustering accuracy of traditional clustering algorithms to seriously deteriorate (Shirkhorshidi A S et al., 2014). For the time efficiency, the increasing scale of complex data puts forward higher demand on the time efficiency of traditional clustering algorithm (Mohebi A et al., 2016).

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 1 Issue (2023)
Volume 13: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing