ASCCN: Arbitrary Shaped Clustering Method with Compatible Nucleoids

ASCCN: Arbitrary Shaped Clustering Method with Compatible Nucleoids

Renxia Wan (North University for Nationalities and Donghua University, China), Lixin Wang (Anhui Institute of Architecture and Industry, China) and Xiaoke Su (Donghua University, China)
Copyright: © 2010 |Pages: 15
DOI: 10.4018/jdwm.2010100101
OnDemand PDF Download:
No Current Special Offers


Special clustering algorithms are attractive for the task of grouping an arbitrary shaped database into several proper classes. Until now, a wide variety of clustering algorithms for this task have been proposed, although the majority of these algorithms are density-based. In this paper, the authors extend the dissimilarity measure to compatible measure and propose a new algorithm (ASCCN) based on the results. ASCCN is an unambiguous partition method that groups objects to compatible nucleoids, and merges these nucleoids into different clusters. The application of cluster grids significantly reduces the computational cost of ASCCN, and experiments show that ASCCN can efficiently and effectively group arbitrary shaped data points into meaningful clusters.
Article Preview

1. Introduction

In data mining field, clustering pays a very important role. Clustering is the task of categorizing a set of objects into different clusters such that objects in the same cluster are more similar to each other than objects in different clusters according to some defined criteria (Huang, 1998). It is useful in a number of tasks, for example, by partitioning objects into clusters, interesting object groups may be discovered, such as the groups of clients in a banking database having a heavy investment in real estate, and clustering of data streams also can find some important applications in tracking evolution of various phenomena in medical, meteorological, astrophysical, seismic studies (Bhatnagar et al., 2009).

Cluster analysis has become the subject of active research in several fields such as statistics, pattern recognition, machine learning and data mining. Up to now, a wide variety of clustering algorithms has been proposed, and also received a lot of attention in the last few years (c.f., section 2). In these algorithms, discovery of arbitrary shaped clusters is often to be a real obstacle for their applications. Ester (1996) and Halkidi (2001) imply that some typical clustering algorithms such as k-means, CURE, ClARANS and so on will get some poor results if there are some nonconvex shape data sets or some ball-shaped data sets of significantly differing sizes in the database. To get the arbitrary shaped clusters, algorithms based on density are designed (DBSCAN is a typical one), but these algorithms also face challenges from the efficiency and the affectivity such as the computation time may be intolerable or parameters input is not “user-friendly”.

In this paper, we present the new clustering algorithm ASCCN. It is a crisp partition method, and clusters objects with compatible nucleoids. The new algorithm requires only one input parameter, can discover arbitrary size and shaped clusters, is efficient even for large data sets especially data with high dimension. The rest of this paper is organized as follows: In section 2, we survey related work. In section 3, we define the compatible relation. The new algorithm ASCCN is presented in section 4. The experimental results are reported to illustrate the new algorithm in section 5. Finally, we draw our conclusions in section 6.

Complete Article List

Search this Journal:
Volume 18: 4 Issues (2022): Forthcoming, Available for Pre-Order
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing