A Novel Chaotic Northern Bald Ibis Optimization Algorithm for Solving Different Cluster Problems [ICCICC18 #155]

A Novel Chaotic Northern Bald Ibis Optimization Algorithm for Solving Different Cluster Problems [ICCICC18 #155]

Ravi Kumar Saidala (Acharya Nagarjuna University, Guntur, India) and Nagaraju Devarakonda (Lakireddy Bali Reddy College of Engineering, Mylavaram, India)
DOI: 10.4018/IJSSCI.2019040101

Abstract

This article proposes a new optimal data clustering method for finding optimal clusters of data by incorporating chaotic maps into the standard NOA. NOA, a newly developed optimization technique, has been shown to be efficient in generating optimal results with lowest solution cost. The incorporation of chaotic maps into metaheuristics enables algorithms to diversify the solution space into two phases: explore and exploit more. To make the NOA more efficient and avoid premature convergence, chaotic maps are incorporated in this work, termed as CNOAs. Ten different chaotic maps are incorporated individually into standard NOA for testing the optimization performance. The CNOA is first benchmarked on 23 standard functions. Secondly, testing was done on the numerical complexity of the new clustering method which utilizes CNOA, by solving 10 UCI data cluster problems and 4 web document cluster problems. The comparisons have been made with the help of obtaining statistical and graphical results. The superiority of the proposed optimal clustering algorithm is evident from the simulations and comparisons.
Article Preview
Top

Introduction

Generating and sharing of the magnitude of data via public administrations, business, scientific research, numerous industrial and non-profit sectors has increased immeasurably. These data include textual content (i.e. unstructured, semi structured as well as structured (Hashimi et al., 2015), to multimedia content (e.g. audio, images, videos) on a variety of platforms (e.g. sensor networks, system-to-system communications, cyber-physical systems, social media websites and Internet of Things) (Witten et al., 2016). Due to the incessant growth in generating and sharing of data, new and efficient techniques are needed for accessing, discovering the hidden knowledge and sharing the same from various domains (Larose et al., 2014). Human investigation for knowledge extraction of this huge data is a tiresome task and it was found that the obtained results are no longer accurate. The classical algorithms are inaccurate in interpreting and extracting hidden knowledge. So, new and advanced technologies are needed to come into existence to understand the knowledge extraction process automatically and summarize the meaningful information as per the application requirements (Thuraisingham, 2014). Therefore, it is an obligation to design clever and efficient techniques to analyze this massive data. Since 1990’s when data mining techniques have appeared in database family, it is broadly used to extract hidden knowledge and pattern from enormous data sets (Han, 2011). This extraction uses two different techniques, namely supervised and unsupervised techniques (Brownlee, 2016). Clustering is the most used unsupervised and popular data analysis technique in data mining for extracting the hidden knowledge of data by partitioning it into clusters or groups. The ultimate purpose of clustering is to generate the clusters of similar data objects by classifying the unlabeled input data. By doing this, the similarity is to be minimized between the objects of each cluster while the similarity is also to be maximized between objects of other clusters. Hierarchical and Partitional clustering are the two primary categories of the developed numerous clustering algorithms (Jain, 2010). The first category algorithms seek to build a tree structure of cluster in the absence of prior knowledge about the count of initial clusters. In the second category, an initial cluster centroid is assigned. The k-means partitional clustering technique is the widely used and the most prevalent algorithm. This technique effectively groups extensive datasets based on the best runtime. In spite of the fact that the k-means algorithm is quicker than numerous other algorithms, it experiences two note-worthy issues, i.e. exhibiting high sensitivity in the initialization phase and local optima at a low convergence rate (Jain, 2010; Kantardzic, 2011). It has been noticed from the literature (Alam et al., 2014; Esmin et al., 2015; José-García, & Gómez-Flores, 2016; Nanda & Panda, 2014; Saidala & Devarakonda, 2018a) that conjoining the nature–inspired optimization algorithms with standard data clustering techniques will result in accurate solutions. It also enables to overcome the drawbacks found in the standard data clustering techniques.

Table 1.
List of uni and multimodal benchmark functions
Function DescriptionRangefmin
IJSSCI.2019040101.m01IJSSCI.2019040101.m020
IJSSCI.2019040101.m03IJSSCI.2019040101.m040
IJSSCI.2019040101.m05IJSSCI.2019040101.m060
IJSSCI.2019040101.m07IJSSCI.2019040101.m080
IJSSCI.2019040101.m09IJSSCI.2019040101.m100
IJSSCI.2019040101.m11IJSSCI.2019040101.m120
IJSSCI.2019040101.m13IJSSCI.2019040101.m140
IJSSCI.2019040101.m15IJSSCI.2019040101.m16IJSSCI.2019040101.m17-
IJSSCI.2019040101.m18IJSSCI.2019040101.m190
IJSSCI.2019040101.m20IJSSCI.2019040101.m210
IJSSCI.2019040101.m22IJSSCI.2019040101.m230
IJSSCI.2019040101.m24
IJSSCI.2019040101.m25
IJSSCI.2019040101.m260
IJSSCI.2019040101.m27IJSSCI.2019040101.m280

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 12: 4 Issues (2020): Forthcoming, Available for Pre-Order
Volume 11: 4 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing