Article Preview
Top1. Introduction And Background
Data mining is a subfield of computer science which finds the valuable hidden information in the data. It is the investigations of data for hidden information and the use of software techniques for discovering patterns, regularities and relationships in datasets. It involves methods at the intersection of database systems and mathematics (Aggarwal, 2014). Partitioning the input dataset into meaningful groups is one of the fundamental data analysis concepts of understanding and learning the valuable information. Clustering and classification are the two important concepts in data analysis to extract the useful covered information from the dataset. Clustering is one of the unsupervised machine learning technique and it is defined as the process of identifying the similar groups of data of a dataset. When groups are formed in clustering process, it should be externally isolated and internally interconnected (i.e.) degree of homogeneity within clusters should be high and degree of heterogeneity between clusters also should be high (Han et al., 2011; Kharat et al., 2018; Witten et al., 2016). The key challenge in the data analysis is to categorize patterns or useful hidden knowledge into a set of meaningful groups without the knowledge of class labels (Larose & Larose, 2014). Since the decades, cluster analysis has contributed a vital role in numerous fields such include Artificial Intelligence, Machine Vision and Machine Learning, pattern Recognition, Image Processing etc. (Aggarwal, 2014; Celebi, 2014; Celebi & Aydin, 2016). Unmasking outliers in a particular dataset are a pertinent problem in the data mining area. Outlier mining is the process of distinguishing ambient events, irregular or divergent elements, sudden changes in system generated data and exceptions (Camposet al., 2016; Roopa et al., 2020). Outlier analysis is an important data mining issue in knowledge extraction. Clustering based outlier unmasking is one of the best approaches to detect and eliminating such irregular or divergent data elements (Hodge & Austin, 2004). Classical clustering techniques are found inaccurate when data size increases (Nikbakht & Mirvaziri, 2015). Many researchers stated that the clustering performance of the classical clustering techniques can be increased with the help of nature inspired Metaheuristics.
The term optimum means the ultimate ideal or the best. Hence, optimization denotes to find the best solution for the real-world NP hard problems. Optimization is repetitive procedure to obtain the desired outcome to the formulated problem by satisfying all its constraint or bounded conditions (Mirjalili et al., 2017). Nature has always been a key source for the researchers to get inspired for developing many algorithms to solve the real-world complex problems. The algorithms developed by taking inspiration from the nature, can mimic certain phenomena from nature (Mirjalili et al., 2018), referred as Nature Inspired Metaheuristics. Over the past few decades, nature has paved the path to design and develop numerous efficient algorithms to solve complex optimization problems. Recently many Metaheuristic algorithms have gain the researchers attention from the field of pattern recognition and clustering. Many Nature-inspired Metaheuristics have been developed recently and successfully applied for solving complex problems for instance recently invented algorithms Whale Optimization Algorithm (WOA) (Mirjalili & Lewis, 2016), Tornadogenesis optimization algorithm (TOA) (Saidala & Devarakonda, 2017a), New class topper optimization algorithm (NCTOA) (Das et al., 2018), etc., and new versions of these algorithms have also been applied in solving many real-world application problems (Moein, 2014; Esmin et al., 2015; Saidala & Devarakonda, 2017b; Saidala & Devarakonda, 2017c; Li et al., 2015; Saidala & Devarakonda, 2018a; Saidala & Devarakonda, 2019; Lin et al., 2018; Saidala & Devarakonda, 2018b; Saidala & Devarakonda, 2018c; Saidala et al., 2018e; Saidala et al., 2018f). The most popular theorem, i.e., no free lunch theorem (Wolpert & Macready, 1997) justifies that the number of new optimization algorithms needed to come into existence to solve complex problems with low solution cost.