Article Preview
TopIntroduction
Clustering is one of the most tough tasks especially in pattern Recognition (Kamel & Selim, 1994), Image analysis (Omran, Salman, & Engelbrecht, 2002) and other complex applications (Bouveyron & Brunet-Saumard, 2014; Fu, Niu, Zhu, Wu, & Li, 2012; Lv et al., 2016). It utilized as a part of many fields including image analysis, data mining, machine learning, bioinformatics and pattern recognition in which the dispersion of data is of any shape and size and is a well-known method for statistical data analysis. In pattern recognition, Data analysis can be done by two different learning methods: one is unsupervised and other one is supervised, the first concerning only unlabeled data (training patterns with recognized category labels) whereas the second one concerning labeled data (Hart & Stork, 2001; Peters & Weber, 2012). The third type of method which is the hybrid of the unsupervised and supervised learning is Semi supervised (Chapelle, Olivier and Scholkopf, Bernhard & Zien, 2009), in which some of the available data is labeled (supervised) while other is given unlabeled (unsupervised). Several approaches have been applied on unsupervised learning for instance Ant colony clustering algorithm, K-Means algorithm, genetic algorithm, Tabu Search, Simulated Annealing approach, Particle swarm optimization, ABC algorithm, HABC, FABC, and Cuckoo Search Algorithm CSA (Dervis, Karaboga & Ozturk, 2011; Shah, Herawan, Naseem, & Ghazali, 2014; Zhang, Liu, Yang, & Dai, 2016). They are discussed in the next section in detail.
K-means algorithm is the standout amongst the maximum widely recognized class of clustering algorithms (Selim & Alsultan, 1991) which is a fast, simple and center based algorithm. The key working of K-means algorithm is that it finds out the partitions so that the squared error between the points in the cluster and the empirical mean of a cluster is reduced. This algorithm has the insufficiencies that it extremely relays on the starting conditions and from the very initial position of search, converges to local minima and with reasonable quantity of computation effort it cannot find global solutions of large problems (Fathian, Amiri, & Maroosi, 2007). So as to overwhelmed local optima problem, the researchers having various backgrounds of research are applying i.e. density-based clustering, artificial intelligence based clustering methods, partition-based clustering and hierarchical clustering, for instance: graph theory (Zahn, 1971), statistics (Forgy, 1965),expectation, evolutionary algorithms, artificial neural networks and swarm intelligence algorithms (Bakhta & Ghalem, 2014; Bouarara, Hamou, & Amine, 2015; Cheng, Shi, & Qin, 2011; Harish, Jagdish Chand, Arya, & Kusum, 2012; Tarun Kumar & Millie, 2011).
Simulated Annealing approach has been discussed and proved theoretically by Selim and Al-Sultan that the clustering problem of getting stuck at local minima faced by K-means can be resolved (Selim & Alsultan, 1991). The algorithm does not “stick” to a local optimal solution, somewhat it obtains the optimum solution. A disadvantage of the simulated annealing approach is that no characterization of an ending point is computationally offered. Another disadvantage is that verifying that a set of data is Standard Data is more difficult than solving the clustering problem itself. A new algorithm based on a TS technique is used for solving this problem. For many test problems the algorithm accomplished preferred outcomes than the famous k-means and the SA algorithms (Al-Sultan, 1995).