Gene Clustering Using Metaheuristic Optimization Algorithms

Gene Clustering Using Metaheuristic Optimization Algorithms

P. K. Nizar Banu (Department of Computer Applications, B.S. Abdur Rahman University, Chennai, India) and S. Andrews (Department of Information Technology, Mahendra Engineering College, Mallasamudram, India)
Copyright: © 2015 |Pages: 25
DOI: 10.4018/IJAMC.2015100102


Gene clustering is a familiar step in the exploratory analysis of high dimensional biological data. It is the process of grouping genes of similar patterns in the same cluster and aims at analyzing the functions of gene that leads to the development of drugs and early diagnosis of diseases. In the recent years, much research has been proposed using nature inspired meta-heuristic algorithms. Cuckoo Search is one such optimization algorithm inspired from nature by breeding strategy of parasitic bird, the cuckoo. This paper proposes cuckoo search clustering and clustering using levy flight cuckoo search for grouping brain tumor gene expression dataset. A comparative study is made with genetic algorithm, PSO clustering, cuckoo search clustering and clustering using levy flight cuckoo search. Levy flight is an important property of levy distribution which covers the entire search space. Breeding pattern of cuckoo is associated with the genes that cause tumor to grow and affect other organs gradually. Clusters generated by these algorithms are validated to find the closeness among the genes in a cluster and separation of genes between clusters. Experimental results carried out in this paper show that cuckoo search clustering outperforms other clustering methods used for experimentation.
Article Preview


DNA microarray technology is a fundamental tool in the study of gene expression data analysis. The accumulation of datasets from this technology that measures the relative abundance of mRNA of thousands of genes across tens or hundreds of samples has underscored the need for quantitative analytical tools to examine such data. Due to the large number of genes and complex gene regulation networks, clustering is a useful exploratory technique for analyzing these data. Clustering divides the data of interest into a small number of relatively homogeneous groups or clusters. There are two ways of applying cluster analysis to microarray data. One way is to cluster genes according to their expression patterns across different conditions. The other way is to cluster samples from different tissues, cells at different time points of a biological process or under different treatments (Chen et al., 2002). Gene expression profiles can be built by measuring transcription levels of genes in an organism under various conditions, at different developmental stages and in different tissues that characterizes the dynamic functioning of each gene in genome (Alvis & Vilo, 2000). These gene expression data in microarray are presented in M X N matrix where M is the number of microarray experiments and N being the number of genes (Tuzhilin & Adomavicius, 2002). Certain analysis needs to be performed on this gene expression data to retrieve useful biological information. Cluster analysis is one such technique which discovers useful biological information by detecting genes that have identical expression profile (Kotala et al., 2001). A wide variety of clustering algorithms are available for clustering gene expression data (Bezdek, 1981). Researchers introduced a number of clustering algorithms, based on the characteristics of the clustering procedure; clustering algorithms are classified into two broad categories namely partitional and hierarchical clustering. Grid-based clustering (Liao et al., 2004), projection based clustering (Bouguessa & Wang, 2009), subspace clustering (Agrawal et al., 1998), density based clustering (Ester et al., 1996), model based methods, graph theoretic methods and soft computing methods are the other clustering algorithms that are presented in the literature.

In the recent years, optimization algorithms are also introduced for clustering process. In optimization based clustering, minimum sum of squared error is considered as the objective and the researchers have used optimization procedure defined in their algorithm for solving clustering objective (Binu et al., 2013). Based on the similar procedure, Genetic Algorithm (Mualik & Bandyopadhyay, 2002), Particle Swarm Optimization (Premalatha & Natrajan, 2008), bacterial foraging optimization (Wan et al., 2012), simulated annealing (Selim & Alsultan, 1991), artificial bee colony (Zhang et al., 2010; Karaboga & Ozturk, 2011), firefly algorithm (Senthilnath et al., 2011) and cuckoo search (Goel et al., 2011) algorithms were applied for clustering. This paper focuses on the application of cuckoo search based clustering for brain tumor gene expression dataset.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 10: 4 Issues (2019): 1 Released, 3 Forthcoming
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing