Michigan vs. Pittsburgh Style GA Optimisation of Fuzzy Rule Bases for Gene Expression Analysis

Michigan vs. Pittsburgh Style GA Optimisation of Fuzzy Rule Bases for Gene Expression Analysis

Gerald Schaefer (Department of Computer Science, Loughborough University, Loughborough, UK) and Tomoharu Nakashima (School of Knowledge and Information Systems, Osaka Prefecture University, Sakai, Japan)
Copyright: © 2013 |Pages: 13
DOI: 10.4018/ijfsa.2013100105


Microarray studies and gene expression analysis have received a lot of attention and provide many promising avenues towards the understanding of fundamental questions in biology and medicine. In this paper, the authors perform gene expression analysis and apply two hybrid GA-fuzzy approaches to classify gene expression data. Both are based on fuzzy if-then rule bases but they differ in the way these rule bases are optimised. The authors employ both a Michigan style approach, where single rules are handled as individuals in the population of the genetic algorithm, and a Pittsburgh type algorithm, which treats whole rule sets as individuals. Experimental results show that both approaches achieve good classification accuracy but that the Michigan style algorithm clearly outperforms the Pittsburgh classifier.
Article Preview

One of the main challenges in classifying gene expression data is that the number of genes is typically much higher than the number of analysed samples. Also, is it not clear which genes are important and which can be omitted without reducing the classification performance. Many pattern classification techniques have been employed to analyse microarray data. For example, Golub et al. (1999) used a weighted voting scheme, Fort and Lambert-Lacroix (2005) employed partial least squares and logistic regression techniques, whereas Furey et al. (2000) applied support vector machines. Dudoit et al. (2002) investigated nearest neighbour classifiers, discriminant analysis, classification trees and boosting, while Statnikov et al. (2005) explored several support vector machine techniques, nearest neighbour classifiers, neural networks and probabilistic neural networks. In several of these studies it has been found that no one classification algorithm is performing best on all datasets (although for several datasets SVMs seem to perform best) and that hence the exploration of several classifiers is useful. Similarly, no universally ideal gene selection method has yet been found as several studies (Liu, Li, & Wong, 2002, Statnikov et al., 2005) have shown.

Several authors have used fuzzy logic to analyse gene expression data before. Woolf and Wang (2000) used fuzzy rules to explore the relationships between several genes of a profile while Vinterbo, Kim, and Ohno-Machado (2005) used fuzzy rule bases to classify gene expression data. However, Vinterbo et al.’s method has the disadvantage that it allows only linear discrimination and that they describe each gene by only 2 fuzzy partitions (‘up’ and ‘down’). In Schaefer et al. (2007), Schaefer & Nakashima (2010), we presented a fuzzy rule-based classification system to analyse microarray expression data. Gene expression data was described by fuzzy sets and rules of combinations of these sets employed to arrive at a classification. In (Schaefer & Nakashima (2010), we derived a more compact rule base using a GA that assesses the fitness of individual rules and selects a rule ensemble that maximises classification performance.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 8: 4 Issues (2019): 1 Released, 3 Forthcoming
Volume 7: 4 Issues (2018)
Volume 6: 4 Issues (2017)
Volume 5: 4 Issues (2016)
Volume 4: 4 Issues (2015)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing