Breast Cancer Classification With Microarray Gene Expression Data Based on Improved Whale Optimization Algorithm

Breast Cancer Classification With Microarray Gene Expression Data Based on Improved Whale Optimization Algorithm

S. Sathiya Devi, Prithiviraj K.
Copyright: © 2023 |Pages: 21
DOI: 10.4018/IJSIR.317091
Article PDF Download
Open access articles are freely available for download

Abstract

Breast cancer is one of the most common and dangerous cancer types in women worldwide. Since it is generally a genetic disease, microarray technology-based cancer prediction is technically significant among lot of diagnosis methods. The microarray gene expression data contains fewer samples with many redundant and noisy genes. It leads to inaccurate diagnose and low prediction accuracy. To overcome these difficulties, this paper proposes an Improved Whale Optimization Algorithm (IWOA) for wrapper based feature selection in gene expression data. The proposed IWOA incorporates modified cross over and mutation operations to enhance the exploration and exploitation of classical WOA. The proposed IWOA adapts multiobjective fitness function, which simultaneously balance between minimization of error rate and feature selection. The experimental analysis demonstrated that, the proposed IWOA with Gradient Boost Classifier (GBC) achieves high classification accuracy of 97.7% with minimum subset of features and also converges quickly for the breast cancer dataset.
Article Preview
Top

In the microarray dataset, the high ratio between the huge dimension of the genes (features) and the few number of samples resulted in inaccurate and imbalanced cancer prediction. In common, most of the genes in the microarray data are uninformative and redundant. These types of the genes are to be identified with the machine learning technique called as feature subset selection. Though the feature selection techniques not only identify the significant genes, it also improves the classification accuracy. There are three approaches for feature selection: (i) Filter method, (ii) Wrapper method and (iii) Hybrid method. In the filter method, the feature importance is measured with properties of the dataset and order the features based on the relevance score (feature importance score). This method is simple and fast and not considering the correlation among the features. The wrapper method generally incorporates any predefined classification algorithm to search for and select the relevance features. This method considers the feature dependencies and computationally intensive and slower. The hybrid method is the combination of filter and wrapper methods. This method, first apply the filter technique to reduce the feature space then use the wrapper method for feature subset selection. Since the wrapper method is expensive, it is proved to be beneficial in finding feature subsets that suit a predetermined classifier (Alshamlan et al., 2015).

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 3 Issues (2023)
Volume 13: 4 Issues (2022)
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing