Abstract
Feature selection is essential to improve the classification effectiveness. This paper presents a new adaptive algorithm called FS-PeSOA (feature selection penguins search optimization algorithm) which is a meta-heuristic feature selection method based on “Penguins Search Optimization Algorithm” (PeSOA), it will be combined with different classifiers to find the best subset features, which achieve the highest accuracy in classification. In order to explore the feature subset candidates, the bio-inspired approach PeSOA generates during the process a trial feature subset and estimates its fitness value by using three classifiers for each case: Naive Bayes (NB), Nearest Neighbors (KNN) and Support Vector Machines (SVMs). Our proposed approach has been experimented on six well known benchmark datasets (Wisconsin Breast Cancer, Pima Diabetes, Mammographic Mass, Dermatology, Colon Tumor and Prostate Cancer data sets). Experimental results prove that the classification accuracy of FS-PeSOA is the highest and very powerful for different datasets.Article Preview
TopSensitive data such as patients’ records and body images such as tumor and surgery related information, should not be in public domains. All these data should only be within the hospital and not in any public clouds. Hence, the design and implementation of private clouds is essential for biomedical scientists to generate, process, update, archive and store their data. (Chang & Wills, 2016). Six benchmark datasets are used in this paper, where Wisconsin Breast Cancer, Pima Diabetes, Mammographic Mass, and Dermatology datasets were obtained from the UCI machine learning repository (UCI), the colon cancer and the prostate cancer datasets were taken from Kent Ridge Biomedical Data Repository. The main characteristics of these datasets are depicted in Table 1.
Table 1.
The characteristics of the used datasets
| Datasets |
Features |
Instances |
Class |
Missing Value |
| Wisconsin Breast cancer |
32 |
569 |
2 |
No |
| Pima Diabetes |
8 |
768 |
2 |
Yes |
| Mammographic Mass |
5 |
961 |
2 |
Yes |
| Dermatology |
33 |
366 |
6 |
Yes |
| Colon tumor |
2000 |
62 |
2 |
No |
| Prostate cancer |
12600 |
21 |
2 |
No |