A Heterogeneous AdaBoost Ensemble Based Extreme Learning Machines for Imbalanced Data

A Heterogeneous AdaBoost Ensemble Based Extreme Learning Machines for Imbalanced Data

Adnan Omer Abuassba (University of Science and Technology Beijing (USTB), Beijing, China & Arab Open University - Palestine, Ramallah, Palestine), Dezheng Zhang (University of Science and Technology Beijing (USTB), Beijing, China) and Xiong Luo (University of Science and Technology Beijing (USTB), Beijing, China)
DOI: 10.4018/IJCINI.2019070102


Extreme learning machine (ELM) is an effective learning algorithm for the single hidden layer feed-forward neural network (SLFN). It is diversified in the form of kernels or feature mapping functions, while achieving a good learning performance. It is agile in learning and often has good performance, including kernel ELM and Regularized ELM. Dealing with imbalanced data has been a long-term focus for the learning algorithms to achieve satisfactory analytical results. It is obvious that the unbalanced class distribution imposes very challenging obstacles to implement learning tasks in real-world applications, including online visual tracking and image quality assessment. This article addresses this issue through advanced diverse AdaBoost based ELM ensemble (AELME) for imbalanced binary and multiclass data classification. This article aims to improve classification accuracy of the imbalanced data. In the proposed method, the ensemble is developed while splitting the trained data into corresponding subsets. And different algorithms of enhanced ELM, including regularized ELM and kernel ELM, are used as base learners, so that an active learner is constructed from a group of relatively weak base learners. Furthermore, AELME is implemented by training a randomly selected ELM classifier on a subset, chosen by random re-sampling. Then, the labels of unseen data could be predicted using the weighting approach. AELME is validated through classification on real-world benchmark datasets.
Article Preview

1. Introduction

Among the popular machine learning methods (Adnan,Abuassba et al.,2017a, Bezdek, 2016; Chen, Li et al., 2018; Luo, Sun et al., 2018; Luo, Jiang et al., 2019; Luo, Xu et al., 2018,Adnan,Abuassba et al.,2017b), extreme learning machine (ELM) is well-known for solving classification and regression problems in real world applications. It is designed for a single hidden layer feed-forward network (SLFN). It is proved theoretically and practically (Huang, Zhu et al., 2006; Huang, Wang et al., 2010; Huang, Zhou et al., 2012; Huang, 2014) that ELM is efficient and fast in both classification and regression (Liu, He et al., 2008; Huang, Ding et al., 2010). It eludes parameter tuning on the contrary of traditional gradient based algorithms. Imbalanced data issue appears when negative or majority class dominates another class (positive or minority); which means the number of majority class examples excessive the number of minority class examples. Many real-world applications suffer from imbalanced data, including text classification (Song, Huang et al., 2016), credit card fraud detection (Hirose, Ozawa et al., 2016), fault diagnosis (Duan, Xie et al., 2016), medical diagnosis (Mazurowski, Habas et al., 2008), and others.

As the distribution of classes is unbalanced; learning with the existence of imbalanced data is not a trivial process for standard machine learning algorithms as they tend to be biased by the negative classes and ignore the positive ones. The prediction of a concrete class is more significant than the negative one. Therefore, imbalance class learning draws more and more attention in recent years. The previously proposed research addresses this issue at data level (FernÁndez, Garcá et al., 2008), at algorithm level and cost-sensitive methods (Sun, Kamel et al., 2007; Tapkan, Özbakir et al., 2016) which combine both.

On the data level, a preprocessing technique is used to balance the original data such as under-sampling, oversampling and the hybrid of the two. Under-sampling approach eliminates a number of majority class examples; however, it wipes out some notable examples. Likewise, over-sampling approach upturns the number of minority class examples; however, it may over-fit the training data. To deal with these issues, hybrid methods are proposed. Synthetic Minority Oversampling Techniques (SMOTE) creates new synthetic examples depending on the similarity between existing ones (Rani, Ramadevi et al., 2016). It increases overlapping between classes when used for over-sampling. On the other hand, the algorithmic level is designed in a way that it is suitable for imbalanced data learning. Cost sensitive one of these algorithms, in which a penalty cost is employed for the misclassified examples, i.e. assigning the misclassified cases for majority class more cost than the minority ones (Tapkan, Özbakir, et al., 2016). Most academic researchers (Jiang, Shen et al., 2015; Zhang, Liu et al., 2016; Ren, Cao et al., 2017) proposed ELM ensemble technique to address the imbalanced classification problem. ELM ensemble methodology assigns weights to train examples that care of the misclassified samples by the previous classifier.

Weighted ELM (Li, Kong et al., 2014) and AdaBoost algorithm is combined in a unified structure. The weighted ELM provides different weights for each training example in a way that alleviate the impact of the concrete class, by conveying an extra weight for the minority class. Those weights were decided by the user which accordingly affected its performance. Nevertheless, how to determine the sample weights still an unsolved issue. A multiclass approach-based ELM ensemble which combine ELM and AdaBoost is proposed (Jiang, Shen et al., 2015). It directly applied to ELM group in face recognition application. A fuzzy activation function of ELM (Wang and Li, 2010) as base learner is proposed in vigorous AdaBoost ensemble of ELM.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 14: 4 Issues (2020): 2 Released, 2 Forthcoming
Volume 13: 4 Issues (2019)
Volume 12: 4 Issues (2018)
Volume 11: 4 Issues (2017)
Volume 10: 4 Issues (2016)
Volume 9: 4 Issues (2015)
Volume 8: 4 Issues (2014)
Volume 7: 4 Issues (2013)
Volume 6: 4 Issues (2012)
Volume 5: 4 Issues (2011)
Volume 4: 4 Issues (2010)
Volume 3: 4 Issues (2009)
Volume 2: 4 Issues (2008)
Volume 1: 4 Issues (2007)
View Complete Journal Contents Listing