Ensemble Learning via Extreme Learning Machines for Imbalanced Data

Ensemble Learning via Extreme Learning Machines for Imbalanced Data

Adnan Omer Abuassba (Arab Open University, Palestine), Dezheng O. Zhang (School of Computer and Communication Engineering, University of Science and Technology Beijing, China) and Xiong Luo (School of Computer and Communication Engineering, University of Science and Technology Beijing, China)
DOI: 10.4018/978-1-7998-3038-2.ch004


Ensembles are known to reduce the risk of selecting the wrong model by aggregating all candidate models. Ensembles are known to be more accurate than single models. Accuracy has been identified as an important factor in explaining the success of ensembles. Several techniques have been proposed to improve ensemble accuracy. But, until now, no perfect one has been proposed. The focus of this research is on how to create accurate ensemble learning machine (ELM) in the context of classification to deal with supervised data, noisy data, imbalanced data, and semi-supervised data. To deal with mentioned issues, the authors propose a heterogeneous ELM ensemble. The proposed heterogeneous ensemble of ELMs (AELME) for classification has different ELM algorithms, including regularized ELM (RELM) and kernel ELM (KELM). The authors propose new diverse AdaBoost ensemble-based ELM (AELME) for binary and multiclass data classification to deal with the imbalanced data issue.
Chapter Preview


Among the popular machine learning methods (Abuassba, Zhang, Luo, Zhang, & Aziguli,2017; Bezdek,2016; Chen, Li et al., 2018; Luo, Sun et al., 2018; Luo, Jiang et al., 2019; Luo, Xu et al., 2018, Abuassba et al.,2018), extreme learning machine (ELM) is well-known for solving classification and regression problems in real world applications. It is designed for a single hidden layer feed-forward network (SLFN). It is proved theoretically and practically (Huang, Zhu et al., 2006; Huang, Wang et al., 2010; Huang, Zhou et al., 2012; Huang 2014) that ELM is efficient and fast in both classification and regression (Liu, He et al. 2008; Huang, Ding et al., 2010). It eludes parameter tuning on the contrary of traditional gradient based algorithms. Imbalanced data issue appears when negative or majority class dominates another class (positive or minority); which means the number of majority class examples excessive the number of minority class examples. Many real-world applications suffer from imbalanced data, including text classification (Song, Huang et al. 2016), credit card fraud detection (Hirose, Ozawa et al., 2016), fault diagnosis (Duan, Xie et al., 2016), medical diagnosis (Mazurowski, Habas et al., 2008), and others.

As the distribution of classes is unbalanced; learning with the existence of imbalanced data is not a trivial process for standard machine learning algorithms as they tend to be biased by the negative classes and ignore the positive ones. The prediction of a concrete class is more significant than the negative one. Therefore, imbalance class learning draws more and more attention in recent years. The previously proposed research addresses this issue at data level (FernÁndez, GarcÁ­a et al., 2008), at algorithm level and cost-sensitive methods (Sun, Kamel et al., 2007; Tapkan, Özbakir et al., 2016) which combine both.

On the data level, a preprocessing technique is used to balance the original data such as under-sampling, oversampling and the hybrid of the two. Under-sampling approach eliminates a number of majority class examples; however, it wipes out some notable examples. Likewise, over-sampling approach upturns the number of minority class examples; however, it may over-fit the training data. To deal with these issues, hybrid methods are proposed. Synthetic Minority Oversampling Techniques (SMOTE) creates new synthetic examples depending on the similarity between existing ones (Rani, Ramadevi et al., 2016). It increases overlapping between classes when used for over-sampling. On the other hand, the algorithmic level is designed in a way that it is suitable for imbalanced data learning. Cost sensitive one of these algorithms, in which a penalty cost is employed for the misclassified examples, i.e. assigning the misclassified cases for majority class more cost than the minority ones (Tapkan, P. n., L. Özbakir, et al., 2016). Most academic researchers (Jiang, Shen et al., 2015; Zhang, Liu et al., 2016; Ren, Cao et al., 2017) proposed ELM ensemble technique to address the imbalanced classification problem. ELM ensemble methodology assigns weights to train examples that care of the misclassified samples by the previous classifier.

Key Terms in this Chapter

Single Hidden Layer Feed Forward Networks (SLFNs): It is an artificial neural network with one hidden layer.

Extreme Learning Machine (ELM): It is a single hidden layer feed forward network. Which is extended to multilayer network. Proposed by Huang Guang-Bin ( Huang, 2015 ).

Geometric Mean (G-Mean): Is the average of sensitivity and specificity which measure the overall learning algorithm performance. It could be calculated by the square root of sensitivity and specificity multiplication.

AdaBoost: Adaptive boosting is a boosting technique focus on instances which are hard to classify.

Imbalance Ratio (IR): Is the proportion of the number of instances in the negative class to the number of instances in the positive one.

Boosting: Is a machine learning ensemble which combines many relatively weak and inaccurate algorithms to construct an accurate dynamic one.

Imbalanced Data: Is data with number of instances in a class highly dominates the other.

Synthetic Minority Oversampling Technique (SMOTE): Is an artificial technique to solve the imbalanced data issue by increasing the number of minority instances in the data set.

Receiver Operating Characteristics (ROC): Is measurement to compare learners’ performance on imbalanced data.

Complete Chapter List

Search this Book: