Article Preview
Top1. Introduction
Among the popular machine learning methods (Adnan,Abuassba et al.,2017a, Bezdek, 2016; Chen, Li et al., 2018; Luo, Sun et al., 2018; Luo, Jiang et al., 2019; Luo, Xu et al., 2018,Adnan,Abuassba et al.,2017b), extreme learning machine (ELM) is well-known for solving classification and regression problems in real world applications. It is designed for a single hidden layer feed-forward network (SLFN). It is proved theoretically and practically (Huang, Zhu et al., 2006; Huang, Wang et al., 2010; Huang, Zhou et al., 2012; Huang, 2014) that ELM is efficient and fast in both classification and regression (Liu, He et al., 2008; Huang, Ding et al., 2010). It eludes parameter tuning on the contrary of traditional gradient based algorithms. Imbalanced data issue appears when negative or majority class dominates another class (positive or minority); which means the number of majority class examples excessive the number of minority class examples. Many real-world applications suffer from imbalanced data, including text classification (Song, Huang et al., 2016), credit card fraud detection (Hirose, Ozawa et al., 2016), fault diagnosis (Duan, Xie et al., 2016), medical diagnosis (Mazurowski, Habas et al., 2008), and others.
As the distribution of classes is unbalanced; learning with the existence of imbalanced data is not a trivial process for standard machine learning algorithms as they tend to be biased by the negative classes and ignore the positive ones. The prediction of a concrete class is more significant than the negative one. Therefore, imbalance class learning draws more and more attention in recent years. The previously proposed research addresses this issue at data level (FernÁndez, Garcá et al., 2008), at algorithm level and cost-sensitive methods (Sun, Kamel et al., 2007; Tapkan, Özbakir et al., 2016) which combine both.
On the data level, a preprocessing technique is used to balance the original data such as under-sampling, oversampling and the hybrid of the two. Under-sampling approach eliminates a number of majority class examples; however, it wipes out some notable examples. Likewise, over-sampling approach upturns the number of minority class examples; however, it may over-fit the training data. To deal with these issues, hybrid methods are proposed. Synthetic Minority Oversampling Techniques (SMOTE) creates new synthetic examples depending on the similarity between existing ones (Rani, Ramadevi et al., 2016). It increases overlapping between classes when used for over-sampling. On the other hand, the algorithmic level is designed in a way that it is suitable for imbalanced data learning. Cost sensitive one of these algorithms, in which a penalty cost is employed for the misclassified examples, i.e. assigning the misclassified cases for majority class more cost than the minority ones (Tapkan, Özbakir, et al., 2016). Most academic researchers (Jiang, Shen et al., 2015; Zhang, Liu et al., 2016; Ren, Cao et al., 2017) proposed ELM ensemble technique to address the imbalanced classification problem. ELM ensemble methodology assigns weights to train examples that care of the misclassified samples by the previous classifier.
Weighted ELM (Li, Kong et al., 2014) and AdaBoost algorithm is combined in a unified structure. The weighted ELM provides different weights for each training example in a way that alleviate the impact of the concrete class, by conveying an extra weight for the minority class. Those weights were decided by the user which accordingly affected its performance. Nevertheless, how to determine the sample weights still an unsolved issue. A multiclass approach-based ELM ensemble which combine ELM and AdaBoost is proposed (Jiang, Shen et al., 2015). It directly applied to ELM group in face recognition application. A fuzzy activation function of ELM (Wang and Li, 2010) as base learner is proposed in vigorous AdaBoost ensemble of ELM.