Imbalanced Classification for Business Analytics

Imbalanced Classification for Business Analytics

Talayeh Razzaghi (University of Central Florida, USA), Andrea Otero (University of Central Florida, USA) and Petros Xanthopoulos (University of Central Florida, USA)
Copyright: © 2014 |Pages: 10
DOI: 10.4018/978-1-4666-5202-6.ch105
OnDemand PDF Download:
List Price: $37.50

Chapter Preview



Advances in science and technology accelerate the accessibility of raw data and create new opportunities for knowledge discovery. Imbalanced problems can be found in a wide variety of applications, including security surveillance (Wu, Wu, Jiao, Wang, & Chang, 2003), medical diagnosis (Mena & Gonzalez, 2009; You, Zhao, Li, & Hu, 2011), bioinformatics (Al-Shahib, Breitling, & Gilbert, 2005), geomatics (Kubat, Holte, & Matwin, 1998), telecommunications (Tang, Krasser, Judge, & Zhang, 2006), risk management (Ezawa, Singh, & Norton, 1996), manufacturing (Adam et al., 2011), quality estimation (Lee, Song, Song, & Yoon, 2005), and power management (Hu, Zhu, & Ren, 2008). Imbalanced classification has been studied in a number of studies (N. V. Chawla, 2010; Guo, Yin, Dong, Yang, & Zhou, 2008; He & Garcia, 2009; Su, Mao, Zeng, Li, & Wang, 2009; Sun et al., 2009). Previous works on the classification of imbalanced data (N. V. Chawla, 2010; Kubat et al., 1998; Ngai, Hu, Wong, Chen, & Sun, 2011; Su et al., 2009; Sun et al., 2009) address that many standard classification algorithms achieve poor performance. Therefore, despite the existing amounts of literature there is room for improvement and future contribution.

Key Terms in this Chapter

Weighted Support Vector Machine (WSVM): A modified type of SVM which assigns weights to different examples in the dataset.

Contrast Pattern Mining: An NP-hard pattern recognition method (Wang, Zhao, Dong, & Li, 2005) to efficiently mine contrast patterns and separate fraudulent from genuine behavior.

Supervised Learning: A machine learning technique of predicting the value of a given function for any input based on labeled training data.

Semi Supervised Learning: A machine learning technique that uses both labeled and unlabeled data for training.

Support Vector Machine (SVM): A supervised machine learning method which analyzes data and distinguishes patterns for classification and regression analysis purposes based on convex optimization.

Unsupervised Learning: A machine learning technique of detecting unknown pattern in unlabeled data.

Imbalanced Data: The data with different degrees of skewness between classes.

Complete Chapter List

Search this Book: