Weighted SVMBoost based Hybrid Rule Extraction Methods for Software Defect Prediction

Weighted SVMBoost based Hybrid Rule Extraction Methods for Software Defect Prediction

Jhansi Lakshmi Potharlanka, Maruthi Padmaja Turumella
Copyright: © 2019 |Pages: 10
DOI: 10.4018/IJRSDA.2019040104
Article PDF Download
Open access articles are freely available for download

Abstract

The software testing efforts and costs are mitigated by appropriate automatic defect prediction models. So far, many automatic software defect prediction (SDP) models were developed using machine learning methods. However, it is difficult for the end users to comprehend the knowledge extracted from these models. Further, the SDP data is of unbalanced in nature, which hampers the model performance. To address these problems, this paper presents a hybrid weighted SVMBoost-based rule extraction model such as WSVMBoost and Decision Tree, WSVMBoost and Ripper, and WSVMBoost and Bayesian Network for SDP problems. The extraction of the rules from the opaque SVMBoost is carried out in two phases: (i) knowledge extraction, (ii) rule extraction. The experimental results on four NASA MDP datasets have shown that the WSVMBoost and Decision tree hybrid yielded better performance than the other hybrids and WSVM.
Article Preview
Top

Researchers used different analysis techniques ranging from Statistics to Machine Learning (Nam, 2014) (Kamei & Shihab, 2016) for effective prediction models. Recently, Li et al., categorized the recent SDP efforts in to machine learning-based prediction algorithms, methods to manipulating the data and mechanisms for effort-aware prediction (Li, Jing, & Zhu, 2018). Nagappan and Ball applied (Nagappan & Ball, 2005) PREfast and PREfix statistical analysis tools for predicting defects and reported 82.91% accuracy of the model. From the studies, which adopted machine learning techniques, Naive Bayes (Menzies, Greenwald, & Frank, 2007) reported 71% accuracy, a Bayesian network of Metrics and Defect Proneness (Okutan et al., 2014) reported 72.5% average accuracy. The Support Vector Machines (Gray et al., 2009) as base learners achieved 80% accuracy. From the combined models of Support Vector Machines (SVM) and Probabilistic Neural Network (PNN) (Al-Jamimi & Ghouti, 2011) reported 87.62% accuracy. Neural Network (NN), Decision Tree (DT), PART, Logistic Regression (LR) and Ada Boost (Arisholm, Briand, & Johannessen, 2010) and achieved 75.6% average Accuracy.

Complete Article List

Search this Journal:
Reset
Volume 9: 1 Issue (2025): Forthcoming, Available for Pre-Order
Volume 8: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 7: 4 Issues (2021): 1 Released, 3 Forthcoming
Volume 6: 3 Issues (2019)
Volume 5: 4 Issues (2018)
Volume 4: 4 Issues (2017)
Volume 3: 4 Issues (2016)
Volume 2: 2 Issues (2015)
Volume 1: 2 Issues (2014)
View Complete Journal Contents Listing