Predicting Bug Priority Using Topic Modelling in Imbalanced Learning Environments

Predicting Bug Priority Using Topic Modelling in Imbalanced Learning Environments

Jayalath Bandara Ekanayake
Copyright: © 2021 |Pages: 12
DOI: 10.4018/IJSSOE.2021010103
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Manual classification of bug reports is time-consuming as the reports are received in large quantities. Alternatively, this project proposed automatic bug prediction models to classify the bug reports. The topics or the candidate keywords are mined from the developer description in bug reports using RAKE algorithm and converted into attributes. These attributes together with the target attribute—priority level—construct the training datasets. Naïve Bayes, logistic regression, and decision tree learner algorithms are trained, and the prediction quality was measured using area under recursive operative characteristics curves (AUC) as AUC does not consider the biasness in datasets. The logistics regression model outperforms the other two models providing the accuracy of 0.86 AUC whereas the naïve Bayes and the decision tree learner recorded 0.79 AUC and 0.81 AUC, respectively. The bugs can be classified without developer involvement and logistic regression is also a potential candidate as naïve Bayes for bug classification.
Article Preview
Top

Background

In recent years there is a rise in mining software repositories such as source code, bug and email achieves using modern machine learning algorithms. The learning algorithms extract hidden relationships from the above repositories. Some of the extracted patterns are non-trivial, previously unknown and potentially useful for future use. Such information is considered as new knowledge.

Several models have been developed for automatic bug reports prioritization (Kanwal, 2012), (Alenezi, 2013), (Xia, 2014), (Tian, 2015), (Kumari, 2018), (Waqar 2020), (Cheng 2020), (Sharma, 2020), (Li, 2020). These models were trained using the attributes constructed from the contents of bug reports for classification tasks.

Kanwal and Maqbool (2012) proposed an approach to prioritize bug reports using Na ̈ıve Bayes and Support Vector Machine (SVM) classifiers. They use both categorical and textual data for classification and measure the prediction quality of models using precision and recall.

Complete Article List

Search this Journal:
Reset
Volume 13: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 12: 2 Issues (2022): 1 Released, 1 Forthcoming
Volume 11: 2 Issues (2021)
Volume 10: 2 Issues (2020)
Volume 9: 2 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing