A Text Mining Framework for Analyzing Change Impact and Maintenance Effort of Software Bug Reports

A Text Mining Framework for Analyzing Change Impact and Maintenance Effort of Software Bug Reports

Ruchika Malhotra, Megha Khanna
Copyright: © 2022 |Pages: 18
DOI: 10.4018/IJIRR.295974
Article PDF Download
Open access articles are freely available for download


Software practitioners often strive to achieve a “bug-free” software, though, it is a myth. Software Bug Categorization (SBC) models, which assigns levels (viz. “low”, “moderate” or “high”) to a software bug aid effective bug management. They assist in allocation of proper maintenance resources for bug elimination to improve software quality. This study proposes the development of SBC models that allocate levels on the basis of three software bug aspects i.e., maintenance effort required to correct a bug, its change impact and the combined effect of both of these. In order to develop SBC models, we use text mining approach, which extracts relevant features from bug descriptions and relates these features with different software bug levels. The results of the study indicate that the categorization of software bugs in accordance with maintenance effort and change impact is possible. Furthermore, the combined approach SBC models were also found to be effective.
Article Preview


Software bugs are inevitable. They generally get introduced during the development or maintenance phase and lead to poor quality software products with unsatisfied customers. Therefore, software maintenance activities, which remove these bugs and ensure the smooth functioning of a software are mandatory (Elmishali et al. 2018). However, it should be noted that maintenance resources are always at constraint. The software community requires to manage these resources appropriately in order to deliver timely upgrades of the software. One possible method for managing maintenance resources, while removing software bugs is Software Bug Categorization (SBC), which is explored in this study.

SBC involves cataloguing of software bugs into different levels (categories) on the basis of various bug attributes such as their criticality, priority, etc. (Menzies & Marcus 2008; Tian et al. 2015). SBC models use textual features and various other information available in the bug reports for cataloguing them. With the aid of SBC, a software developer can take informed decisions while handling a bug and planning its correction. This study categorizes software bugs into three levels namely, “low”, “moderate” and “high” on the basis of two important bug attributes, viz. Maintenance Effort (ME) and/or Change Impact (CI). The ME required to correct a software bug estimates the lines of code which are required to be added or deleted during bug correction. The CI of a bug is computed by the number of classes, which are modified while removing the corresponding bug.

It is important to categorize software bugs on the basis of ME and CI as a bug fixing regime is highly dependent on them. A software developer who is aware that a specific bug belongs to “high” category level corresponding to its ME, will allocate more effort for correction of such a bug as compared to those belonging to “low” and “moderate” levels. As a result, constraint maintenance resources can then be optimally used. Moreover, maintenance activities can complete on time within the designated budget effectively.

A bug with high CI is likely to affect the quality of the existing software product as ripple effect of its correction will be experienced by larger number of classes. This may result in deteriorating the quality of the existing system as new bugs may get introduced in other classes. If software practitioners are aware of a bug’s CI, they will be careful in regression testing so that no new bugs escape into the system and are corrected as soon as possible. Moreover, if a software professional is aware that a bug belongs to “high” category with respect to its CI, he will be more careful with regression testing after bug correction and will execute a larger number of test cases as compared to those for a bug corresponding to “low” or “moderate” levels of CI. Likewise, bugs which are allocated “high” level on the basis of the combined effect of ME and CI, need both, more maintenance resources and stringent regression testing for their effective resolution. Such bugs if known can be properly allocated the required resources. Thus, planning maintenance activities based on the discussed SBC models would result in effective resolution of all categories of software bugs leading to good quality maintainable products and customer satisfiability.

Though studies in literature have developed SBC models on basis of various bug attributes such as severity, criticality, etc. (Menzies & Marcus 2008; Tian et al. 2015), the domain of bug categorization in accordance with their probable CI has not been explored. Also, the study is a first in successfully identifying the CI levels of a bug on the basis of its description. Therefore, the contributions of the study include:

  • • Assessment of ME levels and CI levels of software bugs on the basis of textual description present in their corresponding bug reports.

  • • Prediction of a software bug level on the basis of the combined effect of CI and ME to ease allocation of both maintenance as well as testing resources.

  • • Meticulous evaluation of results on five application packages of a popular open-source software Android, which are also statistically validated using Wilcoxon signed rank test.

Complete Article List

Search this Journal:
Volume 14: 1 Issue (2024)
Volume 13: 1 Issue (2023)
Volume 12: 4 Issues (2022): 3 Released, 1 Forthcoming
Volume 11: 4 Issues (2021)
Volume 10: 4 Issues (2020)
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing