Mining Software Repositories for Revision Age-Based Co-Change Probability Prediction

Mining Software Repositories for Revision Age-Based Co-Change Probability Prediction

Anushree Agrawal (Indira Gandhi Delhi Technical University For Women, India) and R. K. Singh (Indira Gandhi Delhi Technical University For Women, India)
Copyright: © 2020 |Pages: 17
DOI: 10.4018/IJOSSP.2020040102
OnDemand PDF Download:
No Current Special Offers


Changeability is an important aspect of software maintenance and helps in better planning of development and testing resources. Early detection of change-prone entities is beneficial in terms of both time and money and helps to estimate and meet deadlines reliably. Co-change prediction identifies the affected entities when implementing a change in the software system. Recent researches recommend the use of revision history for the identification of co-changed artifacts. However, very few studies are available for investigation of the effect of history size and age on prediction results. This manuscript studies the effect of age of change history on co-change prediction results in software applications by varying the weightage of change commits with time. ROC analysis is done to study the accuracy of the proposed approach, and the results indicate that the older change commits have lower significance in deriving the changeability pattern. The derived change impact set will be useful for software practitioners in change implementation and selective regression testing.
Article Preview


Source code dependencies grow tremendously in evolving software systems. Therefore, it becomes quite challenging for the developers to identify the aftereffects of making changes to the system (Gupta, Chauhan and Dutta, 2015). Change impact analysis is a technique (Bohner and Arnold, 1996; Kung et al, 1994) to identify software entities affected by a given change. Several techniques based upon association rule mining are proposed in the literature (Islam et al, 2018; Lehnert, 2011; Vidacs and Pinzger, 2018) that identify dependencies in software systems exploiting the developers’ intrinsic knowledge from commit comments, bug reports, etc. The commit comments and bug reports contain information about the past co-changed entities, which is used to derive dependencies. Few other techniques exploit the relationship between Object Oriented metrics and change proneness between classes to derive the dependencies (Agrawal and Singh, 2020a, 2020b). These techniques are centred on how the software system has changed with time and thus can be beneficial in identifying dependencies.

In this paper, we consider the historical co-change between classes to identify their future changeability pattern. The techniques proposed in the literature are based on association rule learning generated from the commit history (Machado and Choren, 2018; Oliva and Gerosa, 2015). The two main factors that impact these mined rules are the history length, i.e. the amount of transactions in the commit history and the revision age, i.e. how old is the revision that is used to make predictions. Few studies in the literature study the influence of history length and revision age on change impact analysis (Moonen et al, 2016; 2018). It is evident from the studies that both the factors affect the prediction results. To the best of our knowledge, all the studies on change impact analysis give equal weights to the transactions in the commit history. It is argued that all the transactions do not affect the prediction results in the same manner (Moonen et al, 2018). In this work, we varied the weights to change commits based upon the revision age to study if the prediction ability of change commits diminishes with time. We measure the co-change probability between two classes and generate the changeability pattern by adding all the classes having a co-change probability greater than the cut off value to the change set. The main objectives of this paper are (1) to study the evolutionary coupling between classes for identifying future changeability patterns, (2) to study the effect of revision age on co-change prediction and measure the changeability of classes in terms of the Likelihood of change to determine the set of change prone classes. We have investigated the following research questions in this paper:

  • RQ1: How useful is the information from software change history for deriving the future changeability pattern among classes?

  • RQ2: Is there any influence of commit age on the prediction accuracy of co-changed classes?

We conducted a study on fourteen open-source software applications to explore co-change behavior among changed entities. We derived Likelihood of Change (LiCh) for all the changed pairs and conducted Receiver Operating Characteristic (ROC) analysis to access the accuracy of results. The results indicate that the older commits have less influence in deriving dependent classes and the proposed approach is useful in identifying co-change behavior in software applications.

The rest of the paper is structured as follows. The related work is discussed in section 2 followed by the research methodology in section 3. The data collection process is discussed in section 4 and the results are shown in section 5. Section 6 briefs the future prospective and concludes the study.

Complete Article List

Search this Journal:
Open Access Articles
Volume 12: 4 Issues (2021): Forthcoming, Available for Pre-Order
Volume 11: 4 Issues (2020): 3 Released, 1 Forthcoming
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 1 Issue (2015)
Volume 5: 3 Issues (2014)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing