SBCSim: Classification and Prioritization of Similarities Between Versions

SBCSim: Classification and Prioritization of Similarities Between Versions

Ritu Garg, Rakesh Kumar Singh
Copyright: © 2022 |Pages: 18
DOI: 10.4018/IJSI.309111
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Change management involves a set of changes within entities that poses a risk to the stability of the entity. Similarity index helps to analyze and prioritize the changes, but it provides inaccurate results due to non-consideration of type of similarity. Existing types of similarities also lack accuracy in comparing the versions due to history slicing in case of renaming or shifting. Thus, the proposed combination of fuzzy and hybrid techniques uses modified similarity index that helps in prioritizing the changes in the entities based on the proposed nominal similarity classifications. It reduces the cost of change management by determining risky entities. Results shows that 15% of files, 19% of classes, and 66% of methods pose risk with decrease of 35.29% and 52.53% unstable/risky entities at file and entity level respectively as compared to understand tool. Moreover, MSI captures similarities between entities where SI fails in GIT repositories, thereby enhancing the process of engineering change order.
Article Preview
Top

Introduction

Change management plays an important role in software evolution and maintenance for the Version Controlled Applications (Agrawal & Singh, 2020). In order to ease the process of change management, it usually involves Diff and Merge operations in order to reflect changes from/to remote repository in distributed environment. This operation presents the comparison between set of entities within hierarchical relationship highlighting the changes retrieved from logs along with the metadata such as who changed, when changed, what changes has occurred in form of number of lines added, deleted and modified within them. These changes are accessible at different granularities: File (F), Class (C) and Method (M) including constructors. Changes between versions are accessible with name of entities or line-wise similarity within the entity using different color representations for similar, added or changed lines.

Understand (SciTools, n.d.) & GIT are popular tools that provides line-wise similarity within the files during Diff & Merge operation. Understand provides similarity/change information in form of name of entities (FCM) but without identifying renaming, shifting, and similarity classifications between entities. On the other hand, GIT provides similarity/change information in form of name of files accounting renaming and shifting at file granularity along with amount of similarity as Similarity Index (SI) (Git - Git-Diff Documentation, n.d.). SI computes number of unchanged lines out of total number of lines between files of versions. However, SI suffers three drawbacks:

  • Lack in identifying similarity information’s at Intermediate Granularity (Class and method level): It captures line wise (Finer granularity) similarity information’s at file level (course granularity) only. Although, Finergit (Higo, Hayashi, & Kusumoto, 2020) and Historage (Fujiwara et al., 2014; Hata, Mizuno, & Kikuno, 2011) extended it to method level. However, classes that plays a key role for early fault identification in Model Driven Engineering (Garg & Singh, 2019) has not yet been covered specifically in any study, especially from perspective of robustness (covering renaming and shifting) in Diff and Merge concept for software code.

  • What type of changes has occurred is not always accessible in a generalized way: It is because the one line comment regarding the change is allowable in GIT that is not appropriate enough; in order to identify the type of change that has occurred. For example in case of restructuring, the entities that contains changes involves any of these changes:

    • Layout Changes: The custom settings in GIT allows to view differences ignoring space, blank lines, indentation etc. within a file but if a declarative or executable statement is moved into separate line using newline character, such a case is not controlled using GIT, Finergit and Historage. Although, it also denotes a change in layout as presented in figure 1. It does not pose any changes in functionality for that entity and has least risk associated with it, which may be ignored (Göde & Koschke, 2013).

    • Scope Changes: Sometimes, the content within the entity remains the same except for its coverage with respect to other entities changes, provided, the similar entities are present in both the versions as represented in figure 2. It arises changes to other entities that uses this entity. Thus, it has the risk associated with it, which may not be ignored (Eilertsen & Murphy, 2021).

    • Functional Changes: Any other change with or without layout, scope change is functional change within the entity.

  • Low Similarity Index (SI) entities may have more risk associated with them as compared to high SI entities: This is evident from file NameTooLongException.java with 66% SI that contains layout changes only as shown in figure 3 and file RTRecordTest.java with 70% SI that contains functional changes as shown in figure 4. Therefore, it raises the need for similarity classification to prioritize/segregate entities that have risk associated with them.

Complete Article List

Search this Journal:
Reset
Volume 12: 1 Issue (2024)
Volume 11: 1 Issue (2023)
Volume 10: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 9: 4 Issues (2021)
Volume 8: 4 Issues (2020)
Volume 7: 4 Issues (2019)
Volume 6: 4 Issues (2018)
Volume 5: 4 Issues (2017)
Volume 4: 4 Issues (2016)
Volume 3: 4 Issues (2015)
Volume 2: 4 Issues (2014)
Volume 1: 4 Issues (2013)
View Complete Journal Contents Listing