Fault Localization With Data Flow Information and an Artificial Neural Network

Fault Localization With Data Flow Information and an Artificial Neural Network

Jun-Hyuk Jo, Jihyun Lee, Aman Jaffari, Eunmi Kim
Copyright: © 2021 |Pages: 13
DOI: 10.4018/IJSI.2021070105
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Fault localization is a technique for identifying the exact source code line with faults. It typically requires a lot of time and cost because, to locate the fault, a developer must track the execution of the failed program line by line. To reduce the fault localization efforts, many methods have been proposed. However, their localized suspicious code range is wide, and their fault localization effect is not high. To cope with this limitation, this paper computes the degree of fault suspiciousness of statements by using an artificial neural network and information of the executed test case, such as statement coverage, execution result, and definition-use pair. Compared to the approach that uses only statement coverage as input data for training an artificial neural network, the experiment results show higher accuracy in 15 types of faults out of 29 real fault types in the approach that the definition-use pair included.
Article Preview
Top

Introduction

Currently, in general, IT/R&D related projects, the budget allocated for testing is between 11% and 40% of the total (Lee et al., 2012). Also, according to the World Quality Report, the average spending on quality assurance and testing accounts for 26% of the total budget spent on the IT industry (I International Software Testing Qualifications Board, 2016; World Quality Report, 2019). Therefore, to reduce the costs, further research is required to improve the test efficiency by detecting more faults with fewer test cases.

Software error arises from human mistakes while writing source code. An error is a mistake that results in a wrong or unexpected program behavior, which causes a fault in the software. Errors not eliminated can cause software failure. Failure is a phenomenon that occurs when a fault in the code is executed (Fagan, 1986). In the event of a crash, the system may not behave as expected. Not all faults lead to system failures, but they can cause potential problems and even cause hanging while the software is running.

Fault localization is a technique for notifying developers or test engineers about source code that is expected to be defective in a program. Fault localization typically requires a lot of time and cost because, to locate the fault, developers must track the execution of the failed program line by line. Hence, to reduce the cost and efforts, many fault localization techniques have been proposed. W. E. Wong et al. (2016) has analyzed the software fault localization technique by grouping them into sliding, spectra-based, program state-based, machine learning-based, and model-based approaches. The spectra-based approach describes the program execution information from a specific perspective, like conditional branching. The program execution failure information is utilized to identify suspicious code, indicating which parts of the program under test were handled during execution (Abreu et al., 2007). The machine learning-based fault localization technique accumulates test coverage information and execution results to identify the relationship between them through a machine learning model, indicating the probability that a fault exists in a statement (Wong & Qi, 2009).

Among the various fault localization techniques, the program slicing is the fundamental technique. The sliding-based fault localization technique (Agrawal et al., 1995) limits the search area to particular statements associated with the failed test case. As a result, the contribution of this technique towards fault localization is less significant because faults could also exist in locations other than locations where the test cases fail. Even the faults exist in the localized slice, the size of the resulting slices based on the failed test cases is considerable.

To cope with this problem, in this paper, we use an artificial neural network and data flows. Depending on the input data, the neural network model can perform classification or clustering by identifying the characteristics or patterns of the given data. Applying these characteristics of the corresponding model, calculate the fault suspiciousness of each statement using the features of the statement as input data. The features of the statements used here are the test case coverage regarding each statement, execution result, and the corresponding definition-use (du) pair representing the data flow. The obtained information is preprocessed to use it as input data for the neural network model. Next, the weight of the artificial neural network model is adjusted by feeding the coverage information and du-pair, which finally calculates the fault suspiciousness of each statement. Because the classified results indicate the degree of fault suspiciousness for each statement, the suspected range of code can be narrowed down by a single statement rather than an arbitrary slice unit.

This paper is organized as follows. Section 2 describes existing researches on fault localization techniques, including those utilizing the artificial neural network for fault localization. Section 3 illustrates the fault localization method using data flow information. Section 4 applies the proposed fault localization technique to three programs with different sizes and complexities, summarizes, and analyzes the results. Finally, Section 5 presents discussions, and we reach conclusions and future work in Section 6.

Complete Article List

Search this Journal:
Reset
Volume 12: 1 Issue (2024)
Volume 11: 1 Issue (2023)
Volume 10: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 9: 4 Issues (2021)
Volume 8: 4 Issues (2020)
Volume 7: 4 Issues (2019)
Volume 6: 4 Issues (2018)
Volume 5: 4 Issues (2017)
Volume 4: 4 Issues (2016)
Volume 3: 4 Issues (2015)
Volume 2: 4 Issues (2014)
Volume 1: 4 Issues (2013)
View Complete Journal Contents Listing