Inforence: Effective Fault Localization Based on Information-Theoretic Analysis and Statistical Causal Inference

Reading time: 6 minute
...

📝 Original Info

  • Title: Inforence: Effective Fault Localization Based on Information-Theoretic Analysis and Statistical Causal Inference
  • ArXiv ID: 1712.03361
  • Date: 2017-12-12
  • Authors: Researchers from original ArXiv paper

📝 Abstract

In this paper, a novel approach, Inforence, is proposed to isolate the suspicious codes that likely contain faults. Inforence employs a feature selection method, based on mutual information, to identify those bug-related statements that may cause the program to fail. Because the majority of a program faults may be revealed as undesired joint effect of the program statements on each other and on program termination state, unlike the state-of-the-art methods, Inforence tries to identify and select groups of interdependent statements which altogether may affect the program failure. The interdependence amongst the statements is measured according to their mutual effect on each other and on the program termination state. To provide the context of failure, the selected bug-related statements are chained to each other, considering the program static structure. Eventually, the resultant cause-effect chains are ranked according to their combined causal effect on program failure. To validate Inforence, the results of our experiments with seven sets of programs include Siemens suite, gzip, grep, sed, space, make and bash are presented. The experimental results are then compared with those provided by different fault localization techniques for the both single-fault and multi-fault programs. The experimental results prove the outperformance of the proposed method compared to the state-of-the-art techniques.

💡 Deep Analysis

Deep Dive into Inforence: Effective Fault Localization Based on Information-Theoretic Analysis and Statistical Causal Inference.

In this paper, a novel approach, Inforence, is proposed to isolate the suspicious codes that likely contain faults. Inforence employs a feature selection method, based on mutual information, to identify those bug-related statements that may cause the program to fail. Because the majority of a program faults may be revealed as undesired joint effect of the program statements on each other and on program termination state, unlike the state-of-the-art methods, Inforence tries to identify and select groups of interdependent statements which altogether may affect the program failure. The interdependence amongst the statements is measured according to their mutual effect on each other and on the program termination state. To provide the context of failure, the selected bug-related statements are chained to each other, considering the program static structure. Eventually, the resultant cause-effect chains are ranked according to their combined causal effect on program failure. To validate Inf

📄 Full Content

Received Oct 25, 2016; accepted Jun 23, 2017 E-mail: Parsa@iust.ac.ir Front. Comput. Sci. DOI : 10.1007/s11704-017-6512-z RESEARCH ARTICLE Inforence: Effective Fault Localization Based on Information- Theoretic Analysis and Statistical Causal Inference Farid FEYZI1, Saeed PARSA1 (*) 1 Department of Computer Engineering, Iran University of Science and Technology, Tehran, Iran © Higher Education Press and Springer-Verlag Berlin Heidelberg 2017 Abstract In this paper, a novel approach, Inforence, is proposed to isolate the suspicious codes that likely contain faults. Inforence employs a feature selection method, based on mutual information, to identify those bug-related statements that may cause the program to fail. Because the majority of a program faults may be revealed as undesired joint effect of the program statements on each other and on program termination state, unlike the state-of-the-art methods, Inforence tries to identify and select groups of interdependent statements which altogether may affect the program failure. The interdependence amongst the statements is measured according to their mutual effect on each other and on the program termination state. To provide the context of failure, the selected bug-related statements are chained to each other, considering the program static structure. Eventually, the resultant cause-effect chains are ranked according to their combined causal effect on program failure. To validate Inforence, the results of our experiments with seven sets of programs include Siemens suite, gzip, grep, sed, space, make and bash are presented. The experimental results are then compared with those provided by different fault localization techniques for the both single-fault and multi-fault programs. The experimental results prove the outperformance of the proposed method compared to the state-of-the-art techniques. Keywords Fault Localization, Debugging, Backward Dynamic Slice, Mutual Information, Feature Selection 1 Introduction To eliminate a bug, programmers employ all means to identify the location of the bug and figure out its cause. This process is referred to as software fault localization, which is one of the most expensive activities of debugging. Due to intricacy and inaccuracy of manual fault localization, a great amount of research has been carried out to develop automated techniques and tools to assist developers in finding bugs [1-11]. Most of these techniques use dynamic information from test executions, known as Spectrum-based Fault Localization (SBFL). The majority of SBFL techniques do not perform well in the case of specific bugs caused by undesired interactions between statements because they only consider statements in isolation. In other words, for each individual statement, they contrast its presence in all failing and passing runs to assign a fault suspiciousness value according to the contrast measure. However, as shown in Section 2, there are certain situations in which a specific combination of statements causes undesired program results. Hence, modeling the combinatorial effect of statements on each other, in failing and passing executions, may considerably improve the fault localization process. In this regard, the new idea of locating failure-causing statements considering their combinatorial effect on the program failure is suggested. The idea is inspired by the observation that most program failures are only revealed when a specific combination of correlated statements are executed. In this article, we present a novel approach, Inforence, for fault localization using an information-theory based feature selection algorithm. Inforence employs a dynamic weighting based feature selection algorithm, inspired from [12-14], which not only selects the most relevant program statements and eliminates redundant ones but also tries to recognize groups of interdependent statements which altogether may affect the program failure. To this aim, relevance, interdependence, and redundancy analysis are performed using information theoretic criteria. Instead of directly using scores computed by a feature selection method to localize faults, Inforence employs a method based on statistical causal inference to estimate the failure-causing effect of selected program statements. As a result, unlike existing machine learning based fault localization methods, confounding bias problem [15], which its negative impact on the performance of fault localization has been shown in recent works [15-18], is addressed. More importantly, by performing feature selection and statistical causal inference in a combinatorial manner, we have succeeded to leverage two significant limitations of existing causal inference based methods, their scalability issues due to considerable computational and profile storage overheads and their poor performance in the case of programs containing bugs with combined causes. Inforence also takes advantage of the strength of program slicing [19] in res

…(Full text truncated)…

📸 Image Gallery

cover.png page_2.webp page_3.webp

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut