📝 Original Info
- Title: Inforence: Effective Fault Localization Based on Information-Theoretic Analysis and Statistical Causal Inference
- ArXiv ID: 1712.03361
- Date: 2017-12-12
- Authors: Researchers from original ArXiv paper
📝 Abstract
In this paper, a novel approach, Inforence, is proposed to isolate the suspicious codes that likely contain faults. Inforence employs a feature selection method, based on mutual information, to identify those bug-related statements that may cause the program to fail. Because the majority of a program faults may be revealed as undesired joint effect of the program statements on each other and on program termination state, unlike the state-of-the-art methods, Inforence tries to identify and select groups of interdependent statements which altogether may affect the program failure. The interdependence amongst the statements is measured according to their mutual effect on each other and on the program termination state. To provide the context of failure, the selected bug-related statements are chained to each other, considering the program static structure. Eventually, the resultant cause-effect chains are ranked according to their combined causal effect on program failure. To validate Inforence, the results of our experiments with seven sets of programs include Siemens suite, gzip, grep, sed, space, make and bash are presented. The experimental results are then compared with those provided by different fault localization techniques for the both single-fault and multi-fault programs. The experimental results prove the outperformance of the proposed method compared to the state-of-the-art techniques.
💡 Deep Analysis
Deep Dive into Inforence: Effective Fault Localization Based on Information-Theoretic Analysis and Statistical Causal Inference.
In this paper, a novel approach, Inforence, is proposed to isolate the suspicious codes that likely contain faults. Inforence employs a feature selection method, based on mutual information, to identify those bug-related statements that may cause the program to fail. Because the majority of a program faults may be revealed as undesired joint effect of the program statements on each other and on program termination state, unlike the state-of-the-art methods, Inforence tries to identify and select groups of interdependent statements which altogether may affect the program failure. The interdependence amongst the statements is measured according to their mutual effect on each other and on the program termination state. To provide the context of failure, the selected bug-related statements are chained to each other, considering the program static structure. Eventually, the resultant cause-effect chains are ranked according to their combined causal effect on program failure. To validate Inf
📄 Full Content
Received Oct 25, 2016; accepted Jun 23, 2017
E-mail: Parsa@iust.ac.ir
Front. Comput. Sci.
DOI : 10.1007/s11704-017-6512-z
RESEARCH ARTICLE
Inforence: Effective Fault Localization Based on Information-
Theoretic Analysis and Statistical Causal Inference
Farid FEYZI1, Saeed PARSA1 (*)
1 Department of Computer Engineering, Iran University of Science and Technology, Tehran, Iran
© Higher Education Press and Springer-Verlag Berlin Heidelberg 2017
Abstract
In this paper, a novel approach, Inforence,
is proposed to isolate the suspicious codes that likely
contain faults. Inforence employs a feature selection
method, based on mutual information, to identify those
bug-related statements that may cause the program to fail.
Because the majority of a program faults may be revealed
as undesired joint effect of the program statements on
each other and on program termination state, unlike the
state-of-the-art methods, Inforence tries to identify and
select groups of interdependent statements which
altogether may affect the program failure. The
interdependence amongst the statements is measured
according to their mutual effect on each other and on the
program termination state. To provide the context of
failure, the selected bug-related statements are chained to
each other, considering the program static structure.
Eventually, the resultant cause-effect chains are ranked
according to their combined causal effect on program
failure. To validate Inforence, the results of
our
experiments with seven sets of programs include Siemens
suite, gzip, grep, sed, space, make and bash are presented.
The experimental results are then compared with those
provided by different fault localization techniques for the
both single-fault and multi-fault programs. The
experimental results prove the outperformance of the
proposed method compared to the state-of-the-art
techniques.
Keywords Fault Localization, Debugging, Backward
Dynamic Slice, Mutual Information, Feature Selection
1 Introduction
To eliminate a bug, programmers employ all means to
identify the location of the bug and figure out its cause.
This process is referred to as software fault localization,
which is one of the most expensive activities of
debugging. Due to intricacy and inaccuracy of manual
fault localization, a great amount of research has been
carried out to develop automated techniques and tools to
assist developers in finding bugs [1-11]. Most of these
techniques use dynamic information from test executions,
known as Spectrum-based Fault Localization (SBFL).
The majority of SBFL techniques do not perform well
in the case of specific bugs caused by undesired
interactions between statements because they only
consider statements in isolation. In other words, for each
individual statement, they contrast its presence in all
failing and passing runs to assign a fault suspiciousness
value according to the contrast measure. However, as
shown in Section 2, there are certain situations in which
a specific combination of statements causes undesired
program results. Hence, modeling the combinatorial
effect of statements on each other, in failing and passing
executions, may considerably improve the fault
localization process. In this regard, the new idea of
locating failure-causing statements considering their
combinatorial effect on the program failure is suggested.
The idea is inspired by the observation that most program
failures are only revealed when a specific combination of
correlated statements are executed.
In this article, we present a novel approach, Inforence,
for fault localization using an information-theory based
feature selection algorithm. Inforence employs a dynamic
weighting based feature selection algorithm, inspired
from [12-14], which not only selects the most relevant
program statements and eliminates redundant ones but
also tries to recognize groups of interdependent
statements which altogether may affect the program
failure. To this aim, relevance, interdependence, and
redundancy analysis are performed using information
theoretic criteria. Instead of directly using scores
computed by a feature selection method to localize faults,
Inforence employs a method based on statistical causal
inference to estimate the failure-causing effect of selected
program statements. As a result, unlike existing machine
learning based fault localization methods, confounding
bias problem [15], which its negative impact on the
performance of fault localization has been shown in
recent works [15-18], is addressed. More importantly, by
performing feature selection and statistical causal
inference in a combinatorial manner, we have succeeded
to leverage two significant limitations of existing causal
inference based methods, their scalability issues due to
considerable
computational
and
profile
storage
overheads and their poor performance in the case of
programs containing bugs with combined causes.
Inforence also takes advantage of the strength of
program slicing [19] in res
…(Full text truncated)…
📸 Image Gallery
Reference
This content is AI-processed based on ArXiv data.