ACDC: Altering Control Dependence Chains for Automated Patch Generation
📝 Abstract
Once a failure is observed, the primary concern of the developer is to identify what caused it in order to repair the code that induced the incorrect behavior. Until a permanent repair is afforded, code repair patches are invaluable. The aim of this work is to devise an automated patch generation technique that proceeds as follows: Step1) It identifies a set of failure-causing control dependence chains that are minimal in terms of number and length. Step2) It identifies a set of predicates within the chains along with associated execution instances, such that negating the predicates at the given instances would exhibit correct behavior. Step3) For each candidate predicate, it creates a classifier that dictates when the predicate should be negated to yield correct program behavior. Step4) Prior to each candidate predicate, the faulty program is injected with a call to its corresponding classifier passing it the program state and getting a return value predictively indicating whether to negate the predicate or not. The role of the classifiers is to ensure that: 1) the predicates are not negated during passing runs; and 2) the predicates are negated at the appropriate instances within failing runs. We implemented our patch generation approach for the Java platform and evaluated our toolset using 148 defects from the Introclass and Siemens benchmarks. The toolset identified 56 full patches and another 46 partial patches, and the classification accuracy averaged 84%.
💡 Analysis
Once a failure is observed, the primary concern of the developer is to identify what caused it in order to repair the code that induced the incorrect behavior. Until a permanent repair is afforded, code repair patches are invaluable. The aim of this work is to devise an automated patch generation technique that proceeds as follows: Step1) It identifies a set of failure-causing control dependence chains that are minimal in terms of number and length. Step2) It identifies a set of predicates within the chains along with associated execution instances, such that negating the predicates at the given instances would exhibit correct behavior. Step3) For each candidate predicate, it creates a classifier that dictates when the predicate should be negated to yield correct program behavior. Step4) Prior to each candidate predicate, the faulty program is injected with a call to its corresponding classifier passing it the program state and getting a return value predictively indicating whether to negate the predicate or not. The role of the classifiers is to ensure that: 1) the predicates are not negated during passing runs; and 2) the predicates are negated at the appropriate instances within failing runs. We implemented our patch generation approach for the Java platform and evaluated our toolset using 148 defects from the Introclass and Siemens benchmarks. The toolset identified 56 full patches and another 46 partial patches, and the classification accuracy averaged 84%.
📄 Content
ACDC: Altering Control Dependence Chains for Automated Patch Generation
Rawad Abou Assi Chadi Trad Wes Masri
Department of Electrical and Computer Engineering American University of Beirut Beirut, Lebanon {ria21, cht02, wm13}@aub.edu.lb
ABSTRACT
Once a failure is observed, the primary concern of the developer
is to identify what caused it in order to repair the code that
induced the incorrect behavior. Until a permanent repair is
afforded, code repair patches are invaluable. The aim of this
work is to devise an automated patch generation technique that
proceeds as follows: Step1) It identifies a set of failure-causing
control dependence chains that are minimal in terms of number
and length. Step2) It identifies a set of predicates within the
chains along with associated execution instances, such that
negating the predicates at the given instances would exhibit
correct behavior. Step3) For each candidate predicate, it creates a
classifier that dictates when the predicate should be negated to
yield correct program behavior. Step4) Prior to each candidate
predicate, the faulty program is injected with a call to its
corresponding classifier passing it the program state and getting
a return value predictively indicating whether to negate the
predicate or not. The role of the classifiers is to ensure that: 1)
the predicates are not negated during passing runs; and 2) the
predicates are negated at the appropriate instances within failing
runs.
We implemented our patch generation approach for the Java
platform and evaluated our toolset using 148 defects from the
Introclass and Siemens benchmarks. The toolset identified 56 full
patches and another 46 partial patches, and the classification
accuracy averaged 84%.
CCS CONCEPTS
• Software verification and validation • Software defect analysis •
Software testing and debugging
KEYWORDS
Automated patch generation • automated program repair •
coverage based fault localization • dependence chains • causal
inference • supervised learning • predicate switching
1 INTRODUCTION
During the debugging process, the developer replicates the
failure at hand in order to: 1) identify what caused it, and 2)
prevent it from happening again by modifying, adding, or
deleting code. These two activities are respectively termed fault
localization and program repair. In most cases, these activities
cannot be completed in a timely manner which calls for the
temporary reliance on automated patch generation, the subject of
this work.
For over three decades, researchers have proposed a plethora of
automated fault localization techniques and tools, and in recent
years a number of automated program repair and patch
generation techniques have been proposed that leverage varying
approaches such as evolutionary algorithms [19][39], constraint
solving [14][16][11][26][27], and program mutation [10]. The
aim of this work is to devise an effective patch generation
technique that leverages a variant of an accurate coverage-based
fault localization (CBFL) approach that was previously presented
in the literature [1].
CBFL techniques generally entail two main steps. First, they
identify the executing program elements that correlate most
with failure. Second, starting from these elements, which are not
necessarily the causes of the failure, they try to locate the faulty
code following some examination strategy. It often happens that
in the first step the correlation measure of the identified
elements is not high enough to successfully guide the developer
to the fault. This shortcoming is likely due to the fact that the
program elements covered are simple, such as statements and
branches, and therefore, cannot characterize most defects that
are typically complex. This calls for covering program elements
whose complexity matches the complexity of the defect under
consideration, no more nor less. A less complex element cannot
characterize the defect to begin with; whereas, an excessively
complex element is likely to ‘subsume’ the defect and to
successfully characterize it; but might lead to erroneously
tagging too many statements as suspicious. The ultimate goal
then is to define a program element type that characterizes as
closely as possible the defect at hand. The CBFL technique
presented in [1] attempts to achieve that goal by identifying the
data/control dependence chains that correlate with failure, and
are minimal in number and length.
Our patch generation approach uses a variant of the above CBFL
work as a starting point. But it first improves its accuracy by
considering
the
causal
relationships
amongst
program
statements. The proposed patch generation approach proceeds as
follows:
FSE 2017
Name et al.
2
Step1. It identifies a set of suspicious control chains via the improved CBFL technique. Step2. It identifies a set of predicates within the suspicious chains along with associated ex
This content is AI-processed based on ArXiv data.