ACDC: Altering Control Dependence Chains for Automated Patch Generation

Reading time: 6 minute
...

📝 Abstract

Once a failure is observed, the primary concern of the developer is to identify what caused it in order to repair the code that induced the incorrect behavior. Until a permanent repair is afforded, code repair patches are invaluable. The aim of this work is to devise an automated patch generation technique that proceeds as follows: Step1) It identifies a set of failure-causing control dependence chains that are minimal in terms of number and length. Step2) It identifies a set of predicates within the chains along with associated execution instances, such that negating the predicates at the given instances would exhibit correct behavior. Step3) For each candidate predicate, it creates a classifier that dictates when the predicate should be negated to yield correct program behavior. Step4) Prior to each candidate predicate, the faulty program is injected with a call to its corresponding classifier passing it the program state and getting a return value predictively indicating whether to negate the predicate or not. The role of the classifiers is to ensure that: 1) the predicates are not negated during passing runs; and 2) the predicates are negated at the appropriate instances within failing runs. We implemented our patch generation approach for the Java platform and evaluated our toolset using 148 defects from the Introclass and Siemens benchmarks. The toolset identified 56 full patches and another 46 partial patches, and the classification accuracy averaged 84%.

💡 Analysis

Once a failure is observed, the primary concern of the developer is to identify what caused it in order to repair the code that induced the incorrect behavior. Until a permanent repair is afforded, code repair patches are invaluable. The aim of this work is to devise an automated patch generation technique that proceeds as follows: Step1) It identifies a set of failure-causing control dependence chains that are minimal in terms of number and length. Step2) It identifies a set of predicates within the chains along with associated execution instances, such that negating the predicates at the given instances would exhibit correct behavior. Step3) For each candidate predicate, it creates a classifier that dictates when the predicate should be negated to yield correct program behavior. Step4) Prior to each candidate predicate, the faulty program is injected with a call to its corresponding classifier passing it the program state and getting a return value predictively indicating whether to negate the predicate or not. The role of the classifiers is to ensure that: 1) the predicates are not negated during passing runs; and 2) the predicates are negated at the appropriate instances within failing runs. We implemented our patch generation approach for the Java platform and evaluated our toolset using 148 defects from the Introclass and Siemens benchmarks. The toolset identified 56 full patches and another 46 partial patches, and the classification accuracy averaged 84%.

📄 Content

ACDC: Altering Control Dependence Chains for Automated Patch Generation

Rawad Abou Assi Chadi Trad Wes Masri

Department of Electrical and Computer Engineering American University of Beirut Beirut, Lebanon {ria21, cht02, wm13}@aub.edu.lb

ABSTRACT Once a failure is observed, the primary concern of the developer is to identify what caused it in order to repair the code that induced the incorrect behavior. Until a permanent repair is afforded, code repair patches are invaluable. The aim of this work is to devise an automated patch generation technique that proceeds as follows: Step1) It identifies a set of failure-causing control dependence chains that are minimal in terms of number and length. Step2) It identifies a set of predicates within the chains along with associated execution instances, such that negating the predicates at the given instances would exhibit correct behavior. Step3) For each candidate predicate, it creates a classifier that dictates when the predicate should be negated to yield correct program behavior. Step4) Prior to each candidate predicate, the faulty program is injected with a call to its corresponding classifier passing it the program state and getting a return value predictively indicating whether to negate the predicate or not. The role of the classifiers is to ensure that: 1) the predicates are not negated during passing runs; and 2) the predicates are negated at the appropriate instances within failing runs. We implemented our patch generation approach for the Java platform and evaluated our toolset using 148 defects from the Introclass and Siemens benchmarks. The toolset identified 56 full patches and another 46 partial patches, and the classification accuracy averaged 84%. CCS CONCEPTS • Software verification and validation • Software defect analysis • Software testing and debugging KEYWORDS Automated patch generation • automated program repair • coverage based fault localization • dependence chains • causal inference • supervised learning • predicate switching 1 INTRODUCTION During the debugging process, the developer replicates the failure at hand in order to: 1) identify what caused it, and 2) prevent it from happening again by modifying, adding, or deleting code. These two activities are respectively termed fault localization and program repair. In most cases, these activities cannot be completed in a timely manner which calls for the temporary reliance on automated patch generation, the subject of this work.
For over three decades, researchers have proposed a plethora of automated fault localization techniques and tools, and in recent years a number of automated program repair and patch generation techniques have been proposed that leverage varying approaches such as evolutionary algorithms ‎[19]‎[39], constraint solving ‎[14]‎[16]‎[11]‎[26]‎[27], and program mutation ‎[10]. The aim of this work is to devise an effective patch generation technique that leverages a variant of an accurate coverage-based fault localization (CBFL) approach that was previously presented in the literature ‎[1].
CBFL techniques generally entail two main steps. First, they identify the executing program elements that correlate most with failure. Second, starting from these elements, which are not necessarily the causes of the failure, they try to locate the faulty code following some examination strategy. It often happens that in the first step the correlation measure of the identified elements is not high enough to successfully guide the developer to the fault. This shortcoming is likely due to the fact that the program elements covered are simple, such as statements and branches, and therefore, cannot characterize most defects that are typically complex. This calls for covering program elements whose complexity matches the complexity of the defect under consideration, no more nor less. A less complex element cannot characterize the defect to begin with; whereas, an excessively complex element is likely to ‘subsume’ the defect and to successfully characterize it; but might lead to erroneously tagging too many statements as suspicious. The ultimate goal then is to define a program element type that characterizes as closely as possible the defect at hand. The CBFL technique presented in ‎[1] attempts to achieve that goal by identifying the data/control dependence chains that correlate with failure, and are minimal in number and length.
Our patch generation approach uses a variant of the above CBFL work as a starting point. But it first improves its accuracy by considering the causal relationships amongst program statements. The proposed patch generation approach proceeds as follows:
FSE 2017 Name et al.

2

Step1. It identifies a set of suspicious control chains via the improved CBFL technique. Step2. It identifies a set of predicates within the suspicious chains along with associated ex

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut