Code-based Automated Program Fixing

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Many programmers, when they encounter an error, would like to have the benefit of automatic fix suggestions—as long as they are, most of the time, adequate. Initial research in this direction has generally limited itself to specific areas, such as data structure classes with carefully designed interfaces, and relied on simple approaches. To provide high-quality fix suggestions in a broad area of applicability, the present work relies on the presence of contracts in the code, and on the availability of dynamic analysis to gather evidence on the values taken by expressions derived from the program text. The ideas have been built into the AutoFix-E2 automatic fix generator. Applications of AutoFix-E2 to general-purpose software, such as a library to manipulate documents, show that the approach provides an improvement over previous techniques, in particular purely model-based approaches.

💡 Research Summary

The paper introduces a novel “code‑based” approach to automatic program fixing that extends earlier “model‑based” techniques. Model‑based fixing, as implemented in the predecessor tool AutoFix‑E, relies on contracts (pre‑conditions, post‑conditions, class invariants) and on public queries to build abstract models of correct and faulty executions. By comparing these models it can locate faults, but it works well only when a class exposes rich public interfaces; otherwise the model lacks enough information to pinpoint the error.

The code‑based method overcomes this limitation by analysing the actual program text. For a routine that fails a contract, the system extracts every non‑constant expression that appears in the routine body or in the violated contract clause. Each expression is evaluated at runtime on a set of automatically generated test cases (using the AutoTest random‑testing framework). The test suite is split into passing and failing cases; an expression that frequently appears in failing cases is considered suspicious.

Three scores are computed for each expression:

Expression‑dependence (edep) – a syntactic similarity measure between expressions;
Control‑dependence (cdep) – a distance on the control‑flow graph between the locations where the expressions occur;
Dynamic score (dyn) – the difference in frequency of the expression in failing versus passing tests.

A weighted combination of these scores yields a global “fixme” score that ranks expressions by their likelihood of being responsible for the fault. The highest‑ranked expressions become candidates for fixing actions. Two kinds of actions are generated automatically: (1) value‑adjustment actions that rewrite the expression’s value (e.g., idx := idx‑1), and (2) guard‑insertion actions that add a conditional test before the expression (e.g., if before then …).

The candidate fixes are injected into the faulty routine, and the whole regression test suite is re‑executed. Any variant that passes all tests is accepted as a valid fix.

The authors illustrate the approach with two concrete bugs in a doubly‑linked list based set implementation (class TWO_WAY_SORTED_SET). The first bug occurs when remove decrements the element count, causing the subsequent call to go_i_th to violate its pre‑condition because the saved index (idx) is now larger than the new count. The code‑based analysis identifies the expression idx > index as highly suspicious, generates the corrective action if idx > index then idx := idx - 1 end, and inserts it before the call to go_i_th. This fix resolves the error for all test cases, not only the specific scenario that triggered the failure.

The second bug involves a call to put_left when the cursor is at position 0; the public query before already captures this condition, so both model‑based and code‑based techniques generate a guard if before then forth end before the call.

An extensive experimental evaluation (Section 4) applies AutoFix‑E2 to several real‑world libraries, including a document‑manipulation library and generic collection classes. Compared with the earlier model‑based AutoFix‑E, AutoFix‑E2 fixes 30‑45 % more faults automatically. All generated patches pass the full regression suite, and manual inspection confirms they are readable and maintain performance. The experiments also show that automatically generated random tests provide sufficient dynamic information; no manual test authoring is required.

The paper discusses limitations: the ranking computation can become costly when many expressions are present; complex control flow may make it hard to locate the optimal insertion point; and in the absence of any contracts the approach loses its primary source of correctness criteria. Future work is suggested in integrating static analysis and machine‑learning techniques to improve ranking, and in extending the framework to other languages and platforms.

In conclusion, the work demonstrates that combining contract‑driven error detection with dynamic analysis of code‑level expressions yields a powerful, fully automated fixing pipeline. AutoFix‑E2 can handle errors that model‑based methods miss, especially in classes with sparse public interfaces, and does so without human intervention beyond providing the source code and contracts. This represents a significant step toward practical, large‑scale automatic program repair.

Code-based Automated Program Fixing

💡 Research Summary

Comments & Academic Discussion

Leave a Comment