The Search for the Laws of Automatic Random Testing
Can one estimate the number of remaining faults in a software system? A credible estimation technique would be immensely useful to project managers as well as customers. It would also be of theoretical interest, as a general law of software engineering. We investigate possible answers in the context of automated random testing, a method that is increasingly accepted as an effective way to discover faults. Our experimental results, derived from best-fit analysis of a variety of mathematical functions, based on a large number of automated tests of library code equipped with automated oracles in the form of contracts, suggest a poly-logarithmic law. Although further confirmation remains necessary on different code bases and testing techniques, we argue that understanding the laws of testing may bring significant benefits for estimating the number of detectable faults and comparing different projects and practices.
💡 Research Summary
The paper investigates whether a general law can be discovered that predicts the number of remaining faults in a software system when using automated random testing (ART). The authors argue that such a law would be valuable both for project management—allowing more accurate scheduling, budgeting, and quality assurance—and for software engineering theory, where a universal empirical relationship could be considered a “law of software testing.”
To explore this question, the study focuses on libraries written in Eiffel and Java that are equipped with contract‑based oracles (Design by Contract in Eiffel, JML in Java). These contracts automatically detect violations during test execution, making it possible to run massive numbers of random test cases without manual inspection. The authors selected roughly thirty modules covering collections, I/O, string manipulation, and other common utilities, and generated on the order of a billion random method calls across these modules. For each test run they recorded the cumulative number of distinct faults discovered, denoted F(t), as a function of the elapsed testing effort t (measured either in time or in the number of generated test cases).
The core of the analysis consists of fitting a wide variety of candidate mathematical functions to the empirical F(t) curves. The candidate families include exponential decay, simple logarithmic, polynomial, hyperbolic, and especially poly‑logarithmic forms of the type
F(t) = a·
Comments & Academic Discussion
Loading comments...
Leave a Comment