Searching publications on software testing

This note concerns a search for publications in which the pragmatic concept of a test as conducted in the practice of software testing is formalized, a theory about software testing based on such a formalization is presented or it is demonstrated on the basis of such a theory that there are solid grounds to test software in cases where in principle other forms of analysis could be used. This note reports on the way in which the search has been carried out and the main outcomes of the search. The message of the note is that the fundamentals of software testing are not yet complete in some respects.

💡 Research Summary

The paper presents a systematic literature search aimed at uncovering scholarly work that either (1) formalizes the pragmatic concept of a software test as it is performed in practice, (2) builds a theoretical framework on the basis of such a formalization, or (3) demonstrates, through that framework, that there are solid reasons to conduct testing even when other forms of analysis (e.g., static analysis, model checking) could in principle be applied. The authors begin by clarifying what they mean by “test” in the practical sense: the selection of inputs, execution of the program under test, observation of its behavior, and the judgment of outcomes. They argue that a formal representation of this process—using logical formulas, state‑transition systems, or other mathematical models—would provide a foundation for rigorous reasoning about test design, execution, and result interpretation.

To locate relevant publications, the authors followed a classic systematic review protocol. They queried major digital libraries (ACM Digital Library, IEEE Xplore, SpringerLink, Scopus, and DBLP) for the period 2010‑2025, employing a Boolean search string that combined terms such as “software testing”, “formalization”, “test theory”, “test justification”, “formal methods”, and “empirical validation”. Inclusion criteria required that a paper (a) explicitly models the testing process with a formal language, (b) uses that model to influence test generation, execution, or analysis, or (c) provides a logical or empirical argument that testing adds value in contexts where alternative analyses are feasible. Papers that merely described tools, presented case studies without formal models, or were non‑English were excluded.

The initial query returned 1,237 records. After title, abstract, and keyword screening, 312 papers were retained for full‑text review. Ultimately, only 27 papers satisfied all inclusion criteria. The authors then classified these 27 works along several dimensions: domain of application, type of formalism, presence of a theoretical framework, and whether a comparative justification against other analysis techniques was offered.

Key findings are as follows. First, only about one‑third of the selected papers attempted any formalization of the testing activity, and even those tended to focus on narrow aspects such as formal definitions of test objectives or rule‑based test‑case generation expressed in a specific formal language. Second, the formalizations were largely domain‑specific (e.g., embedded systems, real‑time control, safety‑critical software) and did not propose a generalizable theory of testing. Third, comparative studies that argue for the necessity of testing in the presence of static analysis or model checking were scarce—only five papers—most of which relied on conceptual arguments rather than empirical evidence.

The analysis leads the authors to conclude that the foundational theory of software testing remains underdeveloped. While there is a substantial body of work on test automation, tool development, and domain‑specific case studies, very few contributions address the deeper question of “what is a test?” in a mathematically precise way, nor do they provide a rigorous justification for testing as a distinct activity when other verification techniques are available.

Based on these observations, the paper outlines several research directions. (1) Develop comprehensive mathematical models of the testing process (e.g., representing test specifications as propositional logic, modeling test execution as transition systems) and evaluate their impact on measurable attributes such as fault detection rate, cost efficiency, and risk reduction. (2) Explore hybrid frameworks that integrate model‑based testing with formal verification, thereby clarifying the complementary strengths of dynamic testing and static analysis. (3) Formalize the mechanisms behind AI/ML‑driven test generation and prioritization, and assess their consistency with existing formal test theories. (4) Incorporate formal testing concepts into software engineering curricula to bridge the gap between academic research and industry practice.

In summary, the paper documents a thorough search effort, reveals a pronounced scarcity of work that formalizes the practical notion of a software test and builds a solid theoretical basis for it, and calls for renewed focus on establishing rigorous, generalizable foundations for software testing. This is especially urgent given the growing reliance on automated and AI‑enhanced testing techniques, which would benefit from a clear, mathematically grounded understanding of what testing actually entails.

💡 Research Summary

📜 Original Paper Content