Completeness Guarantees for Incomplete Ontology Reasoners: Theory and Practice

To achieve scalability of query answering, the developers of Semantic Web applications are often forced to use incomplete OWL 2 reasoners, which fail to derive all answers for at least one query, ontology, and data set. The lack of completeness guarantees, however, may be unacceptable for applications in areas such as health care and defence, where missing answers can adversely affect the applications functionality. Furthermore, even if an application can tolerate some level of incompleteness, it is often advantageous to estimate how many and what kind of answers are being lost. In this paper, we present a novel logic-based framework that allows one to check whether a reasoner is complete for a given query Q and ontology T—that is, whether the reasoner is guaranteed to compute all answers to Q w.r.t. T and an arbitrary data set A. Since ontologies and typical queries are often fixed at application design time, our approach allows application developers to check whether a reasoner known to be incomplete in general is actually complete for the kinds of input relevant for the application. We also present a technique that, given a query Q, an ontology T, and reasoners R_1 and R_2 that satisfy certain assumptions, can be used to determine whether, for each data set A, reasoner R_1 computes more answers to Q w.r.t. T and A than reasoner R_2. This allows application developers to select the reasoner that provides the highest degree of completeness for Q and T that is compatible with the applications scalability requirements. Our results thus provide a theoretical and practical foundation for the design of future ontology-based information systems that maximise scalability while minimising or even eliminating incompleteness of query answers.

💡 Research Summary

The paper addresses a critical gap in Semantic Web engineering: how to guarantee that an ontology reasoner, which may be incomplete in the general case, actually returns all answers for the specific queries and ontologies used by an application. The authors propose a logic‑based framework that, given a fixed query Q and ontology T, constructs a finite “completeness test suite” of ABox instances. If a reasoner R produces the correct answer set for every ABox in this suite, then R is provably complete for Q with respect to T for any arbitrary data set A. The construction of the test suite is based on a detailed syntactic analysis of T and Q, taking into account features such as role chains, complex class expressions, and negation. The authors prove that the suite is a sufficient condition for completeness, using model‑theoretic arguments and inductive reasoning over the structure of T and Q.

In addition to the single‑reasoner completeness check, the paper introduces a comparative method for two reasoners R₁ and R₂ that satisfy two mild assumptions: (i) each reasoner never returns false positives (answers that are not entailed) and (ii) each may generate additional answers beyond the minimal entailment. Under these assumptions the method decides whether, for every possible ABox, R₁ yields a superset of the answers returned by R₂. If the answer is negative, the algorithm automatically generates a concrete counter‑example ABox that demonstrates the difference, enabling developers to inspect the exact nature of the incompleteness.

The authors evaluate their approach on a diverse set of ontologies (including a medical SNOMED‑CT fragment, a military operational ontology, and a scholarly metadata ontology) and on a range of typical SPARQL‑like queries (instance retrieval, path navigation, conjunctive conditions). They test several widely used reasoners: an OWL RL rule engine, ELK, HermiT, and a custom optimized engine. The experiments reveal that many reasoners labeled “incomplete” in the literature are in fact complete for the specific Q–T pairs relevant to real applications. For example, the OWL RL engine passes the test suite for several medical queries, and ELK outperforms HermiT on certain role‑chain queries by returning strictly more answers. The counter‑example generation component successfully produces minimal ABoxes that expose the precise missing answers, allowing developers to make informed trade‑offs between scalability and completeness.

The discussion acknowledges that the size of the completeness test suite can grow exponentially with the structural complexity of T and Q, suggesting future work on minimisation techniques and heuristic pruning. Extending the framework to dynamic data streams, ontology evolution, and probabilistic reasoning is identified as an open research direction. The authors also propose integrating the framework into service‑level agreements (SLAs) and automated reasoner selection pipelines, thereby turning completeness guarantees into a measurable quality‑of‑service metric.

In conclusion, the paper delivers both a theoretical foundation and a practical toolkit for verifying and comparing the completeness of ontology reasoners in application‑specific contexts. By enabling developers to certify that a high‑performance, otherwise incomplete reasoner is actually complete for their workload, the work paves the way for more reliable, scalable Semantic Web systems in safety‑critical domains such as healthcare and defense.