The Logical Difference for the Lightweight Description Logic EL

The Logical Difference for the Lightweight Description Logic EL

We study a logic-based approach to versioning of ontologies. Under this view, ontologies provide answers to queries about some vocabulary of interest. The difference between two versions of an ontology is given by the set of queries that receive different answers. We investigate this approach for terminologies given in the description logic EL extended with role inclusions and domain and range restrictions for three distinct types of queries: subsumption, instance, and conjunctive queries. In all three cases, we present polynomial-time algorithms that decide whether two terminologies give the same answers to queries over a given vocabulary and compute a succinct representation of the difference if it is non- empty. We present an implementation, CEX2, of the developed algorithms for subsumption and instance queries and apply it to distinct versions of Snomed CT and the NCI ontology.


💡 Research Summary

The paper introduces a logic‑based framework for ontology versioning that defines the “logical difference” between two ontology versions as the set of queries that receive different answers. Focusing on terminologies expressed in the lightweight description logic EL, the authors extend EL with role inclusions and domain‑and‑range restrictions, which are common in large biomedical ontologies such as SNOMED CT and the NCI Thesaurus. Three query families are considered: subsumption queries (C ⊑ D), instance queries (a ∈ C), and conjunctive queries (CQs). For each family the authors develop polynomial‑time decision procedures that (i) determine whether two terminologies give identical answers over a user‑specified vocabulary V, and (ii) if not, produce a compact representation of the difference, called a “difference witness.”

The core technical contribution is a graph‑based algorithm that merges the two terminologies into a single extended graph and then propagates EL inference rules (concept inclusion, role inclusion, domain and range constraints) simultaneously. For subsumption and instance queries the algorithm checks, for every pair of concepts or concept‑instance pairs drawn from V, whether the inclusion holds in one terminology but not the other. Whenever such a discrepancy is found, the corresponding pair is recorded as a witness. Because propagation is limited to the size of V and the number of axioms, the procedure runs in time polynomial in the combined size of the input terminologies.

Conjunctive queries are more challenging because CQ answering in EL is generally NP‑hard. The authors identify a restricted class of CQs that is expressive enough for many practical use cases yet admits a polynomial‑time algorithm when role inclusions and domain/range restrictions are present. The algorithm reduces a CQ to a pattern‑matching problem on the merged graph, uses the same propagation machinery to prune candidate matches, and finally decides whether the query can be satisfied in both terminologies. If a CQ is answered differently, the algorithm returns a witness consisting of the minimal set of atoms that cause the divergence.

To validate the theory, the authors implement the subsumption and instance algorithms in a prototype tool named CEX2. CEX2 reads two EL terminologies, a vocabulary V, and a set of queries, then reports whether the answers coincide and, if not, prints the witnesses in a human‑readable form (e.g., “C ⊑ D holds only in T1”). Experiments were conducted on successive releases of SNOMED CT and the NCI ontology, each containing thousands of concepts and tens of thousands of role axioms. The results show that CEX2 decides equivalence in a few seconds even for the largest versions, and that the generated witnesses are concise enough for ontology engineers to inspect and act upon.

Overall, the paper makes several notable contributions. First, it formalizes ontology versioning at the semantic level rather than at the syntactic level, aligning version control with the actual reasoning services that users depend on. Second, it demonstrates that for EL—augmented with realistic role and typing constraints—logical difference can be computed efficiently, contrary to the intuition that version comparison must be intractable for expressive logics. Third, the notion of a difference witness provides a succinct, actionable summary of changes, bridging the gap between automated detection and manual review. Finally, the prototype and empirical evaluation prove that the approach scales to real‑world biomedical ontologies, suggesting immediate applicability in ontology maintenance pipelines. Future work may extend the techniques to more expressive description logics (e.g., ALC, SROIQ), to richer query languages (e.g., SPARQL‑DL), and to integration with visualization tools that help stakeholders understand the impact of ontology evolution.