Syntactic vs. Semantic Locality: How Good Is a Cheap Approximation?
Extracting a subset of a given OWL ontology that captures all the ontology’s knowledge about a specified set of terms is a well-understood task. This task can be based, for instance, on locality-based modules (LBMs). These come in two flavours, syntactic and semantic, and a syntactic LBM is known to contain the corresponding semantic LBM. For syntactic LBMs, polynomial extraction algorithms are known, implemented in the OWL API, and being used. In contrast, extracting semantic LBMs involves reasoning, which is intractable for OWL 2 DL, and these algorithms had not been implemented yet for expressive ontology languages. We present the first implementation of semantic LBMs and report on experiments that compare them with syntactic LBMs extracted from real-life ontologies. Our study reveals whether semantic LBMs are worth the additional extraction effort, compared with syntactic LBMs.
💡 Research Summary
This paper presents a comprehensive empirical investigation comparing syntactic and semantic locality-based modules for OWL ontologies, culminating in the first-ever implementation of a semantic locality module extractor.
The core problem addressed is modular ontology extraction: given a large ontology O and a signature Σ (a set of concept and role names of interest), the goal is to extract a subset (module) M of O that preserves all entailments over Σ. Locality-Based Modules (LBMs) offer a solution. Semantic LBMs (based on ∅- or ∆-locality) are defined by checking whether an axiom can be satisfied by interpreting non-Σ terms as the empty set (∅) or the full domain (∆). Syntactic LBMs (based on ⊥- or ⊤-locality) approximate this by checking only the syntactic form of the axiom against predefined patterns. Crucially, a syntactic module is always a superset of the corresponding semantic module for the same Σ. While syntactic extraction runs in polynomial time and is widely implemented (e.g., in the OWL API), semantic extraction requires logical reasoning, which is intractable for OWL 2 DL, and had not been implemented prior to this work.
The central research question is: How good is this “cheap” syntactic approximation compared to the “correct” but expensive semantic method? Are the extra, potentially superfluous axioms in syntactic modules substantial enough to justify the high computational cost of semantic extraction?
To answer this, the authors built a corpus of 156 real-world ontologies from the BioPortal and TONES repositories. Their experimental design is twofold:
- Random Signature Analysis: For each ontology, they generated 400 random signatures (each term included with probability 1/2) to obtain statistically significant results (±5% margin of error, 95% confidence level). For each signature, they compared the extracted syntactic ⊥-module and semantic ∅-module.
- Genuine Module Analysis: They also extracted modules for the signature of every single axiom in each ontology. These yield “genuine” modules, which form a basis for all possible modules and are limited in number.
The results are striking and consistent:
- Near-Equivalence: For the vast majority of ontology-signature pairs, the syntactic ⊥-module and the semantic ∅-module were identical. Differences were extremely rare.
- Minimal Differences: In the few cases where differences occurred, the number of axioms present in the syntactic but absent from the semantic module was very small (often just one or two axioms).
- Characterization of Differences: Analysis revealed that the differing axioms typically fell into two categories: simple tautologies (e.g., A ⊑ A) or axioms matching specific patterns that could, in principle, be easily added to the syntactic locality checker’s rules.
- Performance Gap: As expected, semantic module extraction was significantly slower than syntactic extraction due to the required reasoning calls.
The study’s conclusion is clear and strong: For practical purposes on real-world ontologies, the syntactic approximation is excellent (“Cheap is Great!”). The additional computational effort required to obtain a (slightly) smaller semantic module is generally not justified, as the syntactic module almost always captures the exact same logical content for the target signature without the superfluous axioms being a practical concern. This work provides crucial empirical evidence for ontology engineers, validating the use of efficient syntactic module extraction as a reliable and sufficient technique in most real-world scenarios.
Comments & Academic Discussion
Loading comments...
Leave a Comment