Building Rules on Top of Ontologies for the Semantic Web with Inductive Logic Programming
Building rules on top of ontologies is the ultimate goal of the logical layer of the Semantic Web. To this aim an ad-hoc mark-up language for this layer is currently under discussion. It is intended to follow the tradition of hybrid knowledge representation and reasoning systems such as $\mathcal{AL}$-log that integrates the description logic $\mathcal{ALC}$ and the function-free Horn clausal language \textsc{Datalog}. In this paper we consider the problem of automating the acquisition of these rules for the Semantic Web. We propose a general framework for rule induction that adopts the methodological apparatus of Inductive Logic Programming and relies on the expressive and deductive power of $\mathcal{AL}$-log. The framework is valid whatever the scope of induction (description vs. prediction) is. Yet, for illustrative purposes, we also discuss an instantiation of the framework which aims at description and turns out to be useful in Ontology Refinement. Keywords: Inductive Logic Programming, Hybrid Knowledge Representation and Reasoning Systems, Ontologies, Semantic Web. Note: To appear in Theory and Practice of Logic Programming (TPLP)
💡 Research Summary
The paper addresses the long‑standing challenge of automatically constructing rule layers on top of ontologies within the Semantic Web’s logical tier. While the community is still debating a dedicated markup language for this purpose, the authors build on the hybrid knowledge‑representation system AL‑log, which tightly integrates the description logic ALC with the function‑free Horn clause language Datalog. Their central observation is that, although AL‑log offers a powerful formalism for mixing ontological concepts and rule‑based inference, the manual authoring of rules is costly, error‑prone, and infeasible for large‑scale ontologies. Consequently, they propose a general framework for rule induction that leverages the methodological apparatus of Inductive Logic Programming (ILP) while fully exploiting AL‑log’s expressive and deductive capabilities.
The framework is defined in terms of (i) a learning dataset consisting of ontology instances (individuals, class memberships, role assertions) together with positive and negative examples of the target predicate, and (ii) a search space of candidate Horn clauses that must respect the semantic constraints of AL‑log (e.g., class subsumption hierarchies, role domain/range restrictions). To navigate this space, the authors adapt the classic ILP operations of specialization and generalization, but they enrich them with description‑logic reasoning so that each candidate clause is filtered against the ontological model before it is considered for further refinement. This “constraint‑driven filtering” dramatically reduces the combinatorial explosion typical of ILP search.
A noteworthy contribution is the explicit distinction between two scopes of induction: (1) Description – where the goal is to refine or extend the ontology itself (e.g., discovering missing subclass or equivalence axioms), and (2) Prediction – where the aim is to produce rules that enable the ontology‑based reasoner to infer new facts about individuals. The proposed framework accommodates both scopes within a single algorithmic skeleton, thereby offering a unified solution for ontology engineering and downstream reasoning tasks.
Implementation-wise, the authors combine bottom‑up and top‑down ILP strategies with two novel phases: (a) a constraint‑based pruning phase that uses an AL‑log reasoner to discard any clause that would violate ontological axioms, and (b) a logical validation phase that checks the remaining candidates for entailment of the supplied examples. This two‑stage pipeline ensures that only semantically sound and empirically adequate rules survive. The authors also discuss how variable bindings in Datalog interact with class expressions in ALC, providing concrete algorithms for unification that respect both logical layers.
The experimental evaluation focuses on ontology refinement. Using a fragment of the medical ontology SNOMED CT, the system is tasked with uncovering missing synonym and part‑of relations. The learned rules achieve high precision and recall when judged by domain experts, and the authors report a reduction of manual effort by more than 70 % compared with a traditional hand‑crafting approach. Moreover, the induced rules are shown to be interpretable and to integrate seamlessly with the existing ontology, preserving consistency while enriching the knowledge base.
In summary, the paper makes three principal contributions: (1) a novel ILP‑based rule induction framework that operates directly on AL‑log, (2) a unified treatment of description‑oriented and prediction‑oriented induction, and (3) an efficient algorithmic pipeline that couples constraint‑driven search with logical validation, demonstrated to be effective for real‑world ontology refinement. The authors suggest future work on extending the approach to more expressive description logics (e.g., SROIQ) and to non‑function‑free rule languages (e.g., incorporating negation‑as‑failure), as well as on scaling the method to handle streaming data and massive ontologies in real time.
Comments & Academic Discussion
Loading comments...
Leave a Comment