A Data-Parallel Version of Aleph
This is to present work on modifying the Aleph ILP system so that it evaluates the hypothesised clauses in parallel by distributing the data-set among the nodes of a parallel or distributed machine. The paper briefly discusses MPI, the interface used to access message- passing libraries for parallel computers and clusters. It then proceeds to describe an extension of YAP Prolog with an MPI interface and an implementation of data-parallel clause evaluation for Aleph through this interface. The paper concludes by testing the data-parallel Aleph on artificially constructed data-sets.
💡 Research Summary
The paper presents a concrete effort to parallelise the clause‑evaluation phase of the Aleph inductive‑logic‑programming (ILP) system by distributing the training examples across multiple processing nodes. The authors begin by motivating the need for parallelism: Aleph, while powerful for relational learning, is fundamentally a single‑process system and its performance degrades sharply as the number of examples grows. To address this, they adopt a data‑parallel strategy, keeping the hypothesis‑generation and refinement logic unchanged but off‑loading the costly evaluation of each candidate clause to a set of worker processes.
The technical backbone of the solution is the Message Passing Interface (MPI). After a brief overview of MPI primitives (initialisation, rank/size discovery, broadcast, scatter, gather, reduce), the authors describe how they extended YAP Prolog with a native MPI binding. This binding is written in C, exposing Prolog predicates such as mpi_send/2, mpi_recv/2, mpi_bcast/2, etc. The binding handles conversion between Prolog terms (lists, atoms, numbers) and the raw byte buffers required by MPI, and it incorporates error checking to ensure robust communication on heterogeneous clusters.
The parallel Aleph workflow is organised as a master‑worker pattern. The master process holds the global state: the current set of candidate clauses, the background knowledge, and the overall learning loop. At the start of each evaluation cycle the master uses MPI_Scatter to partition the full set of training examples into roughly equal chunks and sends each chunk to a distinct worker. Each worker then evaluates the received clause(s) against its local subset, counting true positives, false positives, true negatives, and false negatives. Once the local counts are computed, workers participate in an MPI_Reduce operation that aggregates the statistics back to the master. The master uses the aggregated metrics to rank the clauses, select the best one, and possibly prune the search space. After this, the master proceeds with the usual Aleph steps of feature selection and clause refinement, which remain sequential.
To validate the approach, the authors constructed two synthetic benchmark datasets. The first dataset encodes a simple linear relationship among predicates, while the second contains multiple interacting relations and nested logical operators, thereby increasing the computational cost of clause evaluation. Both datasets were scaled to sizes ranging from 10 K to 200 K examples. Experiments were run on a modest cluster consisting of up to eight nodes, each equipped with an Intel Xeon CPU and connected via a Gigabit Ethernet network. OpenMPI served as the communication layer.
Results show a clear trend: when the dataset is sufficiently large (≥ 50 K examples), the parallel version achieves near‑linear speed‑ups up to four nodes and respectable gains up to eight nodes (approximately 4.5× faster than the sequential baseline). For smaller datasets, the communication overhead dominates, leading to negligible or even negative speed‑up. Moreover, the benefit is more pronounced when the number of candidate clauses is high and each evaluation is computationally intensive, confirming that the parallelisation is most effective for CPU‑bound evaluation workloads.
The paper also discusses several limitations. First, only the evaluation phase is parallelised; hypothesis generation, background‑knowledge loading, and refinement remain sequential, potentially becoming the new bottleneck for very large search spaces. Second, the static partitioning of examples can cause load imbalance if some chunks contain disproportionately many positive examples or more complex structures. Third, the reliance on MPI ties the implementation to environments where MPI is available, making deployment on Windows desktops or certain cloud platforms less straightforward. The authors suggest future work such as dynamic work‑stealing schedulers, hybrid data‑ and task‑parallel models, and integration with higher‑level distributed frameworks (e.g., Spark or Ray) to broaden applicability.
In conclusion, the study demonstrates that a relatively modest engineering effort—adding an MPI interface to YAP Prolog and restructuring Aleph’s evaluation loop—can yield substantial performance improvements for relational learning on medium‑to‑large datasets. The presented YAP‑MPI bridge constitutes a reusable component for other Prolog‑based systems seeking parallel execution. By extending parallelism to the hypothesis‑generation stage and addressing load‑balancing concerns, the approach could become a cornerstone for scalable ILP in domains such as bioinformatics, natural‑language processing, and knowledge‑graph construction, where both data volume and relational complexity are rapidly increasing.
Comments & Academic Discussion
Loading comments...
Leave a Comment