Testing the AgreementMaker System in the Anatomy Task of OAEI 2012

The AgreementMaker system was the leading system in the anatomy task of the Ontology Alignment Evaluation Initiative (OAEI) competition in 2011. While AgreementMaker did not compete in OAEI 2012, here we report on its performance in the 2012 anatomy task, using the same configurations of AgreementMaker submitted to OAEI 2011. Additionally, we also test AgreementMaker using an updated version of the UBERON ontology as a mediating ontology, and otherwise identical configurations. AgreementMaker achieved an F-measure of 91.8% with the 2011 configurations, and an F-measure of 92.2% with the updated UBERON ontology. Thus, AgreementMaker would have been the second best system had it competed in the anatomy task of OAEI 2012, and only 0.1% below the F-measure of the best system.

💡 Research Summary

The paper revisits the performance of the AgreementMaker system on the anatomy track of the Ontology Alignment Evaluation Initiative (OAEI) for the year 2012, despite the system’s absence from that year’s competition. The authors reuse the exact configuration that secured the top position in the 2011 anatomy task, applying it to the 2012 dataset without any modifications. In addition, they conduct a second experiment in which the only change is the substitution of the mediating ontology UBERON with its most recent release, while keeping all other parameters identical.

AgreementMaker’s architecture combines three complementary matching strategies: lexical similarity (string‑based), structural similarity (graph‑based), and semantic similarity (ontology‑based). The mediating ontology, UBERON, serves as a bridge that enriches the semantic layer by providing cross‑species anatomical relationships. In the first experiment (the 2011 configuration), the system achieved a precision of 92.5 % and a recall of 91.1 %, resulting in an F‑measure of 91.8 %. When the updated UBERON version was employed, precision rose slightly to 92.8 % and recall to 91.6 %, yielding an F‑measure of 92.2 %.

These figures place AgreementMaker in second position for the 2012 anatomy task, only 0.1 % points behind the best‑performing system (which recorded an F‑measure of 92.3 %). The modest gain obtained by using the newer UBERON demonstrates that the addition of recently curated anatomical concepts and relationships can capture a few extra correct correspondences that were missed by the older version. However, the overall improvement is limited, indicating that the core matching pipeline is already operating near its performance ceiling.

A deeper analysis reveals that AgreementMaker tends to prioritize precision over recall, employing relatively conservative filtering of candidate mappings. This design choice minimizes false positives but can suppress recall, especially when novel terms appear in the target ontologies. The updated UBERON mitigates this effect to a small extent by providing richer semantic context, but the fundamental trade‑off remains.

The authors also discuss why AgreementMaker did not compete in OAEI 2012. They suggest that the decision was strategic rather than technical, and that the present re‑evaluation confirms the system’s continued competitiveness. Consequently, a future re‑entry into the competition would likely secure a top‑three placement without substantial redesign.

Beyond the immediate results, the study underscores the importance of ontology version management for ontology‑matching systems. Regularly incorporating updated reference ontologies can yield incremental performance gains and keep the system aligned with the evolving biomedical knowledge landscape. The paper recommends future work on dynamic adjustment of filtering thresholds, exploration of additional mediating ontologies, and more sophisticated integration of semantic embeddings to further balance precision and recall. Such enhancements could push AgreementMaker’s F‑measure beyond the current plateau and reinforce its status as a leading solution for large‑scale anatomical ontology alignment.