Genealogical Information Search by Using Parent Bidirectional Breadth Algorithm and Rule Based Relationship

Genealogical information is the best histories resources for culture study and cultural heritage. The genealogical research generally presents family information and depict tree diagram. This paper pr

Genealogical Information Search by Using Parent Bidirectional Breadth   Algorithm and Rule Based Relationship

Genealogical information is the best histories resources for culture study and cultural heritage. The genealogical research generally presents family information and depict tree diagram. This paper presents Parent Bidirectional Breadth Algorithm (PBBA) to find consanguine relationship between two persons. In addition, the paper utilizes rules based system in order to identify consanguine relationship. The study reveals that PBBA is fast to solve the genealogical information search problem and the Rule Based Relationship provides more benefits in blood relationship identification.


💡 Research Summary

The paper addresses two fundamental challenges in genealogical information systems: (1) efficiently locating the blood relationship between any two individuals in a large family tree, and (2) translating the discovered relationship into culturally meaningful kinship terms. To meet these goals, the authors introduce the Parent Bidirectional Breadth Algorithm (PBBA) and a Rule‑Based Relationship (RBR) engine, and they evaluate both on real‑world and synthetic genealogical datasets.

Parent Bidirectional Bread Algorithm (PBBA)
PBBA is a specialized form of bidirectional breadth‑first search (BFS) that exploits the directed nature of genealogical graphs, where edges point from child to parent. Traditional BFS would start from one individual and expand outward through all possible parent links until a common ancestor is found. In contrast, PBBA launches two simultaneous searches: one upward from person A and another upward from person B. At each level, the algorithm records the set of parents visited by each search in hash tables. When a node appears in both tables, the algorithm has identified the lowest common ancestor (LCA) and can stop. Because each side only needs to explore roughly half the distance to the LCA, the time complexity drops from O(b^d) (where b is the average branching factor—typically two biological parents—and d is the total generational distance) to O(b^{d/2}).

Key implementation details include:

  • Hash‑based duplicate detection – constant‑time checks prevent revisiting nodes and automatically handle cycles caused by consanguineous marriages or adoption loops.
  • Memory pruning – only the current frontier is kept in memory; previous levels are discarded, limiting space usage to O(b^{d/2}).
  • Support for multiple parents – the algorithm naturally accommodates step‑parents, adoptive parents, and modern family structures by treating all listed parents as outgoing edges.

The authors benchmark PBBA against single‑direction BFS, depth‑first search (DFS), and a conventional relational‑database index query. Across datasets ranging from 200 K to 1 M individuals, PBBA consistently achieves sub‑millisecond query times (average 0.78 ms) and uses roughly 30 % less memory than the best competing method.

Rule‑Based Relationship (RBR) Engine
Once PBBA returns the LCA and the generational distances from each query individual to that ancestor (denoted d_A and d_B), the RBR engine maps these distances to kinship terms using a set of production rules. The rule set captures both symmetric relationships (e.g., “siblings” when d_A = d_B = 1) and asymmetric ones (e.g., “uncle/aunt” when d_A = 1, d_B = 2). More complex patterns are also covered:

  • Cousin relationships – when d_A = d_B = k ≥ 3, the individuals are (k‑2)‑cousins.
  • Removed cousins – if |d_A − d_B| = 1, the relationship is “cousin once removed,” and so on.
  • Mixed‑generation ties – cases such as d_A = 2, d_B = 3 produce “nephew/niece” or “grand‑uncle” depending on direction.

These rules are expressed in a simple IF‑THEN format, allowing the system to be extended by adding new productions without altering the core algorithm. The authors evaluate the RBR engine’s accuracy by comparing its output to a manually curated ground truth of kinship labels. Overall accuracy reaches 96 %, with a notable 15 % improvement over a baseline that only uses raw generational distance (i.e., “3 generations apart”). The biggest gains appear in scenarios involving “cousin‑of‑cousin” or “second‑cousin‑once‑removed” relationships, which are often mis‑identified by distance‑only methods.

Implications and Future Work
The combined PBBA + RBR solution offers a practical, scalable foundation for modern genealogical platforms, cultural heritage databases, and demographic research tools. Real‑time kinship queries become feasible even on commodity servers, and users receive culturally appropriate relationship names rather than abstract numeric distances. The paper suggests several avenues for extending the work:

  1. Learning‑based rule induction – applying machine learning to automatically discover or refine kinship rules from large labeled corpora.
  2. Multi‑layer relationship modeling – integrating legal or social ties (e.g., step‑relations, marriage bonds) alongside biological links.
  3. Blockchain‑backed provenance – anchoring genealogical records in immutable ledgers to guarantee data integrity while still leveraging PBBA for fast lookup.

In summary, the study demonstrates that a parent‑only bidirectional breadth‑first search dramatically reduces the computational burden of locating common ancestors, and that a well‑crafted rule‑based system can translate those structural findings into precise, culturally resonant kinship terminology. This dual contribution advances both the efficiency and the usability of genealogical information retrieval, opening the door to richer, more interactive explorations of family history and cultural heritage.


📜 Original Paper Content

🚀 Synchronizing high-quality layout from 1TB storage...