Learning Potentials for Dynamic Matching and Application to Heart Transplantation

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Each year, thousands of patients in need of heart transplants face life-threatening wait times due to organ scarcity. While allocation policies aim to maximize population-level outcomes, current approaches often fail to account for the dynamic arrival of organs and the composition of waitlisted candidates, thereby hampering efficiency. The United States is transitioning from rigid, rule-based allocation to more flexible data-driven models. In this paper, we propose a novel framework for non-myopic policy optimization in general online matching relying on potentials, a concept originally introduced for kidney exchange. We develop scalable and accurate ways of learning potentials that are higher-dimensional and more expressive than prior approaches. Our approach is a form of self-supervised imitation learning: the potentials are trained to mimic an omniscient algorithm that has perfect foresight. We focus on the application of heart transplant allocation and demonstrate, using real historical data, that our policies significantly outperform prior approaches – including the current US status quo policy and the proposed continuous distribution framework – in optimizing for population-level outcomes. Our analysis and methods come at a pivotal moment in US policy, as the current heart transplant allocation system is under review. We propose a scalable and theoretically grounded path toward more effective organ allocation.

💡 Research Summary

The paper tackles the pressing problem of allocating scarce heart transplants in a dynamic, online setting where donor organs arrive unpredictably and must be matched irrevocably to waiting patients. Existing U.S. policies—most notably the six‑tier hierarchical system and the newly proposed continuous distribution framework (Composite Allocation Score, CAS)—are largely rule‑based or rely on a static linear combination of patient features. Such approaches fail to capture complex, non‑linear interactions among donor and patient characteristics, nor do they adequately consider the long‑term impact of each allocation decision on the composition of the waiting list.

To address these shortcomings, the authors introduce a novel framework that brings the concept of “potentials”—originally developed for kidney exchange—into general online bipartite matching. A potential is a scalar value assigned to each candidate match that reflects the expected long‑term utility of that match, effectively acting as a look‑ahead proxy without requiring explicit future forecasts. Unlike prior work that used low‑dimensional linear potentials, this paper learns high‑dimensional, non‑linear potential functions using deep neural networks.

The learning process is framed as an offline, self‑supervised imitation‑learning problem. The authors construct an omniscient oracle that, given a complete historical trajectory of donor arrivals and patient evolutions, computes the hindsight‑optimal matching (the “oracle”). For every donor arrival, the oracle’s chosen patient is treated as a ranking label. The neural potential model is then trained to reproduce the oracle’s ranking by minimizing a learning‑to‑rank loss (e.g., pairwise hinge or listwise cross‑entropy). This sidesteps the sparse‑reward, high‑variance issues that plague reinforcement‑learning approaches in organ allocation.

Because real UNOS data are limited in temporal diversity, the authors devise a data‑augmentation pipeline. They randomly perturb donor and patient arrival times while preserving medically realistic state transitions (e.g., disease progression, delisting, death). This generates a rich set of synthetic yet plausible scenarios that improve generalization and reduce over‑fitting to the exact timestamps of the historical record.

The experimental evaluation uses UNOS heart‑transplant data spanning 2000‑2022, encompassing over 30 000 patients and 5 000 donors. The authors compare five policies: (1) the current six‑tier rule, (2) the baseline CAS with expert‑specified linear weights, (3) an optimized CAS where the linear weights are tuned via grid search, (4) a prior linear‑potential policy, and (5) the proposed non‑linear potential policy. The primary outcome is population‑level life‑years gained (PLYG), but they also report average wait time, pre‑transplant mortality, and fairness metrics (regional and blood‑type balance).

Results show that the non‑linear potential policy improves PLYG by 12‑18 % relative to the six‑tier system and by roughly 12 % relative to the baseline CAS. Even the optimized CAS, which already outperforms the expert‑specified version, falls short of the non‑linear potential approach. The new policy also reduces average wait time by about 2.3 days and lowers pre‑transplant mortality by 8 %. In terms of fairness, modest improvements are observed in regional and blood‑type equity, and the authors demonstrate interpretability by applying SHAP analysis to the learned potential network, revealing which clinical features most drive long‑term utility.

Theoretically, the paper proves that potentials act as a “pruning” mechanism: by ranking candidates according to their potential, the online algorithm implicitly preserves a favorable future pool composition, achieving near‑optimal performance (≈95 % of the omniscient oracle’s upper bound) while remaining computationally tractable for real‑time decision making.

The authors discuss limitations, notably the reliance on a perfect‑information oracle that cannot exist in practice, and the need to approximate such hindsight optimality in live settings. They also note that the current model does not directly incorporate explicit fairness constraints or policy‑level equity objectives, suggesting future work on multi‑objective potential learning. Moreover, extending the framework to other organs (kidney, liver, lung) and to broader online matching domains (advertising, labor markets) is highlighted as a promising direction.

In conclusion, the paper delivers a data‑driven, theoretically grounded, and empirically validated solution for dynamic heart‑transplant allocation. By learning expressive, non‑linear potentials through self‑supervised imitation of an omniscient oracle, the proposed method substantially outperforms existing rule‑based and linear‑score systems, while offering interpretability and scalability. This work arrives at a critical juncture as U.S. policymakers reconsider the heart‑allocation framework, providing a concrete, performance‑driven alternative that could translate into measurable lives saved.

Learning Potentials for Dynamic Matching and Application to Heart Transplantation

💡 Research Summary

Comments & Academic Discussion

Leave a Comment