Application and Validation of Geospatial Foundation Model Data for the Prediction of Health Facility Programmatic Outputs - A Case Study in Malawi

The reliability of routine health data in low and middle-income countries (LMICs) is often constrained by reporting delays and incomplete coverage, necessitating the exploration of novel data sources and analytics. Geospatial Foundation Models (GeoFMs) offer a promising avenue by synthesizing diverse spatial, temporal, and behavioral data into mathematical embeddings that can be efficiently used for downstream prediction tasks. This study evaluated the predictive performance of three GeoFM embedding sources - Google Population Dynamics Foundation Model (PDFM), Google AlphaEarth (derived from satellite imagery), and mobile phone call detail records (CDR) - for modeling 15 routine health programmatic outputs in Malawi, and compared their utility to traditional geospatial interpolation methods. We used XGBoost models on data from 552 health catchment areas (January 2021-May 2023), assessing performance with R2, and using an 80/20 training and test data split with 5-fold cross-validation used in training. While predictive performance was mixed, the embedding-based approaches improved upon baseline geostatistical methods in 13 of 15 (87%) indicators tested. A Multi-GeoFM model integrating all three embedding sources produced the most robust predictions, achieving average 5-fold cross validated R2 values for indicators like population density (0.63), new HIV cases (0.57), and child vaccinations (0.47) and test set R2 of 0.64, 0.68, and 0.55, respectively. Prediction was poor for prediction targets with low primary data availability, such as TB and malnutrition cases. These results demonstrate that GeoFM embeddings imbue a modest predictive improvement for select health and demographic outcomes in an LMIC context. We conclude that the integration of multiple GeoFM sources is an efficient and valuable tool for supplementing and strengthening constrained routine health information systems.

💡 Research Summary

This paper investigates whether geospatial foundation model (GeoFM) embeddings can improve the prediction of routine health program outputs in a low‑resource setting, using Malawi as a case study. The authors assembled a dataset covering 552 health catchment areas from January 2021 to May 2023, comprising 15 health‑related indicators such as population density, new HIV cases, child vaccination counts, tuberculosis notifications, and malnutrition cases. Three distinct GeoFM sources were evaluated: (1) Google’s Population Dynamics Foundation Model (PDFM), which encodes demographic and mobility patterns; (2) Google AlphaEarth, a satellite‑imagery‑derived model that captures land‑use, infrastructure, and environmental features; and (3) mobile phone call detail records (CDR) that reflect human movement and communication behavior. Each source was transformed into high‑dimensional vector embeddings and fed into gradient‑boosted decision tree models (XGBoost).

Model development followed an 80/20 train‑test split, with five‑fold cross‑validation applied during training to guard against over‑fitting. Predictive performance was measured using the coefficient of determination (R²) and compared against a baseline geostatistical interpolation approach (e.g., ordinary kriging). The results show that embedding‑based models outperformed the baseline on 13 of the 15 indicators (87 %). Notably, the multi‑embedding “Multi‑GeoFM” model, which concatenates PDFM, AlphaEarth, and CDR vectors, achieved the highest cross‑validated R² values: 0.63 for population density, 0.57 for new HIV cases, and 0.47 for child vaccinations. On the held‑out test set, the same model recorded R² = 0.64, 0.68, and 0.55 for those three outcomes respectively, demonstrating robust out‑of‑sample predictive power.

Indicators with sparse primary data—specifically tuberculosis and malnutrition—exhibited poor predictive performance (R² < 0.2), underscoring the dependence of GeoFM embeddings on the quality and coverage of the underlying health reporting system. The study also highlights methodological considerations: the embeddings are generated by proprietary Google models, rendering the feature space a black box; while XGBoost captures non‑linear relationships effectively, interpreting the contribution of individual embedding dimensions remains challenging.

The authors argue that the integration of multiple GeoFM sources provides complementary spatial, temporal, and behavioral information, leading to modest but consistent gains over traditional geostatistical methods. They acknowledge limitations, including reliance on external proprietary models, the relatively short two‑year study window, and the lack of explainability tools for policy makers. Future work is suggested to (i) fine‑tune embeddings with locally collected data to improve relevance, (ii) incorporate explainable AI techniques such as SHAP values to make model outputs actionable for health planners, and (iii) expand the data horizon to capture longer‑term trends and evaluate intervention impacts.

In conclusion, the paper demonstrates that GeoFM embeddings can serve as an efficient supplement to constrained routine health information systems in LMICs, especially when multiple embedding streams are combined. While not a panacea for all data gaps, this approach offers a scalable pathway to enhance health surveillance, resource allocation, and program monitoring in settings where traditional data collection is delayed or incomplete.

💡 Research Summary

📜 Original Paper Content