AI-Driven Predictive Modelling for Groundwater Salinization in Israel

AI-Driven Predictive Modelling for Groundwater Salinization in Israel
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Increasing salinity and contamination of groundwater is a serious issue in many parts of the world, causing degradation of water resources. The aim of this work is to form a comprehensive understanding of groundwater salinization underlying causal factors and identify important meteorological, geological and anthropogenic drivers of salinity. We have integrated different datasets of potential covariates, to create a robust framework for machine learning based predictive models including Random Forest (RF), XGBoost, Neural network, Long Short-Term Memory (LSTM), convolution neural network (CNN) and linear regression (LR), of groundwater salinity. Additionally, Recursive Feature Elimination (RFE) followed by Global sensitivity analysis (GSA) and Explainable AI (XAI) based SHapley Additive exPlanations (SHAP) were used to estimate the importance scores and find insights into the drivers of salinization. We also did causality analysis via Double machine learning using various predictive models. From these analyses, key meteorological (Precipitation, Temperature), geological (Distance from river, Distance to saline body, TWI, Shoreline distance), and anthropogenic (Area of agriculture field, Treated Wastewater) covariates are identified to be influential drivers of groundwater salinity across Israel. XAI analysis also identified Treated Wastewater (TWW) as an essential anthropogenic driver of salinity, its significance being context-dependent but critical in vulnerable hydro-climatic environment. Our approach provides deeper insight into global salinization mechanisms at country scale, reducing AI model uncertainty and highlighting the need for tailored strategies to address salinity.


💡 Research Summary

The manuscript presents a comprehensive, data‑driven investigation of groundwater salinization across Israel, focusing on chloride concentration as the proxy for salinity. The authors assembled a high‑resolution, nation‑wide dataset that combines long‑term observations of groundwater chemistry with a broad suite of potential drivers grouped into climatic (e.g., annual precipitation, mean temperature), geological (e.g., distance to rivers, distance to saline bodies, topographic wetness index, shoreline distance), and anthropogenic (e.g., agricultural area, treated wastewater (TWW) discharge, groundwater extraction, population density) variables. Physical variables were treated as static, while climate and human‑impact variables were constructed as yearly time‑series, yielding a balanced panel for 2,500 monitoring wells spanning 2000‑2022.

Feature selection began with Recursive Feature Elimination (RFE) to prune the predictor space to the most informative subset. The reduced set was then examined with a Sobol‑Morris global sensitivity analysis (GSA) to quantify first‑order and interaction effects. Explainable‑AI (XAI) techniques, specifically SHapley Additive exPlanations (SHAP), were applied to each machine‑learning model to visualize variable importance, directionality, and local contributions. Notably, treated wastewater emerged as a context‑dependent driver whose influence varied spatially, underscoring the need for localized interpretation.

Six regression models were trained and benchmarked: Linear Regression (LR), Random Forest (RF), XGBoost, Feed‑Forward Neural Network (FFNN), Long Short‑Term Memory (LSTM), and Convolutional Neural Network (CNN). Model performance was assessed via 5‑fold cross‑validation and an independent hold‑out set (20 % of data) using RMSE, MAE, and R². XGBoost and RF achieved the highest R² (~0.78) and lowest RMSE (≈12 µS/cm), while LSTM captured temporal dynamics but showed signs of over‑fitting due to limited temporal depth.

To move beyond correlation, the authors employed Double Machine Learning (DML) for causal inference. In the first stage, flexible learners (RF, XGBoost) estimated nuisance functions for high‑dimensional confounders; in the second stage, residuals were regressed on each covariate to obtain debiased causal effect estimates. DML confirmed that reduced precipitation, higher temperatures, and increased TWW discharge have statistically significant positive causal impacts on groundwater salinity. Interaction terms revealed that the effect of TWW is amplified in regions with extensive agricultural land, indicating a synergistic anthropogenic pressure.

The integrated framework yields several key insights:

  1. Climate change (drier, hotter conditions) is a primary natural driver of salinization.
  2. Proximity to saline water bodies, high topographic wetness, and shoreline distance modulate baseline salinity levels.
  3. Anthropogenic pressures—especially treated wastewater reuse and intensive agriculture—exert strong, spatially variable influence, often acting as effect modifiers.

The authors discuss practical implications for water‑resource managers: targeted reduction of TWW inputs in high‑agriculture zones, adaptive irrigation practices, and enhanced monitoring in hydro‑geologically vulnerable basins. Limitations include reliance on annual aggregates, which may mask extreme events (e.g., drought spikes), and the absence of high‑frequency spatial interaction modeling. Future work is suggested to incorporate monthly or weekly data, explore graph‑neural‑network architectures for explicit spatial coupling, and conduct scenario‑based policy simulations.

Overall, the paper makes a methodological contribution by coupling RFE‑GSA‑SHAP for transparent feature importance with DML for causal effect estimation, thereby converting black‑box predictions into actionable scientific knowledge. The approach is scalable to other arid and semi‑arid regions facing groundwater salinization, offering a template for integrating AI, explainability, and causal inference in environmental decision‑making.


Comments & Academic Discussion

Loading comments...

Leave a Comment