Measuring Nonlinear Relationships and Spatial Heterogeneity of Influencing Factors on Traffic Crash Density Using GeoXAI

Measuring Nonlinear Relationships and Spatial Heterogeneity of Influencing Factors on Traffic Crash Density Using GeoXAI
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This study applies a Geospatial Explainable AI (GeoXAI) framework to analyze the spatially heterogeneous and nonlinear determinants of traffic crash density in Florida. By combining a high-performing machine learning model with GeoShapley, the framework provides interpretable, tract-level insights into how roadway characteristics and socioeconomic factors contribute to crash risk. Specifically, results show that variables such as road density, intersection density, neighborhood compactness, and educational attainment exhibit complex nonlinear relationships with crashes. Extremely dense urban areas, such as Miami, show sharply elevated crash risk due to intensified pedestrian activities and roadway complexity. The GeoShapley approach also captures strong spatial heterogeneity in the influence of these factors. Major metropolitan areas including Miami, Orlando, Tampa, and Jacksonville display significantly higher intrinsic crash contributions, while rural tracts generally have lower baseline risk. Each factor exhibits pronounced spatial variation across the state. Based on these findings, the study proposes targeted, geography-sensitive policy recommendations, including traffic calming in compact neighborhoods, adaptive intersection design, speed management on high-volume corridors such as I-95 in Miami, and equity-focused safety interventions in disadvantaged rural areas of central and northern Florida. Moreover, this paper compares the results obtained from GeoShapley framework against other established methods (e.g., SHAP and MGWR), demonstrating its powerful ability to explain nonlinearity and spatial heterogeneity simultaneously.


💡 Research Summary

This paper introduces a novel Geospatial Explainable AI (GeoXAI) framework that simultaneously captures nonlinear relationships and spatial heterogeneity in traffic‑crash density across the state of Florida. The authors first assemble a comprehensive dataset at the census‑tract level, combining two years (2023‑2024) of crash records from the Florida Department of Transportation’s Signal Four Analytics system, detailed roadway network and traffic‑volume attributes from the FDOT GIS database, and a suite of socioeconomic variables derived from the U.S. Census. Crash density (crashes per square kilometre) serves as the dependent variable, while predictors include road‑segment density, intersection density, neighborhood compactness, educational attainment, average annual daily traffic (AADT), truck‑percentage, and other built‑environment factors.

To avoid the labor‑intensive model‑selection process, the study employs an automated machine‑learning (AutoML) pipeline that evaluates a range of algorithms (random forests, XGBoost, LightGBM, etc.) and automatically selects the best‑performing model based on cross‑validated error metrics. The chosen model— a Gradient Boosting Decision Tree— is then interpreted using GeoShapley, a game‑theoretic extension of SHAP that treats geographic location as an explicit feature. GeoShapley therefore yields (1) the marginal contribution of each predictor to the model’s output, (2) the functional (often nonlinear) shape of each predictor’s effect, and (3) location‑specific interaction terms that quantify how the effect of a predictor varies across space.

Key empirical findings are as follows:

  1. Nonlinear effects – Road density and intersection density display classic S‑shaped curves: risk declines up to a moderate density (reflecting reduced exposure and better network redundancy) but rises sharply beyond a threshold where congestion, pedestrian‑vehicle conflicts, and network complexity dominate. This threshold is markedly lower in dense urban cores such as Miami, where pedestrian activity and mixed‑use streets amplify risk.

  2. Spatial heterogeneity – GeoShapley’s tract‑level intrinsic contribution scores reveal that major metros (Miami, Orlando, Tampa, Jacksonville) possess substantially higher baseline crash propensity than the state average, whereas rural tracts in northern and central Florida show lower baseline risk. However, within rural areas, tracts with high socioeconomic vulnerability exhibit elevated contributions, indicating that poverty and limited infrastructure can offset the protective effect of low traffic volumes.

  3. Variable‑specific spatial patterns – Neighborhood compactness raises crash risk in high‑density neighborhoods (more intersections, shorter block lengths) but has a neutral or even protective effect in sprawling suburban tracts. Educational attainment generally correlates with lower risk, yet the magnitude of this protective effect varies: in affluent urban tracts the effect is strong, while in disadvantaged rural tracts the effect diminishes, suggesting that education alone cannot compensate for poor road design or lack of safety enforcement.

  4. Methodological comparison – The authors benchmark GeoShapley against two conventional approaches: (a) standard SHAP applied to the same ML model, which provides global average importances but fails to capture location‑specific interactions, and (b) Multiscale Geographically Weighted Regression (MGWR), which models spatially varying coefficients but assumes linear functional forms. GeoShapley achieves the highest explanatory power (R² ≈ 0.78, lower MAE) and uniquely reveals the combined nonlinear‑spatial dynamics that the other methods miss.

Based on these insights, the paper proposes a set of geography‑sensitive policy recommendations:

  • Urban traffic calming – In highly compact neighborhoods, implement curb extensions, raised crosswalks, and reduced speed limits to mitigate the steep risk increase observed beyond the density threshold.
  • Adaptive intersection design – Deploy smart signal timing, protected turn phases, and roundabouts in corridors with high intersection density, especially in Miami’s downtown and along I‑95 where congestion spikes.
  • Speed management on high‑volume arterials – Install automated speed‑enforcement cameras and variable‑speed limit signage on I‑95 and other major corridors to curb the nonlinear surge in crashes at very high traffic volumes.
  • Equity‑focused rural interventions – Prioritize low‑cost safety upgrades (e.g., rumble strips, better signage) and targeted driver‑education programs in socio‑economically vulnerable rural tracts of central and northern Florida.

The study concludes that the GeoXAI framework, powered by AutoML and GeoShapley, provides a powerful, unified platform for simultaneously uncovering nonlinear relationships and spatial heterogeneity in transportation safety data. Its ability to deliver tract‑level, interpretable explanations makes it valuable not only for traffic safety analysts but also for urban planners, public‑health officials, and policymakers seeking data‑driven, location‑specific interventions. The authors suggest that future work could extend the approach to other domains such as air‑quality modeling, housing‑price dynamics, or climate‑impact assessments, where complex interactions between space and variables are similarly critical.


Comments & Academic Discussion

Loading comments...

Leave a Comment