Prior Smoothing for Multivariate Disease Mapping Models
To date, we have seen the emergence of a large literature on multivariate disease mapping. That is, incidence of (or mortality from) multiple diseases is recorded at the scale of areal units where incidence (mortality) across the diseases is expected to manifest dependence. The modeling involves a hierarchical structure: a Poisson model for disease counts (conditioning on the rates) at the first stage, and a specification of a function of the rates using spatial random effects at the second stage. These random effects are specified as a prior and introduce spatial smoothing to the rate (or risk) estimates. What we see in the literature is the amount of smoothing induced under a given prior across areal units compared with the observed/empirical risks. Our contribution here extends previous research on smoothing in univariate areal data models. Specifically, for three different choices of multivariate prior, we investigate both within prior smoothing according to hyperparameters and across prior smoothing. Its benefit to the user is to illuminate the expected nature of departure from perfect fit associated with these priors since model performance is not a question of goodness of fit. We propose both theoretical and empirical metrics for our investigation and illustrate with both simulated and real data.
💡 Research Summary
This paper investigates how spatial priors used in multivariate disease‑mapping models affect the amount of smoothing imposed on estimated disease risks. Building on earlier work that quantified smoothing for univariate CAR priors, the authors extend the analysis to the multivariate setting by employing the computationally efficient “M‑model” framework. Three families of multivariate spatial priors are examined: the intrinsic CAR (iCAR), the Leroux CAR (LCAR), and a disease‑specific Leroux CAR (LjCAR) that allows each disease its own spatial dependence parameter.
The hierarchical model assumes Poisson counts conditional on unknown rates, with logit‑transformed rates modeled as an overall intercept plus spatial random effects. The spatial effects are given a multivariate normal prior with covariance Σ = Σ_w ⊗ Σ_b, where Σ_w captures within‑disease spatial dependence (determined by the chosen CAR prior) and Σ_b captures between‑disease dependence. Σ_b is parameterized via a Bartlett decomposition of a Wishart prior, reducing the number of free parameters to J(J + 1)/2 and avoiding over‑parameterization.
To quantify smoothing, the authors introduce a theoretical metric called Total Conditional Variance (TCV). For each area i, the conditional distribution of the vector of spatial effects θ·i given all other areas has covariance equal to the inverse of the i‑th J × J block of the precision matrix Σ⁻¹. The determinant of this conditional covariance (the generalized variance) is computed for every area and summed across all areas; a smaller TCV indicates stronger smoothing because the conditional distribution is more concentrated. Closed‑form expressions for TCV are derived for iCAR, LCAR (with spatial dependence parameter λ), and LjCAR (with disease‑specific λ_j).
In addition to TCV, the paper proposes empirical smoothing metrics: mean squared error between estimated and empirical risk rates, spatial autocorrelation measures (Moran’s I variants), and Frobenius‑norm differences of risk matrices. These metrics allow assessment of whether a prior over‑smooths (masking true hotspots) or under‑smooths (leaving noisy maps).
Two simulation studies are conducted. The first varies the number of areas G (25, 50, 100) and examines how TCV and empirical metrics change with λ (for LCAR) and with the eigenvalues of Σ_b. The second compares the three priors under different scenarios of between‑disease correlation and spatial heterogeneity. Results show that iCAR provides the strongest global smoothing, which can be excessive when local risk variation is important. LCAR offers a tunable amount of smoothing via λ, smoothly interpolating between independence (λ = 0) and iCAR (λ = 1). LjCAR, by allowing each disease its own λ_j, achieves a balance: it preserves disease‑specific spatial patterns while still borrowing strength across diseases. When between‑disease correlation is high, LjCAR’s flexibility prevents over‑smoothing of one disease by another.
The methodology is applied to real data on female mortality (age ≥ 50) from colon, stomach, and pancreas cancer across continental Spain. All three cancers display clear spatial clusters, but the degree of clustering differs. The LjCAR model best captures these differences: it retains the distinct high‑risk zones for each cancer while reducing random noise. Estimated Σ_b reveals a strong positive correlation between colon and stomach cancer risks, and a weaker correlation with pancreas cancer, reflecting plausible shared risk factors.
Overall, the paper makes three substantive contributions: (1) a principled theoretical measure (TCV) for multivariate spatial smoothing, (2) a comprehensive empirical evaluation framework linking TCV to observable model performance, and (3) a practical demonstration that disease‑specific spatial priors (LjCAR) can improve risk estimation in multivariate disease mapping. The authors suggest future extensions such as incorporating population offsets explicitly, exploring non‑binary adjacency structures (distance‑based weights), and adding temporal dynamics to model disease evolution over time.
Comments & Academic Discussion
Loading comments...
Leave a Comment