Mapping tuberculosis fatalities by region and age group in South Korea: A dataset for targeted health policy optimization
In South Korea, age-disaggregated tuberculosis (TB) data at the district level are not publicly available due to privacy constraints, limiting fine-scale analyses of healthcare accessibility. To address this limitation, we present a high-resolution, district-level dataset on tuberculosis (TB) fatalities and hospital accessibility in South Korea, covering the years 2014 to 2022 across 228 districts. The dataset is constructed using a reconstruction method that infers age-disaggregated TB cases and fatalities at the district level by integrating province-level age-specific statistics with district-level spatial and demographic data, enabling analyses that account for both spatial heterogeneity and age structure. Building on an existing hospital allocation framework, we extend the objective function to an age-weighted formulation and apply it to the reconstructed dataset to minimize TB fatalities under different age-weighting schemes. We demonstrate that incorporating age structure can give rise to distinct optimized hospital allocation patterns, even when the total number of minimized fatalities is similar, revealing trade-offs between efficiency and demographic targeting. In addition, the dataset supports temporal analyses of TB burden, hospital availability, and demographic variation over time, and provides a testbed for spatial epidemiology and optimization studies that require high-resolution demographic and healthcare data.
💡 Research Summary
This paper addresses a critical data gap in South Korea’s tuberculosis (TB) surveillance: while district‑level (si‑gun‑gu) totals for new TB cases, deaths, and hospitals are publicly available, age‑disaggregated TB statistics are only released at the province (do) level due to privacy regulations. To enable fine‑grained, age‑aware analyses, the authors develop a reconstruction pipeline that upscales province‑level age distributions to the 228 districts for the period 2014‑2022.
First, they obtain province‑level counts of newly reported TB cases (N_i,t) and TB‑related deaths (D_i,t) for ten‑year age groups (40‑49, 50‑59, …, 80+). They compute age fractions n_i,t = N_i,t / N_i and d_i,t = D_i,t / D_i. For each district s belonging to province i, they multiply these fractions by the district’s total reported cases N_s and deaths D_s (which are available at the district level) to estimate age‑specific counts N_s,t = n_i,t N_s and D_s,t = d_i,t D_s. This simple proportional allocation preserves the province‑level totals while providing a high‑resolution, age‑disaggregated dataset. The authors verify that the reconstructed numbers sum to the original province totals and that temporal trends (declining cases and deaths, relatively stable hospital numbers) are consistent with the raw data.
The reconstructed dataset therefore contains, for each year and each of the 228 districts, the number of TB cases and deaths broken down by age group, together with the number of secondary‑care hospitals. This enables three‑dimensional analyses (space, time, age) that were previously impossible without violating privacy constraints.
Building on a previously published spatial optimization framework, the authors model the relationship between hospital density η_s (hospitals per unit area) and the district‑level fatality rate φ_s as an exponential decay: φ_s = exp(−η_s / ˜η_s). The characteristic density ˜η_s is derived from observed data as ˜η_s = η_s / log(N_s / D_s). The original model minimizes total expected deaths E(η) = Σ_s N_s exp(−η_s/˜η_s) subject to a fixed total number of hospitals, yielding an analytically tractable optimal density η*_s.
Crucially, the present study extends this objective to incorporate age‑specific fatality risk. They introduce age weights w_t (either w_t = 1 for an age‑agnostic scenario, or w_t = φ_t/ φ̄ where φ_t is the national age‑specific fatality rate and φ̄ its mean) and reformulate the objective as
min Σ_s Σ_t w_t D_s,t exp(−η_s/˜η_s)
while keeping the total number of hospitals constant. This creates a multi‑age‑group optimization problem where districts serving older populations receive higher marginal benefit from additional hospitals.
Using the reconstructed dataset, the authors solve the optimization for two weighting schemes: (1) age‑agnostic (w_t = 1) and (2) age‑weighted (w_t = φ_t/ φ̄). Both solutions achieve a comparable reduction in total projected deaths (≈5 % relative to the current configuration), but the spatial patterns of hospital reallocation differ markedly. The age‑agnostic solution concentrates hospitals in high‑patient‑density districts, primarily the Seoul‑Gyeonggi metropolitan area, reflecting pure efficiency. In contrast, the age‑weighted solution shifts resources toward districts with higher proportions of elderly residents—often non‑metropolitan, rural areas such as Gangwon, Jeolla, and parts of Gyeongsang—thereby improving access for the most vulnerable age groups even though the overall death count is similar.
These findings illustrate a clear policy trade‑off: a strategy focused solely on minimizing aggregate deaths may overlook equity concerns, while an age‑sensitive approach can achieve comparable efficiency while better protecting high‑risk older populations. The authors argue that, given South Korea’s rapidly aging society, incorporating age‑specific risk into health‑facility planning is essential for equitable TB control.
The paper also discusses limitations. The proportional allocation assumes homogeneous age distributions within each province, which may not hold for all districts and could introduce estimation error. The model treats hospitals as identical units and does not consider capacity, specialty, or operational costs; extending the framework to a multi‑objective setting (efficiency, equity, cost) is a natural next step. Moreover, fatality rates are approximated by reported deaths divided by reported cases, ignoring undiagnosed cases and long‑term outcomes.
Finally, the authors release the full reconstructed dataset, the code for the upscaling procedure, and the optimization scripts under an open‑source license, inviting the research community to reuse the data for other infectious diseases, to test alternative optimization criteria, or to conduct comparative studies across countries.
In summary, this work makes three major contributions: (1) it fills a privacy‑driven data gap by generating a high‑resolution, age‑disaggregated TB mortality dataset for South Korea; (2) it extends a spatial hospital allocation model to explicitly account for age‑specific fatality risk; and (3) it demonstrates that age‑aware optimization can produce distinct, policy‑relevant allocation patterns without sacrificing overall efficiency. The study provides a valuable methodological template for integrating demographic heterogeneity into health‑infrastructure planning in aging societies.
Comments & Academic Discussion
Loading comments...
Leave a Comment