Spatial-Morphological Modeling for Multi-Attribute Imputation of Urban Blocks
Accurate reconstruction of missing morphological indicators of a city is crucial for urban planning and data-driven analysis. This study presents the spatial-morphological (SM) imputer tool, which combines data-driven morphological clustering with neighborhood-based methods to reconstruct missing values of the floor space index (FSI) and ground space index (GSI) at the city block level, inspired by the SpaceMatrix framework. This approach combines city-scale morphological patterns as global priors with local spatial information for context-dependent interpolation. The evaluation shows that while SM alone captures meaningful morphological structure, its combination with inverse distance weighting (IDW) or spatial k-nearest neighbor (sKNN) methods provides superior performance compared to existing SOTA models. Composite methods demonstrate the complementary advantages of combining morphological and spatial approaches.
💡 Research Summary
The paper tackles the problem of imputing missing built‑form indicators—specifically Floor Space Index (FSI) and Ground Space Index (GSI)—at the city‑block level. Recognizing that urban data are often fragmented, the authors propose a hybrid “spatial‑morphological” (SM) imputer that merges global morphological priors with local spatial interpolation.
First, the authors replicate the SpaceMatrix idea by clustering blocks in the normalized (FSI, GSI) space using K‑means. Each resulting cluster represents a canonical urban form (e.g., compact mid‑rise, low‑density spread) and provides a centroid that serves as a prototype FSI‑GSI pair. To assign a block to these prototypes, a CatBoost classifier is trained on the block’s land‑use composition (seven fractional shares) and total site area, outputting a probability distribution over the morphological clusters, P(c_k | X_i). The expected morphological prediction for block i is then the weighted sum of cluster centroids using these probabilities.
Second, the method incorporates classic neighborhood‑based interpolation: inverse distance weighting (IDW) and spatial k‑nearest neighbor (sKNN). These techniques generate a locally informed estimate based solely on the values of surrounding blocks.
The final imputed values are a convex combination of the morphological estimate and the spatial estimate. The mixing weight α is tuned via cross‑validation, allowing the model to lean more on global morphology where land‑use patterns are homogeneous and more on local spatial cues where strong spatial autocorrelation exists.
Experiments are conducted on a dataset of roughly 100 000 blocks from St. Petersburg, Russia. Missingness is simulated at 20 % randomly. Baselines include pure morphological clustering, SMV‑NMF, GraphSA‑GE (a GNN approach), and image‑based deep learning models. Evaluation metrics (RMSE, MAE, R²) show that the SM‑IDW and SM‑sKNN composites outperform all baselines, reducing RMSE by about 12 % and 9 % respectively. The advantage is most pronounced in mixed‑use neighborhoods where traditional K‑NN underestimates density; the probabilistic morphological component corrects this bias.
Beyond accuracy, the morphological clusters are interpretable: planners can inspect a cluster’s centroid to understand typical FSI/GSI values for a given urban form, facilitating scenario analysis for redevelopment or zoning changes. The probability distribution also quantifies uncertainty, supporting risk‑aware decision making.
Limitations are acknowledged. The choice of the number of clusters (K) strongly influences performance, and the approach relies on a sufficient amount of labeled (non‑missing) data to train the CatBoost model without overfitting. Future work is suggested to automate K selection via Bayesian optimization, integrate graph neural networks for richer neighbor aggregation, and test transferability across cities with different morphological vocabularies.
In summary, the study demonstrates that coupling global morphological knowledge with local spatial interpolation yields a robust, interpretable, and superior solution for multi‑attribute imputation of urban blocks, offering a practical tool for digital twin construction and data‑driven urban planning.
Comments & Academic Discussion
Loading comments...
Leave a Comment