CSPR-Net: Self-supervised Curved Surface Projection Rectification Network for Geometric Distortion Correction in Non-planar Projections

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Projecting images onto non-planar surfaces inevitably introduces geometric distortions that degrade visual quality. Traditional correction methods often require tedious manual calibration or structured light sequences to establish pixel-wise correspondences. In this paper, we develop the Curved Surface Projection Rectification Network (CSPR-Net), a self-supervised deep learning framework for automated distortion correction. Our approach employs dual coordinate-based neural networks to learn the bi-directional mapping between the projector and camera spaces. By enforcing a robust cycle-consistency constraint, CSPR-Net autonomously resolves complex geometric transformations without requiring ground-truth deformation fields. Furthermore, a gradient-based loss function is introduced to mitigate the impact of complex ambient light interference and accurately capture high-frequency geometric variations. Quantitative evaluations in physical experimental scenarios demonstrate that CSPR-Net achieves a 20.7% improvement in end-to-end fidelity (SSIM) and outperforms the polynomial baseline by 3.8% and 5.4% in forward and inverse mapping in terms of SSIM respectively, effectively generating high-precision pre-warped images for seamless projection.

💡 Research Summary

The paper introduces CSPR‑Net (Curved Surface Projection Rectification Network), a self‑supervised deep learning framework designed to correct geometric distortions that arise when projecting images onto non‑planar (curved) surfaces. Traditional correction pipelines rely on labor‑intensive calibration procedures such as structured‑light patterns, checkerboards, or external 3‑D scanning devices to obtain pixel‑wise correspondences between projector and camera spaces. These methods are fragile under varying illumination, surface reflectance, and often cannot capture high‑frequency local deformations caused by complex surface topologies.

CSPR‑Net circumvents these limitations by learning a bijective mapping between the projector coordinate domain Ωₚ and the camera coordinate domain Ω𝚌 using two independent coordinate‑based multilayer perceptrons (MLPs). One network, Netₚ₂𝚌, predicts a forward displacement field Δuₚ→𝚌, defining the forward mapping F(uₚ) = uₚ + MLPₚ₂𝚌(uₚ). The second network, Net𝚌₂ₚ, predicts the inverse displacement field Δu𝚌→ₚ, defining G(u𝚌) = u𝚌 + MLP𝚌₂ₚ(u𝚌). Both MLPs consist of four fully‑connected layers with Layer Normalization and LeakyReLU (α = 0.2). By initializing the final layer weights to zero, the networks start as identity transforms and gradually adapt to the surface‑induced warps.

A differentiable spatial transformer (bilinear interpolation) is employed to warp images according to the predicted coordinates, allowing image‑level losses to be back‑propagated directly to the coordinate parameters. The training objective is a weighted sum of five loss terms:

Forward and backward photometric loss (L_fwd, L_bwd) – combines an L1 intensity difference with a Sobel‑gradient structural term, encouraging edge alignment even under strong ambient light.
Cycle‑consistency loss (L_cyc) – enforces G(F(uₚ)) ≈ uₚ and F(G(u𝚌)) ≈ u𝚌, guaranteeing that the two networks are approximate inverses and preserving bijectivity.
Smoothness loss (L_sm) – a Total Variation regularizer on the displacement fields, preventing unrealistic folding or discontinuities.
Mask‑consistency loss (L_msk) – aligns the projected region mask with the camera field‑of‑view mask after warping, ensuring that mappings stay within the physically observable area.

No ground‑truth deformation maps or manual annotations are required; the system learns solely from raw projector‑camera image pairs.

Experimental validation is performed in two stages. In simulation, a high‑fidelity ray‑tracing engine creates a virtual projector‑camera pair viewing a cylindrical surface (radius = 2.0). Ground‑truth pre‑warped images are generated via inverse ray‑tracing, providing an absolute benchmark. CSPR‑Net achieves a forward‑mapping RMSE of 17.11 px (≈50 % reduction vs. a 3rd‑degree polynomial baseline), SSIM = 0.9603, and similar gains in the inverse and pre‑warped domains (PSNR improvements of >8 dB, SSIM gains of >0.15). Error heatmaps show markedly lower boundary errors compared with the polynomial method.

Physical experiments use a commercial DLP projector and a CMOS camera placed arbitrarily, with only the requirement that the camera fully captures the projected area. A multi‑scale high‑contrast calibration pattern (varying grid density and HSL color transitions) provides rich features for the gradient‑based loss. Training curves demonstrate stable convergence without any supervised signal. The final pre‑warped images achieve SSIM = 0.9282, surpassing the polynomial baseline by 0.15, and visual inspection confirms near‑perfect rectification of severe non‑linear warps.

Key contributions and insights:

A fully self‑supervised pipeline that eliminates the need for structured‑light calibration or external 3‑D reconstruction.
Dual‑path MLPs that learn forward and inverse mappings simultaneously, reinforced by a robust cycle‑consistency constraint.
Introduction of a gradient‑consistency loss that explicitly targets high‑frequency edge fidelity, making the system resilient to ambient illumination variations.
Demonstrated superiority over traditional parametric fitting across multiple quantitative metrics (RMSE, PSNR, SSIM) in both synthetic and real‑world settings.

Limitations and future work: While the MLP‑based coordinate representation is flexible, extremely complex surfaces with abrupt depth changes may require deeper networks or additional hierarchical representations. Real‑time deployment would benefit from model compression or hardware acceleration of the spatial transformer. Extending the framework to handle dynamic surfaces or multi‑projector setups are promising directions.

In summary, CSPR‑Net offers a practical, low‑cost, high‑precision solution for geometric distortion correction in curved‑surface projection scenarios, opening new possibilities for spatial augmented reality, immersive installations, and any application where accurate projection onto non‑planar geometry is essential.

CSPR-Net: Self-supervised Curved Surface Projection Rectification Network for Geometric Distortion Correction in Non-planar Projections

💡 Research Summary

Comments & Academic Discussion

Leave a Comment