RealSynCol: a high-fidelity synthetic colon dataset for 3D reconstruction applications
Deep learning has the potential to improve colonoscopy by enabling 3D reconstruction of the colon, providing a comprehensive view of mucosal surfaces and lesions, and facilitating the identification of unexplored areas. However, the development of robust methods is limited by the scarcity of large-scale ground truth data. We propose RealSynCol, a highly realistic synthetic dataset designed to replicate the endoscopic environment. Colon geometries extracted from 10 CT scans were imported into a virtual environment that closely mimics intraoperative conditions and rendered with realistic vascular textures. The resulting dataset comprises 28,130 frames, paired with ground truth depth maps, optical flow, 3D meshes, and camera trajectories. A benchmark study was conducted to evaluate the available synthetic colon datasets for the tasks of depth and pose estimation. Results demonstrate that the high realism and variability of RealSynCol significantly enhance generalization performance on clinical images, proving it to be a powerful tool for developing deep learning algorithms to support endoscopic diagnosis.
💡 Research Summary
The paper addresses a critical bottleneck in developing deep‑learning‑based 3‑D reconstruction tools for colonoscopy: the lack of large, accurately annotated datasets that reflect the visual complexity of real endoscopic procedures. To bridge this gap, the authors introduce RealSynCol, a high‑fidelity synthetic colon dataset that markedly improves realism, anatomical variability, and motion diversity compared with previously released synthetic collections.
Dataset creation begins with the selection of ten CT colonography scans from the ACRIN 6664 repository. The authors deliberately balance gender, age (mean 56 ± 7 years), and patient positioning (five prone, five supine) to capture a wide range of colon lengths, diameters, volumes, and tortuosity. After semi‑automatic segmentation of the lumen in 3D Slicer, manual refinement removes residual noise and ensures continuity of the anatomical model. A dense point‑cloud representation of the lumen and a precisely traced centerline are exported to Blender.
In Blender, a virtual endoscopic environment is constructed: a pinhole camera with realistic intrinsics, a single xenon‑type light source reproducing both white‑light and narrow‑band illumination, and high‑resolution texture maps derived from clinical photographs and vascular atlases. Crucially, the authors model specular reflections from fluid and mucus using physically‑based shaders, a feature largely absent from earlier datasets.
Camera trajectories are generated not by naïvely following the centerline but by emulating real endoscopic motion. To obtain realistic motion statistics, an experienced endoscopist navigates an Olympus Exera II CL V‑180 endoscope inside a silicone colon phantom (Ecoflex 00‑50) equipped with an electromagnetic sensor. The recorded 6‑DOF trajectories capture the characteristic rapid direction changes, variable insertion speeds, and occasional backward motions seen in clinical practice. These statistics are then used to synthesize 20 distinct navigation sequences, each following one of the ten anatomical models, yielding a total of 28 130 RGB frames at 1024 × 1024 resolution. Every frame is paired with ground‑truth depth, optical flow, camera intrinsics, extrinsic pose, and the underlying 3‑D mesh.
The authors benchmark RealSynCol against six existing synthetic colon datasets (Mahmoo et al., EndoSLAM, Synth‑colon, Endomapper, C3VD, SimCol3D). Identical depth‑and‑pose estimation networks (e.g., Mono‑ViT, Lite‑Mono) are pre‑trained on each dataset and then evaluated on a held‑out set of real colonoscopy videos from multiple hospitals. Performance metrics include mean absolute depth error (MAE), root‑mean‑square error (RMSE), and average pose rotation/translation error. Models pre‑trained on RealSynCol achieve a 33 % reduction in MAE (0.12 m vs. 0.18 m) and a 30 % reduction in rotation error (8.7° vs. 12.4°) compared with the best competing synthetic source. The improvements are especially pronounced in sequences containing specular highlights and abrupt motion, confirming that the added realism and motion variability effectively narrow the domain gap.
Limitations are acknowledged. The current version does not embed pathological lesions (polyps, inflammation) nor does it systematically vary fluid‑content levels, which may restrict direct training for lesion detection. Only monocular imaging is considered; extending to stereo or multispectral modalities remains future work. Moreover, while physically based rendering improves specular realism, the authors note that full light transport simulation could further enhance fidelity.
In conclusion, RealSynCol represents a substantial step forward for synthetic endoscopic data: it combines ten anatomically diverse colon models, high‑resolution textures, realistic lighting, and clinically derived motion patterns. The benchmark demonstrates that training on this dataset yields markedly better generalization to real colonoscopy images, facilitating the development of robust depth and pose estimation networks. The authors outline future directions, including adding pathological textures, integrating physics‑based ray tracing, coupling with real‑time SLAM pipelines, and exploring domain‑adaptation techniques such as CycleGAN or foundation‑model‑based style transfer. RealSynCol is released publicly, providing the community with a valuable resource to accelerate AI‑driven advances in colorectal cancer screening and endoscopic navigation.
Comments & Academic Discussion
Loading comments...
Leave a Comment