Purpose: Kidney ureteroscopic navigation is challenging with a steep learning curve. However, current clinical training has major deficiencies, as it requires one-on-one feedback from experts and occurs in the operating room (OR). Therefore, there is a need for a phantom training system with automated feedback to greatly \revision{expand} training opportunities. Methods: We propose a novel, purely ureteroscope video-based scope localization framework that automatically identifies calyces missed by the trainee in a phantom kidney exploration. We use a slow, thorough, prior exploration video of the kidney to generate a reference reconstruction. Then, this reference reconstruction can be used to localize any exploration video of the same phantom. Results: In 15 exploration videos, a total of 69 out of 74 calyces were correctly classified. We achieve < 4mm camera pose localization error. Given the reference reconstruction, the system takes 10 minutes to generate the results for a typical exploration (1-2 minute long). Conclusion: We demonstrate a novel camera localization framework that can provide accurate and automatic feedback for kidney phantom explorations. We show its ability as a valid tool that enables out-of-OR training without requiring supervision from an expert.
In ureteroscopic kidney stone removal operations, up to 20% of the patients require a second operation due to missed stones [1]. This is partly due to the challenging nature of navigating through the kidney collecting framework [2], which requires precise endoscopic manipulation and knowledge of the kidney's anatomy to ensure that every kidney cavity, called a calyx, is fully visited.
Accurately navigating the kidney has a steep learning curve. Unfortunately, training opportunities are limited because the current training paradigm relies on oneon-one, apprenticeship-style guidance during operating room (OR) cases, which are subject to significant time and safety constraints [3]. Additionally, trainees often only receive limited verbal feedback at the end of a case [4], which is based primarily on an expert’s subjective judgment. In this work, we aim to improve ureteropscopy training by introducing an automated objective feedback mechanism when exploring kidney phantoms.
In prior work [5], we introduced anatomically accurate phantoms that can be used for training outside of the OR. However, training on these phantoms still requires oneon-one guidance from an expert. Although electromagnetic tracking used in [5] can provide automated feedback on exploration completeness, they also increase the hardware cost and complexity. Automatic assessment of exploration completeness without additional hardware may facilitate the adoption of ureteroscopy training on phantoms.
Computer vision-based methods such as Simultaneous Localization and Mapping (SLAM) and Structure from Motion (SfM) show great promise in reconstructing the anatomical scene and localizing scope poses through video input alone [6,7]. However, their performance is susceptible to the quality of exploration videos [8]. This leads to frequent failures when using the algorithms on poor quality exploration videos with high motion blur, such as those generated by trainees learning to use the endoscope.
In this paper, we propose a novel, purely ureteroscope video-based framework that measures the trainee’s exploration coverage of a kidney, providing automatic feedback. It identifies calyces missed by the trainee. The framework does not require additional hardware other than a consumer-grade computer. We overcome the challenges with existing methods for ureteroscope-based reconstruction by using a slow and thorough reference exploration of the kidney phantom to build its reference 3D reconstruction. This reconstruction acts as a reusable prior, simplifying the pose localization problem for the challenging normal-speed query exploration videos by trainees. The reconstruction can be re-used for any exploration videos of the same phantom. The code base will be publicly available upon the acceptance of the paper.
Simulation Models for Uretroscopic Training: Traditionally, surgical training followed an apprenticeship model, but simulation-based training is gaining traction as they add training opportunities, provides objective performance metrics, and have demonstrated educational effectiveness [9]. Physical bench-top models exist for urology applications, such as the Uro-Scopic Trainer (Limbs & Things Ltd., Bristol, UK), Scope Trainer (Mediskills Ltd., Edinburgh, UK), and the adult ureteroscopy trainer (Ideal Anatomic Modelling, MI). They mimic the anatomical structure and texture with high fidelity. Their realism are confirmed through clinical user studies, and users’ score in the system correlated with the user’s experience [10][11][12]. Despite their realism, none of these physical models have automatic feedback on task performance.
3D Reconstruction in Surgical Applications: 3D reconstruction or camera position tracking in medical research field is progressing quickly due to its ability to provide intraoprative information of the surgical scene to improve guidance precision [7,13]. In colonoscopy, such reconstruction algorithms have been developed for ensuring full exploration coverage [13], similar to our goal in ureteroscopy. 3D reconstruction with ureteroscopy is relatively underexplored. Maza, et al. [6] adapted a SLAM algorithm for ureteroscopy applications, adding image preprocessing and adapting a new image features detection algorithm. Nonetheless, the system can still lose track under rapid motions. Acar et al. [14] performed SfM reconstruction of patient and the CT rendered ureteroscopy using different methods, of which hloc [15] produced most robust results. Overall, endoscope images with poor image quality such as motion blur present significant challenge for these system to function robustly [8].
Our framework consist of two stages (Fig. 1) . In the first stage, a reference 3D reconstruction of the phantom’s collecting framework is generated with an SfM algorithm and two slow and thorough reference exploration videos [14,16]. The reference reconstruction consists of a 3D point cloud of the kidney collecting system, and endoscope poses of th
This content is AI-processed based on open access ArXiv data.