Simultaneous Localization And Mapping (SLAM) from endoscopy videos can enable autonomous navigation, guidance to unsurveyed regions, blindspot detections, and 3D visualizations, which can significantly improve patient outcomes and endoscopy experience for both physicians and patients. Existing dense SLAM algorithms often assume distant and static lighting and optimize scene geometry and camera parameters by minimizing a photometric rendering loss, often called Photometric Bundle Adjustment. However, endoscopy videos exhibit dynamic near-field lighting due to the co-located light and camera moving extremely close to the surface. In addition, low texture surfaces in endoscopy videos cause photometric bundle adjustment of the existing SLAM frameworks to perform poorly compared to indoor/outdoor scenes. To mitigate this problem, we introduce Near-Field Lighting Bundle Adjustment Loss (NFL-BA) which explicitly models near-field lighting as a part of Bundle Adjustment loss and enables better performance for low texture surfaces. Our proposed NFL-BA can be applied to any neural-rendering based SLAM framework. We show that by replacing traditional photometric bundle adjustment loss with our proposed NFL-BA results in improvement, using neural implicit SLAM and 3DGS SLAMs. In addition to producing state-of-the-art tracking and mapping results on colonoscopy C3VD dataset we also show improvement on real colonoscopy videos.
Please allow a moment for point clouds to load after changing sequences.
Controls: Click and drag inside the viewer to rotate. Use mouse wheel to zoom. Keyboard arrows ( ↑ ↓ ← → ) to move, 'a'/'d' to yaw, 'w'/'s' to pitch, 'q'/'e' to roll. Click here to reset view to default.
This work is supported by a National Institute of Health (NIH) project #1R21EB035832 "Next-gen 3D Modeling of Endoscopy Videos". We also thank Prof. Stephen M. Pizer and Dr. Sarah McGill for helpful discussions during the project.
We used the project page of Fuzzy Metaballs as a template.