InFusionSurf: Refining Neural RGB-D Surface Reconstruction Using Per-Frame Intrinsic Refinement and TSDF Fusion Prior Learning

Abstract

We introduce InFusionSurf, an innovative enhancement for neural radiance field (NeRF) frameworks in 3D surface reconstruction using RGB-D video frames. Building upon previous methods that have employed feature encoding to improve optimization speed, we further improve the reconstruction quality with minimal impact on optimization time by refining depth information. InFusionSurf addresses camera motion-induced blurs in each depth frame through a per-frame intrinsic refinement scheme. It incorporates the truncated signed distance field (TSDF) Fusion, a classical real-time 3D surface reconstruction method, as a pretraining tool for the feature grid, enhancing reconstruction details and training speed. Comparative quantitative and qualitative analyses show that InFusionSurf reconstructs scenes with high accuracy while maintaining optimization efficiency. The effectiveness of our intrinsic refinement and TSDF Fusion-based pretraining is further validated through an ablation study.

Results

We compare our method with GO-Surf and Neural RGB-D at different points in time. The comparison was conducted using ScanNet. When trained for a shorter amount of time, InFusionSurf recovers high-frequency details overlooked by GO-Surf and generates much less erroneous surfaces. Given a longer training time, InFusionSurf achieves better results than Neural RGB-D, recovering the structures Neural RGB-D missed in some cases.

BibTeX


        @INPROCEEDINGS{lee2024infusionsurf,
          author={Lee, Seunghwan and Park, Gwanmo and Son, Hyewon and Ryu, Jiwon and Chae, Han Joo},
          booktitle={2024 IEEE International Conference on Multimedia and Expo (ICME)}, 
          title={InFusionSurf: Refining Neural RGB-D Surface Reconstruction Using Per-Frame Intrinsic Refinement and TSDF Fusion Prior Learning}, 
          year={2024},
          volume={},
          number={},
          pages={1-6},
          keywords={Training;Surface reconstruction;Three-dimensional displays;Refining;Reconstruction algorithms;Neural radiance field;Cameras;RGB-D Surface Reconstruction;TSDF Fusion;Neural Radiance Field;Camera Motion Blur},
          doi={10.1109/ICME57554.2024.10687901}
        }