超越SuGar！开源！GOF：打通3D GS和曲面重建的壁垒！

Source: 3D Vision Workshop

Add a small assistant: dddvision, note: direction + school/company + nickname, pull you into the group. At the end of the article, industry subdivisions are attached

To sum up: Gaussian Opacity Fields (GOF) can be extracted directly using 3D Gaussian by recognizing its level set. The regularization of the article improves surface reconstruction and utilizes moving tetrahedra for compact and scene-adaptive mesh extraction.

Let's read about this work together~

标题：Gaussian Opacity Fields: Efficient and Compact Surface Reconstruction in Unbounded Scenes

作者:Zehao Yu, Torsten Sattler, Andreas Geiger

Institutions: University of Tübingen, Tübingen Center for Artificial Intelligence, Technical University of Cech Prague

Original link: https://arxiv.org/abs/2404.10772

Code link: https://github.com/autonomousvision/gaussian-opacity-fields

Official Website: https://niujinshuchong.github.io/gaussian-opacity-fields/

Recently, 3D Gaussian Flare (3DGS) showcased impressive new view synthesis results while allowing high-resolution images to be rendered in real-time. However, surface reconstruction using 3D Gaussian flares presents significant challenges due to the explicit and discontinuous nature of 3D Gaussian flares. In this work, we propose the Gaussian Opaque Field (GOF), a novel method for efficient, high-quality, and compact surface reconstruction in unbounded scenes. Our GOF is derived from the volumetric rendering of 3D Gaussian flares based on ray tracing, and by identifying its level set, it is possible to extract geometric information directly from the 3D Gaussian flare without the need for Poisson reconstruction or TSDF fusion as in previous work. We approximate the surface normal of the Gaussian spot to the normal of the plane of the intersection of the ray-Gaussian spot, so that regularization can be applied and the geometry is greatly enhanced. In addition, we have developed an efficient geometry extraction method utilizing traveling tetrahedra, where the tetrahedral mesh is caused by a 3D Gaussian flare and is therefore adapted to the complexity of the scene. Our evaluation shows that GOF outperforms existing 3DGS-based methods for surface reconstruction and new view synthesis. In addition, it has a distinct advantage over and even surpasses the neural implicit method in terms of quality and speed.

Fusing TSDF with the rendered depth map of the latest Mip-Splatting model yields noisy and incomplete meshes, while the GOF-extracted meshes are complete, smooth, and detailed. This is achieved by establishing a Gaussian opacity field from 3D Gauss, and geometry extraction is achieved by directly identifying its level set. In addition, GOF generates tetrahedral meshes from 3D Gaussian and utilizes traveling tetrahedra to extract compact and adaptive meshes.

2DGS cannot reconstruct background geometry, whereas GOF can reconstruct more detailed geometry for foreground objects and backgrounds.

Compared to SuGaR, GOF can reconstruct more detailed and smooth geometry for both foreground objects and backgrounds.

This article proposes the Gaussian Opacity Field (GOF), a new method for achieving efficient, high-quality, and compact surface reconstruction directly from 3D Gaussian. The key insights are threefold:

First, a Gaussian opacity field is established from a set of 3D Gauss. Specifically, unlike projection-based volumetric rendering, GOF utilizes an explicit ray-Gaussian intersection to determine Gaussian's contribution during volumetric rendering. The heuristic ray tracing formula facilitates the evaluation of the opacity value at any point along the ray. Then, define the opacity of any 3D point as the smallest opacity in all the training views where the point is observed. GOF is consistent with volumetric rendering during training and extracts surfaces from 3D Gaussian by directly identifying the level set without the need to use Poisson reconstruction or TSDF fusion.

Secondly, the surface normal of 3D Gaussian is approximated to the normal of the plane of intersection between the ray and Gaussian. This technique allows regularization to be integrated during training, enhancing the fidelity of geometric reconstructions.

Then, an efficient surface extraction technique based on tetrahedral mesh was proposed. Recognizing that 3D Gaussian is an effective indicator of potential surface locations, focus the opacity assessment on these areas. In particular, the center and corners of the 3D bounding box around the 3D Gaussky are used as vertex sets for the tetrahedral mesh. After evaluating the opacity of tetrahedral points, the traveling tetrahedral algorithm was used to extract the triangular mesh. In view of the fact that the opacity field challenges the assumption of linear variation of opacity, a binary search algorithm is further implemented to accurately identify the level set of the opacity field, which greatly improves the quality of the resulting surface.

Given multiple poses and calibrated images, GOF's goal is to efficiently reconstruct a 3D scene while allowing for detailed and compact surface extraction and realistic new view compositing. To this end, the Gaussian Opacity Field (GOF) is first constructed from 3D Gauss, and the geometry extraction is directly realized through the recognition level set, thus eliminating the need for traditional Poisson reconstruction or TSDF fusion. Next, two effective regularizers were extended from 2DGS to 3D Gauss, improving the quality of the reconstruction. Finally, a new tetrahedral-based method is proposed to extract detailed and compact meshes from GOF by marching tetrahedra.

Tanks and Temples surface reconstruction datasets. A normal mapping drawn from the extracted grid and GT image together is shown for reference. SuGaR's mesh is noisy, and 2DGS cannot reconstruct the background area. In contrast, GOF can reconstruct the detailed surface of both the foreground object and the background area.

The implicit and explicit surface reconstruction methods of SOTA were first compared on the Tanks and Temples dataset. Since ground live point clouds do not contain background areas, only foreground objects are reconstructed and evaluated. As shown in Table 1, GOF is highly competitive with leading implicit methods and at the same time has higher optimization efficiency. It is important to note that most implicit methods only reconstruct foreground objects, while GOF is able to reconstruct a detailed mesh for the background area, which is important for real-time mesh-based rendering. In addition, although GOF is slightly slower than 3DGS and 2DGS due to the ray-Gaussian intersection calculation, it is significantly better than all SOTA 3DGS-based methods in terms of reconstruction quality.

Comparison with the SOTA surface reconstruction method was performed on the DTU dataset. As shown in Table 2, GOF outperforms all 3DGS-based methods. Despite the performance gap with leading implicit reconstruction methods, GOF optimizes much faster. This performance gap is attributed to the strong viewing angle-dependent appearance of the DTU dataset. Relying on appearance modeling or coarse-to-fine training strategies with a better perspective may improve reconstruction results.

To evaluate the NVS results of GOF, further comparisons were made with the SOTA NVS method on the Mip-NeRF dataset, and the quantitative results are shown in Table 3. GOF not only slightly outperforms all other 3DGS-based methods in terms of PSNR, but in outdoor scenarios, it significantly outperforms all other methods in terms of LPIPS. The main improvement comes from the improved densification strategy.

Qualitative comparison of extracted grids. Similar to the observations on the Tanks and Temples dataset, GOF is able to reconstruct detailed surfaces for foreground objects and background areas, while SuGaR's mesh is noisy and less detailed, and 2DGS cannot extract the mesh of the background area.

This paper proposes Gaussian Opacity Fields (GIFs), a new approach to achieve efficient, high-quality, and compact surface reconstruction in unbounded scenes. GOF is derived from ray-traced-based 3D Gaussian volumetric rendering, maintaining consistency with RGB rendering. GOF makes it possible to extract geometric information directly from 3D Gaussian bodies by identifying their isosurfaces without the use of Poisson reconstructions or TSDFs. The surface normal of the Gaussian body is approximated to the normal of the light-Gaussian intersection plane, and the depth-normal consistency regularization is applied to enhance the geometric reconstruction. In addition, an efficient and compact mesh extraction method using traveling tetrahedra is proposed, in which the tetrahedral mesh is introduced from the 3D Gaussian body. The evaluation results show that GOF is superior to the existing methods in terms of surface reconstruction and new perspective synthesis in unbounded scenes.

Readers who are interested in more experimental results and details of the article can read the original paper~

This article is only for academic sharing, if there is any infringement, please contact to delete the article.

3D Vision Workshop Exchange Group

At present, we have established multiple communities in the direction of 3D vision, including 2D computer vision, large models, industrial 3D vision, SLAM, autonomous driving, 3D reconstruction, drones, etc., and the subdivisions include:

2D Computer Vision: Image Classification/Segmentation, Target/Detection, Medical Imaging, GAN, OCR, 2D Defect Detection, Remote Sensing Mapping, Super-Resolution, Face Detection, Behavior Recognition, Model Quantification Pruning, Transfer Learning, Human Pose Estimation, etc

Large models: NLP, CV, ASR, generative adversarial models, reinforcement learning models, dialogue models, etc

Industrial 3D vision: camera calibration, stereo matching, 3D point cloud, structured light, robotic arm grasping, defect detection, 6D pose estimation, phase deflection, Halcon, photogrammetry, array camera, photometric stereo vision, etc.

Slam:视觉Slam、激光Slam、语义Slam、滤波算法、多传感器融吇𴢆算法、多传感器标定、动态Slam、Mot Slam、Nerf Slam、机器人导航等.

Autonomous driving: depth estimation, Transformer, millimeter-wave, lidar, visual camera sensors, multi-sensor calibration, multi-sensor fusion, autonomous driving integrated group, etc., 3D object detection, path planning, trajectory prediction, 3D point cloud segmentation, model deployment, lane line detection, Occupancy, target tracking, etc.

三维重建：3DGS、NeRF、多视图几何、OpenMVS、MVSNet、colmap、纹理贴图等

Unmanned aerial vehicles: quadrotor modeling, unmanned aerial vehicle flight control, etc

In addition to these, there are also exchange groups such as job search, hardware selection, visual product landing, the latest papers, the latest 3D vision products, and 3D vision industry news

Add a small assistant: dddvision, note: research direction + school/company + nickname (such as 3D point cloud + Tsinghua + Little Strawberry), pull you into the group.

3D Vision Workshop Knowledge Planet

3DGS、NeRF、结构光、相位偏折术、机械臂抓取、点云实战、Open3D、缺陷检测、BEV感知、Occupancy、Transformer、模型部署、3D目标检测、深度估计、多传感器标定、规划与控制、无人机仿真、三维视觉C++、三维视觉python、dToF、相机标定、ROS2、机器人控制规划、LeGo-LAOM、多模态融合SLAM、LOAM-SLAM、室内室外SLAM、VINS-Fusion、ORB-SLAM3、MVSNet三维重建、colmap、线面结构光、硬件结构光扫描仪，无人机等。