laitimes

Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding

author:3D Vision Workshop

Editor: Computer Vision Workshop

Add assistant: dddvision, note: 3D object detection, pull you into the group. At the end of the article, industry subdivisions are attached

Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding

标题:Per-Gaussian Embedding-Based Deformation for Deformable 3D Gaussian Splatting

作者:Jeongmin Bae等人

Homepage: https://jeongminb.github.io/e-d3dgs/

1. Introduction

In this paper, an efficient 3D dynamic scene reconstruction method based on Gaussian embedding is introduced. By assigning a latent embedding vector to each Gaussian and combining it with temporal embedding, the method achieves high-quality prediction of Gaussian parameters, while decomposing the deformations into coarse and fine deformations to model the motion at different velocities. In addition, an efficient training strategy is proposed, which improves the convergence speed and quality.

Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding

2. Method

Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding

Figure 2 illustrates the framework proposed by the authors. The existing field-based methods have the problem of Gaussian coordinate entanglement in the deformation parameter mapping. To solve this problem, the following methods are proposed:

2.1. Gaussian deformation based on embedding

The authors assign a 32-dimensional embedding vector zg to each Gaussian and a 256-dimensional temporal embedding vector zt to each frame. The deformation function Fθ takes these two embedding vectors as input and outputs the deformation parameters of each Gaussian at the current frame. Unlike coordinate-based deformation fields, this method can model the deformations of each Gaussian independently, avoiding the interaction between adjacent Gaussians.

2.2. Thickness deformation

The authors decompose the time variation into coarse and fine deformations, and model the slow and fast motion in the scene, respectively. Coarse deformation uses downsampling time embedding, while fine deformation uses native resolution time embedding. This decomposition allows for more detailed modeling of motion changes in the scene.

2.3. Efficient training strategy:

The authors propose efficient training strategies, including uniform sampling based on camera angle distance, frame sampling based on rendering error, and periodicity promotion of Gaussian densification using multi-view DSSIM loss. These strategies can accelerate convergence and obtain higher quality reconstruction results.

Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding

In summary, the authors propose an embedding-based Gaussian deformation method, which can model each Gaussian's deformation independently by assigning an embedding vector and time embedding to each Gaussian. At the same time, the time change is decomposed into coarse and fine deformations, and higher quality dynamic scene reconstruction results are obtained through efficient training strategies.

3. Experiments

Selection of benchmark methods: The authors selected a number of benchmark methods, including NeRF-based methods (e.g., DyNeRF, NeRFPlayer, etc.), Gaussian-based methods (e.g., 4DGaussians, 4DGS, D3DGS, etc.), and some voxel-based methods (e.g., MixVoxels, K-Planes, etc.).

Evaluation indicators: The authors used indicators such as PSNR, SSIM, LPIPS, etc., to evaluate the quality of rendered images. Among them, PSNR is used to quantify pixel color error, SSIM is used to measure the perceived similarity of the rendered image to the real image, and LPIPS is used to measure the perceived similarity of the higher level.

Experimental datasets: The authors conducted experiments on three datasets, Neural 3D Video, Technicolor Light Field, and HyperNeRF, which contained complex dynamic scenes.

Quantitative and qualitative comparison: Through qualitative and quantitative comparison, the authors demonstrated the advantages of the proposed method in capturing dynamic region details, reconstruction quality, and computational efficiency.

Analytical experiments: The authors conducted analytical experiments to verify the effectiveness of coarse and fine deformation and efficient training strategies.

Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding
Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding
Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding
Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding
Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding
Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding
Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding
Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding
Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding
Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding

4. Summary

We propose a per-Gaussian deformation with per-Gaussian embedding as input, instead of the typical deformation field using the deformable 3DGS method, to achieve high performance. We improve quality by decomposing dynamic changes into coarse and fine deformations. However, our approach has limitations. When there is significant movement between frames, the dynamic areas of the rendered result tend to become blurry, as in other baselines (Figure 10). It is expected to be addressed through additional supervision, such as communication in a rapidly changing, dynamic area. In addition, our methods tend to be slower to render compared to existing Gaussian splash methods. As mentioned earlier, this can be improved by skipping the deformation prediction of the static area and pruning unnecessary points after training.

This article is only for academic sharing, if there is any infringement, please contact to delete the article.

Here I recommend the new course "New SLAM Algorithm Based on NeRF/Gaussian 3D Reconstruction" launched by the 3D Vision Workshop and Gigi

About the Speaker

Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding

Course outline

Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding
Beyond 4DGaussians | Efficient 3D dynamic scene reconstruction method based on Gaussian embedding

Course Highlights:

  • This course starts from both theory and code implementation, and takes you from scratch to learn the principles of NeRF/Gaussian Based SLAM, read papers, and sort out code.
  • At the theoretical level, starting from linear algebra to traditional computer graphics, we can understand the theoretical basis and source of modern 3D reconstruction.
  • At the code level, through a number of exercises, you will be taught to reproduce computer graphics and NeRF related work.

Harvest after school

  • Getting started in the field of SLAM based on NeRF/Gaussian
  • Learn how to quickly capture the key points and innovative points of a paper
  • How to quickly run through the code of a paper and grasp the idea of the paper in combination with the code
  • Parse the NeRF code line by line, grasp every implementation detail, and manually reproduce and improve it

Curriculum

  • System requirements: Linux
  • Programming language: Python
  • Basic requirements: Python and PyTorch foundation

Suitable for people

  • A novice who has no idea how to start with the open source code for a new paper
  • SLAM定位建图、NeRF三维重建小白
  • Those who are engaged in 3D reconstruction work can refer to it
  • Initial readers of NeRF papers
  • Students who are interested in SLAM and NeRF

Start time

On Saturday, February 24, 2024 at 8 p.m., there will be one chapter updated weekly.

Course Q&A

The Q&A of this course is mainly answered in the corresponding goose circle of this course, and students can ask questions in the goose circle at any time if they have any questions during the learning process.

▲Add a small assistant: cv3d007, consult more

Note: Some of the above pictures and videos are from the Internet, if your rights and interests are violated, please contact to delete!

Read on