laitimes

ICRA2024 Oxford proposed NeRF fusion LiDAR SLAM large-scene reconstruction

author:3D Vision Workshop

作者:Yifu Tao(一作授权) | 编辑:3DCV

Add WeChat: dddvision, note: 3D Gauss, pull you into the group. At the end of the article, industry subdivisions are attached

What does this paper do?

As an emerging 3D reconstruction paradigm, NeRF has powerful rendering capabilities for new perspectives, but its geometric accuracy is limited under the constraints of untextured surfaces and limited multi-view constraints.

This paper proposes a large-scale scene 3D reconstruction system called SiLVR that fuses lidar and multi-fisheye cameras. Based on NeRF, the system introduces depth and surface normal constraints from LiDAR to obtain accurate and smooth 3D models on low-texture surfaces. The system is mounted on UAVs and quadruped robots, and integrates multi-fisheye cameras and lidar SLAM to perform beam optimization (BA) and generate submaps for online SLAM trajectories to achieve outdoor large-scale scene reconstruction.

ICRA2024 Oxford proposed NeRF fusion LiDAR SLAM large-scene reconstruction

Thesis information

标题:SiLVR: Scalable Lidar-Visual Reconstruction with Neural Radiance Fields for Robotic Inspection

作者:Efu Tao, Yash Vulgat, Lanke Frank Tarimo Phu, Matias Mattamala, Nived Chebrolu, Maurice Fallon

Institution: Oxford Institute of Robotics

Original link: https://arxiv.org/abs/2403.06877

Project Homepage: https://ori-drs.github.io/projects/silvr/

summary

We propose a large-scale neural field-based reconstruction system that fuses lidar and visual data to generate high-quality reconstruction results with precise geometry and realistic textures. The system adds strong geometric constraints to NeRF by improving the state-of-the-art Neural Radiance Field (NeRF) representation, integrating lidar data and obtaining depth and surface normals. We utilize the trajectories of the real-time LiDAR SLAM system to initialize the Structure-from-Motion process to significantly reduce the computation time and provide scale information that is critical to the LiDAR depth loss function. We leverage submap technology to scale the system to large-scale environments. We use a multi-camera, LiDAR sensor suite on multiple robotic platforms, including a quadruped robot, a handheld device that scans the building scene and walks more than 600 meters, and a drone that investigates data from multi-layered simulated disaster building scenarios.

Effect display

The authors collected three datasets using robots including Boston Dynamics Spot and DJI M600. The reference 3D model for each dataset uses the Leica BLK360 with millimeter accuracy. SiLVR achieves the accuracy of a pure LiDAR solution at the same time, while using a camera to achieve a more complete reconstruction with realistic textures.

ICRA2024 Oxford proposed NeRF fusion LiDAR SLAM large-scene reconstruction

Principles and methods

Hardware & Systems

The perception module includes the Hesai QT64, three AlphaSense fisheye cameras and IMUs, which are mounted on the Boston Dynamics Spot and DJI M600. The LiDAR point cloud generates a depth map and surface normal vector, and then inputs it to the SiLVR system together with the online SLAM system for beam optimization (BA) and submap generation, and then trains the NeRF that fuses vision and laser.

ICRA2024 Oxford proposed NeRF fusion LiDAR SLAM large-scene reconstruction

Geometric constraints of LiDAR on NeRF

In this paper, the depth map and normal vector map are generated by LiDAR respectively, and NeRF is optimized by differential inverse rendering. The depth information can be used to focus the sampling points on the beam on the lidar ranging, which can make the low-texture ground in the image get the correct height, and the lidar can obtain the surface normal vector through plane fitting, so that the surface reconstructed by NeRF is smoother.

ICRA2024 Oxford proposed NeRF fusion LiDAR SLAM large-scene reconstruction

Multi-camera system in the inspection scenario

NeRF relies on multi-view observation, but in heavy inspection tasks, robots often can only go straight. Under such trajectories, the geometric constraints of the single-camera system are limited and the reconstruction accuracy is low, but the platform of the multi-fisheye camera is used in this paper, and the accurate 3D reconstruction can still be obtained under the linear trajectory.

ICRA2024 Oxford proposed NeRF fusion LiDAR SLAM large-scene reconstruction

This article is only for academic sharing, if there is any infringement, please contact to delete the article.

3D Vision Boutique Courses:

3DGS, NeRF, Structured Light, Phase Deflection, Robotic Arm Grabbing, Point Cloud Practice, Open3D, Defect Detection, BEV Perception, Occupancy, Transformer, Model Deployment, 3D Object Detection, Depth Estimation, Multi-Sensor Calibration, Planning and Control, UAV Simulation, 3D Vision C++, 3D Vision python, dToF, Camera Calibration, ROS2, Robot Control Planning, LeGo-LAOM, Multi-modal fusion SLAM, LOAM-SLAM, indoor and outdoor SLAM, VINS-Fusion, ORB-SLAM3, MVSNet 3D reconstruction, colmap, linear and surface structured light, and hardware structured light scanners.

3D Visual Learning Circle

3D vision from the beginning to the proficient knowledge planet, the earliest establishment in China, 6000+ members exchange and learn. Including: nearly 20 planetary video courses (worth more than 6000), project docking, 3D vision learning route summary, the latest top meeting papers & codes, the latest modules in the 3D vision industry, 3D vision high-quality source code summary, book recommendations, programming basics & learning tools, practical projects & assignments, job search & recruitment & interview questions, etc. Welcome to 3D Vision: From beginner to proficient knowledge planet, learn and progress together.

3D visual communication group

At present, the workshop has established multiple communities in the direction of 3D vision, including SLAM, industrial 3D vision, autonomous driving, 3D reconstruction, drones, etc., and the subdivisions include:

Industrial 3D vision: camera calibration, stereo matching, 3D point cloud, structured light, robotic arm grasping, defect detection, 6D pose estimation, phase deflection, Halcon, photogrammetry, array camera, photometric stereo vision, etc.

SLAM: visual SLAM, laser SLAM, semantic SLAM, filtering algorithm, multi-sensor fusion, multi-sensor calibration, dynamic SLAM, MOT SLAM, NeRF SLAM, robot navigation, etc.

Autonomous driving: depth estimation, Transformer, millimeter-wave, lidar, visual camera sensors, multi-sensor calibration, multi-sensor fusion, autonomous driving integrated group, etc., 3D object detection, path planning, trajectory prediction, 3D point cloud segmentation, model deployment, lane line detection, Occupancy, target tracking, etc.

3D reconstruction: 3DGS, NeRF, multi-view geometry, OpenMVS, MVSNet, colmap, texture mapping, etc

Unmanned aerial vehicles: quadrotor modeling, unmanned aerial vehicle flight control, etc

In addition to these, there are also exchange groups such as job search, hardware selection, visual product landing, the latest papers, the latest 3D vision products, and 3D vision industry news

Add a small assistant: dddvision, note: research direction + school/company + nickname (such as 3D point cloud + Tsinghua + Little Strawberry), pull you into the group.

Read on