CVPR'24 Open Source | Fast and good visual inertial navigation system!

Editor: Computer Vision Workshop

Add assistant: dddvision, note: autopilot, pull you into the group. At the end of the article, industry subdivisions are attached

There are two main methods of VINS: optimization and filtering. Optimization-based methods are significant in terms of high accuracy in positioning, but can be affected by high computational complexity. Conversely, filtering-based methods achieve high efficiency at the expense of accuracy. Therefore, there is an urgent need to develop a framework that combines high precision and high efficiency. Inspired by the Shuer complement in the optimization-based approach, the authors make full use of the sparse structure inherent in the high-dimensional residual model for pose and landmark construction to achieve the high efficiency of the EKF. Therefore, this paper proposes an EKF-based VINS framework that achieves both high efficiency and high accuracy.

Let's read about this work together~

标题：SchurVINS: Schur Complement-Based Lightweight Visual Inertial Navigation System

Related: Yunfei Fan, Tianyu Zhao, Guidong Wang

Agency: ByteDance

Original link: https://arxiv.org/abs/2312.01616

Code link: https://github.com/bytedance/SchurVINS

Accuracy and computational efficiency are the most important metrics in a visual inertial navigation system (VINS). Existing VINS algorithms with high accuracy or low computational complexity are difficult to provide high-precision positioning in resource-constrained devices. To this end, we propose a novel filter-based VINS framework called SchurVINS, which guarantees high accuracy and low computational complexity by constructing a complete residual model and using Shure compensation. Technically, we first developed a complete residual model in which gradient, Hessian, and observational covariances were explicitly modeled. Then, Shure compensation was used to decompose the complete model into a self-motion residual model and a landmark residual model. Finally, an efficient extended Kalman filter (EKF) update is implemented in both models. Experiments on EuRoC and TUM-VI datasets show that our method is significantly superior to existing technical methods in terms of accuracy and computational complexity.

Evaluate the comparison of runtime, CPU usage, and RMSE on the EuRoC dataset. Different shapes and colors represent different methods and precisions.

CVPR'24 Open Source | Fast and good visual inertial navigation system!

In this paper, we propose a filter-based VINS framework called SchurVINS, which guarantees high accuracy and low computational complexity by constructing a complete residual model and using Shure complements. Technically, a complete residual model was developed first, in which the gradient, Hesse matrix, and observational covariance were explicitly modeled. Then, Shure complements are used to decompose the complete model into a self-motion residual model and a landmark residual model. Finally, the extended Kalman filter (EKF) update is efficiently implemented in these two models. Experiments on EuRoC and TUM-VI datasets show that SchurVINS is significantly superior to state-of-the-art methods in terms of accuracy and computational complexity. Key contributions include:

(1) An equivalent residual model is proposed to deal with ultra-high-dimensional observations, including gradients, Hesse matrices and corresponding observational covariances. This method has great versatility in EKF systems.

(2) A lightweight EKF-based landmark solver is proposed to estimate the location of landmarks efficiently.

(3) A novel EKF-based VINS framework was developed to achieve accurate and efficient estimation of both self-motion and landmarks.

SchurVINS was developed based on the open-source SVO2.0 in a binocular configuration, in which the original backend in SVO2.0 is replaced with a sliding window-based EKF backend, and the original landmark optimizer is replaced with an EKF-based landmark solver. P1 to Pm represent valid signposts for the surrounding environment and are used to build residual models.

precision

The overall accuracy of the SchurVINS was evaluated using the root mean square error (RMSE) on EuRoC and TUM-VI. SchurVINS achieves the lowest average RMSE in the dataset among the filter-based methods reported to date, and outperforms most optimization-based methods. In addition, SchurVINS achieves a similar accuracy to the well-known optimization-based method BASALT and slightly lower than its recent competitor, DMVIO. The reassessment experiments in Table 2 were exactly as expected. It is worth emphasizing that while the accuracy is slightly reduced compared to the two optimization-based competitors, the computational complexity of the SchurVINS implementation is significantly lower than both, as detailed in the next section.

efficiency

The efficiency evaluation was performed on the Intel i7-9700 (3.00GHZ) desktop platform. Global BA (GBA), Attitude Map Optimization, and Closed-Loop Detection are disabled in all algorithms. As shown in Table 3, SchurVINS achieves almost the lowest processor usage compared to all of the VINS algorithms mentioned. In particular, SVO2.0-wo requires similar CPU usage to SchurVINS, but suffers from significant inaccuracies due to the almost pure visual odometry (VO).

In Table 4, the optimizeStructure module in SchurVINS is nearly 3 times faster than SchurVINS-GN. This is because SchurVINS achieves significant computational savings by utilizing the intermediate results of Schur supplementation. In contrast, the SchurVINS-GN reconstruction problem to estimate landmarks. Compared to SVO2.0-wo, SchurVINS is faster because it replaces the computationally intensive SparseImageAlign with the propagation module. In contrast, SVO2.0-wo's optimizeStructure is significantly faster than SchurVINS-GN. The reason for this is that the latter uses almost 4 times more measurements than the former for optimization. The root cause of the significant increase in the runtime of the algorithm compared to SVO2.0 is the high computational complexity of LBA. Considering OpenVINS, it's worth noting that neither the default configuration nor the configuration with the maximum size of 4 sliding windows can achieve OpenVINS being better than SchurVINS in terms of efficiency. What is striking from this analysis is that the update of SLAM points in OpenVINS requires significantly more computational resources than the EKF-based landmark estimation proposed by SchurVINS.

Ablation studies

The above experiments strongly support SchurVINS. Therefore, it is necessary to study the impact of the different components of the algorithm. Based on SchurVINS, the EKF-based landmark solver is replaced or discarded to analyze its validity. As shown in Table 5, without one of the GN-based or EKF-based landmark solvers, SchurVINS would not be able to sufficiently limit the global drift. In addition, in some challenge scenarios, not simultaneously estimating landmarks in SchurVINS can lead to system divergence. The comparison of SchurVINS and SchurVINS-GN in Table 5 shows that both the proposed EKF-based landmark solver and the original SVO2.0 GN-based landmark solver are valid and reliable, and can guarantee high accuracy. In addition, a comparison of them in Tables 4 and 5 shows that while the proposed EKF-based landmark solver results in a slight drop in accuracy, it can achieve significantly low computational complexity. The visual explanation for the reduced accuracy is that SchurVINS uses only all observations in a sliding window for landmark estimation.

This article develops an EKF-based VINS algorithm, including a novel EKF-based landmark solver, to achieve 6 degrees of freedom estimation with high efficiency and accuracy. In particular, the equivalent residual model composed of Hessian, Gradient and corresponding observational covariance is used to jointly estimate the attitude and landmarks to ensure high-precision positioning. In order to achieve high efficiency, the equivalent residual model was decomposed into an attitude residual model and a landmark residual model by Schur supplementation for EKF update. Affected by the probabilistic independence of the surrounding environmental elements, the resulting landmark residual model is segmented into a set of small independent residual models for EKF updates for each landmark, significantly reducing computational complexity.

This article is the first to utilize the Schur Supplemental Factorized Residual Model in an EKF-based VINS algorithm for acceleration. Experiments based on EuRoC and TUM-VI datasets show that SchurVINS is significantly superior to the overall EKF-based methods and most optimization-based methods in terms of accuracy and efficiency. In addition, SchurVINS requires almost half the computational resources of SOTA's optimization-based approach, with comparable accuracy. At the same time, the ablation study clearly shows that the EKF-based landmark solver not only has significant efficiency, but also ensures high accuracy. In future work, the authors will focus on local map refinement in SchurVINS to explore greater accuracy.

Readers who are interested in more experimental results and details of the article can read the original paper~

This article is only for academic sharing, if there is any infringement, please contact to delete the article.

Computer Vision Workshop Exchange Group

At present, we have established multiple communities in the direction of 3D vision, including 2D computer vision, large models, industrial 3D vision, SLAM, autonomous driving, 3D reconstruction, drones, etc., and the subdivisions include:

2D Computer Vision: Image Classification/Segmentation, Target/Detection, Medical Imaging, GAN, OCR, 2D Defect Detection, Remote Sensing Mapping, Super-Resolution, Face Detection, Behavior Recognition, Model Quantification Pruning, Transfer Learning, Human Pose Estimation, etc

Large models: NLP, CV, ASR, generative adversarial models, reinforcement learning models, dialogue models, etc

Industrial 3D vision: camera calibration, stereo matching, 3D point cloud, structured light, robotic arm grasping, defect detection, 6D pose estimation, phase deflection, Halcon, photogrammetry, array camera, photometric stereo vision, etc.

SLAM: visual SLAM, laser SLAM, semantic SLAM, filtering algorithm, multi-sensor fusion, multi-sensor calibration, dynamic SLAM, MOT SLAM, NeRF SLAM, robot navigation, etc.

Autonomous driving: depth estimation, Transformer, millimeter-wave, lidar, visual camera sensor, multi-sensor calibration, multi-sensor fusion, autonomous driving integrated group, etc., 3D object detection, path planning, trajectory prediction, 3D point cloud segmentation, model deployment, lane line detection, BEV perception, Occupancy, target tracking, end-to-end autonomous driving, etc.

3D reconstruction: 3DGS, NeRF, multi-view geometry, OpenMVS, MVSNet, colmap, texture mapping, etc

Unmanned aerial vehicles: quadrotor modeling, unmanned aerial vehicle flight control, etc

In addition to these, there are also exchange groups such as job search, hardware selection, visual product landing, the latest papers, the latest 3D vision products, and 3D vision industry news

Add a small assistant: dddvision, note: research direction + school/company + nickname (such as 3D point cloud + Tsinghua + Little Strawberry), pull you into the group.

3D Visual Learning Knowledge Planet

3DGS, NeRF, Structured Light, Phase Deflection, Robotic Arm Grabbing, Point Cloud Practice, Open3D, Defect Detection, BEV Perception, Occupancy, Transformer, Model Deployment, 3D Object Detection, Depth Estimation, Multi-Sensor Calibration, Planning and Control, UAV Simulation, 3D Vision C++, 3D Vision python, dToF, Camera Calibration, ROS2, Robot Control Planning, LeGo-LAOM, Multimodal fusion SLAM, LOAM-SLAM, indoor and outdoor SLAM, VINS-Fusion, ORB-SLAM3, MVSNet 3D reconstruction, colmap, linear and surface structured light, hardware structured light scanners, drones, etc.

CVPR'24 Open Source | Fast and good visual inertial navigation system!

Read on

"Cheng Huan Ji": Whether children and their parents have the same concept or not, they will more or less inherit some characteristics from their parents. is like Mai Chenghuan and her mother in the play, on the surface, it seems that the mother and daughter are married

Atomic interferometers facilitate the US Navy's inertial navigation path to reduce drift

Habitual short sleep duration still significantly increases the risk of developing type 2 diabetes

Apple aims at the end-side AI, the inevitable in the habitual "one step late".

April 22 Russia-Ukraine latest: irreversible inertia

"Habitual" stumbling foot 2

Laughing to death...... Jordan is also habitually walking? I'm not afraid to scare you to death when I say it, Jordan is a company

To break the inertia of the industry, Huawei wants to attack the last bastion of BBA

I really didn't expect that one of my habitual operations made me suffer a big loss, and I really regretted it

Title: Diversified Guidance System Guarantees the Strategic Strike Capability of ICBMsIn today's world military landscape, ICBMs, as an important weapon with a strategic deterrent effect, are attacked

Gen-la Hongyi: The greatest stupidity of a person is to habitually refute others.

A work of truth seeking against the inertia of thinking

Ordinary people use inertial thinking, cattle people break the inertia of thinking, and survive in the workplace, we must pay attention to this

Only by going to sea should we use less inertial thinking, and only by doing a good job in cross-cultural management can we avoid detours

The driver of the Meida Expressway cross-stop sighed: "Navigation saved me", the bonus was all donated, and the comment area fell

On the night of the Meida Expressway, some car owners sighed "navigation saved me" and told the miracle of escape in person!

"The Mother of China's Satellite Navigation" Dina, what has she experienced in her life

Navigation and Help! Meida Expressway staged a realistic version of "life and death speed"

Standards & Specifications | Expert consensus on the diagnosis, localization and treatment of pulmonary peripheral nodules under the guidance of augmented reality optical whole lung diagnosis and treatment navigation

War correspondents tell stories丨Home in the navigation station

Comparative Analysis of Automotive-Grade MEMS Inertial Measurement Unit (IMU) Products - 2024 Edition

The owner of the Meida collapse sighed that the navigation saved him, and the navigation software: multiple speed drops were detected and the brakes were sharp

CTI Navigation's new UAV embankment patrol and risk inspection equipment was unveiled at the International Safety Emergency Expo

Meida Expressway "the rest of my life", some car owners sighed "navigation saved me"

Which speed is more accurate, speedometer or mobile navigation? When encountering speed limits, look here and avoid speeding

Xiaopeng, don't be too outrageous! I can't keep up with your update speed! From April 18th to the 5.1 version of the closed beta, to May 1st before the holiday, a total of 5 versions were pushed, with an average of 2

Policy Navigation 丨The optimization and adjustment policies of real estate in many places have been intensively implemented

Where have you been taken by navigation? The comments of netizens can't laugh anymore!

The sad moment when people who can't understand navigation use navigation: where is the southeast, northwest, and north?

The EU "retribution" is coming! After breaking the contract and slaughtering China by 2 billion, the navigation system was paralyzed by the United States

The number of shareholders of Sinan Navigation increased by 7, with an average shareholding of 120,400 yuan