laitimes

An on-board lidar mapping and positioning method that integrates vision and IMU

author:ICVS Autonomous Driving Commercialization

Source: Journal of Fuzhou University (Natural Science Edition) Author: Lin Chenhao, Peng Yuhui Reprinted from the intelligent driving industry invasion and deletion

Abstract:Aiming at the problem of non-uniform motion distortion of lidar, a three-dimensional map construction and mapping (SLAM) method integrating visual inertial odometer and lidar odometer is proposed. After pre-processing and timestamp alignment of the data, the visual estimation and inertial measurement unit (IMU) pre-integration is applied to initialize the vision, and the traditional radar uniform motion model is improved to a multi-stage uniform acceleration model through constrained sliding window optimization and the high-frequency posture of the visual odometer, thereby reducing point cloud distortion. At the same time, the Levenberg-Marquardt (LM) method is used to optimize the laser odometer, and a loop detection method with integrated bag of words model is proposed, and the three-dimensional map construction is finally realized. Based on the actual vehicle test data, compared with the results of the LEGO-LOAM ( lightweight and ground-optimized lidar odometry and mapping on variable terrain) method, the average error and median error of the proposed method improved by 16% and 23%, respectively.

introduction

With the release of the "Intelligent Vehicle Innovation Development Strategy" in February 2020, the research of automotive driverless technology has become a hot spot in universities and industries. Simultaneous localization and mapping (SLAM) technology is a non-negligible part of unmanned driving technology. When the unmanned vehicle fails the global positioning system (GPS), SLAM technology can rely on its own sensor to autonomously complete the posture estimation and navigation of the unmanned vehicle without prior information. At present, the mainstream SLAM method can be divided into camera-based vision SLAM and radar-based laser SLAM according to the type of sensor. In recent years, the visual SLAM fused with inertial measurement units (IMUs) has the advantage of being able to estimate absolute scales and is not affected by image quality, and has gradually become a research hotspot in this field.

Visual SLAM can be further divided into feature point method and optical flow method. The feature point method tracks the feature points by matching the feature points, and finally optimizes the R,t. The optical flow law is based on the grayscale invariance assumption, which replaces the descriptionon and feature point matching in the feature point method with optical flow tracking. Among the SLAM schemes of the numerous feature point methods, ORB-SLAM (oriented FAST and rotated BRIEF) is the most representative, orB features have viewpoint invariance and illumination invariance, and the extraction of keyframes and the rejection of redundant frames also provide a guarantee for the efficiency and accuracy of BA(bundle adjustment) optimization. Considering that pure vision SLAM is relatively easy to drop frames when rotating, especially pure rotation, it is sensitive to noise and does not have scale invariance. Therefore, Xu Kuan et al. integrated the IMU with the vision, applied the pre-integral IMU information and visual information to the Gaussian Newton method, and then used the graph optimization method to optimize the reprojection error of the vision and the residual of the IMU, so as to obtain a more accurate posture. In the current optical flow scheme, VINS-MONO (a robust and versatile monocular visual-inertial state estimator) has excellent performance outdoors, and its IMU+ monocular camera scheme recovers the scale of the object, because of the use of optical flow tracking as a front end, compared to the description of the sub as the front end of the ORB-SLAM has a stronger robustness, in high-speed motion is not easy to follow. Enter the LIDAR AC group, please add VX:zhijia0124.

However, pure visual SLAM requires moderate lighting conditions and obvious image features, and it is difficult to build 3D maps outdoors. Laser SLAM can build outdoor three-dimensional maps, but it is easy to produce non-uniform motion distortion during motion, and the positioning in degraded scenes is not accurate. Therefore, based on the point cloud information collected by lidar, this paper proposes an outdoor three-dimensional map construction and positioning method that integrates multiple sensors. This method first calculates the visual inertial odometry (VIO) and outputs the high-frequency pose, and then removes the lidar motion distortion through the high-frequency pose and calculates the laser-odometry (LO), and finally realizes the three-dimensional map construction.

1 Algorithm framework

As shown in Figure 1, the algorithm framework is roughly divided into two modules: the Visual Inertial Odometer Module, the Laser Odometer Module, and the Mapping Module. Visual inertial odometers use KLT ( kanade-lucas-tomasi tracking) optical flow to track adjacent two frames and pre-integrate IMU as a predictor of motion in adjacent two frames, while extracting bag-of-words information from the image for loop detection. The initialization module decouples vision with IMU pre-integration to solve for gyroscope bias, scale factor, gravity direction, and velocity between adjacent frames. The residual term based on the visual structure and the residual term based on the IMU construction are optimized by the sliding window method, and the high-frequency absolute pose calculated by the VIO is output. The two modules obtain an external reference matrix through the joint calibration of camera radar to convert the absolute pose under the camera coordinate system to the radar coordinate system.

An on-board lidar mapping and positioning method that integrates vision and IMU

The laser odometer and mapping module classify point clouds into different types of clustered points, facilitate subsequent feature extraction, and then integrate high-frequency VIO poses to improve the traditional radar uniform motion model into a multi-stage uniform acceleration model. At this time, the point cloud has fused the information of the camera and the IMU, and the point cloud after the ICP (iterative closest point) frame matching is optimized with LM to obtain the posture transformation matrix between the two frames of the point cloud, and converted to the initial point cloud coordinate system, and finally the loop detection based on the bag of words model is fused to build a three-dimensional map.

2 Mapping and positioning methods

2. 1 Camera, IMU data preprocessing

Due to the high efficiency of FAST feature extraction, KLT optical flow tracking does not require descriptors, so these two are selected for feature extraction and optical flow tracking. Let Ix and Iy represent the image gradient of the brightness of the pixels in the x and y directions in the image, It represents the time gradient in the t direction, and u and v are the velocity vectors of the optical flow along the x and y axes. Construct a constraint equation based on the KLT optical flow principle, and apply the least squares method to obtain u, v, as follows.

An on-board lidar mapping and positioning method that integrates vision and IMU

In each new image, existing feature points are tracked by the KLT algorithm and detect new feature points. In order to ensure the uniform distribution of feature points, the image is divided into 18× 10 sub-regions of exactly the same size, and each sub-region extracts up to 10 FAST corner points, keeping 50 to 200 FAST corner points present in each image. The displacement between two adjacent images in the outdoor scene is large, and the brightness value of each pixel may be abruptly changed, which adversely affects the tracking of feature points. Therefore, the feature points need to be culled out of the group before being projected onto the unit sphere. Outlier rejection uses ransac calculations and incorporates Kalman filters to achieve robust optical flow tracking in outdoor dynamic scenes. Figure 2 shows feature point tracking without RANSAC algorithm and RANSAC algorithm in outdoor scenes, and it can be seen that the use of RANSAC algorithm reduces false tracking. Enter the LIDAR AC group, please add VX:zhijia0124.

An on-board lidar mapping and positioning method that integrates vision and IMU

The IMU responds quickly, is not affected by the image quality, and can estimate the characteristics of the absolute scale, complementing the visual positioning of structureless objects on the outdoor surface. If the camera pose is optimized for all the poses corresponding to all the sampling moments of the IMU during the camera pose estimation, it will reduce the efficiency of the program operation, and the IMU pre-integration processing will be required to convert the acceleration and angular velocity measurements of the high frequency output into a single observation, and the measured values will be relineated in the nonlinear iteration to form a constraint factor for the amount of state between frames. The IMU pre-integral for continuous time is shown in the following equation.

An on-board lidar mapping and positioning method that integrates vision and IMU

where: b is the IMU coordinate system; w is the coordinate system in which the IMU is initialized at the origin, that is, the world coordinate system; at and wt are the acceleration and angular velocity measured by the IMU; qbk t is the rotation of t-time from the IMU coordinate system to the world coordinate system; Ω is a quaternion right multiplication. Integrates all IMU data between frame k and frame k+1. The position (P), velocity (v) and rotation ( Q) of frame k+1 are obtained, which serve as the initial value of the visual estimate, where the rotation is in quaternion form.

2. 2 Sliding window optimization

The initialization module restores the scale of the monocular camera, and the visual information and IMU information need to be loosely coupled. First, SFM is used to solve the posture of all frames in the sliding window with the three-dimensional position of all road punctuation points, and then align it with the IMU pre-integration value obtained earlier, so as to solve the angular velocity bias, gravity direction, scale factor and the corresponding velocity of each frame. With the operation of the system, the number of state variables is increasing, and the sliding window method is used to optimize the state variables in the window. The optimization vector xi defined in the i-moment window is as follows

An on-board lidar mapping and positioning method that integrates vision and IMU

In the formula: Ri, pi is the rotation and translation part of the camera position; vi is the speed of the camera in the world coordinate system; abi and ωbi are the acceleration bias and angular velocity bias of the IMU, respectively.

The set of all xis that are involved in all frames in the optimization slider at k-time is Xk, and all observations of the system are Zk. Combined with the Bayesian equation, the maximum posterior probability is used to estimate the state quantity of the system, as follows

An on-board lidar mapping and positioning method that integrates vision and IMU

Convert the maximum posterior problem into an optimization problem, and define the optimization objective function as shown in the following equation.

An on-board lidar mapping and positioning method that integrates vision and IMU

where X*k is the maximum estimated posterior; r0 is the initial residual value of the sliding window; rIij is the IMU observation residual; rCil is the residual of camera observation. Combined with the calibration method proposed by Unni-krishnan et al. and Kassir et al., the camera and lidar are jointly calibrated to determine the relative posture of the camera and radar, and the optimized VIO pose is converted to the radar coordinate system and then output to the laser mileometer module, and the bag of words DBoW2 constructed by the BRIEF descriptor is calculated for similarity with the current frame for loop detection.

2. 3-point cloud data preprocessing

Point cloud preprocessing partially improves the LEGO_LOAM scheme and divides the point cloud into ground points, effective cluster points, and outlier points in two steps: 1) Ground point cloud extraction, which divides the point cloud into 0. 5 m×0. A 5 m raster that calculates the height difference between the highest and lowest points within the raster, with a height difference of less than 0. A 15 m grid is classified as a ground point. 2) Effective clustering point extraction, after marking the ground point, the disturbance of some small objects will affect the next inter-frame registration link. Therefore, European clustering of point clouds is carried out, and the points with fewer than 30 clusters or less than 3 of the harness occupied in the vertical direction are filtered out.

2. 4 High frequency VIO posture removal point cloud distortion

Due to the non-uniform motion distortion of the point cloud during the scanning process of mechanical lidar, in order to improve the accuracy of the point cloud registration, the high-frequency posture output of VIO is used to remove the point cloud distortion.

An on-board lidar mapping and positioning method that integrates vision and IMU

First, align the timestamps of the two sensor systems, as shown in Figure 3, define tLq as the timestamp of the radar at the time of the q scan, and define tV-Ik as the timestamp of the vio system's kth-second pose output, then the time alignment stamp is achieved by the following equation:

An on-board lidar mapping and positioning method that integrates vision and IMU

The four absolute postures calculated at the tV-Ik, tV-Ik+1, tV-Ik+2, and tV-Ik+3 moments are expressed as Tk, Tk+1, Tk+2, Tk+3, and the displacement between tLq and tLq+1 is divided into two uniform acceleration stages, and the radar motion model is improved to a multi-stage uniform acceleration model, as shown below.

An on-board lidar mapping and positioning method that integrates vision and IMU

Through two stages of displacement and velocity, the velocity, displacement and Euler angle of the point cloud are interpolated, eliminating distortions caused by non-uniform motion of the radar.

2. Extraction and matching of 5 point cloud feature points

Point cloud feature points mainly contain two categories: planar feature points and edge feature points.

An on-board lidar mapping and positioning method that integrates vision and IMU

As shown in Figure 4, due to the large curvature of breakpoints and the small curvature of parallel points, they are mistaken for edge points and planar points, respectively. Therefore, before performing feature extraction, the breakpoints on the cross-section and the parallel points parallel to the direction of the laser line must be removed. Define point cloud roughness ck,i,Sk,i is a set of five points before and after the point cloud pk at the k-time, i closest, by the size of ck, i to threshold segment the edge points and planar points, as follows:

An on-board lidar mapping and positioning method that integrates vision and IMU

The fundamentals of the inter-frame matching link lies in solving the least squares problem of the change in posture between two frames, using the distance norm between the matching points and as the error function f( Pk-1, Pk ), as follows:

An on-board lidar mapping and positioning method that integrates vision and IMU

where p( k,i) is the ith point cloud in frame k; T L ( k-1,k) is the transformation matrix between k frames and k-1 frames. Let the set of equations that constrain the point cloud between k-1 and k frames be D( k-1,k), and the set of edge points and planar points of frame k is Sk, resulting in an error equation

An on-board lidar mapping and positioning method that integrates vision and IMU

T L ( k-1,k) can be obtained by making D( k-1,k) close to 0 by the nonlinear iterative optimization method LM method, as follows:

An on-board lidar mapping and positioning method that integrates vision and IMU

The loopback module adopts the point cloud neighbor search algorithm, takes the current position of the lidar as the search point, finds several positions within a radius of 5 m, and synchronizes the DBoW2 detection results and increases the constraint on the loop detection time in order to improve the search accuracy.

3 Real-world scenario verification

The dataset in this paper is derived from the Qishan campus area of Fuzhou University, and the experimental vehicle is an autonomously developed driverless Formula racing test platform, as shown in Figure 5. The lidar is Pandar 40P, the camera is MYNT-1000 D (built-in IMU), the industrial computer is Nuvo-7160 GC, and the operating system is ubuntu16. 04. Enter the LIDAR AC group and add VX: zhijia0124. The experimental route is located in the north area of The Qishan Campus of Fuzhou University, corresponding to the large scene, square, Z-type, and P-shaped maps, as shown in Figure 6.

An on-board lidar mapping and positioning method that integrates vision and IMU

The method in this article is marked as the A method and the LEGO-LOAM method as the B method. Table 1 shows the comparison of the A method and the B method, both of which are true values, that is, GPS data comparison. In Table 1, Max is the maximum error; Mean is the mean average error; Median is the median error; Min is the minimum error; RMSE is root mean square error; SSE is the sum of squares of errors; The STD is the standard deviation

An on-board lidar mapping and positioning method that integrates vision and IMU

In Table 1, from the control group in sequences 1 to 4, it can be seen that the A method has a smaller error than the B method in the errors of each type of route. In large scene maps, the maximum error is reduced by 26%, the mean error is reduced by 16%, the median error is reduced by 23%, the minimum error is reduced by 88%, the root mean square error is reduced by 20%, the sum of errors is reduced by 36%, and the standard deviation is reduced by 30%. Figures 7( a) and 7( b) are the trajectory comparison and error analysis between the A method and the truth value under the large scene map, and Fig. 7( c) and 7( d) are the trajectory comparison and error analysis between the B method and the truth value under the large scene map.

An on-board lidar mapping and positioning method that integrates vision and IMU

Figure 8( a) Comparison of the trajectory of the plot for both methods with the truth trajectory, Figure 8 ( b) 、8( c) 、8( d) is a local enlarged figure, the dashed line in Figure 8 is the GPS truth value, and the blue line is the trajectory built by the A method, which shows that the vehicle trajectory built by the A method is closer to the true value of the GPS output than the B method. The green line is the trajectory of the B-law. Figure 9 is a three-dimensional map of the north area of the Qishan Campus of Fuzhou University built by this method, which shows that the loops of each part are in good condition.

An on-board lidar mapping and positioning method that integrates vision and IMU
An on-board lidar mapping and positioning method that integrates vision and IMU

4 Conclusion

The outdoor lidar 3D mapping has point cloud non-uniform motion distortion, and integrates the visual inertia odometer on the traditional laser SLAM method, improves the radar uniform speed motion model to a multi-stage uniform acceleration motion model, and introduces a bag of words model in the loop detection module. Comparing the absolute posture error of the drawing with the LEGO-LOAM method, the average error and the median error of the proposed method are improved by 16% and 23%, respectively, and the accuracy of the drawing is greatly improved. It can be seen that the multi-stage uniform acceleration radar motion model can effectively reduce the accumulated error of the odometer under the long-term construction of the map. The proposed double loop detection has a stronger time constraint than the traditional method in terms of the accuracy of the loop time, which can meet the needs of outdoor three-dimensional mapping.

Read on