laitimes

The new algorithm makes intelligent driving "see" more clearly丨The front line of science and technology

author:Voice of the Chinese Academy of Sciences

Recently, researchers from the Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, have made progress in the field of intelligent driving perception, and have made breakthroughs in two key problems of intelligent driving perception, raster occupancy prediction and panoramic segmentation. This work was accepted by the 2024 IEEE ICRA, an international academic conference in the field of robotics.

01

Perception is the foundation of autonomous driving

As the core dimension that defines the process of intelligent networking, autonomous driving technology has become a key indicator to measure whether it is intelligent or not. The core system of autonomous driving includes a perception system, a decision-making system and an execution system. The perception system is like the eyes and ears of a human driver, capturing road condition information through various sensors, which is an important foundation and prerequisite for autonomous driving.

At present, there are roughly two modes of autonomous driving perception technology: one is a mode that uses cameras as the main sensor, with low-cost sensors such as millimeter-wave radar, and image recognition mode as the core process; The other mode is to use high-cost lidar as the core component, using lidar to obtain a longer detection distance, better angular resolution, and less affected by ambient light. Although the former is less costly, it is more dependent on algorithms and chips, and the current algorithms and models put it at a disadvantage in reliability and accuracy compared with the latter.

In this regard, researchers have upgraded and innovated the low-cost camera as the main sensor background algorithm, and made breakthroughs in the two key problems of grid occupancy prediction and panoramic segmentation involved in the algorithm, so as to improve environmental perception, optimize the accuracy of driving scenes, and improve safety assurance.

The new algorithm makes intelligent driving "see" more clearly丨The front line of science and technology

▲ The picture comes from the Internet

02

Overcome obstacle occlusion

Grid occupancy prediction algorithms are commonly used to understand and analyze information about the vehicle's surroundings. It divides the surrounding environment of the vehicle body into many small squares, known as grids, and interprets the information in each grid for reference in the background of autonomous driving. However, due to the lack of detail in the restoration of some scenes, and the lack of thorough understanding of the geometric information of various vehicles and obstacles, when the shape or appearance of objects in the open scene is not clear, the obstacles are often misestimated. Therefore, researchers propose a self-vehicle-centered surround view of the occupancy prediction representation method - CVFormer.

CVFormer adopts the "surround view cross-attention module" technology, which uses multiple surround views around the car to establish the representation of multiple two-dimensional perspectives, so as to effectively describe the surrounding three-dimensional scenes. The "temporal multi-attention module" adopted by it can enhance the utilization of inter-frame relationships and improve the accuracy and efficiency of prediction. In addition, the researchers also introduced 2D and 3D class consistency constraints into CVFormer to make the prediction results more consistent with the actual scenario.

Through the above technologies, CVFormer can overcome the problem of obstruction of the view that may be caused by obstacles around the vehicle, and provide more accurate and reliable environmental perception capabilities for autonomous vehicles.

The new algorithm makes intelligent driving "see" more clearly丨The front line of science and technology

▲CVFormer visualization of the 3D occupancy prediction task on nuScenes, a commonly used dataset for autonomous driving

03

Improve the accuracy of panoramic segmentation

Since the autonomous driving scheme with cameras as the main sensor does not involve 3D laser point cloud data processing, panoramic segmentation has become a crucial core technology, mainly used for the recognition and understanding of driving routes and streets.

Panorama segmentation is a comprehensive method that integrates semantic segmentation and instance segmentation. Semantic segmentation is concerned with segmenting regions in an image into different categories; Instance splitting focuses on the independent segmentation of each instance object. Panorama segmentation fuses the two, but in practice, there will be contradictions between the prediction results of the two, resulting in misjudgment in the background.

In order to solve this problem, researchers designed an end-to-end panoramic segmentation model BEE-Net based on gated coding and edge constraints. The model optimizes the quality of edge segmentation through the semantic-example-panoramic triple edge optimization algorithm, which significantly improves the performance of scene segmentation while maintaining high efficiency.

BEE-Net was verified on CityScapes, an authoritative dataset for driving scene segmentation, and the PQ accuracy index of 65.0% was obtained. In terms of accuracy, it surpasses the highest accuracy of 63.3% of the current CNN-based panoramic segmentation model. At the same time, in terms of efficiency, it is superior to all Transformer-based panoramic segmentation models, taking into account the performance requirements of segmentation accuracy and efficiency, and has completed the test and verification on the intelligent driving perception system of a next-generation production model.

The new algorithm makes intelligent driving "see" more clearly丨The front line of science and technology

▲Segmentation results of BEE-Net on the CityScapes dataset

Overall, BEE-Net not only helps to alleviate semantic-instance prediction confusion, but also improves segmentation quality, especially at the edges. This not only improves the accuracy of panoramic segmentation, but also further enhances the perception ability of the autonomous driving algorithm to perceive the environment, making it more accurate and reliable.

Source: Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences

Editor in charge: Cao Yang

The new algorithm makes intelligent driving "see" more clearly丨The front line of science and technology

Read on