laitimes

Baidu Intelligent Cloud: The true value solution of intelligent driving is transformed, and technology leads a new wave of data cost reduction

author:Gasgoo Gasgoo

Traditional perception algorithm data training requires a large amount of labeled data, and the difficulty in producing ground truth data, poor reusability, high cost, low efficiency, and poor availability are important reasons hindering the improvement of model capabilities.

On April 17, 2024, at the 2nd Automotive Artificial Intelligence Conference, Zhang Xiaoxiao, senior product manager of Baidu Intelligent Cloud AI Data Service Department, shared the corresponding truth value scheme: BEV map truth value and ground truth car evaluation scheme for dynamic and static element characteristics. Baidu Intelligent Cloud provides a full set of mapping tools + BEV map ground truth + ground truth projection verification process, and has nearly 10,000 kilometers of road base map production experience.

Compared with high-precision maps, BEV static maps pay more attention to positive and negative sample sections, emphasizing that the labeling range corresponds to time series data, and can realize the alignment of perception results of multiple sets of sensors. The data accuracy can be down to the cm level, and the static elements are more fine. The mature true-value vehicle solution solves the problems of slow implementation of restructuring and difficult data integration, and has a professional collection team and rich collection experience to obtain high-quality road mining data and whole-process data processing tool services to help improve the energy efficiency of data circulation.

Baidu Intelligent Cloud: The true value solution of intelligent driving is transformed, and technology leads a new wave of data cost reduction

Zhang Xiaoxiao | Senior Product Manager of Baidu Intelligent Cloud AI Data Service Department

The following is a summary of the speech:

Ground truth maps and automated annotation exploration

The current technology shows a trend of emphasizing perception and ignoring high-precision maps, and the emphasis on perception puts forward higher requirements for perception algorithms, covering a wider range of scenarios and higher accuracy of accuracy, which requires larger-scale and higher-precision sensing value-added inputs. In terms of ground truth data costs, the biggest investment is manual annotation. The traditional real generation is generated by manual annotation, Baidu Intelligent Cloud has about 10,000 full-time annotation manpower in the country, and there is a large-scale crowdsourcing annotation manpower online, but such a large-scale annotation manpower can only solve less than 10% of the labeling problem in the industry.

In addition to the problem of manual labeling of its own long cycle, the same road section needs to be re-labeled after changing the model, sensor scheme or driving behavior. In the process of annotation, the quality of the original data, such as the degree of blurring, lighting, and the occlusion of the predecessor, will affect the effect of manual annotation, resulting in high cost, long cycle, low efficiency, and slow model implementation. Therefore, the industry is constantly exploring new solutions to solve the problem of difficulty in obtaining truth value.

Automated annotation is a new trend that has received a lot of attention in recent years. In the process of practical application, we found that automatic annotation faces problems such as adaptability and reliability of different scenarios, so we have new solutions, such as the introduction of large model capabilities, ground truth maps and ground truth car solutions.

Ground truth maps are mainly used in scenarios where there are static elements on the ground or static traffic participants. After autonomous driving technology reduces the dependence of the vehicle on high-definition maps, the new core challenge of perception algorithms is the static element. In fact, the truth map is not directly applied to the regulation algorithm of the vehicle side, but is used as the base map of the truth value in the production process of the perception training data, and its annotation results can be used as a credible ground truth value and directly projected onto the perception data collected by the vehicle under test.

In the process of collection, compared with traditional collection vehicles or high-precision map collection vehicles, our ground truth map collection vehicles will be equipped with higher precision positioning equipment and higher density lidars, and a single road will obtain denser point information in the form of repeated collection. Through feature extraction and multi-acquisition aggregation mapping, we can see that the pavement integrity in the point cloud base map is higher, the roadside elements and road structure are clearer, and the ground shading feedback is very obvious. This high-quality raw point cloud information is directly input into the automatic pre-labeling algorithm, which can generate pre-recognition results with better accuracy, and the efficiency of manual annotation is also significantly improved. This kind of data augmentation through the front and back time series, the completion of the current local map through future information points, and the labeling operation on the local fusion map are also the concepts of 4D annotation that are often mentioned.

Baidu Intelligent Cloud: The true value solution of intelligent driving is transformed, and technology leads a new wave of data cost reduction

Source: Speaker material

Ground truth map construction and application

In addition to obtaining denser and more precise original point cloud information on the original data, we have also carried out further deep processing, which can be called the processing technology of ground truth map. The processing technology integrates the traditional static element annotation, and adds a new type of map annotation and quasi-high-precision requirements.

We can take a look at which element scenes are covered in ground truth map processing. The first is the conventional static ground truth, and the common road signs, sign lines, signs, traffic lights, etc. are the coverage of the ground truth map. The second is the lane connection relationship, and the feature points, such as the occupancy relationship of the intersection, the drivable area and the sample of the special road book, will be specially labeled. The third is the demand for high-precision and special static annotation. As a data service provider, we have collected and analyzed the data requirements of many mainstream car manufacturers, and have an in-depth understanding of the relevant technical details. For example, specific areas of special road sections, such as speed bumps, pedestrian areas, and parking spaces, have become the new trend of annotation, and we will also provide corresponding annotation capabilities in the ground truth basemap to meet the actual needs of customers.

The difference between ground truth map and single-frame annotation is that the original point cloud basemap collected in multiple repetitions and more precise positioning equipment is denser, the edges are clearer, and the structure is more complete. When the pre-recognition algorithm is inputted, it can produce more accurate pre-recognition results quickly, and the manual annotation efficiency is greatly improved, and the overall annotation accuracy can reach more than 99%. In addition, it can also avoid the problems of multiple occlusions and brain filling difficulties in a single frame picture, and there is no need to repeat the annotation and recognition of multiple frames and continuous frame pictures. Once the ground truth map is built, the customer's vehicle can drive in the area covered by the basemap.

After collecting in the coverage area of the basemap, we obtain the perception data for the spatiotemporal alignment of the double data, the supplement of the positioning algorithm and the supplement of the motion compensation algorithm, which can achieve a good matching, and directly map the true value on the basemap to the perception data, and the matching will be completed in about 1-2 weeks, and then large-scale automated production operations can be carried out.

This scheme is significantly different from the traditional manual labeling scheme, and its biggest advantage is that once the ground truth map construction is completed, a large number of subsequent manual labeling work can be automated. The manual only needs to be responsible for the quality inspection work, and the quality inspection process does not need to modify or supplement the labeling like the traditional labeling quality inspection, and only needs to judge whether the mapping result of the entire road section is correct. This greatly reduces the requirements for personnel capacity and makes quality inspection more efficient and convenient.

In addition, the scheme is highly reusable. Once the ground truth map is built, vehicles and sensor solutions from different manufacturers can be directly applied to this base map, and the ground truth results can be stably output. Neither weather conditions, occlusion, nor the state of the sensor have an impact on the ground truth results.

However, it also has a disadvantage, that is, the input cost of the ground truth basemap in the early stage is extremely high. Compared with the production of high-precision maps, the construction cost of ground truth basemaps may be as high as 3-5 times, and this production cost is less cost-effective if it is borne by an OEM or data service provider alone. Therefore, this is a solution that needs to be built by the industry, and we hope that more OEMs and ecological partners can join this project to generate economies of scale and share cost dividends through one-time base map construction.

Ground value car scheme exploration

For the treatment of dynamic elements, a new solution has emerged in recent years - the ground truth car solution. In short, the solution is equipped with a set of ground truth sensors on the vehicle under test, which is characterized by higher accuracy, wider coverage, smaller blind spots, and perceptual redundancy. Higher-precision perception data is generated by collecting, and then the truth algorithm is used to directly generate the truth results. The results of this ground truth sensor can be projected onto the perception data of the vehicle under test, or compared in both directions.

In the process of building a truth value car, it involves the collection of truth value data, data management, truth value generation and subsequent truth value application, such as truth value evaluation. In this regard, we will output a relatively complete technical solution and corresponding data party services.

At the ground value vehicle level, vehicle modification costs are high and the cycle is long, and subsequent data fusion is difficult. That's why we've launched our own Ground Value Car solution. The solution is equipped with an integrated ground truth module design, which integrates the truth sensor module and the data acquisition module. With a split design, it can quickly get on the car, and has good adaptability to SUVs or sedans.

In addition, we provide comprehensive retrofitting services, including time synchronization, calibration and other technical services, aiming to solve various problems in the subsequent use of data. In the actual application case, Hongqi car adopts our ground truth car solution, using a split design, and the appearance and collection are very stable. In addition, the perception sensor modules are equipped with high-precision equipment.

After the ground truth car is built, we move on to the subsequent data collection phase. In the process, businesses face compliance issues. With increasingly stringent policies and regulations, compliance has become an important challenge for enterprises. As one of the map vendors with double armor test qualifications, we can provide enterprises with comprehensive qualification assurance services and compliance solutions to ensure the compliance of data collection. At the same time, with years of experience in data collection, we have rich collection skills and route planning capabilities, which can accurately cover various scenarios required by enterprises.

After the data collection is completed, the data will enter our data management platform, which can be customized for the data characteristics in the field of intelligent driving and the demand scenarios of subsequent applications. It has a rich tool library, which can be used for cleaning, compliance processing and data mining of intelligent driving data. In addition, our data management platform has a built-in automated workflow engine that can support the routine large-scale repetitive data processing work of enterprises to achieve streamlined operations.

In the whole data management module, the perception generation algorithm is a crucial link. The generation of truth data depends on two modules, one is the accumulation of truth algorithms, and the other is manual tuning tools. The truth algorithm mainly relies on the accumulation of domain recognition algorithms in the field of intelligent driving and the application of Baidu's large model in the field of intelligent driving. At present, our ground truth model has covered about 20+ annotation scenarios, in addition to static ground truth scenarios, it also covers dynamics, pedestrians, vehicles, animals and other participants, as well as parking cones and other scenarios.

Baidu Intelligent Cloud: The true value solution of intelligent driving is transformed, and technology leads a new wave of data cost reduction

Source: Speaker material

Since the accuracy rate of the automatic recognition results of the truth system is only about 90%, manual tuning is still required in the future. To this end, we have a built-in manual tuning platform, which can quickly perform data verification and manual tuning.

One direction of truth application is to project it onto the original vehicle sensor to automatically generate the truth value, replacing manual annotation. Another option is to conduct truth evaluation, which can quickly import the truth value into the evaluation platform for evaluation. The evaluation platform has been built and accumulated together with customers in previous projects, and has accumulated more than 30 model evaluation indicators, and will be continuously improved through practical application and continuous cooperation with enterprise customers in the future.

The above two solutions have one to two years of technical exploration and practical project experience in Baidu Intelligent Cloud. We have a complete solution in terms of data acquisition, and can output the corresponding technical capabilities of Baidu Intelligent Cloud. These two solutions are forward-looking and exploratory in the industry, and we look forward to working with industry partners, OEMs, and Tier1 customers to jointly reduce the difficulty and cost of truth value acquisition and accelerate the implementation of intelligent driving models."

(The above content is from Zhang Xiaoxiao, senior product manager of Baidu Intelligent Cloud AI Data Service Department, delivered a keynote speech on "Intelligent Driving True Value Solution Reform, Technology Leads a New Wave of Data Cost Reduction" delivered at the 2nd Automotive Artificial Intelligence Conference on April 17-18, 2024.) )