laitimes

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024

author:Journal of Surveying and Mapping
Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024

The content of this article comes from the Journal of Surveying and Mapping, Issue 2, 2024 (drawing review number: GS Jing (2024) No. 0297)

Optimal extraction method of road alignment combination adapted to different trajectory data scenariosYao Zhipeng1

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024

, PENG Cheng1, TANG Jianbo1,2

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024

, LIU Guoping3, YANG Xuexue1,2, LIU Huimin1, DENG Min1,2 1. School of Earth Sciences and Information Physics, Central South University, Changsha 410083, Hunan, China;

2. Hunan Geospatial Information Engineering Technology Research Center, Changsha 410007, China;

3. Beijing Didi Chuxing Technology Co., Ltd., Beijing 100089 Foundation Projects: National Natural Science Foundation of China (42271462; 42171441;42271485); National Key R&D Program of China(2022YFB3904203); Natural Science Foundation of Hunan Province (2021JJ40727; 2022JJ30703; 2020JJ4749) Abstract:Vehicle trajectory data is an important data source for the dynamic update of urban navigation road network maps, and extracting and fitting road geometry from chaotic trajectory points or trajectory lines, and then generating a structured road vector map is a key step in the construction and update of road network maps based on trajectory data. However, the geometry of real roads is complex and diverse, and the quality of vehicle trajectory data is uneven, resulting in a single road alignment fitting algorithm that can only be applied in some specific data scenarios, and cannot adaptively fit the ideal road centerline for different data scenarios. In addition, compared with the high-frequency trajectory data collected by professional measurement methods, the low-frequency trajectory data collected by taxis and other countries have problems such as sparse trajectory points, more noise, and large positioning errors, which makes it still challenging to extract the ideal road centerline from the low-frequency trajectory data, especially for complex intersection areas. Therefore, based on the idea of divide and conquer strategy, this paper proposes an optimal extraction method for road alignment combination suitable for different trajectory data scenarios. On the basis of trajectory data preprocessing, the data is classified according to the distribution characteristics of trajectory data. Then, the optimal line fitting algorithm is matched for different data scenarios, and the ideal road centerline is generated through the combination optimization strategy. The proposed method integrates the complementary advantages of different fitting algorithms, which can effectively solve the problem of road alignment fitting in different data scenarios such as sparse data distribution and complex road structure (such as self-intersecting overpasses). Based on the trajectory data of Beijing taxis, the average position accuracy of the road generated by the proposed method is 1.24 m, which is significantly better than the existing representative methods. Keywords: trajectory data, road centerline, line fitting, adaptive, road network extraction

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024
Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024

Citation format: Yao Zhipeng, Peng Cheng, Tang Jianbo, et al. Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios[J]. Journal of Surveying and Mapping,2024,53(2):379-390. DOI: 10.11947/j.AGCS.2024.20220606

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024

YAO Zhipeng, PENG Cheng, TANG Jianbo, et al. An adaptive road centerline extraction method for different trajectory data scenarios based on combinatorial optimization[J]. Acta Geodaetica et Cartographica Sinica, 2024, 53(2): 379-390. DOI: 10.11947/j.AGCS.2024.20220606

Read more: http://xb.chinasmp.com/article/2024/1001-1595/20240216.htm Introduction

The current navigation road network map is an important data cornerstone for location-based services such as route planning and travel time estimation. Because the vehicle GNSS trajectory data records the vehicle position, movement direction, and status, the trajectory data is large and dynamically updated quickly, which can perceive the geometry, topology, and traffic status of the road in real time, and contains rich road network information, which has become an important data source for the generation and change of navigation road network maps [1-3]. Extracting road geometric centerlines from chaotic trajectory points or trajectory lines is a key step in the generation and update of road network maps based on vehicle trajectory data. The existing research on road centerline extraction based on trajectory data mainly includes the following methods.

Curve fitting methods are usually based on mathematically obtaining a curve that matches known data and using that curve as the centerline of the road. Ref. [4—5] Based on the mathematical idea of least squares method to fit the road centerline, this method is simple and convenient, but it is easily affected by noise trajectories. In Ref. [6—8], the B-spline fitting method is used to fit a smooth and continuous curve, which can avoid the deviation of the whole curve due to abnormal data, and the method needs to set appropriate control points according to the road shape, otherwise the fitting effect will be affected. In Ref. [9], the piecewise linear fitting method is used to fit the new straight road, but there are some limitations when it is used in non-straight road scenarios. Main curve fitting is another commonly used curve fitting algorithm for extracting road centerlines from trajectory data [10-15], which can extract arbitrary road alignments with self-intersection, but this method is susceptible to noise and generates wrong road shapes in the trajectory sparse scenario.

The trajectory rasterization method first converts the trajectory data into a binary image, and then uses image morphology and refinement algorithms to extract the road centerline [16]. In Ref. [17—18], the trajectory data was converted into binary images by kernel density estimation, and then the road centerline was extracted according to the density stratification. This method can reduce the interference of noisy data and adapt to scenarios where data is relatively sparse, but the extraction results will have the problems of glitches and loss of topological information. For areas with complex road morphology such as three-dimensional intersections, it is difficult to extract the fine geometry of roads by rasterization methods.

The random selection method randomly selects a trajectory from the trajectory cluster passing through the road as the road centerline, which is simple and efficient in high-quality trajectory data, but is easily affected by noise trajectory in low-frequency trajectory data. In Ref. [19], the trajectory increment method was used to extract the road network, and the trajectory was randomly selected as the road centerline to incrementally expand the road network. In Ref. [20], the trajectory aggregation method based on the gravitational-repulsive model [21] was used to gather trajectories and improve the quality of trajectory data, but due to the high noise and sparsity of low-frequency sampled trajectory data, the randomly selected method still has many limitations when applied to low-frequency trajectory data.

The graph theory-based method generally constructs the graph through the trajectory points, and then extracts the centers of the primitives in the graph based on the graph theory algorithm, and connects these centers in turn to form the road centerline. Ref. [22] and ref. [23—24] proposed methods based on Voronoi diagram and Delaunay triangulation network to extract road centerlines, respectively. Ref. [25—26] proposes a cross-section road center method, which cuts the trajectory cluster according to a certain step size to obtain the cross-sectional center, and then connects the cross-sectional center to obtain the road centerline. Similarly, the road centerline was extracted by finding the density center point as the seed point for connection in Ref. [27]. When the trajectory data is sparse, this type of method is prone to position deviation when calculating the center point of the primitive, and the graph-based method is sensitive to noise.

To sum up, most of the existing road centerline extraction methods in the trajectory data use a single algorithm or strategy to extract the road centerline, and do not fully consider the complex diversity of trajectory data distribution and road morphology, resulting in these methods can only be applied in some specific data scenarios, and cannot adaptively extract the ideal road centerline according to the characteristics of the trajectory data itself. As shown in Figure 1, how to adaptively fit the ideal road centerline in complex and diverse data scenarios is still a difficult problem in the current research on road centerline extraction due to the complex road morphological structure and the high noise and sparse sampling of low-frequency trajectory data.

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024
图 1 低频轨迹数据中道路中心线拟合的主要挑战Fig. 1 Challenges of road centerline extraction from low frequency trajectory data
Diagram options

Therefore, based on the divide and conquer strategy, this paper proposes an optimal extraction method for road alignment combinations that is suitable for different trajectory data scenarios. Based on the idea of ensemble learning, through the systematic combing and applicability analysis of the existing methods, the method integrates the existing methods to establish the optimal matching rules between different data scenarios and the road alignment extraction algorithm according to the advantages of road alignment extraction in different data scenarios, and constructs an adaptive fitting method for complex road centerlines in trajectory data based on divide and conquer strategy. Compared with the existing methods, the advantage of this method is that it can be applied to extract the road centerline in different trajectory data scenarios, and effectively solve the linear adaptive fitting problem of complex roads (such as self-intersecting overpasses) in low-frequency trajectory data.

1 Adaptive extraction method of road centerline

Vehicle trajectory is the movement footprint left by the vehicle driving on the road, which can reflect the real-time status information of the road, and is one of the important data sources for the construction and dynamic update of navigation road network maps. Although the trajectory data collected by professional survey vehicles has a high sampling frequency (≤5 s), less data noise and high positioning accuracy, it is difficult to realize real-time data collection of large-scale urban road network due to the high acquisition cost. Crowdsourced trajectory data, represented by taxi trajectory data, has become an important data source for real-time acquisition and update of road information due to its characteristics of fast data update, wide coverage, and low acquisition cost [3, 28-30]. However, the average sampling frequency of crowdsourced trajectory data collected by taxis is low (>30 s), noisy, and inaccurate positioning, and it is difficult for a single trajectory to reflect the accurate location of the road centerline. Therefore, in the road centerline extraction study based on trajectory data, the data distribution center of the trajectory along the road direction is usually taken as the road centerline, and the trajectory clusters distributed in the same lane are found by the method of trajectory line clustering [14, 20, 27] or trajectory point clustering [15, 28-29], and then the road centerline is extracted from the trajectory cluster to generate a vectorized road network map. Therefore, on the basis of the existing trajectory clustering, the main problem solved in this paper is how to fit a road centerline representing the current road or lane from the trajectory cluster for different trajectory data scenarios. For the sake of description, the following definitions are given.

(1) Definition 1 (trajectory): A trajectory T is a sequence composed of multiple trajectory points pi. A trajectory is defined as T={p1, p2, ..., pn}, where the trajectory point pi contains information such as vehicle ID, sampling point coordinates, sampling time, direction of movement, and speed.

(2) Definition 2 (trajectory cluster): A trajectory cluster T-cluster is a set of one or more similar trajectory Ti. A trajectory cluster is defined as T-cluster={T1, T2, ..., Tk}, and each trajectory in the trajectory cluster is collected on the same road or lane, which is similar in morphology, similar in distance, and the beginning and end points of the trajectory are also adjacent to each other.

(3) Definition 3 (center trajectory): A center trajectory Tcenter refers to the trajectory closest to the center of the trajectory cluster. A center track is defined as the track that is the least sum of distances from the other tracks in the track cluster. The distance measurement method can be measured in terms of Hausdorff distance, which is defined as shown in Eq. (1).

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024

(1)

where TA and TB represent two different trajectories in the trajectory cluster; pi and pk represent the trajectory points on TA and TB.

(4) Definition 4 (road centerline): The road centerline C represents a feature line formed by the sequential connection of a trajectory cluster along the data distribution center points in the direction of the road, which can reflect the plane position and straightness change of the road. In this paper, the road centerline is obtained by algorithm fitting from a trajectory cluster, which is a sequence composed of multiple fitting points Fi. A road centerline is defined as C={f1, f2, ..., fn}, where the fitting point fi=(xi, yi) is a two-dimensional coordinate point. A simple example of a single track, a cluster of tracks, and a road centerline is shown in Figure 2.

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024
图 2 轨迹、轨迹簇和道路中心线Fig. 2 Trajectory, trajectory cluster and road centerline
Diagram options

In view of the problem that most of the existing methods use a single algorithm to fit the road centerline, it is difficult to adapt to different complex data scenarios. Based on the idea of divide and conquer strategy, this paper synthesizes the advantages of existing methods and proposes a road alignment combination optimization extraction method suitable for different trajectory data scenarios. Firstly, the original trajectory data was preprocessed to reduce the influence of noise trajectory. Then, according to the number and length of trajectories, the complex and diverse data were classified into scenes. Finally, the characteristics of the existing fitting methods were analyzed, and the optimal fitting algorithm was selected and proposed for the road centerline combination extraction for the classified data scene. The overall flow of the method in this article is shown in Figure 3.

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024
图 3 道路中心线自适应提取方法Fig. 3 The adaptive road centerline extraction method
Diagram options

1.1 Trajectory cluster preprocessing

Due to the influence of GNSS equipment such as positioning error or signal occlusion, there may be some trajectories in the vehicle trajectory data that deviate far from the actual road or lane centerline, and these noise trajectories will affect the accuracy of road centerline extraction. Inspired by the fact that the trajectory distribution on the same lane is approximately in line with the Gaussian distribution [27], the trajectories within a certain threshold range from the center trajectory Tcenter are extracted, and the trajectories that are far away from the center trajectory are excluded as noise. As shown in Figure 4(a), the statistical distribution of the distance from each track to the center trajectory in the track cluster on a certain road in this experiment is shown, and it can be found that the farther away from the center trajectory is, the less the trajectory distribution. Based on this, the center trajectory Tcenter of the trajectory cluster is calculated according to equation (1), and then the Hausdorff distance dT from other trajectories in the trajectory cluster (such as Ti) to the center trajectory Tcenter is calculated separately, and if dT satisfies the following conditions, the trajectory Ti is identified as a noise trajectory and is eliminated from the trajectory cluster

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024

(2)

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024
图 4 轨迹分布特征及噪声轨迹剔除示例Fig. 4 Distribution characteristics of trajectories and noise trajectory filter process
Diagram options

where dT denotes the Hausdorff distance from the trajectory to the center trajectory; x and σ represent the mean and standard deviation of the Hausdorff distance from all trajectories to the center locus in the trajectory cluster, respectively. c represents the adjustment coefficient, and in this experiment, c=3 is set according to the 3σ criterion. The physical meaning of Eq. (2) is to divide a reasonable distance interval from the center of the center to screen out the trajectories closer to the center, as shown in Fig. 4(b), so even if the trajectory distribution in the trajectory cluster does not conform to the Gaussian distribution characteristics, the trajectory closer to the center can still be screened.

At the same time, due to the low sampling frequency of low-frequency trajectory data, the trajectory points in the trajectory cluster are sparse, which affects the effect of the curve fitting algorithm. In this paper, after removing the noise trajectories, a simple linear interpolation algorithm is used to encrypt the trajectory points of each trajectory in the trajectory cluster, which is simple and efficient, which can increase the number of trajectory points without changing the original form of the trajectory, meet the requirements of the fitting algorithm for the number of trajectory points, and reduce the fitting error caused by the small number of trajectory points. Specifically, when the distance between adjacent track points in a trajectory is greater than a certain distance threshold s, the track point interpolation is performed, otherwise the interpolation is not performed. The smaller the distance threshold s, the denser the trajectory points after interpolation, and the smaller the error of the trajectory line expressed by the track point set, but the computational amount of trajectory point processing will also increase accordingly, in order to balance the trajectory representation error and computational efficiency, s is set to 3 m in this experiment. In this paper, linear interpolation is used for track point interpolation, which is simple and efficient, which can increase the number of track points without changing the original form of the trajectory, meet the requirements of the fitting algorithm for the number of track points, and reduce the fitting error caused by the small number of track points.

1.2 Classification of data scenarios

Due to the different distribution densities of vehicle trajectories on different roads and the complex and diverse road forms, especially urban three-dimensional intersections, it is difficult for a single fitting algorithm to be applied to the extraction of road centerlines in different data scenarios. As shown in Figure 5, the applicability of the currently commonly used K-segment main curve algorithm [31] and B-spline fitting algorithm [32] in different data scenarios is compared. For complex interchange self-intersection scenarios (Fig. 5(a)), the K-segment main curve algorithm has a better fitting effect than the B-spline fitting algorithm, but for the data sparse scenario with fewer trajectories (Fig. 5(b)) and long-distance scenarios (Fig. 5(c)), the B-spline fitting algorithm has more advantages. There are differences in the applicable data scenarios of different fitting algorithms, so it is necessary to classify the data and find the optimal method to extract the road centerline for different data scenarios, so as to improve the universality of the algorithm and the accuracy of road centerline extraction in the trajectory data. Through the observation and comparative analysis of a large number of trajectory data, this paper classifies the data scene of trajectory clusters according to three data characteristics: the number of trajectories, the length of trajectories, and the morphology of trajectories.

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024
图 5 K段主曲线算法与B样条拟合的提取结果比较Fig. 5 Comparison of K-segment principal curve algorithm and B-spline fitting method
Diagram options

(1) According to the number of trajectories in the trajectory cluster, the data is divided into sparse and non-sparse scenarios. If the number of trajectories in a trajectory cluster is less than tnum, it is represented as a sparse scene, otherwise it is a non-sparse scene.

(2) According to the average length of the trajectory in the trajectory cluster, it is divided into short-distance, medium-distance and long-distance scenarios. When the average length of the trajectory is less than tshort, it is represented as a short-distance scenario. If it is greater than tlong, it is represented as a long-distance scenario. Others are medium-range scenes.

(3) According to the morphological characteristics of the trajectories in the trajectory cluster, the trajectories are divided into self-intersecting and non-self-intersecting scenarios.

1.3 Combined extraction strategy and optimization method of road centerline

1.3.1 Analysis of applicable scenarios of different fitting algorithms

By classifying the trajectory data, the appropriate road centerline extraction method can be selected in a more targeted manner. For the high-frequency trajectory data collected by professional equipment, the commonly used curve fitting algorithms (such as B-spline fitting) can achieve good results. In the existing research, the extraction of road centerlines for low-frequency trajectory data is still a difficult point. Among them, the trajectory rasterization method [17], the cross-section road center method [25] (Fig. 6(a)), the B-spline fitting algorithm [32] and the K-segment main curve algorithm [31] are the representative methods for road centerline extraction in the current low-frequency trajectory data. However, these methods still have certain limitations. For example, the road centerline extracted by the trajectory rasterization method is prone to glitches and loss of steering information. The line segment extracted by the cross-section road center method is not smooth enough, and is easily affected by adjacent trajectories and noise, which causes the cross-section center to shift, as shown in Fig. 6(b). The B-spline fitting algorithm can extract smooth curves, but it is easy to be disturbed by noise, and the extracted road centerline may deviate from the true center position.

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024
图 6 横截道路中心方法及其存在的问题Fig. 6 Illustration of the sweeping method and its limitations
Diagram options

Compared with the previous methods, the K-segment main curve algorithm can extract the road centerline that is smooth and close to the data centerline, and the main principle of the method is as follows: without considering the direction of the track point, the data is fitted by continuously inserting and updating the position of the principal component line segments, until the fitting error threshold or the maximum predetermined value is reached, and then the road centerline that is smooth and close to the data center is extracted by using the inserted line segments as the skeleton to connect these line segments through graph optimization. An example of a simple segment insertion of this algorithm is shown in Figure 7(a). The principle of this method makes it better suitable for low-frequency trajectory data, but it also has the following problems: (1) in the sparse scene, too little data affects the insertion effect of line segments, resulting in poor curve fitting effect, as shown in Fig. 5(b); (2) In short-distance or long-distance scenarios, overfitting or underfitting problems are prone to occur, as shown in Figure 7(b) and the corresponding line segment insertion results, the direction estimation of some inserted line segments in short-distance scenarios is incorrect, resulting in connection errors resulting in overfitting, and in long-distance scenarios, insufficient insertion of line segments leads to underfitting due to failure of fitting results close to the data distribution center; (3) In some scenarios that are easy to induce line segment insertion errors or connection errors (such as U-shaped vehicle U-turn scenes), fitting errors may occur.

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024
图 7 K段主曲线方法及其存在的问题Fig. 7 K-segment method and its existing problem
Diagram options

1.3.2 Road centerline extraction method based on combinatorial optimization strategy

Based on the analysis of the applicable data scenarios and limitations of the existing methods, this paper proposes a road centerline extraction method based on the combined optimization strategy on the basis of the existing methods. The method consists of a combinatorial extraction strategy and an optimization step. The combined extraction strategy is to preferentially use the K-segment main curve algorithm to extract the road centerline when encountering the data scenario adapted to the K-segment main curve algorithm. When the result of the K-segment main curve generation is wrong or encounters the data scenario where the K-segment main curve algorithm is not applicable, the improved cross-section road center method is used to extract the road centerline. The improved cross-section road center method proposed in this paper is to first cross-cut the road center by using the trajectory point of the center trajectory as the anchor point (Fig. 6(a)), and then the cross-section road center point is fitted with the B-spline curve fitting method to obtain a smooth road centerline. In the experimental analysis, it is found that the combined extraction strategy can be applied to most of the data scenarios, but when the trajectory data is sparse and there are chaotic local self-intersection situations (as shown in Fig. 8(b), the improved cross-section road center method is easily affected by these chaotic trajectory segments. Therefore, this paper further proposes a center trajectory optimization method based on K-means algorithm, which replaces the improved cross-section road center method to extract the road centerline when the trajectory is sparse and there is a self-intersection scene or a trajectory U-turn scenario.

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024
图 8 基于K-means的中心轨迹优化算法Fig. 8 Road center line optimization algorithm based on K-means
Diagram options

In this paper, K-means clustering is performed on the trajectory points after the central trajectory interpolation encryption, and the clusters with abnormal number of trajectory points (locally chaotic trajectory segments) are eliminated. Then, the center points of each cluster were connected as control points (anchor points) in the order of the original trajectory, and the smooth road centerline was obtained by using the B-spline fitting method. This method can be used to optimize the central trajectory in the following two aspects: (1) due to the curling of the trajectory morphology, the number of abnormal tracks at the local self-intersection position will be formed after clustering (Fig. 8(b)), and the local trajectory noise of the central trajectory can be removed by eliminating the clusters with an abnormal number of points (Fig. 8(b)); (2) The center trajectory can be divided into multiple sequentially arranged clusters, and then the center point of each cluster can be used as the control point for B-spline curve fitting, which can make the center trajectory smoother, as shown in Figure 8 before and after the optimization of the center trajectory in the overpass self-intersection and local self-intersection scenarios. There are two key parameters to be determined in this method, one is the initial number of clusters of K-means clustering, and the other is the determination threshold of the abnormal cluster of the number of points.

(1) K-means clustering of the central trajectory is essentially a region segmentation of the central trajectory, each cluster obtained by K-means is a small region, and the approximate length of each cluster Δl is preset, and the initial number of K-means clusters can be determined by the ratio of the total length of the trajectory centerline to Δl; At the same time, in order to avoid the misaggregation of trajectory points on adjacent roads, the setting of Δl can be set according to the average width of the road in the study area, and Δl is set to 10 m in this paper.

(2) There are areas of local self-intersection and abnormal morphology in the low-frequency trajectory data, which affects the extraction accuracy of the road centerline. The distribution of trajectory points in these areas is more dense than that in the normal trajectory segment, and when the number of trajectory points n in the cluster satisfies Eq. (3), it is judged to be an abnormal cluster

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024

(3)

where you and δ represent the mean and standard deviation of the number of trajectory points in the cluster, and m is the coefficient (m is set to 2 in this experiment).

To sum up, this paper adopts the K-segment main curve algorithm to extract first, and the improved cross-section road center method and the center trajectory optimization method complement each other to extract the road centerline combination optimization extraction strategy, and the method flow is shown in Figure 9. Among them, the long-distance scene can be transformed into a medium-distance scene through segmentation, and then the road centerline can be extracted by using the K-section main curve method. The fitting quality evaluation link is to carry out quality control of the extraction results of the road centerline, that is, when the Hausdorff distance between the extraction results and the center trajectory is greater than the given threshold tdist, the quality of the extraction results is considered to be poor, and the other two methods are used to replace them. The U-turn scene is judged for the improved cross-section road center method, and the problem shown in Figure 6(b) exists in the U-turn scenario, so the center trajectory optimization method is also used to extract the road centerline in the U-turn scenario.

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024
图 9 道路中心线组合提取策略Fig. 9 The road centerline combination extraction strategy
Diagram options

2 Test analysis

2.1 Data and Test Environment

In order to verify the effectiveness of the proposed method, the trajectory data of Beijing taxis was used for experimental analysis. The trajectory data collection time is from 8 a.m. to 12 a.m. on August 20, 2020, including vehicle ID, sampling time and coordinate information, with a sampling frequency of 10~40 s, about 380,000 trajectories, and an average trajectory length of 4.2 km. In this paper, the road centerline extraction algorithm is implemented based on Python language, and the experimental test environment is a Spark 10-node cluster environment running on an Ubunu system, and each node is equipped with 8-core Intel processors and 16 GB memory.

2.2 Determination of algorithm parameters

(1) Algorithm parameters of the K-segment main curve [13, 29]. In this paper, the fitting error is set to 3 m, the angle smoothing penalty coefficient connecting each line segment is set to 1.0, and the maximum value is set to 30, which can avoid the problem of algorithm efficiency and overfitting due to too many line segments being inserted, and on the other hand, it can also extract the ideal road centerline for the trajectory cluster of the medium-distance scene.

(2) Determination of data scene classification parameters. Considering the position accuracy requirements of the vehicle navigation road network, the 5 m is taken as the qualified limit, when the number of trajectories in the K segment main curve is 3 or more, the fitting effect can be guaranteed, and when the average length of the trajectories is greater than 100 m and less than 2 km, the overfitting and underfitting problems can be effectively avoided. Therefore, in this experiment, the tnum is set to 3, and the tshort and tlong are set to 100 m and 2 km, respectively.

(3) Determination of the combined extraction strategy and optimization method parameters of the road centerline. In the quality evaluation process, it is necessary to judge whether there is a fitting error in the K-segment main curve algorithm, and considering the position accuracy requirements of the vehicle navigation road network, the tdist is set to 5 m in this paper. In the centerline optimization algorithm based on K-means, the parameter Δl is set to 10 m according to the average width of the road.

2.3 Analysis of Results

Intersections are the most structurally complex areas of the road network, and they are also the most challenging part of road structure extraction based on trajectory data. Therefore, in order to verify the ability of the proposed method to extract complex road centerlines, the intersection identification and trajectory clustering methods in Ref. [15] were used to divide the trajectory data into intersection and non-intersection areas, and the trajectory clusters in the intersection and non-intersection areas were obtained, and then the road centerline structure of the road network was extracted by the method in this paper, as shown in Figure 10. Among them, A-D magnify the trajectory clusters and road centerline extraction results in some intersection areas, respectively. It can be found that the proposed method can accurately extract the smooth road centerline from the original trajectory cluster, and the extraction results reflect the real road shape of the intersection, which is consistent with the manual recognition results. Furthermore, some other representative data scenarios were selected from the study area for road centerline extraction test analysis, and the extraction results are shown in Figure 11. The experimental results show that the proposed method can also accurately extract the road centerline in some complex data scenarios.

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024
图 10 研究区域内道路中心线提取结果Fig. 10 Some results of road center lines extracted by the proposed method
Diagram options
Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024
图 11 典型数据场景下的道路中心线提取结果Fig. 11 Results in some representative data scenarios
Diagram options

2.4 Comparative analysis

In order to verify the applicability of the proposed method, several typical data scenarios such as sparse trajectories and complex roads were selected, and the extraction results of the proposed method were compared with the existing representative methods. The representative methods compared in this paper include the random selection method [19], the B-spline fitting method [32], the segmented linear fitting method [33], the K-segment main curve method [31], the trajectory rasterization method [17], and the cross-section road center method [25]. The extraction results of the method in this paper and the comparison method are shown in Figure 12. In the data scenario of sparse trajectories and local noise trajectory segments (Fig. 12(a)), the trajectory rasterization method and the proposed method can reduce the influence of local self-intersecting noise trajectory segments while maintaining the morphological characteristics of the original trajectories, and there is a large position deviation in the road centerline extraction results of the piecewise linear fitting method, the cross-section road center method and the K-segment main curve method. As shown in Fig. 12(b) and Fig. 12(c), in the data scenario of complex road structures such as overpasses, the results of the segmented linear fitting method and the trajectory rasterization method extracted from the intersecting road centerline are of poor quality, and the random selection method, the B-spline fitting method, and the cross-section road center method can extract the main form of the road, but due to the influence of noise, the position of the extracted road centerline deviates from the trajectory data distribution center, and the K-segment main curve method and the proposed method can accurately extract the complex road centerline. It has a high degree of fit with the distribution center of trajectory data.

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024
图 12 不同道路中心线提取方法的结果对比Fig. 12 Comparison of results by different methods
Diagram options

Furthermore, in order to quantitatively evaluate the accuracy of the road centerline extracted by the proposed method, the original trajectory data was superimposed with the OSM road network data, and it was found that there was a large position shift between the trajectory data and the OSM road network data (Fig. 13(a)). To this end, 652 feature points (such as road center points, intersections, etc.) are marked in the trajectory data by manual annotation, as shown in Fig. 13(b), and then the average offset distance between these manually marked feature points and the points with the same name extracted by the algorithm is calculated to evaluate the position accuracy of the extracted road centerline. Table 1 shows the location accuracy of the seven comparison methods, and it can be found that the position accuracy of the road centerline extracted by the proposed method is significantly better than that of other representative methods.

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024
图 13 精度评价方法的选择Fig. 13 Selection of accuracy evaluation method
Diagram options

表 1 不同方法提取道路中心线的位置精度比较Tab. 1 Accuracy evaluation of road centerlines extracted by different methods m

Yao Zhipeng, School of Earth Sciences and Information Physics, Central South University: Optimal Extraction Method of Road Alignment Combination Adapted to Different Trajectory Data Scenarios |Journal of Surveying and Mapping, Vol. 53, No. 2, 2024

Table options

3 Conclusion

Road centerline extraction in trajectory data is a key step in generating vectorized road network maps and road network map updates based on trajectory data. Although many classical algorithms have been proposed for this study, most of the existing methods use a single fitting strategy for road center extraction for specific data scenarios, which is difficult to apply to the road center line extraction needs in different data scenarios such as sparse trajectory data and complex road morphology. At the same time, due to the high noise of low-frequency trajectory data and the sparseness of sampling points, how to accurately extract the road centerline from low-frequency trajectory data is still a difficult problem to be solved. In view of the existing problems, based on the divide and conquer strategy and the advantages of different road centerline extraction algorithms, this paper proposes a road alignment combination optimization extraction method suitable for different trajectory data scenarios, classifies different data scenarios according to the data distribution characteristics, and selects the optimal road centerline fitting method for different data scenarios. The experimental analysis of Beijing taxi trajectory data shows that the proposed method can effectively extract the road centerline for different data scenarios, and the location accuracy is significantly better than that of the existing representative methods. Compared with the existing research, the main contributions of this method are as follows: (1) Considering that complex and diverse data scenarios will make a single road centerline extraction method inapplicable, the data is reasonably classified by scenes, and the appropriate method for road centerline extraction can be selected for different types of data scenarios, which can obtain better results. (2) In view of the characteristics of low-frequency trajectory data such as noisy and sparse sampling points, the proposed method can also fit better road centerline results in the scenario of sparse and noisy low-frequency trajectory data. In the future, we will continue to improve the fitting method of complex road centerlines, further improve the accuracy and applicability of the method in this paper, and carry out the extraction research of three-dimensional road centerlines.

作者简介第一作者简介:姚志鹏(1998—), 男, 硕士, 研究方向为时空轨迹数据挖掘与分析。 E-mail: [email protected] 通信作者:唐建波 E-mail:[email protected]

First trial: Zhang Yanling review: Song Qifan

Final Judge: Jin Jun

information

Read on