laitimes

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

The content of this article is from the Journal of Surveying and Mapping, No. 9, 2021 (Review No. GS (2021) No. 6102)

Intelligent collaborative control method of traffic signal oriented to geographical road network

Zheng Ye 1, 2

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

Guo Renzhong1,2, Martin1,2, Zhao Zhigang1,2

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

, Li Xiaoming1, 2

1. Smart City Research Institute, School of Architecture and Urban Planning, Shenzhen University, Shenzhen 518060, China; 2. Shenzhen Key Laboratory of Spatial Information Intelligent Perception and Service, Shenzhen 518060, China

Fund Projects: National Key R&D Program (2018YFB2100700; 2019YFB2103104; 2019YFB210310); China Postdoctoral Fund (2019M663070)

Abstract:The operation efficiency of urban transportation is one of the important factors affecting the development of urban productivity, and it is also an important research topic in the process of smart city construction. With the development of computer technology, artificial intelligence, especially reinforcement learning, plays an important role in traffic signal control. At present, traffic signal control based on reinforcement learning is mainly optimized for single intersections or urban trunk roads, and there are few studies on regional coordinated control of urban geographic road networks. In this paper, combined with Markov sequence decision-making, a two-layer agent collaborative control method based on reinforcement learning is proposed. Layer 1, for a single intersection to achieve coarse tuning training, the agent by observing the length of the queue at each lane of the intersection to control the signal matching, to achieve a single intersection is not blocked; layer 2, the multiple rough training of the agent model into the geographical network, to achieve multi-intersection collaborative fine-tuning training. In this paper, the traffic coordination in a middle school area in Ningbo is used as an optimization goal to carry out experiments. The results show that the regulation method has a higher traffic efficiency than the original fixed timing scheme.

Keywords: geographical road network traffic signal control collaborative control reinforcement learning

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK
Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

Citation format: Zheng Ye, Guo Renzhong, Martin, et al. Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network[J]. Journal of Geomatics,2021,50(9):1203-1210.] DOI: 10.11947/j.AGCS.2021.20210191

ZHENG Ye, GUO Renzhong, MA Ding, et al. Multi-agent cooperative control for traffic signal on geographic road network[J]. Acta Geodaetica et Cartographica Sinica, 2021, 50(9): 1203-1210. DOI: 10.11947/j.AGCS.2021.20210191

Read more: http://xb.sinomaps.com/article/2021/1001-1595/2021-9-1203.htm

Introduction

With the increasing ownership of cars in China, traffic congestion has gradually become one of the most difficult problems in urban management. The urban road network undertakes most of the transportation of the whole city, and through the timing and phase coordination of the signal lights at key intersections in the road network, it can effectively improve the efficiency of traffic operation, which is the key research content for the construction of smart cities[1]. The traditional intersection traffic signal timing method usually has two implementation methods: (1) the road spatial information is graphed by measuring the geometric information; (2) the mathematical model is established to find the optimal solution of the objective function. Taking green wave coordination as an example, the traditional method uses the distance between the signal lights on the main road and the green wave speed of the car to construct a hybrid integer linear programming equation to solve the equation with the largest green wave bandwidth [2-5]. The above method has the following limitations: (1) all vehicles must travel at the same speed (i.e., green wave speed), once there are a small number of vehicles with a large gap between the green wave speed, it will destroy the entire queue resulting in poor green wave effect; (2) the traditional method requires the departure law to maintain a relatively stable speed, if the traffic flow changes will reduce the calculation of green wave effect.

The development of computer technology has promoted the introduction of machine learning algorithms such as fuzzy logic control[6], genetic algorithms[7], and expert systems[8] into the field of intelligent transportation. Among the many machine learning algorithms, deep reinforcement learning (DRL) is based on Markov decision theory to make agents continuously make corresponding decisions in the environment and return feedback on their behavioral decisions, so that agents can find the sequence decisions with the highest return values in the environment [9]. The intelligent traffic signal control system realizes the intelligent control of traffic lights by defining the behavior vectors, state vectors and return functions in traffic scenarios [10-12]. With the development of 5G and cloud computing technology, DRL technology has made new breakthroughs in traffic management. [13] This paper proposes a way to use DRL to build a traffic control system that supports dynamic scheduling in the cloud and at the edge in the environment of the Internet of Vehicles and 5G. The literature [14] proposes a DRL traffic collection method based on edge computing and applies this method to alleviate the problem of traffic congestion. Literature[15] Designing a DRL signal control system from the perspective of smart city construction, coordinating multiple intersections to improve overall traffic throughput. The literature [16] refines the details of the DRL signal control algorithm, in which the state vector of the agent is the traffic flow data divided into a grid, the decision behavior function is the duration change of the traffic lamp, and the return function is the cumulative wait time difference between the two cycles. The DDPG-BAND algorithm proposed in the literature [17] coordinates the green wave of urban trunk roads through DRL to achieve collaborative control of multi-intersection of urban trunk roads.

In general, DRL technology has been successfully applied to traffic signal control, but the current research is generally limited to single intersections or urban arterial roads, and there are fewer traffic signal collaborative control based on multiple agents of geographical road networks. In this paper, combined with the characteristics of urban geographic road network and reinforcement learning, a two-layer signal collaborative control training method based on reinforcement learning is proposed, and this method is applied to the road network of a middle school area in Ningbo. The feasibility and effectiveness of this method are proved by comparing the travel time, throughput and number of stops in the simulator with the traditional timing method.

1 Text method

1.1 Markov Decision-Making Process

Reinforcement learning mainly studies the sequential decisions of agents that continuously perform trial and error and feedback training in a dynamic environment, so that agents can obtain the greatest cumulative return in a changing environment [18]. Reinforcement learning is based on the Markov decision process (MDP), which consists of three basic unit state vectors (also known as observation vectors) S, decision vector A, and return function R [19]. The agent interacts with the environment after performing the decision-making behavior, and its state is transferred from S1 to S2, and the state transition matrix is denoted as P. In the process of executing sequence decisions, the current decision has a greater impact than the historical decision, assuming that the decline rate of the decision is Y (γ∈0,1), then the above MDP is expressed by equation (1).

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

(1)

In the MDP problem, the decision-making behavior of the agent in different states is different, the policy function indicates the possibility of the agent to perform multiple candidate decisions in the current state, and its input parameters are the current state s(s∈S) and the decision vector a(a∈A), and the output result is the possibility of each candidate decision. If π represents a policy function, then π (s,a) represents the probability that the agent will execute policy a under the condition of state s. If the agent performs MDP according to the π of the policy function, the return value of the decision performed at t is Rt, and the process of transferring its state from st to st+1 and obtaining the return value rt is expressed as

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

(2)

The sum of the returns of the MDP is expressed as

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

(3)

From the above formula, different policy functions will cause the probability of the agent to execute different behavioral strategies is different, and the return values generated by different behavioral strategies are also different, and the strategy functions of reinforcement learning meet the total return value of the entire sequence decision. An excellent strategy function not only satisfies the current decision to achieve the maximum return value, but also ensures that the overall return of the entire sequence decision process is maximized. Since the π (s,a) of the agent strategy function is the probabilistic transfer process of the state, the state action value function Qπ(s,a) represents the mathematical expectation of the agent to decide the return according to the sequence of π the policy function under the initial conditions of the state, that is, expressed as

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

(4)

Therefore, the essence of the MDP problem is to find the optimal policy function π, so that the decision-making behavior of the agent from the arbitrary state S′ can satisfy the state action value function Qπ(s,a) to obtain the maximum value. According to Bellman's equation [20], the state action value function of the t-decision is only related to the state action value function of the t-1 decision, so the state action value function can be reduced to

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

(5)

In deep reinforcement learning[21], the agent stores the state action values in a deep neural network indexed by s and a, and updates the neural network by constantly interacting with the environment and getting feedback from the return function, which can finally enable the state action value stored in the neural network to correctly guide the agent to perform the sequence decision with the highest return value in the environment.

1.2 Geographical road network traffic scenario settings based on MDP

In this paper, the agent changes the green light duration of each phase of the signal light to achieve the purpose of traffic coordination, and the scene setting is as follows:

(1) The timing and phase difference of the signal light green light are determined by the intelligent decision-making of the agent.

(2) The agent does not change the phase sequence of the signal lamp.

(3) The yellow light duration of the signal lamp is fixed for 2 s.

Based on the above preset conditions, the proposed method is divided into two layers (Figure 1): the first layer is the working agent, whose responsibility is to optimize the individual intersections to ensure that each intersection agent can adjust the green light duration of the respective intersections so that it does not cause traffic jams. The second layer is the management of the intelligent body, whose responsibility is to coordinate the various working agents and improve the overall traffic efficiency of the geographical road network.

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

Figure 1 Two-layer collaborative optimization strategy Fig. 1 Two-tier collaborative optimization strategy

Diagram options

By extracting the characteristics of each variable of the traffic scene of the geographic road network into the Bellman equation, the automatic regulation of traffic lights by the agent can be realized after training. In the following article, we will focus on how to define the state vector, decision vector, and return function in the MDP in the above two layers of agents.

1.3 Single-junction work agent setting strategy

1.3.1 State vector S of the working agent

In a geographic road network traffic scenario, the state vector of the working agent must be able to reflect the traffic congestion state of the current intersection. As shown in Figure 2, the queue length represents the total number of vehicles at a traffic intersection waiting for a red light to turn green. In a single intersection, the length of the vehicle queue reflects the traffic flow in all directions of the intersection, which is a key factor in determining the duration of the corresponding phase green light. In addition to the length of the queue, the starting speed of the vehicle and the time consumed by the vehicle during the turning process are directly determined by the weight and length of the vehicle. Therefore, this paper distinguishes between two types of traffic vehicles as weighted values of state vectors, one is a large car with a weight of more than 15 t or a length greater than 12 m (such as a mud car or bus), and the other is an ordinary passenger car.

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

Note: The rectangle represents the driving vehicle, the color of the rear of the vehicle represents the different vehicle driving states, the green represents the normal driving of the vehicle, the yellow represents that the vehicle is slowing down, and the red represents that the vehicle is stopping waiting for the red light.

Fig. 2 Road intersection[17]Fig. 2 Traffic crossroads[17]

By the above weighted post-queue length, two granularity state vectors are defined: (1) coarse-grained state vectors only calculate the sum of the weighted queue lengths on the road in each direction, taking Figure 2 as an example, the dimension of the coarse-grained state vector is 4, and the value of each dimension is the weighted sum of vehicles in each direction; (2) the fine-grained state vector dimension is 8, and each dimension is the weighted sum of vehicles in each lane.

1.3.2 Decision vector A of working agents

As can be obtained by MDP theory, the policy function π calculate the next decision of the agent according to the state vector S and the decision to A, so it is necessary to define the decision vector A of the working agent under the geographical road network. The agent changes the release time of each phase of the signal lamp to achieve the purpose of traffic coordination, and the decision vector of the working agent is a high-dimensional array that saves the green light length of each phase of the signal lamp. In this paper, the decision vector A of the single-junction working agent should meet the following conditions:

(1) The value of each dimension of the decision vector is a positive integer (generally the duration of the green light of the signal light has no decimal).

(2) The value of each dimension of the decision vector must be greater than a fixed minimum value (the signal period represents the entire length of time for the signal light to change from green to red and then to green, and in order to ensure that pedestrians can pass at normal speed, it must be greater than a fixed value related to the width of the intersection).

(3) The sum of all dimension values of the decision vector must be less than a fixed maximum (the signal lamp cycle indicates that it cannot exceed the range that the average person can tolerate, such as 5 min).

After the agent times the green light of the signal light phase, the traffic simulator runs for a certain period of time and evaluates the decision of the agent through the return function, so how to correctly define the return function is the key to this algorithm.

1.3.3 R, the return function of the working agent

The definition of the return function R determines the optimization goal of the agent strategy function π, and the optimization goal of the working agent after optimization training is to ensure that there will be no traffic jam at each single intersection, so the traffic congestion must first be quantitatively defined. As shown in Figure 2, the lane blocking line is located at the end of the road to measure whether traffic is blocked. If the length of the vehicle queue exceeds the lane blocking line, traffic congestion is considered to have occurred at that intersection. In the general scene, the distance between the vehicle blocking line and the end of the road is not less than 20% of the length of the road, that is, when the length of the vehicle queue does not exceed 80% of the total length of the lane, the traffic in the direction of the lane is smooth. The number of traffic jams means that in a certain period of time, the length of queues in all lanes exceeds the sum of the traffic jam lines, which is the basis for the advantages and disadvantages of traffic regulation of working smarts. Based on the above definition, the return function is set as follows:

(1) If the number of traffic jams before regulation is 0, and the number of traffic jams after regulation is greater than 0, it indicates that the traffic situation changes from a blocked state to a blocked state after regulation, and the return value is -1.

(2) If the number of traffic jams before regulation is greater than 0, and the number of traffic jams after regulation is 0, it indicates that the traffic situation is changed from a blocked state to a non-blocked state after regulation, and the return value is 1.

(3) If the number of traffic congestion before regulation is reduced or increased by more than 20% compared with that after regulation, indicating that the regulation effect is more obvious, the return values of 1 and -1 are returned, respectively.

(4) In other cases, it shows that the regulatory effect is not obvious, and it is not enough to judge the advantages and disadvantages, and the return value is 0.

1.4 Manage agent co-optimization strategies

Working agents can ensure that there will be no traffic jams at their respective intersections, that is, the number of traffic jams for each working agent is 0. On this basis, the management agent further coordinates the control of the above-mentioned working agents to ensure the optimal traffic operation efficiency of the entire geographical road network. The state vector and decision vector of the management agent are similar to the work agent, and its dimension is the sum of the dimensions of all the work agents, representing the length of the queue at all intersections and the timing of the green light, respectively. Therefore, this section primarily defines the return function for managing agents.

The definition of the management agent optimization goal must change with different scenarios. For example, during peak periods, the overall road network can pass more vehicles per unit time through signal coordination, so the optimization goal of the morning peak is defined as the overall throughput of the road network; while the low peak period should be more considered to reduce the average waiting time of vehicles in the road network for red light through signal coordination. If the traffic efficiency coefficient represents the optimization goal of the working agent in the specified scenario (for example, traffic efficiency in the morning rush hour represents the sum of the intersection throughput in the scene per unit time), the return function for managing the agent is defined as follows:

(1) If the number of traffic jams at any intersection is greater than 0 after regulation, the return value of -1 is returned directly.

(2) If the traffic efficiency coefficient increases by 10% before the regulation than after the regulation, the regulation effect is excellent, and the return value is returned 1.

(3) If the traffic efficiency coefficient is reduced by 10% before the regulation than after the regulation, the regulation effect is poor, and the return value is -1.

(4) In other cases, the regulation effect is not obvious, and the return value is 0.

2 Experimental analysis

2.1 Introduction of test data and test environment

As shown in Figure 3, a middle school area in Ningbo is located in Yinzhou District, Ningbo City, which is one of the more densely trafficked areas in the urban area of Ningbo City. The section of the road starts from Fuming Road in the east to Sangtian Road in the west and from Qiaoqiao Road in the south to Min'an Road in the north, and is composed of 4 signal lights composed of 12 geographical road networks.

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

Figure 3 Traffic road network in the test area

Fig. 3 Illustration of traffic roads in the experimental area

The test traffic flow data is derived from the intersection camera in the test area from 7:00 AM to 9:30 AM on December 6, 2020, and the average is obtained after applying the target tracking algorithm (Table 1). The test data includes 4 entry directions and 3 exit directions of left, center and right at each intersection, and distinguishes between large buses and small cars.

Table 1 Intersection Traffic Information Tab. 1 The rate of traffic flow in the intersection

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

The test algorithm in this paper is deployed on a high-performance computer with a 24-core CPU and 32 GB of memory, operating on a CentOS 7 operating system. The algorithm is implemented in Python 3.7.3, the neural network is built using Tensorflow1.14, and the traffic environment is run in the simulation software SUMO 1.3.1 [22] (open source software developed by the Transportation Systems Research Institute of the German Aerospace Center).

2.2 Analysis of test results

2.2.1 Working Agent Training Results

This section explores the training process of working agents at a single intersection under the two state vectors of thickness and fineness. In the experiment, the agent observes the length of the queue at the intersection with two signal cycles every interval, and generates the neural network of input vector update state action value, and the simulation time is 7200 s as an iteration. Figure 4 represents the traffic coefficient of each intersection working agent with the number of iterations, where the ordinate represents the cumulative sum of the number of traffic jams in one iteration, and the abscissa represents the number of iterations. The experimental results show that with the increase of the number of iterations, the traffic congestion coefficient of the four intersections shows a decreasing trend, and when the number of iterations is about 100, the traffic congestion coefficient converges. In addition, the stability and effect of the traffic congestion coefficient trained under the coarse-grained state vector are more excellent. This is because coarse-grained state vectors are calculated in edges for queue length. When the passage of vehicles on each side needs to be controlled by multiple phases of the signal light, the coarse-grained state vector will make it impossible to accurately distinguish which phase requires more green light duration, so its training is relatively difficult to converge.

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

Fig. 4 Single-junction working agent training test results Fig. 4 Training results of working agent on single intersection

2.2.2 Manage agent training results

The working agents obtained from the training in Section 2.2.1 are placed into the geographical road network, and the collaborative optimization training of the geographical region is realized under the coordination of the management agents. In this paper, we test the neural network of the state action value of the agent every 3 signal cycles, and the simulation time is 10 800 s for one iteration. In this paper, three indicators of vehicle average travel time, average number of stops and throughput in the geographical road network are selected as the optimization traffic efficiency coefficients. As shown in Figure 5, the efficiency of the three validated indicators after training has improved (the average travel time and the number of stops have decreased, the throughput has increased), and the convergence after a certain number of iterations has been reached indicates the effectiveness of this method. By calculation, the last 30 iterations decreased by 19.12% in average travel time, 21.47% in average throughput, and 3% in average number of stops compared to the first 30 iterations.

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

Figure 5 Geographic road network management agent training test results Fig. 5 Training results of management agent on geographic road network

2.2.3 Comparison of traffic efficiency coefficients

In order to verify the effectiveness of this method, the traffic efficiency of the time-timing scheme implemented by the proposed method, the original mapping method and the classical Webster method[23] are compared in the geographical road network. In the experiment, 10 sets of random seeds were added, and the random seeds could generate different departure rules under the specified traffic flow conditions, and the average road network traffic efficiency under these 10 sets of random seeds could ensure the fairness of the test. As shown in Figure 6, the test compares the traffic efficiency coefficients of each of the three methods with a statistical cycle of 270 s. The results show that the average travel time of the proposed method is 7.03% lower than that of the original mapping method and 2.87% less than that of the classical Webster method, the number of stops is 12.56% lower than that of the original drawing method, which is 10.49% lower than that of the classical Webster method, and the throughput rate is 8.3% higher than that of the original webster method and 6.4% higher than that of the classical Webster method. Overall, this method has excellent performance in terms of average travel time, number of stops and throughput rate of vehicles. Especially in terms of the number of stops, the other two methods began to decline with the obvious efficiency of the cycle. This is due to the traditional method of mathematical calculation to obtain a fixed timing scheme, the algorithm of this paper can change the timing in real time through the queue length in each direction, so it has better adaptability.

Intelligent mapping | Zheng Ye: Intelligent Collaborative Control Method of Traffic Signal Oriented to Geographical Road Network ○ Abstract | Satellite Applications, No. 7, 2021 Abstract Recommendation ○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK

Fig. 6 Comparison of three traffic efficiency coefficients Fig. 6 Comparison of three traffic evaluation indexes

3 Conclusion

In this paper, combined with the characteristics of Markov sequence decision-making, this paper proposes a two-layer agent collaborative control training method based on reinforcement learning. In Layer 1, coarse tuning training is realized for a single intersection, and the agent adjusts the signal matching time by observing the queue length of each lane at the intersection to achieve a single intersection without blockage; Layer 2 puts multiple rough-tuned trained agent models into the geographical network to achieve collaborative fine-tuning training at multiple intersections. The experimental results show that compared with the traditional algorithm, the travel time is shortened by 7.03%, the number of stops is reduced by 12.56%, and the throughput is increased by 8.3%. In addition, the traffic signal coordination control based on reinforcement learning can change the timing scheme in real time according to the length of the intersection lane queue, which can be better adapted to the complex and changeable traffic environment.

About the Author

About the author: Zheng Ye (1989-), male, Ph.D., postdoctoral fellow, research direction is geographical AI. E-mail: [email protected]

Corresponding Author: Zhao Zhigang. E-mail:[email protected]

Preliminary: Zhang Yanling

Review: Song Qifan

Final Judge: Jin Jun

Recommended in previous periods

information

○ Academician Yang Yuanxi, editor-in-chief of the Journal of Surveying and Mapping, was selected as the 6th Excellent Scientific and Technological Paper of the China Association for Science and Technology

○ Journal of Surveying and Mapping successfully hosted the sub-forum of "Ubiquitous Surveying and Mapping Empowering Industry Innovation"

○ Hubei Luojia Laboratory 2021 Scientific Research Personnel Recruitment Announcement

○ China Coal Aerial Surveying and Remote Sensing Group Co., Ltd. (Bureau) 2022 College Graduate Recruitment Announcement

meeting

○ The 1st Symposium on Remote Sensing of Ecosystems in China (Circular 3)

○ 2021 National Doctoral Academic Forum (Surveying and Mapping Science and Technology) and Working Meeting of the Surveying and Mapping Discipline Review Group of the Degree Committee of the State Council (Notice No. 2)

¡ Meeting Notice | Notice of the 2021 Annual Conference on Theories and Methods of Geographic Information Science in China (No. 2)

c

Journal of Surveying and Mapping

○ Intelligent mapping | Li Qingquan: Dynamic Precision Engineering Measurement Technology and Application○ Intelligent Surveying and Mapping | Yan Li: The key technology of measuring robots

○ Intelligent mapping | Ettinghua: Deep learning empowers some thoughts on map mapping

○ Intelligent mapping | Liu Wanzeng: Research progress and application of spatio-temporal knowledge center

Bulletin of Surveying and Mapping

○ How to identify the Internet "problem map"?

○ A new type of mapping for the digital transformation of Shanghai city

○ Development and prospect of image total station and image measurement

○ Catalogue of surveying and mapping bulletin, No. 8, 2021

Beijing Surveying and Mapping

○ Beijing Surveying and Mapping, No. 7, 2021 Abstract Recommendation (Part 2)

○ Beijing Surveying and Mapping, No. 7, 2021 Abstract Recommendation (Part 1)

○ Beijing Surveying and Mapping, No. 6, 2021 Abstract Recommendation (Part 2)

○ Beijing Surveying and Mapping, No. 6, 2021 Abstract Recommendation (Part 1)

Journal of Surveying and Mapping Science and Technology

○ Recommended by the Journal of Surveying and Mapping Science and Technology, No. 1, 2021

○ Abstract | abstract recommendation of Journal of Surveying and Mapping Science and Technology, No. 3, 2021

○ Abstract | Abstract recommendation of Journal of Surveying and Mapping Science and Technology, No. 2, 2021

○ Abstract | abstract recommendation of Journal of Surveying and Mapping Science and Technology, No. 1, 2021

Journal of Geo-Information Science

○ Journal of Geo-Information Science, No. 8, 2021

○ Special Issue Call for Papers: Social Perception and Geographic Big Data Mining (in the call for papers)

Surveying and Mapping Engineering

○ Abstract | abstract recommendation of Surveying and Mapping Engineering, No. 5, 2021

○ Abstract | abstract recommendation of Surveying and Mapping Engineering, Issue 4, 2021

○ Abstract | abstract recommendation of Surveying and Mapping Engineering, No. 3, 2021

○ Abstract | abstract recommendation of Surveying and Mapping Engineering, No. 2, 2021

China Space Science and Technology

○ Abstract | abstract recommendation of China Space Science and Technology, No. 4, 2021

○ Abstract | recommended by China Space Science and Technology, No. 3, 2021

○ Abstract | abstract recommendation of China Space Science and Technology, No. 2, 2021

○ Abstract | abstract recommendation of China Space Science and Technology, No. 1, 2021

Satellite Applications

<h1 toutiao-origin="h1" >○ Abstract Recommended | Satellite Applications, No. 7, 2021</h1>

○ Abstract | abstract recommendation of Satellite Applications, No. 6, 2021

○ Abstract | Abstract recommendation of Satellite Applications, No. 5, 2021

○ "Satellite Application" entered the "Smart Painting Science Service" integrated media platform!

《Journal of Geodesy and Geoinformation Science》

○ Special issue call for papers | Call for Papers: Spatial Human and Social Geographic Computing (SHGSS)

○ Selected Papers | journal of "GNSS and LBS" in the Journal of Surveying and Mapping (English Edition).

《Satellite Navigation》

○ "Satellite Navigation (English)" scientific editor recruitment notice

○ Multi-source navigation (1) | Satellite Navigation key paper recommendation

○ Navigation positioning app | Featured Satellite Navigation articles

<h1 toutiao-origin="h1" >○ SatNav Essay: Looking forward to excellent papers in the field of PPP/PPP-RTK</h1>

Read on