laitimes

Cornering extreme overtaking, beating the top human players, Sony AI racers on the cover of Nature

Reports from the Heart of the Machine

Editors: Du Wei, Chen Ping

Defeating some of the world's top esports racers in GT Racing, Sony AI has developed a super powerful racing AI agent.

Cornering extreme overtaking, beating the top human players, Sony AI racers on the cover of Nature

From chess to Go to poker, AI agents outperform humans in many games. These agents can now set a new top score in GT Racing (Gran Turismo).

GT Racing is a racing game led by renowned producer Kazunori Yamauchi, a subsidiary of SCEJ. Created in 1997, this game is a racing game developed by POLYPHONY DIGITAL. Whether it is from the game screen, the track when operating the driving, the number of cars, the sense of realism, the system is as perfect as possible. With over 50 tracks and over 1,000 models, the game is a car museum.

Today Sony announced that its researchers have developed an AI driver called "GT Sophy" that is capable of beating the top human esports racers in gt-speed motorsports in consecutive laps. The paper was on the cover of Nature.

Cornering extreme overtaking, beating the top human players, Sony AI racers on the cover of Nature

Here's the match screen:

Cornering extreme overtaking, beating the top human players, Sony AI racers on the cover of Nature
Cornering extreme overtaking, beating the top human players, Sony AI racers on the cover of Nature

Some may think it's a simple challenge, after all, racing isn't just about speed and reaction time. But experts in video game racing and AI say the GT Sophy is a major breakthrough that demonstrates the intelligence agent's mastery of tactics and strategy.

Chris Gerdes, a professor from Stanford University who studies autonomous driving, said, "In motorsport, it is an iconic achievement of AI to be so skilled at surpassing top drivers."

GT Sophy is trained using a method called reinforcement learning: essentially a trial-and-error form in which an AI agent is thrown into an environment without instructions and rewarded for achieving certain goals. In gt Sophy's case, Sony's researchers said they had to be very careful to design such rewards: For example, fine-tuning the collision penalty to create a driving style that was strong enough to win, but that didn't cause the AI to rudely drive other cars off the road.

Using reinforcement learning, gt Sophy can drive on the track in just a few hours of training and surpass 95% of the drivers in the training dataset in a day or two. After around 45,000 hours of training, gt Sophy was able to achieve superhuman performances on three tracks.

When testing AI agents, agents have many inherent advantages, such as they can be perfectly replayed and the reaction time is also very fast. Sony's researchers point out that the GT Sophy does have some advantages over human gamers, such as a precise route map with track boundary coordinates and precise information about the load of each tire, the side declination angle of each tire, and other vehicle status. However, Sony said that the agent occupies two particularly important factors, the frequency of action and the reaction time.

The input of gt Sophy is limited to 10Hz, while the theoretical maximum input of humans is 60Hz. Sony says this has led to more fluid movements for human drivers when driving at high speeds. In terms of reaction time, GT Sophy is able to react to events in the competition environment in 23-30 milliseconds, which is much faster than the maximum reaction time of 200-250 milliseconds for professional athletes. As compensation, the researchers added artificial delays to train GT Sophy with reaction times of 100 ms, 200 ms, and 250 ms. But as they found: All three of these tests reached lap times beyond the human level.

GT Sophy tested it with three top esports drivers: Emily Jones, Valerio Gallo and Igor Fraga. While none of the drivers were able to beat the AI in the time trial, the race allowed them to discover new tactics.

Sony says they're currently working on integrating gt Sophy into future Gran Turismo games, but there's no clear point in time yet.

GT Sophy has some technological innovations

This groundbreaking, human-surpassing racing agent was developed by Sony AI in collaboration with Polyphony Digital (PDI) and Sony Interactive Entertainment (SIE). Researchers have contributed mainly in the following areas:

Surreal simulator

New reinforcement learning techniques

Distributed training platform

Large-scale training infrastructure

Cornering extreme overtaking, beating the top human players, Sony AI racers on the cover of Nature

As mentioned above, GT Sport is a PlayStation 4 driving simulator developed by Polyphony Digital. GT Racing recreates real-world racing environments as realistically as possible, including racing, tracks, and even physical phenomena such as air resistance and tire friction. Polyphony Digital provides access to the APIs necessary to train GT Sophy in this ultimate simulation environment.

Cornering extreme overtaking, beating the top human players, Sony AI racers on the cover of Nature

Reinforcement learning (RL) is a type of machine learning that trains AI agents to act in an environment and reward or punish them through the results of their actions. The following diagram shows how an agent interacts with the environment. Agents take action, receive rewards or punishments, and determine their next move based on changes in the state of the environment.

Cornering extreme overtaking, beating the top human players, Sony AI racers on the cover of Nature

Researchers and engineers at Sony AI have developed a range of innovative reinforcement learning technologies, including the following:

A novel training algorithm called Quantile-Regression Soft Actor-Critic (QR-SAC);

Coded racing rules that can be understood by agents;

A training program to improve your racing skills.

More recently, deep reinforcement learning (Deep RL) has become a key component of AI milestones achieved in arcade games, complex strategy games such as chess, shogi, and Go, as well as other real-time multiplayer strategy games. RL is particularly well suited for developing game AI agents because RL agents consider the long-term impact of their behavior and can independently collect their own data during learning, eliminating the need for complex manually coded behavioral rules.

However, dealing with a complex game like GT Racing requires developing equally complex and subtle algorithms, rewards, and training scenarios.

GT Sophy has mastered three skills with RL

Through key innovations in RL technology, sony AI-developed GT Sophy has mastered the skills of Race Car Control, Racing Tactics and Racing Etiquette.

First, let's look at the car controls.

The new algorithm, QR-SAC, is able to accurately deduce the possible outcomes of GT Sophy's high-speed driving behavior. And, by considering the consequences of driving behavior and the uncertainties involved, GT Sophy can achieve extreme cornering.

Cornering extreme overtaking, beating the top human players, Sony AI racers on the cover of Nature

The GT Sophy agent is able to pass through tracks close to walls without any contact.

The second is racing strategy.

While RL agents can collect their own data, specific skills such as training slipstream passing require the racing opponent to be in a specific position. To solve this problem, GT Sophy trained in mixed scenarios, using hand-crafted race situations that may be crucial on each track, as well as professional sparring opponents to help agents learn these skills. These skill training scenarios helped GT Sophy acquire professional racing skills, including handling crowded starts, defensive moves, and more.

Cornering extreme overtaking, beating the top human players, Sony AI racers on the cover of Nature

The GT Sophy agent successfully overtook the human driver with sharp turns.

Finally, there is the etiquette of the competition.

To help GT Sophy learn sports etiquette, Sony AI researchers found ways to encode written and unwritten racing rules into complex reward functions. They also found that there was a need to race the number of opponents to ensure that the GT Sophy conducted a competitive practice session while not becoming too aggressive or timid when racing against human drivers.

Distributed, Asynchronous Deployment and Training (DART) is a custom web-based platform developed by Sony AI that enables Researchers at Sony AI to train GT Sophy on the PlayStation 4 console in SIE's cloud gaming platform.

DART allows researchers to easily specify experiments, run them automatically when cloud resources are available, and collect data that can be viewed in a browser. It also manages the PlayStation 4 console, compute resources, and GPUs for cross-data center training. The system allows Sony AI's research team to seamlessly run hundreds of experiments simultaneously while exploring ways to take GT Sophy to the next level.

Cornering extreme overtaking, beating the top human players, Sony AI racers on the cover of Nature

The DART platform has access to more than 1,000 PlayStation 4 (PS4) consoles. Each is used to collect data to train GT Sophy or to evaluate a trained version. The platform consists of the necessary computing components (GPU, CPU) to interact with a large number of PS4 and support long periods of large-scale training.

Cornering extreme overtaking, beating the top human players, Sony AI racers on the cover of Nature

While GT Sophy has achieved significant milestones, there is still room for improvement. Sony AI will work with PDI and SIE to continue to upgrade GT Sophy's capabilities and explore ways to integrate agents into the Gran Turismo family. And, in addition to GT Racing, Sony AI is also eager to explore new partnerships to enhance the player's gaming experience through AI.

Reference Links:

https://www.gran-turismo.com/us/gran-turismo-sophy/technology/

https://www.theverge.com/2022/2/9/22925420/sony-ai-gran-turismo-driving-gt-sophy-nature-paper

Read on