laitimes

SenseTime jointly released the generalist AI agent to clear Minecraft, survive, explore and create like a human being

author:Leifeng.com

There is a very interesting but defying common sense phenomenon in the development of AI, "Some tasks that are relatively difficult for humans, such as playing chess, are relatively easy for AI to achieve; In the open world, things that are relatively simple for humans, such as interacting with the environment, planning and making decisions, AI faces great challenges", this is the Moravec paradox.

Now, though, gitm has managed to break through this paradox, breaking through complex real-world-like environments that can survive, explore, and create like humans!

In the best-selling game "Minecraft", which closely simulates the real world, Ghost in the Minecraft (GITM), a generalist AI agent jointly proposed by SenseTime, Tsinghua University, Shanghai Artificial Intelligence Laboratory and other institutions, can not only play Minecraft, but also perform better than all previous agents.

SenseTime jointly released the generalist AI agent to clear Minecraft, survive, explore and create like a human being

"Ghost in the Minecraft"(GITM)

Survive, explore and create like humans

This research is an important step towards general artificial intelligence (AGI).

Broad mission coverage: GITM ACHIEVED 100% MISSION COVERAGE ON ALL TECHNICAL CHALLENGES IN THE OVERWORLD IN MINECRAFT (SUCCESSFULLY UNLOCKING 262 ITEMS IN THE FULL TECH TREE), WHEREAS PREVIOUSLY ALL AGENTS COMBINED COULD ONLY COVER 30%. (All previous agent methods, including OpenAI and DeepMind, have only unlocked a total of 78.)

SenseTime jointly released the generalist AI agent to clear Minecraft, survive, explore and create like a human being

High task success rate: On the most talked about "Get Diamonds" task, GITM achieved a 67.5% success rate, an improvement of 47.5% compared to the current best score (OpenAI VPT).

SenseTime jointly released the generalist AI agent to clear Minecraft, survive, explore and create like a human being

Extremely high training efficiency: GITM's training efficiency has also reached new heights. The number of environmental interaction steps only needs one-ten-thousandth of the existing method, and a single CPU node can be completed in 2 days of training, far less than the 6480 GPU days required by OpenAI VPT or 17 GPU days required by DeepMind Dreamer V3.

SenseTime jointly released the generalist AI agent to clear Minecraft, survive, explore and create like a human being

Ghost in the Minecraft (GITM), a generalist AI agent, starts from scratch in survival mode, gets all the items in the Overworld, digs diamonds, and makes enchanted books!

SenseTime jointly released the generalist AI agent to clear Minecraft, survive, explore and create like a human being

GITM CAN FACE ALL KINDS OF TERRAIN, ENVIRONMENTS, DAY AND NIGHT SCENES, AND EVEN MONSTERS

GITM CAN ALSO BE FURTHER APPLIED TO MINECRAFT'S MORE COMPLEX MISSIONS, SUCH AS SHELTERS, FARMLANDS, AND IRON GOLEMS NEEDED TO SURVIVE, REDSTONE CIRCUITS NEEDED TO CREATE AUTOMATION EQUIPMENT, AND NETHER PORTALS TO ENTER THE NETHER.

SenseTime jointly released the generalist AI agent to clear Minecraft, survive, explore and create like a human being

These missions demonstrate GITM's power and scalability, allowing agents to survive, evolve, and explore more advanced worlds in Minecraft for long periods of time.

General artificial intelligence breakthroughs accelerate the AI industrialization revolution

The development of the AI agent GITM, which overcomes all the technical challenges in Minecraft, aims to build a general artificial intelligence direction with self-learning and mastery of the entire real-world skills.

GITM breaks the traditional RL-based architecture and adopts a new paradigm of Large Language Model (LLM) as the core of the agent.

SenseTime jointly released the generalist AI agent to clear Minecraft, survive, explore and create like a human being

Project Home Page: https://github.com/OpenGVLab/GITM

This innovation will also help accelerate the realization of the research goals of general artificial intelligence (AGI), develop intelligent twins that can sense, understand and interact like humans in open world environments, and bring great breakthroughs and advances to industries such as robotics and autonomous driving, effectively solve complex environments and various long-tail problems in the real world, and promote the implementation of AI technology on a larger scale.

Thanks to the strategic layout of "big model + large computing power" to promote the development of AGI (general artificial intelligence), as well as the full-stack big model R&D system, SenseTime has been able to develop rapidly in the field of multi-modal and multi-task general large models, with the "Daily New SenseNova" large model system as the core, continuously helping innovative technologies to be quickly applied to smart cars, smart life, smart business and smart cities, and continuously improving the penetration rate of industrial intelligence.

At the same time, SenseTime is also actively accumulating Know-How from industrial applications, such as as as early as 2016, it began to lay out the field of intelligent vehicles, and constantly explored and solved a large number of planning and decision-making problems in autonomous driving applications. In mid-2022, DI-star, an AI model developed by SenseTime based on the OpenDILab decision-making AI platform, defeated the former Greater China champion in StarCraft, demonstrating its powerful decision-making AI capabilities and effectively promoting autonomous driving to break the rules and achieve more efficient planning and control. Today, the success of GITM will push applications such as autonomous driving to the next level of ability to handle complex tasks and break the technological ceiling.

Leifeng Net

Read on