laitimes

Three reasons why bots are about to have a "ChatGPT moment".

author:DeepTech

Since the birth of robots, practitioners in this field have been hoping to create robots that can do a variety of household chores. But for a long time, it was just a difficult dream to come true.

While roboticists have been able to get robots to do impressive things in the lab, like parkour, these tasks often require meticulous planning in a tightly controlled environment.

This makes it difficult for robots to work reliably at home, especially in homes with children and pets. In addition, the structure of each house is different, and there will be various chaotic situations.

There is a famous observation in the field of robotics called the Moravec paradox: what humans find difficult is easy for machines, while what is easy for humans is difficult for robots to complete.

Now with artificial intelligence, that's changing. Robots are beginning to be able to perform tasks such as folding laundry and cooking, which not so long ago were considered nearly impossible.

In the cover story of the latest issue of MIT Technology Review, I examine how the field of robotics came to its turning point.

There's a very exciting convergence in robotics research that may (just likely) get robots out of the lab and into our homes.

Here are three reasons why bots are about to have a "ChatGPT moment."

Three reasons why bots are about to have a "ChatGPT moment".

Inexpensive hardware makes research easier to complete

Robots are expensive. Highly complex robots can cost hundreds of thousands of dollars to start, making it impossible for most researchers to buy them. For example, the first batch of domestic robots, PR2, weighed 200 kilograms and sold for $400,000.

But new, cheaper robots are enabling more researchers to do some cool things. Startup Hello robot has developed and launched a new robot called Stretch, which costs about $18,000 and weighs about 22.6 kilograms.

It has a small mobile base, a pole with a camera to hang on, an adjustable arm, and a suction cup at the end that can be controlled with a controller.

At the same time, a team at Stanford University in the United States built a system called Mobile ALOHA (short for "low-cost open-source hardware remote operating system") that learned to cook shrimp based on data from just 20 human demonstrations and other tasks.

They cobbled together a lower-priced robot using off-the-shelf components, costing tens of thousands of dollars instead of hundreds of thousands.

Three reasons why bots are about to have a "ChatGPT moment".

Artificial intelligence is helping us build "robot brains"

The software of these new robots is different from the robots of the past. Due to the boom in artificial intelligence, the focus of research is now shifting from making expensive robots more flexible to building "general-purpose robot brains" in the form of neural networks.

Roboticists have begun to use deep learning and neural networks to create systems, constantly practice and learn in the environment, and adjust their behavior accordingly, rather than traditional planning and training.

In the summer of 2023, Google launched a visual language action model called RT-2. The model gains a general understanding of the world through online texts and images, as well as through one's own interactions. It converts this data into robot actions.

Researchers at the Toyota Research Institute, Columbia University and the Massachusetts Institute of Technology have been able to quickly teach robots many new tasks with the help of an artificial intelligence learning technique called imitation learning and generative AI.

They believe they have found a way to propel generative AI technology from text, images, and video to robotic motion.

Many people are experimenting with generative AI. Covariant, a robotics startup spun off from OpenAI's now-defunct robotics research division, has built a multimodal model called RFM-1.

It can accept prompts in the form of text, images, videos, robot instructions, or measurements (data). Generative AI enables robots to both understand instructions and generate images or videos related to those tasks.

Three reasons why bots are about to have a "ChatGPT moment".

More data, more skills

The power of large AI models such as GPT-4 stems from the vast amount of data collected from the internet. But this doesn't apply to bots, which need data collected specifically for bots.

They require demo data on how to open the washing machine and refrigerator, as well as how to pick up plates, how to fold laundry, and so on. At the moment this data is very scarce and it takes a long time for humans to collect it.

Google's DeepMind has launched a new initiative called "Open X Avatar Collaboration" to change that.

In 2023, the company collaborated with about 150 researchers in 34 research labs to collect data on 22 different robots, including Hello robot's Stretch robot.

The resulting dataset, released in October 2023, features robots that demonstrate 527 skills, such as picking up things, pushing, and moving.

Three reasons why bots are about to have a "ChatGPT moment".

(来源:TOYOTA RESEARCH INSTITUTE)

Early signs are that more data is giving rise to smarter bots. The researchers built two versions of the model for the robot, called RT-X, that can be run locally on computers in individual labs or accessed via the network.

Larger, web-accessible models are pre-trained with internet data to develop "visual common sense," or a basic understanding of the world, from large language and image models.

When the researchers ran RT-X models on many different robots, they found that the success rate of these robots in learning skills was 50% higher than the systems being developed in each lab.

Support: Ren

Operation/Typesetting: He Chenlong

Read on