#记录我的2024#
Quick guide
Google is using Gemini AI to enhance bots' navigation and task completion capabilities. DeepMind's research team describes how they used Gemini 1.5 Pro to improve interaction with the RT-2 robot, enabling more seamless communication through extensive contextual windows and natural language commands. They teach the robot to understand the environment by recording a video tour of a specific area, allowing the robot to learn from the video and follow natural language or visual cues to execute instructions. The Gemini-powered robot has a 90% success rate in large operating spaces, demonstrating encouraging results. In addition, the Gemini 1.5 Pro gives the robot advanced planning capabilities, such as strategizing actions when the user asks if a particular drink is available. DeepMind plans to further explore these capabilities to improve robot performance.
Google uses Gemini AI in bot training
Google is leveraging Gemini AI to enhance its bots' navigation and task-completing capabilities. In a recent research paper, the DeepMind robotics team details how to leverage Gemini 1.5 Pro's extensive contextual window to enable more seamless interaction with RT-2 robots through natural language commands.
Video-based robot navigation training
To teach the robot to understand its environment, a video tour of a specific area, such as a home or office, is recorded. Using the Gemini 1.5 Pro, the researchers enabled the robot to learn from a video, allowing it to follow instructions given by natural language or visual cues. This approach has shown encouraging results, with Gemini-powered robots achieving a 90% success rate with various user instructions in large operating spaces.
Gemini drives advanced planning capabilities for robots
The study also shows that the Gemini 1.5 Pro gives the robot the ability to plan beyond basic navigation tasks. For example, when a user asks if a particular beverage (such as Coke) is available, the bot can strategize by going to the refrigerator, checking the drink, and bringing back the results. DeepMind intends to explore these capabilities in depth to further improve the robot's performance.