Jin Lei Fengse is from the Au Fei Temple
量子位 | 公众号 QbitAI
The popular Stamford all-round housework robot Mobile ALOHA, big! over!!car!!
You think it's a no-brainer, but it's actually like this:
I'll sprinkle it all for you, and break a cup by the way......
You think it can be cooked by a chef, but it turns out to give you a wok bottom:
And that's not all there is to Mobile ALOHA.
For example, the pot that fried the shrimp just now, hey, I accidentally didn't hold it:
Even if the little brother rushed up with an arrow, it didn't prevent the "tragedy" from happening (it seems to be burning his hands).
This picture is really like Zhuang's sister-in-law smashing a bowl......
Mobile ALOHA, which was still on the "altar" yesterday, was exposed to so many "clumsy" appearances overnight, which also attracted many netizens to watch.
However, this time, even in the face of ironclad evidence of overturning, the painting style of netizens is abnormal:
It's not perfect, but it's lovely.
There will always be room for error.
The most important are:
Relieved. (Manual Dog Head)
What the hell is going on?
Stanford team exposes "scandal"
It turned out that this video of the robot overturning was released by Tony Z. Zhao, the author of Stanford Mobile ALOHA.
And he also bluntly said:
Robots aren't ready to take over the world.
And this video of the rollover is exactly what the robot committed in completely autonomous mode.
In the author's words, "the stupidest mistake".
After all, with the exception of the few examples we just showed, Mobile ALOHA doesn't even fit a pot in the cupboard:
The fried shrimp can't be poured out of the pan, not to mention, and even the location of the bowl can't be found:
You can't find the right place to start with a pen:
In the face of the failed collection, the author quipped:
It's my favorite video so far, (though) when the bot makes a mistake in front of you, you don't find it that funny.
Indeed, after all, the hands are burned......
However, the author revealed today that there should actually be another reason for this video.
Because the video of the mobile ALOHA god-level live two days ago did attract a lot of attention, but many people mistakenly thought that it was done in autonomous mode.
But in fact, Mobile ALOHA adopts a hybrid model, which is not completely autonomous, and the author also calls on netizens to carefully read the paper and code while eating melons.
It is worth mentioning that the author also cites and pays tribute to the 2015 Boston Dynamics Atlas humanoid robot "Rollover Collection".
Perhaps this is also as Nvidia scientist Jim Fan said:
One step at a time.
Xi 50 times, the success rate can reach 90%
In the past two days, the Mobile ALOHA team has released three explosive videos in a row, showing the robot's agile and dexterous housework ability, which stunned netizens.
Including making a full banquet of the Han Dynasty (knocking eggs, turning over the chicken, etc.), are all at your fingertips):
Pillowcase for bed sheets:
Watering flowers, mopping floors, opening bottle caps, and even teasing cats:
That's a man who looks like a human being, and he can go up to the hall and down to the kitchen.
However, most of them are controlled by real people, such as the ones above.
For a more intuitive look, you can take a look at the following animated picture of drawing paper and wiping glass, and there is a human 1:1 demonstration standing directly behind it:
For some relatively simple tasks, though, such as this single-fried shrimp:
There are also brushing pots, returning dining chairs, calling and taking elevators, wiping tables, etc., which can be learned with a small amount of teaching by a real person, and then operate autonomously without humans.
Specifically, according to the author, the above simple actions only need to be learned Xi 50 times to achieve a 90% success rate -
After testing, Mobile ALOHA can wipe off the spilled wine 9 times in a row and call the elevator 5 times in a row without error, and can maintain a certain degree of stability.
In addition, it is also resistant to interference, and when the pot is placed in the cabinet, the experimenter keeps throwing debris in front of it, which does not affect its performance at all:
A chair that is simply invisible during training? It can also accurately identify and complete homing tasks.
So, how did the authors get Mobile ALOHA to implement autonomous tasks with only 50 demos?
The most critical is to mimic learning Xi through ACT or diffusion strategies, and then jointly train the robotic system with static operational data.
With this joint training approach, the performance of the robot can be significantly improved, especially for tasks that require precise manipulation.
Finally, I would like to introduce the robot results of Stanford University again:
It was officially released at the end of March this year, after 8 months of iteration and 2 months of testing.
There are three authors, two of whom are Chinese Ph.D. students majoring in computer science at Stanford (the last one is a supervisor):
At that time, the robot was already able to use tools to complete all kinds of delicate work, but it could only be in a fixed position:
Of course, the back is also remotely controlled by a real person.
And as its name ALOHA stands for "ow-cost pen-source rdware system", this robot focuses on open source and low cost:
All software and hardware design, including code and data, are released together, and building this system "only" costs 32,000 US dollars (about 227,000 yuan), which hardware is needed, the author also made a list, interested friends can follow DIY.
The first year of robots?
Almost at the same time as Stanford's explosive robot, Google also released its latest research results, and it was all in one go:
One is to increase the robot's decision-making speed by 14%, while keeping the operation accuracy unchanged and increasing the improved model by 10.6%;
One is a new framework that specializes in generalization capabilities, using a new approach to increase the success rate of robots in completing never-before-seen tasks from 29% to 63%;
and a receipt collection system that can accommodate 20 robots at the same time, which will be used to speed up the training of robots' ability to understand human instructions.
All of these new achievements are used to upgrade Google's robot model RT-2.
Compared to Stanford's Mobile ALOHA, Google's RT-2 performance is still colder, but all its effects are fully autonomous.
In addition to these two, Li Feifei's team has also been following up, and its robotic system called VoxPoser can already understand human speech and complete various instructions without additional training.
This can't help but make people think of many people's predictions that "2024 will be the first year of robots":
Do you think it will come true?
Reference Links:
https://twitter.com/tonyzzhao/status/1743378437174366715
— END —
QbitAI · Headline number signed
Follow us and be the first to know about cutting-edge technology trends