laitimes

Humanoid robot + large model, why is it a new outlet for investors to chase?

Humanoid robot + large model, why is it a new outlet for investors to chase?

Humanoid robot + large model, why is it a new outlet for investors to chase?

Image source: @VisualChina

"In one morning, more than 40 investors came to the scene, all of them came to see the humanoid robot." A chasing employee said.

An investor pointed to the robot performing coffee latte and asked, "What is the difficulty of this action?" On the other side, another humanoid robot standing next to him pointed to by a self-proclaimed coal mine owner and asked, "How much is it, how to order it?" ”

At the 2023 World Robotics Conference, humanoid robots that previously could only appear in science fiction movies performed various skills live. Xiaomi, Chasing Technology, UBTECH, Cloudminds Technology, Unitree Technology, Dalian Tiace Technology, Star Era, Polytechnic Huahui and many other companies have moved their humanoid robots to the scene.

Humanoid robot + large model, why is it a new outlet for investors to chase?

The humanoid robot displayed by Dalian Tiace Technology, shot on the spot by Jiazi Lightyear

This is a lively corner of the World Robotics Congress 2023. At this year's conference, 160 domestic and foreign robot companies and 600 robots showed off their muscles on the scene for a long time. Humanoid robots steal most of the limelight. Marc Raibert, the world-renowned founder of Boston Dynamics, and Hiroshi Ishiguro, a well-known Japanese roboticist, are here.

Humanoid robots are becoming a new hot spot in the capital market. In the primary market, Baidu, Jingwei, Hillhouse, CDH, Gaorong, Yunqi, Zhenge, Meihua Venture Capital, etc. are all actively investigating on the front line, and no one in the first half of the big model of venture capital does not care about general robots.

At present, there have been some cases at home and abroad:

Figure completed two rounds of financing in two months: in July, it received a $9 million investment from Intel Investment; In May, Figure closed a $70 million Series A round led by Parkway Venture Capital. According to Reuters, Figure was already valued at more than $400 million at the time of the May round.

In the domestic market, Agibot, founded by Huawei's genius teenager Zhihuijun, is also being sought after by first-line funds, and Baidu, Jingwei, Hillhouse, CDH, Gaorong, etc. have all participated in the investment.

Recently, the general robot company Yuequan Bionic also completed a round of exclusive investment by Beijing Beike Zhong Development Qihang Venture Capital Fund. The core business of Yuequan Bionic is the industrialization of general-purpose bionic humanoid robots and core components. Yuequan Bionic was founded by the team of academician Ren Luquan, Key Laboratory of Engineering Bionics of the Ministry of Education, Jilin University.

In the secondary market, humanoid robot concept stocks have been speculated in several waves. In May, Musk's remarks about the humanoid robot Optimus at the 2023 shareholders' meeting directly pulled up a wave of A-share robot concepts - Seymour Intelligent 20CM price limit, Fengli Intelligent rose nearly 160% in six trading days, directly aroused the attention of the Shenzhen Stock Exchange, requiring clarification of the reasons and reasonableness of the large increase in stock prices.

Why have humanoid robots become fragrant? Behind the survey of investors, what opportunities and challenges do humanoid robots face?

1. Tesla, Xiaomi, Chase

Humanoid robots in full swing

Tesla is a direct driver of this wave of humanoid robot craze.

At the 2023 shareholder meeting, Musk said that the humanoid robot Optimus has significantly improved its control of movement and force, as well as environmental perception, and the technology is iterating rapidly. He predicts that demand for robots could reach 10 billion or more. If the ratio of humans to robots is 2:1, then the demand for humanoid robots may far exceed the demand for cars.

Musk's belief and input ignited many people's confidence in the humanoid robot track.

Tesla may push the entire industry chain to maturity. Wu Shichun, founding partner of Meihua Venture Capital, told Jiazi Lightyear: "In the field of new energy vehicle industry chain, Tesla has driven the overall development of the intelligent vehicle industry chain, and its factory in Shanghai has driven the gradual maturity of the entire Chinese intelligent industry chain. The next one could be a humanoid robot. ”

"That's a good thing, and we're also looking at where there are investment opportunities in the industry, whether it's complete machines, components or software." Wu Shichun said.

At the Tesla 2022 AI Day event, Tesla humanoid robot "Optimus" made its debut and completed actions such as walking, turning, stopping, and waving in greeting. Although we did not see "Optimus" in this year's Robot Conference, at this year's World Artificial Intelligence Conference, we saw a "Optimus" in the window in Shanghai.

Humanoid robot + large model, why is it a new outlet for investors to chase?

Filmed on the spot by Jiazi Lightyear

"Optimus" uses computer vision consistent with cars, "brains" that process visual data, make action decisions, support communication, and the same chip as Tesla vehicles, and is also equipped with FSD computers and Autopilot-related neural network technology of the same origin as Tesla vehicles, and is expected to sell for no more than 20,000 US dollars (about 144,000 yuan).

Huang Mingming, founding partner of Mingshi Capital, believes that electric vehicle companies have innate advantages in making humanoid robots. "When Musk was going to do Tesla Bot two years ago, a lot of people thought he wasn't doing his job. But if you look closely at Tesla's technology stack, you will find that robots are a natural extension of electric vehicles. The car is the first generation of four-wheeled robots, and the vision set by Li Auto at the beginning of this year is not to become the world's largest electric car company, but to become the best artificial intelligence and robotics company. ”

He mentioned that cognitive robotics is the next big breakthrough. "It could be humanoid or it could be quadruped. Now we already have a variety of robots in factories, production lines, and logistics, but the programs of these robots are written by human engineers, and the actions are fixed by us. However, cognitive robots, like autonomous driving, have the ability to perceive, analyze and judge, interact with humans and computers, understand the 3D world in real time, and operate accurately. ”

The task generalization ability of a humanoid robot determines how far it can go. Many companies are aiming in this direction.

Zhiyuan Robot's recently released embodied intelligent robot expedition A1 is a humanoid robot, Zhihuijun said: "Zhiyuan Robot is committed to closely integrating advanced robot and AI technology with human life and manufacturing, and making robots a powerful assistant for humans in the future." In the future, the A1 can autonomously complete mobile and operational tasks in various complex scenarios. ”

The aforementioned startup Moonspring Bionic robot has been able to achieve similar dexterity as a human hand. The company's self-developed humanoid intelligent dexterous hand can be adjusted by active movement, flexion, flipping and other actions under the interference of external forces to keep the handheld object from falling. "In addition to basic gripping, pressing and other movements, you can also complete 27 different complex and delicate hand operations, such as using chopsticks to pick small objects, apply skin care products, stir coffee, swipe mobile phones, unbutton, etc." It is reported that the Yuequan bionic intelligent dexterous hand adopts the pulling body drive technology and has a very high degree of freedom. At the same time, a flexible sensor is built-in, with tactile nerve feedback.

Humanoid robot + large model, why is it a new outlet for investors to chase?

Moonspring Bionic Intelligent Smart Hand, photo provided by the company

Zhao Di, CTO of Yuequan Bionics, told "Jiazi Lightyear" that the current joint-type humanoid robots can generally only grasp and hold, and it is difficult to perform more complex actions such as unbuttoning. In order for humanoid robots to truly serve humans, a good and capable hands are necessary, so it is particularly important to innovate and break through the underlying principles and propose new configurations.

In terms of the movement of humanoid robots, Yuequan Bionic also conducts research on the independent innovation theory of "bionic stretch body robot". Zhao Di mentioned, "Under normal circumstances, a joint can have up to 6 degrees of freedom, but the joint drive robot locks the degrees of freedom in order to pursue control accuracy, and in fact there are only 1~3 degrees of freedom at most, resulting in a high level of power consumption, which is dozens or even hundreds of times the level of human movement power consumption." ”

According to him, Yuequan's bionic body pulling body robot can overcome the shortcomings of traditional joint-driven robots, so that the robot has similar movement characteristics to humans, can take into account stability in the case of high degree of freedom of movement, and can achieve rapid adaptive adjustment of joint stiffness at the same time enable the robot to achieve safe interaction with the environment, and the sports energy consumption index is only one to two times that of the human body. At the same time, due to the new driving method, Yuequan Bionic's products no longer need to use complex and expensive reducers, reducing costs. It is understood that the company has also developed micromotors and bionic materials to match the performance needs of the pull-and-pressure body drive mode.

At the robot conference, we also saw a lot of humanoid robots showing various skills.

Chase brought the newly released humanoid robot released in March this year to the scene, which is a robot with a height of 178cm and a weight of 56kg, with a total of 44 degrees of freedom in the whole body, of which there are complete 6 degrees of freedom on one leg, which can complete standing on one leg. In addition, the robot is equipped with a depth camera to complete the modeling of the indoor 3D environment, and also integrates an AI large language model to communicate and dialogue in real time.

Humanoid robot + large model, why is it a new outlet for investors to chase?

Chasing robot performing coffee lattica, picture from Chase

Yu Chao, head of the humanoid robot of Chasing Technology, told "Jiazi Lightyear": "The difficulty of the humanoid robot to independently realize the coffee 'pulled' is that it needs to interact with people in an open space, which means that there will be many uncertainties in its spatial position and trajectory, and tools that need to identify different materials and sizes." To solve these problems, it is necessary to equip more intelligent models, sensors, and make innovative adjustments in mechanical structure design. ”

Robot company Unitree Technology not only brought the quadruped robot that has landed in the industry scene, but also brought the newly released humanoid robot H1.

Humanoid robot + large model, why is it a new outlet for investors to chase?

Filmed on the spot by Jiazi Lightyear

This is a full-size general-purpose humanoid robot that can run, with the highest dynamic performance of the world's approximate specifications, and is equipped with 360° panoramic depth perception, walking speed greater than 1.5m/s, potential exercise capacity greater than 5m/s, and the weight of the whole machine is about 47kg. Relevant personnel told "Jiazi Lightyear" that H1 will be officially put into production in the second half of the year, and the price will be within hundreds of thousands of yuan.

Even if it is kicked from the side or behind, H1 can be like a person, and after a slight stagger, it can find a new balance without falling. A robotics engineer told "Koshi Lightyear": "Achieving this function is not an easy task, everyone knows which mathematical function is used behind it, but it is difficult to really implement it into the product." ”

Everyone's curiosity about Xiaomi is more focused on the newly released quadruped robot, and the humanoid robot "Tie Da" released last year is less noticed. According to "Jiazi Lightyear", the humanoid robot "Tieda" released by Xiaomi last year was manufactured by Chase, which itself is also a member of the Xiaomi ecological chain and has accumulated some experience in multiple software and hardware such as height motors.

Humanoid robot + large model, why is it a new outlet for investors to chase?

Filmed on the spot by Jiazi Lightyear

Cloudminds brought the humanoid bipedal robot "Seven Fairies" Xiaozi to the scene, this robot is 165cm tall, weighs 65kg, the whole body is made of lightweight and high-strength carbon fiber composite materials, with more than 60 intelligent flexible joints, self-developed by Cloudminds full-stack, equipped with the Hairui cloud brain operating system, integrated robot multi-modal artificial intelligence large model RobotGPT. There was also a humanoid robot wearing a jersey performing a fixed-point shot. Huang Xiaoqing, founder and CEO of Cloudminds, said that the "Seven Fairies" will be officially released in 2024 and mass production in 2025.

Students from the robotics team of the School of Control of Zhejiang University also came to the meeting with the "Wukong-4" humanoid robot. It is understood that "Wukong-4" can adapt to outdoor roads, grass, muddy roads and other terrain, the fastest movement speed can reach 6 km / h, can also jump 0.5 meters, can also go up and down 25 degrees slope and 10 cm steps. Under unknown disturbances such as road slippage and external thrust interference, balance can be quickly restored and stable walking can be maintained.

Humanoid robot + large model, why is it a new outlet for investors to chase?

Photo courtesy of Zhejiang University

By integrating leg and foot movement technology and environment perception technology, "Wukong-4" realizes the construction of three-dimensional environmental map and autonomous dynamic navigation of robots. The mentors of the project are Zhu Qiuguo and Xiong Rong, both experts in the field of robotics, who have long been engaged in the research of leg and foot robots, robot intelligent perception control, group collaborative control, etc., and have a deep theoretical and technical foundation.

The startup Xingdong Era, a startup incubated by the Institute of Interdisciplinary Information Studies of Tsinghua University and the Shanghai Institute of Zhizhi, also came to the scene with humanoid robot products.

This time, the two products of Star Era, Xiaoxing and Xiaoxing MAX, were both live demonstrated. Xiaoxing can walk quickly on concrete ground, woods, grass, and gravel roads, and has a certain degree of stability.

Humanoid robot + large model, why is it a new outlet for investors to chase?

Filmed on the spot by Jiazi Lightyear

Behind these two robot products is a series of software and hardware technologies independently developed by the company: humanoid robot ontology based on ontology perception drive; Using self-developed high torque density modular joints and integrated structural design; Advanced materials such as high-strength alloys, carbon fibers and engineering plastics retain beautiful appearances and improve the strength and stability of the structure; Layout large language model, coupled with advanced force control algorithms, with high dynamic performance and better understanding of humans.

From the pursuit of investors, the entry of leading technology companies, to the innovative research of startups and universities, at present, humanoid robots are advancing in multiple directions and ushering in a new stage of development.

2. Equip humanoid robots with brains

Large models are another key technology variable driving the boom in humanoid robots.

OpenAI single-handedly brought humanity to the door of general artificial intelligence. The tentacles of large models are entering all walks of life, and after combining with robots, humans have opened a new imagination of the possibilities of general-purpose robots: can only chat and dialogue is too limited, can you disassemble the task yourself through instructions and execute it to the end?

Chen Yu, a partner at Yunqi Capital, believes that the essence of the big model is software capabilities, and in the digital world, the best carrier to connect with the real world is actually the hardware carrier. "At present, we pay more attention to how to combine large models with robot hardware to achieve the form of a general robot." In his view, embodied intelligence has made general robots have seen a possible path, capital is to add a handful of firewood to the industry, after five to ten years of large-scale investment, make it burn more prosperous, and finally realize the commercialization of general robots.

Chen Yu mentioned that the current focus of robot research and development is different from the past: no longer limited to completing a specific type of work, but further completing multiple types of tasks. For example, in the past, delivery robots were responsible for distribution, and construction robots intelligently brushed walls. However, in the context of general intelligence, it will be possible to realize flexible robot employment in the future, such as robots in factory environments that can both screw screws, spray paint, and complete parts assembly.

At present, whether it is "embodied intelligence" or "general artificial intelligence" is the market's new expectation for humanoid robots in 2023 - the value of general artificial intelligence to the physical world needs to be carried by embodied entities and interact with the real physical world in order to affect humans on a larger scale.

Xin Wei, director of investment at Linear Capital, believes that large models open up the achievability of general-purpose robots. Xin did not mention that the generalization of robots needs to solve several technical problems: the upper layer needs to understand, define, plan, and split the task; In the middle, an execution layer that can be strongly generalized is required to meet the task execution of different scenarios; The bottom layer is relatively mature robot control, and then paired with the appropriate hardware body. "Among the three levels, before the big model comes out, we believe that the top layer is the most difficult to achieve, and the capabilities of the large model fully match the requirements of this layer, making it possible for general-purpose robots to become reality."

Many startups are also looking at technology-driven, and the new capabilities of humanoid robots may open up new needs and get closer to the ground.

Wang Xingxing, founder of Unitree Technology, mentioned that a few years ago, the most critical reason why the market was not optimistic about humanoid robots was that human control technology could not control such a complex robot form as humanoid robots. Today, under the development of large model technology, the development of AI has far exceeded the technology required by robots. "Now to make a humanoid robot, learn from the existing large model technology, make a small model or even a medium model can be used, which makes the general humanoid robot technology level step a gap, in the next few years, as long as the engineering problems are broken, the humanoid robot can play a great production value, bring subversive landing applications."

Zhang Wei, founder of the intelligent robot company Step-by-World Power, which is studying bipedal robots, believes that the general opportunity of robots brought by humanoid robots lies in the fact that it can not only solve the problems that cannot be handled by professional machines at present, but also needs to have the ability of at least 2-3 workers of different jobs, such as being able to move boxes, pick goods, conduct quality inspection, etc., which needs to be a general physical movement platform.

Compared with domestic and foreign countries, progress is one step faster, and the capabilities of large models have penetrated from language to the execution layer.

In July this year, Li Feifei's team published a new embodied intelligence research project on the Internet: robots connected to large models can pull drawers, twist bottle caps, weigh apples and other actions according to the verbal instructions issued by humans.

On July 28, Google DeepMind's most launched robot model, Robotics Transformer 2 (RT-2), is also an extension of research in this direction. RT-2 is a new visual-language-action (VLA) model that can learn from network and robot data and translate this knowledge into universal instructions for robot control. RT-2 demonstrated better generalization capabilities—understanding beyond the semantic and visual realm of the robot data it touched, and being able to interpret new instructions and respond to user commands by performing basic reasoning.

Google DeepMind's paper introduces Robotics Transformer 2 (RT-2), a new vision-language-action (VLA) model that learns from network and robot data and translates that knowledge into universal instructions for robot control, while retaining web-scale capabilities.

These studies have laid a key step towards universal robots. Domestically, the team from Tsinghua has also been conducting research in this field. Yao Zhizhi, winner of the 2000 Turing Award, academician of the Chinese Academy of Sciences, and dean of the Institute of Interdisciplinary Information Studies of Tsinghua University, mentioned in the forum sharing that this new generation of embodied intelligent bodies that integrates the capabilities of large models needs to have three characteristics:

The first is the body, which needs to have enough hardware, such as sensors and actuators; The second is the cerebellum, which can dominate various perceptions of vision and touch, control the body, and complete complex tasks; The third is the brain, which dominates the logical reasoning, decision-making, long-term planning, and communication with other agents and the environment in natural language.

If the general-purpose robot with embodied intelligence as the core is the future direction, why must this form be a humanoid robot?

It is true that general-purpose robots are not necessarily equal to humanoid robots, but at present, many robot practitioners have mentioned that humanoid robots are recognized as the best general-purpose robot form.

Yao Zhizhi mentioned in the sharing that at present, the best general robot form is a humanoid robot, on the one hand, humanoid robots are more adaptable to various environments, on the other hand, the current human social environment in various designs are tailored for humans, such as staircase structure, door handle height, cup shape, etc., are customized for human images. Therefore, if you want to build a general-purpose robot with universal application capabilities, the human form is the best and most suitable form at present.

Liu Yuan, a partner at ZhenFund, told Jiazi Lightyear that he believes that the product definition and demand for humanoid robots are basic, and it is somewhat similar to what humans expected of robots hundreds of years ago. Throughout history, many new products are a distant echo of the needs and solutions for future life in human science fiction works hundreds of years ago. "Hundreds of years ago, humans wanted robots to do housework. Then there were washing machines, microwave ovens, and sweeping robots. It can be said that science fiction has completed a product definition of human needs. ”

But he also mentioned that at this stage, many companies rushed to do humanoid robots, and there was also a big follow-up component.

3. There are many problems that cannot be solved by large models

The flip side of ideal fullness is the skinny feel of reality. For technology companies that are accelerating the layout of humanoid robot tracks, while seizing the current opportunities, they must also consider more realistic technical problems and commercialization challenges.

At the technical level, as Google, Li Feifei and other teams are studying, the field of general robotics also needs a large model like GPT-4, which can really be in place in one step, integrate multi-mode capabilities, and truly unify the development of embodied intelligence.

But this is not an easy task, Ming Shi Capital partner Xia Ling told "Jiazi Lightyear" that the current paper and some demos show robots combined with large language models to focus on interaction problems, but not after solving the interaction problem, humanoid robots have become general-purpose robots. "Because even if high-level task decomposition and planning is done based on human-machine interaction, robots still need to control and execute, including universal mobility and high-precision operation capabilities in complex terrain." These capabilities are still a big challenge for robots. ”

Summer Saving believes that at the level of control execution, large language models are difficult to solve problems. "From the perspective of the development of the entire general-purpose robot, the big language model has contributed, but the impact on the underlying control and execution is limited. At present, academia adopts an AI-driven approach, hoping to use reinforcement learning as the underlying control execution, but this is not directly related to the big language model. Moreover, most of the control methods of reinforcement learning are still in the academic research stage. ”

Businesses in it are also facing these challenges. Wang Xingxing, founder of Yushu Technology, mentioned that the field of robotics will appear its own large model. In his opinion, the integration of large models and robots is a difficult point, some general large models and text logic and processing ability is good, but because it is not for general humanoid robots to use specifically, so these large models are basically zero in environmental cognition and perception. Unlike the data set of the large language model, which can be obtained directly from the Internet, the data of the robot is a dynamic data set, which needs to obtain dynamic simulation data in the simulation environment, and also depends on the foundation and interaction with the physical environment, which takes a certain amount of time.

For the future technological progress, Wang Xingxing is relatively optimistic, "At present, NVIDIA has been promoting relevant training in the simulation environment, from the current global heat and the current progress of the artificial intelligence industry, no more than 10 years, there will be significant progress." ”

Others feel that it will not be soon. Professor Alois C. Knoll of the Technical University of Munich in Germany mentioned that next, the field of robotics also needs to slowly integrate simulation, modeling, programming, artificial intelligence and other capabilities step by step like a large language model, and get out of its own intelligent generalization capabilities. "Humanoid robots are one of the hardest and most complex machines we've seen today, and the process takes time, may be slower than AGI, and may not see dramatic changes quickly."

Another key challenge lies in the co-evolution of hardware and software capabilities.

Unlike many who are advocating the disruptive opportunities that large models bring to humanoid robots, Marc Raibert, founder of Boston Dynamics, mentioned in his speech that hardware engineering and software are equally important in the future development of robotics. "There are people who think that software can overcome all the problems and limitations of hardware, and I don't agree with that."

In his opinion, only the best hardware designers and software designers can work together to design the best robots in the world. In the case of Boston Dynamics' humanoid robot, the company did a lot of work on Atlas' hardware engineering - including hydraulic systems, multiple specialized valves, special batteries, loads, etc., to reduce the weight of the robot from 170 kilograms to 90 kilograms, in the process, they did not compromise on the robot's function, but increased the robot's range of motion strength and speed.

This is the most realistic difficulty at the moment. Especially for startups, balancing technology implementation, performance, and cost is a key capability.

Chen Jianyu, CEO of Star Era, said that at present, Star Era hopes that robots can take into account strength, speed, precision and cost at the same time, but it is indeed difficult to do so. "Hydraulic technology can make the robot fast and powerful, but the cost is too expensive; The electric drive technology depends on the harmonic technology with a high reduction ratio, but once the accuracy is relatively high, the bearing is also good, and the dexterity will be reduced; The dexterity goes up, the cost is relatively low, but it has to sacrifice the load and accuracy of the robot. Now it is difficult to take care of all situations, and only for different application scenarios, the balance of various elements can be carried out. ”

In addition, in terms of safety, the nonsense of large language models may have little impact, but once a robot with a large model enters life, it is necessary to ensure accuracy and safety, which are the directions that technology needs to improve.

These problems require humanoid robot companies to constantly try and make mistakes in the scene to find solutions to the problems.

Xin Wei, director of investment at Linear Capital, told "Jiazi Lightyear": The demos that humanoid robots can currently display are relatively rudimentary, whether it is mobile or operation, and the real use of algorithms and hardware at the scene end requires strong generalization of algorithms and hardware, which is the basis for its commercialization. "Of course, we can neither overestimate the short-term effect of technology, nor ignore the long-term progress of technology, general-purpose robots have become the field of fire, whether it is the academic side or the industrial side, more forces and resources are pouring in, I believe that the commercial products that can land to a certain extent are not far away."

Xia Ling, partner of Mingshi Capital, believes that for startups that want to do general robots today, it is especially important to find L2 that can be commercialized and data-closed-loop. Because it has real commercial value, and the underlying technology can realize the data flywheel on the basis of commercialization value realization, supporting the continued development of L4. "If you only have L2, you can't do L4 without the technical architecture, ambition and capabilities. So it is necessary to have a big dream of L4, and at the same time to find a commercializable L2. ”

Technology, scenarios, costs, safety, opportunities and challenges are coming at the same time, and humanoid robots are taking a crucial step towards the future.

(Cover image source: Baidu Wenxin Yige)