On June 27, 2024, Morgan Stanley released a Tesla research report pointing out that AI is driving transformative changes in robotics, and the development opportunities of humanoid robots will be much greater than that of self-driving cars, with faster adoption and more capital investment, and Tesla is at the center of this theme.
On June 28, 2024, Meta founder and CEO Mark Zuckerberg had an in-depth conversation with technology creator Robin Kallaway to discuss the development of technology in the next 10 years, especially the application of smart glasses, neural wristbands, and AI technology in creators and small businesses. Zuckerberg believes that smart glasses will gradually replace mobile phones, and AI technology will be diversified in the future, allowing creators and small businesses to create customized AI.
In this issue of Intelligent Frontline, Morgan Stanley released an excerpt from Tesla's research report, Zuckerberg accepted Robin Kallaway's interview minutes, Master Station, Youxin Newin released, Liuhe Business Research and Selected Refined School, share with everyone, Enjoy!
Body:
The full text is 13,542 words
Estimated reading 27 minutes
特斯拉人形机器人Optimus Prime:Embodied AI体现式AI的投资影响
When: June 29, 2024
Source: Master Station
Word Count: 3,855
AI is driving transformative changes in robotics, and the opportunities for humanoid robots will be much greater than those of self-driving cars, with faster adoption and more capital investment, and Tesla is at the center of this topic, and investors may need to add tabs to their Excel models.
Here's an excerpt from Morgan Stanley's latest BluePaper: "Humanoid Robots: The Investment Impact of Embodying AI"
The advancement of AI is changing the robotics industry. We believe that the adoption of embodied AI could be much faster than that of autonomous vehicles.
Labor shortages and changing demographic trends, increasing business relevance and adoption paths across a wide range of industries, and periods of economic reward.
We built a proprietary TAM model (Total Addressable Market Model) that examines workforce dynamics and humanoid robot optionality, covering more than 830 job classifications and a $30 trillion global labor market.
We include a comprehensive competitive analysis from the Asian robotics team and a proprietary BotBOM to help investors think about the hardware cost curve.
Daddy, baby tigers learn by watching their mom hunt. My 9-year-old son announced at the family dinner that they pounce on the grass and catch small prey such as fawns to practice.
For years, machine learning was limited to self-reinforcing software algorithms. Advances in large language models (LLMs) and generative AIGenAl have led to a quantum leap forward in the field of robot learning, accelerating the learning of physical machines through natural language, imitation, and simulation.
GenAl is changing the way robots learn, giving them the opportunity to observe and mimic behavior in both physical and virtual worlds, connect through natural language, and iterate in the data center.
Just as large language model LLMs are helping to drive ChatGPT's increasing capabilities, multimodal model MMMs are driving innovation in robotics.
AI algorithms can dramatically shorten R&D cycles by automating repetitive tasks, enhancing data analysis and prediction capabilities, enabling virtual simulations, and optimizing design and testing processes.
As an AI-adjacent field, humanoid hardware development can now benefit directly from increased capital formation and R&D investment in robotics topics.
AI leaps into the physical/atomic world. AI is everywhere, AI is listening to you, AI sees your face and body, AI knows where you are now, AI can read, AI can write, AI can talk, AI can make pictures of cats in calfboy hats punching in Nasta.
Aside from running a lot of algorithms and activating a few switches, the AI rarely actually moves. In nature, locomotion refers to the ability of an organism to move independently using its own energy. In the fossil record, the earliest locomotor capacity on Earth dates back to the Precambrian period of bacterial flagella, a structure shaped like a spindle. Nowadays, the line between mobile devices and robots is gradually becoming blurred and inconspicuous.
Why do we need humanoid robots? There are good reasons for robots to take as many highly specialized forms as possible, such as robotic arms, snake robots, robot dogs, mechanical dust, etc.
Many robotics and AI experts say that the strongest argument for humanoid robots is that in a world that has already been created for humans, the environment is ready for humanoid robots.
Nvidia founder and CEO Jensen Huang recently said: "The simplest robot to adapt to the world is a humanoid robot, and we built this world for ourselves. We also have the most data to train these bots, and we have the same body size. Think of the many tasks that humans are able to perform with our hands or with tools, and the many machines designed for human hands and fingers. The most important reason why robots look like people is that we built the world for ourselves, the workstations in the factory, the production lines in the factory were created for people.
Define TAM. As of November 2023, the U.S. workforce is about 162 million people. At an average salary of $59,428, the U.S. labor market is worth nearly $10 trillion per year. According to Statista, about 3.4 billion people are employed worldwide. Assuming an annual salary of $9,000 per worker, the global labor market is worth about $30 trillion, or about 30% of global GDP.
We predict that by 2040, out of a theoretical $30 trillion, the cumulative/application base of humanoid robots will reach 8 million units, affecting wages by $357 billion. By 2050, it will reach 63 million units, affecting wages by $3 trillion. Our analysis does not currently take into account the fact that the humanoid robot application base exceeds the existing human workforce, and in some cases, the economic benefits of this technology may make this a reality.
At Tesla's recent annual shareholder meeting, Musk said he believes that the number of humanoid robots will eventually be at least twice as large as that of humans: I think the ratio of humanoid robots to humans will be at least two to one, presumably, at least one to one.
That is, there could be about 10 billion humanoid robots. Perhaps, it could be 20 billion, or even 30 billion.
In the 2040s, there will be 1 billion humanoid robots? According to Musk's post, Musk has been focusing on the Optimus Palo Alto Engineering Center in recent months. Tesla first demonstrated the humanoid robot Optimus on September 30, 2022.
Tesla's bipedal robot Optimus, which includes 28 actuators, is divided into two categories:
1. Rotary actuators, including harmonic reducers, ball bearings and sensors, for shoulder and elbow rotation actions;
2. Linear actuators, including planetary rollers, ball bearings and sensors, are used for linear motion similar to human muscles. There are a total of 12 actuators on both hands, and many more details are still preserved in-house.
In January 2024, Musk said he expected more than 1 billion humanoid robots to be in operation by the 2040s. At Tesla's annual shareholder meeting on June 13, 2024, he predicted that by 2025, Tesla would have at least 1,000 Optimus robots at work, and things would scale rapidly from there. He believes that humanoid robots will eventually surpass the number of humans, possibly reaching 20 billion or more, without sharing a timeline.
A dynamic, rapidly changing competitive environment. In addition to Tesla, dozens of startups and established companies are involved in the development of humanoid robots, driven by GenAl's rapid growth in 2022/2023. We note that even ahead of NVIDIA's March 2024 keynote, it was self-evident about the company's intentions regarding physical AI robots, which had been a recurring theme at the MorganStanley TMT conference. After many ups and downs, a range of venture capitalists and companies are betting on the promise of embodied AI.
In 2024, humanoid robotics startups Figure AI and Agility Robotics were valued at $2.6 billion and $1.2 billion, respectively, in private rounds, with a broader theme attracting major investors including OpenAI, SoftBank, Tiger Global, Amazon, NVIDIA and Microsoft.
Major public companies in industries ranging from automotive to consumer electronics are also actively involved in humanoid robot development, and some are partnering with humanoid robotics startups to explore potential future use cases.
Robotics is going through a ChatGPT moment. According to Vincent Van Hook, director of advanced robotics at Google's DeepMind, experts call 2 years ago the good old days. LLM and GenAl have suddenly brought robotics from an isolated island of robots into the AI flywheel. LLM and GenAl science have long been seen as a completely different world from the robotics actuators. Now these worlds are colliding with far-reaching implications.
We've seen this before. In 1821, Faraday observed the rotation of the wires by applying electricity through wires suspended above magnets. This not only marked the invention of the first electric motor in how electrical energy produces mechanical motion, but also linked two seemingly unrelated fields of science, electricity and magnetism. Albert Einstein discovered a previously unimagined connection between matter and light e=mc^2. We may be on the verge of unraveling the relationship between generative AI and robotics?
Networked collective robot learning. Imagine a humanoid robot standing in front of an island in the kitchen, where there is a small plate with an onion on it and a paring knife next to it. Now imagine a large warehouse with 1,000 humanoid robots, each standing next to a kitchen island with the same setup. With each accumulation of trial and error, the entire group learns collectively at the rate of the best bot at any point in time.
This kind of networked collective aggregate learning is rapidly improved, and group learning is accelerated. When the physics exercise is complete and the top-performing robot does a better job than the other 999 onion peelers, best practices can be further shared and refined through digital twins in hundreds of millions of trials in a simulated holographic universe.
Have you ever seen or interacted with a bot today? Some of you may have seen it, and most of you who read this in the summer of 2024 probably didn't. This nostalgic human technology is rapidly passing by in its history. The ongoing LLM/GenAI revolution, which is in its early stages, has penetrated the field of robotics.
For a long time, LLM and robotics were seen as two distinct fields of science. With the advancement of LLMs, there may be more overlap in the training and learning of robots. Whether it's a car-shaped robot or a humanoid robot, the AI brain is looking for its robotic body.
Humanoid robots and self-driving cars. Autonomous vehicle Avs is a relatively simple robot. In simple terms, robo-taxis have only three main executive outputs: steering wheel, accelerator pedal angle, and brake pedal. The field of operation is extremely complex, and public roads are full of unpredictable elements.
We believe that the time to commercialization of humanoid robots will be achieved much faster compared to the variability of the AVs operating environment (real world) and the corresponding safety implications (human passengers, pedestrians) compared to the morphology of humanoid robots, which can be learned in geofenced areas (warehouses/factories enclosed work cells). Humanoid robots have more physical output, and the difficult areas of operation, safety concerns, and regulatory scrutiny faced by autonomous vehicles have delayed their adoption curve.
Key Drivers of Humanoid Robot Adoption: The Story of Humanoid Robots, which involves understanding three main areas: AI, robots, and people.
At different stages, advances in AI's multimodal models, neural network training, and computation may advance faster than the physical sciences of robotics, such as optics, actuators, and battery manufacturing, which may follow the path of potentially nonlinear improvements to themselves. At the same time, multiple drivers of labor factors across industries and geographies will significantly determine the economic return period, adoption rate, and social acceptance.
The development of advanced humanoid robots is still in its infancy. We believe that over the past few years, advances in adjacent fields, GenAI, Actuators & Machinery, and battery storage, have proven to be important contributors to the development of humanoid robots. Further progress in these three areas will be key to the commercialization of humanoid robots.
There are some constraints that must be taken into account. The large-scale commercialization of humanoid robots must overcome a range of technical challenges, as well as a wide range of social/policy/safety barriers.
On the technology side, creating humanoid robots that can navigate the nuances of the human environment may require continued generative AI advancements, as well as efforts to tailor these advanced models specifically to humanoid robots.
Further refinements in precision actuators, sensors, and battery capacity are critical to increasing the range of tasks that humanoid robots can perform.
The sudden and rapid rise of generative Al models in modern robotics, which has been in development for 10 years, could create situations where mental capabilities outstrip physical capabilities, opening up a series of potential hardware bottlenecks that will need to be addressed as humanoid robots become smarter. Social/policy/safety considerations, as well as relevance to AV for autonomous vehicles, help us understand the range of obstacles that humanoid robots may face.
We believe that the use of digital twins or the training of humanoid robots in enclosed geofenced work cells provides a comparative advantage for humanoid robots in handling potential safety regulations relative to public streets.
Leveraging the strengths of Morgan Stanley's Asia Industrial Zone, from Chinese industry to Japanese industry and Chinese automotive suppliers, we gain an in-depth understanding of the inner workings of humanoid robots, analyzing component costs and future cost reduction potential.
According to our estimates, the cost of building a humanoid robot can vary from $10,000 to $300,000 depending on the configuration and downstream application. For example, based on the price quotes of major module suppliers with proprietary analysis, we estimate the current BOM of the Tesla Optimus Gen2 to be $50~60k per unit, excluding software.
With the help of economies of scale, the introduction of AI algorithms to significantly shorten the R&D cycle, and the use of cost-effective components from China, we see opportunities for significant cost reductions in the selling price of Optimus, which is about $20,000, to achieve Musk's goal.
Zuckerberg's latest 10,000-word insight: information streaming will turn more to personal and AI interactions, and is not optimistic about these three types of AI hardware
When: June 30, 2024
Source: There is a new Newin
Word Count: 9,587
Recently, Meta CEO Mark Zuckerberg had an in-depth conversation with technology creator Robin Kallaway to discuss the development of technology in the next 10 years, especially the application of smart glasses, neural wristbands, and AI technology in creators and small businesses.
Zuckerberg discussed in detail the future direction of smart glasses, arguing that they will gradually replace mobile phones as the main personal hardware devices. In the future, smart glasses will be divided into three types: basic type without display, intermediate type with head-up display, and advanced type with holographic display.
In the future, AI technologies will not be monolithic, but diverse, allowing creators and small businesses to create customized AI. This diverse AI experience will enhance the richness and personalization of user interactions. Smart glasses and nerve wristbands will change the way people interact. Zuckerberg believes that these technologies will allow people to stay focused in the real world while accessing information and interacting in a more natural and efficient way.
The following is the full text of the conversation
Robin Kallaway: Can you talk about Meta's strategy in the broader AI space? There will be thousands of creators who will hear this. They know AI, they know the players, they play with some tools, and I think it would be very helpful to hear you talk about it. What is the Meta AI program? How does it fit into the market?
Mark Zuckerberg: Our approach is very different from other companies, and you're going to see a lot of companies trying to build a major AI for you to use. Whether it's Google Gemini, or OpenAI ChatGPT. Our view is that we will have a basic Meta AI assistant for people to use. Our overall view is that there shouldn't be just one.
We believe that people want to interact with many different people and businesses, and that many different AIs need to be created to reflect people's different interests. A big part of our approach is to have every creator, and ultimately every small business on the platform, create their own AI to help them engage with the community and customers.
We think this will create a more engaging experience that is more dynamic and useful than just using a single thing, partly because we don't build it ourselves.
We're building the underlying technology, and we want to make the underlying model we are building, Llama, world-leading. We want to go all out and try to build completely generic intelligence, build leading models, and I'm very happy with our progress.
An equally important part is building tools for creators and businesses that enable them to create AI that reflects themselves over time, creating a variety of different experiences.
That's what this week's AI Studio announcement is about, and it's an early beta that is an exciting step forward in bringing that vision to life.
Robin Kallaway: That's exactly what I want to talk about, and I think this view of the web, maybe it's a single purpose, maybe it's a multi-purpose combination of agents, customized for each creator and business, and that's where I think we're going going in the future.
It is not controversial to say that in 10 years, we will have more creators, and mainstream content will flow more through creators. Meta is already the main layer of laying these tracks today, and when you think about what the future of creator experience looks like, what does the future look like from a tactical use case perspective? What does it seem interesting to you?
Mark Zuckerberg: First of all, I completely agree with you. If you look at the broader trajectory of human history, more and more people have had the opportunity to pursue their creativity and interests, rather than doing work that they might find monotonous or just for the sake of work, and more and more of us are doing what we really love.
A lot of technology has been developed to achieve this, not only by increasing productivity for other tasks, but also by providing people with a variety of new tools.
It's definitely part of the future, and we want to build more tools to get more people, including people who don't consider themselves creators today, everybody to be creative in some way.
Like I see my kids, they don't think they're creators, they're definitely creating all sorts of different things when they're playing with Lego. It's an old Picasso saying, every child is an artist, and the challenge is to grow up and stay that way.
Part of what we're going to do is build tools that allow everyone to do that. What trends am I seeing? When it comes to social media, there are a few big trends.
One is a shift from stream-based media to more personal messaging interactions. For example, if you look at Instagram, direct messaging is one of the fastest-growing parts of the system, and that's what I'm excited about about Creator Studio vs. AI Studio.
We're enabling people to create a persona for their own AI version to help them handle all the direct messages sent by the community. That's the classic problem, there's not enough time.
Every creator wants to interact with every fan who contacts them, and you simply don't have the time. There are probably more people than the people who send messages and want to interact with the content created by the creator. A lot of people may not send a message at all, knowing that the creator won't have time to reply.
The question is how high the quality of the AI Agents that creators create for themselves will become an art form that will evolve and improve over time, getting better and better.
For a lot of people, it makes sense to know that they're interacting with something created by the creator. Probably not as good as interacting with the creator himself, for many people, it's unavailable, the creator doesn't have enough time to respond, and that's an important part that we can dive into this topic.
Another fast-growing area is short videos. There's been an amazing trend lately, from movies vs. long-form TV shows, to what was once thought of as short-form videos, and people compared it to TV movies, and today a lot of YouTube videos feel long compared to real short-form videos.
You're probably multitasking while you're watching a YouTube video, and it goes on for several minutes. This trend is likely to continue, and people have tools to create very engaging content that is very compact and concise, and the speed at which these tools evolve will continue.
People use AI to create and edit videos, and there's a lot of creativity in the process, and you'll need to polish what you're building, like a sculpture.
Fundamentally, it will become more accessible, and the quality of the content will improve as people are able to try more different ideas. These two megatrends are messaging and short videos, and these are the two megatrends I'm currently seeing.
Robin Kallaway: This barbell strategy, with super-raw, unedited long-form podcast conversations on one end and super-elaborate short stories on the other, seems to be going to both ends of the spectrum.
I want to talk about the AI studio in terms of tools, today is basically the stage of laying the foundation, which is the foundational building block of how creators interact with AI in the Meta world, and today is the first foundational tool.
Can you talk about your strategy and how it was implemented step by step? What's released today? Why are you excited about this? I can share my feedback as a test.
Mark Zuckerberg: We can talk about the gradual implementation of these tactical tools, there are several technical paths that are going on simultaneously.
One is Llama development, the tuning of the underlying model, which we do in the process of getting feedback, which is the underlying core infrastructure. On top of that, we're creating all of the product experiences and tools that enable people to create these different AIs. Whether it's creators creating an AI Agent version to interact with their community.
At some point, we'll also be rolling out features that let anyone create user-generated content AI. It doesn't have to be created by yourself, it can be a new avatar that you want to exist on Instagram and interact with people on other apps.
Today we probably won't go into details, but in terms of business, the number of small businesses is no less than the number of creators right now, and that's a huge opportunity.
It should be very easy for any business to press a few buttons to create an agent version that can help you with customer support and e-commerce support, which will be very powerful.
Today we are rolling out the first beta phase, and we are trying to do it in phases. We started with about 50 creators and will gradually roll it out to a small group of people. As we adjust this, we'll roll it out gradually, probably in the next month or so, so that more people can interact with the AI created by these creators. Maybe by the end of July or August, we'll have a full rollout.
It's going to be a really fun experience to see how people enjoy interacting with these AIs, and building tools for creators is a big part of it.
Would love to hear your feedback on how you feel while using this tool, what works well and what needs improvement.
Robin Kallaway: I'd love to share. For me, as a creator, my parents ran a golf course in Ohio. They're a small business use case, they have an Instagram account, they get a lot of requests, they can't handle the volume of direct messages and messages.
My thought is, it's kind of like a spectrum, there's factual questions, there's a massive influx of these questions, and I'm sure you've had that experience too, it's hard to imagine, thousands of factual questions.
Like, do you have this link, or have you done this video, or where did your shirt come from? These are obvious questions that just need to be answered by cloning yourself, and that's the amazing thing about the initial use cases of this technology, and every small business and creator will want this.
At the other end of the spectrum, what I've been playing with is more of an opinion type question. For example, if someone asked, if you were me, how would you grow your brand? This is a multifaceted question that is difficult to answer in one sitting.
And that's where I find it interesting, watching the AI train on me and my responses, and hone it. If I can expand on these opinionate responses, one of my goals as a creator is to build one-on-one trust, which is the only thing I care about, and I've reached the limits of what I'm capable of.
One of the issues that comes to mind is that factual questions make a lot of sense, and every fan will agree that they just want information. In terms of opinions, some fans will worry about AI assistants that may dehumanize the magical connection between people and fans.
The question is, how do you build these AI tools that are incredibly useful for creators, while maintaining a real and human connection?
Mark Zuckerberg: The opinion question you mentioned is more of an art form. That is, the training process of these things.
The first is the basic Llama model. Creators, when setting up these AIs, have the opportunity to extract all sorts of information from their social media presence and any other information they want to train the system on.
The factual questions are relatively clear and can understand when we are doing it right and when we are not. In terms of opinions, creators will have more opinions on how they want the AI to express their opinions. As for the point where people know they're interacting with AI, that's one of our core design principles, and we don't want people to think they're interacting with the creator himself.
We want it to be as high-fidelity as possible to reflect the creator's intent, and it's also very clear that it's AI, so there's no confusion. When you interact with the community, you may feel like you can be more liberal or risky in your expressions, or closer to the bottom line in some of the expressions that may be more risky, and you may not want your AI to do that.
You can have it trained based on your social media content, and you may want to train it to be more cautious about certain things and avoid touching on certain topics until you feel more confident that it will accurately reflect your intentions.
These tools are very important, it's an art form, and we don't know what is the most engaging, trust-building formula when we start out. We wanted to provide tools for people to experiment and see what works best in the end.
Robin Kallaway: What I really want to talk about is the idea of what AI studios will do in the future, so let me give you a really cool example.
When you watch this video clip, if someone can click on your shirt or my hat or this light, there is immediately an AI layer to identify the brand with the product, automatically add to the cart, and automatically track the rebate. This invisible layer can really help monetize.
Millions of Reels are being created every hour due to infrastructure, and it is difficult to support this today, which may be planned. What are the future AI studio features that we haven't discussed yet, but do you think we can achieve them in the next 3~5 years?
Mark Zuckerberg: Understanding what different items do, and how it should apply to all posts over time, we'll get to that point. We have this early version of multi-model AI on Ray-BanMeta glasses, and you can say, Meta, take a look at what this thing is? It tells you exactly what we're looking at and is able to answer relevant questions.
This is only going to get better as the Llama model continues to improve, and as we fully roll out the next version, which is a big feature. There are a lot of things like that, and being able to automatically translate and dub them is something that I'm very interested in the future.
English speakers, often ignoring that many people in the world don't speak English, being able to automatically translate everything into various languages to make it accessible to more people would be very powerful if it felt authentic and felt like you were speaking that language.
These are some very exciting ideas, and these are different from AI studios, which are different applications of AI in terms of content, content understanding, content translation, etc.
For AI studios, it's going to be an ongoing evolution of how to give creators more tools to tweak the experience to make it more fun and build trust, and we're adding different modes.
At first it's text, and over time, there's video, audio, and eventually you can make it 3D, so you can appear as a hologram in someone's living room, which would be really cool. We're focused on the metaverse and all things materialized, and that's our natural path, and we're trying to make it possible for creators to interact with people in a more natural way.
Robin Kallaway: I really like the concept of the Agent network that you mentioned, and when I was playing with this, I was thinking, this is level one, what is that level two? An example is a market research agent, where I try to come up with what video to make or what product or course to offer to the community.
If I had an agent who could go out and have a one-on-one conversation with 5% of the audience, dig into their pain points, and automate those things, it would be awesome. You stack these little use cases on top of each other, and all of a sudden, you have a set of agents, AI, or bots that are really valuable to creators.
Mark Zuckerberg: That's a good point. In the business environment, it's clearer that businesses need customer support, and the next level is being able to aggregate analytics to understand all the things that people need to support and improve them. For creators, there are similar versions, like, how my community likes to interact with my content, what the different feedbacks are, and how I incorporate those factors into my creative process or business model.
Robin Kallaway: I'm a big fan of talking about all of these use cases in the future, we're both technology optimists, and I believe that almost all of these technologies, in the long run, will have a net benefit for humanity.
There's a big question that I'm sure you hear a lot about as well, from friends, family, and I hear a lot in the comment section. Many people have a lot of fear and uncertainty about AI, mainly creatives and artists who are afraid that AI may replace them and take away their jobs.
I think of my brother, who was a great animator and he was very worried, he was worried about whether his education and training would become meaningless.
For this group, it would be very beneficial to hear original ideas like you share about an AI-driven future. What does this mean for them? Can you provide them with some mindset or framework?
Mark Zuckerberg: In the future there will be more creative jobs than there are now, and you can look at the human trajectory, where most people used to be farmers, and now we don't need so many people in agriculture, people can pursue more creative things.
As technology evolves, so do the tools we use. The key to being a talented person is to stay up to date with these tools. Fundamentally, there will be more creative opportunities in the future, along with more powerful tools that will enable people to do just that.
Taking a step back, I do think one of the things that baffles me a little bit is that the narrative that some people in the industry are driving is that they think there's going to be a one-size-fits-all AI that can do everything, and I don't think things are going to go that way.
I understand that if you're in an AI lab where you might want what you're doing to be very important, you feel like we're building the only thing for the future, which is not the case.
It's not like people have only one app on their phones, and people just want to get all the content from one creator. People value diversity, which creates a sense of richness and learning and progress in our lives, experiencing different kinds of things.
I'm a big believer that there won't be just one AI in the future, there will be a lot of different AIs that will allow a lot of different people to create different things, and that's part of the reason why I believe in open source.
I don't think AI technology should be hoarded, and only one company can use it to build their central product. If you believe in the best experience and the best future, there will be a lot of different AI and a lot of different experiences, and you want it to spread in a variety of ways.
This part is to build tools for creators and platform users to be able to create their own AI, like the UGC type of AI case, and all commercial content.
The other part is open source, so that other companies can also create different things, and people can play around with and modify them themselves. I have a very deep view of the world, and I feel very uncomfortable when people in the tech industry talk about building the only AI. They seem to think they are creating God, which is not what we do, and I don't think things will go that way. We're going to be 10 years from now, with different tools than we are today, and just as we use different tools today than we did 10 years ago, there will be more creative work in the future than there are today.
Robin Kallaway: I thought of digital cameras. There were a lot of photographers at that time, digital cameras appeared, and the taste of photography was still important. They just get a better tool that could be used for different use cases. I couldn't agree more with what you said, a lot of companies are trying to build this kind of closed, all-in-one platform, and it's also a blow to me, which is more disruptive than beneficial.
Mark Zuckerberg: It's not going to create more value in the world, it's a bit of a weird ideology, and from my perspective, I don't think that's the way to create the best experience for people. You want to unlock and unleash as many people as possible to try different things, and that's the nature of culture. It's not like a bunch of people have a monopoly on everything, and you want all sorts of different ideas out there.
Robin Kallaway: Absolutely. I would like to go back to the Ray-BanMeta you mentioned earlier, I was very shocked when I used it. The combination of audio, camera quality, and multi-model AI far exceeded my expectations.
I've heard you share a framework where you think that devices like smart glasses could become the next generation of phones, and devices like the Quest3 VR could become the next generation of computers and TVs, and that makes a lot of sense to me. As a product builder, I'm curious, Ray-BanMeta or smart glasses, what else is needed to make billions of people prefer to use glasses instead of mobile phones as their primary hardware devices?
Mark Zuckerberg: That's an interesting question.
If you had asked me a few years ago or even 1 and a half years ago, I would have said that we need to get to the level of holographic AR in order for this to become the main platform that people use, and we're working on that.
We answer this question from two different directions.
For Ray-BanMeta, we took a look and asked ourselves how much technology we could cram in without compromising the form factor, weight, etc., if we only took the best form factor of glasses today, and this is what we got with Ray-BanMeta.
From another angle, we want to create holographic AR. This still needs to be glasses, not a headset, and there may be slightly thicker frames that you're cramming in more technology.
It's not the prototype version that we have, I'm excited, we're getting closer to showing it, it's not the most fashionable thing, but it's good.
It is undoubtedly glasses, not a headset. Over time, the two paths will merge. I used to think we needed holograms for a sense of presence, and AI has taken such a leap forward that even a simple product will attract people faster.
The demand for Ray-BanMeta products is much higher than we expected. On the one hand, you like to see that, and on the other hand, it's a bit frustrating that we're not producing enough.
It's sold out in most styles, you can still buy the basic black model, and a lot of people want other styles that are already sold out. We are accelerating our production and factory lines to make more products.
My view now is that there will be a lot of these products. I think you can use the camera, the microphone, the speakers, and the multimodal AI, to create a great experience, even if the glasses don't have any display. Interestingly, the display itself may not be suitable for everyone right away, and it adds weight to the glasses, making them more expensive.
You can buy Ray-BanMeta for $300, and if you add a holographic display, it will significantly increase the cost, and even if it is possible, we will be able to put it in the form factor we want.
I still think that for people who can afford more expensive equipment and don't mind a little heavier, they might want holographic functionality. In 10 years, we'll have really small holographic devices, and it's going to be great.
In the near future, I think many people will prefer Ray-BanMeta products. We're going to continue to have this form factor, which is getting smaller and smaller over time, and it's a very exciting evolution, basically there will be three different products.
1. No display: Glasses without a display can only perform AI operations, capture content, listen to audio books, music, answer calls, etc.
2. Non-holographic: A non-holographic display, meaning it won't cover your entire field of vision like a hologram. It's likely to have a bit of a heads-up display, which opens up a lot of interesting use cases. You can receive notifications, send messages, talk to the AI, answer questions, and not only hear, but also see, which provides higher bandwidth. It's going to be exciting, and there's a lot of use for a small screen, even if it's just a small head-up display.
3. Hologram: There will be the top version, which is the full-field holographic display, and in our future conversations, are you a hologram sitting on the sofa in my living room, or you are here, not only a video call, not just a hologram on the screen. We can interact, you want to play cards, we can have a hologram of cards. We can interact and mess around with the same things, and you want to create art or content together, or draw on a whiteboard, you can do all of these things. It's going to be pretty crazy, that's still the ultimate direction. I'm more optimistic now that this is going to be a big event even before we get there.
Robin Kallaway: Like Tony Stark's glasses are the last use case, I've heard you mention a wristband that captures subcutaneous nerve signals, and I think that's pretty cool as well.
Mark Zuckerberg: It's a nerve wristband, a nerve interface wristband. When people hear about neural interfaces, I think their first reaction is, this must be something implanted in your brain. Most people don't want anything implanted in their brains, I agree.
Your brain sends signals to your body through the nervous system, which is how you activate all the muscles. It turns out that there are many different pathways that are not being used in the normal functioning of your body.
You can have a wristband that you can train to capture the signals and ways your brain travels through different paths, moving your hand differently than you normally would. Eventually, it will reach a point where it will be possible to communicate through this neural interface even without moving the hand obviously, and it will start with some simple movements.
In the coming years, it will be possible to type with it, do all kinds of things, control the cursor, it will be very crazy. Combined with glasses, it opens up a range of very amazing use cases, even if it's just a head-up display version or even no display.
You can sit there and you can send a message to someone or AI wherever you are. Once again, this isn't just for Meta, over time it will work for all the different AI, all the creators' AI, whoever you want to interact with.
You can sit there and send messages silently and discreetly, and you can hear the answer in your ear, or if you have the monitor version, you can see the little text pop up, and I think it's going to be pretty crazy.
For me, the best thing about it is that I prefer to be face-to-face. For me, one of the best parts of group conversations on Zoom is that you can have a whole back channel where you can chat with a subset of people while the main meeting or conversation is going on, and sometimes in a meeting, I have a question that I want to ask somebody, I don't want to ask in front of everybody, I just have to wait until the meeting is over.
In a digital meeting, you can send a message to someone while the meeting is in progress, such as a message on WhatsApp. It would be great to be able to do this through glasses when interacting face-to-face. It will be very powerful that you can interact with people and also get information during conversations to make things more efficient.
Robin Kallaway: One of the things I used to underestimate was that phones can break your focus. When you're using your phone, you're completely inside the phone, not outside. Once you put on your glasses, you don't even have a head-up display, but with it, it won't distract you and you'll be very focused. You can get information in the real world and in a hybrid way at the same time. It's a very interesting balance with a wristband combined with glasses.
Mark Zuckerberg: You ask when this will replace mobile phones.
In the history of technology, new platforms usually don't completely stop people from using old stuff, you just use it less. What many people do on their phones today, they used to do on their computers, is more convenient. You don't turn on your computer as often as you go, you don't go to your desk, you just do it directly on your phone. Even if I sit at my desk, I will do a lot of things on my phone that I would have done on my computer 10~15 years ago, and glasses will evolve like this.
It's not that we stop using our phones, it's just that it stays in your pocket more. You'll take it out when you need it, and more and more people will say, I can take pictures with glasses, I can ask questions to the AI, or send messages, it's more convenient. I wouldn't be surprised if we still have mobile phones in 10 years, the way we use them is much more intentional and not just for any technical need.
Robin Kallaway: Every few years, society goes through some disruptive consumer tech moment. I think of the first time I searched on Google, the first time I received a friend request on Facebook, or the first time I used Uber, and people can almost remember the moment when they first experienced these transitions.
One of the coolest parts of your job is that you can experience disruptive technologies 5~10 years in advance. When we talk about eyewear, I'm curious about what other technologies are on Meta's roadmap for the next 10 years that have the potential to be disruptive consumer tech moments?
Mark Zuckerberg: I totally agree with you, it's one of the best parts of technology. A lot of other areas where you can do the same thing for a long time, and in technology, every once in a while there are some new opportunities that come up and you need to rethink what you're doing, which is exciting.
I don't know, we've talked a lot about it.
Glasses will be a big event, and we're almost ready to start showing holographic glasses prototypes. We're not going to sell prototypes extensively, we're going to focus on building full consumer versions, and instead of selling prototypes, we're going to start showing prototypes to people, which is crazy. I showed it to everyone and they were very excited and I'm really looking forward to showing it to more people.
Nerve wristbands are also crazy where you can input information through subtle hand movements, just by imagining how to move your hand.
This will become more abundant over time. It usually takes until the second or third version for these things to really be debugged and people to understand. We also saw this in the second version of Ray-BanMeta, and it was a really big hit. The neural wristband will be a big event, and depending on the progression, it could be primarily used for the input of glasses and mixed reality headsets, or eventually become a standalone platform. You can imagine a world where you control all the devices or computers in your home with a neural wristband, and it will be really cool over time.
In terms of AI, the pace of progress is phenomenal. We went from the summer of 2023 to the Llama 3, the Llama 2 is not the most advanced, the Llama 3 is close to catching up with the best models. It's open-source, open to a lot of people, unlocking a lot of great stuff.
We started working on Llama 4 and I'm excited about it, with each version adding more modes. Llama 3 has more image modes and some voice content, and Llama 4 will go deeper into these aspects and add some reasoning power.
This way, when you're a creator and you're editing something, you don't need to describe what you're doing in great detail. You can communicate back and forth and try different ideas, which will be very engaging.
It will be really cool when it goes from a turn-based chatbot to an agent where you can give intentions and complete more complex tasks.
I don't know, but I do think the journey that we started on the creator AI side, would be an interesting start. A lot of people are experiencing these AIs now, and it's the main AI built through a few companies. Creator AI, with some of the commercial AI we've launched, is going to get people started interacting with more diverse experiences, which is going to add real richness to the ecosystem as a whole, which is fantastic.
Robin Kallaway: I totally agree, sometimes it's just a new app or packaged differently. Like the theme we talked about, from singular to diverse.
Here's an interesting question, on the creator side, I don't think I've ever heard you answer it.
You are a very prolific creator, you design your own clothes, you make music. I've heard about your MarksMeats and all your hobbies.
A question that a lot of creators are asked, and I'm often asked, what do I do if I'm going to build a brand online from scratch or build a brand around a hobby? I'm curious how you would answer this question.
Let's say you're just Mark, a guitarist, or Mark, the founder of Mark. You know everything you need to know about worldbuilding, fan psychology, and we take all the resources. What would you think of this challenge? How can you as an entrepreneur try to build a meaningful brand online?
Mark Zuckerberg: It's interesting, we've almost been reflexively trained to think about these things now.
For the Mark meat example, I wasn't trying to create a business around it, I instinctively thought about it in terms of the story behind it. We don't just raise cattle, we try to raise high quality cattle in Hawaii, we feed them a unique macadamia nut meal, we brew beer, and we feed them to drink. We're vertically integrated, we grow macadamia trees, we brew beer. I find that that's the fun part of it, everyone does things in a unique way.
Maybe one day when I retire, Mark Meat will become a commercial operation, and now just trying to do this great thing, the part that makes it great, is to have a narrative and a story around it. Social media and other online tools help people tell this story, which also trains us to think about what the narrative of things is and what the story is, which is an interesting question.
Related Research Reports:
【Intelligent Era】Autonomous Driving: The moment of the iPhone car, the evolution of China's three major schools is taking shape
Battle of the Gods: American tech giants, from the Spring and Autumn Five Hegemons to the Warring States Seven Heroes|GBAT 2023 Greater Bay Area Intelligent Era Industry Summit
The second part of the trilogy of human future civilization: the special pre-sale of the intelligent era is open, the singularity is approaching, and the future has come
Zhao Yujie of Jiuyu Capital: Thinking in the era of intelligence, cognitive thinking, there are three waves of cognitive dividends of native, dimensionality reduction, and dimensionality upgrading
Zhao Yujie of Jiuyu Capital: Thinking in the era of intelligence, taking history as a mirror, the wave of science and technology, from the Internet to AI
Zhao Yujie of Jiuyu Capital: Thinking in the era of intelligence, cosmic perspective, from carbon-based organisms to silicon-based agents
One of the trilogy of human future civilization: the pre-sale of the metaverse topic is open, with 59 issues and 450,000 words
Zhao Yujie of Jiuyu Capital: 15,000 words of annual thinking collection of number one players, scientific and technological innovation, endless frontier
Zhao Yujie of Jiuyu Capital: 15,000 words of annual thinking set of smart electric vehicles, software definition, reshaping everything
Frontier Weekly: Embrace technology, insight into the future, 70 issues of the collection are packaged and delivered
The members of the Liuhe annual report were officially launched, and the research results of more than 5 years were systematically delivered
【Smart Electric Vehicle Special Pre-sale】The century-old automobile industry has accelerated its transformation, and the era of intelligent electric vehicles has opened
[Ready Player One Season 1 Pre-Sale]: Tech giants explore the future, and the number one player echelon rises
[Ready Player One Season 2 Pre-sale]: Technological innovation brings a paradigm shift and expands endless new frontiers
[First paid report + annual membership] 140,000-word in-depth report on live broadcast e-commerce: under the trillion-level GMV tuyere, giants are fighting and merging
The collection of science and technology sports reports was launched, and the deep integration of "technology + sports" has transformed sports in an all-round way
【Blockbuster】365 star companies, nearly 600 reports, and a panoramic presentation of Liuhejun's research results for more than 4 years
Zhao Yujie of Jiuyu Capital: CES Experience, open your brain, super technology giants will take over everything
Zhao Yujie of Jiuyu Capital: 5G opens a new cycle and enters the era of exploration in the online world|GBAT 2019 Greater Bay Area 5G Summit
Zhao Yujie of Jiuyu Capital: Seize the huge trend dividend of e-cigarettes, and seize the changes and changes in industrial transformation
[IPO Observation] Quarter 1: A collection of 11 in-depth research reports from SMIC, Cambrian, SMOORE, Pop Mart, Anker Innovations, etc
[IPO Observation] Season 2: A collection of in-depth research reports from 12 companies, including Ideal, Xpeng, Shell, Ant, Snowflake, and Palantir
[IPO Observation] Season 3: A collection of in-depth research reports from 12 companies including Coinbase, Roblox, Kuaishou, and Wuxin Technology
Annual Observation 2019 Series Collection: It lasted more than 3 months, with more than 200,000 words and nearly 500 pages, reviewing the past, looking forward to the future, and gaining insight into change and change
【Collector's Edition】Liuhe Treasure Book: A panoramic scan of 300 star companies, which lasted 3 years, 2.1 million words and more than 5,000 pages, is highly recommended
Zhao Yujie of Jiuyu Capital: Fragmented thinking on the smart electric vehicle industry
Zhao Yujie of Jiuyu Capital: Jiugong Grid Analysis Method, Tao and Art in the Field of Language and Mathematics Education and Training
[2023 Returning to the Hometown] The post-90s and post-00s friends have a 10,000-word record, life is back on track, and the Spring Festival is stronger
[2022 Hometown Experience] 20 post-90s and post-00s with 20,000 words, 4 countries and 13 places, comprehensively showing the differences in epidemic prevention at home and abroad, Spring Festival atmosphere, and development status
[2021 Homecoming Experience] 22 post-90s 20,000 words, the collision of the Chinese New Year on the spot and the return to their hometowns for the New Year, showing a real, three-dimensional and transformative China
[2020 Homecoming Experience] 20 post-90s 20,000 words, special Spring Festival, the collective memory of the times
【Blockbuster】22 "post-90s" 20,000 words of experience and experience of returning to their hometowns, telling about their eyes in China's counties, towns, and rural areas
Liuhejun's 3rd birthday, TOP 60 classic research reports are recommended
Afternoon tea, the Three Kingdoms Killing in the Internet world
5G boosts AR to open a new industrial cycle, and AR glasses open the era of dedicated AR terminals
The new business infrastructure continues to be enriched and improved, enabling the rise of new brands, new models and new products, and creating a new and diversified lifestyle
【Blockbuster】China's new economy leader, a compilation of 20 reports on listing in Hong Kong and the United States
Knowledge service + payment + audio, opening up new industry-level opportunities for content production, the knowledge economy can be expected to be 100 billion yuan in 10 years
From the 4-year replacement of the best-selling list of the App Store, watch the rise of content payment
Thinking about the new third board breaking 10,000 yuan: the daily trading volume of the new third board is 100 times in 10 years?
Zhao Yujie of Jiuyu Capital: Technology Changes Consumption and Makes Life Better|2017 GNEC New Economy and New Consumption Summit
Zhao Yujie of Jiuyu Capital: The Law of Time in the Entrepreneurial Era, Parallel Time for Developing Users|2016 GNEC New Economy and New Intelligence Summit
Zhao Yujie of Jiuyu Capital: The Internet Leads the New Economy, Content Entrepreneurship Connects the New Ecology|2016 GNEC New Economy and New Marketing Summit
Be sure to read the disclaimer and risk warning