Reported by the Heart of the Machine
Heart of the Machine Editorial Department
The next form of the large model is no longer a round of instant Q&A between people and models.
This could be a turning point in the form of AI technology.
On July 29, local time, at the 51st SIGGRAPH Computer Graphics Conference held in Denver, United States, NVIDIA founder and CEO Jensen Huang and Meta founder and CEO Mark Zuckerberg had a world-renowned fireside chat.
The two global tech giants, both founding CEOs and leaders of generative AI technology, exchanged leather coats over an hour-long conversation praising open source, scolding Apple, discussing the future of generative AI applications.
The conversation revolved around generative AI technology and how it can be used in a variety of fields.
Huang and Zuckerberg talked about how generative AI could change social media's recommendation system, and soon the system will recommend content based on your interests. Social media may evolve into a system that synthesizes what's happening in the moment for you to generate content that is created on the fly. As the versatility of the model increases, we may also say goodbye to jumping between software, with Facebook or Instagram all in the same AI model.
When Llama 3 is upgraded to Llama 4 and beyond, the interactive experience will quickly go beyond round-by-round conversations with a chatbot, where you give the model an intent and it can handle a task that takes a long time to think or execute, a task that can take months to process, and the AI will report back to you when the task is complete.
Zuckerberg believes that research on basic models is still accelerating, and even if basic models are stopped, the AI industry still has five years to innovate products.
When asked why Meta insists on open source, Zuckerberg said that it stems from the openness of Microsoft's Windows system in the PC era. After the rise of smart mobile devices represented by mobile phones, Apple's closed-source ecology has caused many of the features that Zuckerberg wants to create to collapse in the middle of the road, and Zuckerberg has even exploded for this reason.
At the same time, Zuckerberg and Huang agreed that open source is not a pure altruistic thing, but a smart business strategy. Meta's open-sourcing of its open computing system has become the industry standard, and the supply chain has actively aligned Meta's design, and this experience of building an entire ecosystem for the industry has actually saved Meta billions of dollars.
Zuckerberg and Huang disagreed on whether people would prefer the most powerful general-purpose model or a smaller, specialized model. Huang chose to go straight to the most upscale, "Nvidia doesn't care about saving those pennies. Our goal is to achieve the highest quality results."
Zuckerberg's "metaverse" in mind, he didn't give up. Meta's AI and Llama projects, as well as R&D in augmented and virtual reality, are actually building Horizon OS, an open operating system designed for mixed reality.
He believes that the next computing platform will be divided into smart glasses and VR/MR headsets. Smart glasses will be the next "mobile phone" and the beginning of the next generation of smart mobile devices. VR and MR headsets will become workstations or game consoles that take on more intensive computing tasks. Meta is working on their dream holographic AR glasses, which will integrate super interactive AI and have a sleek look that will serve as a styling function in everyday life.
At the end of the conversation, Huang asked himself a question: "Living in an era where the entire computing stack is being reshaped. What do we think about software?" What excites him the most is the possibility of Jensen AI, replicating an AI agent based on him and being able to fine-tune it in conversation. In this conversation, Xiao Zha also described the same vision. Meta is curating Creator AI and AI Studio, a product that will allow everyone to customize an AI agent that can be customized to their needs.
Loading...
Subtitles are recognized by AI and are for reference only
The following is a transcript of the conversation:
Jensen Huang: Mark, welcome to SIGGRAPH for the first time! Can you believe it? This is one of the pioneers and leaders in computing, and I have to invite him to SIGGRAPH, and thank you very much for joining us.
Zuckerberg: Actually, I've been around the venue for more than five hours.
Jensen Huang: It's SIGGRAPH, you know 90 percent of the people here are PhDs. When it comes to why SIGGRAPH is so important, it's a great event about computer graphics, image processing, artificial intelligence, and robotics. Over the years, many companies have been here to discuss and absorb new research, from Disney, Pixar, Adobe, Epic Games, and of course, Nvidia.
This year, we had 20 papers selected for this conference, mainly in the field of artificial intelligence and simulation. AI can help us simulate larger-scale physical environments, such as differentiable physics, faster, and we're using simulation methods to create simulated environments for synthetic data. We are proud of what we do.
When it comes to Meta, what you're doing in the AI space is well known. One of the things I find interesting is to see how the media has covered Meta's AI efforts in recent years — including those that FAIR has done. We're all using PyTorch from Meta. You've done a lot of groundbreaking work in areas like computer vision, language models, real-time translation, and more.
My first question is, how do you see the advancement of generative AI and Meta today? How can you apply it to enhance your operations or introduce new features that you offer?
Zuckerberg: Meta started at SIGGRAPH about 8 years ago, so we're new to NVIDIA.
Jensen Huang: You're participants, and that's my territory.
Zuckerberg: Definitely, you're welcome to invite me to your territory (laughs). I remember back in 2018, we showed off some early gesture tracking work for our VR and mixed reality headsets. I think we've talked a lot about the progress we've made on codec avatars, and we hope to be able to drive photorealistic avatars from consumer headsets, and we're getting closer to that goal. We're very excited about it.
In addition, we've done a lot of display system work, some prototyping and research on future technologies to enable mixed reality headsets to be very thin. I would like to have quite advanced optical stacks and display systems, i.e., integrated systems. These are the things that we usually show first here.
It's such a pleasure to be here this year, and we're not just going to talk about the metaverse stuff, we're going to talk about all the AI stuff, like you said, we started FAIR, the AI research center, then Facebook, now Meta, and then we started Reality Labs. We've been doing this in this area for a while.
All the stuff about generative AI, it's arguably an interesting revolution, and I think it's going to end up making all the different products that we produce different in an interesting way. Take a look at the product line we already have. For example, feed and recommendation systems, as well as Instagram and Facebook. We've been on a journey like this, and it's evolved from simply connecting with friends, and Ranking is always important because even if you're just following friends, if someone does something very important, like your cousin has a baby, you definitely want that kind of information to be at the top. If it's buried somewhere in your stream, you're going to be very angry with us.
That's why referral rankings are important, but over the past few years, rankings have grown to the point where most of the content in the rankings is just different public content. Because now there are not just hundreds or thousands, but millions of potential candidate posts from your friends. This becomes a very interesting question of recommendation. With generative AI, I feel like we're going to get to a new phase very quickly, where what you're seeing on Instagram is no longer just something posted by people you follow. Instead, based on your interests, we'll recommend content that might be of interest to you, even if it's posted by someone you don't follow.
I think a lot of content will be created through these tools in the future as well. Some of them are creators using these tools to create brand new content. And then there's something that I think will end up being something that is created for you on the fly, or brought together and synthesized through different things that exist. This is just one example of how the core parts of what we're doing are going to evolve, and it's been evolving for 20 years.
Jensen Huang: Some people think that one of the world's largest computing systems to date is a recommender system.
Zuckerberg: yes. It's a completely different path, and it's not as noticeable as what people are talking about. There's a lot of talk about transformer architectures, and it's all about something like that, just building more and more generic models.
Jensen Huang: Embedding unstructured data into features.
Zuckerberg: yes, one of the big drivers of quality improvement is that in the past you would use different models for each type of content. A recent example is, let's say, we have a model for ranking and recommending short-form videos, and another model for ranking and recommending longer videos, and then doing some product work that basically makes the system display anything. But the more generic recommendation model you create that can cover everything, the better it becomes.
I mean, part of the logic behind this is actually similar to the principle of economics, which is the circulation and accessibility of content. When you're able to draw content from a wider library, you won't run into those confusing efficiency issues that arise when transitioning between different content sources. As the model gets bigger and more versatile, it gets better and better. I dream of a day where you can imagine the whole of Facebook or Instagram being like an AI model that unifies all the different content types and systems that actually have different goals in different time frames. Because some of them just show you, like what interesting content you want to watch today. But there's also something that's going to help you build your long-term social network, right? For example, someone you may know or an account you might want to follow.
Jensen Huang: These multimodal models tend to be better at recognizing patterns and weak signals. So people will always say, it's interesting that AI is so deep in Meta. You've been building the GPU infrastructure to run these big recommender systems.
Zuckerberg: Actually, the speed at which we apply GPUs is a bit slow in the industry.
Jensen Huang: You're my guest, and I just want to try to be friendly.
Zuckerberg: yes, so friendly (laughs). When you were in the background just now, you were still talking about admitting mistakes or something.
Jensen Huang: You don't have to say it all of a sudden.
Zuckerberg: I think we've tried. A breakthrough was made very quickly.
Jensen Huang: You're just a blockbuster. Now, the coolest thing about generative AI is that when I use WhatsApp, I feel like I'm working with WhatsApp. I like to imagine myself as a professional typist who generates images as I type. If I go back and rephrase my words. It generates additional images. For example, "The Chinese old man enjoyed a glass of whiskey at sunset with three dogs, a Golden Retriever, a Golden Retriever, and a Bernese Mountain Dog." The AI will generate a very beautiful photo.
Zuckerberg: I've spent a lot of time with my daughters over the past week, imagining them as mermaids or something. It's funny. Yes, that's one aspect of generative AI. A lot of the new generation of AI stuff, I think it's going to be a major upgrade to all the workflows and products that we've had for a long time. On the other hand, we can create a lot of completely new things.
At Meta, we want to provide an AI assistant that can help you with different tasks, and in our world, it will be very creative, like you said. But it's also universal. Over time, it will be able to answer any questions.
I think when we move from the Llama 3 standard model to Llama 4 and beyond, I don't think it's going to be like a chatbot anymore – you give it a hint and it responds, round after round interactive conversation. I think it's going to evolve very quickly, and as long as you give it an intention, it's capable of going and performing tasks on multiple time frames (can handle tasks that require long thinking or execution, not just instant responses).
If some of the things I had in mind finally came to fruition, it would start tasks that require a lot of computing resources that could take weeks, months, or even longer to complete. Then it will come back at some point and report back to you the results, as if something had happened in the world. I think it's going to be very powerful.
Jensen Huang: Today's AI is turn-based. Whatever you say, it will get back to you. But obviously, when we think, when we are given a task or a problem, we think about multiple options, or we come up with a tree of options, a decision tree, that simulates in our minds the different outcomes of each decision that might be made. So we're planning. In the future, AI will do the same.
I'm really excited when you talk about the vision of Creator AI. Frankly, I think it's a blasting idea. Can you tell us more about Creator AI and AI Studio?
Zuckerberg: I don't think there will be just one AI model. That's exactly what some other companies in the industry are doing, and they're building a single central agent.
Of course, you'll get a kind of Meta AI assistant. But our vision is for everyone who uses our products to be able to create agents for themselves. Whether it's millions of creators on the platform, or hundreds of millions of small businesses, we ultimately want to be able to collect everything and quickly build a business agent that can engage with your customers, do sales, customer support, etc. So one of the products we're starting to launch now is AI Studio, which is a set of tools that will finally make this vision work. Each creator can build their own version of AI, an agent or assistant that they can interact with.
It solves the fundamental problem that everyone doesn't have enough time. If you're a creator, you want to interact more with your community, but your time is limited. Similarly, your community wants to interact with you. So the best thing to do in the future is to let people only be responsible for creating. It's kind of like an agent, but it's trained based on your material, representing you in the way you want it to be. I think it's a very creative endeavor, like the artwork or content that you create and post on social media. To be clear, this is not a direct interaction with the creator himself, but through an agent. But I think it's going to be another interesting way, like creators posting content on these social systems, being able to interact through these agents.
Again, I think in the future people will create their own agents for a variety of different purposes. Some are custom utility features that they want to accomplish, and they want to fine-tune and train the agent. Some are entertainment. Some of the things that people create are funny, some are silly, or have an interesting attitude towards certain things, and we may not be able to, may not build them into Meta AI as assistants, but it looks like people are very interested in seeing them and interacting with them.
An interesting use case is for people to use these agents to provide support. One thing that surprised me a little bit was that one of the main use cases for Meta AI was that people had it to play difficult social situations that they were going to encounter. For example, I want to ask my manager, how do I get a promotion or a raise? Or I had a fight with my friend, or I was in a difficult situation with my girlfriend, how should this conversation go?
It's a completely unpredictable area where you can role-play, see how the conversation will go, and get feedback. But I think a lot of people don't want to just interact with the same kind of agent, whether it's Meta AI or ChatGPT, or whatever else that people are using, they want to create their own stuff. So that's the general goal of AI Studio. But I guess that's just part of our larger vision, and we don't think there should be just one big AI that people interact with. If there is diversity, the world will be a better and more interesting place.
Jensen Huang: I think that's really cool. If you're an artist and have your own unique style, you can use your style and all of your work to fine-tune an AI model. This way, the model can be created according to your style. You can use this AI and give it commands. For example, you can ask it to create something based on the art style that I have. You can even give me a painting or sketch to use as inspiration, and I can generate new artwork for you based on that. You can come to my AI bot, or just use my AI to do that.
In the future, it is possible that every restaurant and every website will have such AI, which can provide personalized services and content based on the needs and preferences of users.
Zuckerberg: yes, I think in the future, just like every business has an email address, a website, a social media account or a few social media accounts, in the future, every business will have an AI agent that interacts with customers.
In the past, some things were hard to do. For example, if you look at any company, you might see that the customer support department is separate from the sales department, and that's not what you want as a CEO to be. It's just, "Okay, they're different jobs."
When you're the CEO, you have to do everything. When you incorporate abstract concepts into an organization, in general, organizations are separate because they are optimized for different things. But I think the Platonic ideal is that it should be built for the customer. You know, when you're trying to buy something, you don't want to have a different buying method in front of you, and if you have a problem buying something, you just want to have a place where you can go and answer the question and be able to interact with the business in a different way. I think that applies to creators as well.
Jensen Huang: This kind of interaction with customers, especially dealing with their complaints, will make your company better. In fact, this interaction with AI will capture the company's institutional knowledge and how to deal with the problem, all of which can be used for analysis to improve the AI, and so on.
Zuckerberg: So, from a business perspective, I think this requires more integration, and we're still in the early stages. But AI Studio allows people to create their own UGC agents and other stuff and start kicking off this flywheel effect of creators creating agents. I'm very excited about it.
Jensen Huang: Can I use AI Studio to fine-tune my model with my image set?
Zuckerberg: Absolutely, we'll provide that ability.
Jensen Huang: Okay. And then I can load everything I write onto AI Studio so it can work as a replica of me?
Zuckerberg: Absolutely.
JJ: And every time I come back to it, it loads again. So it remembers where it last stopped. We continue our conversation as if nothing had happened?
Zuckerberg: yes, like any product, it's going to get better over time. Training tools will also get better. It's not just about what you want it to say. I think in general, both creators and businesses have priorities for their work. So it's going to get better in all of these areas.
I think Plato's version of AI isn't just words, but includes every point you can imagine, which has a bit of an intersection with the Avatar work we've been doing for a long time. You want to be able to video chat with agents, and I think we'll get that to that over time.
I don't think these things are still that far away from us, the flywheel of technology spins very fast. So, it's exciting, there's a lot of new stuff to build. I think even if the progress on the base model stops now (and I don't think it will), we'll have five years of product innovation to figure out from scratch how to make the most efficient use of everything that has been built so far. But actually, I think the progress of basic models and basic research is accelerating, and it's a pretty crazy time.
I gotta say, you've made this possible.
Jensen Huang: Thank you! You know, we're CEOs, we're delicate flowers and need more encouragement.
Zuckerberg: By this time, we're pretty sophisticated. I think we're two of the longest-standing founders in the industry, right?
Jensen Huang: Yes.
Zuckerberg: You see your hair is gray, mine is just growing.
Jensen Huang: yes, my hair is gray, your hair is curly. What's going on?
Zuckerberg: It's been this way all the time, I just take care of it a lot.
Jensen Huang: If I had known that the road to success was so long......
Zuckerberg: Then you probably wouldn't have taken that path at all.
Jensen Huang: No, I'm probably going to leave college early, like you.
Zuckerberg: But we have very different personalities.
Jensen Huang: You started your career 12 years earlier, which is much faster. After graduating from Oregon State University in 1984, Huang worked for companies such as AMD; Zuckerberg started Facebook as an undergraduate at Harvard University and dropped out of school.)
Zuckerberg: But what you've done is great.
Jensen Huang: Well, let the past pass. So, I like your vision that everybody can have an AI, and every business can have an AI. At NVIDIA, I want every engineer and every software developer to have an AI, and there's a lot of AI.
One thing I like about your vision is that you also believe that everyone and every company should be able to make their own AI. So you're actually open source. When you open source Llama, I think that's great. By the way, I think Llama 2 was probably the biggest event in the field of artificial intelligence last year.
Zuckerberg: I think [the biggest thing] is the H100.
Jensen Huang: It's a chicken-and-egg issue.
The reason why Llama2 is the biggest event is because when it comes out, it activates every business and every industry. All of a sudden, every healthcare company is developing artificial intelligence. Every big company, small company, startup is developing artificial intelligence. This enables every researcher to re-engage with AI because they have a starting point where they can do something.
Now that Llama 3.1 has been released, there's a lot of excitement and you know, we're working together to deploy Llama 3.1. We're rolling it out to global businesses, and it's exciting. I think it will support a wide range of applications.
Tell us about your open source philosophy. You've open-sourced PyTorch, Llama 3.1, and the Llama family, and you've built a whole ecosystem of open source, but what did you think about it at first?
Zuckerberg: In terms of open source, Meta started relatively "backward". When Meta started building distributed computing infrastructure and data centers, other tech companies were already laying out, so at the time, it wasn't a competitive advantage for Meta. So you might as well open source it directly, and then Meta benefits from the ecosystem around it, the biggest of which should be open computing, Meta has made public the server design, the network design, and even the final data center design. Once these designs became industry standards, all of the supply chain was essentially based on Meta's standards, so in effect, open source saved Meta billions of dollars.
Jensen Huang: Open computing also makes Nvidia HGX systems compatible with every data center.
Zuckerberg: Thank you Nvidia for this wonderful experience. After the sweetness of open computing, we have adopted a similar open source strategy in basic tool classes such as PyTorch. Therefore, when the Llama project was launched, Meta naturally leaned towards active open-source AI model development.
For Meta's open source, we should also look at it from the following perspectives. First, Meta's products had to face the fact that we distribute our own apps through a competitor's mobile platform. In the competition for smartphone operating systems, Apple dominates the market with its closed ecosystem, and the rules of the game are all up to Apple. Although in terms of quantity, there are more Android phones, but Apple basically controls the entire market and all profits, and Android is actually following Apple. However, looking back at the PC era, while Microsoft is clearly not a completely open company, the Windows system can run on all OEMs, all different software, hardware, and is a much more open ecosystem compared to Apple. Windows is the leading ecosystem. In the PC era, an open ecosystem is dominant.
Therefore, I hope that in the next generation of computing, the open ecosystem will win and return again to become the dominant trend.
However, I believe that both open source and closed source models have their advantages. I'm not a very avid open source person, and not all of Meta's products are open source. However, open source is undoubtedly significant for the computing platform that the entire industry is building together. Meta's AI and Llama projects, as well as R&D on augmented reality and virtual reality, are actually building Horizon OS, an open operating system designed for mixed reality, just like Android or Windows, capable of supporting a wide variety of hardware vendors and producing a wide variety of devices. We just want to restore the ecosystem to this level of openness.
I firmly believe that open source will win eventually. It's a bit selfish, but after trying to build such and such a feature so many times only to be held back by the platform's fxcking limitations, I just want to make sure the underlying technology is in my hands for the next 10-15 years.
Jensen Huang: Think about our show going on......
Zuckerberg: I'm sorry, but when I talk about the topic of Apple Closed Source, I get angry, so help me "beep" later.
Jensen Huang: In any case, open source is a great undertaking. The world's brightest minds are dedicated to building the best AI systems and providing them selflessly to the world as a service. At the same time, if you want to build your own artificial intelligence, open source gives you the ability to do so. It's like I don't make this leather jacket myself, but buy the finished product made by someone else. The value of this service is immeasurable. Especially with your Llama 3.1 release, you have introduced models of different sizes of 405b, 70b, and 8b, and you can use a larger model to enhance a smaller model. And you've also set up Llama Guard to use as a guardrail for your model. As a result, I really appreciate that Meta is now completely transparent about how Meta builds its models, ensuring that everyone who uses them clearly understands how to use them correctly.
Further, we are calling for open source because it has to exist to get rid of the limitations of some closed model. But open source software is not something that one person or one company can do, it relies on a complete ecosystem, which inherently requires openness and collaboration. If it's not open source, it probably won't work effectively at all, can it? We chose open source not out of pure altruism, but because it will make our products better with a strong ecosystem. Just look at your contributions to the Pytorch ecosystem. At Nvidia, for example, hundreds of professionals are committed to making Pytorch more powerful, scalable, and efficient.
Zuckerberg: When you become the industry standard, naturally, other people in the industry will work to your standards, and open source is a very good business strategy, but I think some people just don't figure it out. However, Nvidia has been able to keep up with the latest AI models every time, providing professional support and optimization.
Jensen Huang: Yes, I'm a particularly supportive person. Although I am old, I am still relatively agile, which is a must-have quality for a CEO. I genuinely feel that the Llama series is very important. Nvidia has come up with a concept called the "AI Factory". Many companies have the need to use AI, but they don't know how to "feed" their business and data to AI, so NVIDIA provides the tools and expertise to help them achieve this goal, relying on Llama technology. This is NIM (Nvidia Inference Microservices) cloud-native microservices, which can use models as optimized "containers" that can be deployed on-premises, in the cloud, in the data center, or on workstations, and can be opened and used by developers anywhere at any time. NVIDIA has built a system of partners that includes OEMs and global systems integrators (GSIs) like Accenture to run NIM and create Llama-based workflows. This exciting project took shape thanks to Llama's open source.
Zuckerberg: yes, I think helping people distill the proprietary models that they need from the big models is going to be something really valuable and new. But I don't think there's going to be a general-purpose AI agent coming along.
Jensen Huang: I agree with you, and I don't think there's going to be an AI model that solves all problems. For example, Nvidia has an AI dedicated to chip design, a software-coded AI that understands USD, and an AI that can write Verilogs...... Each dedicated AI is fine-tuned on top of Llama. I believe that in the future, every company will have a customized dedicated model.
Zuckerberg: There's going to be a big question in the future: Are people going to choose larger, more complex generic models? Or do you prefer to train custom proprietary models? I bet it's the latter, and a ton of different models will pop up quickly.
Jensen Huang: But Nvidia opted for the big general-purpose model. Because an engineer's time is too precious. We are currently optimizing the performance of models with 4-5 tens of billions of parameters. It's no secret that such a large model can't fit any single GPU. That's why NV Link's high-performance connectivity is so important. Each of NVIDIA's GPUs is connected to each other via an NV Link Switch. For example, in the HGF architecture, there are two such switches that enable all GPUs to work together for true high performance. The reason for choosing the largest, best-performing model is because an engineer's time is extremely valuable, and even if it adds some costs, we don't care about saving those pennies. Our goal is to ensure the highest quality results for engineers.
Zuckerberg: The running cost of a Llama model with this level of parameters is about half that of GPT-4o. So at that level, your decision is pretty good. But what I'd say is that people need to distill smaller models that are able to run on certain devices, and that's another set of patterns.
Jensen Huang: Let me do the math: Nvidia hires an AI that specializes in designing chips for $10 an hour, but it can be shared, which is equivalent to giving each engineer an assistant. That's not a very high cost, and the wages we pay engineers are quite high, and AI can add a "superpowered employee" for just a few dollars an hour.
Zuckerberg: Huang, you don't have to convince me.
Jensen Huang: If you haven't hired an AI yet, don't you go! Let's change the subject and let's talk about the next wave. We often use your model of "dividing everything" internally. Training video models to improve our understanding of the physical world and the application of robotics and industrial scenarios is also an area that NVIDIA is actively promoting. Can you share more Meta ideas on computer vision, such as Ray-Ban Meta smart glasses, etc.
Zuckerberg: Speaking of which, I have too much stock waiting to be "new". Right here at SIGGRAPH, we're announcing the Segment Anything Model 2 (SAM 2). This time, the content of the video can also be divided, and it seems that the video is being divided into cattle raised on my ranch in Hawaii.
JJ: By the way, they're called "delicious Mark's cows." When you came to my house before, we cooked filet mignon. The next time you come back, you'll take the cow from your picture.
Zuckerberg: I'm a fantastic sous chef.
Jensen Huang: Who gave you this rating?
Zuckerberg: I'm a guest at your house, and it's midnight. Say, "Are you full?" I replied, "I don't know, I can probably eat some more." You are shocked and say, "Really?"
Jensen Huang: Xiaozha, you don't know, do you? Normally, when someone asks you if you're full, most people will hold their stomach and say, "I'm full."
Zuckerberg: (I'm not polite) I'll say, "Make me another whole cheesecake!" Old Huang!"
Jensen Huang: Let me show you how severe Mark Zuckerberg's "obsessive-compulsive disorder" is. So I was preparing the cheesecake, and I said, "Zha and cut some tomatoes." Then, I handed the knife to Xiaozha.
Zuckerberg: I'm a very good guy.
Jensen Huang: He cuts tomatoes to the millimeter perfectly. I originally thought that all the tomatoes would be cut into slices. But when I turned around, he said he needed another plate. The reason is that he cuts all the tomatoes, and once he separates one slice of tomato from another, he puts one slice on another plate and never lets the tomatoes touch each other again.
Zuckerberg: If you want the cut tomatoes to touch each other, you have to ask for it! I'm a chef, okay.
Jensen Huang: That's why he created an AI that doesn't judge people!
Zuckerberg: (speechless)
Jensen Huang: SAM2 is really cool because it can recognize the trajectory of a cow.
Zuckerberg: It's going to make a lot of interesting effects, and it's going to be good for science. Scientists can use it to study coral reefs, natural habitats, the evolution of landscapes, and more.
Jensen Huang: Let me give you one more use case. Let's say you have a warehouse with a lot of cameras in it, and the AI is monitoring everything in the warehouse. In the event of an accident, such as a collapse of a stack of containers or a sprinkling of water on the floor, the AI can quickly identify and react to the incident, automatically generate text describing the incident, and dispatch personnel to deal with it immediately. A model that can understand video content has a wide range of applications. What else are you doing outside of the Ray-Ban Meta smart glasses?
Zuckerberg: Still smart glasses. For the next computing platform, we divide it into mixed reality, headsets, and smart glasses. Nowadays, all people who wear glasses may eventually upgrade their glasses to smart glasses. There are more than a billion people in the world, so the potential for this market is huge. For VR and MR headsets, some people may recognize their use in the gaming world, but some people are not convinced. My view is that these smart devices will coexist. Smart glasses will be the next "mobile phone" and the beginning of the next generation of smart mobile devices. VR and MR headsets will become workstations or game consoles, which will take on more intensive computing tasks, after all, glasses have limited capacity and cannot carry the same level of computing power as mobile phones.
Huang: These are happening at the time of the explosion of generative AI.
Zuckerberg: So, fundamentally, we're approaching smart glasses in two different directions.
On the one hand, we've been building the technology we think is needed for holographic AR glasses, including developing custom chips, custom display technology stacks, and more. It's a pair of glasses, right? It's not a headset, it's not a VR/MR headset. They look like glasses, but they're still a big difference from the glasses you're wearing now. I mean, the glasses you wear on a daily basis are very thin. But even with the glasses that we've created in collaboration with Ray-Ban, you can't fit all the technology into it to enable holographic AR, even though we're getting closer. In the next few years, I think we'll get closer and closer. Such glasses will still be quite expensive, but I think that it will start to become a product.
Another direction for us is to work with Essilor Luxottica, the world's best eyewear manufacturer, to start with good-looking glasses. They have pretty much every big brand of eyewear you've ever touched, including Ray-Ban, Oakley, and more.
Jensen Huang: The Nvidia of eyewear.
Zuckerberg: I think they're probably going to like the metaphor. Who doesn't like to be called that at this moment? We are working with them on the second generation of the product. Our goal is to make the product look great. On top of that, we incorporate as much technology as possible. While we can't technically achieve the desired results, in the end, it will surely look like a beautiful pair of glasses.
On this glasses, we are equipped with a camera sensor, so you can take photos and videos, go live on Instagram. You can make a video call on WhatsApp and transfer what you see to the other person. It also has a microphone and speakers. The speaker is really good, it's open, so a lot of people find it more comfortable than earbuds. You can listen to music, it's a private experience. People love the design and often use it to make phone calls.
But then we found out that the sensor set is exactly what you need to talk to AI. It was an accident. If you had asked me five years ago if we would have implemented holographic AR before artificial intelligence, I would have said, yes, very likely, right? Because holographic AR seemed to only need advancements in graphics and display technology, including some virtual, hybrid display technologies, and we were making progress in those directions at the time.
But suddenly, the LLM direction erupted. As a result, we now have high-quality AI, and it has been evolving rapidly before holographic AR came along. This reversal was something I didn't expect. Luckily, though, we're ready because we're working on all these different products.
But I think you're going to end up with a range of eyewear products at different price points and at different levels of technology. Based on the data we've seen from the Ray-Ban Meta glasses product, I think AI glasses that cost around $300 will be a bestseller, and tens or hundreds of millions of people will eventually have such glasses.
Jensen Huang: (Put on these glasses) and you'll have a super interactive AI talking to you? You've just demonstrated your visual language understanding, you also have the ability to translate in real time, you can speak to me in one language and I'm hearing in another?
Zuckerberg: Obviously, the display will be great, but it will increase the weight of the glasses, and it will also make the glasses more expensive. I think there will be a lot of people who want holographic displays, but there will also be a lot of people who want very thin glasses.
JJ: For some industrial and work scenarios, we do need these glasses.
Zuckerberg: [In addition to that] also think about consumer products. I used to think about this a lot when I was working remotely, and people spent so much time on Zoom. In the future, we're not many years away from a virtual meeting, and it may be my hologram talking to you. I think it's very important to put AI in that.
Huang: It takes a bit of patience to accept that a device like this is part of everyday life.
Zuckerberg: But I think we're going to get to this point. I mean, the frames of glasses come in thin and thick frames, and there are various styles. I think it's going to be a while before we can have (thin) hologlasses like yours. But I don't think the use of holographic technology in a fashionable pair of thick-rimmed glasses is not far off.
I'm also trying to become a fashion influencer so I can bring my glasses to life before they hit the market.
JJ: I see you're trying. The result?
Zuckerberg: It's still early days (laughs on both sides). I think if our main business in the future is to make fashion glasses that people wear, then I should start paying more attention to it.
Jensen Huang: yes, I agree.
Zuckerberg: We're going to have to retire that "every day" me. I mean, glasses aren't the same as a watch or a phone, and people don't want to look the same. So I think it's going to be a platform, an open ecosystem, because people's looks and styles are going to be very diverse.
Jensen Huang: That's right. Mark, it's unbelievable that we're living in a time where the entire computing stack is being reshaped. What do we think about software? You know, Andrej Karpathy mentioned the concept of software 1.0 and software 2.0, and we're basically in the software 3.0 era right now.
Now that we've shifted from general-purpose computing to generative neural network processing, we're able to develop capabilities and applications that were unimaginable in the past. This kind of technology, generative AI, I can't remember any other technology that can impact consumers, businesses, and the scientific community at such a rapid pace. It is capable of spanning all different fields of science, from climate to biology to physical science. In every field we encounter, generative AI is at the heart of a fundamental shift.
In addition to that, as you said, AI will have a profound impact on society. One of the things that got me super excited was that I was asked if there would be Jensen AI. That's exactly what you mean by creative AI, where we can build our own AI. I upload everything I've written and fine-tune it the way I answer questions. Hopefully, with the passage of time and the accumulation of use, it will become a truly great assistant and a companion to many people. You can ask it questions and it will generate new ideas for you to feed back on. Like you said, it's going to be a non-judgmental version of Jensen, and you don't have to be afraid of being judged. You can come and interact with it at any time. I think these things are really incredible.
And, we've been writing a lot. Just give it three or four topics, tell it that's what I want to write about, and write in my voice. It's incredible. There's so much we can do now. It's been a great experience to work with you. I know it's not easy to build a company, you have to move your company's products from PCs to mobile devices, VR, artificial intelligence, all these platforms. It's really unusual for you guys to do this, and Nvidia has been through this transformation many times, and I know very well how hard the process is. We've both had our fair share of setbacks over the years, but that's what it takes to pioneer and innovate. So it's really great to watch you go all the way.
Zuckerberg: Well, thank you (impressed).
If you continue to do what you did before, you can't be sure if it's a turning point or not. It's also very interesting to see how you've traveled. And you go through a period when everyone is saying, no, everything will move to these devices, and computing power will become very cheap. And you've been insisting that you're going to need these large systems that can be parallelized for large-scale computing.
Jensen Huang: We went the other way. Now we're not making smaller and smaller devices, we're making mainframe computers.
Zuckerberg: It's not very fashionable.
JJ: It was unfashionable at the time, but it looks cool now. We started making graphics cards – GPUs. Now, what you call H100, Xiaozha has 600,000 data centers.
Zuckerberg: We're your great customers (laughs).
Jensen Huang: The systems you're building are huge systems that are hard to coordinate and hard to run. You've said you're late to your GPU journey than most, but you're running on a larger scale than anyone else. It's incredible. Congratulations on everything you do. You're also a fashion icon.
Zuckerberg: [My business] is still in its early stages.
JJ: After the last time we had dinner with Mark, we swapped jackets, and that photo went viral. I think he's okay wearing my jacket, but is that really me the guy in the photo?
Zuckerberg: Debably.
Zuckerberg: Actually, I made one for you this time.
I brought a box with me, (Mark pulls out a new leather jacket, Huang takes off his jacket to try it on), it's black, it's leather, and it's fur in one. Actually, it wasn't something I did, I just ordered it online.
Jensen Huang: Wait, I'll give it a try.
I put it on.
Zuckerberg: Wow, get this guy a necklace. The next time I see you, I'll bring you a gold chain.
Jensen Huang: To be fair, I'll give you one too. This is a new jacket that Lori (Huang's wife) just bought me for SIGGRAPH. Because SIGGRAPH is an important occasion for NVIDIA, it is here that RTX was announced. So it's a brand new jacket that we can interchange. It's yours (Huang gives the jacket to Mark).
Zuckerberg: This one is more valuable because it was worn by [Huang].
Jensen Huang: Let's see, Mark looks robust.
Zuckerberg: So are you.
Jensen Huang: Thank you, and have a great time at SIGGRAPH.
References: https://www.youtube.com/live/H0WxJ7caZQU