laitimes

What big move will OpenAI make next week? Ultraman may have spoiled the story in this interview

What big move will OpenAI make next week? Ultraman may have spoiled the story in this interview

Tencent Technology

2024-05-12 08:25Posted on the official account of Beijing Tencent News Technology Channel

Highlights:

  • 1

    Altman examines the concept of universal basic income and explores the new economic models that may be needed in the era of artificial intelligence.

  • 2

    OpenAI has not yet decided when GPT-5 will be released, nor has it decided on a final name, and is committed to ensuring that it meets a high standard when it is released.

  • 3

    I want to be able to develop an open source model that is as good as possible and can run efficiently on mobile phones.

  • 4

    The iPhone is one of the greatest technological products ever made in human history, and if you want to surpass it, you have to set a higher standard.

  • 5

    The expectation is to have a device that is always active and extremely easy to use, and can understand the user's needs through voice, text, or better yet, in other ways.

Tencent Technology News reported on May 12 that according to foreign media reports, OpenAI CEO Sam Altman (Sam Altman) recently participated in the podcast program "All-in Podcast". At the event, Altman talked about a series of topics such as inference computing, open-source models, the GPT-5 large language model under development, AI regulation, the implementation of Universal Basic Income (UBI) in the post-AI era, how advanced agents will change the way we interact with applications, and the OpenAI "Gong Dou" incident that occurred at the end of last year.

In addition, it was reported on Friday that Apple is about to work with OpenAI, and it is expected that the next-generation mobile operating system, iOS 18, may integrate ChatGPT's features. In the interview, Altman also explores the characteristics of a possible future product that could compete with the AI-enabled iPhone.

The following is a full transcript of the program.

Q: It's a great honor to have Sam Altman, an innovator and entrepreneur who has made a splash in the tech world. Let's turn back the clock to about 20 years ago, when I had the pleasure of meeting Ultraman for the first time. At the time, he was working on a native mobile app called Looped, which was later backed by Sequoia Capital. We both became angel investors at Sequoia Capital, and Altman showed his extraordinary insight at that time, leading the investment in Stripe, a fintech company that was still unknown at the time, and I invested in Uber. The funds we participated in together have achieved a staggering return of more than $200 million with millions of dollars invested.

During that time, Altman's career trajectory shifted to Y Combinator, a startup incubator where he served as president from 2014 to 2019. In 2016, he co-founded OpenAI with a grand vision – to create artificial general intelligence that would benefit all of humanity. In 2019, Altman left Y Combinator to devote himself to the role of CEO of OpenAI. On November 30, 2022, with the release of ChatGPT by OpenAI, Ultraman's name began to be widely known in the tech world. In January 2023, tech giant Microsoft invested up to $10 billion in OpenAI.

However, in November 2023, the situation suddenly changed dramatically. In just five days, social media was flooded with news that Altman had been fired by OpenAI's board of directors, with many speculating that he might join Microsoft, and OpenAI's dream team that was about to realize general AI technology could be disbanded as a result. But just a few days later, Altman dramatically returned to OpenAI as CEO again.

Recent reports suggest that Altman is looking to raise up to $7 trillion for an AI chip project. In addition, it is also reported that he has teamed up with Jony Ive, Apple's former design director, to plan to raise $1 billion from SoftBank CEO Masayoshi Son to develop a product that can compete with the iPhone. Since the release of ChatGPT, the service has continued to evolve and has had a profound impact on the way we work. ChatGPT has reportedly become the fastest-growing consumer app in history, gaining 100 million users in just two months. OpenAI's revenue growth has also been impressive, reaching $2 billion annualized at the end of last year.

Now, let's warmly welcome Ultraman to the "All-in Podcast" program.

Ultraman: Thank you very much!

GPT-5 has not yet set a release date, but is considering a new release strategy

Q: The industry is eagerly awaiting the release of GPT-5. There are reports that the product will be officially launched sometime this summer. Is it possible to narrow down the time frame? When will you release GPT-5?

Ultraman: Regarding the release date of GPT-5, we have not yet set a specific date. We are cautious about the launch of the new model and are committed to ensuring that it meets our high standards at launch. As you mentioned, since the release of GPT-4, we have noticed a continuous improvement in model performance, which indicates a natural trend of technological advancement. We believe that through continuous use and optimization, AI systems will be able to better serve society, not just through a simple increment of version numbers.

We're looking at a new release strategy that may be a different way than we've done before. Our goal is to make AI technology more popular, so that a wider user group can enjoy the advanced technological results. We believe that by providing easily accessible AI tools, we can unleash the innovation potential of more people, which is one of our core missions.

We're still discussing the naming and release strategy for GPT-5, but it's safe to assume that we're committed to making this new model a great experience for users when it is released. We will continue to monitor the development of the technology and explore the best release strategy so that more people can benefit from advanced AI technology.

Q: Does this mean that there will not be a long training cycle, but rather an ongoing iteration of training or training submodels? What architectural changes will be made in the future for large models?

Ultraman: As you can imagine, it seems reasonable to continue training the right model.

Q: You just mentioned that GPT-5's launch event is a bit different. Are you thinking about releasing GPT-5 to paying users first; Or is the security risk high right now, and you're planning to let the red team test it before releasing it to users?

Altman: We put a lot of emphasis on making AI technology more accessible and accessible, and that's really one of our core missions. Currently, GPT-4 is mainly for paid users, but we are actively exploring how to make advanced technology available to free users. Our philosophy is to develop AI tools and make them as widely available to users as possible for free or at a very low cost. We believe that this can help people use these tools to invent and create new things, and promote the progress and development of society. While AI General Sense is still a distant goal, we believe that through continuous innovation and optimization, we can gradually get closer to this goal. We're trying to find ways to make advanced models like GPT-4 available to more users at no cost or at a lower cost. I'd be very sad if we hadn't figured out how to make GPT-4 available to users for free. That's what we really want to do.

Both open source and closed source have their own unique value and role, and more models are planned to be open-sourced in the future

Q: I think there's two main factors that people talk about a lot. First, it limits the potential costs and delays of killer app development to some extent. The second factor is the ability for people to develop applications in an open source environment for a long time, and the madness in this space is that the open source community is very fanatical. We just demoed Devin just over a month ago and we're very impressed. How do you see the open source model evolving in the coming years?

Altman: Regarding the speed and cost issues that you mentioned, we attach great importance to those two factors. While I can't give an exact timeline for when significant reductions will be achieved, I'm confident that we can achieve it. We're committed to reducing latency and dramatically reducing costs. Although we are still in the early stages of understanding the scientific development and working principles of AI, we firmly believe that with continuous efforts and innovation, we will eventually be able to achieve the desired goals. All of our current development work is progressing steadily. We recognize that it will be a major breakthrough when the cost of AI becomes low enough that it is almost negligible, and at the same time its speed becomes so fast that it is almost instant for us and other users. We believe that achieving this goal will unlock great potential for ourselves and all of our users.

When it comes to open source and closed source, we believe that both have their own unique value and role. We already have plans to open source more models in the future, and we are actively developing general AI and exploring how to distribute the benefits broadly. This strategy has been embraced by a lot of people, although it may not be for everyone. We're building a huge ecosystem that will include an open-source model and a community of developers built on top of it. Personally, I'm particularly interested in the open source space, especially to be able to develop the best possible open source model that works efficiently on mobile phones. At the moment, it doesn't seem like there's a very good solution on the market yet, but I'm sure it would be a very important technological advancement.

Q: When will you develop an open-source model that will run on mobile phones?

Ultraman: I don't know if we're going to do that, or if anyone is going to do that. Maybe Llama 3 or Llama 4 will be able to do that.

Q: I'm guessing that the 7 billion parameter version of the Llama 3 model might be suitable for running on a mobile phone.

Ultraman: It doesn't matter if this version of Llama 3 is for mobile phones or not, but I think it will work on mobile phones. I'm not sure yet, I haven't experienced it.

Q: When Llama 3 was released, many people thought that its performance was already on par with GPT-4, or even very close to it in some ways. OpenAI just released a new version of GPT-4 not long ago, and is still working on GPT-5. Given the excellent performance of open-source Devin, what does OpenAI, as an industry leader, need to do to maintain its leading position in the field of artificial intelligence?

When Llama 3 was released, many thought it had caught up with GPT-4 in terms of performance. I think Llama 3 is on par with GPT-4 in all respects, but it seems to be pretty close. My question is, OpenAI just released a new version of GPT-4 not long ago, and it's still working on GPT-5. Because of the excellent performance of open-source Devin, how can OpenAI maintain its leading position in open-source models?

Altman: Our goal is not just to develop algorithm weights that are as smart as possible, but to create a practical intelligence layer that people can apply in a variety of scenarios. In this process, our model, although it is a core component, is only a part of the overall intelligent system. I am confident that OpenAI will maintain its leading position in this field, and we are determined to maintain this advantage. To achieve this, we also need to build more infrastructure and support around the system. Like any other business, we must build lasting value in the traditional way. This means that we need to discover and uphold a great product vision that will continue to deliver value to our customers.

We are working to build an ecosystem that includes not only advanced AI models, but also user interfaces, developer tools, educational resources, and community support to ensure that our technology is widely accepted and effectively utilized. With such a holistic approach, we hope to achieve long-term success and continue to make a positive impact on society in the field of AI.

Q: When we started OpenAI, the organization's goal was open source, because open source is so important for any company. Then the switch came in, because everyone was so easy to develop and use the technology that we needed to lock it down. I think that's right, because the cynical side is like this. Moving from open source to closed source, I wonder why you ended up on this path?

Altman: Part of the reason we released ChatGPT was to show the world what we were doing. We've been trying to get the message across: AI is really, really important. Back in October 2022, not many people were aware of the importance of AI or its imminent impact. An important part of our efforts to do this is to enable people to actually use the technology. Now, there are a number of different ways to achieve this, and I think that really plays a very critical role. However, the fact is that many people are currently using the free version of ChatGPT, and we are not advertising it and have not thought about profitability. We launched the free version of ChatGPT because we wanted people to be able to take advantage of these tools. I think this already provides a lot of value to people, it's like "teaching people to fish". The reason we're doing this is also to give people a better understanding of what's going on in the AI industry as a whole.

As for whether the closed-source strategy is correct, we don't have a standard answer yet. We, like other companies, are exploring and adjust our strategy many times as we learn new things. When we started OpenAI, we didn't know how things were going to turn out. We hadn't even developed any products until we developed our first language model. We're just trying to explore step by step and move forward steadily. We will continue to do so.

Intelligence is simply a emergent property of matter, like the laws of physics

Q: I think that's what you just said when you talked about open source and closed source: no matter what business decisions are made, all of these models are going to get infinitely close to a certain level of accuracy. Not all, but let's say there are four or five models, and they have enough money behind them, such as OpenAI, Meta, Google, Microsoft, and so on. Let's say there are four, five, maybe a startup, and an open source model. And then very quickly, the accuracy or value of those models can be transferred to proprietary training data that you can get and no one else can get, or data that someone else can get but you can't. Is that how you see this developing? Open source allows everyone to reach a certain threshold, and then there's competition for data, isn't it?

Altman: I definitely don't think it's going to turn into a race about data, because when models become smart enough, at some point, they're going to stop relying on more data, at least in terms of training. Data may still be needed to improve its usefulness. One of the most important lessons I've learned along the way is that it's very difficult to make confident predictions about where the next few years will come, so I don't want to try to do that at the moment. I do expect a lot of remarkable models to emerge in the world.

It seems to me that we have just discovered a new fact in nature or science, which, whatever you want to call it, is almost a spiritual cognition. Intelligence is just an emergent property of matter, and it is like the laws of physics. I'm sure people will understand this, but there will be many different approaches to system design, people will make different choices, and come up with new ideas. I believe that, just like any other industry, there will be many different approaches emerging in the field of AI, and different people will have different preferences. In the same way that some people like iPhones and some people like Android phones, I think AI models will show a similar diversity.

Q: Questions about cost and speed. All AI companies are somewhat constrained by NVIDIA's production capacity, isn't it? I think you and pretty much everyone else have effectively announced the number of chips you guys can get, because that's just Nvidia's maximum capacity. In order to be able to compute cheaper, faster, to get more energy, something needs to be done at the infrastructure level so that you can actually solve these problems. How are you helping to shape the industry to address these issues?

Altman: We're definitely going to make significant progress in algorithms, and I don't want to underestimate that. I'm very interested in the field of chips and energy. If we can double the efficiency of a model for the same performance, that effectively means we have twice the computing power, doesn't it? I believe there's still a lot of work to be done in terms of efficiency, and I look forward to seeing the results as we can really start to see them. In addition to these, the complexity of the entire supply chain cannot be ignored. This includes the production capacity of logic chips, the availability of high-bandwidth memory (HBM), and how quickly we can get construction permits, pour concrete, build data centers, and complete cabling work. Access to energy is also a huge bottleneck. But I believe that when these technologies are of sufficient value to people, the world will take the necessary action, and we will work to accelerate this process.

Of course, there is also a possibility, and as you mentioned earlier, if there is a major breakthrough in infrastructure, we may have a more efficient approach to computing. However, I don't want to rely too much on this possibility, and I don't spend too much time thinking about it.

Voice interaction is an important clue to the future of interaction

Q: What about the device? You've mentioned models that can be adapted to mobile phones, whether it's large language models (LLMs) or small language models (SLMs), and I'm sure you're already thinking about the use of these models. But will the device itself change as well? Will these devices also become as expensive as iPhones?

Altman: I'm very interested in this topic and I'm passionate about the innovative forms of computing. Every major leap in technology seems to open up new possibilities. The current performance of mobile phones is amazing, so to reach the next level, the bar is naturally very high. Personally, I believe that the iPhone is one of the greatest technological products ever made in the history of mankind. It really is an extraordinary device. I've also mentioned before that it's already so good that to surpass it, we have to set a pretty high bar.

Q: Does it either have to be designed to be more complex, or should it actually be more economical and easier to use?

Ultraman: Almost everybody is willing to spend money on a phone, so if you can make a device that costs a lot less, I think there's a certain amount of resistance to carrying or using a second device. Given that most of us are willing to pay for our phones, I don't think just reducing costs is the solution.

Q: Will a different (device) be the answer? Could there be a specialized chip that powers mobile phones and is particularly adept at supporting phone-sized AI models?

Altman: It's likely that there will be [chips like this], but mobile phone manufacturers will definitely work in this direction. This doesn't necessarily require us to create a completely new device. I think the key is to find a truly different mode of interaction that has been spawned by technological advancements. If I could know what that is, I'd be very excited to devote myself to this field now.

Q: However, your current app already has a voice function, and in fact, I set the quick action button on my phone to launch ChatGPT's voice app directly. I use it a lot with my kids and they love to communicate with it. Although this app sometimes has some lag issues, it's really outstanding.

Ultraman: We will continue to improve and improve the quality of our voice features. I believe that voice interaction is an important clue to the future of interaction. If you can achieve a truly high-quality voice interaction experience, it's a whole new way to interact with computers.

Q: By the way, why does ChatGPT sometimes not respond, it feels like it's using a radio, which is really annoying. But it's just as amazing when it provides the right answer.

Ultraman: We're working on improvements, but at the moment it does look a bit clunky, not responsive enough, and lacks a smooth and natural feel. We're committed to making these experiences a significant improvement.

Q: In the field of computer vision, is it possible to imagine having glasses or wearing a pendant that allows you to combine visual or video data with voice information? Through this combination, AI is able to get a complete picture of everything happening around you.

Ultraman: The ability to interact with each other in multiple modalities is extremely powerful, for example, you can ask the question, "Hey! ChatGPT, what am I looking at? Or, "I'm not quite sure what kind of plant this is." "This is obviously another direction to explore. But in terms of whether people want to wear glasses, or hold up a device to get information when they need it, I think there's a lot of complex social and interpersonal considerations, especially when it comes to wearing computing devices on the face.

Q: We have witnessed the case of Google Glass, where someone had a physical altercation while on a mission, which raised a lot of questions. If artificial intelligence becomes ubiquitous on everyone's phone, what applications might spawn? Do you have a hunch of what might happen, or what kind of applications you'd like to see developed?

Ultraman: What I want to have is an always-on and easy-to-use device that understands my needs through voice, text, or better yet, in other ways. I envision a system that can assist me around the clock, gather as much contextual information as possible, and be the best assistant in the world, constantly helping me improve myself. When it comes to the future of AI, there are often two different perspectives that may sound similar but have significant differences in the practical application of system design. One view is to want AI to become a kind of extension of the individual, like a ghost or another ego, able to act on my behalf, and even handle mail without notifying me, and it becomes like it's a part of me. Another point of view is to hope that the AI will be a great senior employee, that it will be able to understand me very well, that I can delegate tasks to it, that it can understand my needs as well as having access to my email, but that I will see it as a separate entity. Personally, I prefer the latter and think that's where we want to go in the future. In this sense, AI is not a simple extension of the user, but an assistant or executive who is always accessible, always excellent, and extremely capable.

The agent does not just execute commands mechanically, it is capable of reasoning

Q: The agent is like a representative of you in a way, able to understand your needs and anticipate your intentions, which is exactly what I mean by what you're saying.

Altman: I expect a similar agent to emerge, but there is a difference between a senior employee and an agent. One of the things I appreciate about senior employees is that they are able to give me feedback, and sometimes they may choose not to do exactly what I instruct. They might tell me that if you do what you ask, there might be consequences like this, and then this, and then that, are you sure you want to do that? The agent doesn't just execute commands mechanically, it is capable of reasoning, yes, it has the ability to reason. The relationship between the agent and me is the kind of interaction I expect with a truly capable colleague, unlike someone who just blindly follows.

Q: In this envisionary world, if we had advanced agents like Jarvis, how would they change the way we interact with applications? These agents are capable of reasoning. How will this affect the product interfaces we currently use, especially in applications with high interface value, such as Instacart, Uber, and DoorDash, which are not just providing APIs, but directly providing convenience for people. In a world where agents interact on behalf of 8 billion people, how do we need to shift our thinking about how applications work and the entire experience architecture?

Ultraman: I'm very interested in the idea of creating a world that is suitable for both humans and artificial intelligence. I love the explainability of the design, the fluidity of task handoffs, and our ability to provide feedback. For example, DoorDash can provide some API interface to my future AI assistant so that the assistant can place orders automatically, or I can say to the AI assistant from my phone, "Okay, AI assistant, please help me place an order on DoorDash," and then watch how the app responds, see the tap on the interface, and if needed, I can instantly correct it by saying, "Hey, that's not the option," or "Well, that's what I want."

Q: So you realize that voice interaction has the potential to make traditional applications redundant. You can simply say what you want sushi and it will recommend the type of sushi you like based on your past preferences, while avoiding the options you don't like and trying to provide you with the best ordering experience.

Ultraman: It's hard for me to imagine that we're fully into a situation where we simply say, "Hey, ChatGPT, order me some sushi," and it responds, "Okay, which restaurant do you want to order from?" What type of sushi do you want? When will it be delivered? "And so on. I believe the user interface is very useful for many tasks. It's hard for me to imagine a world where you don't look at the screen at all and rely only on voice interaction, but I really can't imagine that.

Q: Exactly. This is true for many transactions. For example, Apple introduced Siri, which is supposed to allow us to automatically book Uber reservations, but I suspect that very few people actually do that. After all, why take the risk? As you pointed out, because the quality of service of voice assistants is not high enough at the moment. However, once the quality of service is high enough, I think people will really be more inclined to use voice assistants because it's more convenient. You don't need to take out your phone, you don't need to open apps, you don't need to tap anything. If you encounter an app that automatically logs out, or you need to log back in, or if you have two-factor authentication (TFA), the whole process becomes very cumbersome and annoying.

Ultraman: I understand that for something as simple as setting a timer, I always use Siri because it's so effective and convenient, and I don't need other extra information. However, for a service like calling Uber, I may want to compare the prices of several different options, see how far the vehicle is actually from me, or even see the driver's real-time location on the map, as sometimes I may choose to walk to a more convenient location. By working directly on the Uber app's interface, I can get these details much faster, which is much more efficient than doing the whole process with voice commands.

Q: I love the idea you came up with to "watch it happen", that's really cool.

Altman: I think there will be different interfaces for different tasks, and I think that's going to continue.

I'm most interested in having AI accelerate and improve the process of scientific discovery

Q: Of all the developers who are building apps and experiences using the OpenAI platform, are there any projects that stand out to you that show great potential for growth, even if they are still in the early stages or similar toy applications? Or are there any outcomes that your team specifically identified and found significant?

Ultraman: I came across a newly created startup team this morning, which is technically made up of two people. They plan to run a project in the summer dedicated to the development of an AI mentor feature. I've always had a keen interest in this space, and while many have already achieved outstanding results on our platform, it would be very exciting if someone could make it happen – as they described it would be a "monorail level" innovation – to revolutionize the way we learn. I'm personally excited about the prospect of finding a new way for people to explore and learn in a personalized way. A lot of the things you mentioned earlier related to coding, I think it's a really cool look into the future. I believe healthcare is an area that is in particular need of a radical overhaul. But personally, what excites me the most is the possibility of accelerating and improving the scientific discovery process. While GPT-4 is clearly not yet a major breakthrough in this area, it may have accelerated the process to some extent and made scientists more productive.

Q: These models are trained and built differently from language models. While there are many commonalities between them, there are also many differences. Many of these models are designed based on an entirely new kind of infrastructure, and they are designed for a specific set of problems or use cases, such as modeling chemical interactions. Is that so?

Altman: Yes, there are certain elements that are essential, but I think what we're generally missing right now, for many of the applications that we're talking about, is models with inference capabilities. Once models are capable of reasoning, they can be combined with tools such as chemistry simulators or used to make predictions and hypotheses.

Q: Indeed, one of the key topics that I want to explore today is the idea of model networks. When discussing agents, people tend to think of a series of linear function calls, but in biology, what we observe is the complex network and interaction between systems. These systems produce results through the integration of sets and networks, rather than simply linear calls. Are we in the midst of the rise of architectures where specialized models or network models work together to solve more complex sets of tasks and apply reasoning skills, with some computational models focusing on chemical or arithmetic operations and others taking on different responsibilities. This architecture avoids relying on a single, omniscient model to handle everything.

Ultraman: I'm not sure if reasoning will eventually become a widely applicable feature. I have that hunch and hope that it is, and if it really happens, it will be very ideal. I'm not sure if that's really the case, though.

Q: Let's take protein structure prediction as an example. There is a large amount of protein image and sequence data, and the researchers use this data to build a predictive model, and they follow a series of processes and steps to achieve this. Is it conceivable that in the future, there may be a model with artificial general intelligence or advanced reasoning ability that can autonomously construct sub-models to solve this problem, collect the necessary data and process it to solve the problem of protein structure prediction?

Altman: There are many possibilities, for example, it could train a specialized model for a specific task; Or, it may rely on a large model that is able to identify additional training data that it needs, and then actively query that data to update its knowledge base accordingly.

Q: I wonder, is it true that all these startups are going to fail? Many startups today are adopting the strategy of collecting specific data and then training a new model based on that data that is optimized for a task and outperforms all other solutions.

Altman: In a way, as we pointed out when we talked about biology and those complex systems networks, I've been able to get a glimpse of that. I laugh because I recently went through a serious illness and am now mostly recovered, but the whole process is like a blow to the body system one by one. You can really make it clear that, for example, it's a problem with the digestive system. This reminds me of what we talked about earlier, where you don't know the complexity of the interaction between these systems until something goes wrong, which in itself is a very interesting phenomenon. At the same time, I tried to use ChatGPT to understand my condition. I might say I'm not so sure about a question. And then I would put the relevant paper link into context without reading it, and I would find, oh, that's the question I'm not sure about, and now I think it should be. That's an example that you mentioned earlier, you can say, I don't know this information, and then you can add more contextual information, and instead of retraining the whole model, you just need to add context on the fly, and then you can come to a conclusion.

Q: These models for predicting protein structures, for example, form the cornerstone of this field. Now, I'm curious, is Alpha 3 able to handle the predictions of other molecule types? If so, then theoretically the best general-purpose model will be able to ingest, learn, and solve problems autonomously. Perhaps you can give us an example to illustrate this process. Can you talk to us about Sora? Your video model is capable of generating stunning moving images and videos. Can you tell us what makes its architecture unique and how it differentiates from other models?

Altman: It's true that for general problems, we obviously need tools like specialized simulators, connectors, and various data fragments. I have a gut feeling – although I have to stress that there is no scientific basis for this – that I think it is achievable if we can understand the essence of general reasoning and apply this principle to new problem areas, similar to the universal reasoning ability of humans. It's going to be a quick breakout, at least that's what I think. As for Sora, it is not based on a language model, but a model designed specifically for video processing.

Q: Obviously, we're not quite at that level yet, right? For example, to build an efficient video model, you may be starting from scratch with a unique architecture and a specific data set. Going forward, however, a system with general-purpose reasoning capabilities – what we call artificial general-purpose intelligence – should theoretically be able to master the methods of rendering video through self-learning and logical reasoning.

Altman: I would like to say that as an example, as far as I know, most of the best text models in the world are still built based on regression methods. At the same time, the use of diffusion modeling techniques for top-of-the-line image and video models is somewhat unusual.

There are differences in the use of training data, and it will not enter the music industry for the time being

Q: There is a lot of controversy around the use of training data. You've been dealing with issues in a fair use and respectful approach to creators' rights. Why did OpenAI choose not to get involved in the music industry? You're more cautious than other companies. At the moment, you've set up some licensing agreements, but as far as I know, you haven't reached a settlement with the New York Times, and I'm guessing it's about the use of training data. How do you view and balance the fair use principle? We had a heated discussion about this on our podcast. Your actions by entering into licensing agreements have demonstrated your commitment to fairness. What do you personally think about the rights of artists who create great music, lyrics, books? When you use their work, create derivative products, and commercialize them, how should fairness be defined? How do we build a world where artists can create content and be able to control how others use their work? I'm curious about your personal opinion because I know you've been thinking deeply about this issue and that many people in our industry don't give enough thought to the rights of content creators.

Altman: I think there's a significant difference between different types of situations. Taking fair use as an example, I think our position is reasonable under the current law. But the specificity of AI means that we need to think about these issues from a new perspective when it comes to creative work such as art. For example, if someone learns math by reading math materials on the Internet, I think most people will think it's beyond reproach. However, for other cases, there may be different opinions...... But in order to avoid being too verbose, I won't go into details.

It seems to me that there is a general perception that learning general human knowledge, such as mathematical theorems, is part of the public domain. On the other hand, the systematic imitation of an artist's style or the creation of a portrait involves more complex copyright issues. Between these two extremes, there are many different scenarios. Traditionally, discussions have tended to focus on training data, but as the value of training data decreases, we are increasingly focusing on how the model behaves when inferring and how it accesses and uses information in context in real time. Behavior while reasoning will become the new focus of discussion and how the economic model of the future will evolve.

Using music as an example, if someone asks a model to compose a Taylor Swift-style song, even if the model has never been trained on a Taylor Swift song, we still face a challenge: the model may already know about Taylor Swift and the themes of her music. This begs the question: should the model be allowed to imitate her style, even without direct training? If so, how should Taylor Swift get paid? In this case, I think there should be an opt-in or opt-out mechanism first, and then an economic model should be built to deal with these issues.

From the history of sampling in the music industry, we can find an interesting perspective on how this economic model works. Although this is not exactly the same as AI creation, it provides us with a starting point for thinking.

Q: Sam, I'd like to challenge you to the example you provided: the model learns the elements of song structure, rhythm, melody, harmonic relationships, etc., which are the key factors that make music successful, and then uses the training data to create new music. How is this different from the process by which a person listens to a lot of music, absorbs this knowledge, and builds a similar predictive model or understanding in their brain? What's the difference between the two? Why do you argue that artists should be paid specially? It's not a simple case of sampling, because instead of copying or storing the original song, the AI is learning the structure of the music, is it?

Ultraman: I'm not trying to make that point, because I agree, just like humans get inspiration from other humans, I'm saying, if you say, "Create a Taylor Swift-style song for me." ”

Q: I see, okay, there's an artist's style in the prompt.

Ultraman: Personally, I think these are two different cases.

Q: Are you comfortable with a situation where a musical model autonomously trains on all the music works created by humans without paying royalties to the artists who created them? Then, instead of being allowed to make a request from a particular artist, you can ask something like this: "Please play me a fairly modern pop song about heartbreak, preferably by a female voice." ”

Altman: We've decided not to get into the music industry for the time being, in part because of the complex issues surrounding how to define the boundaries of use. For example, I recently met with some musicians that I really respect to try to explore some marginal cases. Imagine if we paid 10,000 musicians to create a large amount of music dedicated to building a quality training set that would allow our music model to learn all the elements about strong rhythmic structures and compelling rhythms. If we only train with this music, we can still theoretically develop a great music model, and maybe we can achieve this. I had suggested this as a thought experiment, and the musicians had said that at that stage, they had no reason to object in principle. Even so, I personally have reservations about it. That's not to say we shouldn't, but there are some considerations.

Q: Have you seen Apple's recent ad? It condenses all human creativity into a very thin iPad. What are your thoughts on this?

Ultraman: People have a very emotional reaction to this, much stronger than you might think. I myself have a very positive attitude towards AI, but I believe that there are some things in human creativity and artistic expression that are invaluable. We certainly welcome AI that can enhance scientific research. However, when it comes to artificial intelligence capable of deep and wonderful human creative expression, I think we should proceed with caution. The development of this technology is inevitable and will be a tool to help us reach a higher level of creativity, but in the process, we should find a way to both advance technology and maintain the creative spirit that we hold dear.

Some advanced AI systems will have the ability to cause serious global harm

Q: When it comes to the regulation of AI and the implementation of a universal basic income in an AI-dominated world, there are advocates for "comprehensive regulation of AI". What exactly does this mean? Can you share some thoughts on the recent regulatory proposals made by California? If you wish, we can explore this topic further.

Ultraman: I'm a little worried about the current situation. While I note that there are many regulatory proposals for AI being discussed, from what I have seen personally, many of the proposals in California raise concerns for me. At the same time, I am generally concerned if each state does its own work on the regulation of AI. When it comes to "regulatory AI," I don't think people are on the same page. Some may advocate a complete ban on the development of AI, while others may insist that AI should be open source, not closed.

What I am particularly concerned about personally is that I believe that in the not-too-distant future, we may face a moment – and I have to admit, this is only a forward-looking statement, and there are always risks associated with making such predictions – and my prediction is that in the near future, some advanced AI systems will have the ability to cause serious global harm. For these systems, I would like to see an international body similar to the global regulation of nuclear weapons or synthetic biology that oversees these most powerful systems and ensures that they are properly tested for safety to prevent them from getting out of control, self-recursive improvements, or other similar risks.

Q: Critical voices point out that you have the resources to lobby and build strong ties with politicians, and that you have been very actively involved. However, for startups that are also passionate about the space and are investing in it, they may not have the resources to lobby or deal with regulatory capture. As venture capital mogul Bill Gurley mentioned in a brilliant speech last year, this could be an issue of concern. Maybe you can confront this problem head-on and give your opinion.

Altman: If the regulation is limited to "we only focus on those models that are trained on computers worth more than $1 billion or $10 billion," I think that's acceptable and can set such a standard. And I don't think such a regulation would impose a regulatory burden on startups.

There are strong concerns that regulation may be overdone, and GPT-4 does not pose a substantial threat

Q: If you have nuclear material to make a nuclear weapon, which is usually limited to a few people, so you can compare that situation to the analogy of requiring nuclear inspections.

Altman: Actually, I think it's an interesting point. On the issue of regulation, I would like to add that I am very concerned that regulation may be excessive. I believe that whether we have overdone it or just a little bit, we can all make mistakes. Similarly, if we don't do enough, we can go wrong. But I really think we have a responsibility and a mission to talk about what we think is going to happen and what we need to do to get it right.

Q: The challenge is that we have existing regulations that are meant to protect people and society as a whole, but we're facing a new type of regulation that might give governments the power to review code and trade secrets that we've never encountered before. For example, the legislation proposed by California and some proposed federal legislation essentially require the federal government to audit the model, software, and check and verify the parameters and weights of the model. You can't deploy these models or software for commercial or public use until you've been certified by the government. To me, it seems like it's because people are scared of the understanding of AI and its potential impact, and they want to control it, and one way to do that is to require an audit before it's released. I think these legislators may still have a limited understanding of AI. As you know, it's even better than anyone that with the rapid development of technology, these regulations may no longer be applicable after a year.

Altman: The reason I'm advocating for an institution-based approach to dealing with macro-level issues, rather than writing them into law, is because I think in 12 months, these laws will all prove to be incorrect. Even if these legislators are world-class experts, I doubt they will be able to make the right decisions in a 12-month or 24-month time frame. I don't believe in policies like we have to review all your source code and check your ownership weight one by one, and yes, I think there are a lot of unrealistic proposals out there. But just like an aircraft needs to go through a series of safety tests before it can be certified, it's completely different from reading all your code, and we should review the output of the model rather than the internal details of the model. I would say that I think it makes sense to do security testing.

Q: How can we get there? I don't just speak for OpenAI, but for the entire industry, and even for all of humanity. I fear that if we limit the development of remarkable technologies that can greatly advance human progress, we may return ourselves to a situation similar to the Dark Ages. How can we transform our current emotions and achieve this? Because progress at the government level is too fast, and many people don't seem to be getting the issue right. In addition, Llama's architectural decisions are very interesting, the idea is to allow Llama to grow as freely as possible, and we have another system called Llama Guardian (Llama Guardian) that aims to provide protective controls. Do you think this approach will solve the problem correctly, or what do you think about the problem?

Ultraman: Given the capabilities of the current model, there will undoubtedly be some problems. I don't intend to downplay these issues or ignore their severity, but for models like GPT-4, I'm personally not worried about the catastrophic risks they pose. I believe that there are a number of possible approaches to how to deploy this type of model safely. If we agree on this point, we may find more consensus. One particularly interesting example you mentioned is that there are technically potential models for recursive self-improvement, even if they are not actually used as such, such as autonomously designed and deployed biological weapons, or new models that involve recursive self-improvement.

For these potentially threatening models, I think security testing should be done at the international level. I don't think GPT-4 poses a substantial threat, and there are many safe ways to publish such a model. But you also understand that when it comes to situations that can lead to a large number of casualties, such as airplanes or many other examples, we are happy to have a testing framework in place to ensure safety. When I board a plane, I don't usually worry about its safety because I assume it's safe, right? Now, there are a lot of unnecessary concerns about this.

The future may not just be about UBI, but more about universal basic computing resources

Q: On the question of jobs, you did some tests when you were at Y Combinator, and I think you did some research on UBI, and the results of your research will come out soon.

Altman: It was a research project that lasted for five years, and it has now come to an end, or rather, the project was actually started five years ago. A pilot study was conducted in the initial phase of the project, followed by a long-term study run.

Q: Can you explain why you decided to launch the UBI? What was your original intention?

Altman: We started talking about this in 2016, and it was around that time that we started to take AI seriously. Our theory is that the changes that can occur in society, for employment, for the economy, and at a deeper level, such as the nature of the social contract, are so great that we need to do a lot of research to explore new ways to reorganize the social structure. I also don't think I'm very happy with the government's approach to most policies aimed at helping the poor. I'm more inclined to believe that if people are given money directly, they will be able to make informed choices and the market will work accordingly.

I am a big proponent of improving basic living standards and reducing and eradicating poverty, but I am very interested in finding a solution that is more effective than the existing social safety nets and current approaches. I believe that money may not solve all problems and will not make people happy immediately, but it may solve some problems and may provide people with a better starting point to help themselves help themselves, and I am very interested in the prospect of that. Now, as we see how AI is evolving, I'm wondering if there's a better way to do it than the traditional Universal Basic Income (UBI). I'm thinking that the future may not just be about universal basic income, but more about universal basic computing resources. For example, everyone can get a certain amount of GPT-7 computing power, which they can use, can resell it, or donate it to others for cancer research, etc. But what you get is not money, but a share of the equivalent of a part of the productive forces, and what you have is a part of the productive forces.

Hours after being fired, there was a state of utter confusion, and the board's intentions were sincere

Q: What exactly happened? You've been fired and you're back, is that an internal power struggle? Has anyone betrayed you? Have you made a breakthrough in artificial general intelligence? What exactly is the situation? Please let us know.

Ultraman: I was in a situation where I was fired, and I was thinking about whether or not to go back to work because I was very frustrated and a little lost. But then I realized that I had a deep affection for OpenAI and the colleagues there, and I decided to go back. I understand that returning to work will be challenging, and in fact, it will be more difficult than I expected, but I feel that it can be handled. I agree to return to OpenAI. The board spent some time figuring things out, and during that time, we did our best to keep the team intact and continue to serve our clients. Then, we started to make a new plan. Eventually, the Board of Directors decided to appoint another interim CEO. Many were surprised when the interim CEO took office, and his name was Emmett Shear? His tenure in office was very short.

I got a text message the night before, followed by a phone call, and after that everything got very confusing. My phone almost became an ornament because it was constantly vibrating and flooded with text messages and incoming calls. Basically, I felt like I was fired through social media, which happened a few times during the Trump administration. In my hotel room, I spent hours in a state of utter confusion, trying to figure out what to do next. Then things got really weird. I flew home around 3 p.m., my phone rang non-stop along the way, and calls and messages barely broke. In the evening, I met with some people face-to-face, and I decided, okay, I'm going to continue my research on artificial general intelligence and be optimistic about what the future holds. Then the next morning, I was on the phone with a couple of board members discussing my possible return, which set off another flurry of busyness and chaos. Eventually, despite the many crazy moments in between, things worked out.

OpenAI had only one non-profit board at the time, so all board members were INEDs. The number of members of the board of directors has been reduced to six. They first got then-chairman and president Greg Brockman out of the board, and then fired me. That's the case.

Q: I mean, is there a culture clash in the board of directors between those members who only have a nonprofit background and those who have startup experience? If you want, can you share some information about the motivations behind those decisions that led to them, or any details you'd be willing to reveal.

Altman: I always thought that a clash of cultures was inevitable. Obviously, not all of the board members are people I particularly like, but I have a deep respect for their serious approach to artificial general intelligence and their awareness of the importance of ensuring AI security. While I have strong objections to some of their decisions and actions – and I do – I have never doubted their integrity or commitment to our common goal of achieving safe and beneficial artificial general intelligence.

Q: Do you think they're making the right decisions in the process, or do you know how to balance all the things that need to be taken care of?

Ultraman: I don't think so. But I do believe that their intentions are sincere, and that their emphasis on general AI and their commitment to getting this technology right is serious.

The "$7 trillion" project is not an individual project, but a project of OpenAI

Q: I'd like to ask about OpenAI's mission, which has a clear goal of developing artificial general intelligence. This is indeed a very interesting goal. Many people think that if we succeed in creating artificial general intelligence, it could be an unexpected result, or even a sign that something is seriously wrong, which scares them a lot. However, OpenAI actually has this as its core mission. Does such a mission create more worries for the work you do? I understand that it can also be motivating, but how do you balance the two? Why did you choose such a mission?

Ultraman: I'll answer the first question first, and then the second question. I think that the development of artificial general intelligence does cause widespread fear, which is understandable, given that many people are scared of current AI, and indeed of the future AGI. Still, they are excited about the current development of AI and even more excited about the possibilities of the future, although this excitement is accompanied by deeper apprehensions. We're grappling with these complex emotions, but I think the advent of artificial general intelligence is inevitable and will come to fruition. I believe that despite this, it will bring great benefits. However, we do need to find a reasonable way to guide us towards this future. There will be a lot of change, and change tends to make people uncomfortable, so we need to make the right decisions and adjustments in many ways.

Q: You are an exceptional trader. I've seen you throughout your career and you've really excelled at trading. You have a wide network of contacts and are highly skilled at fundraising and have made a lot of money. Your actions in the investment community, and the fact that the companies you've been involved in raising huge sums of money to build chip factories and so on, all of these things show your strength. However, a slight exaggeration here, we all know that you didn't really raise $7 trillion, it could be the market capitalization of a company. Putting that aside, the question is, why is there a lack of trust in you despite the many deals you've made? What is your motivation? What is the ultimate goal you are pursuing? Which opportunities should remain within OpenAI, and which ones can be attributed to you personally? Is it because people in nonprofits are suspicious of you?

Altman: Regarding the projects of equipment companies or chip manufacturing companies, these are not my personal projects, but the projects of OpenAI. Correspondingly, OpenAI will receive relevant equity. I understand that there may be such an opinion from the outside world, especially for those who don't need to comment on these matters on a daily basis. That's fair because we haven't announced these matters yet, they're not done yet. I don't think most people in the world think about these issues as deeply as you do.

I also agree that this situation does give rise to a lot of conspiracy theories, especially among tech critics. If I could go back in time, I would be more explicit about my intention to hold the stake and make sure that was very clear. That way, everyone understands that, but still, I'm going to be involved in it, because I care deeply about artificial general intelligence and believe it's the most fascinating job in the world. At the very least, doing so clearly communicates the essence of the chip project to everyone. (Compilation/Mowgli)

Read on