Refract OpenAI's technology roadmap for the new year, and look into Sam Altman's 12 wish lists

Finishing | Su Mi and Yuan rolled

Produced by | CSDN（ID：CSDNnews）

On December 24, local time, Sam Altman launched a rare "wishing pool" on the X platform, "What do you want OpenAI to build/fix in 2024?", a tweet that quickly attracted the participation of many bigwigs and netizens in the AI field.

Two hours later, Sam Altman picked out the 12 most anticipated wish lists and vowed to "do everything we can to deliver (and many other things we're excited about but not mentioned here)" — a snapshot of OpenAI's roadmap for 2024.

AGI (Please be patient)
GPT-5
Better voice patterns
Increase access restrictions
Better GPTs
Better reasoning skills
Control over the degree of arousal/behavior
Video processing/generation capabilities
Personalization
Enhanced networked search capabilities
Sign in with OpenAI
open source

For this wishing pool, even former GitHub CEO Nat Friedman poured into the comment section to speak, "Please make sure the voice pattern in ChatGPT is good enough to have a 10-minute conversation with the Turing test, thank you!"

"In 2024, OpenAI will not have AGI"

In the past year, with the explosion of ChatGPT, GPT-4, GPT-4 Turbo, DALL· The successive launch of large models such as E 3 has pushed the development of AI to a new climax. Many people are also looking forward to doing more with AI as the underlying model matures.

Of course, it's not hard to see from the wish list that more people are hoping that AI can achieve a breakthrough in AGI in the new year.

There is no universal definition of AGI, but when ChatGPT is asked, the explanation given is that it refers to an AI system with a level of intelligence similar to or beyond that of humans. With AGI, AI is able to Xi and adapt to a variety of different tasks and domains like a human, with a more comprehensive cognitive capacity. Achieving AGI is considered a long-term goal in the field of artificial intelligence and one of the challenging problems for computer science and AI research.

Previously, Nvidia CEO Jensen Huang made a prediction that we could see AGI in the next five years. In Huang's view, he defines AGI as a piece of software or computer that can complete tests that reflect basic intelligence and is "quite competitive" with a normal person.

However, for OpenAI, which specializes in low-level large models, AGI is bound to become the foundation of AI products, not just a piece of software.

When AI realizes AGI, it means that the era of ultimate symbiosis between humans and machines will come. Among them, whether it is laws and regulations, application scenarios, or ethics, you need to be fully prepared, otherwise AI may also have many uncontrollable situations.

That's why OpenAI is cautious about AGI. In an interview with Time magazine earlier this month, Altman said, "I think AGI will be the most powerful technology ever invented by humanity – especially in enabling democratized access to information around the world...... Just like any other powerful technology that has ever been there, this will lead to incredibly new things, but it will also come with some real negative effects."

In the high anticipation of netizens this time, Sam Altman also replied bluntly on the X platform, "Wow, there were many more requests for AGI in the first 2 minutes than expected, sorry to disappoint you, I don't think we can achieve this goal in 2024......

Will GPT-5 be stronger?

Compared with the impossibility of AGI, the landing of GPT-5, the highly anticipated next-generation AI language model, seems to be more promising.

In July this year, OpenAI submitted a trademark application for GPT-5, and then Sam Altman revealed in an interview that the next-generation artificial intelligence model GPT-5 is under development, and he hopes that investors such as Microsoft will give some more financial support.

There are hints everywhere that the next generation of new models is being developed within OpenAI. As for whether GPT-5 is or not, Sam Altman has also warned: "There are many things we need to figure out before we can make the model we call GPT-5. 」

However, through the pace of OpenAI's iteration, and as predicted by the Fireflies.ai community, we can also have the following expectations for the next generation of "GPT-5":

1. Data training

GPT-3 and GPT-4 have 175 billion and more than 1 trillion parameters, respectively, and on top of that, GPT-5 is expected to leverage larger datasets, potentially trillions of parameters.

Meanwhile, in August 2023, OpenAI released GPTBot, a web crawler tool that can expand its dataset by collecting publicly available information from the internet on a copyright-focused basis. This move has also been interpreted by the industry as OpenAI's hope to use this tool to help train GPT-5 related models.

2. Higher accuracy

Although GPT-4 is currently the most advanced AI model in the industry, it is still not immune to the "clutches" of "hallucinations", "false" and misleading information.

However, according to the iterations of OpenAI's previous large model versions, "accuracy" is one of the dimensions that must be upgraded. According to the OpenAI report, GPT-4 has significantly fewer hallucinations than GPT-3 and previous versions. GPT-4 has an accuracy level of over 80% in the science and history categories. There is also a significant improvement in accuracy for other categories.

GPT-5 is expected to have less than 10% hallucinations so that users can trust the language model.

3. Comprehensive multimodality

Given the rise of multimodal AI systems like Microsoft's Bing Chat and Google Bard, many have speculated that GPT-5 is likely to be upgraded with comprehensive multimodal capabilities, potentially having the ability to process and generate text, images, audio, video, and 3D content more smoothly.

4. Pursue cost-effective expansion

Not long ago, Anthropic, which is regarded as OpenAI's strongest competitor, released Claude Pro, which charges the same as the ChatGPT Plus service, but compared to the limit of 50 messages per three hours for ChatGPT Plus users, Claude Pro users can send at least 100 messages to Claude 2 every eight hours, setting a new industry benchmark, which naturally puts some pressure on OpenAI.

To compete effectively with Claude Pro, OpenAI needed to address key challenges such as cost, scale, and performance. Therefore, it remains to be seen whether the GPT-5 version will overcome these challenges.

Other Wishlists

In addition, netizens also hope that OpenAI can implement and fix some of the following features:

Better voice patterns

Last month, on the first day after internal chaos, OpenAI quietly announced on the X platform that ChatGPT's voice function is now free and open to all users, and it is mainly powered by the Whisper model. The voice feature is available in both the ChatGPT mobile app for iOS and Android.

With some limitations, though, ChatGPT only offers five different voices to choose from: Breeze, Ember, Cove, Juniper, and Sky.

In the coming new year, many users hope that OpenAI will support more voice and language choices, and also look forward to adding this feature to the web version.

Increase access restrictions

The access restrictions set by OpenAI for ChatGPT and GPT-4 specifically refer to the limit on the number of messages that each user can send in a specific period of time, and the limit on the number of times a user or client can access the server in a specified time.

With rate limiting, OpenAI can effectively prevent some users from abusing or misusing APIs, ensure fair access to APIs for everyone, and help it manage the total load on its own underlying design.

Of course, different account types and usage tiers have different rate limits. The graph below shows the default extremums for the OpenAI API, where extremums are measured in two ways: RPM (requests per minute) and TPM (tokens per minute).

Of course, in special circumstances or when there is a strong reason, you can also apply to OpenAI separately for a rate limit increase. With the abundance of AI application scenarios, more and more users hope that OpenAI can directly raise a wave of access restrictions first.

It is conceivable that the increase in rate limiting will also come at a cost to OpenAI, such as more computing power and infrastructure support, increased demand for network bandwidth, or measures such as improving algorithms, parallelization, and reducing latency from the software dimension.

Better GPTs

In November, Sam Altman announced that "GPTs are now available to all ChatGPT+ subscribers", which means that the era has come for everyone to make agents with zero code. Also at the inaugural Developer Day, OpenAI also announced that it will launch the GPT Store to help verified developers monetize their products.

Sadly, OpenAI suffered from civil unrest, and although Sam Altman was eventually returned a few days after the original board members kicked him out, the incident disrupted the original internal product launch plan.

Because of this unexpected event, OpenAI informed users that the GPT Store app will be postponed until 2024. Therefore, facing the future, it is not difficult to land richer GPTs in the new year.

Better reasoning skills

Inference ability refers to the ability of large models to deal with complex tasks, solve problems, or generate logically related coherent texts, which can understand and apply existing knowledge, and carry out reasoning, induction, and deduction, so as to produce accurate and reasonable inferences and reasoning results.

To improve inference, you may be able to improve your inference skills through improved model architecture, larger-scale training data, improved pre-training and fine-tuning strategies, multi-task Xi, and combining external knowledge and context.

Control over the degree of arousal/behavior

This is a matter of ethics and security for AI systems. The application and development of AI technology needs to be carried out within legal, ethical, and social frameworks to ensure that it has a positive impact on the interests and well-being of humanity. Of course, this is not something that OpenAI can solve alone, and it requires the participation of governments, regulators, and developers and research institutions to ultimately ensure that the use of AI systems is reliable, transparent, and controlled.

Video processing and generation capabilities

At present, compared with the text, audio, and image dimensions, the ability of large models to process video functions is still relatively limited, and more complex architectures and technologies are required for the processing of video data.

Video data is usually high-dimensional and has a large amount of time series information, which is more complex and time-consuming to process. In addition, video processing also involves the recognition, tracking, and action understanding of visual content, which requires deeper visual comprehension capabilities.

This is also the next stop that many large model companies such as OpenAI are working on.

Personalization

The personalization of large models has also become a mainstream trend. The goal of the personalization model is to provide output results that are more in line with the personalized needs of users and enhance user satisfaction and experience. It can be achieved by considering the user's personalized information, context, and feedback, and it is fundamentally different from the generic large model by being more anthropomorphic.

However, when implementing personalization, developers like OpenAI need to balance personalization and privacy protection.

Enhanced networked search capabilities

ChatGPT's knowledge base has been criticized by netizens for a long time, with GPT-3.5's knowledge base deadline being September 2021 and GPT-4's knowledge base ending in April 2023. In terms of linguistic text content processing, it can meet the basic requirements. For tasks involving news and knowledge, users have higher requirements for timeliness, and they hope that OpenAI will increase the product capability of real-time online search.

OpenAI accounts support social logins

Some netizens expressed the hope that the OpenAI account supports social login on other websites, and the feature is also very reasonable. Because OpenAI currently has more than 1 billion users, it has also become the fastest growing product in history. According to current statistics, ChatGPT currently has over 180 million users and reaches 100 million weekly active users.

This demand can also be seen that netizens have expectations for OpenAI's continued growth, after all, only the demand for social login for national-level application development is more reasonable.

open source

The topic of open source is actually the most popular New Year's Eve in the Twitter comments, with some developers mentioning that they want OpenAI to open source their weights and datasets, but Altman listed open source as the last item.

OpenAI was once ridiculed as CloseAI because of the problem of open and closed sources. The topic of whether the large model that claims to be open source is true open source or fake open source has also sparked discussions in the industry.

OpenAI's faulty lead has forced other players to use open-source strategies to try to overtake in corners, and the fastest action is Llama2, which has led to the rapid development of the global large-scale model ecosystem in the past few months. The vigorous growth of the Llama2 ecosystem has also allowed Meta to use the ecological revitalization plan to see hope.

In this year's 1024 Programmer's Day, Zhang Jiaxing, a chair scientist at the IDEA Research Institute, said, "Open source code is the concept of the high seas, and everyone contributes, and the initiators of open source projects will get a lot of benefits. However, an open-source model is different from open-source code, and if a parameter is modified, the model performance will also be different. When the model is open-sourced, there is the possibility of continuing training, and if someone can continue training, then the lineage of the model will become very large, and it will form the structure of a tree. From another point of view, we also hope that everyone can be really open source, such as more open source training code and training data, which can really help developers continue to train and fine-tune. 」

Yang Zhilin, the founder of the dark side of the moon, has a view on the open source of large models: "The team should choose whether to open and close the source according to its own development direction. If you plan to close the source like OpenAI, it may be the only way to get to the Super APP, and open source is just a means of user acquisition for the ToB. 」

Yang Zhilin believes that "everything that wants to be a C-end super APP is closed-source. 」

Write at the end

AI has come a long way, but there are still many challenges and limitations:

From a technical point of view, the development of AI is limited by computing power, data quality, and algorithm architecture. As hardware advances and algorithms continue to improve, it can be expected that the performance of AI will continue to improve.

In addition, the development of AI is also constrained by ethical, legal, and social factors. AI technology has sparked a flurry of discussions about privacy, data security, employment impact, distribution of responsibilities, and more. These issues need to be considered comprehensively and appropriate norms and policies should be developed as AI evolves.

As a unicorn in the field of AI, let's wait and see what amazing features and products OpenAI will bring in 2024~

Reference:

https://www.linkedin.com/pulse/what-expect-from-gpt-5-fireflies-inc-vll6f/?trk=article-ssr-frontend-pulse_more-articles_related-content-card

https://twitter.com/sama/status/1738673279085457661

Refract OpenAI's technology roadmap for the new year, and look into Sam Altman's 12 wish lists

Read on

Report: OpenAI plans to announce an artificial intelligence search engine next Monday

Baidu's vice president Xuan Jing apologized late at night; Another senior designer of Apple "ran away"; According to the news, OpenAI poached Google employees to develop an AI search engine | Geek headlines

Baidu's vice president apologized late at night; Another senior designer of Apple "ran away"; According to the news, OpenAI poached Google employees to develop an AI search engine | Geek headlines

OpenAI released the first version of the "Model Specification" to restrain ChatGPT from crossing the line and breaking the law

After Microsoft and Bytes, why does OpenAI also want to enter AI search?

Google I/O Developer Conference Preview: Encountering OpenAI and Microsoft, the war is about to break out

The "Father of OpenAI" warns! The impact of AI on the economy is "underestimated"

Huawei issued the first pure electric car, and 16,700 households on Evergrande Ocean Flower Island got the certificate, OpenAI

AI Daily: OpenAI's blockbuster update is scheduled for next Monday; The open-source version of the EMO project was released

OpenAI Product Teaser: GPT-5 will not be released, and no AI search engine will be released! Netizens predict: It will lose Apple's position in the field of artificial intelligence [with analysis of the development status of the AIGC industry]

Morning Post|Baidu's internal executives commented on the quiet turmoil/OpenAI's official announcement to release an update next week/ZEEKR landed on the New York Stock Exchange

Apple apologizes for new iPad ad; OpenAI's official press conference, no AI search; Robin Li commented on the Jingjing incident: Excellent employees represent the real Baidu | Geeks already knew

It's not search, it's not GPT 5, it's this that OpenAI will show next Monday?

OpenAI's big move is coming! AI voice assistant snipes Google Apple, GPT-See you 5 years ago

OpenAI catches the ghost with a strange trick, Ultraman plays everyone: GPT searches for pigeons! Upgraded GPT-4 instead

What big move will OpenAI make next week? Ultraman may have spoiled the story in this interview