Don't let ChatGPT run

Source: Screenshot of the movie "I, Robot"

Is ChatGPT just another AI gimmick?

On the bright side, the popularity of ChatGPT has recently declined, which is really in line with the consistent argument of many people who sing about the AI industry - just like Deep Blue, which beat the chess champion, and AlphaGo, the international Go champion, the popular AI tools always eventually calm down.

Because these cool tools often have an unavoidable question: Where are the prospects for commercialization?

In terms of use scenarios, except for professional chess players, no one will need to play against robots every day, and ChatGPT as a large model that absorbs countless language materials, with more than 170 billion model parameters alone, the most applicable scenario seems to be only to complete the content summary writing and unified annotation format in academic papers, and help paper authors avoid the risk of checking duplicates. To be honest, ChatGPT has done so well that it has almost become a secret among international students, so much so that some Chinese students have written an app called GPTZero to identify the content generated by ChatGPT in homework.

But that's about it. From a cost point of view, tens of millions of dollars in development and deployment costs give the outside world a firm reason to look down on it, this so-called intelligent chat tool is too expensive, not to mention its most amazing part: the understanding of human language and dialogue logic, the "generative" creation of answer content, are quickly "disenchanted" with more and more "rollover" instances. Optimistic arguments that it will replace search engines and disrupt intelligent voice assistants are disappearing.

ChatGPT seems to be about to go down the path of those AI tools of the past in the public eye, dazzling like meteors, and then falling silent.

Source: Screenshot of the "Matrix" movie

But, is there really nothing more to it?

The first to build an airplane

Sheng, a PhD student working on pre-trained large models at Tsinghua, spoke about ChatGPT in a mixture of excitement and nervousness.

"Just two years ago, whether to take the direction of pre-training large models was still being discussed by the entire academic community." Sheng said that the reason is exactly what was mentioned above, the cost of training a large model at one time is too high, and what results can be obtained is uncertain, and few people are willing to take risks. Players in related directions in China once tended to use the method of cooperating large and small models to improve the effect of AI tools, because the traditional view is that training on relatively small models is not necessarily worse than large models. More than one practitioner in the direction of AI also said that in the past, the industry paid far less attention to manually labeled data, and everyone did not expect that the reinforcement learning based on human feedback used by ChatGPT would be so good.

Until OpenAI launched ChatGPT.

"As much intelligence as there is human beings." This is a phrase that is often used as a joke in the field of artificial intelligence, and it is fitting to describe ChatGPT. As a pre-trained large model, it embodies the word "big" very well. On the one hand, compared with GPT1, the parameter scale of GPT3 is nearly 1500 times higher. On the other hand, thanks to the so-called "self-supervised learning" mechanism, the model can be trained using a large amount of text data from the Internet.

Large models of this level are unprecedented.

"Recent research tells us that when the model reaches a certain scale, there will be something emergent ability." Sheng said.

To some extent, OpenAI, the developer of ChatGPT, is also gambling, no one knows whether this road can be passed, and it is their persistent investment that finally proves that pre-trained large models have cognitive understanding and generalization capabilities that ordinary models do not have. In other words, pre-trained large models are very similar to people's ideal AI "general-purpose models."

Unlike AlphaGo, which is tailored specifically for Go, ChatGPT is not an AI tool developed for a specific narrow domain problem, but rather it may be more like some kind of immature general-purpose AI computing model, with the ability to answer open questions, showing the potential to be flexibly deployed in various fields.

That's why ChatGPT is important, showing people the power of pre-trained large models. This means that the third wave of AI has reached a critical node after more than a decade of development.

"ChatGPT / GPT-3.5 is an epoch-making product, it is almost the difference between missiles and bows and arrows from the previous common language models, and it must attract the highest degree of attention." An article trying to help the open source community replicate the GPT3.5 technology roadmap seriously points this out at the beginning. （https://zhuanlan.zhihu.com/p/593519656）

Sheng likened the birth of ChatGPT to the Wright brothers' invention of the airplane: "Everyone knew that airplanes could theoretically be built, but no one had ever seen an airplane. ChatGPT is like someone suddenly putting an airplane in front of you, and although it may only fly 100 meters and it can easily be faulty, it appears. ”

Source: Screenshot of the "Matrix" movie

Bigger than bigger, how much more potential is there for big models?

Compared with the significance of the key nodes in the wave of AI development revealed by ChatGPT, the defects and weak commercialization prospects of ChatGPT itself are much smaller. What's more, for many practitioners, the shortcomings exposed by ChatGPT are not unsolvable.

One of them, which many have criticized, is the so-called database time frame. ChatGPT training is based on a fixed database with a deadline of September 2021, which means that ChatGPT can't keep track of anything that has happened in the world since then, from the release of the iPhone 14 to the midterm elections in the United States, or even today's weather conditions, in this regard, ChatGPT's performance is not even as good as any intelligent voice assistant today.

But from a technical level, this problem is not difficult to solve. In fact, according to foreign media reports, Microsoft, which has reached a strategic partnership with Open AI, is about to launch a new version of Bing Bing with AI dialogue capabilities in March, which combines search engines with ChatGPT capabilities, and even Microsoft intends to introduce corresponding capabilities in the Office suite.

The most interesting cost problem is that there are also many ideas for optimization iteration at the algorithm level. For example, since ChatGPT shows the ability of machines to simulate human behavior through special training in the process of answering questions, at the algorithm level, it will be worth exploring the direction to let ChatGPT no longer call its own database when it comes to pure knowledge and information problems when it comes to pure knowledge and information, but to scrape content directly from the network. As a result, large models can be scaled down without reducing their performance, and the cost of training will be reduced.

As for the commercial landing scene, in addition to the relatively certain text generation and intelligent assistant fields, realistically speaking, there are still large areas of barren land that need to be developed, but many practitioners have expressed optimism.

"The difficult thing is the original innovation from 0 to 1, and the latter is not a problem." An AI researcher who works for a large factory said, "Especially in China, where the market is so big and everyone is so rolled, since the big model path has been proven to be feasible, then soon all smart people will join in." Sheng also expects that in as little as a year or two, there will be commercial products based on pre-trained large models.

Source: Screenshot of the "X-Men-Apocalypse" movie

A must-climb mountain

In fact, in the field of investment this year, AI is a rare hot track that has sprung up. However, cost is like a tight curse, binding every player who does not have the ability to develop pre-trained large models themselves.

The cost of running a pre-trained large model at the level of ChatGPT is tens of millions of dollars, and the cost of commercial deployment and landing in multi-user high-concurrency tasks such as chatbots will only be higher. One estimate given by Xiaoice CEO Li Di is 300 million per day. This also means that only a very small number of organizations in China are qualified to play this money-burning game, and most start-ups, and even many universities, will be "persuaded" by such a high cost.

Sun, an investment manager from a domestic first-line investment institution, has not made a single shot this year after watching the PPT of countless AI-related projects: "Commercial projects are very realistic, have you mastered the core technology?" How high are your barriers to competition? ”

Few Chinese companies have been able to respond to such cross-examinations.

In this case, if you want to make the product have AI capabilities, you can only call the public large model interface (such as GPT3.0), which is equivalent to putting your core capabilities in the hands of others.

A very cruel example is Jasper.AI. The text generation field, which was once valued at 1.5 billion Jasper.AI also called the GPT 3.0 model at the bottom, and after ChatGPT was almost unforeseen, Jasper's business was immediately affected (this story was written into a story by The Information), because Jasper's fee plan was as cheap as $29 and could only generate 20,000 words, compared to the fact that Jasper's fee plan was as cheap as $29 and could only generate 20,000 words. The cost of using ChatGPT is negligible, while the interaction and performance are even better.

What's more, OpenAI itself is under operational pressure. News from within OpenAI said that the cost of pre-training large models is high, OpenAI is also bitter, since GPT 3.0, OpenAI's model is no longer open source, but pushes its subscription to paid services (Jasper is to pay a certain fee to OpenAI to obtain the GPT call interface).

Developing its own big model is obviously what every company with ambitions in the field of AI should do.

In addition to OpenAI, Google, which proposed the transformer model, also has a large-language model LaMDA and a multimodal task model MUM specifically for conversational applications, which are considered to have the same capabilities as ChatGPT. And in Silicon Valley, startups like Perplexity and YouChat are also developing new chatbots based on the Big Prophecy model. OpenAI also heralds the existence of GPT 4.0, which can be seen from the version number, when this industry-leading large language model will have further capabilities.

Source: Screenshot of the "X-Men-Apocalypse" movie

Therefore, for China, time is not waiting, pre-training large models is a hard bone that must be gnawed down, and China must not miss this AI "arms race". Not only because blindly imitating or seeking open source model interface support is equivalent to allowing others to always grasp the technological initiative of pinching their necks, but also will be in a disadvantageous position in the future AI industry competition.

Sun gave perhaps less apt examples: the mass-energy equation was proposed by Einstein in 1905, the atomic bomb was successfully tested by the Americans 40 years later in 1945, and it took New China nearly 20 years to master the technology.

The rapid development of AI technology will not give China such a long time to catch up.

The good news is that AI technology does not have the same unattainable technical barriers as chip manufacturing, and although no ChatGPT paper has been published, more than one AI engineer said that with the knowledge now publicly available, the top AI engineering team is likely to reproduce a model similar to ChatGPT, because "the technology itself is ready-made." ”

The bad news is that we are running out of time.

If ChatGPT was born from the investment of huge funds, sufficient technology and talent reserves, these conditions can still be barely met by domestic giants, then over time, if not caught up in the current very important window of opportunity, the accumulated experience of algorithm iteration will bring structural technical barriers and generation differences to AI companies represented by OpenAI, once this generation gap is formed, it will be extremely difficult to catch up.

Although the machine is still essentially unable to think or innovate, the "intelligence" "emerged" of the pre-trained large model and its excellent generalization ability will make the AI industry itself no longer a track, but evolve into a basic productivity resource like oil and power grids, completely changing the pattern of the entire information industry.

After more than ten years of development, the wave of AI based on deep learning frameworks has found a clear direction, followed by a warm scene like the western nugget tide, countless opportunities and breakthroughs will emerge, in order to keep up with this AI revolution of cooking oil, we must have our own ChatGPT. Otherwise, let it run too far, and it will be too late to chase again.

Resources:

The Secret of ChatGPT Evolution https://zhuanlan.zhihu.com/p/593519656

The Best Little Unicorn in Texas: Jasper Was Winning the AI Race—Then ChatGPT Blew Up the Whole Game https://www.theinformation.com/articles/the-best-little-unicorn-in-texas-jasper-was-winning-the-AI-race-then-ChatGPT-blew-up-the-whole-game

Don't let ChatGPT run

Read on