laitimes

ChatGPT and the new Moore's Law, the next systemic opportunity door is near?

author:Matrix Partners

A new generation of AI is sweeping the world, from Silicon Valley, Wall Street, Zhongguancun, to offices and university classrooms in all corners of the world, people are hotly discussing the future of ChatGPT and artificial intelligence.

Recently, OpenAI released GPT-4, bringing AI to "new heights unprecedented and irreversible in history." GPT-4 has been added to accept image input, while the previous GPT-3.5 only accepted text. And the "professionalism" has been greatly improved, in the mock bar test, GPT-4 score exceeded 90% of human candidates, while the previous version of GPT-3.5 score, only exceeded about 10%.

Image recognition, advanced reasoning, and huge word mastery capabilities are the core of this GPT iteration, and GPT-4 with multimodal capabilities can generate video, audio, pictures, and text based on the information provided by users. But this time OpenAI did not announce the model parameters and data scale, nor did it involve technical details, training methods, etc., which may be difficult for chasers to imitate.

A few weeks ago, OpenAI also released the ChatGPT API, reducing the price per 1 million words output to 2.7 US dollars (about 18 yuan), and the cost of use is only one-tenth of the previous GPT-3.5, which seems to announce the arrival of the era of "one ChatGPT per person", and also laid the important position of OpenAI as a new infrastructure construction field in the era of artificial intelligence, followed by a large number of new companies applying the middle layer.

At the same time, OpenAI founder Sam Altman proposed the "new Moore's Law" via Twitter — the total amount of intelligence in the universe doubles every 18 months.

In the face of this "iPhone moment" in the AI era, we have successively invited senior experts from academia and industry, as well as industry investors and entrepreneurs to conduct in-depth exchanges.

The special guest of this issue, Mr. Zhou Bowen, founder of Beijing Zhiyuan Technology, was the president of IBM Research's artificial intelligence basic research institute, the chief scientist of IBM Watson Group, the outstanding engineer of IBM, the senior vice president of JD.com, the chairman of the group technology committee, and the president of cloud and AI; At the same time, he serves as the Hui Yan Chair Professor of Tsinghua University and the Tenured Professor of the Department of Electronic Engineering of Tsinghua University.

Professor Zhou has more than 20 years of research experience in speech and natural language processing, multimodal knowledge representation and generation, human-computer dialogue, and trustworthy AI. As early as 2016, the natural language representation mechanism of the self-attention fusion multi-head mechanism proposed by him became one of the core ideas of the Transformer architecture. In addition, two widely cited natural language generative model architectures and algorithms in the field of AIGC are also derived from him.

At the industrial level, Professor Zhou founded Zhiyuan Technology at the end of 2021, adopting a vertical model to open up its own basic large model, application scenarios and end users, forming a vertical closed loop. Its core product is leading CIP and ProductGPT-based ProductGPT, through vertical datasets and adding Instruction-tuning to training, so that the final model is better and better in continuous tuning iterations, so as to help enterprises achieve product innovation, in order to surpass ChatGPT with deeper and more accurate insights and innovation capabilities on the vertical track.

In the face of this exciting new trend, we have too many questions to talk about, so this article is long, but since this interview was conducted a few weeks ago, this article has only made limited additions to the content related to GPT-4, and will not expand too much, which will be discussed later. Following, Enjoy:

1. What are the core breakthroughs behind ChatGPT? What are the core iterations of ChatGPT4?

2. Why does OpenAI persist? Why did Google become that "big fragile company"?

3. Another model in the era of large models: vertical closed loops

4. Under the "data flywheel effect" of ChatGPT, how should Chinese companies catch up?

5. Summary of ChatGPT: the arrival is not surprising, the impact is not underestimated, and the future is not terrible

ChatGPT and the new Moore's Law, the next systemic opportunity door is near?

1

What are the core breakthroughs behind ChatGPT?

What are the core iterations of ChatGPT4?

Jingwei: OpenAI recently released GPT-4 and previously reduced the price of ChatGPT API by 90%, which established OpenAI as the infrastructure of the AI era, and then a large number of new companies in the middle layer of applications will emerge, what changes do you think this will bring?

Zhou Bowen: GPT-4 has three major changes: multimodality, logical reasoning and controllability.

The first is multi-modality, which brings about the unification of the channel of human-computer collaborative interaction. GPT-4 has a very powerful image understanding ability, which can support pixel-level graphic processing, such as: you can write code according to design drawings, write answer processes for photo questions, and summarize questions and answers through document pictures. GPT-4 multimodal capabilities will inevitably give rise to a wider range of downstream applications, and the era of "Moore's Law" of agents has arrived.

The second is the significant improvement in the ability to understand and generate complex long texts. GPT-4 increases the limit on text length to 32k characters, can handle text over 25,000 words, and can create, extend dialogue, search, and analyze documents using long-form content. GPT-4 can integrate more complex and diverse training datasets, which is a significant improvement in logical reasoning compared to ChatGPT. At present, it has reached the same level as humans in various professional and academic assessments, such as the American Bar Examination (MBE), the American College Preparatory Test (AP), and the American College Entrance Examination (SAT).

The third is controllability, GPT-4 has creative writing ability, including: writing songs, writing scripts, learning user writing styles, etc. In the fine-tuning process, OpenAI introduces a lot of manpower to ensure the high quality of the supervision signal. Compared to InstructGPT and ChatGPT, it can be guessed that RLHF in GPT-4 may be a more general paradigm, that is, covering a wider range of tasks and scenarios.

GPT-4 has some limitations and improvements. The problems of GPT-3 and GPT-3.5 also basically exist in GPT-4, such as data timeliness, "illusion" and other problems. GPT-4 has excellent results in various tasks of evaluation, but it is difficult to use and has the ability to solve some simple problems, which is related to the problems of GPT-4 in knowledge storage, positioning, modification and other aspects. At present, the large model is still based on the Transformer architecture of the fully connected graph, in which the controllable storage, positioning and modification of knowledge, and the continuous evolution mechanism are still unknown, and there are also deficiencies in the time-varying dimension characterization of knowledge information.

At present, GPT-4 only demonstrates its support for visual input and relatively shallow reasoning ability in public technical reports and System Card, and it still needs to be evaluated and verified on more difficult and deep reasoning tasks. At the same time, GPT-4 lacks the ability to understand and generate audio, video and other modalities, perhaps related to the Transformer pre-training architecture, and there are significant technical challenges in combining the image generation capabilities of diffusion models and building a unified multimodal understanding and generative model in the future. At present, a lot of work being promoted by the AI academic research community is based on powerful language models combined with multimodal capabilities, and it is worth looking forward to leveraging AGI through language intelligence.

ChatGPT opens a new stage of collaborative interaction, using interaction as a means of learning. GPT-4 goes a step further and uses visual signals to provide better insights, generate new knowledge, and complete tasks. Therefore, we believe that the new round of AI innovation will gradually transform from traditional simple scenarios such as intelligent quality inspection and customer service to complex scenarios such as product innovation and knowledge discovery.

GPT-4 completed training last August, and many of the problems seen now may have already been resolved. It is undeniable that GPT-4 has huge technical barriers that are difficult to surpass in the short term. OpenAI predicts the performance boundary of GPT-4 through "Scaling Law", and GPT-4 is the strongest AI performance boundary we can see, which helps to reflect on the strengths and weaknesses of existing AI theory.

OpenAI is no longer Open, that following is no longer an option, the participants in the new generation of AI navigation era need to have their own deep technical understanding, forward-looking technology trend judgment, need scene feedback and polishing, and also need the general who leads the way to illuminate everyone with their own microlight.

Sam Altman proposed a "new Moore's Law" on Twitter some time ago, which is that "the total amount of intelligence in the universe will double every 18 months", and I think it is more accurate to say that "the number of intelligent touch points will double", and this is happening. OpenAI has significantly reduced the price of the ChatGPT API in order to accelerate the focus on developers and explore more application scenarios through more developers, so as to form a new AI ecosystem.

The cost of using large models mainly comes from two parts, one is the training part, and the other is the inference part. OpenAI's recent move to reduce the cost of inference to the lowest possible level is to be expected and will continue to occur, i.e., as a model is continuously optimized, its model density and inference efficiency will become higher and higher, and the cost of inference will become lower and lower.

This incident is a challenging signal for competitors who focus on general models, whether they are large manufacturers or entrepreneurial teams. In the future, they will not only have to accelerate catch-up at the technical and algorithm levels, but also bear the high cost of model training and inference deployment, but they do not have pricing power in the inference call revenue. At the same time, it is also necessary to face the passive situation that the developer ecology and user minds are rapidly concentrating on "pioneers" such as OpenAI, and complete the climb against the trend.

But in the training part, OpenAI's price reduction has not brought about fundamental changes, such as ChatGPT's deep insight and innovation in vertical scenarios. Of course, I don't expect OpenAI to venture into vertical markets at this stage, and they are unlikely to delay the opportunity to occupy the entire platform market for the sake of one vertical.

In this context, if large model entrepreneurs want to succeed, they first need to find the right business model and moat, and achieve "enjoy the ride of this wave", that is, believe that the increase in the number of smart contacts will make them develop faster rather than worse, but they will not be overwhelmed by the platform advantages (technology + training high investment + inference pricing power + ecology quickly cultivated and occupied) of leading general-purpose large model players such as OpenAI.

Jingwei: You were already working on artificial intelligence when you were at IBM, and a lot of research in that era, such as Transformer, laid the foundation for the success of ChatGPT today. What core advances do you think are behind the major breakthroughs in Transformer and ChatGPT?

Zhou Bowen: Yes, I started to study artificial intelligence when I was studying at the University of Science and Technology of China, and since then, I have also studied speech and language understanding in graduate school and studying in the United States, and after graduating, CU-Boulder directly entered the IBM T. J. Watson Research Center. At that time, IBM was one of the most powerful artificial intelligence institutions in the field of speech and language in the world, and the groundbreaking work such as using machine learning to do speech recognition and machine translation originated here. Many of these talents went on to academia, such as JHU, Yale and CMU; Some of them go to Wall Street and use the Hidden Markov Model (HMM) to do quantitative high-frequency trading and so on. In the early days, my own research direction was to integrate speech recognition, natural language processing, machine translation and other fields to do speech translation, and later do deep language understanding, representation learning and reasoning.

If you talk about why ChatGPT has been successful? I think we should first talk about Transformer, as an extremely important support point for the former, it integrates several very core breakthroughs:

The first core breakthrough came from the use of self-attention and multiceptive mechanisms to characterize natural language, a core idea that was first published by the IBM team led by me in 2016, "A Structure Self-Attentive Sentence Embedding", which was recognized and cited by Transformer in 2017.

Previously, the most commonly used natural language representations were based on sequence-to-sequence-to-model plus attention mechanisms. For example, when the AI learns to answer a question, the input is the question and the output is the answer, represented by a sequence RNN or LSTM, which is the sequence-to-sequence representation mode. On this basis, Bengio introduced attention, the mechanism of attention, the core of which is that not all words are equally important when answering questions; If you can identify the more critical part based on the correspondence between the question and the answer, and then pay more attention to this part, you can give a better answer. This attention model quickly gained very widespread acceptance. Based on this idea, I published several of the earliest cited AI generative models for writing in natural language around 2015.

However, there is also a problem with this approach, which is that attention is built based on giving answers. The AI trained in this way, figuratively, is like asking the teacher to focus on the students before the final exam in college, and then going to the targeted (attention) focus review. In this way, although the performance of AI can be improved for specific problems, it is not universal. Therefore, we propose that it does not depend on the given task and output at all, but only based on the internal structure of the input natural language, and learns which parts are more important and their relationship with each other through AI multiple readings, which is the representation learning of self-attention plus multi-head mechanism. This learning mechanism only looks at the input, more like students learning and understanding the course many times and systematically before the exam, rather than learning in a targeted and fragmented manner based on the focus of the exam, so as to be closer to the purpose of general artificial intelligence, and also greatly enhance the learning ability of AI.

The second core breakthrough is the use of simple positional coding and the abandonment of sequential neural network structures such as RNN/LSTM. In my opinion, it is the simplest and smartest point of this important paper, through a simplification that makes the Transformer no longer bound by the difficulty of parallel training of RNN/LSTM, and can be trained more efficiently with more data. The paper thus became an important milestone in the field, driving a series of changes that followed, and finally ushering in the era of big models. The title of the Transformer paper is "Attention is All You Need" also means "self-attention is important, bulls are important, but RNNs may not be as important as we previously thought." By the way, Ashish Vaswani, the first author of the Transformer paper, was a student I mentored at IBM, who later joined the Google Brain team.

After understanding the above historical evolution, let's look at the significance of ChatGPT as a milestone: its "predecessors", including IBM Deep Blue, IBM Watson, Deepmind AlphaGo, although these are the leading artificial intelligence at that time, but the core difference between them and ChatGPT is that the previous artificial intelligence design idea is to compete with human AI, and prove the progress of AI technology by showing that it is better than humans in some fields.

In contrast, ChatGPT introduces Instruction-tuning, which emphasizes collaboration, interaction and value alignment with people. After the long and less successful exploration and accumulation process of GPT-1 and GPT-2, until the realization of major engineering innovations in the GPT-3 stage, today's ChatGPT is based on GPT-3 to introduce Instruction-tuning and human reinforcement learning in the loop, through human annotation and feedback to AI, to achieve value alignment, better help ChatGPT to understand, let it know more clearly what kind of answer is good and continue to learn from it.

For example, if AI is asked to explain the moon landing for a 6-year-old child, GPT-3's basic model capabilities can answer this question from various angles, including gravity based on physical principles, the Cold War between the United States and the Soviet Union based on historical background to promote the moon landing, the relationship between the Earth and the Moon based on astronomical perspectives, or based on human myths and legends about the moon. It is not difficult to find this information and integrate it into the generated text, but the difficulty is how GPT-3 can identify which of the answers is more suitable for a 6-year-old, which is value alignment.

The regular pattern is sorted by the probability of an answer. But ChatGPT is based on this, people select, score and rank four types of answers, and these feedback can be used to fine-tune the GPT-3 model, align GPT-3 with human intent and evaluation system, and then change model parameters and reasoning results.

After the above interaction with people, if ChatGPT is asked to write fairy tales for a 6-year-old, it will learn to start with "once upon a time" on its own, because it is already in the context of a conversation with a 6-year-old, and it is better to answer in this way. Therefore, the more humans use ChatGPT, the smarter it will become.

While you are amazed by the effects of ChatGPT, you may also realize that these results also depend on the way users ask questions, and the skill and patience that guide ChatGPT to correct and iterate answers. So strictly speaking, these amazing effects are created by users and AI. Because of this, ChatGPT has become the first milestone product in history to interact with people instead of competition, and it is a milestone product that is people-centered and better serves people, and its social value and potential are limitless. This is also my long-term research on cutting-edge AI, and my view has always been that the greater value of AI will come from the collaboration and interaction between people and the environment, so I joined Tsinghua in May 2022 and established the Collaborative Interaction Intelligence Research Center of the Department of Electronics.

Admittedly, when we look back, the limitations of the parameters used by GPT-1 and GPT-2 in those years are also an important factor. GPT-1 has only 110 million parameters, GPT-2 is only 1.5 billion parameters, until the number of GPT-3 soared to 175 billion, there is a surge of capabilities, and more breakthrough results later. In the face of the huge demand for computing power and training investment in this development process, it has to be admitted that not only must there be long-term research accumulation, and clear forward-looking thought guidance, but also sufficient funds to support.

ChatGPT and the new Moore's Law, the next systemic opportunity door is near?

Zhou Bowen (center) at the IBM T.J. Watson Research Center, New York, summer 2001

2

Why does OpenAI stick with it?

Why has Google become a "fragile manufacturer"?

Jingwei: Behind the GPT big model is a difficult entrepreneurial history, the first two generations often lost to Google's Bert due to lack of maturity, until GPT-3 really achieved a leap. Many people admire OpenAI's perseverance to persevere and fight the world during the constant setbacks of GPT-1 and GPT-2, and finally prove itself right. You know a lot of key people in the industry, what do you think about OpenAI going all the way and ultimately succeeding?

Zhou Bowen: After the success of Transformer, everyone used it to make various large models for a while, but it was divided into two factions in the NLP field: one group of companies such as OpenAI focused on practicing left-to-right pre-training, ordering AI to learn to predict what the next word would be, and gradually realizing the generation of natural language. The bottom layer of this idea is consistent with our 2016 paper's emphasis on self-attention concept, that is, not allowing AI to use future information to learn, which is closer to the idea of general artificial intelligence.

The other group, such as Google's Bert, adopts a task-oriented way of thinking, aiming to do a good job of understanding natural language, that is, a paragraph should be seen from left to right, from right to left, and the more you look, the stronger your ability to understand.

In fact, there is no right or wrong between these two ideas, but they reflect the huge difference in philosophical views between the two sides, just like the self-attention we proposed, that is, insisting that students should not read the test questions before studying, but first understand the knowledge before going to the exam. That's why I think GPT's philosophy is better suited to truly general AI. But in the early stages of development, the GPT model did frustrate OpenAI a lot, and GPT-1 and GPT-2 failed to beat Bert, and it was not until GPT-3 that it raised its eyebrows.

In addition, there is another perspective that I think is worth paying attention to, that is, the success of OpenAI is not only achieved by this company alone, but relies on the support and help of the entire AI academic research community. There is a saying in English called "It takes a village to raise a child", the reason why OpenAI can always insist on doing GPT is due to the rich research and analysis of large models by the entire AI academic research community, such as many researchers have been trying to prove that in the middle and low layers of GPT and Transformer, there is lexical and grammatical knowledge; A lot of semantic and common sense knowledge is stored in the middle and upper levels.

The relevant verification and analysis work of the AI academic research community has greatly enhanced the confidence and direction of the OpenAI team. Without the help of these spontaneous research efforts, OpenAI may have struggled to stick around. Imagine, if you train on a large amount of data for a long time, and finally find that there is no evidence that this large model has learned any knowledge and reasoning, but only learned statistical correlations, and cannot form the possibility of its own precipitation and future emergence effects, then who will always have the determination to stick to it? The success of ChatGPT is precisely because OpenAI relies on the strong AI academic research community behind it and has a good industry-university-research integration ecology, which is worth learning from.

In terms of scale, OpenAI is just a startup with a few hundred people, while Google is a technology giant with tens of thousands of employees. I believe that within Google, both in terms of technology and ideas, they also have news related to artificial intelligence, but compared to openAI, there is no real product. One reason may be that Google's main profit comes from its search business, and generative AI may completely disrupt related business models, which seems to be another Kodak and digital camera story?

Zhou Bowen: One is the business level, and the other is the decision-making level of large companies. Large companies, as strong as they may seem, are in many cases, especially when it comes to technological generational transitions.

ChatGPT, a dialogue mode with deep artificial intelligence, will greatly reduce the value of the search business itself, and the original business model of "search keyword sorting" is likely to be no longer established, because users no longer need to look at so many search engine sorted links in the web page, which will lead to a rapid decline in Google's gross margin. In the view of Microsoft, which accounts for less than 10% of the share and has long been in the second place of search engines, this is a once-in-a-lifetime opportunity, and its crazy investment in this field can be seen.

At the same time, Microsoft's To B business and audience are very diverse, so I think Microsoft's organizational capabilities are far better than Google's. In this case, Microsoft can rely on the To B business ahead while quickly tuning the entire organization to adapt to new challenges better than Google and fight a war of attrition with Google in search.

In addition, Microsoft has the ability to embed ChatGPT in more To B scenarios, which Google dwarfs in this regard. Therefore, I believe that investors in the AI era can no longer despise the To B field. In the past, AI was not powerful enough in terms of productivity tools, so it became a "C-end toy"; However, now that AI has crossed the technical threshold, its application on the B side will become more and more impactful. Of course, it's not that To C is not important, the best model is still to achieve both To B and To C.

At the decision-making level of large companies, there are always many voices questioning the lack of innovation of large companies, but large companies often do not lack single-point innovation capabilities, and problems often appear in the process of systematic innovation, especially in the coordination and focus of internal resources. At the same time, large companies also have a lot of burdens, such as: Google needs to maintain its own technical image, if it thinks that the newly developed product is not good enough, it will not open the public test. Take ChatGPT as an example, it has a lot of errors and problems at the beginning, if Google is doing it, the public and public opinion may not be as tolerant as startups such as OpenAI. In addition, early in the development of technology, there may even be political debates, which can have a serious impact on a company's market capitalization.

Combined with these two factors, Google tends to be conservative in similar product launches. But this generative artificial intelligence technology, from GPT to ChatGPT, a big threshold in the middle is the real interaction with a large number of users, if there is a lack of feedback from a large number of users, it will never cross this threshold, and once lagged behind, it may always lag behind. OpenAI dares to invest boldly, focusing on designing and polishing a good product. Large companies are burdened with a series of pressures from market value management, capital use efficiency, technical reputation, social reputation, etc., so it is easy to act deformed in decision-making.

That's why startups like OpenAI run faster and have a more flexible route, because they don't have too much baggage from big companies and can move forward no matter what the difficulties. Of course, whether at Google or Microsoft, there are colleagues and friends I respect, who are very smart and personally no less capable than OpenAI.

It is worth mentioning that Microsoft invested in OpenAI in vitro to complete this work, if it is not done well, only need to cut on the PR, once the success Microsoft will win, which is also a commendable point in its investment vision and skills.

I have worked in large factories at home and abroad for a long time, and this decision-making problem is deep-rooted and cannot be changed by one person or a team. So for large manufacturers, the best decision is to invest in a startup company focused on this field while innovating internally.

Jingwei: Not only new companies, but everyone needs to think positively about how to integrate. For example, the first wave of C-side changes may be Microsoft, and if it integrates artificial intelligence in Word, Excel, PowerPoint, and outlook, it will be a big scene. At the same time, GPT will also disrupt many SaaS companies, such as a financial SaaS, as a customer may only need to enter a question, it can directly form the answer. Do you think many companies will be threatened by this?

Zhou Bowen: For SaaS companies, if the original business involves too shallow, but only automates or integrates information in the process, then such a company will indeed be greatly threatened, because if all processes are re-iterated based on deep natural language understanding and collaborative interaction, not only will the threshold be rapidly reduced, but the experience will far exceed the current products; But if the business is deeply tied to the industry and has a very strong industry know-how, then the addition of ChatGPT can only help and not threaten, because ChatGPT currently has no way to generate real insights and does not have usability when precise answers are needed.

In this case, the end-to-end vertical mode barrier will be deeper. For example, some SaaS companies may have shallow ties to the industry, although the business can be reconstructed through ChatGPT, but this ability is leveled with others, everyone can do the same thing, and the threshold is very low.

3

Another model in the era of large models: vertical closed loops

Jingwei: Actually, we want to make an end-to-end, top-to-bottom model?

Zhou Bowen: From ability, scene to user, it is all connected. From the underlying model to the deep dialogue capability, it is closely integrated with the scene. In this way, we form a closed loop that can quickly iterate on the basic big model, application scenarios and end users, which will be more valuable to users. We can also get feedback from users to help iterate the basic model, and we will also add the instructions of industry experts to the training, and finally make the model better and better in continuous tuning iteration.

The ChatGPT model, its advantage is that it has a wide coverage, but the disadvantage is that it is very shallow and only integrates existing information. I think that in addition to the breadth, there will be another form of high-value application of artificial intelligence, that is, on the basis of a certain breadth, it can be very deep in a specific field, even more than professional people.

We say this because we expect future AI to be able to do this in a decade. Daniel Kahneman, winner of the 2002 Nobel Prize in Economics, has a best-selling book, Thinking Fast And Slow, in which he proposed that there are two types of human ways of thinking: "System 1" and "System 2." "System 1" is characterized by being quick and easy to people based on intuition and empirical judgment; "System 2" is characterized by slowness, based on complex calculations and logic, with heavy cognitive consumption and a high cognitive threshold for people.

In the last wave of artificial intelligence, most people thought that AI would be suitable for "System 1" work, such as face recognition or industrial quality inspection through pattern recognition, and the work of "System 2" far exceeded the capabilities of AI. Therefore, AI is more deployed in blue-collar work scenarios, replacing repetitive work.

But I think the greater value of AI is to help people do the work of System 2 more effectively and deeply, which requires very complex reasoning, data and logic, and then generate more innovation in specific areas, and even create new knowledge to better complete more complex tasks. Recent advances in AIGC and large models are demonstrating potential in this direction. But if we look at it according to this idea, to really break through the value threshold, we need to not think of a large and comprehensive field on the ground, but narrow the field.

Based on the above ideas, Zhiyuan has been developing its own large model to lead CIP and ProductGPT to help enterprises achieve product innovation. It can provide comprehensive analysis and detailed data support, as well as in-depth analysis according to brand, category, and characteristics, which really helps professionals.

Leading CIP and ProductGPT as collaborative interactive artificial intelligence in vertical fields, according to our market verification and forecast, it can multiply innovation opportunities by 10 times, shorten the time-to-market cycle by nearly 10 times, significantly reduce innovation costs, and help enterprises bring more revenue, business growth and profits, our model is to surpass ChatGPT in the field of product innovation.

Jingwei: OpenAI also puts forward the term of application middle layer, that is, on top of OpenAI's GPT large model, to connect various application fields to form an intermediate layer. Of course, a large model like GPT, the coverage is wide but shallow, then you need to have new companies to join, do not make their own models, but directly use GPT to connect various vertical tracks, such as medical, legal and other specific fields of data sets to train, such companies will have strong competition with vertical closed-loop companies in the future?

Zhou Bowen: For this market, I will divide it into three categories, the first category is start-up companies such as Zhiyuan Technology, we do the underlying model ourselves, from technical algorithms to model iteration, scene closed loop, this kind of is vertical; The second type is based on other people's models (such as GPT), and then combined with their own industry know-how to do training; The third type is pure application, which is to take the model to use directly, and the barriers will be low.

Why do I think the model is more competitive in the long run? From a technical point of view, because it forms a complete closed loop of infrastructure, large models, application scenarios and end users. When the company has specific functions for end users to use, it will generate a lot of usage data, and the data feedback can help improve the application, but also help improve the basic model capabilities, and finally the model will continue to be optimized and iterative. Start end-to-end and slowly iterate on larger business models. This benefit also reduces the complexity of training, from the training cost and speed, we can use smaller training costs to achieve 100 training times faster for the technical team, through these hundreds of training polishing to quickly improve engineering, various know-how and engineering skills and product experience.

As for whether the second category can succeed, I think it will take time to verify, and it is not clear yet. The reason is that everyone still does not know how to integrate industry know-how with the big model as an effective way, and how to have a moat and a sustainable business model, which is still unknown.

From the perspective of OpenAI or the big manufacturers, they like this "application middle layer" model. Of course, if it is really an infrastructure and whether it can work, it needs to be operated for a period of time before observation.

But society definitely needs another model, because it's also important to stay innovative. For example, if the problem is too centralized, such as all the applications in the world are only integrated in one big model, then the world is quite scary. Such a model is trained through a large amount of data feedback, and it has the ability to align some values, which will pose a huge challenge to the governance system of human society.

There are also technical reasons. If there is only one general large model idea, there is no way to see the iteration and comparison of different technical directions. As mentioned earlier, if there is no Bert to compete with GPT, GPT will not develop so fast, and only in competition can GPT develop more momentum. Academic innovation and technology ecology both need to be diversified, and cannot be completely concentrated on a big model, nor should they all use one way of thinking to do things.

4

Under ChatGPT's "data flywheel effect",

How should Chinese companies catch up?

Jingwei: With the outbreak of ChatGPT, China's AI-related companies also need to catch up, but OpenAI does have a first-mover advantage and enjoys the data flywheel effect. How do you think China's AI industry should make a decision on the road to catching up?

Zhou Bowen: On the one hand, we need to have our own big model, but on the other hand, we may have to start from the vertical field first. My point of view is to first learn how the big model works through the vertical field model, how to interact with the big model in the scene, obtain more data, form a vertical data flywheel, and then see how the business model should unfold; After you have done the vertical field well, it is time to think about what to do with the big model.

Especially large models require a lot of engineering. Engineering means that enough attempts are made so that engineers can gain experience and then summarize the know-how to make the next attempt more successful. Of course, at some point and stage, this process will evolve into whoever puts in more money has the ability to try more. However, if each company invests a high cost to make its own large model and produces its own Know-how, it will undoubtedly bring about repetitive waste of resources.

Focusing on a vertically integrated field with a sufficient breadth, through the saturation training of a large amount of data, with real closed-loop scenarios and user feedback, more vertical data can be obtained, and the depth and inference capabilities of large models may be able to emerge at a lower cost. In addition, China's computing power resources are very tight. If companies flock to make large models, suppose each company needs 10,000 A100s, but fierce internal competition leads to no one getting 10,000 A100s in the end. Such vicious competition, it is better to first do a good job of the vertical model through 100 pieces of A100, and then add it to 1000 pieces of toB or toC mode and generate value, and finally increase the computing power resources of the weighing machine with the highest value through the market to 10,000 pieces. Therefore, it is more in line with objective reality to try from the vertical.

Of course, I firmly believe that China will eventually have its own general-purpose big model, but this path does not have to be to completely mimic OpenAI. OpenAI has been very difficult for a long time, whether it is technical obstacles and bottlenecks, or computing power and data dilemmas. At the same time, large companies are understandably under more pressure from liability and similar factors such as the impact on their own search business.

Jingwei: Yes, in fact, from the perspective of parameters, more is not better, and now OpenAI also says that GPT-4 will not be an overly large parameter magnitude. What parameter magnitude do you think is reasonable?

Zhou Bowen: It is true that more parameters are not the better, and adequate training is more important. Under the premise of adequate training, 80 billion parameters may achieve better results than 100 billion parameters. At the same time, the scale of parameters should also be gradually increased according to the actual situation of training. Another point worth noting is that in 2022, ChatGPT will be released, many companies claim that their models have much larger parameters than GPT-3, but so far, none of them can compare with the actual effect of ChatGPT.

From a technical point of view, the complexity of the model, including the magnitude of parameters, should follow the "Occam's razor", which means that if you can fully model a hypothesis, the fewer parameters used are always the better. Because the less it is, the more it means that the model does not make too many assumptions, and it is easier to generalize and generalize. This was also called the KISS principle by Einstein, which stands for "Keep it simple, stupid!"

Jingwei: There is a conservative view that ChatGPT has great limitations, and although its current answer is amazing, it is essentially a language model based on statistics, that is, looking at a lot of data and then predicting the next step based on the statistical results. But if we give it some jumble of data, its answer will become illogical. Therefore, this view will believe that even if more and more parameters and data are given in the future, there are still big doubts about whether it can really become a general artificial intelligence in the end. What do you think about this?

Zhou Bowen: First of all, I don't think ChatGPT is equal to general artificial intelligence. However, ChatGPT is really trying to create better and more powerful AI.

At the same time, ChatGPT has many weaknesses. First, it lacks real insight, and its reasoning ability is simply insufficient; Second, it still integrates information at a relatively shallow semantic level, and although it can distinguish different points of view and integrate them, it still lacks depth; The third is the issue of its credibility in terms of knowledge and data.

In contrast, what Zhiyuan Technology wants to do is not a very broad general platform, but to use more vertical data to train more in-depth artificial intelligence in a specific direction - it can give finer, deeper and more accurate answers, so as to better help professionals complete insights and product innovation, which will become another new form of strong artificial intelligence.

5

Summary of ChatGPT

"The arrival is not surprising, the impact is not underestimated, and the future is not terrible"

Jingwei: In recent years, although new technologies such as AI painting, AI video, AI sound, and AI prediction of protein structure have emerged, they are still distributed in points. The launch of ChatGPT shocked the world in the form of a productized chatbot. What do you think of the future of AI?

Bowen Zhou: It's been a lot of people asking me what I think of ChatGPT lately, and some of them are excited about its arrival, while others are apprehensive. My views can be summed up in fifteen words: "Don't be surprised when it comes, don't underestimate the impact, and don't be scary about the future."

"Arrive not surprised" means that this is not the kind of "Sputnik Moment", because many of the technologies and ideas in it are actually trends that have emerged since 2021. Therefore, this round is not too surprising for people who have been doing AI frontier and forward-looking research for a long time, and most of the core technological innovation points have appeared in 2021. Therefore, the emergence of integrated product innovation such as ChatGPT is inevitable, but there is a certain chance when and who will make it in the end.

"Impact not underestimated" means that ChatGPT will change a lot of things. The emergence of ChatGPT at this moment is a milestone, and its impact on human society will be reflected in all aspects such as economy and technology.

"The future is not terrible" means that I do not agree with the demonization of AI by many people, including Musk's so-called "crisis awareness". For now, at least, AI is controllable. In the future, government policymakers, academic research teams, entrepreneurs and legal professionals will continue to think about how AI should be integrated into human society.

For example, there will be some problems at the moment, such as ChatGPT is actually a flattering personality, biased to constantly follow the answers obtained to correct themselves, but human society is full of contradictions, conflicts and other information, how ChatGPT should iterate in the process of forming its own value system, is a very worthy problem.

In addition, intellectual property copyright is also one of the unavoidable issues. Much of ChatGPT's data is based on mass creation, and if it comes to commercialization, how should the benefits be distributed? What's more, ChatGPT is not a simple collection, but a fusion mechanism, so how to trace and distribute, and clarify all kinds of things here will be very complicated.

There is also the problem of defining use, for example, some academics do not allow paper publishers to use ChatGPT, but many non-native English speakers like to use ChatGPT to modify grammar and polish sentences, and related application scenarios are also worth discussing.

In short, ChatGPT is an epoch-making product. Since then, AI has really found a flashpoint for applications, and will continue to integrate and develop with various industries in the future. Finally, I still use those fifteen words to conclude, hoping that everyone will be familiar with the new era of AI that is happening and coming: "The arrival is not surprising, the impact is not underestimated, and the future is not terrible."

Maybe you also want to see:

Jingwei 2022 year-end stocktake: optimists move forward, follow the white rabbit

Jingwei Zhang Ying: 2023, not only overcome difficulties but get used to difficulties |【Jingwei low-key sharing】

Jingwei Zhang Ying: 9 suggestions for founders with scientific research/technical background |[Jingwei low-key production]

Sam Altman, the father of ChatGPT: In what fields will large AI companies be born? |【Jingwei low-key sharing】

ChatGPT's past and present, and future |【Jingwei low-key sharing】

ChatGPT and the new Moore's Law, the next systemic opportunity door is near?