laitimes

China think tank report丨Is the emergence of artificial intelligence large model blowout good or worry?

author:CNR

Central Broadcasting Network Beijing, June 26 (Reporter Zhang Mianmian) At present, the artificial intelligence models of various countries around the world have shown a white-hot competition trend. Large models represented by "ChatGPT" were released. China Electronics Information Industry Development Research Institute recently released "CCID Think Tank View丨Leng Guan Artificial Intelligence Big Model Fever".

What are the problems behind the rush of big companies to launch big models? How to promote the development of large model-related technologies? And how to establish and improve the supervision mechanism of large models? Zhong Xinlong, head of the dialogue project of Zhang Mianmian, reporter of the Voice of China of the main station, researcher and senior consultant of the Future Industry Research Center of the China Electronics Information Industry Development Research Institute.

Think tank article: "CCID Think Tank View丨Leng Guan Artificial Intelligence Big Model Fever"

Think tank: China Electronics Information Industry Development Research Institute

Report author: Zhong Xinlong, researcher and senior consultant of the Future Industry Research Center of China Electronics Information Industry Development Research Institute

A group of super scholars who know astronomy and geography

The AI Big Model is an AI algorithm that can use big data and neural networks to simulate human thinking and creativity. It leverages massive amounts of data and deep learning techniques to understand, generate, and predict new content, often with tens or even trillions of parameters, that can exhibit intelligence across different domains and tasks. For example, the familiar "ChatGPT" belongs to the "big model". What exactly can such a large model do? How is it different from the artificial intelligence of the past?

Reporter: Artificial intelligence big models are still very strange to many people, can you help us explain in plain language what is an artificial intelligence big model?

Zhong Xinlong: Now the industry defines it as a general artificial intelligence model, what is general artificial intelligence? It is equivalent to a super cluster of scholars who answer everything and respond anytime, anywhere, which is a very approachable and almost anthropomorphic tool. Generally speaking, large models learn large amounts of data and then understand and generate natural language for everyday use.

For example, Chinese and English are the natural languages used by human society, including images, audio, or other types of data that are commonly seen. And the big model is like a huge cluster of super scholars, like humans learning various examples. For example, we study standard sample articles, look at standard example pictures, listen to typical music, watch typical videos and other dimensions of information, in order to analyze the information composition of the current human world.

Reporter: What is the difference between such a cluster of super scholars and the artificial intelligence of the past? Is the current "big model" equivalent to the underlying application of artificial intelligence AI?

Zhong Xinlong: Yes, so to speak. Because it is usually divided in this way, the second half of last year is a time node, before which artificial intelligence was mainly in the era of decision-making artificial intelligence. For example, we want to do speech recognition or speech-to-text, which is something that everyday users can encounter, as well as pictures to text, etc., these artificial intelligence applications are relatively single algorithms, for a single scene is of a specific nature, there is no possibility of scene migration. For another example, can I do so-called algorithms to meet the needs of generalized tasks, which traditional decision-making artificial intelligence cannot do, because it can only be used in specific scenarios and according to specific patterns.

After the big model like ChatGPT caught fire in the second half of last year, the industry called it the "general artificial intelligence big model". It can solve more generalized problems, and most importantly, does not require the use of programming languages, such as C, Python or Java and other programming languages. This is a very prominent advantage, because the programming language does not meet the habits of the daily public, and everyone's daily language must be communicated in natural language, such as Chinese, English, etc. Taking this large model as an example, it can use everyday language to communicate, so that there is no threshold, everyone can use it, any questions can be asked directly, almost the vast majority of questions can be answered "decently", which is that this general artificial intelligence large model is fundamentally different from the decision-making artificial intelligence of the past.

The boom in large models is coming, and hidden worries such as cost are worth being vigilant about

At present, in the field of large models, the competition between domestic and foreign giants has become intense. OpenAI has become a benchmark for leading the development of large models, and it is expected that in the fourth quarter of this year, OpenAI will release a more advanced version of ChatGPT-5. On May 24, Microsoft announced that Windows 11 was connected to GPT-4; On May 10, Microsoft's direct competitor Google launched a new generation of large model PaLM 2, with more than 25 AI products and functions fully connected to PaLM 2; Amazon cooperated with artificial intelligence startup Hugging Face to develop ChatGPT competing product - BLOOM; Domestic science and technology leading enterprises are also intensively releasing self-developed large models; Baidu released a big model Wen Xin; Alibaba released the first ultra-large-scale language model. What kind of business value is behind the intensive emergence of these large models? What are the biggest concerns at the same time?

Reporter: Can you help us first summarize what are the general characteristics of today's large models?

Zhong Xinlong: First, the more significant trend is that the industry has begun to use large models for application empowerment. Microsoft's Copilot Office, for example, was announced by Microsoft at the end of March. They put AI into Office applications, which can achieve the most basic application bottom support. For example, you can generate PPT with one click in the way of dialogue, such as giving commands to quickly process forms. In fact, there are benchmarks in China, and Kingsoft should now begin internal testing, that is, the large model processing capacity is built into WPS.

Second, the underlying application trend of AI is the empowerment of vertical industries. For example, in foreign countries, we know that BloombergGPT is actually a typical big model to empower the financial industry. It has never been seen that almost all industry fields have to face large model reconstruction, and the foundation of industry intelligence will be rooted in large models.

Of course, this requires the process of iterative application development, but also the process of time precipitation, and may be long-term efforts in this direction in the next 3 to 5 years, which is the international development situation.

Reporter: Why are these leading enterprises at home and abroad flocking to develop large models? What advantages do they look for in larger models?

Zhong Xinlong: At present, the industry may mainly focus on this point, that is, the commercial value of the large model. To put it bluntly, the development of large model applications ultimately depends on business value.

At present, the application of the relatively preliminary stage, such as the large language model similar to GPT, it is connected to automated customer service and automated process office systems, or do some fixed process, fixed template auxiliary work, which can significantly improve efficiency and reduce costs. For example, art illustration, document customization, and platform customer service of the mall can be expanded to the underlying access of the model, which will obtain higher usability and stronger effectiveness than the traditional more mechanical artificial intelligence.

In addition, due to the accumulation of so many years, especially the accumulation of information digitization for up to 10 to 20 years, we have more data to use than any other period in history, which is the necessary element to train large models. Even if the large model is placed in the historical dimension at a certain moment in advance, the training effect may not be good because of insufficient data.

Finally, at present, in the field of artificial intelligence, almost all research teams, including head companies, are competing to develop the most advanced large models, and many of the head models have begun to open source. I think this kind of competition is a kind of healthy competition, which is conducive to promoting the development of technology and promoting the sharing of technology.

Reporter: Behind the concentrated emergence of large models at home and abroad, are there hidden worries at the technical level?

Zhong Xinlong: First, the so-called cost problem, from the perspective of training costs, taking ChatGPT as an example, its training cost is about millions of dollars at a time. In the highly iterative and iterative training process, the threshold for entry is hundreds of millions, not even including the cost of computing power. If you want to include the cost of computing power, such as tens of thousands, hundreds of thousands of professional computing cards, this is the cost of billions. Such a high cost will directly exclude many small research institutions or small and medium-sized companies, which will lead to AI research and development concentrated in the head resources, aggravating inequality.

Second, the current training of many large models, the competition will lead to a large amount of energy consumption, such as carbon emissions, including electricity consumption, etc., all increase the economic cost and environmental costs.

Third, is the cost of data, generally to make a large model training must require a lot of data, not only this data volume is large, the quality is also high. In this process, collecting and cleaning a large number of high-quality training sets also requires a lot of time and human resources.

Fourth, in terms of AI competition, there will also be some maintenance and update costs. For example, taking GPT as an example, from GPT-2 to GPT-4, the number of parameters continues to increase, which will lead to the cost of repeated iterative optimization training, and the cost of operation and maintenance will also increase. Not only is the cost of updating increasing, but also the operator has to do maintenance, and the cost of repeatedly maintaining operations and updating is growing non-linearly with the complexity of large models and the scale of parameters.

Solving problems such as privacy leakage and increased costs requires multiple measures

With the rapid development of large models, the industry is generally optimistic about the future prospects of large models, but risk costs such as privacy leakage continue to appear. How exactly should such risks be perceived? How does application enablement cost achieve a business path of cost return and incremental revenue? And how to solve such a problem?

Reporter: Many people are worried that large models will leak privacy and bring risks.

Zhong Xinlong: To give a very simple example, if there are these related risks, without basic prevention and correction, there will be potential crime risk factors. The methods and ways to obtain violations and crimes through large models must be prohibited. At present, the training of its own large model, you will also find that the source of the good training set in it also has certain copyright disputes.

We can see, for example, taking foreign countries as an example, such as Google and OpenAI, they will have copyright authorization signing certificates with news publishing organizations, which shows that they actually have copyright disputes in the process of training datasets, so this is also a risk issue. At present, there is still a lot of work to be done on the governance system that the big model needs to focus on, especially in terms of ethics and legal governance.

Reporter: Is there a better solution to the above problems and the very prominent cost problem?

Zhong Xinlong: In response to these problems, first, study to see if there are good ways to reduce computing resources and energy consumption. For example, we can do some research, or use more efficient training algorithms and hardware optimization to reduce the need for computing resources and personnel.

Second, echoing that AI is too focused on the head, is it possible to consider establishing some public computing resources and datasets, which can be open to the public or to ordinary practitioners in this industry? In this way, more small and medium-sized research institutions or companies can participate in the research and development process of large models.

Third, we must pay attention to data privacy protection, including when collecting or using data, we need to comply with legal and ethical provisions related to data privacy and protection. For example, we need to use reasonable data masking and anonymization techniques, and then formulate usage authorization rules related to copyright owners, and formulate statements similar to explicit user consent and risk factor statements.

Fourth, the interpretability of large models should be improved as much as possible.

We recall that when humans make decisions, every decision has a logical chain, and we make decisions step by step, so human decisions can be explained, but the current generation results of large models are unexplainable, so this leads to the fact that large models may have some shortcomings in gaining public trust. In view of the shortcomings, by studying the interpretability of the large model, the trust of society and the public can be obtained from the underlying mechanism ethically, and even in terms of relevant governance, there will be relevant safer and more reliable means to govern.

Finally, the prevention of bias and the protection of fairness are a core problem that almost all large models encounter today. For example, in the data collection and training stage, there may be some discriminatory, unfair and biased values, and it is inevitable that these issues may be involved in the generated content. In response to the problem, it may be necessary to study how to adopt algorithms that intervene in more fairness or anti-discrimination and value correction, and correct and audit data and models.

For more exciting information, please download the "Central Radio Network" client in the application market. Welcome to provide news clues, 24-hour reporting hotline 400-800-0088; Consumers can also complain online through the "Woodpecker Consumer Complaint Platform" of the Central Broadcasting Network. Copyright statement: The copyright of this article belongs to the Central Radio Network, and shall not be reproduced without authorization. Please contact: [email protected], we will be held accountable for acts that do not respect the original.

Read on