丨Focus
1. The cost per training of large models is usually millions of dollars, which is a very expensive and computing process.
2. Parameters are the elements that make up the model itself, similar to indicators such as the number of brain neurons, the size of the parameters determines the complexity of the model, but it does not mean that the larger the parameters, the better.
3. Large model applications generally have four problems: insufficient computing power, data problems, tool chains required for large model training, and lack of professional teams and experience.
4. Model as a service means that users only need to pay attention to the model itself, and do not need to care about the various complex algorithms and engineering processes underlying the model itself.
丨Overview
ChatGPT has been popular for 7 months, and although the attention has decreased slightly, the "phenomenal product" is no longer enough to describe this product from the era of artificial intelligence.
In the eyes of many people, the large model represented by GPT is a bridge for human beings to reach the era of general artificial intelligence (AGI), whether it is giants such as Google and Microsoft, or the newly established start-up team, they hope to bet on the big model track and get the ticket to the future era, and the situation of "10,000 model wars" was born.
Although there is a consensus on the huge transformative ability of large models and artificial intelligence, there are different views and views in the fields of specific models, algorithms, data and computing power.
At 20:00 on July 5, "Liangjian" exclusively invited Hou Fang, product leader of Tencent Cloud TI platform, to live interpret large model data, computing power value, application scenario exploration, and the platform's empowerment concept for large model entrepreneurship.
In Hou Fang's view, the training of large models is a very high-cost business, the number of parameters of the model needs to match the application scenario, the more suitable the better, the larger the computing power, the better, "the larger the number of parameters of the large model, the more training process is needed to fill these parameters, which also means higher cost."
At the same time, Hou also shared the difficulties in working with customers to promote the application of large models, including "insufficient computing power, data problems, the toolchain required for large model training, and the lack of professional teams and experience." At the same time, he also put forward his own suggestions for the application of large models, saying, "From the current practical experience, the first problem that needs to be solved is to determine the applicable scenarios." ”
Hou Fang believes that more and more platforms will join the ranks of empowerment, launching industry large model select stores represented by models as a service to help solve customers' problems in models and computing power, and he also called on more teams to choose this form to step into the threshold of the big model era to solve the cost and efficiency problems of enterprise large model landing.
The following is the essence of the live text:
01
The more suitable the number of parameters, the better, and the greater the computing power, the better
Liu Xingliang: First of all, do the larger the parameters of the large model mean better?
Hou Fang: Large models look very intelligent and complex, but in fact we can simply understand that it is just a series of matrices. These matrices represent various values in mathematics, and through a large number of matrix operations, the entire intelligent implementation is completed.
Regarding parameter scale, since the advent of ChatGPT, people have quickly started making comparisons. Everyone marveled at the huge amount of parameters and realized that it produced a large amount of intelligence.
As far as the model itself is concerned, it is not that the larger the parameter is better, nor that the smaller the parameter is better, the most important thing is to see what problem you are trying to solve. In different problem scenarios, you may need a larger, more specialized, and more focused model, or you may only need a very small model.
So, in terms of the amount of parameters, from my point of view, the one that is suitable is the best.
Liu Xingliang: Are the big model parameters data understood in the past? What could be if explained in one sentence?
Hou Fang: In simple terms, large models learn from data and transform it into internal parameters, which is a relatively intuitive explanation.
Data refers to the corpus we use for large-scale model learning, that is, the knowledge we provide to it, while parameters are the elements that make up the model itself, similar to the number of neurons in our brains and other indicators, the size of the parameters to a certain extent reflects the complexity of the model.
It should be noted that the cost of training large models is closely related to data and parameters. The amount of data involved in model training will affect the training cost, and the cost of learning 100 pieces of knowledge is completely different from the cost of learning 10 million knowledge, so the data scale has an impact on the cost. The larger the number of parameters of the model, the more training data and training time are required, which also means that the cost will be higher.
As you can see from ChatGPT, the cost per training of large models is usually millions of dollars, so this is a very expensive and computationally intensive process.
Large model development process diagram, source: New Zhiyuan
Liu Xingliang: When it comes to big models, computing power is a topic that cannot be avoided, what is computing power, and what is its value for models?
Hou Fang: Generally speaking, the greater the computing power, the better.
More computing power means that the time consumption of model training itself will become shorter, and the process of training models and the intelligent emergence of large models has some powerful miracles, so if you have more computing power resources, there are more possibilities to try in model training itself.
Computing power refers to the computing power in AI large model training, usually referring to GPU cards, that is, graphics cards used for graphics processing. These graphics cards are very good at parallel computing and have chips and processing units built in. Of course, the CPU may also provide a certain amount of computing power, because it is a form of computing unit, which is the meaning of computing power.
02
Four challenges for large model applications
Liu Xingliang: What are the needs of enterprises for big models today?
Hou Fang: Since the advent of the big model, the frequency of communication and cooperation with customers has increased significantly. Our ideas and ideas are becoming infinitely diverse, and we are constantly meeting a variety of needs.
Among the customers I have dealt with before, there are several typical examples, the most common one is the application of the customer service field to improve the experience and efficiency of intelligent customer service. Many enterprises will face similar problems, whether it is the sales department or other back office support departments, customer service upgrades are the type of customer needs we currently contact the most.
The field of customer service involves many aspects, not just directly generating answers, but also requirements for accuracy and completing specific tasks, such as booking hotels, booking flights, and so on.
In addition to the customer service field, we have also been exposed to many generative needs, such as writing novels, writing official documents, reports, video scripts, marketing copywriting, etc., and the needs in this field are very wide and diverse, because there are many kinds of things to write, which is the second type of demand.
The third type of demand gradually favors professional fields, including programming, research report writing, conference minutes, and so on.
Liu Xingliang: What are the core problems that enterprises encounter in applying large models?
Hou Fang: Due to the emergence of ChatGPT, everyone will have relatively high expectations at the beginning, thinking that large models can solve all problems. When we work with our clients, we often encounter the following four difficulties: insufficient computing power, data problems, the toolchain required for large model training, and lack of professional teams and experience.
The first is the issue of resources, especially computing power. Many enterprises want to train models at scale, but they have limited computing power themselves. For example, we've come across machines with customers who only have two graphics cards, but they want to handle more complex tasks. This is not because they do not have enough funds, but because the accumulation or reserve in this field is relatively insufficient, and it is difficult to find available computing resources in the current market environment, and the shortage of computing power has become a common phenomenon.
The second issue is data. Many enterprises want to train their models, but face the problem of insufficient data volume, or insufficient data quality, and model training is very dependent on the quality and quantity of data. In addition, a series of engineering tasks such as data cleaning, proportioning and preprocessing also require a lot of work, and data engineering itself is also difficult.
The third problem is the toolchain required for large model training. Large model training requires a complete set of toolchains, because this involves high-performance computing power, high-performance storage, multi-machine and multi-card scheduling capabilities, acceleration frameworks, and so on.
The last problem is the lack of professional team and experience. Although there are more and more large models, many enterprises lack the professional team and experience to implement them, and will face problems such as model selection, data processing, and training method selection when actually landing.
Liu Xingliang: What advice would you give to enterprise managers in applying large models?
Hou Fang: From the current experience, the first thing to solve is to determine the applicable scenarios. In our conversations with our customers, we found that this was the first key issue we faced together.
Since the technology of large models is still relatively new, no one can assert that their large model is the absolute best choice in a specific scenario, which requires us to continue to explore and practice.
The second is the construction of computing power, data, and platform tools, which are all key factors to consider.
Finally, a more general recommendation is to embrace the big model. Because the big model puts forward new ideas and interaction methods for many software and business designs, including methods to solve problems in specific scenarios.
03
Artificial intelligence makes the Internet "work again"
Liu Xingliang: What are the reasons and objective conditions that have contributed to the current wave of entrepreneurship of large models?
Hou Fang: The emergence of large model technology is a fundamental change in the entire field of technology.
In the past, we paid more attention to algorithms, and then gradually developed to the deep learning stage, where the role of data began to be highlighted, but algorithms still occupy a relatively high proportion in it.
With the advent of large models, it will be found that as long as the data and computing power are provided, the model can learn a lot of things on its own. In this case, the importance of the algorithm itself is gradually decreasing, which is a very big change and evolution from the perspective of the development path of technology.
From the perspective of application, in the past few years, AI has been used in more and more landing scenarios in enterprises, and its applications have become more and more extensive, and it has been integrated into people's lives, and the degree of contact has become higher and higher.
Driven by the above factors, people are beginning to firmly embrace AI and large model technology.
Liu Xingliang: What is the value of AIGC and what changes does it bring to the Internet industry?
Hou Fang: The biggest feature of large models is generative, which deconstructs a large amount of data and knowledge through multiple dimensions by mapping a large amount of data and knowledge into the mathematical space. Once knowledge is deconstructed, they can be recombined to produce various works in reality, such as artistic paintings, words, or other creations.
The operation of a large model relies on a large number of computational processes, which means that it can learn and create in a very structured way, using hundreds of billions of dimensions to randomly combine to produce incredible works, which is why sometimes it creates paintings or copywriting that are surprising and often cannot be thought of or written by humans themselves.
Therefore, AI and big model technology in the field of content creation, especially in content generation, will bring many new technologies to assist or replace the human creative process, allowing us to generate content more intelligently, which is indeed a very effective state.
Upper, middle and downstream of AIGC, source: "AIGC+AI Generated Content Industry Outlook Report" Qubit Think Tank
Liu Xingliang: In terms of computing power and data, small teams actually do not have an advantage, so isn't it fragrant to choose mature open source large models for training?
Hou Fang: In fact, many enterprises will try the open source model now, and the development of the big model is inseparable from the contribution of the open source community.
However, there are some thresholds for training with open-source models, which require certain algorithm knowledge and industry experience.
From the perspective of the feasibility of landing, it is more recommended to choose the industry large models provided by cloud vendors, which are usually optimized for specific industries and will be more suitable for solving the actual problems of enterprises.
04
Make big model entrepreneurship a "supermarket"?
Liu Xingliang: I know that Tencent Industry Big Model has just been released, Tencent Cloud has also launched a large model select store, and proposed the concept of model and service (MaaS), which should be the model of this industry big model earlier in the industry, so what is a big model selection store and what is a model as a service?
Tencent Cloud MaaS creates a one-stop industry model selection store
Hou Fang: Let's start with MAAS, this concept is actually very interesting and a process of continuous evolution.
From the very beginning of infrastructure and services (IaaS), to the later PaaS, and then to the current MaaS, is to make the cloud service itself become more and more simple, the previous use of services need to be developed and implemented by the customer's business system, with the popularity of MaaS services, will more and more reduce the threshold for everyone to use AI services.
MaaS has a lot of significance in the field of AI and large models, it means that we can completely wrap the underlying internal work, and the user only needs to focus on one problem: request something from the model, and the model directly provides the corresponding service.
At present, there are a large number of model fields in the industry to choose from, and incomplete statistics have exceeded 800 models. With so many choices, we need to think about how to select and apply models, and similar problems exist in terms of algorithmic models. Therefore, Tencent's idea is to select large models of various industries on the TI platform, similar to the concept of "model supermarket", and select them according to customer scenarios and industry characteristics to provide the most suitable models.
Liu Xingliang: What is the core value of the model store to customers?
Hou Fang: High-quality large models in multiple industries, corresponding supporting tools, lower cost, more efficient to help enterprises apply large models, these are the three core values of large model select stores.
Liu Xingliang: From the actual observation, is the industry big model an effective way to implement the current AI big model technology?
Hou Fang: Industry big model is one of the more effective paths at present, and when facing some common challenges, including computing power and data, the industry big model provides a better solution. However, it is difficult to determine the direction of development in the next five or ten years, because the industrial revolution of the big model has just begun, and it is not yet known what state it will evolve into.
Liu Xingliang: What are the ways to commercialize AI large models?
Hou Fang: The industry big model is an effective way, because in the commercialization landing, in addition to solving the problem, we must also consider ROI, which is the core issue that everyone pays attention to in commercialization.
I believe that a key factor in commercialization is to create value that solves customer problems, whether it is the commercial provider or the commercial demand side. Only in this way can commercialization be sustainable. Commercialization not only involves solving customer problems, but also involves people's emotional and communication needs. Although this scenario may not necessarily be involved in the industry's large model, it is indeed a very good commercial landing scenario.