laitimes

With the industry big model, why did Tencent launch a general big model?

On September 7, Tencent's general large model "Hybrid Yuan" was officially unveiled at the 2023 Tencent Global Digital Ecosystem Conference. At the scene, Tencent executives demonstrated the function of the hybrid big model, which has been connected to 50 businesses and products such as Tencent Cloud, Tencent Advertising, and Tencent Games.

With the industry big model, why did Tencent launch a general big model?

During the ecological conference, Jiang Jie, vice president of Tencent Group, was also interviewed by Nanfang + reporter and answered questions on the development strategy and business path of Tencent's general model.

Talk about applications:

Becoming a business "grindstone" within Tencent

South+: In June, Tencent Cloud released a large model of the industry; The mixed-element large model is not only on the B-side, but also in many C-side application scenarios. What is Tencent's positioning in the entire big model ecosystem?

Jiang Jie: Tencent first released the industry large model in June, and now the general model is officially unveiled. In fact, in June, the industry and customers have a lot of demand for large models, and the general large model, in addition to supporting several industries that have been released in the industry, will support more fields, and will also serve as the foundation of Tencent Cloud MaaS services to further serve customers.

In the past few months, we have been honing with Tencent's internal applications, treating Tencent's important and rich business scenarios as "grinding stones", and then coming out to serve more enterprises. For example, in the application of meetings and documents, it is in a complex environment, through internal full polishing, let us have confidence in ourselves, which is the most critical. We hope to make the Tencent Hybrid Model a "multiplier".

South+: What role do you hope to play with the big model application?

Jiang Jie: Tencent Meeting, Tencent Docs, Tencent Advertising, etc. have been connected to the hybrid model and have a large number of users and have been deeply applied. First of all, when we make this model, we must serve the enterprise itself, and then serve customers and ecological partners externally through Tencent Cloud. For a general large model, its logical thinking and reasoning ability are very critical. It is necessary not only to have complex reasoning capabilities, but also to have a better judgment on the security issues in the process of complex reasoning capabilities. We hope that the big language model can really bring convenience to our lives and work to bring efficiency.

South+: The training and storage of large models involves a large amount of personal data and sensitive data, how does Tencent ensure the security and privacy of these data to avoid data leakage and abuse?

Jiang Jie: This is a privacy issue, which is not directly related to the big model itself. Regardless of whether there is a large model or not, Tencent strictly complies with legal requirements, not only we make small models, large models or even large language models, but also will not use personal privacy data. In addition, Tencent's content products also provide a large-scale, high-quality and diversified corpus for Tencent's hybrid model, which can learn rich language knowledge and contextual understanding capabilities in various application scenarios.

Talking about commercialization:

Provide customized solutions for customers from self-use to in place

South+: How do you view the commercialization of large models?

Jiang Jie: Large models will generate good commercial income in the short term for TOB, I think this still needs to be explored, because the maturity of large models and the ability to deal with complex tasks are not enough, it still has many serious scenes, professional scenes, and when it cannot be unlocked, in fact, its application scenarios are still very limited, and we need to improve them together with various teams in the industry and even academia. Tencent's mixed-element large model system is based on Tencent's own application research and development, and then more deeply combined with the large model application, in order to offset the high equipment, training and personnel costs of the entire large model.

South+: Is there a clearer path to commercialization?

Jiang Jie: I think the first thing is to do a good job in the technology itself, return to the essence of technology, as for commercialization, we will open all the capabilities of the hybrid model to all Tencent's businesses, which are internally public, and all of them are used and iteratively applied on Tencent's machine learning platform. For example, cooperation with Tencent Documents and Tencent Meeting is deeply combined with various businesses to do external release, and the mixed element large model needs to do more data annotation, more frameworks, and more data for training. Internally, we can actually think of the mixed element as an internal open source model, and the internal business of each company can see the ability of the mixed element, and they apply it based on this ability; For the TO B side, it will be open to the public through Tencent Cloud API. In the future, if each industry needs to do in-depth customization, Tencent Cloud will also provide services for everyone.

Talk about self-research:

Complete mastery of technology leads to better iteration

Jiang Jie: Why do full-link self-research? In fact, there are many open source models, with the help of many open source models can do some superposition on it, but if you do not do self-research from scratch, in fact, you will not fully master the technology, for example, this model is trained by others, but when there is some illegal and harmful information, the answer is wrong, and you can't do more modifications in it, at the same time, iteration, research and development can be faster, and there is more suitable for the integration of Tencent's technology stack in the future.

Tencent is self-developed from high-speed networks and lowest-level servers to network cards, high-speed networking, as well as platforms, models, and algorithms, and this self-research actually gives us subsequent iterations that can be accelerated. At the same time, the deep integration with other businesses will also be accelerated. It can be said that Tencent has a massive and highly concurrent business, and many of the open source architectures are not suitable for Tencent's business volume, so we must come out of a set of R&D paths based on independent systems in order to cope with the massive and highly concurrent business impact.

South+: What optimizations will be brought to customers in terms of cost and effectiveness?

Jiang Jie: Now the cost of large models is actually borne by Tencent itself, so the cost must be high, but we hope to continuously reduce costs, from the efficiency of training efficiency and framework to reduce costs, in the reasoning stage of service users, in the future we will also do some customized ways to minimize the cost of customers.

South+: What are the main technical challenges faced during the entire R&D process?

Jiang Jie: Actually, since 2021, the results we see today are not all at once. First of all, it must do the framework of the underlying training, otherwise it will not be able to install hundreds of billions of parameters and 2 trillion tokens. The entire system is self-developed in terms of platform architecture, model, and algorithm. We were not a dense large model at the beginning, but based on a sparse large model, the advertising business achieved a sparse large model to support the advertising business. In the process, Tencent has been increasing its investment in these technical capabilities. Recent R&D is also doing some more in-depth capability evolution, not only in industry, but also in academia to improve application practice capabilities.

Southern + reporter Gao Xiaoping

【Author】 Gao Xiaoping

【Source】 Southern Press Media Group South + client