Mengchen is from Wafei Temple

Quantum Position | 公众号 QbitAI

Entering 2024, the wind direction of the large model has changed.

At the beginning of the "100-model war", as long as you simply and rudely take a demo to engage in MaaS (model as a service), that is, let users directly interact with the large model, it is enough to play the table.

But now, intensive cultivation of applications, whether it is native AI applications or the integration of AI functions on existing products, has become the latest trend.

Even Tencent's hybrid model team, which has always been low-key and mysterious, has also announced the progress of the application:

Tencent's hybrid model has supported access to more than 400 internal businesses and scenarios, and is fully open to enterprise and individual developers through Tencent Cloud.

Revealing Tencent's hybrid model: 400+ scenarios have been implemented, and collaborative SaaS products have been fully accessed

There are many well-known "national-level" apps, such as WeCom, Tencent Docs, and Tencent Meeting, all of which have been fully armed with AI.

There are also more Tencent Cloud SaaS products, such as Tencent Lexiang, an enterprise knowledge learning platform, and Tencent e-Sign, an electronic contract management tool, which are also supported by AI.

Tencent's hybrid model only debuted in September last year, is it intended to speed up the progress?

Faced with this problem, Zhang Feng, the person in charge of Tencent's hybrid model application, answered a bit "Versailles":

We are just following the normal rhythm, and we are not just connected to the large model, and we have entered the stage of polishing the user experience.

Among the domestic large-scale model manufacturers, why did Tencent take such a unique route? We had an in-depth chat with Zhang Feng.

Tencent's AI products are already polishing the user experience

Tencent has been known for its products for so many years, and the AI era has also continued this style.

Take the façade of the large model, Tencent Hybrid Assistant, as an example, "already polishing the user experience" is really not an empty word.

For example, if you ask it to do a simple math problem, you can find that the AI is very fluent in analyzing the idea, and it also judges that the problem is missing, but it pauses slightly before giving the final result.

This doesn't fit the principle of the big model predicting the next token, but rather it looks like it's really being calculated.

Zhang Feng revealed that behind it was actually the AI that wrote a piece of code first, executed it on the backend and then returned the result.

It has to be said that this is an ingenious way of thinking to solve the problem of inaccurate calculation of large models. But why not display the code in the foreground like the GPT-4 Code Interpreter Edition?

An important scenario of Tencent's hybrid assistant is that it is used in WeChat mini programs, and the mobile display code will be particularly long. Zhang Feng believes that the current strategy is more in line with user experience habits.

The product strategy is there, but it's not a simple thing to implement. First, the large model needs to understand that the current user needs to be accurately calculated, then the appropriate code needs to be generated, and finally the code needs to be successfully executed through function calls.

There are many more examples of polishing the user experience from the details.

For example, Tencent Meeting, which everyone is familiar with, has also made a lot of differentiated functions compared to simple AI voice transcription and meeting minutes summary.

When transcribing, Tencent Meeting AI intelligently organizes parts such as "uh-huh-hmm" to make the post-meeting transcript look neater.

Another issue that Tencent Meeting is pondering is that the AI-generated meeting summary format should be appropriately adjusted according to the type of meeting.

A meeting with a clear theme and agenda requires a very different format than a brainstorming session where everyone can talk freely. Therefore, in addition to generating meeting minutes by time and chapters, Tencent Meeting will also introduce the function of generating meeting minutes by speaker/topic.

Tencent Lexiang, as an enterprise knowledge collaboration platform, has achieved the identification of the questioner in the AI Q&A function, so that thousands of people can answer thousands of faces.

If the company's HR asks the AI about the salary structure, it can get a positive answer, and the AI will refuse to provide the same question when other positions ask the same question. It's convenient and safe at the same time.

Kuangzhen, a law firm in Hunan Province, has accessed Lexiang Assistant to make an AI knowledge base, and employee surveys show that the satisfaction rate of AI answers to typical questions is as high as 93 points, and the end-to-end question accuracy rate is 91%.

Tencent e-Signature uses an AI intelligent document review system to identify contract risk clauses, making it easier for enterprises to control contract risks. Enterprises have different requirements for risk control in contracts. Tencent eSignature also uses large models and few-shot technology to train small vertical models suitable for customer industries to achieve low-cost operation. At the same time, through the hybrid cloud model, it supports the private deployment of data and models, solves efficiency problems and ensures compliance.

In a total of 400+ application scenarios, examples like this abound, and I won't repeat them here.

The next question worth exploring is how Tencent can polish AI products in a short period of time.

The complete process of application landing has been completed

At Tencent, large-scale model development and business applications are a two-way street.

According to Zhang Feng, the iteration speed in the development process of Tencent's hybrid model is very fast, and there are basically four to five versions a month.

This speed comes from the efficient cooperation with the business application team, where the business team puts forward requirements and contributes fine-tuned data, and the R&D team can strengthen the capabilities of large models in a targeted manner. During the online testing process, bad cases are constantly discovered, and the shortcomings of the large model can be quickly made up.

In this mode of considering the actual application needs during research and development, Tencent's hybrid model is positioned as a "practical-level general model".

Among the large models in China, Tencent Mix took the lead in completing the upgrade of the MoE (Mix of Experts) architecture, that is, from a single dense model to a sparse model composed of multiple experts.

In the MoE architecture, the total number of parameters increases while the activation parameters remain unchanged, which can handle more tokens, and at the same time, thanks to the small actual activation amount, the training and pushing cost can be significantly reduced.

The rapid transformation of this route also benefited from the early understanding of the needs of the business application side.

In the process of polishing with business applications, Tencent Hybrid focuses on improving the three capabilities of the general model:

Instructions follow the ability to put forward a variety of complex structured long instructions, and Tencent Hybrid can execute them as required.

Web page and document comprehension capabilities meet the needs of users who often need AI to summarize long text content and reduce cognitive negatives.

The ability to call functions is also one of the trends of Tencent's hybrid team in judging the next stage of large models.

The general model is just the beginning.

Zhang Feng introduced that in practical applications, in addition to the MoE main model, if the call volume is large, from the perspective of cost performance, each business can consider using small models of different sizes, or use vertical small models fine-tuned according to business data.

Fine-tuning is a common term in academic circles, and Tencent prefers to use "fine-tuning" within Tencent.

From data management to self-developed AngelPTM training framework, AngelHCF inference framework, to model evaluation and deployment, there is a strong focus on intensive cultivation.

So, in the face of today's 400+ scenarios, and more businesses will have to use large models in the future, the R&D team obviously can't spare energy to fine-tune one by one, how to solve this problem?

The answer is that through the one-stop platform of Hybrid Elements, many demand business teams can easily handle it by themselves.

The one-stop platform not only supports direct invocation of the hybrid large model service through the API interface, but also visualizes many processes of the large model from training to deployment, which can be quickly completed without writing code and with just a few clicks of the mouse.

With the one-stop platform of Hybrid Elements, many AI engineers don't have to toss around the code much, and business engineers who are not proficient in machine learning can easily get started.

Next, according to the process of fine-tuning a complete model to the go-live, we can understand the capabilities of the one-stop platform for hybrid elements.

First of all, in terms of models, the platform provides a matrix of pedestal models of various sizes. It is divided into three levels: general model, optimization model for typical scenarios, and sub-model for tasks in more vertical domains.

As mentioned earlier, there are two examples of scenario optimization models: when developing Agent applications, you can use a model that strengthens the ability to call functions, and in scenarios with high knowledge density, you can choose a model that optimizes the summarization ability.

If there are not only vertical application scenarios, but also vertical datasets, the secondary training of private datasets can be completed on the one-stop platform of mixed elements, so that the vertical sub-model not only has good general understanding ability, but also is very good at professional domain knowledge.

Next, we will talk about the data processing capability of the one-stop platform of mixed elements.

For data from different sources with varying quality, from data cleansing processes such as quality inspection and deduplication, to statistical allocation of data ratios on different topics, to more difficult data value alignment, the bias contained in it can be efficiently removed by automated means.

Even if the model is found to be incapable in some aspects due to the lack of certain types of data after the model is launched, supplementary data can be quickly put into continuous training to support the rapid iteration of the model.

With the pedestal model and data, you can fine-tune it to create your own model on demand. Whether it is Lora fine tuning with fast speed and low cost, or full parameter deep fine tuning, it can be completed in the one-stop platform of mixing.

The evaluation, deployment and launch of the fine-tuned model are also automated, especially the deployment can be released with one click, which is one of the core technologies of the one-stop platform of Hybrid Elements.

In summary, compared with traditional machine learning platforms, the biggest features of the one-stop platform are that it provides pre-trained base models, automates and optimizes data processing processes, and streamlines and efficient model fine-tuning and application integration workflows. The platform uses automation and intelligent tools to meet challenges such as massive training data, model customization, and deployment, which greatly reduces the threshold for business access to large models, and achieves the goals of fast speed, good effect, and diverse access methods.

In a word: the complete process from model development to application implementation has been run.

The internal process has been thoroughly completed and verified by 400+ scenarios, and external developers and enterprises can directly call Tencent's hybrid capabilities through Tencent Cloud APIs, and the next step is to help partners upgrade their businesses intelligently.

One More Thing

At the end of this exchange, Qubit submitted to the team the problems found during the testing of Tencent's hybrid assistant, which the model still could not solve well.

It was already more than 6 p.m. Beijing time after the end, nearly 2 hours later than the original end time.

Most of the members of Tencent's mixed yuan team are ready to leave for the airport to rush back to the Shenzhen R&D headquarters.

Zhang Feng did not leave the conference room with everyone.

After a brief goodbye, he sat back down on the couch, immersed in the world of figuring out how to improve the bad case.

— END —

量子位 QbitAI 头条号签约

Revealing Tencent's hybrid model: 400+ scenarios have been implemented, and collaborative SaaS products have been fully accessed

Tencent's AI products are already polishing the user experience

The complete process of application landing has been completed

One More Thing

Read on