laitimes

Tencent Mixed Yuan officially "entered the war" The second half of the big model began

author:IT Times
Tencent Mixed Yuan officially "entered the war" The second half of the big model began

The hybrid model is connected to more than 50 Tencent services

Author/ IT Times reporter Hao Junhui

Editor/Sun Yan

On September 7, the 2023 Tencent Global Digital Ecosystem Conference, Tencent Hybrid Yuan Model was officially unveiled. As one of the troika of Chinese Internet companies, Tencent finally joined the "100-model war" with a general language model after entering the game with an industry big model on June 19 this year.

Everything is foreshadowed. A week ago, it was reported that 11 domestic large model products such as Baidu Wenxin Yiyan, Ali Tongyi Qianwen, and Baichuan Da Model were the first to pass the "Interim Measures for the Management of Generative Artificial Intelligence Services" and could be officially launched to provide services to the public, and Tencent was also on the list.

On the morning of the 7th, Tencent Mixed Yuan officially unveiled: it has a scale of more than 100 billion parameters, a pre-training corpus of more than 2 trillion tokens, and has been connected to more than 50 Tencent businesses such as Tencent Cloud, Tencent Advertising, Tencent Games, Tencent Fintech, Tencent Meeting, and Tencent Documents.

Tencent Mixed Yuan officially "entered the war" The second half of the big model began

On the same day, Tencent announced that the hybrid model was officially opened to the public through Tencent Cloud. Tang Daosheng, senior executive vice president of Tencent Group and CEO of the Cloud and Smart Industry Business Group, said that thousands of industries can call mixed elements through APIs, or use mixed elements as a base model to build large model applications for different industry scenarios.

Tencent Mixed Yuan officially "entered the war" The second half of the big model began

Tang Daosheng, Senior Executive Vice President of Tencent Group and CEO of Cloud and Smart Industry Business Group

At this point, the domestic general large language model track is slowly "closed", and the later ones will focus more on the industry large model and vertical large model, and with the filing of large models one after another, the second half of the "100-model war" officially opened.

Tencent Mixed Yuan officially "entered the war" The second half of the big model began

From zero to two trillion

"Tencent's mixed-element model is trained from zero from the first token." From 0 to 2 trillion, the belated hybrid element is the culmination of Tencent's full-link self-developed technology from model algorithms to machine learning frameworks to AI infrastructure, which gives Tencent Vice President Jiang Jie great confidence, he said: "Because we have mastered the whole chain technology, we have the confidence to continuously upgrade this technical system in the future to cope with various changes in the external environment." ”

Although it is already a "100-model war", large model manufacturers with full-link self-research capabilities do not have many options worldwide, and participants need to have "killer skills" in software development and hardware infrastructure at the same time.

In fact, in addition to Google, Microsoft, Amazon, Alibaba, Baidu, Tencent, Huawei and other large cloud service providers, it is difficult for other large model manufacturers to achieve full-link self-research. Large models with trillion-level parameters at every turn require a large number of servers to form computing power clusters through high-speed networks to complete training tasks together, and only powerful large cloud merchants can gnaw the hard bones of "soft and hard network integration".

The case introduced on the scene confirms the effectiveness of Tencent's full-link self-research.

"Who is stronger in combat power between Guan Gong and Qin Qiong?" "Hallucinations" are unavoidable problems for all large models, and for this typical "wrong" problem, a large domestic model and ChatGPT 3.5 have given a wrong answer, while the mixed element answer is correct.

The common practice of eliminating "illusions" in the industry is to add plugins such as search enhancements or knowledge graphs to large models, which is equivalent to an open-book exam, but this practice has great limitations in practical applications. The mixed-element written from the first line of code adopts the "exploration" technique method of optimizing the objective function in the pre-training stage. According to Jiang Jie, compared with the common open source large models on the market, this method can effectively reduce hallucinations by 30% to 50%.

Another advantage of Tencent Hybrid that is significantly superior to other large models is its support for ultra-long text output. Although multimodal is becoming an important evolution direction for mainstream large models, in terms of text output, large models including GPT-3.5 or GPT-4 are difficult to support answers of more than 1000 words, and users need to enter "continue" to let the large model continue to give answers. Hybrid breaks through this limitation. After improving the processing effect and performance of ultra-long texts through positional encoding optimization, it has the ability to generate long texts and can give a complete answer of 4,000 words. Obviously, this will greatly expand the scope of use of AIGC and help large models "think" some deeper, more comprehensive answers.

Tencent Mixed Yuan officially "entered the war" The second half of the big model began

Train a trillion-level model in four days

At the main forum, the third speaker was Qiu Yuepeng, vice president of Tencent, and deliberately took a detour to the stage from behind Jiang Jie. Qiu Yuepeng's other identity is the president of Tencent Cloud, "because the cloud is the foundation behind the big model."

Since April this year, Tencent Cloud has released a series of infrastructure for large model training. From the self-developed Xingxinghai server, to the new generation of HCC (High-Performance Computing Cluster) high-performance computing cluster, to the self-developed Xingmai high-speed network, Tencent has built a complete set of high-performance intelligent computing network for AIGC.

"We are the strongest high-performance computing cluster HCC in China," Tencent cloud computing booth staff did not shy away from this, "Now many large model manufacturers will do internal testing for customers, and we are indeed the best performance and cost performance." ”

According to the staff, the cluster is composed of the latest generation of Tencent Cloud Xinghai self-developed servers, which gathers NVIDIA's H800 and Tencent's self-developed XPU, and provides the industry's highest 3.2T ultra-high interconnection bandwidth, which is 3 times higher than the previous generation, and the same trillion-parameter large model shortens the training time by 80%. Qiu Yuepeng further revealed that Tencent Cloud can now support large-scale training clusters with more than 100,000 cards for parallel computing, and a round of training of trillion-parameter large models can be completed within four days.

There is a typical "barrel effect" in computing power improvement, computing, storage, and network are indispensable, and bottlenecks on either side will lead to a serious decrease in computing speed. Especially in the training process, once the card fails, the entire training must be interrupted and the data will be rolled back, coupled with the huge amount of training data, Checkpoint read and write speed requirements are extremely high. Now, Tencent Cloud Storage has achieved more than 3TB of data writing in 60 seconds, improving the training efficiency and training time of the entire model.

Tencent Mixed Yuan officially "entered the war" The second half of the big model began

It is understood that Tencent Cloud has established a full set of capabilities around large models, including high-performance computing power clusters, cloud-native data lakes and vector databases and other data processing engines, as well as model security, support for model training and fine-tuning tool chains, etc., enterprises and developers can flexibly choose products according to their own needs and reduce the training cost of large models.

Tencent Mixed Yuan officially "entered the war" The second half of the big model began

Mixed element access to more than 50 services of Tencent

After nearly a year of exploration, no one doubts that a future-oriented cloud service provider must have its own big model and provide MaaS services.

Tencent is, of course, the best "first customer" of Mix. At the conference, Tang Daosheng announced that Tencent will fully embrace the big model. At present, MixYuan has been connected to more than 50 services of Tencent for testing and achieved preliminary results, including Tencent Cloud, Tencent Advertising, Tencent Games, Tencent Fintech, Tencent Meeting, Tencent Docs, WeChat Souyi Search, QQ Browser and other businesses and products, which are gradually becoming the business intelligence base of Tencent.

Obviously, the big model will create a new form of next-generation cloud services, redefine cloud tools, enterprises can use more intelligent, more convenient and easy-to-use cloud products through the cloud, and new interaction methods will continue to emerge. On the same day, Tencent Cloud announced that nearly 10 intelligent applications and solutions were updated and upgraded based on AI large model technology, and Tencent Cloud risk control large model, Tencent Cloud AI code assistant, Tencent Meeting AI assistant and other products have achieved significant efficiency improvement and experience optimization due to the blessing of large model capabilities.

Take Tencent meeting as an example, a meeting generally lasts tens of minutes to several hours, will involve tens of thousands of words, a large number of colloquial expressions, if you are a little distracted when participating in the meeting, you can directly ask the AI assistant, what a speaker said just now, if you hear a word you don't understand, you can also ask directly, and the assistant will not only answer the meaning of the word, but also answer the occasion when the word appears in the meeting. After the meeting, all the meeting content can be directly generated by the assistant into a "to-do" to-do, who should get what done at what time, which is quite practical.

Tencent Mixed Yuan officially "entered the war" The second half of the big model began

At present, Tencent Meeting AI Assistant and Qidian Analysis AI Assistant have officially opened trial applications.

Tencent Mixed Yuan officially "entered the war" The second half of the big model began

The second half of the "100-model war"

Although the late arrival of the mixed element, it seems to be the last boot to land, drawing a halt to the general large model "Crazy" of major manufacturers in the past year.

In fact, as early as June this year, when Tencent took the lead in cutting into the current round of "100-model war" with industry large models, the development path of large models began to diverge, and most of the more than 30 large models that appeared at the World Artificial Intelligence Conference in July this year were also industry large models. There is a basic consensus in the industry: "expensive" general-purpose large models are only games for a few people, and more focus on scenario-based, B-side-oriented industry large models, which are the most cost-effective AI tools.

However, Tencent does not come up with a general large model, which always makes people feel "uneasy".

Since the launch of the strategic upgrade in 2018, Tencent has shouted the slogan of "taking root in the consumer Internet and embracing the industrial Internet", and TO B has become the focus of Tencent's transformation. "Fintech and enterprise services", which represents the integration of industrial Internet data and reality, has accounted for more than 30% of revenue for nine consecutive quarters.

But Tencent is still China's "C-end king", CTR-Xinghan · According to the 2023 Q2 China Mobile Internet Power List, WeChat still topped the list with 1.29 billion quarterly active users. Whether based on data production capacity or user needs, Tencent needs a universal big model.

Judging from the content announced by Tencent this time, among the more than 50 Tencent businesses connected to the mixed yuan, Tencent Meeting, Tencent Docs, WeChat Search and QQ Browser, etc. are all products that can be directly accessed by the C-end and are used quite frequently. This means that the hybrid element has a naturally high user touch point from the beginning, and once it is opened, both the cost and the pressure will be higher than the industry large model and the general general large model.

"As the complexity of the model increases, the latency of inference will also be high, and the GPU performance required for model inference will become abnormally high to meet business performance requirements, which greatly increases the inference cost of a single QPS." At present, large models are mainly used for productivity, because the cost of service is too high and can only be used in high-value user scenarios. If the cost can be reduced to 1/10th or more, it can further extend the large model from productivity to entertainment, content, and all user interfaces. At the 2023 Tencent Global Digital Ecosystem Conference Internet AIGC Application Session, Tencent Cloud officially released the AIGC full-stack solution, and Mao Dehui, an industry solution expert in Tencent Cloud, said that Tencent Cloud's full-link acceleration capability can enable enterprises to improve efficiency and reduce costs on the road to AGI and make AIGC services more available.

Perhaps, for Tencent, only the east wind that sends the "mixed yuan" to the sky can come.

On September 7, the reporter searched for "Tencent Mixed Element Assistant" in the WeChat mini program, and the system showed that "the application was successful and queuing". Compared with other large models such as Baidu Wenxin Yiyan, which has been officially launched, Mixed Yuan still maintains the last prudence.

Tencent Mixed Yuan officially "entered the war" The second half of the big model began

Typesetting / Ji Jiaying

Photo / Tencent IT Times

Source: IT Times public account vittimes

E N D

Please add "star" to not miss us

Tencent Mixed Yuan officially "entered the war" The second half of the big model began
Tencent Mixed Yuan officially "entered the war" The second half of the big model began