21st Century Business Herald reporter Bai Yang reported from Shenzhen
At the end of last year, ChatGPT was born, which brought AI large models to global attention. In the past half a year, large-model products have sprung up, but in the process of competing for glory, Tencent has been slow to make a move.
Previously, many authoritative figures in the industry analyzed that in the future large model ecology, only a few general-purpose large models will remain. Tencent has also always been considered one of the strong players.
At the 2023 Tencent Global Digital Ecosystem Conference held on September 7, the Tencent Hybrid Model was officially unveiled.
According to reports, Tencent's mixed-element large model is a general-purpose large language model developed by Tencent Full-link, with a scale of more than 100 billion parameters, a pre-training corpus of more than 2 trillion tokens, and its Chinese creation ability, logical reasoning ability and task execution ability will be opened to the outside world through Tencent Cloud.
In the past week, with the release of domestic deep synthesis service algorithm filing information by the Cyberspace Administration of China, many large model products have also begun to shift from internal testing to public testing. The Tencent mixed element model has also passed the record, in a sense, Tencent and other domestic manufacturers are now standing on the same running line.
Do not "half-finished products"
If from the perspective of technical reserves, Tencent has only released large-model products now, it is actually an active slowdown.
According to the 21st Century Business Herald reporter, Tencent began to explore large model related technologies as early as 2021. Since 2021, Tencent has successively launched NLP sparse large models with hundreds of billions and trillions of parameters. Regarding the hybrid AI model, Tencent disclosed its R&D progress for the first time in April 2022. According to the information at that time, the mixed-element AI big model completely covered NLP (natural language processing), CV (computer vision), multimodal and many industry models.
If Tencent wants to follow the trend in the first half of the year to release a large model product, it is not difficult. But Tencent didn't do that, why? In this regard, at the Tencent 2023 shareholders' meeting held in May this year, Ma Huateng, chairman and CEO of Tencent, made a very clear explanation.
Ma Huateng said at the time, "At first, we thought that this was an opportunity that the Internet would not encounter in ten years, but the more we thought about it, the more we felt that this was an opportunity similar to the industrial revolution that invented electricity that had not been encountered in hundreds of years." Internet companies have a lot of accumulation, are doing, we are also immersed in research and development, but we are not in a hurry to finish early, and take out the semi-finished products to show. ”
Ma Huateng believes that for the industrial revolution, taking out the light bulb a month earlier is less important in terms of a long time span. The key is to do a solid job in the underlying algorithms, computing power and data, and more importantly, the scene landing, "I believe we have many scenarios that can be landed, and (we) are still doing some thinking."
Self-developed throughout the link
Ma Huateng's tone has raised the expectations of the outside world for Tencent's large model, and this has directly translated into a huge pressure on Tencent's R&D team.
In the past six months, many Tencent employees have confided internal pressure to reporters. As the person in charge of the hybrid model, Jiang Jie, vice president of Tencent Group, said frankly in an interview with the 21st Century Business Herald reporter that Tencent has been developing hybrid elements since 2020, and the results you see today are not achieved overnight. To be precise, the hybrid element will always be on the way, and the training dataset will continue to be updated, and Tencent chose to unveil it at this time because it has reached a usable and practicable state.
According to reports, Tencent's hybrid model has a path planning independently developed and masters the full-link self-research technology from model algorithms to machine learning frameworks to AI infrastructure.
Jiang Jie said that the reason why Tencent's mixed element big model chose to start training from zero from the first token is because it is impossible to fully master the large model technology without self-research, and Tencent's high concurrent business scenarios are not suitable for using open source models.
Under the background of the accumulation of domestic large models, Tencent Mixed Element Large Model has also carried out many exploration and innovation of technical routes. Thanks to the full-link self-developed technology, Tencent's hybrid model has been able to understand the meaning of the context, and has the ability to memorize long texts, which can smoothly conduct multiple rounds of dialogue in professional fields.
Emphasis on "practicality"
At the conference, Tencent also threw out a new concept, that is, the "practical" large model. In Tencent's view, the current large model performs well in handling simple tasks and high-fault tolerance scenarios, but it still faces great technical challenges to make it more reliable to process complex information.
The "practical" large model can be understood as a large model that can effectively solve problems, improve work efficiency, and have high accuracy and reliability in multiple fields and tasks in practical application scenarios. Such models not only excel at simple tasks, but also handle complex information processing and bring real value to customers and users.
The value of the "practical" large model is first reflected in Tencent's internal appearance. Jiang Jie said, "Tencent's goal in developing large models is not to get high scores on evaluations, but to apply technology to actual scenarios. Tencent will fully embrace the big model. ”
During the earnings call in March, Tencent's president and chairman of the investment committee, Martin Lau, said that AI would be a multiplier for the company's future business growth. Generative AI and basic model technology can complement and optimize Tencent's business. Therefore, the company will actively invest resources to build the basic big model, which will positively complement each business line in the future.
At present, more than 50 Tencent businesses and products, such as Tencent Cloud, Tencent Advertising, Tencent Games, Tencent Fintech, Tencent Meeting, Tencent Docs, WeChat Souyi Search, and QQ Browser, have been connected to the Tencent Mixed Element Large Model Test and achieved preliminary results.
At the conference, Jiang Jie also demonstrated the practical application of Tencent Meeting, Tencent Docs, Tencent Advertising and other businesses after accessing the Tencent Hybrid Model.
For external customers, the Tencent Hybrid Model will be used as the foundation of Tencent Cloud MaaS services, and customers can not only call the Hybrid Element directly through APIs, but also use the Hybrid Element as the base model to make their own exclusive industry model.
Jiang Jie said that in June this year, Tencent released solutions related to the industry's large model, which was only prepared in a few industries, and now, Tencent Hybrid can support more industries.
There is no doubt that the era of big models is accelerating. As of July, more than 130 large models have been released in China, truly setting off a hundred model war. In the past two decades, Tencent has occupied a place in the era of PC Internet and mobile Internet with QQ and WeChat, and it is worth paying attention to whether the mixed element big model can help Tencent get the "ticket" of the big model era?
For more information, please download 21 Finance APP