laitimes

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

author:Smart stuff
Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

Author | vanilla

Edit | Li Shuiqing

Zhidong reported on May 17 that today, Tencent Cloud announced a series of generative AI R&D and product progress. Tencent's hybrid model has been fully upgraded, with a maximum scale of trillions of parameters, and the overall performance has been increased by 50% compared with the previous generation. At the same time, in terms of multimodal capabilities, Tencent Cloud open-source Chinese native DiT architecture Wensheng graph model, mixed element support for single image can generate 3D models in 30s, and the video generation time reaches 16s.

Based on the self-developed hybrid large model base, Tencent Cloud has built a native toolchain for the large model era and released three AI engine tools: large model knowledge engine, image creation engine, and video creation engine. In addition, Tencent Cloud has also launched a one-stop AI agent creation and distribution open platform, Tencent Element, on which users can create exclusive AI agents and publish them to Tencent ecosystems such as QQ and WeChat.

According to reports, Tencent Cloud will launch the hybrid model app "Tencent Yuanbao" at the end of May 30 this month, providing efficient information integration tools driven by hybrid elements and search engines. In the direction of Wensheng video, which has attracted much attention, Hybrid is comprehensively upgrading the architecture based on ST-DiT, and is expected to achieve a 30-second video generation time by the third quarter, and will open the Wensheng video API interface within 2-3 months.

Dowson Tang, Senior Executive Vice President of Tencent Group and CEO of Cloud & Smart Industry Business Group, emphasized that Tencent has always taken "industrial practicality" as the core strategy for developing large models, and built AI closest to the industry by building high-performance models, efficient tool platforms, highly agile scenario applications, high-availability computing infrastructure, and a strong and secure model environment.

After the press conference, a few media such as Zhidong had an in-depth conversation with Wu Yunsheng, vice president of Tencent Cloud, head of Tencent Cloud Intelligence, head of Tencent Youtu Lab, and head of Tencent Enterprise Point.

Talking about the commercial or application value of the voice assistant released by OpenAI and Google this week, Wu Yunsheng believes that the technology that truly integrates the three modes of vision, audio, and text and achieves end-to-end input and output is worth paying attention to, and will become the main trend of future technology development, with great prospects for commercialization.

Tencent Yuan Device Trial Application Address:

https://open.hunyuan.tencent.com

Tencent Cloud official website address:

https://cloud.tencent.com/product/hunyuan

1. Trillion-parameter MoE, a single image can generate a 3D model in 30s, and the video generation time can reach 16s

Jiang Jie, vice president of Tencent Group, said that Hybrid has achieved a comprehensive layout from computing power, platform and other infrastructure to model construction such as text, graphics, video, and 3D.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲The layout of the hybrid element from infrastructure to model construction

Tencent released the hybrid model last year, and after several iterations and upgrades, the text generation base model has been expanded to a scale of trillions of parameters, using a MoE (hybrid expert model) structure, supporting up to 256k contexts, and the overall performance is 50% higher than that of the previous generation.

At present, Hybrid has launched three versions: Pro, Standard, and Lite, corresponding to trillions, 100 billions, and 10 billion parameters, respectively, and is available to developers and enterprise users through the Tencent Cloud platform.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲Hybrid element is expanded to a trillion MoE model

In terms of Wensheng diagrams, the infrastructure of mixed element Wensheng diagrams has been comprehensively upgraded, from the traditional U-Net to the DiT architecture, and the number of parameters has been increased by more than ten times, and the evaluation results are leading in China.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲Mixed element Wensheng diagram architecture upgrade

In addition to generating high-quality, multi-style images, MixElement also upgrades the ability of multi-round dialogue, allowing users to re-edit the generated images through natural language interaction.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲ Mixed Yuan Wensheng Diagram multi-round dialogue ability

At the commercial level, Hybrid can efficiently synthesize product materials, such as changing different backgrounds for product images, and has been put into production in advertising scenarios.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲Mixed element Wensheng diagram can be efficiently synthesized into product materials

According to Jiang Jie, it only takes 30 seconds to generate 3D models of animation, automobiles, buildings and other types of animation.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲Hybrid element layout 3D generation

In terms of video generation, Tencent Hybrid has 4 core capabilities: Wensheng video, Tusheng video, Tuwen video, and video raw video, and supports diversified product gameplay such as video stylization and video redrawing, which has a higher resolution and greater movement amplitude than Pika, Runway and other competitors, and can generate up to 16s video.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲ The 4 core competencies of mixed element video

Jiang Jie revealed that Hybrid is comprehensively upgrading the architecture based on ST-DiT, and is expected to achieve 30-second video generation by the third quarter, and will open the Wensheng video API interface within 2-3 months.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲ Mixed Yuan Wensheng video ability

2. Promote a one-stop agent creation and distribution platform, and the "Tencent Yuanbao" App will be launched at the end of the month

Based on the capabilities of the hybrid large model, Tencent fully open-sourced the DiT architecture Wensheng graph model on Tuesday (May 14). This is the industry's first Chinese-native DiT architecture Wensheng diagram open-source model, which supports bilingual input and understanding in Chinese and English, with 1.5 billion parameters.

Compared with other open source models in the industry, Hybrid DiT has no shortcomings in multiple dimensions, and its comprehensive indicators rank third among all open source and closed-source algorithms, realizing SOTA in the open source version.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲Mixed element Chinese native DiT architecture Wensheng graph model open source

Today, Tencent Cloud announced that it will open source three MoE models, including Hunyuan-S for mobile phone deployment, Hunyuan-M for PC deployment, and Hunyuan-L for cloud/data center deployment, with parameter sizes ranging from 3 billion to 30 billion.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲Hybrid will soon open source a variety of size MoE models

In addition, Tencent Cloud has fully opened up the agent ecosystem and launched a one-stop AI agent creation and distribution open platform "Tencent Element", where users can not only create their own AI agents, use Tencent's official plug-ins and knowledge base, but also publish these agents on QQ, WeChat or App.

Tencent Element is open for application experience from now on, with the advantages of low threshold for creating agents, rich plug-ins and knowledge bases, and opening up Tencent's global distribution channels.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲Tencent Yuanqi, a one-stop AI agent creation and distribution open platform

Based on the capabilities of the hybrid model, Tencent Cloud will officially launch the Tencent Yuanbao App on May 30 at the end of the month, aiming to efficiently search and refine information, provide users with efficient information integration tools driven by hybrid elements and search engines, and provide interesting and practical life functions in combination with Tencent's content ecosystem.

Jiang Jie said that Tencent ingots are relatively concise in terms of interaction, with only one input box, through which functions such as AI search, document summary, translator, and oral sparring can be realized.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲Tencent Yuanbao App will be launched soon

3. The three major engines lower the threshold for model implementation, and develop enterprise-level knowledge applications in 5 minutes

Wu Yunsheng, Vice President of Tencent Cloud, said that AI has become a key driving force for digital development, with large model technology as the core. According to the "Research Report on the Path to the Generative AI Industry" released by Gartner, the number of large models with more than 1 billion parameters in China has exceeded 100, and more than 60% of Chinese enterprises plan to deploy generative AI in the next 12-24 months.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲AI has become a key driving force for digital development

However, in order to accelerate the innovation of the large model industry, large model manufacturers also need to solve three major challenges: lowering the threshold for tool use, improving platform adaptability, and ensuring security compliance.

To address these challenges, Tencent Cloud has launched a new native toolchain for the large model era, including three PaaS tools, namely "Large Model Knowledge Engine", "Large Model Image Creation Engine", and "Large Model Video Creation Engine", to help enterprises improve the quality and efficiency of knowledge services, image and video creation scenarios.

For knowledge management scenarios, Tencent Cloud has launched a large-model knowledge engine, so that AI can understand not only "industry", but also "enterprise" and "product".

Knowledge Engine is a large-scale model application development platform focusing on enterprise knowledge service scenarios, which is built with the framework of large-scale model + RAG (Retrieval Enhanced Generation). Using natural language, enterprise users can develop a knowledge service application in 5 minutes, which can be quickly implemented in business scenarios such as customer service marketing and enterprise knowledge community.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲Large model knowledge engine

The Tencent Cloud OCR parsing model behind it improves the accuracy of "knowledge parsing" by 25%. Through semantic-level knowledge segmentation and data vectorization, the large model can quickly retrieve the most matching answer, greatly reduce the illusion, and answer more reliably.

The large model image creation engine provides API technical services with AI image generation and processing capabilities, which can intelligently create input-related image content based on input text or images, and supports image stylization, AI photography, line drawing and other capabilities.

Based on a series of audio and video AI technologies such as Tencent Video's large model, the large model video creation engine supports high-quality video content generation or processing, covering video translation, video stylization, image dancing, video interpolation, WordArt video, motion brush, canvas expansion and other capabilities.

While the large-model native toolchain promotes AI inclusion, Tencent Cloud has also upgraded a number of large-model product applications, such as the intelligent cockpit, Qidian Marketing Cloud AI assistant, and AI code assistant, so that "out-of-the-box" AI can accelerate the implementation of the industry.

Fourth, it has been connected to 600+ internal scenarios, and the B-end and C-end are two-pronged

According to Tang Daosheng, the hybrid model has been connected to more than 600 application scenarios within Tencent.

On the C-side, WeChat Reading has recently launched new functions such as AI book asking and AI outline based on the hybrid yuan model, which greatly improves the reading efficiency and experience of users.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲The hybrid model landed on the C-end

On the B-side, Tencent's SaaS collaboration products are fully connected to Mixelement. Tencent's customer service team upgraded the intelligent customer service system based on the hybrid model, and created a fine-tuning model in the vertical field of intelligent customer service, which improved the accuracy of intent understanding of intelligent dialogue and the fluency of multiple rounds of Q&A by 38% compared with the traditional small model.

Tencent Advertising has launched a one-stop AI advertising creative platform, Tencent Advertising Miaosi, based on Mix, to help improve the efficiency of advertising production and delivery, and the average click-through rate of Tushengtu has increased by 15%.

Tencent hybrid model upgrade! The performance is increased by 50%, and 16s video can be generated, see you at the end of the month

▲The hybrid model landed on the B-side

Conclusion: Break the "floor-to-ceiling glass" of AI, so that the large model can be changed from visible to usable

Large model technology has gradually matured, but there is still a certain distance between "visible" and "usable". How can companies find the optimal path to generative AI? The full-link strategy from models, platforms, scenario applications, and computing infrastructure to computing power infrastructure is the answer sheet handed over by Tencent Cloud.

As Tang Daosheng said, "the creation of large models is only the starting point, and the goal is to implement technology in industrial scenarios and create value", creating a large model native tool chain to help enterprises improve the quality and efficiency of knowledge services, image and video creation, and "building the closest AI to the industry" can make the large model play the greatest value.

Read on