
AI big model fierce melee! Within a day, giants such as Huawei, Ali, and Tencent made a move

author:Wall Street Sights

In recent months, the intensity of the involution of domestic large models can be described as "immortal fights". This Friday, the big model melee reached a new height, according to incomplete statistics on Wall Street, today, Huawei, Ali, Tencent, SenseTime, and other companies released or updated large models.

In the grand event of the "100-model war", who is most likely to create the Chinese version of GPT-4?

HUAWEI CLOUD Pangu Model 3.0 was officially released

AI big model fierce melee! Within a day, giants such as Huawei, Ali, and Tencent made a move

On July 7, HUAWEI CLOUD released Pangu Model 3.0 at the Developer Conference 2023. Zhang Pingan, Executive Director of Huawei and CEO of HUAWEI CLOUD, said that Pangu Model 3.0 is a completely industry-oriented large model, including a "5+N+X" three-layer architecture.

Zhang Pingan said at the meeting that Pangu Model will not write poetry, only do things, and will continue to build core competitiveness around the three innovative directions of "industry reshaping", "technology rooting" and "opening up and flying", and provide better services for industry customers, partners and developers.

The three-tier architecture is:

The L0 layer includes five basic large models of natural language, vision, multimodality, prediction, and scientific computing, providing a variety of skills to meet the needs of industry scenarios. Pangu 3.0 provides customers with a series of basic models of 10 billion parameters, 38 billion parameters, 710 parameters and 100 billion parameters, matching the diversified needs of customers in different scenarios, different delays, and different response speeds. At the same time, it provides a new capability set, including knowledge Q&A, copywriting generation, code generation of NLP large models, as well as image generation and image understanding capabilities of multimodal large models, which can be directly called by customers and partner companies. Regardless of the parameter size of the large model, Pangu provides a consistent capability set.

The L1 layer is N industry big models, and HUAWEI CLOUD can provide industry-wide big models trained using industry open data, including large models such as government affairs, finance, manufacturing, mining, and meteorology. It is also possible to train your own proprietary large model for customers on the L0 and L1 layers of the Pangu large model based on the industry customer's own data.

The L2 layer provides customers with more detailed scenario models, focusing more on specific industry applications or specific business scenarios such as government hotlines, network assistants, lead drug screening, conveyor belt foreign body detection, typhoon path prediction, etc., and provides customers with "out-of-the-box" model services.

Pangu large model adopts a complete hierarchical decoupling design, which can quickly adapt and quickly meet the changing needs of the industry. Customers can either load separate datasets for their large models, upgrade the base model separately, or upgrade the capability set separately.

Based on the L0 and L1 models, HUAWEI CLOUD also provides customers with a large model industry development kit, which allows customers to have their own exclusive industry models through secondary training on their own data. At the same time, according to the different data security and compliance requirements of customers, Pangu Grand Model also provides diversified deployment forms of public cloud, large model cloud zone, and hybrid cloud.

Ali AIGC App "Universally Meaningful"

AI big model fierce melee! Within a day, giants such as Huawei, Ali, and Tencent made a move

At the 2023 World Artificial Intelligence Conference, Alibaba Cloud officially launched a new AI painting product "Tongyi Wanxiang".

Based on the combined generative model Composer, developed by Alibaba, Tongyi Wanxiang proposes a "combined generation" framework based on the diffusion model, which provides highly controllable and extremely free image generation effects by disassembling and combining image design elements such as color matching, layout, and style.

The user can enter a prompt word in the universal meaning to output the corresponding image. In addition to Wen Sheng diagrams, Tongyi Wanxiang has also launched functions including style transfer and similar graph generation.

From then on, the threshold of picture design will be greatly reduced, whether it is art design, games, or cultural creativity, it will usher in a change.

At present, Tongyi Wanxiang has the following three functions: Wen Sheng diagram, similar graph generation, and style transfer.

As long as you enter prompt, select the creative style (watercolor, oil painting, Chinese painting, flat illustration, quadratic element, sketch, 3D cartoon, etc.), and you can automatically generate a large number of creative inspiration. Tongyi Wanxiang has been officially launched to provide services to the outside world.

Lookalike graph generation allows users to quickly scale similar assets based on existing footage. As long as the user provides a reference image, they can get an image similar to its content and style.

Style transfer, on the other hand, is to generate a new image of the specified style for an original image.

The picture below is a test from "New Zhiyuan", using Tongyi Mansang to change the picture below to the style of the French impressionist painter Renoir.

AI big model fierce melee! Within a day, giants such as Huawei, Ali, and Tencent made a move

After the completion of the migration, such an impressionist portrait was obtained.

AI big model fierce melee! Within a day, giants such as Huawei, Ali, and Tencent made a move

According to the "New Zhiyuan" evaluation, some of the drawing capabilities of Tongyi Wanxiang are already approaching the world's most powerful AI painting artifact Midjourney.

Tencent MaaS platform upgrade

During the World Artificial Intelligence Conference, Tencent Cloud announced the upgrade of its MaaS platform to apply the industry's large model capabilities to new scenarios such as financial risk control, simultaneous interpretation, and digital sapiens customer service. Among them, the financial risk control model announced for the first time has a 10-fold efficiency improvement compared with traditional risk control.

In the field of technical foundation, the self-developed Xingmai high-performance computing network and vector database provide more abundant computing power infrastructure for the industrial application of large models. Among them, the newly upgraded Tencent Cloud's self-developed Xingmai high-performance computing network can improve GPU utilization by 40%, save 30%~60% of model training costs, and bring 10 times the communication performance improvement to AI large models. Based on Tencent Cloud's next-generation computing power cluster HCC, it can support a large computing scale of 100,000 cards. Tencent Cloud AI native vector database supports up to 1 billion vector retrieval scale, with latency controlled in milliseconds, which is 10 times higher than that of traditional stand-alone plug-in databases, and has a peak capability of million-level queries per second (QPS).

In terms of application innovation, Tencent Cloud's industry large model capabilities have been applied to scenarios such as financial risk control, interactive translation, and digital sapiens customer service, greatly improving the efficiency of intelligent applications.

The financial risk control solution supported by the industry's large model has improved efficiency by 10 times compared with the previous one, and through Tencent's accumulation of more than 20 years of black and gray industry confrontation experience and thousands of real business scenarios, the overall anti-fraud effect has been improved by about 20% compared with the traditional model. Based on the PROMPT model, enterprises can iterate risk control capabilities, from sample collection, model training to deployment and launch, achieving zero manual participation in the whole process, and reducing the modeling time from 2 weeks to only 2 days. Even with limited sample accumulation, quick setup can be done without skipping the "cold start" process.

In the field of interactive translation, based on the blessing of industry large model technology, simultaneous interpretation technology no longer needs millions of training data, only "small sample" training can achieve better results, and translation in professional fields can also reduce the participation of manual tuning, ensure translation effects, and land in multiple vertical industries. Among them, Tencent Simultaneous Interpretation has provided AI simultaneous interpretation services for the main forum of the World Artificial Intelligence Conference for six consecutive years.

In the field of digital sapiens, this year Tencent Cloud launched a small-sample digital human factory, which can replicate 2D digital twins within 24 hours with only a small amount of data, greatly reducing the cost of enterprise application digital sapiens services. Now, relying on AI generation algorithms, the speed of reproduction of the 3D image of digital sapiens has been greatly improved, and through generative action drive, combined with the industry's large model capabilities, enterprises can obtain more "personalized, professional, natural and realistic" digital intelligence employees, making "face-to-face" professional services possible.

SenseTime's model has been fully upgraded

During the World Artificial Intelligence Conference, at the "Boundless Love, Daily New" artificial intelligence forum, SenseTime announced that the "SenseTime SenseNova" large model system will be comprehensively upgraded in multiple aspects, as well as a series of large model product updates and landing results under the system.

As a natural language processing model with hundreds of billions of parameters, SenseChat 2.0 breaks through the input length limit of large language models and launches model versions with different parameter magnitude, which can perfectly adapt to the application requirements of different terminals and scenarios such as mobile and cloud, and reduce deployment costs. The model parameters of SenseMirage 3.0, SenseTime's self-developed large-scale model, SenseMirage 3.0, have increased from 1 billion since its first release in April this year to 7 billion, enabling professional photography-level image detailing.

In addition, SenseAvatar 2.0 digital human generation platform improves voice and lip shape fluency by more than 30% compared with version 1.0, achieving 4K HD video effects, and bringing AIGC image generation and digital human singing functions. In addition, SenseTime Qiongyu SenseSpace 2.0 improves the space reconstruction efficiency by 20%, the rendering performance by 50%, and the mapping time per 100 square kilometers of scene can be completed in only 38 hours (1200 TFLOPS/sec computing power support). SenseThings 2.0 achieves millimeter-level fineness in texture and material restoration of small objects, and breaks through the problem of collecting highly reflective and specular objects.

In the financial field, SenseTime cooperates with banks, insurance, securities firms and other customers, uses digital humans for intelligent customer service, smart marketing, etc., and provides new functions such as investment research analysis and research report writing by accessing the ability of large language models to achieve cost reduction and efficiency increase. In addition, after mounting the financial knowledge base, it can also output content Q&A based on the customer's product description 100%, and realize timely update of information.

In the medical scenario, SenseTime has built a Chinese medical language model "Big Doctor" based on massive medical knowledge and clinical data, providing multi-scenario and multi-round conversation capabilities such as guidance, consultation, health consultation, and decision-making assistance, and will soon support multi-modal comprehensive analysis of medical images, text, structured data, etc., and continuously improve medical language understanding and reasoning capabilities, and continue to empower the rate of hospital diagnosis and treatment and patient service improvement.

Progress of other AI companies

Local AI unicorn goes out to ask and release "sequence monkey"

Mobvoi brought the internal test exploration large model "Sequence Monkey" and AI CoPilot solution to the World Artificial Intelligence Conference. According to reports, "Sequence Monkey" is a large language model with multimodal generation ability, and the model with language as the core capability system covers six dimensions of "knowledge, dialogue, mathematics, logic, reasoning, and planning", and can simultaneously support different tasks such as text generation, picture generation, 3D content generation, language generation and speech recognition. Sequence monkeys have natural language understanding, knowledge, logic, and reasoning abilities, and can conduct conversations based on these abilities. Large models are being trained, and I'm confident about their prospects

He Xiaodong, vice president of Jingdong Group and president of the Exploration Research Institute, said that the current training time of the basic general large model is about two months, and the cost is estimated to be tens of millions of yuan, and he is very confident in the commercial prospects and landing scenarios of the large model. He suggested that startups should find their own "moat" when entering the big model, and in the face of the current status quo of "100-model war", He Xiaodong believes that for the market, pressure and competition are good things, which will effectively promote the development of the industry.

This article is from Wall Street News, welcome to download the APP to see more


Read on