China's first music SOTA model "Tiangong Music Model" is in public beta today

On April 17, 2024, on the occasion of the first anniversary of the "Tiangong" model, Kunlun Wanwei announced that the "Tiangong 3.0" base model and the "Tiangong SkyMusic" music model officially opened the public beta!

"Tiangong 3.0" has 400 billion parameters, surpassing the 314 billion parameters of Grok-1, and is the world's largest open-source MoE model. "Tiangong 3.0" has breakthrough performance improvements in the fields of semantic understanding, logical reasoning, generalization, generalization, uncertainty knowledge, and learning ability, and the ability of mathematics/reasoning/code/cultural creativity has been improved by more than 30%.

(Tiangong 3.0 model parameters surpass Grok-1 and become the world's largest open-source MoE expert hybrid model)

The strong technical strength of the model gives "Tiangong 3.0" super performance. In a number of authoritative multi-modal evaluation results such as MMBench, "Tiangong 3.0" surpassed GPT-4V and led the world.

(Tiangong 3.0 multi-modal performance surpasses GPT-4V, leading the world)

At the same time, the "Tiangong SkyMusic" music model under "Tiangong 3.0" is also open to the whole society for public testing today. "Tiangong SkyMusic" is China's first music SOTA model, and it is the first time that China's self-developed large-scale model technology has led the world in the field of AIGC.

(天工SkyMusic综合性能超越Suno V3,取得音乐大模型SOTA,领跑全球)

Tiangong SkyMusic: China's first music AIGC SOTA model

Previously, large models have made breakthroughs in multiple technical fields such as text and images, bringing about comprehensive changes in the industry. However, in the field of AI music generation, the world has been waiting for a product to open the "music ChatGPT moment".

This is because for a long time, a large number of research in the AI music industry has focused on the technical route of symbol music generation, and most of them can only realize the generation of background music (BGM) without voices, and the quality, effect, and aesthetics of music are far from reaching the usable level, and the industry has not been able to explode for a long time.

("Tiangong SkyMusic" self-developed AI music model technical architecture)

Different from the mainstream path in the industry, "Tiangong SkyMusic" adopts the self-developed large-scale model music audio generation technology route. This route directly realizes the integrated end-to-end music generation of musical instruments, vocals, melodies, volumes, and notes through large-scale model technology, which is extremely technically difficult, and only a very few top players in the world, including Kunlun Wanwei, participate.

In the horizontal evaluation with Suno V3, the top overseas AI music model, "Tiangong SkyMusic" significantly leads its opponents in the fields of vocal & BGM sound quality, vocal naturalness, pronunciation intelligibility, etc., and surpasses Suno V3 with a comprehensive score of 6.65 points, becoming the global AI music SOTA model.

In addition, "Tiangong SkyMusic" also has original reference music generation and dialect song generation capabilities.

Reference music generation: Users can upload their own reference music, or select existing reference music in the "SkyMusic" database to generate songs with similar styles and singing voices, further lowering the threshold for using large music models and making it easy for users who are not familiar with music theory knowledge to play.

Dialect song generation: The music generated by "Tiangong SkyMusic" not only performs well in the fields of vocal naturalness and vocal intelligibility, but also supports many dialects such as Cantonese, Chengdu dialect, and Beijing dialect, allowing users to realize musical expression more freely and spread dialect culture.

"Tiangong SkyMusic" is China's first publicly available AI music generation model, and it is the first time that China's self-developed large-scale model technology has led the world in the field of AIGC.

At present, OpenAI has attracted global attention in the field of large text models, but in subdivisions such as AI search and AI music generation, Chinese players are moving forward bravely, constantly achieving top SOTA performance in subdivided fields through self-developed technology, jointly building China's large model industry, and creating an independent and controllable large model industry ecology.

Tiangong 3.0: 400 billion parameters, the world's largest open-source MoE model

On the basis of the leading MoE model of the previous generation "Tiangong 2.0", "Tiangong 3.0" has achieved a comprehensive performance upgrade, adopting a 400 billion-parameter MoE hybrid expert model architecture, which is currently the world's largest and most powerful open-source MoE model with the largest model parameters and the strongest performance.

"Tiangong 3.0" has comprehensively upgraded its logical reasoning capabilities, semantic comprehension capabilities, ability to cope with complex needs, and content creation capabilities, and has added multiple rounds of search and synthesis tool calling, chart drawing, research mode, enhancement mode, map modification and expansion and other AI capabilities, bringing users a new AI experience.

Multiple rounds of search and synthesis tool calls: "Tiangong 3.0" has been specially trained for the model's ability to independently plan, call, combine external tools and integrate information, so that it can independently generate and call code to complete a variety of complex user needs including industrial research, product evaluation, information analysis, image generation, chart drawing, etc.

At the same time, "Tiangong 3.0" can disassemble user tasks into subdivisions through powerful semantic understanding capabilities, determine whether it is necessary to network or call tools in real time, conduct single or multiple rounds of network search and tool calls, and complete complex user needs including multiple rounds of search, hot information analysis, and image generation.

(Query: Query the latest Chinese historical movie box office rankings, chart display)

Chart drawing: "Tiangong 3.0" comprehensively improves the logical reasoning ability and the user's natural language Query comprehension ability, so that it can more accurately judge user needs, independently generate and call code, and conduct real-time content analysis and chart construction in combination with text requirements, bringing users more intuitive and efficient comparison results.

(Query: Which is more fun in Beijing, Shanghai, or Chongqing?)

Multi-round search, comprehensive tool calling, chart drawing, etc. are all unique large-scale model comprehensive capabilities of "Tiangong 3.0", which opens up the underlying capabilities of "Tiangong 3.0" from the bottom such as AI search, AI dialogue, AI code generation, AI image recognition, etc., and directly triggers through semantic recognition capabilities, bringing users a more convenient and efficient AI experience and becoming a real AI productivity tool.

In addition, "Tiangong 3.0" also adds a number of AI capabilities such as research mode, enhanced mode, and map modification and expansion.

Research mode: In the research mode, "Tiangong 3.0" can extend the relevant questions around a simple instruction of the user, and automatically generate research outlines, maps, practice summaries, and mind maps, helping users quickly and clearly grasp the core content and complete the user's complex research needs.

(Query:康乾盛世年代)

Enhanced mode: In the enhanced mode, "Tiangong 3.0" can disassemble, refine, question, and complete information for the user's complex query, so that it has stronger performance in natural semantic understanding, better performance in the face of uncertain knowledge, and can meet user needs more accurately and efficiently.

(Query: 2024 Spring Festival movies; "Tiangong 3.0" understands and asks user needs)

Modified and expanded: "Tiangong 3.0" has made a comprehensive breakthrough in multi-modal performance, surpassing GPT-4V and ranking first in the world. With the support of a powerful technical base, the AI drawing capability of "Tiangong 3.0" has added new functions such as image size expansion, image orientation adjustment, pad image generation, pad image evolution, and pad map expansion.

("Tiangong 3.0" AI remodeling, retouching, expanding images, etc.)

China's first music SOTA model "Tiangong Music Model" is in public beta today

Read on