laitimes

Kunlun Wanwei Fanghan: After reading 200 AI papers a year, I found the best money-making model for large models

author:Kōko Kōnen
Kunlun Wanwei Fanghan: After reading 200 AI papers a year, I found the best money-making model for large models
"Free + to C" will produce the next generation of AI giants.

Author|Zhao Jian

Is it reliable for Internet companies to transform into big models? Kunlun Wanwei is looking for opportunities that suit itself.

Founded in 2008, this Internet company started with games, and its business covers social networking, entertainment and other fields. In 2022, Kunlun Wanwei saw a subversive revolution in the field of generative AI in painting and other fields, announcing "All in AIGC". In 2023, Kunlun Wanwei will successively release a number of AI products such as the basic large model "Tiangong" and Tiangong AI search.

At the beginning of 2024, Kunlun Wanwei released a new company vision, with the mission of "realizing general artificial intelligence, allowing everyone to better shape and express themselves".

Today, Kunlun Wanwei announced that the "Tiangong 3.0" base model and the "Tiangong SkyMusic" music model officially opened the public beta. "Tiangong 3.0" has 400 billion parameters, surpassing the 314 billion parameters of Grok-1, which is the world's largest open-source MoE model, and "Tiangong SkyMusic" is China's first music SOTA (State Of The Art) model, which is significantly ahead of Suno V3 in the fields of vocal & BGM sound quality, vocal naturalness, and pronunciation intelligibility.

In order to do a good job in the large model, Kunlun Wanwei has made a lot of preparations.

In terms of computing power, Kunlun Wanwei has nearly 10,000 training resources, which is enough to support the training of the next generation of multimodal-based MoE large models and video generation large models.

In terms of technology, in order to keep pace with the most advanced technology in the industry, Fang Han, chairman and CEO of Kunlun Wanwei, reads 3~4 technical papers every week, and will read more than 200 papers in 2023;

In terms of business model, Fang Han believes that the "free + to C" model in the mobile Internet era is still applicable to the AI era. Only "free + to C" will produce giants in the AI era. This is currently the most suitable business model, and it is also the easiest to break even and win positive ROI.

In order to achieve this, the large model must reduce the cost of inference, the end game is end-side inference, and the middle game is the base of the large model + AI UGC platform - this is also the route chosen by Kunlun Wanwei.

Recently, "Jiazi Lightyear" had a conversation with Fang Han, chairman and CEO of Kunlun Wanwei, explaining in detail how Kunlun Wanwei intends to make money through large models.

1. Reduce the cost of inference to be free

Fang Han judged that the next generation of AI giants is similar to the giant model in the Internet and mobile Internet era, and it must be "C-end + free", because there are 8 billion C-end users in the world, and the market ceiling is the highest, and any tiny income multiplied by 8 billion is a very amazing number.

At present, many large model companies abroad adopt a subscription model, such as OpenAI. Based on the subscription model, Fang Han calculated an account: assuming that the subscription fee for a month is $19, there may only be about 100 million users in the world who are willing to spend money to subscribe, and the remaining 7.9 billion people still have to rely on the free model.

How to achieve the free model? Fang Han believes that there are three paths.

The first path is to reduce the cost of inference. If the inference cost of large models is reduced to 1/10000 or 1/10,000 of the current cost, it can almost be used by everyone for free. However, reducing the cost of inference and improving the power of the model will constrain each other. Fang Han said that the cost of inference is now decreasing at a rate of almost ten times per year, but the ability of the model is also increasing at a rate of dozens of times (bringing about an increase in cost), just like "two donkeys are grinding".

In addition to reducing costs through technical and engineering optimization, there is also a way to reduce costs with "small models". For example, ChatGPT, which we often use today, is not a model with 175 billion parameters, but optimized to 8 billion parameters. Fang Han said that Kunlun Tiangong is also a similar logic, although Tiangong 3.0 is a 400 billion parameter MoE model, but in fact, not all services are called this large parameter model, but distilled out a lot of small models to serve users.

The second path is to adopt the UGC platform model, that is, let 1% of content creators use paid AI, and the remaining 99% of readers can watch the content it produces for free, so that the cost of inference is reduced by about 100 times, and the business model will be easier to establish.

The third path is end-to-end inference, such as AI PCs and AI mobile phones, but AI mobile phones will not be popularized until 3~5 years later, because it takes a cycle for users to change their phones. Fang Han said: "It's like 4G and mobile phone cameras have promoted the development of the short video industry. If there is no camera, no mobile phone camera, no 4G network, the short video industry will never appear. ”

If AI mobile phones become widespread, the market size will become unprecedentedly huge. There are no technical difficulties, and after the optimization of engineers, the mobile phone can reason about the 7B and 13B models, which has met 70% to 80% of the needs of users.

Fang Han believes that before the popularization of AI mobile phones, the UGC platform is the best business model in the medium term. Kunlun Tiangong chose to make a UGC platform driven by a large model pedestal.

2. Large model base + AI UGC platform

Kunlun Wanwei currently has six business matrices: AI large model, AI search, AI music, AI video, AI social networking, and AI games, which are essentially two business lines - the bottom layer is the base of the general large model, and the upper layer is the AI UGC platform.

Fang Han explained: "From a technical point of view, human wisdom is precipitated in the form of text, and all the exclusive models of socializing, games, music and video actually need the ability of text models to support. For example, the video data of the training video model needs to be marked with a text model, and the ability of the text model determines how strong the user's ability to comply with the prompt input is, and how relevant the final generated video is. ”

In order to do a good job in the base model, Kunlun Wanwei has also reserved nearly 10,000 cards of training resources, which is enough to support the training of the next generation of multimodal-based MoE large models and video generation large models. For the adaptation of domestic chips, Kunlun Wanwei has also done a lot of internal tests, and the latest version of a domestic chip can reach about 80% of Nvidia H100 in terms of performance.

From a business point of view, Fang Han also mentioned that having a large model of the pedestal can ensure that it will not be "drawn from the bottom of the kettle". In fact, similar incidents have already been staged, and companies that make applications based on other pedestal models will face the risk of terminating cooperation.

At the application layer, the purpose of Kunlun's product matrix is to create a comprehensive UGC platform with IP as the core.

Fang Han said that an IP is basically generated in novels and comics, such as Harry Potter, The Lord of the Rings and Marvel, and the specific monetization method is video and games, which is a complete closed loop of IP. The user doesn't really care much about whether it's in the form of text, comics or videos, he only cares about whether he can tell a good story, which is actually IP. If you can generate new IPs, users will be willing to spend on your platform. Kunlun Tiangong's AI UGC platform is to allow all people who use AI to create in it to complete the full closed loop of IP.

Regarding the growth expectation of the product, Fang Han said that "a soldier who does not want to be a general is not a good soldier". At present, the products with the best technology – that is, the ones that reach SOTA – will get the most users. Therefore, as long as SOTA is obtained in vertical categories, whether it is a domain or a language, it will definitely be able to achieve rapid growth capabilities.

Kunlun Tiangong's Tiangong music model, which opened the public beta today, is the SOTA model in the music field.

Kunlun Wanwei Fanghan: After reading 200 AI papers a year, I found the best money-making model for large models

天工SkyMusic综合性能超越Suno V3,图片来自昆仑万维

Based on the current progress, Fang Han believes that all AIGC technologies will reach a level of sufficient usability in two or three years.

3. The large model company must be the CEO of the technical boss

In the fiercely competitive large-scale model track, how does Kunlun Wanwei ensure the leading position of technology?

Fang Han believes that there is only one principle for continuous progress, and that is curiosity. "As long as you're curious about the world, you can keep moving forward, and the specific way to move forward, I don't think there's any other way, is to go to the front line and get in touch with technology. ”

This wave of large models is different from the original, the biggest feature is technology-driven, basically startups are technology bosses as CEOs.

Fang Han was assigned to the China Institute of High Energy Physics after graduation, which was the first unit in China to access the Internet. Fang Han says his curiosity about technology is endless.

Fang Han reads 3~4 technical papers a week, and will read more than 200 papers in 2023, and will also write code and prompt on the front line, "I dare to say that my ability to write prompts may exceed 90% of my colleagues in the company".

Fang Han exhorted: "As a company manager, if you don't read papers, you don't know where the boundaries of technology are, and there is no way to design products and business models for this boundary." If you don't understand technology, others will play low-dimensional high-dimensional, and the technical indicators will crush you, and you will be all finished, just like the GPT shell company was wiped out after the GPT Store came out. For me, the only way to keep up with the industry is to read papers and talk to technical classmates, and the same is true for the management of our company. ”

Fang Han believes that as long as decision-makers know where the boundaries of technology are, all planning will be fine.

In addition to standing on the front line of technology, Kunlun Wanwei also attaches great importance to the introduction of talents.

In September 2023, AI expert Yan Shuicheng joined Kunlun Wanwei as the co-CEO of Tiangong Intelligence and the president of the 2050 Global Research Institute. Fang Han revealed that the 2050 Global Research Institute has already been joined by a number of professors and doctors.

Fang Han also observed a phenomenon that due to the emergence of large models, all technology stacks are completely new, resulting in the strongest ability in this track is actually the doctoral and graduate students in the school, rather than those who have graduated for many years. He found that the papers on large models were the most creative and thoughtful for the Ph.D. in school.

4. The debate between open source and closed source

Just one day before the announcement of the public beta of the Kunlun Tiangong 3.0 model, Baidu Chairman and CEO Robin Li expressed a controversial view on the open source and closed-source routes at the Create 2024 Baidu Developer Conference. Robin Li said that under today's large-scale model ecosystem, open-source models will become more and more backward.

Fang Han also shared his thoughts on the open source vs. closed source debate.

First of all, does open source have a business model? In the software industry, open source has always been a very controversial topic.

At the end of the last century, in the early days of the development of the software industry, there was indeed no good business model for open source, and there was only one business model at that time: charging service fees. The most profitable company at the time was Red Hat, which was later acquired by IBM.

Later, the emergence of a software company changed the open source business model - MongoDB. MongoDB found that cloud service providers are making money from open source products and services, but not giving a penny to open source organizations. As a result, MongoDB has introduced an SSPL product licensing agreement that is free for all users except cloud service providers, unless the cloud service provider can pay a fee.

Another business model of open source is to use open source as the cheapest means of getting leads. After the product is open sourced, many users will come to try it, and if they encounter problems, they have to find the original factory to solve the after-sales problems. Some domestic open-source databases have adopted this method.

Therefore, Fang Han believes that open source still has a business model.

From a technical point of view, who is more advantageous in open source or closed source models?

At present, there are many ways to evaluate large models, and Fang Han believes that the most authoritative test method is LLM Arena. Since 2023, GPT-4 has been the perennial No. 1 position, Claude 3 briefly surpassed GPT-4 after its release, and recently the latest version of GPT-4 Turbo has regained the first place.

Kunlun Wanwei Fanghan: After reading 200 AI papers a year, I found the best money-making model for large models

Large Model Arena Leaderboard, image courtesy of LMSYS Chatbot Arena

If you look at the open source model, the highest-ranked open source model on the current list of large model arenas is Alibaba's Qwen1.5-72B-Chat, ranking eleventh. Fang Han believes that the gap between the open source model and the closed source model has been chased from more than 2 years behind to only 4~6 months, which proves that the gap between the two is narrowing rather than increasing.

From another Wensheng diagram track, the advantages of the open source model will be more obvious.

There are two representative closed-source SOTA models in the field of Wensheng graphs: DALL-E and Midjourney, but after the release of the open-source Stable Diffusion, many art workflows in the game field have gradually adopted Stable Diffusion, and part of the reason behind this is the computing resources.

Fang Han said that doctoral students and teachers in universities and universities are actually very embarrassed, because they do not have too many computing resources, and can only do their work based on open source Stable Diffusion, and their ingenuity has to contribute to the open source model. Therefore, the open-source model is actually an ecological builder, which is more conducive to meeting the long-tail needs of users.

Fang Han knows two Chinese personal model authors, one is to design tattoo drawings for tattoo artists, and the other is to draw scaffold drawings for stores, and these two long-tail needs, closed-source models cannot be satisfied, and can only be customized by open-source models.

Fang Han believes that the open source model and the closed-source model are part of an ecosystem, not who replaces whom, but a complementary relationship, and both will have their own living space.

(Cover image from "Iron Man")

Read on