ByteDance set off a price war for large models

The official account of Beijing Business Daily

2024-05-15 21:25Published on the official account of Beijing Business Daily in Beijing

Bytes, which have been "sluggish" in AI, have "escaped a wave of mediocrity" with their price. The main model is 0.0008 yuan/1000 Tokens, which is 99.3% cheaper than the industry price, which makes the outside world call ByteDance the "price war" of the large model. For the industry, the decline in the threshold means that the process of ecological prosperity is speeding up again, and for ByteDance, this "late" press conference may also mean that the huge byte is straightening out its own logic of doing AI.

ByteDance set off a price war for large models

"Price Power"

AI has also begun to talk about "price power". On May 15th, the ByteDance Doubao large model was officially released at the Volcano Engine Motive Power Conference, which currently mainly includes nine models, including general model Pro, general model lite, speech recognition model, speech synthesis model, and Wensheng diagram model.

The bean bag model, formerly known as "Skylark", is one of the first large models in China to pass the algorithm filing. But compared with the bean bag model family that made its collective debut for the first time, the price is the biggest surprise and surprise at this press conference.

"The official price of the Bean Bag Universal Model Pro 32k Model is 0.8 cents/1,000 Tokens, which is 99.3% lower than the industry price." At the press conference, as soon as Tan Bei, president of Volcano Engine, announced this number, there was a burst of exclamation and applause at the scene. So much so that Tan Zhi repeated, "Yes, I heard it right, it is a reduction of 99.3%".

In comparison, the pricing of models of the same specification on the market is generally 0.12 yuan/1,000 Tokens, which is 150 times the price of the bean bag model. The price calculation released by the volcano engine shows that one yuan can buy 1.25 million tokens of the main model of bean bags, which is about 2 million Chinese characters, which is equivalent to three copies of "Romance of the Three Kingdoms".

What supports the pricing of the bean bag model may be the first question that pops up as soon as the pricing news comes out. In the group procurement after the press conference, Tan explained that the team has a lot of optimization methods in technology, including optimizing and adjusting the model structure to reduce the cost and achieve good results, and greatly reduce the deployment cost through distributed inference and hybrid scheduling.

The call volume gives the confidence of the low price of bean bags, and the low price is to leverage the larger market. It is reported that after a year of iteration and market verification, the bean bag model is becoming one of the largest and most abundant application scenarios in China, with an average daily processing of 120 billion Tokens text and generating 30 million images. Tan believes that cost reduction is a key factor in driving large models to fast forward to the "value creation stage".

"The low price of bean bags makes it affordable for more enterprises and individuals to use large models, thereby lowering the threshold for using large model technology, which will help large models to be applied in more industries and scenarios." Wang Peng, an associate researcher at the Beijing Academy of Social Sciences, told a reporter from Beijing Business Daily that the low-price strategy will help attract more potential customers and expand market share. With the increase of the user base, the commercialization prospects of large models will also be broader.

Races you can't afford to lose

At the point in time, there are also coincidences in the story of AI's "price power". Two days ago, OpenAI blew up the field with GPT-4o, and in terms of API use, compared with GPT-4-turbo, GPT-4o is half the price and twice the speed.

On the same day, the new price system was launched on the Zhipu large model open platform. Among them, the call price of the entry-level GLM-3 Turbo model has been reduced by 80%, from 5 yuan/million tokens to 1 yuan/million tokens.

"On a global scale, the inference cost of large models, especially the inference cost of non-top large models, has decreased much faster than previously estimated." In an interview with a reporter from Beijing Business Daily, Internet investor Zhuang Minghao said.

From the perspective of the domestic situation, low prices may also be an inevitable move. Zhuang Minghao said that compared with a few foreign large-scale model enterprises, the competitive environment of domestic large-scale models is more complex, with many head enterprises and star start-ups. In contrast, the explosion of the application layer has been slower than expected. Therefore, when technical capabilities have entered a relatively bottleneck stage, it is also a business practice to expand the application ecosystem as much as possible.

Among the leading large-scale model enterprises, the battle on the other side of the ocean is equally fierce. In the early morning of May 15, Beijing time, Google officially counterattacked OpenAI, not only bringing heavyweight releases such as the lightweight model Gemini 1.5 Flash and AI general agent Project Astra, but also launching OpenAI's "bounced" AI search, defending its position in the search market.

Regarding the question of whether ByteDance will trigger a chain reaction of price reduction, Wang Peng believes that when a company adopts a low-price strategy, other competitors will often take corresponding price reduction measures in order to maintain or expand market share. Therefore, the low-price strategy of bean bags is likely to trigger a price war in the large model industry.

Some people even joked that it may not be long before enterprises will no longer need the cost to call large models, and even enterprises will call large models, and the corresponding large model enterprises should provide subsidies. This joke reflects two levels of anxiety, one in the application ecology and the other in market competition.

Tan also mentioned that for enterprises, there are too many uncertainties in AI transformation, and the cost of trial and error must be kept as low as possible.

"With the development of large models, the basic technology has reached a bottleneck, and people are eager to know what large models can do and whether they can really achieve the flywheel effect, which takes time and money. In this race, no one can afford to lose, whether it will lead to an extreme price war, no one can say," Zhuang said.

For large models that burn money, whether low prices will compress profit margins has also become one of the core issues. In this regard, Wang Peng believes that price reductions will undoubtedly cause certain pressure on the profitability of enterprises in the short term. However, this pressure can be alleviated by expanding market share and increasing usage.

In the long run, Wang Peng believes that with the increase in the number of users and the increase in the frequency of use, enterprises can achieve economies of scale and reduce unit costs, thereby offsetting the impact of price reductions to a certain extent. In addition, enterprises can also increase revenue sources by providing value-added services, customized services, etc., to cope with the profitability pressure caused by price reductions.

"We don't do this at the expense of losses, losses are unacceptable." In the group sourcing, Tan Cheng also mentioned that the basis of pricing is confidence in technology.

Late press conference

At the annual all-staff meeting at the beginning of 2024, ByteDance CEO Liang Rubo set the key word for ByteDance in 2024 is "always start a business and escape the gravity of mediocrity". In that speech, Liang Rubo mentioned the "sense of crisis" many times, such as the sense of crisis of mediocrity of the organization and the sense of crisis of sluggishness.

AI is also a source of crisis. Liang Rubo said that the semi-annual technology review at the company level did not start discussing GPT until 2023, and the large-scale model startups that did better in the industry were founded in 2018-2021.

Such an evaluation actually coincides with the outside world's impression that ByteDance's AI lines are too low-key or even "slow" - most of the previous news of product releases, including bean bags and buttons, was received by the outside world in the form of "network transmission", and the Volcano Engine Power Conference may be Byte's first real AI-focused press conference.

And ByteDance, which is slow, also needs to find its own rhythm, whether it is the release of the product or the spotlight on the "stage".

"From the perspective of communication, ByteDance's AI actions in the past are actually very low-key, although the data of bean bags and buttons is good, but maybe only the people who care most about this industry will know, and the volume is far inferior to Zhipu, MiniMax and other companies. In this sense, ByteDance needs such a press conference to get things back to where they should be." Zhuang Minghao said.

In fact, during the hour-long group harvesting, the length of preparation of the bean bags and the timing of the release have also been mentioned many times. Tan Bei's response was, "Our style is not to be ready not to speak out."

In Zhuang Minghao's view, ByteDance's "slowness" and "low profile" may all be attributed to the huge organizational structure, "The development of large models all the way, for these super large and super complex business companies, it is very difficult to sort out their own AI strategy."

Zhuang Minghao said, for example, that ByteDance has a variety of products and is also very good at development, resulting in today's situation, it is more likely that the internal situation is not clear, whether it is in the name of Volcano Engine or ByteDance, in the name of bean bags or buttons, in the name of ToB or ToC, these may have undergone a game within ByteDance, and after the division was clear, there was such a press conference.

Coincidentally, during the group sourcing, when asked about ByteDance's overall strategic thinking on AI and large model business, there was also an explanation on the scene that Tan Cheng was mainly responsible for the part of the volcano engine, "You can talk about the strategy of the volcano engine, and ByteDance's strategy can go back and talk about it."

Beijing Business Daily reporter Yang Yuehan

View original image 393K