laitimes

Volcano engine entry large model, scene, scene, scene

author:Titanium Media APP
Volcano engine entry large model, scene, scene, scene

President of Volcano Engine, Tan Zhi

What is the difficulty in landing the large model? The effect of the model, the cost of reasoning, and the difficulty of landing, each hurdle is enough for enterprises to have a headache, and the problem of "affordability" is the first to bear the brunt.

At present, the large model has entered a stage of serving the majority of customer scenarios, which means that it must not only be able to operate successfully, but also provide services efficiently, achieve a higher level of intelligence with less computing power, and provide it to customers in a reasonable way. As the complexity of model requirements increases, the cost will rise further, and the high cost of inference may also become a barrier to the application and innovation of large models in many enterprises, because they may not be able to afford such a huge expense. And the lowest price often brings users one step closer to affordable AI.

On May 15th, ByteDance's bean bag model broke down the price in one fell swoop - the current market price of the main bean bag model is only 0.0008 yuan / 1,000 Tokens, 99.3% lower than the industry price, 0.8 centimeters can process more than 1500 Chinese characters, 1 yuan can buy 1.25 million Tokens, the price is extremely competitive in the current global industry.

"The ultra-low pricing of the large model comes from the fact that we are technically confident in optimizing costs." Volcano Engine President Tan Cheng said.

Talking about the logic of price reduction, Tan Zhi pointed out at the media communication conference, "We have made a lot of technical optimizations at the model level, and at the same time, for AI engineering, for example, hybrid scheduling and distributed inference have been optimized for the underlying heterogeneous computing power, which can greatly reduce the cost of inference." From today's point of view, many enterprises actually have a relatively high risk when trying to innovate the application of large models, so from the perspective of the supply side, the trial and error cost of enterprises must be reduced to a very low level, so that it is possible for everyone to use it extensively.

Of course, the value of technology is always judged by implementation.

This is also the logic of the volcano engine's price of the bean bag large model, the product matrix, and the new generation of AI technology stack: the application of the large model has entered a more realistic stage, and only by truly creating scene value can we win customers.

Close to the business, run through the large model

The large model will inevitably be integrated into enough scenes, but this will not be easy. The reasons for this are technical, such as hallucinations and fine-tuning ability; There are also non-technical aspects, such as the adaptability to the enterprise business, the learning ability of business personnel, etc.

It is precisely because the combination of large models and industries is still in the early stage, and there is no mature path reference, so enterprises usually face difficulties in choosing.

At present, many general large models on the market are still pursuing hundreds of billions or even trillions of parameters. It is true that large models with high parameter numbers usually have stronger processing power and generalized scenarios. However, if the large model only pursues the improvement of the number of parameters, it may not be able to achieve the best results in specific business scenarios, but may ignore the actual business requirements.

From the perspective of enterprises, if they want to use large models in actual business scenarios, in addition to facing technical problems such as cost, security, and poor algorithm effect, they often lack relevant implementation experience and case references.

From the perspective of technology vendors, how to not only gain insight into the needs of thousands of enterprises and customers, but also reduce the application threshold of large models to the lowest is also a common commercialization problem.

The answer comes from practice.

As a typical representative of data-intensive industries, the financial industry has been moving relatively fast in the application of large models, especially in the fields of marketing, risk control, investment research, and customer service.

Taking Huatai Securities as an example, the previously constructed securities customer service system has problems such as isolated product forms, insufficient generalization of intent recognition, and lack of multi-round conversation comprehension, and the user experience needs to be improved. Nowadays, Huatai Securities is using AI large model to optimize wealth management assistants, through the introduction of large model technology, and the integration of traditional model algorithms and business transaction processes, a new generation of wealth management assistants to solve the problem that traditional technology can not accurately identify intentions and cannot conduct multiple rounds of interaction with customers.

Volcano engine entry large model, scene, scene, scene

Image source: Unsplash

Observing the process of Huatai Securities' wealth management assistant from thinking to product polishing, it can be found that many companies have a large number of user scenarios and strong knowledge barriers, and one of the prerequisites for the introduction of advanced technology is that the solution must be optimized to a more mature stage. As a technology provider, Volcano Engine provides multiple model calls, flexible observable cloud computing power, security sandbox and other capabilities. In the model application stage, Volcano Engine provides the best model combination in China with the model capability matrix of "1+N+X", so that it can avoid the pressure of hard decision-making, ensure the competitiveness of model selection, and realize the observation ability of model invocation and the smooth upgrade of model version, so as to ensure that model access can be upgraded, rolled back, and observed.

At the same time, we have an in-depth understanding of customer needs and pain points, evaluate and optimize the model from the customer's perspective, and pay attention to the ease of use and interpretability of the model, so as to achieve the best results and cost performance. From the Volcano Engine, we see the best path for large-scale model applications to accelerate the landing.

Take it one step further and root the scene

Any technology implementation path will encounter specific challenges in business scenarios - enterprises in various industries actually have their own business scenarios, proprietary data, and specific requirements, whether they can solve the problems in specific scenarios, and what practical effects will be...... No one can guarantee the best practices in one step, and the implementation of large models is a process that requires iteration, trial and error, and iteration.

Titanium Media has had in-depth exchanges with a company, which has sorted out nearly 100 business scenarios in the initial stage for the application of large models, but even if it has many real business scenarios, it still needs to be carefully evaluated and planned to achieve the implementation. For example, to determine the priority of scenarios based on the input-output ratio, to clarify which are high-value scenarios and which are less relevant to the core business, how to achieve efficient inference at the edge, whether the existing production-end system has the data required to collect model inference, and even whether enterprises need to pay attention to whether front-line employees have a full understanding of AI large models. It took several months to verify the implementation of the large model in a single pilot.

Volcano engine entry large model, scene, scene, scene

Image source: Pexels

Throughout the process of the implementation of large-scale model scenarios, various large-scale model manufacturers have differentiated, either focusing on their own business reconstruction, or trying to accumulate industry benchmarks. For the latter, most vendors will first build a set of general large model systems, and then provide comprehensive solutions for certain industries and scenarios based on the model, which often requires enterprise customers to have very strong AI requirements and implementation, and the scenarios targeted are highly homogeneous.

On the contrary, Volcano Engine chooses to start from the customer, understand the customer's unique scenarios, and experience the product polishing of each scenario, so as to reverse the coverage of a wide range of scenario applications - this road seems to be more difficult to walk, but it has gained a higher number of calls and verified its leadership.

The bean bag model, formerly known as "Skylark", is one of the first large models to pass the security filing of the large model service. The AI application assistant "Doubao APP", which is based on the Doubao model, has accumulated more than 100 million downloads and has become a mainstream AI productivity tool used by users on platforms such as Douyin, Xiaohongshu, and Toutiao to solve work and life. According to reports, more than 8 million agents have been created on the Doubao platform, and the monthly active users have reached 26 million.

In addition to the implementation of the Doubao APP, ByteDance has also created AI applications in multiple scenarios such as development, learning, interactive entertainment, and clone creation, and promoted more than 50 businesses within ByteDance, as well as customers in the financial, automobile, and pan-mutual industries served by Volcano Engine, to gradually access the Doubao model to achieve AI efficiency improvement.

For example, Feishu, a fist-level B-end product, has embraced AI across the board, and the Feishu Smart Partner function, which is built based on large-scale model technology, can provide enterprises with services such as daily work summary, meeting minutes summary, and enterprise knowledge search.

The massive user base and a wide range of customer application scenarios provide a good model training ground for the bean bag model. Because only the maximum amount of use can polish the best large model, and in turn, the better the effect of the large model, the more it understands the needs of real users, and the more it can attract users to use.

According to Tan Bei, the Doubao model currently processes an average of 120 billion tokens of text (about 180 billion Chinese characters) per day and generates 30 million images. In order to truly meet the demands of enterprises for large models in different scenarios, the Doubao large model will be provided externally through the Volcano Ark large model service platform, including the two main models of Doubao general model pro and Doubao general model lite, as well as multimodal model families with images, voices, and semantics that adapt to a variety of business scenarios, including role-playing models, speech synthesis models, speech recognition models, voice replication models, Wensheng diagram models, Function call models, and vectorization models.

In the matter of making large models, when most manufacturers are still spinning in a large number of homogeneous scenarios, the bean bag large model has started from internal and external business and customer applications, while solving scenario-based needs, while polishing capabilities, and then through the volcano ark to give the majority of enterprise customers the best cost performance, the highest degree of freedom of products.

On the other hand, the large model service platform Volcano Ark also provides customers and developers with a safe, stable, easy-to-use, and complex environment, and helps customers realize the application practice of large models in the form of the highest cost performance and the greatest degree of freedom.

Volcano Engine's model products and technical capabilities build a positive cycle of scenario practice and technology iteration, teaching people to fish, empowering customers to stimulate more innovation and growth.

The landing of large models also requires full-stack services

AI technology is not a castle in the air, the landing of large models is not a single-point capability output, and behind intelligence is the construction of system engineering. Observing the enterprise customers who have already used large models to drive business applications is also their common demand.

For example, Huatai Securities has realized the flexible invocation of multiple models through the Volcano Ark large model service platform, and relies on the computing power, data services, and security sandboxes provided by Volcano Engine to escort business growth.

When exploring a new model of knowledge management and collaboration, FlowUs and Volcano Engine connected to the Doubao large model to realize the application of the model in multiple application scenarios, and also used DataFinder, a growth analysis product of Volcano Engine, to improve the overall information flow efficiency.

ASUS's "Douding AI Assistant", an artificial intelligence service specially designed for consumers, uses ByteDance's large model capabilities to gain excellent dialogue, search, and creation capabilities, and at the same time, with the help of ViKingDB, a vector database of the volcano engine, improves the vector similarity retrieval ability, ensuring that when users ask questions to Douding, the system can quickly and accurately locate the problem, retrieve relevant results, and submit them to the large model for refining and summarizing.

In the collision with a large number of customers and internal business practices, Volcano Engine also abstracted new common demands, forming a one-stop large-scale model service platform that is more suitable for enterprises and developers-Volcano Ark. At the 2024 Spring Volcano Engine FORCE Conference, Volcano Ark 2.0 will be fully upgraded, providing a networking plug-in with the same search capability as Toutiao Douyin, a content plug-in with massive content from the same source as Toutiao Douyin, a RAG knowledge base plug-in with a vectorized model of bean bags, and a "Buckle Professional" development platform to help enterprises quickly build AI applications. At the same time, Volcano Engine also provides many out-of-the-box AI applications such as ChatBI, an AI assistant for intelligent data insights, an intelligent creation cloud 2.0 for marketing scenarios, and Sales Copilot, a sales AI assistant.

At present, the combination model of Doubao large model + Volcano Ark large model service platform + cloud base has become a full-stack AI service solution for Volcano Engine to help many enterprise partners in intelligent transformation. It can be seen that what customers need is not only to insert a large model, but also to provide a large model service platform that provides stronger system carrying capacity, better plug-ins, better algorithm services, and secure and credible solutions. From the model to the platform to the computing power base, every step provides strong support for the digital intelligence of enterprises.

As ByteDance's ToB capability output platform, the role of Volcano Engine in the era of AI large models is gradually escalating. Although the large model industry is still early to fully compete at present, in order to become a better provider of intelligence and computing power, the Volcano Engine is running at full speed around data, computing power, algorithms, and co-creation with customer scenario ecology.

(This article was first published in Titanium Media App, by |.) Yang Li, editor | Gai Hongda)

Read on